2025-09-19 21:28:06 +02:00
2025-09-19 17:58:24 +00:00
2025-09-19 17:58:24 +00:00
2025-09-17 19:15:21 +00:00
2025-09-19 17:58:24 +00:00
2025-09-19 17:58:24 +00:00
2025-09-19 17:58:24 +00:00
2025-09-18 14:52:14 +02:00
2025-09-17 20:47:32 +02:00
2025-09-19 17:58:24 +00:00
2025-09-19 21:28:06 +02:00
2025-09-19 17:58:24 +00:00
2025-09-19 17:58:24 +00:00
2025-09-19 17:58:24 +00:00
2025-09-19 17:58:24 +00:00
2025-09-18 11:46:54 +00:00

File Wizard

File Wizard is a self-hosted, browser-based utility for file conversion, OCR, and audio transcription. It wraps many cli and python converters aswell as fast-whisper and tesseract ocr.

Screenshot

Features

  • Versatile File Conversion: Convert between various file formats. The system is designed to be extended with any command-line tool (like FFmpeg, ImageMagick, etc.) via a simple settings.yml configuration file.

  • OCR: Perform Optical Character Recognition (OCR) on PDFs and images to extract text.

  • Accurate Audio Transcription: Transcribe audio files into text using Whisper models.

  • Modern UI & UX:

    • Clean, responsive, dark-themed interface.
    • Drag-and-drop support for single or multiple files anywhere on the screen.
    • Traditional multi-file selection buttons.
    • A dialog to choose the desired action (Convert, OCR, Transcribe) for dropped files.
  • Real-time Updates & History:

    • Jobs are processed in the background, with the UI updating statuses in real-time via polling.
    • A persistent job history table displays file names, tasks, submission times, file sizes (input → output), and final status.
  • Configuration:

    • A dedicated /settings page allows for viewing and editing the configuration directly from the UI.
    • OAuth needs to be configured in the config/settings.yml file, you can see the default for a reference. By default, it runs without auth in local mode.

Tech Stack

  • Backend: FastAPI (Python)
  • Frontend: Vanilla HTML, CSS, JavaScript (no framework)
  • Task Queue: Huey (with a SQLite backend might do redis soon)
  • Database: SQLAlchemy (with a SQLite database might go postgres soon)
  • Configuration: YAML

Installation & Setup

With Docker (Local Build)

Clone the Repo and make sure docker compose is working on your environment

git clone https://github.com/LoredCast/filewizard.git
cd filewizard

Startup Docker, initially the settings.yml file is applied, you can edit it. For You can edit the .env file for further configuration.

Note: Building this image will take some time installing all deps ((mostly texlive)).

docker compose up --build

Installation & Setup

With Docker (From Dockerhub)

I've published 2 images to dockerhub that you can pull:

  • loredcast/filewizard:latest
  • loredcast/filewizard:small

The smaller one has Tex and many large tools left out.

Copy the docker-compose.yml from the repo into a directory on your machine, adjust the file to your needs and startup docker.

docker compose up -d

Manually

1. Clone the Repository

git clone https://github.com/LoredCast/filewizard.git
cd filewizard

2. Create a Virtual Environment & Install Dependencies

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

# Make sure you have a requirements.txt file with all dependencies
pip install -r requirements.txt

(Note: Dependencies include fastapi, uvicorn, sqlalchemy, huey, faster-whisper, ocrmypdf, pytesseract, python-multipart, pyyaml, etc.)

3. Configure the Application Copy the default settings file and customize it to your needs. This is where you define your conversion tools and other parameters.

4. Run the Web Server This command starts the Webserver and Huey.

chmod +x run.sh
./run.sh

Usage 🖱️

  1. Open your browser to http://127.0.0.1:8000.
  2. Drag and drop any file or multiple files onto the page.
  3. A dialog will appear asking you to choose an action: Convert, OCR, or Transcribe.
  4. Alternatively, use the "Choose File" buttons in any of the three sections.
  5. Your job will appear in the "History" table, and its status will update automatically.
Description
No description provided
Readme 647 KiB
Languages
Python 62.9%
JavaScript 15.5%
HTML 10.5%
CSS 10%
Dockerfile 0.9%
Other 0.2%