Documentation

Complete guide to WhisperSubTranslate - from installation to advanced features.

Installation

Windows (Portable)

  1. Download the latest .zip file from GitHub Releases
  2. Extract the ZIP file to your desired location
  3. Run WhisperSubTranslate.exe

The application includes all necessary components (whisper.cpp, ffmpeg). No additional installation required.

Quick Start

  1. Launch WhisperSubTranslate
  2. Drag and drop your video/audio file or click "Select File"
  3. Select the Whisper model (small recommended for most users)
  4. Choose the source language (or select "Auto Detect")
  5. Click "Extract Subtitles" and wait for processing
  6. Save the generated subtitle file

System Requirements

Component Minimum Recommended
OS Windows 10 Windows 10/11
RAM 4GB 8GB+
Storage 2GB 5GB+
GPU Optional NVIDIA GTX 1060+

Whisper Models

WhisperSubTranslate supports various Whisper models with different accuracy and speed trade-offs:

Source: whisper.cpp GitHub - VRAM requirements are lower than PyTorch Whisper due to GGML optimization

Model Size VRAM Speed
tiny 75MB ~1GB Fastest
base 142MB ~1GB Fast
small 466MB ~2GB Balanced
medium 1.5GB ~4GB Accurate
large 3GB ~5GB Best Quality
large-v2 3GB ~5GB Best Quality
large-v3 3GB ~5GB Best Quality
large-v3-turbo 809MB ~4GB Fast & Accurate ⭐

For most users, the "small" or "large-v3-turbo" model provides the best balance of speed and accuracy.

Model Selection by Hardware

Hardware Recommended
CPU Only / Integrated Graphics tiny, base
Low RAM (4GB or less) tiny
8GB RAM (CPU mode) small
GPU 4GB VRAM small, medium, large-v3-turbo ⭐
GPU 6GB+ VRAM Any model (including large-v3)

For CPU-only users: Start with "tiny" or "base" for faster processing. "small" is usable but slower.

Output Formats

WhisperSubTranslate generates subtitles in the most compatible format:

  • SRT - Most widely compatible format, works with all video players. The app automatically generates SRT files with millisecond-precision timestamps.

Note: The underlying whisper.cpp engine supports additional formats (VTT, TXT, JSON) via command line, but the app interface focuses on SRT for maximum compatibility.

Translation

WhisperSubTranslate can translate your subtitles using external APIs:

DeepL

  1. Create an account at deepl.com
  2. Get your API key from the account settings
  3. Enter the API key in WhisperSubTranslate settings

MyMemory (Free)

  1. No API key required for basic usage
  2. Select MyMemory as translation service in settings
  3. Free tier available with daily limits

GPT-5-nano (OpenAI)

Pricing: $0.05 input / $0.40 output per 1M tokens

  1. Create an account at openai.com
  2. Get your API key from the API settings
  3. Enter the API key in WhisperSubTranslate settings

Gemini 3 Flash (Google)

Free: 250 subs/day (~20-30min video), Paid: unlimited

  1. Go to Google AI Studio
  2. Create an API key (Google account required)
  3. Enter the API key in WhisperSubTranslate settings

Translation APIs may incur costs based on usage. Check the pricing of each service.

GPU Acceleration

GPU acceleration can dramatically speed up subtitle extraction. WhisperSubTranslate supports NVIDIA CUDA.

Requirements

  • NVIDIA GPU (GTX 1060 or higher recommended)
  • Latest NVIDIA drivers
  • CUDA Toolkit (included with drivers)

GPU acceleration can provide 5-10x faster processing compared to CPU-only mode.

Command Line

For advanced users, the underlying whisper.cpp can be used directly from the command line:

whisper-cli.exe -m _models/ggml-small.bin -l auto input.wav

Common Options

  • -m - Model path
  • -l - Source language (auto, en, ko, ja, etc.)
  • -osrt - Output SRT format
  • -ml - Maximum segment length in ms

Data Storage

WhisperSubTranslate stores your settings and API keys locally on your computer. This data is never uploaded to any server.

Storage Location

All settings are stored in your Windows AppData folder:

%APPDATA%\whispersubtranslate\translation-config-encrypted.json

Typical path:

C:\Users\[YourUsername]\AppData\Roaming\whispersubtranslate\

What is Stored

  • API Keys - DeepL, OpenAI, Google Gemini keys (encrypted with AES)
  • App Settings - Selected model, language, device, translation service
  • UI Preferences - Interface language
  • Error Logs - Translation error logs stored in logs/ subfolder

Log Files

Translation error logs are saved to help with troubleshooting:

%APPDATA%\whispersubtranslate\logs\translation-errors.log

These logs contain only error information for debugging purposes and do not include any personal data or API keys.

Your API keys are encrypted before storage. The settings file is stored outside the project folder and will NOT be uploaded when you share the project on GitHub.

Reset Settings

To reset all settings to defaults, delete the config folder:

  1. Close WhisperSubTranslate
  2. Press Win + R, type %APPDATA% and press Enter
  3. Delete the whispersubtranslate folder
  4. Restart the application

Backup Settings

To backup your settings (including API keys), copy the entire whispersubtranslate folder from AppData to a safe location.

Keep your backup secure! The file contains your encrypted API keys.

Troubleshooting

App won't start

  • Make sure you have Windows 10 or later
  • Try running as administrator
  • Check if antivirus is blocking the application

Poor accuracy

  • Try a larger model (medium or large-v3)
  • Make sure the audio is clear with minimal background noise
  • Specify the source language instead of using auto-detect

GPU not detected

  • Update NVIDIA drivers to the latest version
  • Verify GPU compatibility (CUDA 11.x support required)
  • Restart the application after driver update