Configuration
Configuration
Section titled “Configuration”S2ST Translator uses a layered configuration system with the following priority (highest to lowest):
- Command-line flags
- Environment variables
config/default.yaml- Built-in defaults
Configuration File
Section titled “Configuration File”The main configuration file is config/default.yaml:
# Model settingsmodel: size: "large" # small, medium, large device: "auto" # auto, cuda, mps, cpu dtype: "float16" # float16, float32 num_beams: 5 # Beam search beams temperature: 1.0 # Sampling temperature
# Translation defaultstranslation: source_lang: "auto" # auto-detect or language code target_lang: "eng" # Default target language task: "s2st" # Speech-to-speech translation
# Audio processingaudio: target_sample_rate: 16000 # SeamlessM4T requirement normalize: true to_mono: true
# File pathspaths: input_dir: "./input" output_dir: "./output" translated_subdir: "translated" logs_subdir: "logs"
# Logginglogging: level: "INFO" # DEBUG, INFO, WARNING, ERROR console: enabled: true file: enabled: true max_bytes: 10485760 # 10MB backup_count: 5
# Processingprocessing: num_workers: 1 batch_size: 1 resume_from_checkpoint: true
# Security (optional)security: api_key: null # Set to require API keyEnvironment Variables
Section titled “Environment Variables”| Variable | Description | Default |
|---|---|---|
SEAMLESS_DEVICE | Force device: auto, cuda, mps, cpu | auto |
SEAMLESS_MODEL_SIZE | Model size: small, medium, large | medium |
LOG_LEVEL | Logging level | INFO |
API_KEY | API authentication key | None |
NUM_WORKERS | Number of worker processes | 1 |
Example:
export SEAMLESS_DEVICE=cudaexport SEAMLESS_MODEL_SIZE=large./start.shModel Selection
Section titled “Model Selection”| Size | Model ID | VRAM | Quality | Speed |
|---|---|---|---|---|
small | seamless-m4t-medium (v1) | ~4GB | Good | Fast |
medium | seamless-m4t-medium (v1) | ~8GB | Good | Medium |
large | seamless-m4t-v2-large | ~16GB | Best | Slower |
Note The
smallandmediumsizes use the same v1 model. Thelargesize uses the v2 model with improved quality.
Supported Languages
Section titled “Supported Languages”S2ST Translator supports these language codes:
Major Languages
Section titled “Major Languages”| Code | Language |
|---|---|
eng | English |
deu | German |
fra | French |
spa | Spanish |
ita | Italian |
por | Portuguese |
nld | Dutch |
pol | Polish |
rus | Russian |
ukr | Ukrainian |
cmn | Chinese (Mandarin) |
jpn | Japanese |
kor | Korean |
arb | Arabic |
hin | Hindi |
tur | Turkish |
All Supported Codes
Section titled “All Supported Codes”afr, amh, arb, ary, arz, asm, azj, bel, ben, bos, bul, cat, ceb, ces, ckb,cmn, cym, dan, deu, ell, eng, est, eus, fin, fra, fuv, gaz, gle, glg, guj,hau, heb, hin, hrv, hun, hye, ibo, ind, isl, ita, jav, jpn, kam, kan, kat,kaz, kea, khk, khm, kir, kor, lao, lit, ltz, lug, luo, lvs, mai, mal, mar,mkd, mlt, mni, mya, nld, nno, nob, npi, nya, oci, ory, pan, pbt, pes, pol,por, ron, rus, slk, slv, sna, snd, som, spa, srp, swe, swh, tam, tel, tgk,tgl, tha, tur, ukr, urd, uzn, vie, xho, yor, zho, zulAPI Security
Section titled “API Security”To require API key authentication:
# config/default.yamlsecurity: api_key: "your-secret-key-here"Or via environment:
export API_KEY="your-secret-key-here"./start.shClients must include the key in requests:
curl -H "X-API-KEY: your-secret-key-here" http://localhost:8000/api/jobsPerformance Tuning
Section titled “Performance Tuning”GPU Memory Optimization
Section titled “GPU Memory Optimization”For limited VRAM, use the medium model:
model: size: "medium" dtype: "float16"Multi-Worker Processing
Section titled “Multi-Worker Processing”For high throughput with multiple GPUs:
processing: num_workers: 2Or via environment:
NUM_WORKERS=2 ./start.shChunk Size
Section titled “Chunk Size”Adjust audio chunk duration for memory vs. quality tradeoff:
audio: chunk_duration: 10.0 # seconds