Add OmniVoice TTS module with config, API, profiles and CLI

- Create modules/omnivoice/ with VoiceAPI, VoiceProfiles, CLI - Add config manager integration with local model support - Add app/komAI.py entry point - Add tests/test_omnivoice.py - Clone OmniVoice to external/ for development - Add omnivoice config to global.yaml
2026-04-16 17:51:15 +03:00
parent 22b85455e1
commit 55353654b7
11 changed files with 1064 additions and 3 deletions
--- a/modules/omnivoice/README.md
+++ b/modules/omnivoice/README.md
@@ -0,0 +1,93 @@
+# OmniVoice Module
+
+TTS (Text-to-Speech) модуль на базе OmniVoice.
+
+## VENV
+
+Все зависимости в venv:
+```bash
+venv\Scripts\python.exe -m pip install torch==2.8.0+cpu torchaudio==2.8.0+cpu --index-url https://download.pytorch.org/whl/cpu
+venv\Scripts\python.exe -m pip install soundfile PyYAML transformers accelerate pydub gradio tensorboardX webdataset librosa
+venv\Scripts\python.exe -m pip install -e external/OmniVoice
+```
+
+## Использование
+
+### Python API
+
+```python
+# В venv
+venv\Scripts\python.exe
+
+# Инициализация модуля
+from modules.omnivoice import register_config, register_logging
+register_config()
+register_logging()
+
+from modules.omnivoice import get_api, get_profiles
+
+api = get_api()
+profiles = get_profiles()
+
+# Voice Cloning
+audio = api.clone(
+    text="Hello, this is a test.",
+    ref_audio="ref.wav",
+    ref_text="Reference transcription."
+)
+
+# Voice Design
+audio = api.design(
+    text="Hello, this is a test.",
+    instruct="female, british accent"
+)
+
+# Auto Voice
+audio = api.auto(text="Hello, this is a test.")
+
+# Сохранение
+path = api.save_audio(audio[0], "output.wav")
+
+# Профили
+profiles.save_from_generated("my_voice", "Hello", ref_audio="ref.wav")
+audio = profiles.generate("my_voice", "Generated text")
+```
+
+### CLI
+
+```bash
+# Скачать модель локально
+venv\Scripts\python.exe -m modules.omnivoice.cli download
+
+# Voice Cloning (без --output = воспроизвести)
+venv\Scripts\python.exe -m modules.omnivoice.cli clone --text "Hello" --ref-audio ref.wav
+venv\Scripts\python.exe -m modules.omnivoice.cli clone --text "Hello" --ref-audio ref.wav --output hello.wav --profile my_voice
+
+# Voice Design
+venv\Scripts\python.exe -m modules.omnivoice.cli design --text "Hello" --instruct "female, british accent"
+
+# Auto Voice
+venv\Scripts\python.exe -m modules.omnivoice.cli auto --text "Hello"
+
+# Профили
+venv\Scripts\python.exe -m modules.omnivoice.cli profiles
+venv\Scripts\python.exe -m modules.omnivoice.cli profile-add --name my_voice --ref-audio ref.wav
+venv\Scripts\python.exe -m modules.omnivoice.cli profile-remove --name my_voice
+venv\Scripts\python.exe -m modules.omnivoice.cli profile-use --profile my_voice --text "Hello"
+```
+
+## Конфигурация
+
+Параметры в `config/global.yaml`:
+
+```yaml
+# omnivoice
+omnivoice.model_name: "k2-fsa/OmniVoice"
+omnivoice.device: "cuda:0"
+omnivoice.dtype: "float16"
+omnivoice.num_steps: 32
+omnivoice.speed: 1.0
+omnivoice.profiles_dir: "data/voice_profiles"
+omnivoice.output_dir: "output/voice"
+omnivoice.enabled: true
+```