Add OmniVoice TTS module with config, API, profiles and CLI

- Create modules/omnivoice/ with VoiceAPI, VoiceProfiles, CLI
- Add config manager integration with local model support
- Add app/komAI.py entry point
- Add tests/test_omnivoice.py
- Clone OmniVoice to external/ for development
- Add omnivoice config to global.yaml
This commit is contained in:
2026-04-16 17:51:15 +03:00
parent 22b85455e1
commit 55353654b7
11 changed files with 1064 additions and 3 deletions

View File

@@ -0,0 +1,93 @@
# OmniVoice Module
TTS (Text-to-Speech) модуль на базе OmniVoice.
## VENV
Все зависимости в venv:
```bash
venv\Scripts\python.exe -m pip install torch==2.8.0+cpu torchaudio==2.8.0+cpu --index-url https://download.pytorch.org/whl/cpu
venv\Scripts\python.exe -m pip install soundfile PyYAML transformers accelerate pydub gradio tensorboardX webdataset librosa
venv\Scripts\python.exe -m pip install -e external/OmniVoice
```
## Использование
### Python API
```python
# В venv
venv\Scripts\python.exe
# Инициализация модуля
from modules.omnivoice import register_config, register_logging
register_config()
register_logging()
from modules.omnivoice import get_api, get_profiles
api = get_api()
profiles = get_profiles()
# Voice Cloning
audio = api.clone(
text="Hello, this is a test.",
ref_audio="ref.wav",
ref_text="Reference transcription."
)
# Voice Design
audio = api.design(
text="Hello, this is a test.",
instruct="female, british accent"
)
# Auto Voice
audio = api.auto(text="Hello, this is a test.")
# Сохранение
path = api.save_audio(audio[0], "output.wav")
# Профили
profiles.save_from_generated("my_voice", "Hello", ref_audio="ref.wav")
audio = profiles.generate("my_voice", "Generated text")
```
### CLI
```bash
# Скачать модель локально
venv\Scripts\python.exe -m modules.omnivoice.cli download
# Voice Cloning (без --output = воспроизвести)
venv\Scripts\python.exe -m modules.omnivoice.cli clone --text "Hello" --ref-audio ref.wav
venv\Scripts\python.exe -m modules.omnivoice.cli clone --text "Hello" --ref-audio ref.wav --output hello.wav --profile my_voice
# Voice Design
venv\Scripts\python.exe -m modules.omnivoice.cli design --text "Hello" --instruct "female, british accent"
# Auto Voice
venv\Scripts\python.exe -m modules.omnivoice.cli auto --text "Hello"
# Профили
venv\Scripts\python.exe -m modules.omnivoice.cli profiles
venv\Scripts\python.exe -m modules.omnivoice.cli profile-add --name my_voice --ref-audio ref.wav
venv\Scripts\python.exe -m modules.omnivoice.cli profile-remove --name my_voice
venv\Scripts\python.exe -m modules.omnivoice.cli profile-use --profile my_voice --text "Hello"
```
## Конфигурация
Параметры в `config/global.yaml`:
```yaml
# omnivoice
omnivoice.model_name: "k2-fsa/OmniVoice"
omnivoice.device: "cuda:0"
omnivoice.dtype: "float16"
omnivoice.num_steps: 32
omnivoice.speed: 1.0
omnivoice.profiles_dir: "data/voice_profiles"
omnivoice.output_dir: "output/voice"
omnivoice.enabled: true
```