Files
komAI/modules/omnivoice/README.md
Komisar 55353654b7 Add OmniVoice TTS module with config, API, profiles and CLI
- Create modules/omnivoice/ with VoiceAPI, VoiceProfiles, CLI
- Add config manager integration with local model support
- Add app/komAI.py entry point
- Add tests/test_omnivoice.py
- Clone OmniVoice to external/ for development
- Add omnivoice config to global.yaml
2026-04-16 17:51:15 +03:00

93 lines
2.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# OmniVoice Module
TTS (Text-to-Speech) модуль на базе OmniVoice.
## VENV
Все зависимости в venv:
```bash
venv\Scripts\python.exe -m pip install torch==2.8.0+cpu torchaudio==2.8.0+cpu --index-url https://download.pytorch.org/whl/cpu
venv\Scripts\python.exe -m pip install soundfile PyYAML transformers accelerate pydub gradio tensorboardX webdataset librosa
venv\Scripts\python.exe -m pip install -e external/OmniVoice
```
## Использование
### Python API
```python
# В venv
venv\Scripts\python.exe
# Инициализация модуля
from modules.omnivoice import register_config, register_logging
register_config()
register_logging()
from modules.omnivoice import get_api, get_profiles
api = get_api()
profiles = get_profiles()
# Voice Cloning
audio = api.clone(
text="Hello, this is a test.",
ref_audio="ref.wav",
ref_text="Reference transcription."
)
# Voice Design
audio = api.design(
text="Hello, this is a test.",
instruct="female, british accent"
)
# Auto Voice
audio = api.auto(text="Hello, this is a test.")
# Сохранение
path = api.save_audio(audio[0], "output.wav")
# Профили
profiles.save_from_generated("my_voice", "Hello", ref_audio="ref.wav")
audio = profiles.generate("my_voice", "Generated text")
```
### CLI
```bash
# Скачать модель локально
venv\Scripts\python.exe -m modules.omnivoice.cli download
# Voice Cloning (без --output = воспроизвести)
venv\Scripts\python.exe -m modules.omnivoice.cli clone --text "Hello" --ref-audio ref.wav
venv\Scripts\python.exe -m modules.omnivoice.cli clone --text "Hello" --ref-audio ref.wav --output hello.wav --profile my_voice
# Voice Design
venv\Scripts\python.exe -m modules.omnivoice.cli design --text "Hello" --instruct "female, british accent"
# Auto Voice
venv\Scripts\python.exe -m modules.omnivoice.cli auto --text "Hello"
# Профили
venv\Scripts\python.exe -m modules.omnivoice.cli profiles
venv\Scripts\python.exe -m modules.omnivoice.cli profile-add --name my_voice --ref-audio ref.wav
venv\Scripts\python.exe -m modules.omnivoice.cli profile-remove --name my_voice
venv\Scripts\python.exe -m modules.omnivoice.cli profile-use --profile my_voice --text "Hello"
```
## Конфигурация
Параметры в `config/global.yaml`:
```yaml
# omnivoice
omnivoice.model_name: "k2-fsa/OmniVoice"
omnivoice.device: "cuda:0"
omnivoice.dtype: "float16"
omnivoice.num_steps: 32
omnivoice.speed: 1.0
omnivoice.profiles_dir: "data/voice_profiles"
omnivoice.output_dir: "output/voice"
omnivoice.enabled: true
```