Add OmniVoice TTS module with config, API, profiles and CLI
- Create modules/omnivoice/ with VoiceAPI, VoiceProfiles, CLI - Add config manager integration with local model support - Add app/komAI.py entry point - Add tests/test_omnivoice.py - Clone OmniVoice to external/ for development - Add omnivoice config to global.yaml
This commit is contained in:
93
modules/omnivoice/README.md
Normal file
93
modules/omnivoice/README.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# OmniVoice Module
|
||||
|
||||
TTS (Text-to-Speech) модуль на базе OmniVoice.
|
||||
|
||||
## VENV
|
||||
|
||||
Все зависимости в venv:
|
||||
```bash
|
||||
venv\Scripts\python.exe -m pip install torch==2.8.0+cpu torchaudio==2.8.0+cpu --index-url https://download.pytorch.org/whl/cpu
|
||||
venv\Scripts\python.exe -m pip install soundfile PyYAML transformers accelerate pydub gradio tensorboardX webdataset librosa
|
||||
venv\Scripts\python.exe -m pip install -e external/OmniVoice
|
||||
```
|
||||
|
||||
## Использование
|
||||
|
||||
### Python API
|
||||
|
||||
```python
|
||||
# В venv
|
||||
venv\Scripts\python.exe
|
||||
|
||||
# Инициализация модуля
|
||||
from modules.omnivoice import register_config, register_logging
|
||||
register_config()
|
||||
register_logging()
|
||||
|
||||
from modules.omnivoice import get_api, get_profiles
|
||||
|
||||
api = get_api()
|
||||
profiles = get_profiles()
|
||||
|
||||
# Voice Cloning
|
||||
audio = api.clone(
|
||||
text="Hello, this is a test.",
|
||||
ref_audio="ref.wav",
|
||||
ref_text="Reference transcription."
|
||||
)
|
||||
|
||||
# Voice Design
|
||||
audio = api.design(
|
||||
text="Hello, this is a test.",
|
||||
instruct="female, british accent"
|
||||
)
|
||||
|
||||
# Auto Voice
|
||||
audio = api.auto(text="Hello, this is a test.")
|
||||
|
||||
# Сохранение
|
||||
path = api.save_audio(audio[0], "output.wav")
|
||||
|
||||
# Профили
|
||||
profiles.save_from_generated("my_voice", "Hello", ref_audio="ref.wav")
|
||||
audio = profiles.generate("my_voice", "Generated text")
|
||||
```
|
||||
|
||||
### CLI
|
||||
|
||||
```bash
|
||||
# Скачать модель локально
|
||||
venv\Scripts\python.exe -m modules.omnivoice.cli download
|
||||
|
||||
# Voice Cloning (без --output = воспроизвести)
|
||||
venv\Scripts\python.exe -m modules.omnivoice.cli clone --text "Hello" --ref-audio ref.wav
|
||||
venv\Scripts\python.exe -m modules.omnivoice.cli clone --text "Hello" --ref-audio ref.wav --output hello.wav --profile my_voice
|
||||
|
||||
# Voice Design
|
||||
venv\Scripts\python.exe -m modules.omnivoice.cli design --text "Hello" --instruct "female, british accent"
|
||||
|
||||
# Auto Voice
|
||||
venv\Scripts\python.exe -m modules.omnivoice.cli auto --text "Hello"
|
||||
|
||||
# Профили
|
||||
venv\Scripts\python.exe -m modules.omnivoice.cli profiles
|
||||
venv\Scripts\python.exe -m modules.omnivoice.cli profile-add --name my_voice --ref-audio ref.wav
|
||||
venv\Scripts\python.exe -m modules.omnivoice.cli profile-remove --name my_voice
|
||||
venv\Scripts\python.exe -m modules.omnivoice.cli profile-use --profile my_voice --text "Hello"
|
||||
```
|
||||
|
||||
## Конфигурация
|
||||
|
||||
Параметры в `config/global.yaml`:
|
||||
|
||||
```yaml
|
||||
# omnivoice
|
||||
omnivoice.model_name: "k2-fsa/OmniVoice"
|
||||
omnivoice.device: "cuda:0"
|
||||
omnivoice.dtype: "float16"
|
||||
omnivoice.num_steps: 32
|
||||
omnivoice.speed: 1.0
|
||||
omnivoice.profiles_dir: "data/voice_profiles"
|
||||
omnivoice.output_dir: "output/voice"
|
||||
omnivoice.enabled: true
|
||||
```
|
||||
Reference in New Issue
Block a user