Your Complete Voice Adaptation Research Workspace
VoiceStudio is a unified toolkit for text-style prompted speech synthesis, enabling instant voice adaptation and editing through natural language descriptions. Built on cutting-edge research in voice style prompting, LoRA adaptation, and language-audio models.
Key Features:
uv add voicestudio[all] # Install with all available base TTS models
git clone https://github.com/LatentForge/voicestudio.git
cd voicestudio
uv pip install -e ".[all]"
git clone https://github.com/LatentForge/voicestudio.git
cd voicestudio
uv pip install -e ".[all,web]"
# Build package
uv build
# Upload to PyPI
uv publish
VoiceStudio works with various TTS architectures:
| Model | Status | Notes |
|---|---|---|
| Parler-TTS | β Supported | Required further testing |
| Higgs-Audio | β Supported | Required further testing |
| Qwen3-TTS | β Supported | Required further testing |
| Chroma | β Supported | Required further testing |
| Spark | π Experimental | Coming soon |
| Dia | β Supported | Fully tested (by HF) |
| CozyVoice | π Experimental | Coming soon |
| F5-TTS | π Experimental | Coming soon |
Add your own model: See our Integration Guide
We welcome contributions! See CONTRIBUTING.md for guidelines.
Areas we need help with:
This project is licensed under the MIT License - see LICENSE file for details.
The base TTS models supported by this project are subject to their own respective licenses. Users are responsible for reviewing and complying with each modelβs license before use.
If you use VoiceStudio in your research, please cite:
@software{voicestudio2026,
title={VoiceStudio: A Unified Toolkit for Voice Style Adaptation},
author={Your Name},
year={2026},
url={https://github.com/LatentForge/voicestudio}
}
@article{t2a-lora-2025,
title={T2A-LoRA: Text-to-Audio LoRA Generation via Hypernetworks for Real-time Voice Adaptation},
author={LatentForge},
journal={arXiv preprint arXiv:2501.XXXXX},
year={2025}
}
Made with β€οΈ by LatentForge Team