Larynx - A viable Linux TTS
I spent some time over the weekend experimenting with voice2json and rhasspy, trying to set up a fully offline voice assistant system using Mozilla DeepSpeech for speech recognition, a template file containing all known phrases and their mappings to intents, an intent recognizer and a local shell script to parse the recognized intent and invoke commands (think opening a website or folder when an intent is recognized). Rhasspy was easy to use and really fun. It’s amazing how far we’ve come in terms of open-source tools in the TTS/speech recognition space.
Along the way, I discovered Larynx, a TTS system for Linux with high-quality voices from Glow-TTS and others, with intonations that sound human. I’ve kept an eye on the Linux TTS space for years and have been disappointed by the limited consumer-use options. Often, the pre-trained TTS voices sound all too robotic for everyday use. I suppose that’s understandable given the dearth of open-source voice datasets (which is why projects like Mozilla CommonVoice are so exciting!). It’s nice to have a pleasant pre-trained TTS model natively available on Linux.
My use case is to copy text in a browser/Thunderbird RSS article, hit a shortcut and have the TTS system read the selected text aloud so I can look away from the screen and just listen.
Setup
I followed the Debian installation instructions, and downloaded and installed the tts
, lang-en_us
and Harvard Glow TTS
files.
# cd Downloads # (or /path/to/downloaded/deb/files)
sudo apt install ./larynx*.deb
TTS Shortcut
To create a shortcut that invokes Larynx on selected text, I added aliases in my ~/.bash_aliases
file. They use xclip
to access clipboard and selection data. On Debian-based systems, you should be able to install it with sudo apt install xclip
.
# Speak text passed as argument
# Usage: speak "This is a test"
alias speak="larynx --voice harvard-glow_tts --interactive"
# Speak clipboard text
# Usage: speak-clipboard
alias speak-clipboard="xclip -out -selection clipboard | speak"
# Speak currently selected text
# Usage: speak-selection
alias speak-selection="xclip -out -selection primary | speak"
Under Settings –> Keyboard on GNOME, I added a custom keybinding for Super+S to invoke bash -i -c "speak-selection"
. This lets me select any text and hit Super+S to invoke larynx