Vaja 8.0 : A Thai/English Text-to-speech Commercialized Software

Service Innovation

Spoken language is the most common communication tool used in daily life. Without laying eyes on a monitor or screen, speech technology helps synthesizing human-like sounds to transfer massage from a device to perceivers. Text-to-speech synthesis system are the practical intermediate tool for controlling a synthesizer. Using only a chuck of text, the system can generate synthetic sounds to commute with subjects, for example using a text for an automatic announcement system, which is more flexible than using a prior recorded speech.

Vaja 8.0 is the newest version (2018) of Thai/English text to speech synthesizer from NECTEC. It can give speech in two voices, female and male. It comprises of all three new reengineered parts. The whole system is reorganized to be able to work as plug-in modules. All parts are able to be replaced by any up-coming engine or language. In terms of technical challenges, text pre-processing part uses syllable-like unit in text segmentation instead of word. The grapheme to phoneme conversion part uses the new grapheme to phoneme conversion (G2P 3.0) which is a fully machine-learning module instead of using syllable-pattern dictation. Finally, speech synthesizer part uses linguistics information for storing and retrieving HMM models from a tree. The product is bundled and available on both Linux and Windows.

Software features:

Output: Thai/English synthetic reading speech
Text pre-processing: module for reading from dictation is available.
Speech quality sounds closer to human speech compared to that from previous version
Voice modification: able to tune volume and speed
API: UI and library
Voice:
- Name: Nok (female), A (male)
- Language: Bilingual
- Sample type: PCM, 44,100 Hz, 16 bits, Mono
- Style: reading speech

Hardware requirement

Operating system: Windows 7, Windows 8.1, Windows 10
RAM: 1 GB or higher
Available storage space: 100 MB or higher
Sound card: General sound card

Examples of VAJA ver. 2- 8

Software specification:

Supported APIs
Provide text pre-processing tool such as grapheme-to-phoneme conversion

Application:

Able to be plugged-in to IVR system
Able to read e-book, email, or documents
Able to be used in queuing system or information support system
Able to be bundled in presentation or e-learning application
Able to be used in any responsive device
Able to be plugged-in screen reader

Availability:

Licensing

Research and development team:

Speech and Audio Technology Laboratory (SPT, HCCRU)

email: sawit.kas[at]nectec.or.th

Contact:

Business and Technology Transfer: Tel: 0 2564 6900 ext 2346, 2351-2354, 2357, 2382, 2383, 2399; email: business[at]nectec.or.th

Poster : VAJA 8.0

September-11-2018 16:26