MBROLA is speech synthesis software as a worldwide collaborative project. The MBROLA project web page provides diphone databases for many[1] spoken languages.

MBROLA
Original author(s)Thierry Dutoit
Developer(s)Vincent Pagel
Initial release1995; 29 years ago (1995)
Stable release
3.3 / 17 December 2019; 4 years ago (2019-12-17)
Repositorygithub.com/numediart/MBROLA
Written inC
Operating systemLinux
Windows
FreeBSD
TypeSpeech synthesizer
LicenseGNU Affero General Public License
Websitegithub.com/numediart/MBROLA

The MBROLA software is not a complete speech synthesis system for all those languages; the text must first be transformed into phoneme and prosodic information in MBROLA's format, and separate software (e.g. eSpeakNG) is necessary.

History

edit

MBROLA project started in 1995 at the TCTS Lab of the Faculté polytechnique de Mons (Belgium) as a scientific project to obtain a set of speech synthesizers for as many languages as possible. The first release of MBROLA software was in 1996 and was provided as freeware for non-commercial, non-military application. Licenses for created voice databases differ, but are also mostly for non-commercial and non-military use.

Due to its free usage only for non-commercial applications, MBROLA was as alternative choice for private/home users for de facto speech synthesis engine eSpeakNG in Linux workstations, but mostly was not used for commercial solutions (e.g. for speaking time clocks, boarding notifications for ports and terminals etc.) After initial development of voice databases updates and support of MBROLA software ceased and gradually closed-source binaries fell behind development of recent hardware and operating systems.[2] To deal with this MBROLA development team decided to release MBROLA as open source software, and on October 24, 2018, source code was released on GitHub with GNU Affero General Public License. On January 23, 2019, tool called MBROLATOR was released to provide creation of MBROLA database from WAV files with the same license.

Used technology

edit

MBROLA software uses MBROLA (Multi-Band Resynthesis OverLap Add)[3] algorithm for speech generation. Although it is diphone-based, the quality of MBROLA's synthesis is considered to be higher than that of most diphone synthesisers as it preprocesses the diphones imposing constant pitch and harmonic phases that enhances their concatenation while only slightly degrading their segmental quality.

MBROLA voice sample of Leonhard Euler quote

MBROLA is a time-domain algorithm similar to PSOLA, which implies very low computational load at synthesis time. Unlike PSOLA, however, MBROLA does not require a preliminary marking of pitch periods. This feature has made it possible to develop the MBROLA project around the MBROLA algorithm, through which many speech research labs, companies, or individuals around the world have provided diphone databases for many languages and voices.

References

edit
  1. ^ List of MBROLA voices
  2. ^ Mbrola-64 crashes immediately with a SEGFAULT
  3. ^ Dutoit, T; Leich, H (Dec 1993). "MBR-PSOLA: Text-To-Speech synthesis based on an MBE re-synthesis of the segments database". Speech Communication. 13 (3–4): 435–440. doi:10.1016/0167-6393(93)90042-J.
edit