Text to Speech support

Created by: Lester Caine, Last modification: Tue 15 of Jul, 2014 (07:44 UTC)

As the basic audio call system uses a set of pre recoded sound segments to build the automatic announcment, it is restricted to simply anouncing ticket numbers along with counter, booth or desk as numbers or letters. Calling by name and supporting more diverse room locations has in the past been provided by the rVoice text to speech engine. With the takover of Rhetorical by ScanSoft 10 years ago, the rVoice TTS engine was shelved in favour of the systems marketed by the new American owners. This is rather anoying since the engine is still today the best option for announcing names in the calling system and Nuance no longer seem to sell a Text to Speech solution at all, just the Dragon Speech Recognition. None of the other replacements are as clean as rVoice. The problem however is that while the engine is still performing perfectly, it is only able to be run on W2k due to it's licence and so while it can still be maintained, other restictions are creating pressure for it to be replaced. The problem is finding a suitable replacement. Most of the current options work well enough for personal interaction perhaps via a computer speaker or headset, but using them with a PA system is not so clear.

Since Rhetorical was a spin off from Edinburgh University it's perhaps not suprising that the bigest open source engine is still hosted by them. Festival is still a little strained in it's pronunciation, as is the alternate eSpeak engine, so this leaves the commercial options of which nothing stands out. While researching the current alternatives I did come across a new company, CereProc which has staff from the original Rhetorical team so should be interesting except for an anual charge for each voice.

Wikipedia has a well researched up to date view of the current marketplace. It picks up that in addition to Rhetorical, Nuance has also take over both Lernout & Hauspie and Speechworks which were two other good quality engines in the early days. The history of these developments is nicely documented on Wikipedia.

The system being used for the CMS framework was originally developed to support transport announcements on railway and bus stations, where the clarity of annoucements is very important. Assessing a new option which can be used in that environment will require a substantial amount of work.

Currently reviewed options

CereProc Demo on top bar
SitePal On-line service only
NeoSpeech No pricg details on-line
Cepstral Sounds a little stunted
eSpeak Too robotic for announcements
Festival As above
NaturalReader Not too bad when equalised in the amplifier.
Nuance Not found prices for Vocalizeer yet

A few more options are listed on Wikipedia

Rhetorical history