Text-To-Speech (TTS)

Posted on | January 29, 2008 | 6 Comments

Last month, Steve posted on his blog an idea of exchanging recorded vocabulary lists. He then made a very nice sample recording of an English vocabulary list. The first time I listened to it, I told myself that I must get one for my own list.

Here is the problem. I have over 1000 words/phrases on my list. Asking anyone to record the entire list is just inhumane. Well, perhaps I can find a non-human to do it for me. How about a Text-To-Speech (TTS) engine?

A quick survey of several online TTS services gave me the impression that the one from AT&T Labs produces the best result. Here is a sample in Spanish:

With a few trial-and-errors, I have refined the process so that it would take me only 5-10 minutes to make the audio file. Here are the steps:

  1. Select those words/phrases you want to learn and generate a print-view in LingQ
  2. Copy and paste the list onto a spreadsheet
  3. Remove the ‘hint’ column and massage the other 2 columns to make them consumable by the TTS engine
  4. Generate multiple WAV files 300 characters at a time (a limitation of the demo version of the engine).
  5. Merge the multiple files into one
  6. Copy and paste the list to the lyrics section of the MP3 file
  7. Upload the MP3 file to the MP3 player.

The audio quality from the TTS engine is acceptable for listening. But I would not recommend using it for serious speaking or parroting exercises.

Hopefully in the near future, I can generate the audio file with a few clicks from the LingQ system, played back with real voices.


6 Responses to “Text-To-Speech (TTS)”

  1. Steve Kaufmann
    January 30th, 2008 @ 2:19 am


    I did not listen to the computer generate sound. I find it hard to listen to. You have 1000 words on your list. I have over 8000 on my list of Russian words and phrases. I cannot possibly hear them all. I need not review them all. I review on a random basis. I pick up some of the words. I am reminded of some of the words in phrases. It is a good exercise and process. I learn some of those words now, and later on as well, incidentally. Language learning is not for perfectionists and is not something to be done throughly. It is done like picking flowers while walking through the forest. Eventually we get to our destination.

  2. Keith
    January 30th, 2008 @ 7:38 am

    The Mac OS can read text too. It has TTS. It sounds just as good in English as the Spanish one you posted. But I don’t have any software to record the sound, otherwise I’d sent you an example.

  3. Keith
    January 30th, 2008 @ 8:01 am

    I just followed your link to the AT&T Research page and tried the same text with the various voices. The TTS in Mac OS is better than the AT&T system. It’s not hard to listen to, but not perfect yet. It sounds like they have recorded phrases and blend them together (just sounds that way). Where two phrases meet or overlap I can hear a double-voice, but during individual phrases, it sounds pretty natural in speed, rhythm, and intonation.

    So, I think a whole text would be a bit annoying, but just single words or short phrases is really good. Put a period after each word so there is a short pause.

    Now if only I had a way to record it.

  4. Edwin
    January 30th, 2008 @ 11:15 am

    Even if I want to review a small list of words, it takes time to request a native-speaker to record it for me. The TTS does sound unnatural, but at least, I can generate the audio file and carry it around in my MP3 player, all done within minutes. As you have said, we cannot be perfectionists all the time.

    I remember you mentioned something about adding the capability to attach audio samples to the vocabulary. Can’t wait to see that happens!

  5. Edwin
    January 30th, 2008 @ 11:19 am

    I am not a Mac-person, so I cannot offer any advise to you.

    As for myself, I must have the audio file generated so that I can carry it around. I have tried many online TTS engines, a few of them have very good qualities but do not generate audio files. Too bad!

  6. chris(mandarin_student)
    February 2nd, 2008 @ 12:09 pm

    Edwin, if you have the free Audacity sound recording software you can alway hit use the record “record what you hear” functionality to create yourself sound files from anything that normally comes out of your PC loudspeakers.

    Also most of these TTS sites on the Internet probably are loading a sound file in the background, if you use Firefox there are a few browser extensions that will sniff out the url to the sound file generate (usually a wav file) and allow you to download it to your PC, be aware that there urls are almost always temporary though so best to capture them soon after you have listened online. The best Firefox extension for this in my experiance is called “DownloadHelper”

    I like TTS, it is not perfect but another useful tool. Lets say you have a couple of paragraphs you want to learn to recite in foreign language but no native speaker to help you. Generating TTS from the text is heaps better than having nothing at all.

Leave a Reply

  • Subscribe

  • Recent Posts

  • Posts by Categories