Monday, February 3, 2014
Speech synthesis from Coursera Lectures - 1
These days we are finding alot of speech data for free on Internet. Take an example: Coursera lectures where each professor teach lessons to students for free in online by recording his lectures. These records contain both audio and video streams. The good thing is that they all provided with the subtitles (transcript). Most of the lectures you find on Coursera are in English. With the advances in Speech technology especial in Speech synthesis, we can synthesis voices from the lecture videos. But there are some challenges involved in this procedure. I will list some of the problems here:
1) Recording conditions. Actually this is a tough problem because we don't know how far the microphone is located. What type of noises are mixed in the recordings (ex: Fan or A/C sound, cough, writing sound..etc). In this case we have to apply speech enhancement techniques to remove any background noises.
2) Audio extraction. We know that band-width and memory space available on Internet are limited. So most of the times videos are post-processed to compress the size. In general, they don't specify the compression techniques used for the compression. If they used the lossy compression like MPEG then we can't restore the original quality otherwise we can restore the original quality by inverse process.
3) Text alignment with audio. We commonly see that subtitles are either delayed or rushed by few seconds. In human perception we don't bother about these small mismatches. But in speech technology they play a major role. So, in this we have to apply some text alignment tools to correct this mismatches.
4) Length of utterances. Traditional speech synthesis systems use the read aloud sentences which usually have length of 6-7 words, for training. But in lectures we will find a variety of length utterances because in most of the situations professors explains things from his experience not by reading some pre-written text.
With these four main issues there will be some minor issues which I will discuss in coming posts.
Subscribe to:
Post Comments (Atom)

No comments:
Post a Comment