[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Speech-reco] RE: [SpeechIO] Update on TTS work, with samples

From: Sina Bahram
Subject: [Speech-reco] RE: [SpeechIO] Update on TTS work, with samples
Date: Wed, 13 Oct 2010 10:27:48 -0400

Interesting ... It is coming along  it sounds like.

Take care,

-----Original Message-----
From: address@hidden [mailto:address@hidden On Behalf Of Bill Cox
Sent: Wednesday, October 13, 2010 9:01 AM
To: address@hidden; address@hidden; speech input and output
Subject: [SpeechIO] Update on TTS work, with samples

This is still pretty rough, but it's improving.  The attached files are espeak 
samples, sped up by varying amounts.  I find that I
can understand all of them, but only the 4x speed because I know the words it's 
trying to say.  However, I believe that with
practice, I could understand this voice nearly as easily as voxin at high 
speed.  I've used pitch 65, which I believe is a bit
easier to understand at high speed, and I've used the UK voice.  The US voice 
is harder to understand at high speed because some of
the sounds are just too short.

>From a technical point of view, this is what I've done.  I use standard linear 
>predictive coding to extract LPC coefficients, factor
them into poles, and during synthesis, I interpolate between the poles to 
smooth the sound a bit.  The pops you hear are due to
errors in my pole matching algorithm, which still needs improvement.  I have to 
match poles between frames in order to interpolate,
and errors can easily be heard as pops.

The reason I'm using poles rather than traditional reflection coefficients is 
because I am hoping to do direct pattern matching of
the major pole tracks over time to do speech recognition.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]