[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Speech-reco] RE: [SpeechIO] Update on TTS work, with samples
From: |
Sina Bahram |
Subject: |
[Speech-reco] RE: [SpeechIO] Update on TTS work, with samples |
Date: |
Wed, 13 Oct 2010 10:27:48 -0400 |
Interesting ... It is coming along it sounds like.
Take care,
Sina
-----Original Message-----
From: address@hidden [mailto:address@hidden On Behalf Of Bill Cox
Sent: Wednesday, October 13, 2010 9:01 AM
To: address@hidden; address@hidden; speech input and output
Subject: [SpeechIO] Update on TTS work, with samples
This is still pretty rough, but it's improving. The attached files are espeak
samples, sped up by varying amounts. I find that I
can understand all of them, but only the 4x speed because I know the words it's
trying to say. However, I believe that with
practice, I could understand this voice nearly as easily as voxin at high
speed. I've used pitch 65, which I believe is a bit
easier to understand at high speed, and I've used the UK voice. The US voice
is harder to understand at high speed because some of
the sounds are just too short.
>From a technical point of view, this is what I've done. I use standard linear
>predictive coding to extract LPC coefficients, factor
them into poles, and during synthesis, I interpolate between the poles to
smooth the sound a bit. The pops you hear are due to
errors in my pole matching algorithm, which still needs improvement. I have to
match poles between frames in order to interpolate,
and errors can easily be heard as pops.
The reason I'm using poles rather than traditional reflection coefficients is
because I am hoping to do direct pattern matching of
the major pole tracks over time to do speech recognition.
Bill
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Speech-reco] RE: [SpeechIO] Update on TTS work, with samples,
Sina Bahram <=