Re: [gnuspeech-contact] Re: Parameter names and meanings

gnuspeech-contact

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] Re: Parameter names and meanings

From:	David Hill
Subject:	Re: [gnuspeech-contact] Re: Parameter names and meanings
Date:	Fri, 3 Feb 2006 14:24:09 -0800

Hi Eric,

This is the first time I've seen this message, so maybe there was alist problem. Very odd!



On Feb 3, 2006, at 4:40 AM, Eric Zoerner wrote:

I haven't seen a reply to this message which was sent to the liston 3 Jan. Is this information not available?
Thanks,
Eric



Ar 3 Ean 2006, ag 18:47, scríobh Eric Zoerner:
Maybe I'm not looking in the right place, but there seems to besome information missing in the GnuSpeech documentation which iscausing some difficulty for me.
I am having some difficulty with the names used for theparameters, both at the posture (Synthesizer) and suprasegmental(Monet) levels, in that I cannot find documentation on how theparameters relate to the tools. There is general discussion of theparameters in some cases, but there is no direct mapping of theconcepts to the parameter names.


I'll have to look into this.

For example, the Synthesizer app allows you to adjust the"breathiness" of a posture, but there is no reference to howadjusting this affects the output parameters of the Synthesizerapp. My best guess is that it may affect the "fricBW" parameter.

The "breathiness" parameter actually injects noise as part of theglottal excitation. The parameter simulates the fact that with somevoices, there is a part of the glottis that does not fully close, andair passing through that unclosed portion, as the main glottalclosure increases, causes "breathy" noise at the glottis itself.This is one of the features that distinguishes most female voices("she had a really husky voice" indicates a more extreme case) frommost male voices.

You can find the source for the software Tube Resonance Model engineas "tube.c" on the gnu site under

"trliium/src/softwareTRM/tube.c

The most relevant part of the code is:

            /*  CREATE GLOTTAL PULSE (OR SINE TONE)  */
            pulse = oscillator(f0);

            /*  CREATE PULSED NOISE  */
            pulsed_noise = lp_noise * pulse;

            /*  CREATE NOISY GLOTTAL PULSE  */
            pulse = ax * ((pulse * (1.0 - breathinessFactor)) +
                          (pulsed_noise * breathinessFactor));

            /*  CROSS-MIX PURE NOISE WITH PULSED NOISE  */
            if (modulation) {
                crossmix = ax * crossmixFactor;
                crossmix = (crossmix < 1.0) ? crossmix : 1.0;
                signal = (pulsed_noise * crossmix) +
                    (lp_noise * (1.0 - crossmix));

Note that when voiced fricatives (especially "z") are synthesised,the frication noise is pulsed at glottal frequency. This is adifferent effect and is is what the noise cross-mix is all about.

It would be helpful to get a description of each parameter, whatthe abbreviation stands for, what it means, and how it is adjustedby the tools.


Agreed.  It shall be done.

r1 through r8 are fairly obvious, and I've figured out themeanings of fricVol, fricPos, etc., but the ones I'm not sure ofinclude fricCF (is this the throat transmission CutoffFrequency?), and fricBW (bandwidth??).

This is "fricative center frequency". Fricatives are well imitatedwith a particular bandwidth and centre frequency (though realfricatives are more complex if you get down to analyse them, but alsopretty variable). Thus a "sh" sound has a wide bandwidth and low CF(2600 Hz and 2500 Hz respectively) whilst a "s" sound has a narrowbandwidth and higher CF (500 Hz and 5500 Hz respectively). This kindof information is in the posture data entries.

In the Monet documentation appendix, the BW and AX are listed, butno explanation of what these mean that I can find. (I did findexplanation of qss, and duration parameters in transitions).

BW is "bandwidth" -- I am not sure of the context you had in mind.AX is an old term that should have been updated. It dates back tothe days of Lawrence's "Parametric Artificial Talker" (or "PAT"),probably the first fully functional formant-based synthesiser (likethe later MITalk and DECTalk) that toured the US in 1953 with itsBritish inventor. AX simply stands for "Larynx Amplitude" -- the Abeing "amplitude" and the X standing for "larynx" -- more properlyreferred to as glottis or vocal folds these days. Don't ask me tojustify it ;-) Computer memory was at a premium back in those days,and PAT did not, at that time, even use a computer, but thetechniques of character saving spilled over, I suppose. AX standsfor the amplitude of the glottal waveform (but even there somequalification is needed. Waveform "amplitude" can be specified aspeak amplitude, RMS amplitude, energy flow (power) ... We are usingpeak amplitude. An energy measure might be better. This is like thedebate between VU readings and other measures of audio output foraudio equipment.

Please keep asking questions. I am very happy to supply answers, andit will help me to see what is missing and steer me to producing adocument to fill in the gaps.

I'll check out what might be needed and make a start asap, but I amcurrently trying to get a working version of the "Synthesizer"application up (it is going quite well), and I also want to get a"real-time Monet" working too.


Thanks for writing.

All good wishes.

david
----
David Hill
Imagination is more important than knowledge. (Albert Einstein)
Kill your television!

[Prev in Thread]

Current Thread

[Next in Thread]

[gnuspeech-contact] Re: Parameter names and meanings, Eric Zoerner, 2006/02/03
- Re: [gnuspeech-contact] Re: Parameter names and meanings, David Hill <=
- Re: [gnuspeech-contact] Re: Parameter names and meanings, David Hill, 2006/02/03

Prev by Date: [gnuspeech-contact] Re: Parameter names and meanings
Next by Date: Re: [gnuspeech-contact] Re: Parameter names and meanings
Previous by thread: [gnuspeech-contact] Re: Parameter names and meanings
Next by thread: Re: [gnuspeech-contact] Re: Parameter names and meanings
Index(es):
- Date
- Thread