[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [gnuspeech-contact] source code organisation
From: |
David Hill |
Subject: |
Re: [gnuspeech-contact] source code organisation |
Date: |
Fri, 28 Apr 2006 18:06:16 -0700 |
Hi Eric,
I was away yesterday, as usual.
Several of your questions I shall only be able to answer properly/
fully when I've finished the Synthesizer port and moved onto the
extraction & implementation of "real-time-monet".
On Apr 27, 2006, at 2:25 AM, Eric Zoerner wrote:
Where in the source for gnuspeech is the TTS logic, including where
it selects an intonation contour based on the punctuation?
You can probably "read" the source code even better than I can.
Don't forget that the input text is parsed and converted to Monet
input format. At that point, the tone-group and foot boundaries are
added, and the tone groups selected and the tonic chosen and this
info is used to set the input format elements. It is done on the
basis of dictionary look-up, to get the stresses, which provide the
foot boundaries, and the punctuation, which selects the tone group
boundaries and tone-groups. If there is no other information (e.g.
bad punctuation) then defaults are used, so the tone group spans the
whole sentence, the tonic is the last foot of the sentence, and the
default tone group is tone group 1. The short answer to your
question is "during the parsing of the input to Monet". The
conversion to parameters is then carried out by methods within Monet,
but there's no provision for changing the intonation/rhythm rules and
data (unlike everything else), though they can be varied manually in
particular syntheses.
Am I correct in saying that there are no tools that directly access
the TTS functionality?
I am not sure what this question really means. Accessing the
functionality of the TTS system is pretty well what Monet is all
about. The bit that is missing are Synthesizer, which allows the
tube itself to be manipulated direcly. But Monet allows all the
other elements of the TTS system to be created/deleted/edited (well,
the deletion for some things like rules is pretty kludgy since they
are only renamed to a dead name). Even the intonation can be varied
by using the intonation window. Perhaps you can be more explicit.
Is this part of the real-time Monet subsystem, and where is the
source for that as well?
The original real-time monet is under "trillium/ObjectiveC/
Monet.realtime" in the archive:
http://cvs.savannah.gnu.org/viewcvs/gnuspeech/?root=gnuspeech
Is there information somewhere that describes the organisation of
the source code in the project?
Unfortunately, No, other than the self-documenting properties of the
development environment and language -- unless Craig has kept
whatever notes he made when he was writing the system.
At this point I have only checked out the "current" repository,
which is just the OS X port, correct?
Correct
What is the complete list of CVS repositories, and what part of the
project is contained in each?
You find out about this by visiting the repository as per URL above.
The names are pretty well self documenting, except where deliberate
obscurantism was used (to hide password files, for example, and the
directory names were not changed when the stuff was dumped on the
Savannah site.
If you wanted to modify and recompile on the NeXT, you'd do well to
consult with Craig &/or Len to see what extra paths need to be set up
so ProjectBuilder can find everything. The MusicKit is one obvious
component that needs to be accessible. Try compiling and see what
the system complains it can't find and then figure out where the
stuff is and add the paths to your search path. If you are only
dealing with Monet, you'd be better dealing with the Mac version and
that shouldn't cause any problem. If it does, Steve is you best
source of info. The NeXT is quite slow, of course. You should find
everything needed is on your NeXTStation if you really want to go
that root.
Hope this helps.
All good wishes.
david