[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Octal-dev] Fourier Transforms and samples
From: |
n_nelson |
Subject: |
Re: [Octal-dev] Fourier Transforms and samples |
Date: |
Fri, 19 May 2000 15:09:14 -0700 |
Steve Mosher wrote:
[ I've been thinking about the pitch-scaling / time-dilation issues,
[ and learning some cool stuff about math at the same time. Perform-
[ ing Short-Time Fourier Transform on each sample played causes per-
[ formance problems, but is the bulk of it in the transform itself?
[ Or is reassembly similarly intense? I'm going to look at the reas-
[ sembly process more closely, so I can see this for myself. Anyhow,
[ what I'm getting at is this: would it be worth it to store select-
[ ed samples as transformed? Less work would have to be done in re-
[ al-time... in fact, all STFT compliant machines could neglect to
[ reassemble their data between one-another. Regardless, the signal
[ must only be transformed once, and reassembled once, which is good
[ in situations where you want to (say) scale the pitch, then put it
[ through a freq-filter.
[ Does this work out well, or am I missing something?
Do you mean by performance: the processing time required (most like-
ly), or the resulting audio quality?
But before going onto these questions, it appears that you have im-
plemented or found an STFT implementation. If this is code/software,
where can it be found?
In terms of speed, if only the generating phase (reverse Fourier
Transform) from a predetermined Fourier description could be used,
the forward transform phase could be avoided. The STFT method de-
scribed (thanks to your previous web reference) on
(1) cnmat.CNMAT.Berkeley.EDU/~alan/MS-html/MSv2.html#RTFToC2
obtains a spectrum of Fourier partials for each FFT block (and ac-
cording to the several parameters used for STFT selected for a sub-
sequent reverse transform). As can be seen under close scrutiny in
Figure 5. Spectral surface created from the analysis of a short
trumpet note
there are a succession of waves going from left to right, each
rising to a crest and then falling. And then the time dimension is
from back to front such that an STFT block would be a horizontal
slice of the waves in Figure 5. The rising to a crest and falling of
these frequency values is also shown in the tables on
(2) www.dspdimension.com/html/pscalestft.html
in section 4: About The Choice of Stride.
The point here is that if one uses the raw output of the STFT as
suggested in (1) before attempting the additional precise frequency
detection procedure in (2), the large number of frequencies in the
sloping portions of the cresting waves seen in the figure from (1)
will put an enormous computing load on the reverse transform. The
additional forward transform work described in (2) to obtain the
precise frequency description will be significantly less for the
overall forward and reverse transform sequence and benefit a strate-
gy use predetermined Fourier descriptions as against attempting to
obtain them in a real-time manner. Essentially, the reverse trans-
form load is proportional to the number of simultaneous frequencies
to be generated. (2) also notes that the STFT needs to processing in
a substantially overlapping manner (multiple FFTs per sample) to ob-
tain a precise result and avoid _smearing_.
As just suggested, the forward transform process should provide a
precise Fourier description to minimize overall processing time but
will obtain a trade-off between processing time and precision of re-
sult--as seen in the increase of FFTs per sample--to minimize Fouri-
er description distortion (with respect to the original sound).
But finally, the STFT process will obtain a floor amount of distor-
tion (some necessary portion of distortion) because of the inherent
block nature of the FFT. This effect will be most apparent at high,
transient frequencies (discussed in the parameter trade-offs in
(1)).
As mentioned by Davíð B. Franzson, obtaining a precise Fourier de-
scription in something close to real-time will require a significant
computational ability. Fourier processing is highly parallel and may
be pipelined, such that I expect a Beowulf Cluster would approach
the real-time result. A one megahertz Beowulf node using AMD's
Athlon is closing in on $1,000. This kind of Fourier processing
equipment might be suitable for recording studios where the time re-
quired to record and then playback would minimize the processing de-
lay issue. The computationally intensive forward transform should be
able provide a simplified, precise Fourier description within the
record to playback time delay and such that the reverse transform
could be processed in real time. Or is someone already doing this?
Neil Nelson address@hidden
[Octal-dev] OCTAL and literate programming, Dave O'Toole, 2000/05/19