On Fri, 20 Jul 2007, John Gehman wrote:
In using integers from 0 to 40 for your xdata, and computing data
with (b/c) as part of your function, you're introducing an undefined
number into the whole process. Hence the residual will be undefined,
and the sum of the squares of the residuals will be undefined, and
the gradients at that point, etc etc. At a minimum try float f = i
+1.0 where the data is built, and in the _f and _df functions.
Yeah, somebody else pointed it out to me yesterday. I can't believe I
didn't spot that. What a schoolboy error. Worse, I only re-wrote it
in the
period/t instead of frequency format because I was finding it
easier to
read while I was getting it to work.
I now changed it using the frequency, so it both runs faster (due to
multiply being roughly 4 clock cycles instead of about 80 for a
divide)
and avoids the div/0 problems. :-)
That's surely the biggest problem. But you also might want to be
careful about mixing your float data types with the doubles assumed
by gsl structures. I didn't look closely, but was left with the
impression that they were mixed a bit in your code. You might be
getting away with it anyway, but just a word of caution.
The only thing that seems to require doubles is the x_init array.
Everything else seems to work OK with floats. My main motivations for
using floats are:
1) Smaller - more data fits in the cache
2) Faster
2.1) SSE1 cannot vectorize doubles
2.2) SSE2+ can vectorize doubles but only 2 at a time instead of 4
at a
time.
Also, with regard to the ultimate intent of fitting the data, I
presume this was just an incremental step in your development of the
fitting procedure, but insofar as this goes, once you clean up the
above, you'll still get a shite fit, as the harmonic function, is not
described well by the data sampling.
Indeed I am aware of this, but it depends on the data, and it
depends on
the initial guesstimate. I have a reasonably good guesstimator
implemented, and I am finding that typically the error between the
fit and
the data falls to within 5-10% in most cases. But this may just
mean that
my data is quite clean (which, arguably, I expected). But you do
have a
point - my worst case data may well be much more noisy.
Which brings me to the next question. I have my own home-brewed solver
implemented and that works by steepest descent annealing (i.e. when it
finds a local minimum it ups the resolution). I'm finding that
overall it
fits better solutions - possibly because noise in the data somtimes
upsets
the algebraic gradient descent and it gets stuck in a local
minimum, and
the local-inspection approach with increasing precision may be more
robust. The GSL implementation also seems slower on my current
platform,
despite generally managing to descend in considerably fewer
iterations (up
to the point where it stops being able to fit a better solution -
but it
stops earlier than my own algorithm).
Now, granted, the speed could be an implementation issue (my code
was made
with ICC's auto-vectorization in mind), but are there parameters on
the
solver function that could be tweaked, other than the delta checker
sensitivity?
Also, how does sigma fit into all this? I have tried it with various
settings and it didn't seem to make any difference, so I just set
it to 1
and removed it from all the equations to save the seemingly needless
divide.
Finally - where is gsl_matrix_float_set() implemented? grep only
seems to
find it in the header files. I wanted to have a closer look at the
code to
see if I could do anything to help it vectorize with ICC (assuming it
doesn't already).
Thank you all for helping, I really appreciate it.
Gordan
_______________________________________________
Help-gsl mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/help-gsl