octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Contribution to the optimization toolbox


From: Michael Creel
Subject: Re: Contribution to the optimization toolbox
Date: Mon, 7 Sep 2009 10:57:27 +0200

On Sun, Sep 6, 2009 at 9:55 AM, Jaroslav Hajek <address@hidden> wrote:
>
> Partially. I discovered the reason why fminunc was so much worse: it
> was the factor = 100 setting in fminunc. This was taken from fsolve
> and the original MINPACK and means that the starting trust region is
> 100 times the size of norm(x). While this may be OK for fzero, which
> starts with a fresh correct Jacobian, it seems to make little sense
> fro fminunc, which starts with a Hessian approximation equal to
> identity matrix (and is often quite wrong), so essentially the
> function starts from a 100x bigger random guess. I changed the default
> values to factor = 1 for fsolve and even more conservative factor =
> 0.1 for fminunc. With this new setting, I get from your script:
>
> ans =
>
>   0.221018   1.000000   0.077295   1.000000
>
> I also tried dim = 10:
> ans =
>
>   0.52277   1.00000   0.21817   1.00000
>

Great! By the way, the version of fminunc in the 3.2.2 release
candidate has a typo down in the part at the end for guarded
evaluation. There is a jx/gx confusion which is obvious. I'm trying to
compile a checkout of the development version to see if I can
replicate your results, but it crashes with
In file included from ./txt-eng-ft.h:28,
                 from ./gl-render.h:43,
                 from ./DLD-FUNCTIONS/fltk_backend.cc:60:
/usr/include/ft2build.h:56:38: error: freetype/config/ftheader.h: No
such file or directory
In file included from ./gl-render.h:43,
                 from ./DLD-FUNCTIONS/fltk_backend.cc:60:
./txt-eng-ft.h:29:10: error: #include expects "FILENAME" or <FILENAME>
In file included from ./gl-render.h:43,
                 from ./DLD-FUNCTIONS/fltk_backend.cc:60:
./txt-eng-ft.h:76: error: ‘FT_Face’ does not name a type
g++ -c                    -fPIC -I. -I.. -I../liboctave -I../src
-I../libcruft/misc -DHAVE_CONFIG_H -O3 -march=native -funroll-loops
-Wall -W -Wshadow -Wold-style-cast -Wformat -g -O2 -pthread
./DLD-FUNCTIONS/gammainc.cc -o pic/gammainc.o
make[2]: *** [pic/fltk_backend.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: Leaving directory `/home/michael/Desktop/octave/src'
make[1]: *** [src] Error 2
make[1]: Leaving directory `/home/michael/Desktop/octave'
make: *** [all] Error 2
address@hidden:~/Desktop/octave$


I have fltk, ftgl libraries installed. Can I disable the fltk stuff in
configuration?


>
> so a lot of time is taken by the finite differences. I'm not sure
> bfgsmin uses the analytic gradient at all, though; is there any way to
> enforce it?

This version shows how to use analytic/numeric with bfgsmin (set reps
to 1 and set control={100,3} for full verbosity)


##################### cut here ##################333
1;
# in the calls to the minimization functions, use "objective" if you
want analytic gradient, otherwise, use "objective2"

# example obj. fn.: with gradient
function [obj_value, gradient] = objective(x)
        [obj_value, gradient] = rosenbrock(x);
endfunction

# example obj. fn.: no gradient
function obj_value = objective2(x)
        obj_value = rosenbrock(x);
endfunction


dim = 5;
replications = 100;
results = zeros(replications,4);
ub = 2;
lb = 0;
control = {100,0};  # set the second number to 1,2,3 for increasing
levels of bfgsmin verbosity

for i = 1:replications
        x = (ub-lb).*rand(dim,1) + lb;
        tic;
        [theta, obj_value, convergence] = bfgsmin("objective2", {x}, control);
        results(i,1) = toc;
        results(i,2) = norm(theta - ones(dim,1)) < 1e-5;

        tic;
        [theta, obj_value] = fminunc("objective2", x);
        results(i,3) = toc;
        results(i,4) = norm(theta - ones(dim,1)) < 1e-5;
endfor

mean(results)
##################### cut here ##################333



> Maybe there are also changes to fminunc that didn't yet get to 3.2.x.
> I'll check. I'm working with optim-1.0.6; I tried the SVN version but
> I can't get it to work: it seems the revision 5818
> http://octave.svn.sourceforge.net/viewvc/octave?view=rev&revision=5818
> broke things for me; I tried to fix it (including a screwed-up commit
> where my debug lines got in), but I wasn't successful yet. I think
> this is the only change since 1.0.6.


I hadn't tried out the OF version of __bfgsmin.cc since those changes
were made. It didn't work for me, either. I checked in a working
version this morning. Thanks for the heads up.

>
>> Great. But to facilitate testing and to avoid having redundant
>> algorithms, it would be nice to have a set of test problems to work
>> with. For example, this little test shows that fminunc is getting a
>> lot faster with the development version of Octave, and could probably
>> be made a little more robust with a little work. I'll have a look at
>> the fminunc code and see if I can make some suggestions. If it gets as
>> robust as bfgsmin, and is faster, then bfgsmin will probably need to
>> be retired.
>
> Well I'm not sure about that. The point is that bfgsmin also contains
> the LBFGS update. fminunc uses a factorized direct BFGS update (i.e.
> updating the Cholesky factor of the hessian, via cholupdate), so one
> step of it is O(N^2) and the memory is also O(N^2). This is equal to
> the direct updating of inverse employed by bfgsmin; a possible
> advantage to fminunc is that all the hidden intelligence of Octave's
> matrix division is employed, including check for ill-conditioning.
> However, for large systems, LBFGS is bound to win; it's asymptotically
> superior.

Right, I forgot about lbfgs, I don't use it that often.

> The optimization arsenal provided by Octave & OctaveForge does not
> need to be orthogonal, just broad enough.
>
> On the whole, I'm now convinced that optimization algorithms are well
> suited to be implemented as m-files. Very often in real applications,
> the overhead of the algorithm itself is actually negligible, so one
> doesn't need to worry too much about performance. And m-files have the
> advantage that more people are able to play around with them, tweak
> and improve them, and share the improvements.
> Another advantage of m-files is their genericity; for instance,
> fminunc will compute in single precision if given single precision
> inputs. That may not be a big advantage, as single precision
> optimization is rare, but the general mechanism is the point. For
> instance, it took just a few small changes to make fsolve able to
> solve complex (holomorphic) nonlinear systems, using complex Newton
> method.
> On the other hand, it's essential to have good building blocks for the
> m-files. The choice between high-level code and building block,
> between m-file and .oct file, is an art.

I think that you're right, given that the .m file implementations are
getting faster. I will probably try to write lbfgs as an .m file at
some point. If that works out and it is more or less as fast as the
.oct version, then I think there is no reason to maintain the .oct
version. I am going to experiment with the development version of
Octave and see how things go.

Cheers, Michael

>
> regards
>
> --
> RNDr. Jaroslav Hajek
> computing expert & GNU Octave developer
> Aeronautical Research and Test Institute (VZLU)
> Prague, Czech Republic
> url: www.highegg.matfyz.cz
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]