Hi Riccardo,
On my machine, running OCTLES_TEST using 4p, I get:
Time Mem
11:21 am 694xxxx
11:27 am 696xxxx
11:32 am 698xxxx
11:37 am 702xxxx
In my research code (~5000 lines long), this memory increase becomes
severe. The type of simulations I do requires running my code for
several days using multiple processors. The simulations start with < 5
GB RAM and then they grow up to the limit
of the system (usually 24
GB), then consumes the Swap Mem and then crashes.
On Tue, May 14, 2013 at 11:15 AM, Sukanta Basu <
address@hidden> wrote:
> Hi Riccardo,
> I have done the experiments suggested by you. Please see the attached pdf file.
> You will be able to duplicate the memory problem if you could run the
> attached simple code on your machine and monitor the memory usage. It
> will keep growing with iterations.
> Best regards,
> Sukanta
> On Tue, May 14, 2013 at 10:14 AM, Riccardo Corradini
> <
address@hidden> wrote:
>> Dear Sukanta
>> since you are running np octave istances. Please take not on how
many bytes
>> you are going to loose with a simple octave -q eval
>> exit ... than multiply it for the number of octave istances .. see if you
>> get a reasonable number..
>> Please have a look at
>> Try to run valgrind mpirun -np 12 octave --eval speedtest &>
>> octave_grindnp12.txt
>> and
>> valgrind mpirun -np 2 octave --eval speedtest &> octave_grindnp2.txt
>> Is ...
>> "definitely lost ... "of the first command a reasonable multiplier of the
>> second "definitely lost ... " command?
>> I also added
>> clear comm, comm_size, my_rank;
>> clear oufile;
>> clear bandwidth;
>> clear
>> clear total_time;
>> clear end_time;
>> clear recv_data;
>> clear send_data;
>> clear zero_clock;
>> clear p ;
>> clear message_size ;
>> clear byte_size ;
>> clear tag ,start_time;
>> before MPI_Finalize();
>> Bests
>> Riccardo
>> ________________________________
>> Da: Sukanta Basu <
>> A: Riccardo Corradini <
>> Cc: "
address@hidden" <
>> Inviato: Martedì 14
Maggio 2013 13:56
>> Oggetto: Re: Fwd: Memory Leak & openmpi_ext
>> Dear Riccardo,
>> Thanks for checking. As I said before: I have no problem in "running"
>> openmpi_ext successfully. I get the same "success" message. The issue
>> I am facing is related to "memory leak". If you run speedtest with
>> valgrind, you will get the following output (in Valgrind.out):
>> ==24981== definitely lost: 42,031 bytes in 28 blocks
>> ==24981== indirectly lost: 25,802 bytes in 76 blocks
>> ==24981== possibly lost: 0 bytes in 0 blocks
>> ==24981== still reachable: 124,717 bytes in 603 blocks
>> ==24981== suppressed: 0 bytes in 0 blocks
>> valgrind --leak-check=yes -v --log-file=Valgrind.out mpirun -np 2
>> octave -q --eval speedtest &
>> The loss monotonically
increases with increasing number of processors.
>> In other words, if you run speedtest with more number of processors,
>> the byte lost will increase.
>> The other code (OCTLES_TEST.m) has similar issue.
>> I also ran valgrind with massif:
>> valgrind --tool=massif --time-unit=ms mpirun -np 4 octave -q --eval
>> You will notice the memory consumption increases with time. You can
>> use linux's "top" command and monitor RAM usage.
>> Thanks for your continuous help and support.
>> Best regards,
>> Sukanta
>> On Tue, May 14, 2013 at 6:44 AM, Riccardo Corradini
>> <
address@hidden> wrote:
>>> Hi Sukanta
>>> I use
octave 3.6.4 , openmpi_ext 1.1.0
>>> The version of openmpi is openmpi-1.4.3 compiled from source. speedtest
>>> works with no errors
>>> mpirun -np 6 octave -q --eval speedtest
>>> my_rank: 1my_rank: 5
>>> my_rank: 0
>>> my_rank: 3
>>> my_rank: 4
>>> my_rank: 2
>>> Could you please use this version (1.4.3)?
>>> Add all relevant info into your .bashrc
>>> ompi_info=~/openmpi-1.4.3/bin/ompi_info
>>> if [ -f $ompi_info ]; then
>>> # ----------------------
>>> # Open-MPI software selection: choose the right ompi_info executable
>>> # All configuration information can be obtained from
>>> # ----------------------
>>> OMPIBIN=`$ompi_info -path bindir -parsable | cut -d: -f3`
>>> OMPILIB=`$ompi_info -path libdir -parsable | cut -d: -f3`
>>> OMPISCD=`$ompi_info -path sysconfdir -parsable | cut -d: -f3`
>>> export PATH=$OMPIBIN:$PATH
>>> fi
>>> If it works then the bug is linked to new nersion
>>> Bests
>>> Riccardo
>>> ________________________________
>>> Da: Sukanta Basu <
>>> A: Riccardo Corradini <
address@hidden>; c.
>>> <
>>> Cc: "
address@hidden" <
>>> Inviato: Martedì 14 Maggio 2013 5:23
>>> Oggetto: Re: Fwd: Memory Leak & openmpi_ext
>>> Hi,
>>> I do not know if the following info are helpful:
>>> (i) I monitored the "buffer" usage using "top" command. The buffer
>>> number increases monotonically and with a constant rate (about 8
>>> buffers every refresh time).
>>> (ii) I
installed the older version of openmpi_ext (1.0.2) via synaptic
>>> (Ubuntu 13.04 has this version as default). The memory leak persists.
>>> So, there is no difference between 1.0.2 and 1.1.0 in terms of memory
>>> leak.
>>> -Sukanta
>>> On Mon, May 13, 2013 at 7:27 PM, Sukanta Basu <
>>> wrote:
>>>> Hi Riccardo,
>>>> Thanks for your response. I appreciate it.
>>>> First of all, there is no mismatch in send-receive. I always get the
>>>> "correct" MPI send/receive. Unfortunately, the memory usage increases
>>>> continuously.
>>>> Deleting variables from the octave workspace does not solve
this problem.
>>>> I have coded up very simple MPI programs here:
>>>> These octave codes clearly show the memory leak problem.
>>>> I do not have a good background C++. So, debugging the MPI_Send.cc and
>>>> MPI_Recv.cc are difficult for me.
>>>> Thanks again for your help!
>>>> Best regards,
>>>> Sukanta
>>>> On Mon, May 13, 2013 at 7:56 AM, Riccardo Corradini
>>>> <
address@hidden> wrote:
>>>>> Dear
>>>>> you may easily debug MPI_Send.cc and MPI_Recv putting some printf
>>>>> concerning
>>>>> the integer flag info
>>>>> int info = send_class (comm, args(0), tankrank, mytag);
>>>>> printf("info for sending class = %i",info);
>>>>> You may follow a bottom up approach for every brick you want to send and
>>>>> receive. Finally you will find the flag that detetcts what is not
>>>>> working
>>>>> as
>>>>> expected.
>>>>> The older versions contained lots of pieces of code very similar, but
>>>>> they
>>>>> were easier to debug in this naive way.
>>>>> I would also suggest you to clear the variable once wou MPI_Send it.
>>>>> Try to get rid of variables that you do not need. The
master will
>>>>> receive
>>>>> them, so this will save memory.
>>>>> Bests
>>>>> Riccardo
>>>>> ________________________________
>>>>> Da: Sukanta Basu <
>>>>> A:
address@hidden>>>>> Inviato: Domenica 5 Maggio 2013 19:56
>>>>> Oggetto: Fwd: Memory Leak & openmpi_ext
>>>>> FYI
>>>>> ---------- Forwarded message ----------
>>>>> From: Sukanta Basu <
>>>>> Date: Sun, May 5, 2013 at 1:51 PM
>>>>> Subject: Memory Leak & openmpi_ext
>>>>> To: "c." <
address@hidden>, Octave Forge
>>>>> <
address@hidden>, Carnë Draug
>>>>> <
>>>>> Hi Carlo and Carne,
>>>>> I hope all is well.
>>>>> A few months ago, you helped me with the openmpi_ext toolbox.
>>>>> toolbox works like a charm. Unfortunately, I am facing a memory leak
>>>>> issue with this toolbox. I noticed this leak on all the platforms I
>>>>> have access to: Ubuntu (12.04, 12.10, and 13.04) and RedHat systems.
>>>>> The leak persists for all the recent versions of openmpi (1.6.2,
>>>>> 1.6.4, 1.7.1).
>>>>> Since my original code is too complicated for others to debug, I
>>>>> created a sample code for testing. Basically, I modified the
>>>>> speedtest.m file (written by Dr. Jeremy Kepner; MatlabMPI) to work in
>>>>> conjunction with openmpi_ext. I then ran this code with valgrind:
>>>>> valgrind --leak-check=yes -v --log-file=Valgrind.out mpirun -np 2
>>>>> octave -q --eval speedtest
>>>>> The summary of valgrind is:
>>>>> ==24981== definitely lost: 42,031 bytes in 28 blocks
>>>>> ==24981== indirectly lost: 25,802 bytes in 76 blocks
>>>>> ==24981== possibly lost: 0 bytes in 0 blocks
>>>>> ==24981== still reachable: 124,717 bytes in 603 blocks
>>>>> ==24981== suppressed: 0 bytes in 0 blocks
>>>>> I would appreciate if you could help me out with identifying the
>>>>> memory leak in openmpi_ext. I am attaching the speedtest.m file and
>>>>> Valgrind.out file.
>>>>> Best regards,
>>>>> Sukanta
>>>>> --
>>>>> Sukanta
>>>>> Associate Professor
>>>>> North Carolina State University
>>>>> --
>>>>> Sukanta Basu
>>>>> Associate Professor
>>>>> North Carolina State University
http://www4.ncsu.edu/~sbasu5/>>>>> _______________________________________________
>>>>> Help-octave mailing list
>>>> --
>>>> Sukanta Basu
>>>> Associate Professor
>>>> North Carolina State University
>>> --
>>> Sukanta Basu
>>> Associate Professor
>>> North Carolina State University
>> --
>> Sukanta Basu
>> Associate Professor
>> North Carolina State University
> --
> Sukanta Basu
> Associate Professor
> North Carolina State University
Sukanta Basu
Associate Professor
North Carolina State University