Re: Enhancement to mtrace perl script

bug-glibc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Enhancement to mtrace perl script

From:	david
Subject:	Re: Enhancement to mtrace perl script
Date:	Sun, 04 Mar 2007 19:03:09 +0100
User-agent:	Thunderbird 1.5.0.9 (X11/20060911)

It seems that my caution in the original mail was justified: I was too hasty with sending the script.

I have corrected the mistakes and further improved it, and here is the revised version.

The log now looks like:

~/projects/svn-general/trunk/test> ~/mtrace --pea test_mtrace mtrace.txt ===================================================== 0x186a00 (1600000) bytes not freed in 1 allocation(s) ===================================================== Leaks listed by memory address ------------------------------ Address Size ( base10) at Caller ------------------ -------- (--------) ---------------------------------------------------------------------- 0x0000000000603460 0x186a00 ( 1600000) /home/david/projects/svn-general/trunk/test/test_mtrace.c:80 At termination leakage memory totalled 186a00 (1600000): termination leakage allocation by Caller ---------------------------------------- Caller percent Num calls bytes ( base 10) ---------------------------------------------------------------------- -------- --------- ------------ (------------) /home/david/projects/svn-general/trunk/test/test_mtrace.c:80 100% 1 0x186a00 ( 1600000) Leaks listed by Caller ---------------------- Caller percent Num calls bytes ( base 10) ---------------------------------------------------------------------- -------- --------- ------------ (------------) /home/david/projects/svn-general/trunk/test/test_mtrace.c:80 100% 1 0x186a00 ( 1600000) =============================================================== The peak allocation during this run was 186a00, (1600000) bytes =============================================================== At peak [1] allocated memory totalled 186a00 (1600000): peak [1] allocated allocation by Caller --------------------------------------- Caller percent Num calls bytes ( base 10) ---------------------------------------------------------------------- -------- --------- ------------ (------------) /home/david/projects/svn-general/trunk/test/../src/link_list.c:57 100% 100000 0x186a00 ( 1600000) At peak [2] allocated memory totalled 186a00 (1600000): peak [2] allocated allocation by Caller --------------------------------------- Caller percent Num calls bytes ( base 10) ---------------------------------------------------------------------- -------- --------- ------------ (------------) /home/david/projects/svn-general/trunk/test/test_mtrace.c:80 100% 1 0x186a00 ( 1600000)

David.

david wrote:

I have just spent a couple of days trying to puzzle out where all the memory went in a large
and complex application that crashed with "out of memory". It was a painful experience :-{

Lying in bed last night thought "there must be a better way" and went trawling in the linux
man pages. I came across mtrace and saw that the log mtrace made could also be used
to identify where all the memory was allocated.

I have thus enhanced the mtrace perl script adding a track of what the peak usage is and
the --peak option that additionally reports all the allocation locations that contribute to
that (those) peak(s). Here is a sample output for a very small test program:

~/mtrace --peak test_mtrace mtrace.txt
The peak allocation during this run was 3c0, (960) bytes

Memory not freed:
-----------------
           Address     Size             Caller
0x0000000000603460     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x0000000000603480     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x00000000006034a0     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x00000000006034c0     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x00000000006034e0     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x0000000000603500     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x0000000000603520     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x0000000000603540     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x0000000000603560     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x0000000000603580     0x10 (      16) at /home/david/projects/svn-general/trunk/test/../src/link_list.c:57
0x00000000006035a0    0x320 (     800) at /home/david/projects/svn-general/trunk/test/test_mtrace.c:46

At peak allocation - 3c0, (960) - these were the code locations where the memory was allocated.
Location                                                               bytes
/home/david/projects/svn-general/trunk/test/test_mtrace.c:46           320 (800) in 1 allocations
/home/david/projects/svn-general/trunk/test/../src/link_list.c:57      a0 (160) in 10 allocations

Caution: As yet this update has only had limited testing - I've not used it in anger yet! I would
appreciate any news of usage and any problems encountered.

This updated script is donated to the GNU libC project and thus is released under the license of
the original script - see the code header.

GNU Libc maintainer: please consider incorporating this update in the next glibc release.

Farewell little script - I hope you can find your way in the world :^)
  

#! /usr/bin/perl
eval "exec /usr/bin/perl -S $0 $*"
    if 0;

# Copyright (C) 1997-2004, 2005, 2006 Free Software Foundation, Inc.
# This file is part of the GNU C Library.
# Contributed by Ulrich Drepper <address@hidden>, 1997.
# Based on the mtrace.awk script.

# The GNU C Library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2.1 of the License, or (at your option) any later version.

# The GNU C Library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Lesser General Public License for more details.

# You should have received a copy of the GNU Lesser General Public
# License along with the GNU C Library; if not, write to the Free
# Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
# 02111-1307 USA.
#
# This script analyses the output from the GNU malloc module's mtrace function.
# It can help identify where memory is leaking. Additionally with the -peak
# option it can also identify where all the memory is being consumed at peak 
usage.
#
# History
# 2007-03-03 D.Ingamells Restructured code and added the -peak option.

use strict;

my $VERSION = "2.5";
my $PACKAGE = "libc";
my $progname = $0;

sub usage($)
{
    my $status = shift;

    print "Usage: mtrace [OPTION]... [Binary] MtraceData\n";
    print "  --help       print this help, then exit\n";
    print "  --version    print version number, then exit\n";
    print "  --peaks      also report allocations at peaks\n";
    print "\n";
    print "For bug reporting instructions, please see:\n";
    print "<http://www.gnu.org/software/libc/bugs.html>.\n";
    exit $status;
}

# We expect two arguments:
#   #1: the complete path to the binary
#   #2: the mtrace data filename
# Options are --peak.
# The usual options (--help and --version) are also recognized.

sub arglist(\@)
{
  my $ARGV = shift;
  my $peaks = 0;

    if ($ARGV->[0] =~ m/^--?v(e(r(s(i(o(n)?)?)?)?)?)?$/)
    {
        print "mtrace (GNU $PACKAGE) $VERSION\n";
        print "Copyright (C) 2006 Free Software Foundation, Inc.\n";
        print "This is free software; see the source for copying conditions.  
There is NO\n";
        print "warranty; not even for MERCHANTABILITY or FITNESS FOR A 
PARTICULAR PURPOSE.\n";
        print "Written by Ulrich Drepper <address@hidden>\n";

        exit 0;
    }
    elsif ($ARGV->[0] =~ m/^--?h(e(lp?)?)?$/)
    {
        usage 0;
    }
    elsif ($ARGV->[0] =~ m/^--?p(e(a(ks?)?)?)?$/)
    {
       $peaks = 1;
       shift @$ARGV;
    }
    elsif ($ARGV->[0] =~ /^-/)
    {
        print "$progname: unrecognized option `$ARGV->[0]'\n";
        print "Try `$progname --help' for more information.\n";
        exit 1;
    }
    else
    {
    }

    return $peaks;
}

##
# Print a neat header optionally bordered above or below by a sequence
# of the same length of the given symbol.

sub neatPrint($$$)
{
    my $pre = shift; # Single Character (or empty) for symbol above line.
    my $line = shift; # title to print.
    my $post = shift; # Single Character (or empty) for symbol below line.

    my $len = length $line;

    if ($pre)
    {
        printf "%s\n", ($pre x $len);
    }

    printf "%s\n", $line;

    if ($post)
    {
        printf "%s\n", ($post x $len);
    }
}

##
# determines the source code name for the given code address.

{
    my %cache;

    sub location($$\%)
    {
        my $str = shift;
        my $binary = shift;
        my $locs_hr = shift;

        return $str if ($str eq "");

        if ($str =~ /.*[[](0x[^]]*)]:(.)*/)
        {
            my $addr = $1;
            my $fct = $2;

            return $cache{$addr} if (exists $cache{$addr});

            if ($binary ne "" && open (ADDR, "addr2line -e $binary $addr|"))
            {
                my $line = <ADDR>;
                chomp $line;
                close (ADDR);
                if ($line ne '??:0')
                {
                    $cache{$addr} = $line;
                    return $cache{$addr};
                }
            }

            $cache{$addr} = $str = "$fct @ $addr";
        }
        elsif ($str =~ /^(.*):.*[[](0x[^]]*)]$/)
        {
            my $prog = $1;
            my $addr = $2;
            my $searchaddr;

            return $cache{$addr} if (exists $cache{$addr});

            if ($locs_hr->{$prog} ne "")
            {
                $searchaddr = sprintf "%#x", $addr - $locs_hr->{$prog};
            }
            else
            {
                $searchaddr = $addr;
                $prog = $binary;
            }

            if ($binary ne "" && open (ADDR, "addr2line -e $prog $searchaddr|"))
            {
                my $line = <ADDR>;
                chomp $line;
                close (ADDR);

                if ($line ne '??:0')
                {
                    $cache{$addr} = $line;
                    return $cache{$addr};
                }
            }

            $cache{$addr} = $str = $addr;
        }
        elsif ($str =~ /^.*[[](0x[^]]*)]$/)
        {
            my $addr = $1;

            return $cache{$addr} if (exists $cache{$addr});

            if ($binary ne "" && open (ADDR, "addr2line -e $binary $addr|"))
            {
                my $line = <ADDR>;
                chomp $line;
                close (ADDR);

                if ($line ne '??:0')
                {
                    $cache{$addr} = $line;
                    return $cache{$addr};
                }
            }

            $cache{$addr} = $str = $addr;
        }

        return $str;
    }
}

##
## Read all the command line arguments, act on any commands (like help and 
version)
## and return the remainder to the caller.
##

sub getArgs(\@)
{
    my $ARGV = shift;
    my $peaks = arglist(@$ARGV);

    my $data = $ARGV->[0];
    my $prog = "";

    if (1 == @$ARGV)
    {
      $data = $ARGV->[0];
    }
    elsif (2 == @$ARGV)
    {
        my $binary = $ARGV->[0];
        $data      = $ARGV->[1];

        if ($binary =~ /^.*[\/].*$/)
        {
          $prog = $binary;
        }
        else
        {
          $prog = "./$binary";
        }
    }
    else
    {
      die "Wrong number of arguments (" . scalar @ARGV . "), run $progname 
--help for help.";
    }

    return ($data, $prog, $peaks);
}

#
# Load all locations (as reported by env) in the executable
#

sub loadLocs($)
{
    my $prog = shift;
    my %locs;

    if ($prog && (open (LOCS, "env LD_TRACE_LOADED_OBJECTS=1 $prog |")))
    {
        while (my $l = <LOCS>)
        {
            chomp $l;
            if ($l =~ /^.*=> (.*) .(0x[0123456789abcdef]*).$/)
            {
                $locs{$1} = $2;
            }
        }

        close (LOCS);
    }

    return %locs;
}

##
## Find the memory allocated but not freed and return
## the amount of leakage and the high water mark of the memory.
##

sub findLeaks(\%\%\%$$)
{
    my $allocated_hr = shift;
    my $addrwas_hr   = shift;
    my $locs_hr      = shift;
    my $data         = shift;
    my $prog         = shift;

    my $currAllocated = 0;
    my $peakAllocated = 0;

    my $nr = 0;

    open(DATA, "< $data") || die "Cannot open mtrace data file";

    while (my $l = <DATA>)
    {
        chomp $l;

        my @cols = split / +/, $l;
        my $n;
        my $where;

        if ($cols[0] eq "@")
        {
            # We have address and/or function name.
            $where = $cols[1];
            $n = 2;
        }
        else
        {
            $where = "";
            $n = 0;
        }

        my $allocaddr = $cols[$n + 1];
        my $howmuch = hex($cols[$n + 2]);

        ++$nr;

        if (($cols[$n] eq "+") || ($cols[$n] eq ">"))
        {
            # report of an allocation of memory or a realloc new allocation.

            $currAllocated += $howmuch;
            $peakAllocated = $currAllocated if ($peakAllocated < 
$currAllocated);

            if (defined $allocated_hr->{$allocaddr})
            {
                if ($cols[$n] eq "+")
                {
                    printf ("+ %#018x Alloc %d duplicate: %s %s\n",
                            hex($allocaddr), $nr,
                            location($addrwas_hr->{$allocaddr}, $prog, 
%$locs_hr),
                            $where);
                }
                else
                {
                    printf ("+ %#018x Realloc %d duplicate: %#010x %s %s\n",
                            hex($allocaddr), $nr, $allocated_hr->{$allocaddr},
                            location($addrwas_hr->{$allocaddr}, $prog, 
%$locs_hr),
                            location($where, $prog, %$locs_hr));
                }
            }
            else
            {
                $allocated_hr->{$allocaddr} = $howmuch;
                $addrwas_hr->{$allocaddr}   = $where;
            }
        }
        elsif (($cols[$n] eq "-") || ($cols[$n] eq "<"))
        {
            # report of a free or realloc release

            $currAllocated -= $allocated_hr->{$allocaddr};

            if (defined $allocated_hr->{$allocaddr})
            {
                delete $allocated_hr->{$allocaddr};
                delete $addrwas_hr->{$allocaddr};
            }
            else
            {
                my $ty = ($cols[$n] eq "-") ? "Free" : "Realloc";

                printf ("- %#018x %s %d was never alloc'd %s\n",
                        hex($allocaddr), $ty, $nr,
                        location($where, $prog, %$locs_hr));
            }
        }
        elsif ($cols[$n] eq "=")
        {
            # Ignore "= Start".
        }
        elsif ($cols[$n] eq "!")
        {
            # Ignore failed realloc for now.
        }
    }
    close (DATA);

    return ($currAllocated, $peakAllocated);
}

##
# Produce a tabular report of allocating locations and the total memory
# allocated at this location (and not yet freed at the given moment) together
# with the count of the number of allocations that contributed to this size.

sub logCallerBytes($$\%\%\%$)
{
    my $title        = shift;
    my $total        = shift;
    my $whereUsed_hr = shift;
    my $whereCounts_hr = shift;
    my $locs_hr      = shift;
    my $prog         = shift;

    neatPrint("", $title, "-");
    my $fmt =  "%-70s %7d%% %9d %#12x (%12d)\n";
    my $fmtH = "%-70.70s %8.8s %9.9s %12.12s (%12.12s)\n";

    printf $fmtH, "Caller", "percent", "Num calls", "bytes", "base 10";

    my $under = "-" x 70;

    printf $fmtH, $under, $under, $under, $under, $under;

    foreach my $k (sort keys %$whereUsed_hr)
    {
        printf($fmt,
               location($k, $prog, %$locs_hr), int($whereUsed_hr->{$k}/$total * 
100),  
               $whereCounts_hr->{$k},
               $whereUsed_hr->{$k},
               $whereUsed_hr->{$k});
    }

    print "\n";
}

#
# Log the allocations of the given type (e.g. peak or termination) per
# code call to the allocation functions.
#

sub logAllocations($\%\%\%$$)
{
    my $type         = shift;
    my $whereUsed_hr = shift;
    my $whereCounts_hr = shift;
    my $locs_hr      = shift;
    my $prog         = shift;
    my $allocated    = shift;

    printf "\nAt %s memory totalled %x (%d):\n",
             $type, $allocated, $allocated;

    logCallerBytes("$type allocation by Caller", $allocated, 
                   %$whereUsed_hr, %$whereCounts_hr, %$locs_hr, $prog);
}

##
# produce a report of the allocations at the high water marks
# of the program execution.

sub reportPeaks($$$)
{
    my $data          = shift;
    my $prog          = shift;
    my $peakAllocated = shift;

    my %locs = loadLocs($prog);

    my %allocated;
    my %addrWas;
    my %whereUsed;
    my %whereCounts;

    my $currAllocated = 0;

    my $msg = sprintf("The peak allocation during this run was %x, (%d) bytes",
            $peakAllocated, $peakAllocated);

    neatPrint("=", $msg, "=");

    my $nr = 0;

    open(DATA, "< $data") || die "Cannot open mtrace data file";

    while (my $l = <DATA>)
    {
        chomp $l;

        my @cols = split / +/, $l;
        my $n;
        my $where;

        if ($cols[0] eq "@")
        {
            # We have address and/or function name.
            $where = $cols[1];
            $n = 2;
        }
        else
        {
            $where = "";
            $n = 0;
        }

        my $allocaddr = $cols[$n + 1];
        my $howmuch = hex($cols[$n + 2]);

        if (($cols[$n] eq "+") || ($cols[$n] eq ">"))
        {
            # report of an allocation of memory or the new memory part of a 
realloc.

            if (!defined $allocated{$allocaddr})
            {
                $allocated{$allocaddr} = $howmuch;
                $addrWas{$allocaddr}   = $where;

                $whereUsed{$where} += $howmuch;
                $whereCounts{$where} ++;

                $currAllocated += $howmuch;

                if ($currAllocated >= $peakAllocated)
                {
                    $nr++;
                    logAllocations("peak [$nr] allocated", %whereUsed, 
                                   %whereCounts, %locs, $prog, $peakAllocated);
                }
            }
        }
        elsif (($cols[$n] eq "-") || ($cols[$n] eq "<"))
        {
            # report of a free of the first half of a realloc.

            if (defined $allocated{$allocaddr})
            {
                $currAllocated -= $allocated{$allocaddr};

                $whereUsed{$addrWas{$allocaddr}} -= $allocated{$allocaddr};
                $whereCounts{$addrWas{$allocaddr}} --;

                delete  $whereUsed{$addrWas{$allocaddr}} if 
($whereUsed{$addrWas{$allocaddr}} <= 0);
                delete $whereCounts{$addrWas{$allocaddr}} if (!defined 
$whereUsed{$addrWas{$allocaddr}});

                delete $allocated{$allocaddr};
                delete $addrWas{$allocaddr};
            }
        }
        elsif ($cols[$n] eq "=")
        {
            # Ignore "= Start".
        }
        elsif ($cols[$n] eq "!")
        {
            # Ignore failed realloc for now.
        }
    }
    close (DATA);
}

##
# Produce a report of the leaked memory listed by
# the virtual address of the memory chunks.

sub logleaksByAddress(\%\%\%$)
{
    my $allocated_hr = shift;
    my $addrwas_hr   = shift;
    my $locs_hr      = shift;
    my $prog         = shift;

    neatPrint("", "Leaks listed by memory address", "-");

    my $fmt  = "%#018x %#8x (%8d) %s\n";
    my $fmtH = "%18.18s %8.8s (%8.8s) %-70.70s\n";

    printf $fmtH, "Address", "Size", "base10", "at Caller";

    my $under = "-" x 70;
    printf $fmtH, $under, $under, $under, $under;

    foreach my $addr (sort keys %$allocated_hr)
    {
        if (defined $allocated_hr->{$addr})
        {
            printf ($fmt, hex($addr), $allocated_hr->{$addr},
                    $allocated_hr->{$addr}, location($addrwas_hr->{$addr}, 
$prog, %$locs_hr));
        }
    }
    print "\n";
}

##
# Produce a report of the leaked memory grouped by
# the code line that allocated the memory chunks.

sub logLeaksByCode($\%\%\%$)
{
    my $leakedAllocated = shift;
    my $allocated_hr = shift;
    my $addrwas_hr   = shift;
    my $locs_hr      = shift;
    my $prog         = shift;

    my %SizeByLocation;
    my %CountByLocation;

    foreach my $addr (keys %$allocated_hr)
    {
        $SizeByLocation{$addrwas_hr->{$addr}} += $allocated_hr->{$addr};
        $CountByLocation{$addrwas_hr->{$addr}} ++;
    }

    logAllocations("termination leakage", %SizeByLocation, %CountByLocation, 
%$locs_hr, $prog, $leakedAllocated);

    logCallerBytes("Leaks listed by Caller", $leakedAllocated, %SizeByLocation, 
%CountByLocation, %$locs_hr, $prog);
}

##
# Log all the leak allocation locations.
# If the number of allocations is small print a report by address of allocated 
memory.
# Always produce a report of all code allocation locations with 
# counts and total memory allocated here.
#

sub logLeaks($\%\%\%$)
{
    my $leakedAllocated = shift;
    my $allocated_hr = shift;
    my $addrwas_hr   = shift;
    my $locs_hr      = shift;
    my $prog         = shift;

    my $anything = 0;

    my $num = scalar(keys %$allocated_hr);

    if ($num > 0)
    {
        $anything = 1;

        my $msg = sprintf "%#x (%d) bytes not freed in %d allocation(s)",
               $leakedAllocated, $leakedAllocated, $num;
        neatPrint("=", $msg, "=");

        if ($num < 200)
        {
            logleaksByAddress(%$allocated_hr, %$addrwas_hr, %$locs_hr, $prog);
        }

        logLeaksByCode($leakedAllocated, %$allocated_hr, %$addrwas_hr, 
%$locs_hr, $prog);
    }
    else
    {
        print "No memory leaks.\n";
    }

    return $anything;
}

##
# Find and report any memory leaks.
# Along the way determine the height of the high-water-mark for
# later reporting allocations at the high tides.
#

sub findAndReportLeaks($$)
{
    my $data          = shift;
    my $prog          = shift;

    my %locs = loadLocs($prog);

    my %allocated;
    my %addrwas;

    my ($leakedAllocated, $peakAllocated) = findLeaks(%allocated, %addrwas, 
%locs, $data, $prog);

    # Now print all remaining entries.

    my $anythingFound = logLeaks($leakedAllocated, %allocated, %addrwas, %locs, 
$prog);
  
    return ($peakAllocated, $anythingFound);
}

##
# The main program.
#

sub main()
{
    my ($data, $prog, $peaks) = getArgs(@ARGV);

    my ($peakAllocated, $anythingFound) = findAndReportLeaks($data, $prog);

    reportPeaks($data, $prog, $peakAllocated) if ($peaks);

    return $anythingFound != 0;
}

exit main();

[Prev in Thread]

Current Thread

[Next in Thread]

Enhancement to mtrace perl script, david, 2007/03/04
- Re: Enhancement to mtrace perl script, david <=

Prev by Date: Enhancement to mtrace perl script
Next by Date: MAC OS X porting
Previous by thread: Enhancement to mtrace perl script
Next by thread: MAC OS X porting
Index(es):
- Date
- Thread