gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gcl-devel] Profiling [ was Re: lisp reader enhancement ]


From: Camm Maguire
Subject: [Gcl-devel] Profiling [ was Re: lisp reader enhancement ]
Date: 04 Aug 2003 19:21:07 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

Greetings!  Just a followup note here in case you're still interested in
trying to analyze the performance.

1)  If you reran with the new gbc-time, I'd be interested to know
    what it was.

2) I've been looking at GCL's two profiling modes.  The first is based
   on the 'profil' call to libc.  It works, but unfortunately our
   documentation is misleading/wrong.  The idea is to:

        0) Apply the following two patches:

===================================================================
RCS file: /cvsroot/gcl/gcl/o/fat_string.c,v
retrieving revision 1.14
diff -u -r1.14 fat_string.c
--- fat_string.c        15 Feb 2003 00:38:28 -0000      1.14
+++ fat_string.c        4 Aug 2003 22:52:09 -0000
@@ -49,7 +49,7 @@
   if( type_of(start_address)!=t_fixnum ||   type_of(scale)!=t_fixnum)
     FEerror("Needs start address and scale as args",0);
 
-  profil((void *) (ar->ust.ust_self), (ar->ust.ust_dim),
+  profil(!(fix(start_address)*fix(scale)) ? NULL : (void *) 
(ar->ust.ust_self), (ar->ust.ust_dim),
         fix(start_address),fix(scale) << 8);
   RETURN1(start_address);
 }
--- ptable.h    12 Jul 2002 22:00:48 -0000      1.4
+++ ptable.h    4 Aug 2003 22:51:56 -0000
@@ -50,7 +50,7 @@
 #ifdef SPECIAL_RSYM
 struct string_address_table c_table;
 #else
-static struct bfd_link_info link_info;
+struct bfd_link_info link_info;
 #endif
 struct string_address_table combined_table;
 

        a) in your program, first run set-up-profile, defined in
        profile.lsp (should be in your gcl autoload path).  The first
        argument is the size of the profile array, with 1000000 being
        suggested in the docs -- this works for me.  What is not
        documented is the optional second argument specifying the
        maximum number of functions in your image.  I had to supply
        this to increase the default value when testing in maxima.
        Here is my call:

                :lisp (si::set-up-profile     1000000 8000)


                        Loaded c and other function addresses
                        Using profile-array length 1000000 
                        Use (si::prof 0 90) to start and (prof 0 0) to stop:
                        This starts monitoring at address 0 
                        thru byte (256/90)*(length *profile-array*)
                        (si::display-prof) displays the results
                        NIL

        b) run si::prof with the starting address you desire to
        examine, and a 'scale' parameter indicating how many counter
        profile array elements to allocate to each 256 bytes of code.
        You can see these addresses reported on loading binary
        modules, or you could use the DBEGIN value issued at configure
        time.  You could also try function-address.  We need to make
        it easier to just say 'profile my whole program', but what I
        did was find the address of _init in gdb, find the value of
        heap_end in gdb, figure out a good scale for this amount of
        memory to correspond to a 1000000 byte array, and issue the
        following call for maxima:

                :lisp (si::prof 134522412 15)

        c) run some intesive code, I did load("tests.lisp");

        d) Turn off the profiler

                :lisp (si::prof 0 0)

        e) display the results, probably using the same scale and
        start address as in b):

                :lisp (si::display-profile 134522412 15)

        My output follows.  There are a few gotchas in interpretation
        -- If a global function in GCL's core is separated from the
        next such function by a bunch of static functions, all counts
        for the static functions get lumped into the first's counter.
        The large values for 'enter_mark_origin' thus refers to the
        GBC time in the marker and sweeper located in between in the
        code.  Also needs cleaning up.

        All user functions are not thus afflicted.  But the core
        function counts are probably illustrative, as they may indicate
        inefficiencies in the lisp compiler's optimization/inlining of
        calls.  

        I short, I'd appreciate seeing your version of the following:

  0.10% (    1): strerror
  0.10% (    1): memmove
  0.39% (    4): IapplyVector
  0.39% (    4): Iinvoke_c_function_from_value_stack
  0.10% (    1): make_fixnum1
  0.10% (    1): pack_hash
  0.19% (    2): intern
  0.10% (    1): Lintern
  0.10% (    1): Lunuse_package
  0.10% (    1): Lparse_namestring
  0.10% (    1): fLrationalp
  0.10% (    1): fLfunctionp
  1.07% (   11): eql
  0.58% (    6): equal
  0.29% (    3): init_predicate_function
  0.10% (    1): edit_double
  0.78% (    8): write_object
  0.68% (    7): Lwrite_char
  0.10% (    1): princ
  0.19% (    2): init_prog
  0.58% (    6): read_object
  0.39% (    4): NIL
  0.19% (    2): NIL
  0.87% (    9): Lread_char
  0.10% (    1): rl_putc_em
  0.48% (    5): symbol_function
  0.10% (    1): Lelt
  1.55% (   16): elt
  0.39% (    4): elt_set
  0.87% (    9): length
  0.19% (    2): nreverse
  0.10% (    1): fSclear_connection_state
  0.19% (    2): coerce_to_string
  0.10% (    1): get_string_start_end
  0.19% (    2): Lstring_neq
  0.10% (    1): Lmake_string
  0.19% (    2): Lstring_right_trim
  0.10% (    1): structure_ref
  0.68% (    7): symbol_value
  1.84% (   19): getf
  0.10% (    1): get
  0.19% (    2): remf
  0.10% (    1): remprop
  0.19% (    2): Lgetf
  0.39% (    4): check_type_symbol
  0.10% (    1): check_type_string
  0.10% (    1): check_type_cons
  0.10% (    1): Ltype_of
  2.62% (   27): alloc_object
  3.30% (   34): make_cons
  0.39% (    4): alloc_relblock
  0.29% (    3): fLrow_major_aref
  0.10% (    1): fSaset1
  0.10% (    1): fSget_aelttype
  0.19% (    2): gset
  0.10% (    1): array_allocself
  0.10% (    1): fLfill_pointer
  0.48% (    5): setq
  0.10% (    1): fLmakunbound
  0.10% (    1): fSset_gmp_allocate_relocatable
  0.19% (    2): new_bignum
  0.10% (    1): normalize_big_to_object
  0.10% (    1): mul_int_big
  0.39% (    4): normalize_big
  0.10% (    1): big_minus
  0.19% (    2): maybe_replace_big
  0.39% (    4): integer_quotient_remainder_1
  0.19% (    2): parse_key_new_new
  0.19% (    2): Lchar_eq
  0.10% (    1): Lchar_neq
  0.10% (    1): Lchar_upcase
  0.10% (    1): cplus
  0.10% (    1): init_error
  1.94% (   20): funcall
  0.19% (    2): funcall_no_event
  0.10% (    1): super_funcall
  0.29% (    3): super_funcall_no_event
  0.10% (    1): readc_stream
  0.10% (    1): unreadc_stream
  0.58% (    6): writec_stream
  0.29% (    3): flush_stream
  0.39% (    4): read_fasl_data
  0.10% (    1): Lformat
  0.29% (    3): fLformat
  0.10% (    1): frs_sch_catch
  0.10% (    1): call_or_link
  0.48% (    5): c_apply_n
  3.39% (   35): call_proc_new
 31.62% (  326): enter_mark_origin
  0.10% (    1): perm_writable
  1.65% (   17): car
  1.65% (   17): cdr
  0.29% (    3): kar
  1.26% (   13): list
  0.19% (    2): listA
  0.10% (    1): append
  0.19% (    2): copy_list
  0.19% (    2): Lcdr
  0.10% (    1): cddr
  0.39% (    4): Llast
  0.39% (    4): Lmake_list
  0.10% (    1): Lrevappend
  0.29% (    3): nconc
  0.10% (    1): Lreconc
  0.10% (    1): fixnum_add
  0.19% (    2): number_plus
  0.19% (    2): number_minus
  0.19% (    2): get_gcd
  0.97% (   10): number_compare
  0.10% (    1): Lall_the_same
  0.10% (    1): number_expt
  0.10% (    1): GET-INSTREAM
  0.29% (    3): READ-FROM-STRING
  0.10% (    1): TYPEP
  0.10% (    1): SYSTEM::NORMALIZE-TYPE
  0.48% (    5): SYSTEM::KNOWN-TYPE-P
  0.29% (    3): SUBTYPEP
  0.10% (    1): CONCATENATE
  0.10% (    1): SYSTEM::BAD-SEQ-LIMIT
  0.10% (    1): REMOVE
  0.10% (    1): DELETE
  0.19% (    2): POSITION
  0.10% (    1): SORT
  0.10% (    1): WITH-HASH-TABLE-ITERATOR
  0.10% (    1): user_match
  0.10% (    1): ASSQR
  0.10% (    1): SYS-FREE-MEMORY
  0.19% (    2): MGET
  0.10% (    1): MACLISP-TYPEP
  0.19% (    2): BOTHCASE-IMPLODE
  0.10% (    1): LIST-STRING
  0.10% (    1): TYO
  0.39% (    4): TYI
  0.10% (    1): FILE-TO-STRING
  0.10% (    1): ALPHALESSP
  0.78% (    8): MEMQ
  0.10% (    1): DELQ
  0.29% (    3): SAFE-GET
  0.19% (    2): GETL
  0.10% (    1): SUB
  0.19% (    2): ASSOL
  0.10% (    1): ASSOLIKE
  0.10% (    1): MAXIMA-SUBSTITUTE
  0.19% (    2): GETOP
  0.19% (    2): GETOPR
  0.10% (    1): $LISTP
  0.10% (    1): SPANG1
  0.10% (    1): $GETCHAR
  0.10% (    1): MEVALARGS
  0.10% (    1): SAFE-MGETL
  1.36% (   14): MEVAL1
  0.10% (    1): GETL-LM-FCN-PROP
  0.10% (    1): MGETL
  0.10% (    1): $SQRT
  0.10% (    1): $BINOMIAL
  0.19% (    2): ONEP1
  0.39% (    4): ZEROP1
  0.19% (    2): MNUMP
  0.10% (    1): RATNUMP
  0.10% (    1): MEXPTP
  0.10% (    1): $RATNUMP
  0.10% (    1): SPECREPCHECK
  0.10% (    1): CONSTANT
  0.10% (    1): MAXIMA-CONSTANTP
  0.10% (    1): MXORLISTP1
  0.10% (    1): CONSTFUN
  0.10% (    1): FREE
  0.58% (    6): SIMPLIFYA
  0.29% (    3): EQTEST
  0.10% (    1): RULECHK
  0.10% (    1): TIMESK
  0.10% (    1): PLS
  0.10% (    1): TESTT
  0.10% (    1): TESTTNEG
  0.19% (    2): SIMPTIMES
  0.10% (    1): STIMEX
  0.10% (    1): TMS
  0.10% (    1): SIGNUM1
  0.19% (    2): EXPTRL
  0.10% (    1): SIMPEXPT
  0.29% (    3): TIMESIN
  0.58% (    6): ALIKE1
  0.10% (    1): ALIKE
  0.10% (    1): ORDHACK
  0.10% (    1): RATNUMERATOR
  0.39% (    4): NTHKDR
  0.10% (    1): $MKEY
  0.10% (    1): MEVAL*
  0.10% (    1): KILL1
  0.10% (    1): REMALIAS
  0.10% (    1): STRING*
  0.10% (    1): $NOUNIFY
  0.10% (    1): PFLATTEN
  0.10% (    1): PMINUSP
  0.10% (    1): PQUOTIENT
  0.19% (    2): ALGORD
  0.10% (    1): PSIMP
  0.10% (    1): PCETIMES1
  0.10% (    1): PCTIMES
  0.10% (    1): PEXPTSQ
  0.58% (    6): $APROPOS
  0.10% (    1): DIVISORS
  0.10% (    1): COMPLETEVECTOR
  0.10% (    1): CFACTOR
  0.19% (    2): SAVEFACTORS
  0.10% (    1): FACTOROUT1
  0.10% (    1): PFACTOR1
  0.10% (    1): PSQUOREM1
  0.10% (    1): OLDCONTENT1
  0.10% (    1): PCONTENTZ
  0.19% (    2): $RATSIMP
  0.10% (    1): $FACTOR
  0.10% (    1): RATSETUP1
  0.10% (    1): GENSYM-READABLE
  0.10% (    1): ORDERPOINTER
  0.10% (    1): RATREP*
  0.10% (    1): RATF
  0.29% (    3): PREP1
  0.10% (    1): NEWVAR1
  0.10% (    1): PDISREP
  0.10% (    1): PDISREP!
  0.10% (    1): PDISREP2
  0.10% (    1): $RATDISREP
  0.10% (    1): CDISREP
  0.10% (    1): NEWVAR
  0.10% (    1): RADSORT
  0.10% (    1): RDIS*
  0.10% (    1): PRODCOEF
  0.19% (    2): FREEOF
  0.10% (    1): FLGREAT1
  0.19% (    2): STRING1
  0.19% (    2): MSIZE
  0.10% (    1): MSIZE-ATOM
  0.10% (    1): MSZ
  0.10% (    1): MSZ-MEXPT
  0.29% (    3): STRMDOIN
  0.10% (    1): NFORMAT
  0.39% (    4): FORM-MTIMES
  0.10% (    1): IMEMBER
  0.10% (    1): ALPHABETP
  0.10% (    1): ASCII-NUMBERP
  0.10% (    1): TYI-PARSE-INT
  0.10% (    1): ALIASLOOKUP
  0.19% (    2): GOBBLE-WHITESPACE
  0.10% (    1): PARSER-ASSOC
  0.19% (    2): READ-COMMAND-TOKEN-AUX
  0.10% (    1): READLIST
  0.19% (    2): SCAN-DIGITS
  0.10% (    1): COLLISION-CHECK
  0.10% (    1): MREAD
  0.29% (    3): ADD-LINEINFO
  0.10% (    1): TRIGINT
  0.10% (    1): $LISTOFVARS
  0.19% (    2): DIMENSION
  0.10% (    1): MAKESTRING
  0.10% (    1): DIMENSION-SUPERSCRIPT
  0.10% (    1): CHECKFIT
  0.19% (    2): CHECKBREAK
  0.19% (    2): OUTPUT
  0.10% (    1): DRAW-LINEAR
  0.10% (    1): D-SUMSIGN
  0.48% (    5): EXPLODEN
  0.10% (    1): CNTP
  0.10% (    1): LEARN
  0.10% (    1): LIKE
  0.10% (    1): DEQ
  0.10% (    1): TESTA*
  0.10% (    1): E1-
  0.10% (    1): PSTIMES
  0.19% (    2): PSEXPT1
  0.10% (    1): GET-ARRAY-POINTER
  0.10% (    1): SIMPLIM%INVERSE_JACOBI_DS
  0.10% (    1): %$ETEST
Total ticks 1031
134522412

        f) I'd like suggestions on how to improve the usability here.

3)  The gprof mode will require a bit more work, as recent gprof in
    Linux calls monstartup authmatically in gcrt0.o, foiling the
    existing strategy in the code.

Take care,

                

-- 
Camm Maguire                                            address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah




reply via email to

[Prev in Thread] Current Thread [Next in Thread]