help-make
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: GNU Make on Linux Feeding All Commands Through ksh


From: Rinehart, Raleigh
Subject: RE: GNU Make on Linux Feeding All Commands Through ksh
Date: Thu, 4 Dec 2008 12:13:19 -0600


> -----Original Message-----
> From: address@hidden [mailto:help-make-
> address@hidden On Behalf Of Steve Waltner
> Sent: Thursday, December 04, 2008 10:06 AM
> To: address@hidden
> Subject: Re: GNU Make on Linux Feeding All Commands Through ksh
> 
> After two months, I'm finally looking into this issue again. Gotta get
> it working by the end of the year since migrating builds to Linux
> (more specifically the faster x86 hardware) is one of my business
> objectives for the year... :-)
> 
> Thanks to both Philip and David for their original responses. I've
> done some more digging and found the following information:
> 
> - Philip was correct in that both platforms were running all commands
> through /bin/ksh, even though it didn't look like that was the case on
> Solaris. Solaris does in fact use the exec(2) system call when called
> with the -c option. This means the "difference" between Solaris and
> Linux doesn't appear to be a significant difference.
> 
> - I am running a stock version of GNU make 3.81 on both platforms that
> I personally compiled from the same source code download. There
> shouldn't be anything funky about differences between the copies of
> GNU make, especially from the possibility of RedHat making changes to
> the source code before compiling and including it in their Linux
> distribution.
> 
> - Per David's commend about wether the "SHELL = /bin/ksh" definition
> was actually necessary or not, I did try removing that from the
> makefiles, but still got the same result of not "enough" simultaneous
> jobs running for the build while running on Linux. This did make the
> tree of processes on Linux look like "gmake -> gmake -> gmake ->..."
> intsead of "gmake -> ksh -> gmake -> ksh -> gmake ->...", but since
> this didn't change the behavior of the root issue on Linux, I removed
> that change from the makefiles. I do remember the developer that did
> most of the work on the makefiles making the comment about /bin/sh on
> Solaris being junk and switching to /bin/ksh. I didn't try a build on
> Solaris and never let the build on Linux run to completion to see if
> this actually produced a valid build, but again, since this didn't fix
> the root issue, this issue is not related to the main question at
> hand, so I'll give up on that for now and maybe look at that later.
> 
> The main question that remains would be: Is there a way to debug and
> follow the token check-in/check-out process that is used internally in
> GNU make to try and see what's going on here? I can work on trying to
> track down what's going wrong, but without a way to get visibility
> into the process, I'd just be making random changes to the makefiles,
> which isn't going to be very productive.
> 

You can try Remake from here to do some tracing and debugging:
http://bashdb.sourceforge.net/remake/

Also John Graham-Cumming has some helpful info and macros to help with tracing 
GNU make macros and rule execution in his "GNU Make Unleashed" book
And on his Ask Mr. Make website.

Hope this helps,
-raleigh


> Steve
> 
> On Oct 8, 2008, at 11:31 AM, Steve Waltner wrote:
> 
> > I'm working on completing the migration of our build process for a
> > rather large software project from Solaris SPARC to Linux x86 and have
> > run into an issue. This process is using GNU Make 3.81 on a Solaris 9
> > box and a RedHat AS 4.7 x86_64 system. The major symptom that I've
> > noticed is that the Linux system doesn't really honor the "-j 4"
> > option we typically build with. It quickly degrades into a single
> > threaded build. Items of note include:
> >
> > - The builds are being passed to a cluster of systems running Sun Grid
> > Engine, so the "-j 4" option isn't passed at the command line. The
> > first build command looks for a $NSLOTS environment variable and
> > changes the MAKEFLAGS as appropriate.
> >
> > - I am running both builds with a copy of GNU Make that I compiled
> > from the same source code. I am not using the copy of make that was
> > included with the RedHat or Solaris systems.
> >
> > - The makefiles include settings like "override SHELL = /bin/ksh" to
> > force all shell interpretations to go through the ksh.
> >
> > It appears as though the Linux system feeds every command through a
> > ksh process, while the same function on the Solaris system calls the
> > command (wether it is a ccpentium, ccarm, or make command) directly.
> > This is done by looking at the process hierarchy using the pstree
> > command. The examples below were both done on a build with NSLOTS=4
> > (ie: -j 4). You can see the Solaris build running three ccpentium
> > processes at the time this snapshot was taken, while the Linux build
> > has only spawned a single ccpentium command.
> >
> > Solaris:
> > ====================
> > ictgrid004:~> sgetree
> > -+- 00278 sgeadmin 9:36 /soft/gridware-wic/sge/6.0u6/bin/sol-sparc64/
> > sge_execd
> >  |-+- 18341 sgeadmin sge_shepherd-467543 -bg
> >  | \-+- 18342 root /soft/gridware-wic/sge/6.0u6/utilbin/sol-sparc64/
> > rshd -l
> >  |   \-+- 18343 swaltner /soft/gridware-wic/sge/6.0u6/utilbin/sol-
> > sparc64/qrsh_
> >  |     \-+- 18347 swaltner tcsh -c hostname ; gmake
> >  |       \-+- 18353 swaltner /soft/gnu/make/3.81/bin/gmake
> >  |         \-+- 26740 swaltner /soft/gnu/make/3.81/bin/gmake
> > Platform/.make App
> >  |           |-+- 26757 swaltner /soft/gnu/make/3.81/bin/gmake -C
> > Platform MKLe
> >  |           | \-+- 26819 swaltner /soft/gnu/make/3.81/bin/gmake
> > Boot/.make Sys
> >  |           |   \-+- 26906 swaltner /soft/gnu/make/3.81/bin/gmake -C
> > System MK
> >  |           |     \-+- 27024 swaltner /soft/gnu/make/3.81/bin/gmake
> > BSP/.make
> >  |           |       \-+- 10502 swaltner /soft/gnu/make/3.81/bin/
> > gmake -C DQ MK
> >  |           |         \-+- 10560 swaltner /soft/gnu/make/3.81/bin/
> > gmake DQ MKL
> >  |           |           \-+- 11323 swaltner ccpentium -c -o dq.o -
> > fmessage-len
> >  |           |             \--- 11331 swaltner /soft/windriver/gpp/
> > 3.4/gnu/3.4.
> >  |           \-+- 26788 swaltner /soft/gnu/make/3.81/bin/gmake -C
> > Application M
> >  |             \-+- 26853 swaltner /soft/gnu/make/3.81/bin/gmake
> > RAID/.make Deb
> >  |               |-+- 06868 swaltner /soft/gnu/make/3.81/bin/gmake -C
> > Debug MKL
> >  |               | \-+- 06928 swaltner /soft/gnu/make/3.81/bin/gmake
> > ccvm_dbg/.
> >  |               |   \-+- 11524 swaltner /soft/gnu/make/3.81/bin/
> > gmake -C safe_
> >  |               |     \-+- 11585 swaltner /soft/gnu/make/3.81/bin/
> > gmake safe_d
> >  |               |       \-+- 11639 swaltner ccpentium -c -o
> > safeSymbolDebug.o
> >  |               |         \--- 11642 swaltner /soft/windriver/gpp/
> > 3.4/gnu/3.4.
> >  |               |-+- 26909 swaltner /soft/gnu/make/3.81/bin/gmake -C
> > RAID MKLe
> >  |               | \-+- 27055 swaltner /soft/gnu/make/3.81/bin/gmake
> > cache/.mak
> >  |               |   \-+- 08612 swaltner /soft/gnu/make/3.81/bin/
> > gmake -C hid M
> >  |               |     \-+- 08728 swaltner /soft/gnu/make/3.81/bin/
> > gmake hid MK
> >  |               |       \-+- 11452 swaltner ccpentium -c -o
> > hidLUDispatch.o -f
> >  |               |         \--- 11457 swaltner /soft/windriver/gpp/
> > 3.4/gnu/3.4.
> >  |               \--- 11635 swaltner /soft/gnu/make/3.81/bin/gmake -C
> > MAPI MKLe
> > ====================
> >
> > Linux:
> > ====================
> > ictgrid005:~/ccm_wa/symbios/RAIDCore-swaltner_1636/
> > dev_09q4_fc_7091-68.10.00.03> ~/pstree-2.32/pstree 3543
> > -+= 03543 root /soft/gridware-wic/sge/6.0u6/bin/lx24-amd64/sge_execd
> >  \-+= 21589 root sge_shepherd-467474 -bg
> >    \-+= 21590 root /soft/gridware-wic/sge/6.0u6/utilbin/lx24-amd64/
> > rshd -l
> >      \-+= 21591 swaltner /soft/gridware-wic/sge/6.0u6/utilbin/lx24-
> > amd64/qrsh_starter /var/spool/sgeexecd/ictgrid005/active_jobs/467474.
> >        \-+= 21603 swaltner tcsh -c hostname ; gmake
> >          \-+- 21612 swaltner gmake
> >            \-+- 04707 swaltner /bin/ksh -c gmake Platform/.make
> > Application/.make  MKLevel=$(( 0 + 1 )) MKopts='';
> >              \-+- 04708 swaltner gmake Platform/.make
> > Application/.make MKLevel=1 MKopts=
> >                \-+- 04787 swaltner /bin/ksh -c gmake  -C
> > Application    MKLevel=$(( 1 + 1 ))
> >                  \-+- 04788 swaltner gmake -C Application MKLevel=2
> >                    \-+- 04868 swaltner /bin/ksh -c gmake RAID/.make
> > Debug/.make MAPI/.make TAPI/.make Spy/.make Stpsim/.make FBDT/.make
> >                      \-+- 04870 swaltner gmake RAID/.make Debug/.make
> > MAPI/.make TAPI/.make Spy/.make Stpsim/.make FBDT/.make IT/.make D
> >                        \-+- 04947 swaltner /bin/ksh -c gmake  -C
> > RAID    MKLevel=$(( 3 + 1 ))
> >                          \-+- 04948 swaltner gmake -C RAID MKLevel=4
> >                            \-+- 05074 swaltner /bin/ksh -c gmake
> > cache/.make iop/.make htd/.make hid/.make icn/.make rtr/.make
> > rpa/.make
> >                              \-+- 05075 swaltner gmake cache/.make
> > iop/.make htd/.make hid/.make icn/.make rtr/.make rpa/.make Fibre/.ma
> >                                \-+- 11193 swaltner /bin/ksh -c gmake
> > -C vdm    MKLevel=$(( 5 + 1 ))
> >                                  \-+- 11194 swaltner gmake -C vdm
> > MKLevel=6
> >                                    \-+- 18797 swaltner /bin/ksh -c
> > gmake vdm  MKLevel=$(( 6 + 1 )) MKopts='';
> >                                      \-+- 18798 swaltner gmake vdm
> > MKLevel=7 MKopts=
> >                                        \-+- 22893 swaltner /bin/ksh -
> > c HOME="" LM_LICENSE_FILE="" ccpentium -c -o vdmRVState.o -fmessage
> >                                          \-+- 22894 swaltner
> > ccpentium -c -o vdmRVState.o -fmessage-length=0 -O2 -nostdlib -fno-
> > builtin
> >                                            |--- 22896 swaltner /soft/
> > windriver/gpp/3.4/gnu/3.4.4-vxworks-6.4/x86-linux2/bin/../libexec/g
> >                                            \--- 22895 root
> > (get_feature)
> > ictgrid005:~/ccm_wa/symbios/RAIDCore-swaltner_1636/
> > dev_09q4_fc_7091-68.10.00.03>
> > ====================
> >
> > I believe this behavior is causing the make process to consume tokens
> > for the parallel builds when it shouldn't be. The ksh process that
> > launches the gmake command in the subdirectory is consuming the token.
> > Once you get deep enough in the source directory, all the tokens are
> > in use by these idle ksh processes causing it to fall-back to a single
> > thread on the build. This is confirmed by starting a build using a "-j
> > 8" or "-j 16" or higher. By giving the make process more tokens, it is
> > able to keep the CPU busy on this quad CPU Linux server. This worked
> > fine when there is a single developer on the build system, but that
> > won't work well for the way we launch builds on these systems through
> > SGE. Once this issue is resolved, we can deploy the x86 hardware which
> > will give us the same build speeds in a box that is 20% the physical
> > size and costs about 10% of the price of the SPARC systems we have
> > been using.
> >
> > Thanks for any guidance you can provide. I've been fooling with this
> > for several days without any luck.
> >
> > Steve
> >
> >
> > _______________________________________________
> > Help-make mailing list
> > address@hidden
> > http://lists.gnu.org/mailman/listinfo/help-make
> 
> 
> 
> _______________________________________________
> Help-make mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/help-make





reply via email to

[Prev in Thread] Current Thread [Next in Thread]