gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] 2.0.0rc4 (and rc5) locks up when used as root


From: Gordan Bobic
Subject: Re: [Gluster-devel] 2.0.0rc4 (and rc5) locks up when used as root
Date: Sat, 21 Mar 2009 00:46:32 +0000
User-agent: Thunderbird 2.0.0.19 (X11/20090107)

I'm not sure what you mean. There root volume is replicate/afr, but there is only one node active (I haven't bothered building the 2nd node yet).

The init scripts don't change any permissions during boot. What I mentioned below is happening in rc.sysinit.

If permissions were the problem, the init process wouldn't just lock up - if execute permission wasn't there it would just skip executing the script, which is what I forced by chmodding -x the udev-stw.modules script.

Gordan

Harshavardhana wrote:
Gordan,

were the permissions changed being over replicate by any chance?. can you write an additional script to check for the permission of files or atleast have a dump to compare permissions over replicate and without it. Regards
--
Harshavardhana
"Yantra Shilpi"
Z Research Inc - http://www.zresearch.com



On Fri, Mar 20, 2009 at 11:46 PM, Gordan Bobic <address@hidden <mailto:address@hidden>> wrote:

    As suspected, chmod -x /etc/sysconfig/modules/udev-stw.modules fixes
    the immediate problem. It would appear things like:


    for file in /etc/sysconfig/modules/*.modules ; do
     [ -x $file ] && $file
    done

    seem to cause it to lock up.

    Unfortunately, that sort of thing happens all over the place in the
    boot scripts, and now it locks up a few steps later. The last
    version this worked with was rc2 (possibly rc3, I haven't tested
    it). It's definitely not working on rc4 and rc5.

    Gordan


    Gordan Bobic wrote:

        Anand Avati wrote:

                It looks like it locks up when used as root
                (afr/replicate) at the point
                where it initially starts up udev (not 100% sure where
                exactly yet, will
                have to put some trace code in rc.sysinit).

                2.0.0rc2 didn't have this problem.


            Can you try rc5? Though still it is still under QA, you
            might want to
            give it a try since some transport related code changes have
            gone it
            which might be the reason for your lockup.


        The lock-up still occurs with rc5.

        I've done some more digging, however. It appears to die at this
        pint in rc.sysinit, between debug 3 and debug 4:

        #################
        echo "debug 3"
        # Load other user-defined modules
        for file in /etc/sysconfig/modules/*.modules ; do
         [ -x $file ] && $file
        done

        # Load modules (for backward compatibility with VARs)
        if [ -f /etc/rc.modules ]; then
               /etc/rc.modules
        fi
        echo "debug 4"
        #################

        There is no rc.modules file, so contents of that can be ruled out.

        # ls -l /etc/sysconfig/modules/
        total 8
        -rwxr-xr-x 1 root root 100 May 25  2008 udev-stw.modules

        # cat /etc/sysconfig/modules/udev-stw.modules
        #!/bin/sh
        for i in nvram floppy parport lp snd-powermac;do
               modprobe $i >/dev/null 2>&1
        done

        I have just rebuilt my initrds separately with rc2 and rc5. rc2
        works fine, rc5 fails. No other changes to the system between
        the two attempts.

        Oh, and the first access failure bug is still there.

        I couldn't test for the memory leak in rc5 since I couldn't get
        it to boot due to the lock-up mentioned above.

        I'll try disabling those modules listed above since I don't need
        them on this setup, but I can confirm that modprobe itself works
        fine. So it sounds like a problem/bug elsewhere. Possibly a
        buffer-overrun somewhere that gets triggered by rc4/rc5.

        BTW, in case it's relevant, I'm using the fuse kernel module
        from 2.6.24.7, rather than the one from the patched package,
        because the one in the kernel appears to be later. Can anyone
        confirm if there are any known problems with this? Is there any
        strong reason why I should use a different kernel module (e.g.
        one from the patched fuse 2.7.4 package)?

        Gordan


        _______________________________________________
        Gluster-devel mailing list
        address@hidden <mailto:address@hidden>
        http://lists.nongnu.org/mailman/listinfo/gluster-devel




    _______________________________________________
    Gluster-devel mailing list
    address@hidden <mailto:address@hidden>
    http://lists.nongnu.org/mailman/listinfo/gluster-devel







reply via email to

[Prev in Thread] Current Thread [Next in Thread]