gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] glusterfs_1.4.0qa19: small issues


From: Brent A Nelson
Subject: Re: [Gluster-devel] glusterfs_1.4.0qa19: small issues
Date: Tue, 10 Jun 2008 19:24:29 -0400 (EDT)

On Tue, 10 Jun 2008, Brent A Nelson wrote:

Okay, that's fixed, along with the removexattr issue. Another couple of quirks; cp -a /usr/bin /gluster/bin makes a perfect copy, except:

1) /gluster/bin itself doesn't preserve the correct permissions (rsync succeeds, though, and a cp -a to /tmp also works fine, so it's not a problem with cp); the contents are all fine, except:

2) /gluster/bin/sudoedit did not have the setuid bit set (this fails with rsync, too, but a manual chmod u+s succeeds).


I tortured my GlusterFS pretty heavily today. Apart from the issues above, it held up perfectly for repeated rsync/rm cycles.

I noticed that when doing a large (10GB ) dd write on a client, if the write was going to the client node (which is also a server), ls would hang for long periods on that client. So, I figured it was time to load up io-threads (just 2 threads per export) on the servers to split up the metadata and io traffic. It worked beautifully; ls -al was very responsive on all clients, even with heavy write activity.

However, I noticed that with io-threads loaded, the server processes would sometimes consume several GB of RAM (up to 2.2GB). I didn't worry about it too much, as it generally would free the memory back. I did more exhaustive testing, with all 4 servers acting as clients, all 4 doing dd writes, and it worked fine for awhile. Eventually, however, one of my nodes ran out of memory; perhaps that extra io-threads memory consumption is an issue after all.

Also, when the node ran out of memory, my other GlusterFS clients hung. As someone mentioned previously, GlusterFS apparently doesn't give up properly if the unresponsive machine is still up. After rebooting the hung node (but not yet restarting the server processes), the clients gave up on the node and became responsive again.

Thanks,

Brent




reply via email to

[Prev in Thread] Current Thread [Next in Thread]