gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] client reconnect


From: Brent A Nelson
Subject: [Gluster-devel] client reconnect
Date: Fri, 25 May 2007 18:21:45 -0400 (EDT)

I mentioned in a previous email that client reconnection may not be 100%. I encountered this again in the following scenario: one of my servers (in a multiserver unify/afr) was trying to format a bad drive, and this knocked out access to all my 3ware disks which were being exported by GlusterFS from that machine. While in this condition, a couple of clients tried to ls directories on a filesystem that uses this server (and its mirror). I suspect they were able to contact the glusterfsd of the "bad" machine, but glusterfsd deadlocked trying to access the disk. I ended up rebooting the server, but the clients that were trying to ls never returned and had to be killed. The mountpoints had to be unmounted and the filesystem remounted.

It seems to me (you will probably come up with something much better) that if the client successfully communicates a request to a server but the server doesn't complete the request, the client needs to timeout the I/O request that it was waiting on and try again. In the case of afr, it should also check to see if the mirror host can satisfy the request, instead.

Thanks,

Brent




reply via email to

[Prev in Thread] Current Thread [Next in Thread]