gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] NFS reexport works, still stat-prefetch issues, -s p


From: Brent A Nelson
Subject: Re: [Gluster-devel] NFS reexport works, still stat-prefetch issues, -s problem
Date: Thu, 10 May 2007 22:01:18 -0400 (EDT)

On Thu, 10 May 2007, Brent A Nelson wrote:

[May 10 18:14:18] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115) [May 10 18:14:18] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:share4-1: connection to server disconnected [May 10 18:14:18] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport [May 10 18:14:18] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=9) [May 10 18:14:18] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:share4-0: connection to server disconnected [May 10 18:14:18] [ERROR/client-protocol.c:204/client_protocol_xfer()] protocol/client:transport_submit failed [May 10 18:14:18] [ERROR/client-protocol.c:204/client_protocol_xfer()] protocol/client:transport_submit failed [May 10 18:14:19] [CRITICAL/client-protocol.c:218/call_bail()] client/protocol:bailing transport [May 10 18:14:19] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=115) [May 10 18:14:19] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:share4-0: connection to server disconnected [May 10 18:14:19] [ERROR/client-protocol.c:204/client_protocol_xfer()] protocol/client:transport_submit failed

I've seen the "0 bytes r/w instead of 113" message plenty of times in the past (with older GlusterFS versions), although it was apparently harmless before. It looks like the code now considers this to be a disconnection and tries to reconnect. For some reason, when it does manage to reconnect, it nevertheless results in an I/O error. I wonder if this relates to a previous issue I mentioned with real disconnects (node dies or glusterfsd is restarted), where the first access after a failure (at least for ls or df) results in an error, but the next attempt succeeds? Seems like an issue with the reconnection logic (and some sort of glitch masquerading as a disconnect in the first place)... This is probably the real problem that is triggering the read-ahead crash (i.e., the read-ahead crash would not be triggered in my test case if it weren't for this issue).


Well, it looks like I can reproduce this behavior (but, so far, not the memory leak), on a much simpler setup, no NFS required. I was copying my test area (with several 10GB files) to a really simple GlusterFS (one share, no afr, no unify, glusterfsd on the same machine), when I hit the disconnect issue (after a few files successfully copied). This looked like an issue with protocol/client and/or protocol/server, but I thought it would be a good idea to narrow things down a bit...

Thanks,

Brent




reply via email to

[Prev in Thread] Current Thread [Next in Thread]