gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Could be the bug of Glusterfs? The file system is u


From: Shehjar Tikoo
Subject: Re: [Gluster-devel] Could be the bug of Glusterfs? The file system is unstable and hang
Date: Mon, 01 Jun 2009 10:14:26 +0530
User-agent: Mozilla-Thunderbird 2.0.0.19 (X11/20090103)

Alpha Electronics wrote:
We are testing the glusterfs before recommending them to enterprise clients. We found that the file system always hang after running for about 2 days. after killing the server side process and then restart, everything goes back to normal.


What is the server config?
If you're not using io-threads on the server, I suggest you do,
because it does basic load-balancing to avoid timeouts.

Also, avoid using autoscaling in io-threads for now.

-Shehjar


 Here is the spec and error logged:
GlusterFS version:  v2.0.1

Client volume:
volume brick_1
  type protocol/client
  option transport-type tcp/client
  option remote-port 7777 # Non-default port
  option remote-host server1
  option remote-subvolume brick
end-volume

volume brick_2
  type protocol/client
  option transport-type tcp/client
  option remote-port 7777 # Non-default port
  option remote-host server2
  option remote-subvolume brick
end-volume

volume bricks
  type cluster/distribute
  subvolumes brick_1 brick_2
end-volume

Error logged on client side through /var/log/glusterfs.log
[2009-05-29 14:58:55] E [client-protocol.c:292:call_bail] brick_1: bailing out frame LK(28) frame sent = 2009-05-29 14:28:54. frame-timeout = 1800 [2009-05-29 14:58:55] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850788: ERR => -1 (Transport endpoint is not connected)
error logged on server
[2009-05-29 14:59:15] E [client-protocol.c:292:call_bail] brick_2: bailing out frame LK(28) frame sent = 2009-05-29 14:29:05. frame-timeout = 1800 [2009-05-29 14:59:15] W [fuse-bridge.c:2284:fuse_setlk_cbk] glusterfs-fuse: 106850860: ERR => -1 (Transport endpoint is not connected)

There is error message logged on server side after 1 hour in /var/log/messages: May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] lib/util_sock.c:write_data(564) May 29 16:04:16 server2 winbindd[3649]: write_data: write failure. Error = Connection reset by peer May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:write_socket(158) May 29 16:04:16 server2 winbindd[3649]: write_socket: Error writing 104 bytes to socket 18: ERRNO = Connection reset by peer May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/clientgen.c:cli_send_smb(188) May 29 16:04:16 server2 winbindd[3649]: Error writing 104 bytes to client. -1 (Connection reset by peer) May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0] libsmb/cliconnect.c:cli_session_setup_spnego(859) May 29 16:04:16 server2 winbindd[3649]: Kinit failed: Cannot contact any KDC for requested realm


------------------------------------------------------------------------

_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel





reply via email to

[Prev in Thread] Current Thread [Next in Thread]