Re: [Gluster-devel] Could be the bug of Glusterfs? The file system is un

On Tue, Jun 2, 2009 at 12:25 AM, Shehjar Tikoo <address@hidden> wrote:

Hi

>
> Also, avoid using autoscaling in io-threads for now.
>
> -Shehjar
>
>

-Shehjar

Alpha Electronics wrote:

Thanks for looking into this. We do use io-threads. Here is the server config:
: volume brick1-posix
2: type storage/posix
3: option directory /mnt/brick1
4: end-volume
5:
6: volume brick2-posix
7: type storage/posix
8: option directory /mnt/brick2
9: end-volume
10:
11:
12: volume brick1-locks
13: type features/locks
14: subvolumes brick1-posix
15: end-volume
16:
17: volume brick2-locks
18: type features/locks
19: subvolumes brick2-posix
20: end-volume
21:
22: volume brick1
23: type performance/io-threads
24: option min-threads 16
25: option autoscaling on
26: subvolumes brick1-locks
27: end-volume
28:
29: volume brick2
30: type performance/io-threads
31: option min-threads 16
32: option autoscaling on
33: subvolumes brick2-locks
34: end-volume
35:
36: volume server
37: type protocol/server
38: option transport-type tcp
40: option auth.addr.brick1.allow *
41: option auth.addr.brick2.allow *
42: subvolumes brick1 brick2
43: end-volume
44:

On Sun, May 31, 2009 at 11:44 PM, Shehjar Tikoo <address@hidden <mailto:address@hidden>> wrote:

Alpha Electronics wrote:

We are testing the glusterfs before recommending them to
enterprise clients. We found that the file system always hang
after running for about 2 days. after killing the server side
process and then restart, everything goes back to normal.

What is the server config?
If you're not using io-threads on the server, I suggest you do,
because it does basic load-balancing to avoid timeouts.

Also, avoid using autoscaling in io-threads for now.

-Shehjar

Here is the spec and error logged:
GlusterFS version: v2.0.1

Client volume:
volume brick_1
type protocol/client
option transport-type tcp/client
option remote-port 7777 # Non-default port
option remote-host server1
option remote-subvolume brick
end-volume

volume brick_2
type protocol/client
option transport-type tcp/client
option remote-port 7777 # Non-default port
option remote-host server2
option remote-subvolume brick
end-volume

volume bricks
type cluster/distribute
subvolumes brick_1 brick_2
end-volume

Error logged on client side through /var/log/glusterfs.log
[2009-05-29 14:58:55] E [client-protocol.c:292:call_bail]
brick_1: bailing out frame LK(28) frame sent = 2009-05-29
14:28:54. frame-timeout = 1800
[2009-05-29 14:58:55] W [fuse-bridge.c:2284:fuse_setlk_cbk]
glusterfs-fuse: 106850788: ERR => -1 (Transport endpoint is not
connected)
error logged on server
[2009-05-29 14:59:15] E [client-protocol.c:292:call_bail]
brick_2: bailing out frame LK(28) frame sent = 2009-05-29
14:29:05. frame-timeout = 1800
[2009-05-29 14:59:15] W [fuse-bridge.c:2284:fuse_setlk_cbk]
glusterfs-fuse: 106850860: ERR => -1 (Transport endpoint is not
connected)

There is error message logged on server side after 1 hour in
/var/log/messages:
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
lib/util_sock.c:write_data(564)
May 29 16:04:16 server2 winbindd[3649]: write_data: write
failure. Error = Connection reset by peer
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
libsmb/clientgen.c:write_socket(158)
May 29 16:04:16 server2 winbindd[3649]: write_socket: Error
writing 104 bytes to socket 18: ERRNO = Connection reset by peer
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
libsmb/clientgen.c:cli_send_smb(188)
May 29 16:04:16 server2 winbindd[3649]: Error writing 104
bytes to client. -1 (Connection reset by peer)
May 29 16:04:16 server2 winbindd[3649]: [2009/05/29 16:05:16, 0]
libsmb/cliconnect.c:cli_session_setup_spnego(859)
May 29 16:04:16 server2 winbindd[3649]: Kinit failed: Cannot
contact any KDC for requested realm

------------------------------------------------------------------------

_______________________________________________
Gluster-devel mailing list
address@hidden <mailto:address@hidden>

http://lists.nongnu.org/mailman/listinfo/gluster-devel

From:	Alpha Electronics
Subject:	Re: [Gluster-devel] Could be the bug of Glusterfs? The file system is unstable and hang
Date:	Wed, 3 Jun 2009 16:48:32 -0500