gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Latest unstable (1.4) branch checkout seems... unsta


From: Brent A Nelson
Subject: Re: [Gluster-devel] Latest unstable (1.4) branch checkout seems... unstable
Date: Tue, 29 Jul 2008 12:44:18 -0400 (EDT)

I had to make the ip->addr change a number of checkouts ago. I hadn't yet switch from tcp/client and tcp/server to socket, as backwards compatibility seemed to work fine. I just made the change, but as expected (since the client is obviously communicating with all the servers; for example, df information is correct), it didn't help.

Other then this common complaint:
2008-07-29 12:04:18 C [dict.c:1141:data_to_str] dict: @data=(nil)

I have nothing in the server logs. However, I'm not sure how useful the server logs are, as I run 4-5 server processes per machine, and they all use the same log location.

This setup (which is a set of four machines, 4 exports per machine, 2 machines offering namespace, clientside AFR+unify) was working fine with a checkout that was probably about a week old.

It's possible that it's due to some changes I made to the kernel of my build machine to try to get shared writable mmap support into my fuse, but those patches were pretty specific, and I wouldn't expect it to cause this kind of behavior.

I'll try to figure out how to get tla to roll back to a particular patchset and see if I can identify which patch causes the breakage.

Thanks,

Brent

On Tue, 29 Jul 2008, Raghavendra G wrote:

Hi Brent,

There are couple of changes in 1.4. The authentication module "ip" have been
renamed as "addr". so the server-volume-spec file should have,

auth.addr.<brick-name>.allow <list-of-addresses>

list-of-addresses depends on the address-family specified in the
transport/socket. it can be,
ip-address for inet/inet6/inet-sdp
path for unix

Do the server side logs say that "no authentication module is interested in
authenticating client xxxx"? If thats the case, the above fix works. If not,
can you send server side logs?

regards,
On Mon, Jul 28, 2008 at 11:23 PM, Brent A Nelson <address@hidden> wrote:

The latest checkout seems to have a major defect, in my setup.  On the
bright side, the fchmod bug seems like it might be fixed (although it could
be that the filesystem isn't working well enough to tell)...

ls -al /beast
ls: cannot access /beast/vz: No such file or directory
ls: cannot access /beast/openvz: No such file or directory
ls: cannot access /beast/usr0: No such file or directory
ls: cannot access /beast/lost+found: No such file or directory
ls: reading directory /beast: File descriptor in bad state
total 128
drwxr-xr-x  6 root root 20480 2008-07-28 15:02 .
drwxrwxrwx 28 4791 kmem  4096 2008-07-18 20:32 ..
d?????????  ? ?    ?        ?                ? lost+found
-rwxr-xr-x  1 root root 92376 2008-04-04 02:42 ls
d?????????  ? ?    ?        ?                ? openvz
d?????????  ? ?    ?        ?                ? usr0
d?????????  ? ?    ?        ?                ? vz

Associated glusterfs.log:

2008-07-28 15:08:10 E [socket.c:1186:socket_submit] share4-0: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:08:10 E [afr.c:3428:afr_statfs_cbk] mirror4: (child=share4-0)
op_ret=-1 op_errno=107(Transport endpoint is not connected)
2008-07-28 15:08:59 C [client-protocol.c:223:call_bail] ns0-0: bailing
transport2008-07-28 15:08:59 C [client-protocol.c:223:call_bail] ns0-1:
bailing transport2008-07-28 15:08:59 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-0: forced unwinding
frame type(1) op(34) address@hidden
2008-07-28 15:08:59 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(1) op(34) address@hidden
2008-07-28 15:08:59 E [socket.c:1186:socket_submit] ns0-0: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:08:59 E [socket.c:1186:socket_submit] ns0-1: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:08:59 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
18: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:09:49 C [client-protocol.c:223:call_bail] ns0-0: bailing
transport2008-07-28 15:09:49 C [client-protocol.c:223:call_bail] ns0-1:
bailing transport2008-07-28 15:09:49 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-0: forced unwinding
frame type(2) op(0) address@hidden
2008-07-28 15:09:49 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:09:49 E [client-protocol.c:3980:client_setvolume_cbk] ns0-0:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:09:49 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-0: forced unwinding frame type(1) op(34) address@hidden
2008-07-28 15:09:49 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(2) op(0) address@hidden
2008-07-28 15:09:49 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:09:49 E [client-protocol.c:3980:client_setvolume_cbk] ns0-1:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:09:49 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(1) op(34) address@hidden
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
18: (op_num=34) / => -1 (No such file or directory)
2008-07-28 15:09:49 E [socket.c:1186:socket_submit] ns0-0: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:09:49 E [socket.c:1186:socket_submit] ns0-1: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
19: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
19: (op_num=34) / => -1 (No such file or directory)
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
20: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:09:49 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
20: (op_num=34) / => -1 (No such file or directory)
2008-07-28 15:09:49 E [afr.c:4180:afr_readdir_cbk] ns0: (child=ns0-1)
op_ret=-1 op_errno=77(File descriptor in bad state)
2008-07-28 15:09:49 E [fuse-bridge.c:1947:fuse_readdir_cbk] glusterfs-fuse:
21: READDIR => -1 (File descriptor in bad state)
2008-07-28 15:09:49 E [afr.c:5641:afr_closedir] ns0: child_errno[] not 0,
returning ENOTCONN
2008-07-28 15:09:49 E [fuse-bridge.c:940:fuse_err_cbk] glusterfs-fuse: 22:
(op_num=24) ERR => -1 (Transport endpoint is not connected)
2008-07-28 15:10:42 C [client-protocol.c:223:call_bail] ns0-0: bailing
transport2008-07-28 15:10:42 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-0: forced unwinding
frame type(2) op(0) address@hidden
2008-07-28 15:10:42 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:10:42 E [client-protocol.c:3980:client_setvolume_cbk] ns0-0:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:10:42 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-0: forced unwinding frame type(1) op(34) address@hidden
2008-07-28 15:10:42 C [client-protocol.c:223:call_bail] ns0-1: bailing
transport2008-07-28 15:10:42 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-1: forced unwinding
frame type(2) op(0) address@hidden
2008-07-28 15:10:42 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:10:42 E [client-protocol.c:3980:client_setvolume_cbk] ns0-1:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:10:42 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(1) op(34) address@hidden
2008-07-28 15:10:42 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
23: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:10:42 E [socket.c:1186:socket_submit] ns0-0: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:10:42 E [socket.c:1186:socket_submit] ns0-1: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:10:42 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
23: (op_num=34) / => -1 (No such file or directory)
2008-07-28 15:11:32 C [client-protocol.c:223:call_bail] ns0-1: bailing
transport2008-07-28 15:11:32 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-1: forced unwinding
frame type(2) op(0) address@hidden
2008-07-28 15:11:32 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:11:32 E [client-protocol.c:3980:client_setvolume_cbk] ns0-1:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:11:32 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-1: forced unwinding frame type(1) op(34) address@hidden
2008-07-28 15:11:32 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
24: (op_num=34) / => -1 (Transport endpoint is not connected)
2008-07-28 15:11:32 E [socket.c:1186:socket_submit] ns0-1: transport not
connected to submit (priv->connected = 0)
2008-07-28 15:11:35 C [client-protocol.c:223:call_bail] ns0-0: bailing
transport2008-07-28 15:11:35 E
[client-protocol.c:4122:protocol_client_cleanup] ns0-0: forced unwinding
frame type(2) op(0) address@hidden
2008-07-28 15:11:35 E [dict.c:648:dict_unserialize] dict: sscanf on buf
failed
2008-07-28 15:11:35 E [client-protocol.c:3980:client_setvolume_cbk] ns0-0:
SETVOLUME on remote-host failed: ret=-2 error=Unknown Error
2008-07-28 15:11:35 E [client-protocol.c:4122:protocol_client_cleanup]
ns0-0: forced unwinding frame type(1) op(34) address@hidden
2008-07-28 15:11:35 E [fuse-bridge.c:452:fuse_entry_cbk] glusterfs-fuse:
24: (op_num=34) / => -1 (No such file or directory)

Also, trying to shut down after this test, the filesystem unmounts fine,
and most of the share glusterfsd processes were killed normally, but I had
to kill -9 the namespace glusterfsd processes.

Thanks,

Brent


_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel




--
Raghavendra G

A centipede was happy quite, until a toad in fun,
Said, "Prey, which leg comes after which?",
This raised his doubts to such a pitch,
He fell flat into the ditch,
Not knowing how to run.
-Anonymous





reply via email to

[Prev in Thread] Current Thread [Next in Thread]