gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Single-process (server and client) AFR problems


From: Gordan Bobic
Subject: Re: [Gluster-devel] Single-process (server and client) AFR problems
Date: Tue, 20 May 2008 09:30:52 +0100
User-agent: RoundCube Webmail/0.1

This is with release 1.3.9.

Not much more that seems relevant turns up in the logs with -L DEBUG (DNS
chatter, mentions that the 2nd server isn't talking (glusterfs is switched
off on it because that causes the lock-up).

This gets logged when I try to cat ~/.bashrc:

2008-05-20 09:14:08 D [fuse-bridge.c:375:fuse_entry_cbk] glusterfs-fuse:
39: (34) /gordan/.bashrc =>
 60166157
2008-05-20 09:14:08 D [inode.c:577:__create_inode] fuse/inode: create
inode(60166157)
2008-05-20 09:14:08 D [inode.c:367:__active_inode] fuse/inode: activating
inode(60166157), lru=7/102
4
2008-05-20 09:14:08 D [inode.c:367:__active_inode] fuse/inode: activating
inode(60166157), lru=7/102
4
2008-05-20 09:14:08 D [fuse-bridge.c:1517:fuse_open] glusterfs-fuse: 40:
OPEN /gordan/.bashrc
2008-05-20 09:14:08 E [afr.c:1985:afr_selfheal] home: none of the children
are up for locking, retur
ning EIO
2008-05-20 09:14:08 E [fuse-bridge.c:692:fuse_fd_cbk] glusterfs-fuse: 40:
(12) /gordan/.bashrc => -1
 (5)

On the command line, I get back "Input/output error". I can ls the files,
but cannot actually read them.

This is with only the first server up. Same happens when I mount home.vol
via fstab or via something like:
glusterfs -f /etc/glusterfs/home.vol /home

I have also reduced the config (single process, intended for servers) to a
bare minimum (removed posix lock layer), to get to the bottom of it, but I
cannot get any reads to work:

volume home1
        type storage/posix
        option directory /gluster/home
end-volume

volume home2
        type protocol/client
        option transport-type tcp/client
        option remote-host 192.168.3.1
        option remote-subvolume home2
end-volume

volume home
        type cluster/afr
        option read-subvolume home1
        subvolumes home1 home2
end-volume

volume server
        type protocol/server
        option transport-type tcp/server
        subvolumes home home1
        option auth.ip.home.allow 127.0.0.1,192.168.*
        option auth.ip.home1.allow 127.0.0.1,192.168.*
end-volume

On a related node, if single-process is used, how does GlusterFS know which
volume to mount? For example, if it is trying to mount the protocol/client
volume (home2), the obviously, that won't work because the 2nd server is
not up. If it is mounting the protocol/server volume, then is it trying to
mount home or home1? Or does it mount the outermost volume that _isn't_ a
protocol/[client|server] (which is "home" in this case)?

Thanks.

Gordan

On Tue, 20 May 2008 13:18:07 +0530, Krishna Srinivas
<address@hidden> wrote:
> Gordan,
> 
> Which patch set is this? Can you run glusterfs server side with "-L
DEBUG"
> and send the logs?
> 
> Thanks
> Krishna
> 
> On Tue, May 20, 2008 at 1:56 AM, Gordan Bobic <address@hidden> wrote:
>> Hi,
>>
>> I'm having rather major problems getting single-process AFR to work
> between
>> two servers. When both servers come up, the GlusterFS on both locks up
>> pretty solid. The processes that try to access the FS (including ls)
> seem to
>> get nowhere for a few minutes, and then complete. But something gets
> stuck,
>> and glusterfs cannot be killed even with -9!
>>
>> Another worrying thing is that fuse kernel module ends up having a
> reference
>> count even after glusterfs process gets killed (sometimes killing the
> remote
>> process that isn't locked up on it's host can break the locked-up
> operations
>> and allow for the local glusterfs process to be killed). So fuse then
> cannot
>> be unloaded.
>>
>> This error seems to come up in the logs all the time:
>> 2008-05-19 20:57:17 E [afr.c:1985:afr_selfheal] home: none of the
> children
>> are up for locking, returning EIO
>> 2008-05-19 20:57:17 E [fuse-bridge.c:692:fuse_fd_cbk] glusterfs-fuse:
> 63:
>> (12) /test => -1 (5)
>>
>> This implies come kind of a locking issue, but the same error and
> conditions
>> also arise when posix locking module is removed.
>>
>> The configs for the two servers are attached. They are almost identical
> to
>> the examples on the glusterfs wiki:
>>
>> http://www.gluster.org/docs/index.php/AFR_single_process
>>
>> What am I doing wrong? Have I run into another bug?
>>
>> Gordan
>>
>> volume home1-store
>>        type storage/posix
>>        option directory /gluster/home
>> end-volume
>>
>> volume home1
>>        type features/posix-locks
>>        subvolumes home1-store
>> end-volume
>>
>> volume home2
>>        type protocol/client
>>        option transport-type tcp/client
>>        option remote-host 192.168.3.1
>>        option remote-subvolume home2
>> end-volume
>>
>> volume home
>>        type cluster/afr
>>        option read-subvolume home1
>>        subvolumes home1 home2
>> end-volume
>>
>> volume server
>>        type protocol/server
>>        option transport-type tcp/server
>>        subvolumes home home1
>>        option auth.ip.home.allow 127.0.0.1
>>        option auth.ip.home1.allow 192.168.*
>> end-volume
>>
>> volume home2-store
>>        type storage/posix
>>        option directory /gluster/home
>> end-volume
>>
>> volume home2
>>        type features/posix-locks
>>        subvolumes home2-store
>> end-volume
>>
>> volume home1
>>        type protocol/client
>>        option transport-type tcp/client
>>        option remote-host 192.168.0.1
>>        option remote-subvolume home1
>> end-volume
>>
>> volume home
>>        type cluster/afr
>>        option read-subvolume home2
>>        subvolumes home1 home2
>> end-volume
>>
>> volume server
>>        type protocol/server
>>        option transport-type tcp/server
>>        subvolumes home home2
>>        option auth.ip.home.allow 127.0.0.1
>>        option auth.ip.home2.allow 192.168.*
>> end-volume
>>
>> _______________________________________________
>> Gluster-devel mailing list
>> address@hidden
>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>>
>>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]