gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] about afr


From: nicolas prochazka
Subject: Re: [Gluster-devel] about afr
Date: Tue, 3 Feb 2009 16:15:11 +0100

Without performance translator, the result is the same.
I'm trying with gdb as soon as possible.
you say, EBADFD is fine, AFR will try the operation on the other server , ok
so i understand, but it I test to stop this server, gluster can not retrieve the first which is EBADFD.
A lot of my problem comes from here, i think, because with my two server,
i stop the first, then restart , wait, stop the second, restart  and all is KO.
I just try to stop the first and test, then all is ok .
Nicolas

On Tue, Feb 3, 2009 at 3:50 PM, Krishna Srinivas <address@hidden> wrote:
Nicolas,

When you restart the server logs indicating EBADFD is fine, AFR will
try the operation on the other server. When you have the situation
where the glusterfs client hangs can you attach gdb to the glusterfs
and mail us the backtrace?

gdb -p <pid of glusterfs>
type "bt" at the gdb command prompt.

Just want to confirm that glusterfs has not blocked at a system call.
(as we have non blocking io now)

Can you see if removing the performance translators helps? we can
narrow down to the problem translator in such case.

Krishna

On Tue, Feb 3, 2009 at 5:18 PM, nicolas prochazka
<address@hidden> wrote:
> ok,
> So now I know there's few bugs,
>
> 1 - when stop and i restart a server , I've the EBADFD bug
> 2 - When I stop server :
>       - with  --disable-direct-io-mode   : my big image file become corrupt
> ( missing data ...)
>       - without --disable-direct-io-mode  :   my process hangs and cpu load
> grows a lot (by process )
>
> any ideas ?
>
> Regards,
> Nicolas Prochazka
>
>  On Tue, Feb 3, 2009 at 5:42 AM, Raghavendra G <address@hidden>
> wrote:
>>
>> Hi Nicolas,
>>
>> On Tue, Feb 3, 2009 at 12:01 AM, nicolas prochazka
>> <address@hidden> wrote:
>>>
>>> I inspect the log and i find something interesting :
>>> All is ok,
>>> i have stop 10.98.98.2 and i restart it :
>>>
>>> 2009-02-02 15:00:32 D [client-protocol.c:6498:notify] brick_10.98.98.2:
>>> got GF_EVENT_CHILD_UP
>>> 2009-02-02 15:00:32 D [socket.c:924:socket_connect] brick_10.98.98.2:
>>> connect () called on transport already connected
>>> 2009-02-02 15:00:32 N [client-protocol.c:5786:client_setvolume_cbk]
>>> brick_10.98.98.2: connection and handshake succeeded
>>> 2009-02-02 15:00:40 D [fuse-bridge.c:1945:fuse_statfs] glusterfs-fuse:
>>> 17399: STATFS
>>> 2009-02-02 15:00:40 D [fuse-bridge.c:368:fuse_entry_cbk] glusterfs-fuse:


reply via email to

[Prev in Thread] Current Thread [Next in Thread]