gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Re: Timeout settings and self-healing ? (WAS: HA fai


From: Krishna Srinivas
Subject: Re: [Gluster-devel] Re: Timeout settings and self-healing ? (WAS: HA failover test unsuccessful (inaccessible mountpoint))
Date: Mon, 28 Apr 2008 18:42:00 +0530

Guido,
Can you paste the server and client spec files again?
(it has got deleted from the pastebin)
Make sure you are using unify on client side and have set transport-timeout
to 10 secs.
If possible try to reproduce the problem you are seeing with minimal
spec file.
Thanks
Krishna

On Sat, Apr 26, 2008 at 4:36 AM, Amar S. Tumballi <address@hidden> wrote:
>
>
> On Wed, Apr 23, 2008 at 3:47 AM, Guido Smit <address@hidden> wrote:
> > Krishna,
> >
> > I did the test. I killed glusterfsd on one server.
> > All tests (ls, df, cp) worked like it should. I didn't even notice any
> difference. Unplugging the cable however, blocked all operations and finally
> after a few minutes
> > the transport endpoint message appears.
> >
> >
> >
> >
> The problem with TCP/IP is that when you unplug the cable, there is no
> messages sent to application's poll() on network. Driver internally tries to
> reconnect, and only after a long time. (it was around 10+minutes when we
> tested) we get message saying no route to host. But when applications die on
> server, or there is a shutdown, the connected nodes get a notification,
> hence everything will be smooth. Hence the delay in case of network cable
> unplugging.
>
> We came with an work around for managing this delay, that was
> 'transport-timeout' option, which times out each request after certain time.
> The default is '108's now. We kept it as high as this considering few
> applications which use mandatory locks, (block the write till a lock gets
> freed) can take easily up to 1+minutes for releasing the locks. Users have
> the option to set 'transport-timeout' (In client/protocol volume). So, they
> can tune it considering the I/O time of their apps.
>
> In our test setups, we could timeout exactly after given transport-timeout
> setting, everytime. So, the issue of freezing indefinitely, we couldn't
> reproduce.
>
>
> Regards,
> Amar
>
>
>
> --
> Amar Tumballi
>  Gluster/GlusterFS Hacker
> [bulde on #gluster/irc.gnu.org]
> http://www.zresearch.com - Commoditizing Super Storage!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]