Re: [Gluster-devel] EAGAIN/EBUSY handling in glusterfs

On Wed, Jan 23, 2013 at 1:34 AM, Shishir Gowda <address@hidden> wrote:

Hi Avati,

One of the possible scenarios is someone taking a lvm snap of the backend.

Can you describe in more detail exact operations for which LVM snap returns EAGAIN or EINTR? EINTR in posix is best retried in posix level. However I'm not sure if LVM snapshote actually makes the disk filesystem return these non standard errors for any reason. Can you give an example strace of this happening?

Avati

few eg:
DHT's rebalance: we would not retry a migration if case we got an error EAGAIN or even EINTR.
Does self-heal retry healing if the error was EAGAIN or EINTR?

These are just few I can think about.

When snap feature becomes supported (refer to wiki link in previous page), few ops' would be blocked while snap is in progress.

If we decide to provide complete snap in the future (not just crash-consistent), then in all probability all fops will be blocked.

Do we guarantee all op's(triggered internally) that fail will be re-triggered? Or are we guaranteeing a state from which we can recover completely?

With regards,
Shishir

----- Original Message -----
From: "Anand Avati" <address@hidden>
To: "Shishir Gowda" <address@hidden>
Cc: address@hidden
Sent: Wednesday, January 23, 2013 1:23:09 PM
Subject: Re: [Gluster-devel] EAGAIN/EBUSY handling in glusterfs

On Tue, Jan 22, 2013 at 10:39 PM, Shishir Gowda < address@hidden > wrote:

Hi All,

Currently I see that almost all the xlators in glusterfs do not handle EAGAIN/EBUSY errors.

Though this should be handled by the applications,

If by "handle by application" you meant "handled by retrying syscall by application", that is not completely true. More generally it is true for EINTR, and some places for EAGAIN (i.e when used on non-blocking pollable file descriptors like sockets - which specifically does NOT include filesystem for regular read/write). EBUSY almost always does not suggest a poll/retry to the application.

there are multiple paths where the op's are not performed by the applications (but are internal to glusterfs).

Few of these are
a. Rebalance
b. Replace brick
c. Self-heal
d. lk's
etc...

With the proposed snap feature ( http://www.gluster.org/community/documentation/index.php/Features/snapshot ), would it not be better to identify such op's inside glusterfs?

Can you explain more on that? Why is that necessary?

Thanks,
Avati

Irrespective of the snap feature, I think it is about correctness to handle EAGAIN/EBUSY in these code paths.

Please comment.

With regards,
Shishir

_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel

From:	Anand Avati
Subject:	Re: [Gluster-devel] EAGAIN/EBUSY handling in glusterfs
Date:	Wed, 23 Jan 2013 10:48:42 -0800