gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Need review for client-reopen changes


From: Raghavendra Gowdappa
Subject: Re: [Gluster-devel] Need review for client-reopen changes
Date: Mon, 7 Jan 2013 06:16:59 -0500 (EST)

Pranith,

This comment is on the second patch. While the implementation looks fine, I've 
some concerns related to the idea itself. Consider following situation with a 
replicate volume of two subvolumes:

1. process 1 (p1) acquires a mandatory lock.
2. stop first server, replace disk
3. reopen of fd opened by p1 fail (since file is not present).
4. self heal completes. parent is notified that child is up. However fd is not 
opened yet.
5. now, there is a possibility that another process p2 can successfully write 
to another fd opened on the same file (on server1), since lock (from p1) is not 
yet acquired on server1.

Similar situation can arise even without this patch, but only when p1 and p2 
are not running on same mount point. With this patch it can happen even on 
single mount point too. I am not sure whether we can ignore this corner case. 
Others, please let us know your opinion on this.

regards,
Raghavendra.

----- Original Message -----
> From: "Pranith Kumar Karampuri" <address@hidden>
> To: "devel" <address@hidden>
> Cc: "Raghavendra Gowdappa" <address@hidden>, "Krishnan Parthasarathi" 
> <address@hidden>, "Jeff Darcy"
> <address@hidden>, "Amar Tumballi" <address@hidden>
> Sent: Monday, January 7, 2013 10:15:36 AM
> Subject: Need review for client-reopen changes
> 
> hi,
> http://review.gluster.org/#change,4357
> http://review.gluster.org/#change,4358
> 
> are the changes I made to handle re-opens of files in the case where
> a disk is replaced while a brick is offline. The idea is to attempt
> re-opens after self-heal completes and the file could be opened.
> With these changes readv/fxattrop/writev/findelk for fds with
> remote-fd -1 are attempted using anon-fds and if the fop succeeds
> then the re-open is attempted for every 1024th success. 1024 is an
> arbitrary number I used. The re-open of files could fail because of
> posix lock re-acquisition failure, that is the reason re-opens are
> attempted periodically (for every 1024 successful fops on that fd).
> 
> I think the re-attempt logic could be better.
> For instance, we can attempt re-open on the first success on anon-fd
> instead of waiting till 1024th success and if this re-open fails we
> could fall-back on 'periodic attempts' i.e. for every 1024 successes
> on the anon-fd.
> 
> Let me know your thoughts.
> 
> Pranith
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]