Re: [Gluster-devel] Proposal to change locking in data-self-heal

On Wed, May 22, 2013 at 5:57 AM, Pranith Kumar Karampuri <address@hidden> wrote:

----- Original Message -----
> From: "Anand Avati" <address@hidden>
> To: "Jeff Darcy" <address@hidden>
> Cc: "Pranith Kumar Karampuri" <address@hidden>, "Anand Avati" <address@hidden>, "Raghavan Pichai"
> <address@hidden>, "Ravishankar Narayanankutty" <address@hidden>, "devel" <address@hidden>
> Sent: Wednesday, May 22, 2013 1:19:19 AM
> Subject: Re: [Gluster-devel] Proposal to change locking in data-self-heal
>
> On Tue, May 21, 2013 at 7:05 AM, Jeff Darcy <address@hidden> wrote:
>
> > On 05/21/2013 09:10 AM, Pranith Kumar Karampuri wrote:
> >
> >> scenario-1 won't happen because there exists a chance for it to acquire
> >> truncate's full file lock after any 128k range sync happens.
> >>
> >> Scenario-2 won't happen because extra self-heals that are launched on the
> >> same file will be blocked in self-heal-domain so the data-path's locks are
> >> not affected by this.
> >>
> >
> > This is two changes for two scenarios:
> >
> > * Change granular-lock order to avoid scenario 1.
> >
> > * Add a new lock to avoid scenario 2.
> >
> > I'm totally fine with the second part, but the first part makes me a bit
> > uneasy. As I recall, the "chained" locking (lock second range before
> > unlocking the first) was a deliberate choice to avoid windows where
> > somebody could jump in with a truncate during self-heal. It certainly
> > wasn't the obvious thing to do, suggesting there was a specific reason.
>
>
> Chained locking really was to avoid the start of a second self-heal while
> one is in progress. By giving up the big lock (after holding the small
> lock), regional operations can progress but the second self-heal waiting to
> hold the big lock is still held off. Truncating isn't really an issue as
> long as the truncate transaction locks from new EOF till infinity (which it
> is) and self-heal will not race in those regions. Of course, with chained
> locking, full-file truncates can starve.
>
> It's not obvious what use case Pranith is facing where self-heal and
> ftruncate() are competing. Is it just an artificial/hand-crafted test case,
> or a real-life access pattern?

artificial.

>
>
> > Until we recall what that reason was I don't think we can determine
> > whether this is a bug or a feature. If it's a feature, or if we don't
> > know, this change is likely to break something instead of fixing it.
>
>
> The sole reason was to prevent the start of a second self-heal. Having
> self-heals serialize in a separate domain solves the problem (except if we
> are looking at runtime compatibility across multiple versions in this
> scenario.. aargh!)

So you guys are OK with this proposal if we solve version compatibility issues?

I guess you can even just ignore the backward compatibility issue in this case. Old servers will "safely" block themselves against the big data-lock causing starvation just the way things work today. Once all servers are upgraded, all self-heals will gracefully block themselves in the new domain. There isn't much compatibility handling actually.

Avati

From:	Anand Avati
Subject:	Re: [Gluster-devel] Proposal to change locking in data-self-heal
Date:	Wed, 22 May 2013 18:32:09 -0700