Re: [Gluster-devel] solutions for split brain situation

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] solutions for split brain situation

From:	Mark Mielke
Subject:	Re: [Gluster-devel] solutions for split brain situation
Date:	Fri, 18 Sep 2009 10:39:20 -0400
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.1) Gecko/20090814 Fedora/3.0-2.6.b3.fc11 Thunderbird/3.0b3

On 09/18/2009 06:44 AM, Stephan von Krawczynski wrote:

Only, we talked about OPEN and not REMOVE. Your example comes up with a broken
replicate situation for a remove and you correctly say that after the second
subvolume goes alive again the remove should be completely done.
You may well ask the journal for that, and it tells you that the file on the
second should be removed. Now, if you enter the same situation for a local fed
file on secondary I would simply suggest - since there is no journal telling
you to remove - that the file should be valid and _not_ removed, but
replicated to node 1.
Since this decision can be taken based on the journal, both setups have a
valid answer. Still there is no race. Open in first setup fails, open in
second setup succeeds. Nevertheless both open tries need a stat to check for
the files' existence. The first stat finds the file should be gone, the second
stat replicates the file to node 1 and open can succeed. And guess what:
exactly that happens on glusterfs. If you stat a file that is only available
on secondary node, it gets replicated.

I failed to mention another consideration - the journal doesn't liveforever. In most systems, the journal is either removed or marked as"done" as the backend storage is successfully updated. In many systems,such as PostgreSQL, the journal can be thought of as an infinite sizelist of instructions that if processed in order, will result in thecorrect backing data. With this model, the journal is front andforemost. If the journal does not say the file was created, then thefile shouldn't be considered to exist even if it is found in the backingdata.

That all said - the GlusterFS representative responded they have chosento error on the side of "conservative", where they choose to keep thefile if they cannot find proof that it should be removed, whichunintentionally supports your model. This being the case, it does leadto supporting your position, as you are also looking for "conservative"behaviour in the case of an error path during self-heal and backendconsistency checks.

The point here, is that the journal SHOULD be consulted.

You omitted the most important word: "too". The journal should be consulted
too. Nevertheless it cannot be the only reason for decision.

If this was a database system, the journal trumps the data every time,and nothing should go into backend storage without going through thejournal. But, if the authors want to relax this to support other models(such as a backend file system restore where the backup process, or therestore process (fsck) strips the extended attributes, effectivelyburning the journal into smoke and ash, then it seems like you have avalid point. :-)


Cheers,
mark

--
Mark Mielke<address@hidden>

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gluster-devel] solutions for split brain situation, (continued)

Prev by Date: Re: [Gluster-devel] solutions for split brain situation
Next by Date: Re: [Gluster-devel] solutions for split brain situation
Previous by thread: Re: [Gluster-devel] solutions for split brain situation
Next by thread: Re: [Gluster-devel] solutions for split brain situation
Index(es):
- Date
- Thread