gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] solutions for split brain situation


From: Mark Mielke
Subject: Re: [Gluster-devel] solutions for split brain situation
Date: Mon, 14 Sep 2009 10:25:40 -0400
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.1) Gecko/20090814 Fedora/3.0-2.6.b3.fc11 Thunderbird/3.0b3

On 09/14/2009 08:06 AM, Stephan von Krawczynski wrote:
we have seen several split brain situations and think that the most common
option for the situation is simply missing. You can define a favourite child,
but you cannot define to use the latest file copy as definitive. Why not?
Isn't it a logical approach to say that the latest copy of a file based on
mtime must be the most up-to-date and therefore being used in split brain
recovery?

Latest is *a* resolution, but it's probably not 100% the right answer for everybody. I don't think I would use it. If the file system is forked - and one client is doing one thing, and another is doing another thing - there is no clear answer. Split brain in general is bad. My personal conclusion on the matter is: 1) I want to make sure that only one server is modifying one file at one time, and only cut over if the master goes down, *or* 2) I want to lock a majority of the servers before allowing a transaction to start, such that split brain should not occur. For a 3-node clusters, this means requiring 2 locks.

I don't think I would rely on self-healing of split-brain for a production service. Just my opinion.

If I did want to make a "best choice", though - I think I would choose "volume associated with the longest running glusterfsd including being actively ping accessible". It's not perfect either, but at least it maximizes the chance that this is the one the most people using would have seen and made their decisions based upon.

Currently it seems that there is no real choice besides a defined favourite
child, the file action is only distributed between the children, which means
you just get a subset of old file copies.
I'd say the solution has to be placed somewhere at
xlators/cluster/afr/src/afr-self-heal-data.c lines 855 ff.
I have no idea though how to find out what the latest copy is ...
Comments?

Look at the stat() results for each of the files, and track the latest mtime. But, for two processes actively writing - this still rolling a die. In fact, just because it's latest now, doesn't mean it is latest 2 seconds from now...

Cheers,
mark

--
Mark Mielke<address@hidden>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]