gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] split brain


From: Jeff Darcy
Subject: Re: [Gluster-devel] split brain
Date: Thu, 16 Aug 2012 09:58:10 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0

On 08/16/2012 04:55 AM, Emmanuel Dreyfus wrote:
> On all vricks, .glusterfs/3e/6b/3e6b026a-b9ed-4845-a5d1-6eb06412b3ca
> iis a symlink to directory 
> .glusterfs/4b/34/4b34a8a2-bff2-4684-b005-a36b069914ab/arch
>
> I a m bit surprised to see a link to a subdir of a .glusterfs hash.
> Is it something that makes sense? Or is it again a link(2) that should
> be remplaced by linkat(2) ?

It's part of the GFID-based back end that's new in 3.3.  Amar would be the
expert, but he's on leave right now.  Avati could probably also provide decent
answers.  I admit that I don't understand all of the nuances well enough to do 
so.

> Here are the xattr for the directory:
> 
> gfs33-client-0
> trusted.glusterfs.dht   00000001000000007fffffffffffffff
> trusted.afr.gfs33-client-1      000000000000000200000000
> trusted.afr.gfs33-client-0      000000000000000000000000
> trusted.gfid    3e6b026ab9ed4845a5d16eb06412b3ca
> 
> gfs33-client-1
> trusted.glusterfs.dht   00000001000000007fffffffffffffff
> trusted.afr.gfs33-client-1      000000000000000000000000
> trusted.afr.gfs33-client-0      000000000000000100000000
> trusted.gfid    3e6b026ab9ed4845a5d16eb06412b3ca
> 
> gfs33-client-2
> trusted.glusterfs.dht   0000000100000000000000007ffffffe
> trusted.afr.gfs33-client-3      000000000000000000000000
> trusted.afr.gfs33-client-2      000000000000000000000000
> trusted.gfid    3e6b026ab9ed4845a5d16eb06412b3ca
> 
> gfs33-client-3
> trusted.afr.gfs33-client-3      00000000000000000000000000
> trusted.afr.gfs33-client-2      00000000000000000000000000
> trusted.glusterfs.dht   0000000100000000000000007ffffffe
> trusted.gfid    3e6b026ab9ed4845a5d16eb06412b3ca

OK, here's something I'm much more comfortable with.  Note how this differs
from what you presented earlier, where the non-zero values were on client-0
pointing to client-1 and client-3 pointing to client-2.  Now we still have
client-0 pointing to client-1, but also client-1 pointing to client-0.  That's
a true split brain; operations seem to have completed on each node that didn't
complete on the other, so we don't know which values should take precedence.
The way I'd fix it would be to clear (not remove) one of the non-zero
trusted.afr xattrs, and let self-heal do the rest.

> I understand pending are the trusted.afr from the bricks,
> but what do they represent, by the way?

These two posts explain it about as well as I'm able:

http://hekafs.org/index.php/2011/04/glusterfs-extended-attributes/
http://hekafs.org/index.php/2012/03/glusterfs-algorithms-replication-present/



-- 

ObSig: if you use "ask" as a noun I will ignore you for a week.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]