Re: [Gluster-devel] ping timeout

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] ping timeout

From:	Gordan Bobic
Subject:	Re: [Gluster-devel] ping timeout
Date:	Thu, 25 Mar 2010 09:56:24 +0000
User-agent:	Thunderbird 2.0.0.22 (X11/20090625)

Michael Cassaniti wrote:

On 03/25/10 10:21, Gordan Bobic wrote:
Christopher Hawkins wrote:
Correct me if I'm wrong, but something I would add to this debate isthe type of split brain we are talking about. Glusterfs is quitedifferent from GFS or OCFS2 in a key way, in that it is an overlay FSthat uses locking to control who writes to the underlying files andhow they do it.
It is not a cluster FS the way GFS is a cluster FS. For example ifGFS has split brain, then fencing is the only thing preventing thecomplete destruction of all data as both nodes (assuming only two)write to the same disk at the same time and utterly destroy thefilesystem. But glusterfs is passing writes to EXT3 or whatever, andat worst you get out of date files or lost updates, not a uselesspartition that used to have your data...
I think less stringent controls are appropriate in this case, andthat GFS / OCFS2 are entirely different animals when it comes to howsevere a split brain can be. They MUST be strict about fencing, butwith Glusterfs you have a choice about how strict you need it to be.
Not really. The only reason it is less bad is because the corruptionwill affect individual files, rather than the complete file system.Granted, this is much better than hosing the entire file system, butthe fact remains that you get left with files that cannot be healedwithout manual intervention or explicitly specifying which node shouldwin with the favorite-child option.
Gordon,
Can you suggest how you would successfully manage to get the first nodein your scenario in sync?

The point is that getting things in sync after split-brain isn'tpossible without throwing away at least some changes. The only way todeal with it is to now allow it to desync in the first place.

If I have your mentioned scenario right, including what you believeshould happen:


    * First node goes down. Simple enough.
    * Second node has new file operations performed on it that the first
      node does not get.
    * First node comes up. It is completely fenced from all other
      machines to get itself in sync with the second node.
    * Second node goes down. Is it before/after first node is synced?
          o If it is before then you have a fully isolated FS that is
            not accessible.
          o If it is after then you don't have a problem.

I would suggest writing a script and performing some firewalling toperform the fencing.

This is not really good enough - you need an out-of-band fencing devicethat you can use to forcibly down the node that disconnected, e.g.remote power-off by power management (e.g. UPS or a network controllablepower bar) or remote server management (Dell DRAC, Raritan eRIC G4, HPiLO, Sun LOM, etc.). When the node gets rebooted, it has to notice thereare other nodes already up and specifically set itself into such a modethat it will lose any contest on being the source node for resync untilit has fully checked all the files' metadata against it's peers.

I believe you can run ls -R on the file-system toget it in sync. You would need to mount glfs locally on the first node,get it in sync, then open the firewall ports afterward. Is that anappropriate solution?

The problem is that firewalling would have to be applied by every nodeother than the node that dropped off, and this would need to becommunicated to all the other nodes, and they would have to confirmbefore the fencing action is deemed to have succeeded. This is a lotmore complex and error prone compared to just using a single point offencing for each node such as a network controlled power bar.(e.g.http://www.linuxfordevices.com/c/a/News/Entrylevel-4port-IP-power-switch-runs-Linux/

)

Gordan

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gluster-devel] ping timeout, (continued)
- Re: [Gluster-devel] ping timeout, Christopher Hawkins, 2010/03/18
  - Re: [Gluster-devel] ping timeout, Stephan von Krawczynski, 2010/03/18
- Re: [Gluster-devel] ping timeout, Christopher Hawkins, 2010/03/18
  - Re: [Gluster-devel] ping timeout, Ed W, 2010/03/23
    - Re: [Gluster-devel] ping timeout, Gordan Bobic, 2010/03/23
    - Re: [Gluster-devel] ping timeout, Jeff Darcy, 2010/03/23
- Re: [Gluster-devel] ping timeout, Christopher Hawkins, 2010/03/24
  - Re: [Gluster-devel] ping timeout, Stephan von Krawczynski, 2010/03/24
  - Re: [Gluster-devel] ping timeout, Gordan Bobic, 2010/03/24
    - Re: [Gluster-devel] ping timeout, Michael Cassaniti, 2010/03/25
    - Re: [Gluster-devel] ping timeout, Gordan Bobic <=
    - Re: [Gluster-devel] ping timeout, Stephan von Krawczynski, 2010/03/25
    - Re: [Gluster-devel] ping timeout, Gordan Bobic, 2010/03/25
    - Re: [Gluster-devel] ping timeout, Stephan von Krawczynski, 2010/03/25
    - Re: [Gluster-devel] ping timeout, Gordan Bobic, 2010/03/25
    - [Gluster-devel] split-brain [was ping timeout], Ian Rogers, 2010/03/25
    - Re: [Gluster-devel] split-brain [was ping timeout], Vikas Gorur, 2010/03/25

Prev by Date: [Gluster-devel] How to make out-of-sync files visible in replication setup
Next by Date: Re: [Gluster-devel] ping timeout
Previous by thread: Re: [Gluster-devel] ping timeout
Next by thread: Re: [Gluster-devel] ping timeout
Index(es):
- Date
- Thread