Re: [Gluster-devel] Gluster Recovery

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Gluster Recovery

From:	Danson Michael Joseph
Subject:	Re: [Gluster-devel] Gluster Recovery
Date:	Sun, 29 Apr 2007 14:10:04 +0200
User-agent:	Thunderbird 1.5.0.10 (X11/20070403)

Hi all,

To come back to the issue of "self-healing" in an AFR situation.Consider the rather complex situation below where A=AFR, S=stripe andU=unify:

/---- Server1---\Client3S---- Server2 ----S//-------U---<>---U--------\/ S---- Server3 ----S\/ \---- Server4 ---/\Client1 ------A--<>---A------Client2\ /---- Server5---\/\ S---- Server6 ----S/\-------U---<>---U-------/

                                     /         S---- Server7 ----S
                              Client4       \---- Server8 ---/


Client1 and 2: AFR of two separate unions of two separate stripes
Client3 and 4: Union of two separate stripes

I think this is quite a complex arrangement and could probably accountfor 80% of large installation cases. The obvious question here is whatwould the method of healing be for a server failure. Some thoughts:

1) As mentioned later on in this thread, the flexibility of gluster isgreat but it is somewhat rediculouos to imagine that this flexibilityfrees one from using good cluster design. For instance, the followingconfiguration is probably of little use, the clients must have a usefulconfiguration, possible like the larger one above:

                      /---Server1---\
Client1---S---<                     >---U---Client2
                      \---Server2---/

2) If a server is replaced, healing must take place from any or allclients, otherwise the distributed nature of the system is lost.


3) No client should exist below a striping such as:
              Client2
                        \    /---- Server1...
                          U---- Server2...

...Client1---S---<U---- Server3....

                           \---- Server4....

Correct me if I'm wrong, but trying to read striped data as the abovedrawing shows for client2 would not be very useful to client2.

4) A suggestion here is to have each AFR client with a self-healfilter/translator. ONLY AFR clients should have self-healing forreplication. Other clients such as the union clients can haveself-healing filters but for different filesystem health checks. When aserver fails and is replaced, all AFR clients get stuck in and attemptto reconstruct the data. Thus in this situation, Clients 1 and 2 willheal the system. Clients 3 and 4 cannot because they don't have a fullset of data from which to work.5) Who is the dominant reconstruction client? A simple possiblesolution is to have a "pre-healing" lock for each file to bereconstructed. For instance, Client1 finds "hello.c" in bad shapebecause of the failure. Client1 placed a lock file in the directoryidentifying itself with a timestamp. Client2 also notices that"hello.c" is in bad shape and moves to fix, but notices a lock file witha timestamp on it, and so will move on to another file/folder. IfClient2 notices that the timestamp has not been updated in 20s orsomething reasonable, that means that Client1 has crashed or failed insome manner and is no longer healing "hello.c". Therefore Client2 willcontinue to heal "hello.c". Obviously, during healing, nothing elseshould access the file for fear of further corruption. Comments on thatmay run far, but so be it.6) What it all comes down to is: 1) do not make the system's distributednature worthless; let all clients get stuck in as if they were alltrying to make breakfast. If someone is making the eggs, don't makeeggs, go make the toast. If the eggs start burning because the cookwent to the toilet, take over and finish the eggs. Soon enough, withclever co-operation, the breakfast will be done.


Comments?

Regards,
Danson Joseph




Anand Avati wrote:

The concern here is the following though:

Two separate clients are identically configured to use AFR to two identical 
server configurations as follows:
             Server1
           /         \
Client1 ---           ---Client2
           \         /
             Server2

Client1 puts "hello.c" onto both Server1 and Server2 via AFR.  Client2 then 
changes hello.c in some way.
Server1 goes down; data lost, no chance of recovery and is replaced by Server3, 
a brand new server with fresh disks.
In this case, how does the data get reconstructed from the client's side 
because you mentioned that the automatic recovery was going to be on the 
glusterfs side.  Client1 believes hello.c is something different to what 
Client2 believes.  Which client will responsibly reconstruct the data?  Will 
the journaling of the remaining servers be used to reconstruct the data on the 
new server?


'changes' are done in sync on both server1 and server2 always
(writes()s are sent to all child nodes). when server3 comes in place
of server1, the self-heal should detect that hello.c is missing on
server3 and sync it from server2.



regards,
avati

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gluster-devel] Gluster Recovery, (continued)
- Re: [Gluster-devel] Gluster Recovery, majied . najjar, 2007/04/27
  - Re: [Gluster-devel] Gluster Recovery, Anand Avati, 2007/04/28

Prev by Date: [Gluster-devel] Re: more bugs (was Re: io-threads...)
Next by Date: [Gluster-devel] getfattr
Previous by thread: Re: [Gluster-devel] Gluster Recovery
Next by thread: Re: [Gluster-devel] Gluster Recovery
Index(es):
- Date
- Thread