gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Gluster Recovery


From: Danson Michael Joseph
Subject: Re: [Gluster-devel] Gluster Recovery
Date: Sun, 29 Apr 2007 14:10:04 +0200
User-agent: Thunderbird 1.5.0.10 (X11/20070403)

Hi all,

To come back to the issue of "self-healing" in an AFR situation. Consider the rather complex situation below where A=AFR, S=stripe and U=unify:

/---- Server1---\ Client3 S---- Server2 ----S / /-------U---< >---U--------\ / S---- Server3 ----S \ / \---- Server4 ---/ \ Client1 ------A--< >---A------Client2 \ /---- Server5---\ / \ S---- Server6 ----S / \-------U---< >---U-------/
                                     /         S---- Server7 ----S
                              Client4       \---- Server8 ---/


Client1 and 2: AFR of two separate unions of two separate stripes
Client3 and 4: Union of two separate stripes

I think this is quite a complex arrangement and could probably account for 80% of large installation cases. The obvious question here is what would the method of healing be for a server failure. Some thoughts:

1) As mentioned later on in this thread, the flexibility of gluster is great but it is somewhat rediculouos to imagine that this flexibility frees one from using good cluster design. For instance, the following configuration is probably of little use, the clients must have a useful configuration, possible like the larger one above:
                      /---Server1---\
Client1---S---<                     >---U---Client2
                      \---Server2---/

2) If a server is replaced, healing must take place from any or all clients, otherwise the distributed nature of the system is lost.

3) No client should exist below a striping such as:
              Client2
                        \    /---- Server1...
                          U---- Server2...
...Client1---S---< U---- Server3....
                           \---- Server4....
Correct me if I'm wrong, but trying to read striped data as the above drawing shows for client2 would not be very useful to client2.

4) A suggestion here is to have each AFR client with a self-heal filter/translator. ONLY AFR clients should have self-healing for replication. Other clients such as the union clients can have self-healing filters but for different filesystem health checks. When a server fails and is replaced, all AFR clients get stuck in and attempt to reconstruct the data. Thus in this situation, Clients 1 and 2 will heal the system. Clients 3 and 4 cannot because they don't have a full set of data from which to work. 5) Who is the dominant reconstruction client? A simple possible solution is to have a "pre-healing" lock for each file to be reconstructed. For instance, Client1 finds "hello.c" in bad shape because of the failure. Client1 placed a lock file in the directory identifying itself with a timestamp. Client2 also notices that "hello.c" is in bad shape and moves to fix, but notices a lock file with a timestamp on it, and so will move on to another file/folder. If Client2 notices that the timestamp has not been updated in 20s or something reasonable, that means that Client1 has crashed or failed in some manner and is no longer healing "hello.c". Therefore Client2 will continue to heal "hello.c". Obviously, during healing, nothing else should access the file for fear of further corruption. Comments on that may run far, but so be it. 6) What it all comes down to is: 1) do not make the system's distributed nature worthless; let all clients get stuck in as if they were all trying to make breakfast. If someone is making the eggs, don't make eggs, go make the toast. If the eggs start burning because the cook went to the toilet, take over and finish the eggs. Soon enough, with clever co-operation, the breakfast will be done.

Comments?

Regards,
Danson Joseph




Anand Avati wrote:
The concern here is the following though:

Two separate clients are identically configured to use AFR to two identical 
server configurations as follows:
             Server1
           /         \
Client1 ---           ---Client2
           \         /
             Server2

Client1 puts "hello.c" onto both Server1 and Server2 via AFR.  Client2 then 
changes hello.c in some way.
Server1 goes down; data lost, no chance of recovery and is replaced by Server3, 
a brand new server with fresh disks.
In this case, how does the data get reconstructed from the client's side 
because you mentioned that the automatic recovery was going to be on the 
glusterfs side.  Client1 believes hello.c is something different to what 
Client2 believes.  Which client will responsibly reconstruct the data?  Will 
the journaling of the remaining servers be used to reconstruct the data on the 
new server?

'changes' are done in sync on both server1 and server2 always
(writes()s are sent to all child nodes). when server3 comes in place
of server1, the self-heal should detect that hello.c is missing on
server3 and sync it from server2.



regards,
avati





reply via email to

[Prev in Thread] Current Thread [Next in Thread]