gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] afr logic


From: Kevan Benson
Subject: Re: [Gluster-devel] afr logic
Date: Wed, 17 Oct 2007 09:38:25 -0700
User-agent: Thunderbird 2.0.0.6 (X11/20070728)

Alexey Filin wrote:
Hi Kevan,

consistency of afr'ed files is important question as of failures in backend fs too, afr is a medicine against node failures not backend fs ones (at least not directly), in the last case files can be changed "legally" in bypass glusterfs by fsck after a hw/sw failure and the changes have to be handled for corrupted replica, else reading of the same file can give different data (especialy for forthcoming load balanced read of replicas). Fortunately rsync'ing of original must create consistent replica in the case too (if cluster/stripe under afr works equally with replicas), unfortunately extended attributes aren't rsync'ed (I tested it) what can be required during repairing.

It seems glusterfs could try to handle hw/sw failures in backend fs with checksums in extended attributes and checksums are to be calculated for file chunks (because one checksum requires full recalculation after appending/changing of one byte to/in a gigabyte file) in the case glusterfs has to recalculate checksums of all files on corrupted fs (may be toooo long, it is the same case with rsync'ing) or get list of corrupted files from backend fs in some way (e.g. with a flag set by fsck in extended attributes). May be some kind of distributed raid is a better solution, first step in the direction was done already by cluster/stripe (unfortunately one of implementations, DDRaid http://sources.redhat.com/cluster/ddraid/ by Daniel Phillips seems to be suspended), perhaps it is too computational/network intensive and raid under backend fs is the best solution even taking into account disk space overhead.

I'm very interested to hear thoughts about it from glusterfs developers to clear my misunderstanding.

The rsync case can probably be handled through a separate find of the appropriate attributes on the source and set on the target. A simple bash/perl script could handle this in a few lines.

The fsck case is more interesting, but if you could get fsck to report file/directory names that have problems and not fix them, it's easy to pipe that to a script to remove the trusted.afr.version attribute on the files and then the AFR will heal itself.

Checksums would of course give you much better tracking of corrupted files, but I imagine the cpu strain and speed decrease would make it non-feasible.

--

-Kevan Benson
-A-1 Networks




reply via email to

[Prev in Thread] Current Thread [Next in Thread]