gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] GlusterFS Roadmap: Erasure codes.


From: gordan
Subject: Re: [Gluster-devel] GlusterFS Roadmap: Erasure codes.
Date: Thu, 24 Apr 2008 10:39:12 +0100 (BST)
User-agent: Alpine 1.10 (LRH 962 2008-03-14)

On Thu, 24 Apr 2008, Rodney McDuff wrote:

I was just looking at the GlusterFS Roadmap and thought that a nifty
feature for the future would be an AFR-like translator that uses
Reed-Solomon erasures codes instead for file replication. That would add
many 9s to the reliability without adding much storage overhead.

You mean to have one file split across several servers like in RAID[3-6]? That's a lot more complicated, and CPU intensive. It would also be quite expensive on the writes across the servers, because each written block would require all segments of it to be read first for the R-S checksums to be calculated.

Fundamentally, I think it's just not what GlusterFS is intended to do. The nice thing about GlusterFS is that it is very similar to Coda, only:

1) without the limitations that reduce Coda's usefulness (limit on 1MB of metadata per directory - and metadata includes file names, which makes Coda fundamentally unsuitable for things like Maildirs or any application that is likely to see more than 1000-4000 files per directory)

2) without the features that get in the way of clean integration into existing server clusters - e.g. Coda's permission system is ACL based rather than POSIX, which is great for a truly global file system that Coda is designed to be, but annoying for more tightly coupled clusters.

A reed-solomon n+m RAID type solution, on the other hand, is a lot less flexible, although it does provide more effective storage per physical storage. Total failure recovery is also a lot more difficult. With GlusterFS, all the files are still there with their original content and their original names. In Coda they have hash numbered names, and the real names are in the metadata storage, but the content is still as per the original file. All this makes recovery from extensive failure a lot more sane (you'll know what I mean if you ever had to recover data from a RAID5 stripe in which 2 disks went bad).

But if storage space effectiveness is important to you, there are at least two products/projects that provide this functionality already:

DDRAID
Network RAID 3.5, node number limited to powers of 2. Project seems to be unmaintained.
http://sourceware.org/cluster/ddraid/

Cleversafe Dispersed Storage
Network RAIDn (n+m). Awesome idea, but sadly it's Java based (i.e. bloatware with the performance of a snail on sedatives)
http://www.cleversafe.org/dispersed-storage

I hope this helps.

Gordan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]