Re: [Gluster-devel] GlusterFS Roadmap: Erasure codes.

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] GlusterFS Roadmap: Erasure codes.

From:	gordan
Subject:	Re: [Gluster-devel] GlusterFS Roadmap: Erasure codes.
Date:	Thu, 24 Apr 2008 10:39:12 +0100 (BST)
User-agent:	Alpine 1.10 (LRH 962 2008-03-14)

On Thu, 24 Apr 2008, Rodney McDuff wrote:

I was just looking at the GlusterFS Roadmap and thought that a nifty
feature for the future would be an AFR-like translator that uses
Reed-Solomon erasures codes instead for file replication. That would add
many 9s to the reliability without adding much storage overhead.

You mean to have one file split across several servers like in RAID[3-6]?That's a lot more complicated, and CPU intensive. It would also be quiteexpensive on the writes across the servers, because each written blockwould require all segments of it to be read first for the R-S checksums tobe calculated.

Fundamentally, I think it's just not what GlusterFS is intended to do. Thenice thing about GlusterFS is that it is very similar to Coda, only:

1) without the limitations that reduce Coda's usefulness (limit on 1MB ofmetadata per directory - and metadata includes file names, which makesCoda fundamentally unsuitable for things like Maildirs or any applicationthat is likely to see more than 1000-4000 files per directory)

2) without the features that get in the way of clean integration intoexisting server clusters - e.g. Coda's permission system is ACL basedrather than POSIX, which is great for a truly global file system that Codais designed to be, but annoying for more tightly coupled clusters.

A reed-solomon n+m RAID type solution, on the other hand, is a lot lessflexible, although it does provide more effective storage per physicalstorage. Total failure recovery is also a lot more difficult. WithGlusterFS, all the files are still there with their original content andtheir original names. In Coda they have hash numbered names, and the realnames are in the metadata storage, but the content is still as per theoriginal file. All this makes recovery from extensive failure a lot moresane (you'll know what I mean if you ever had to recover data from a RAID5stripe in which 2 disks went bad).

But if storage space effectiveness is important to you, there are at leasttwo products/projects that provide this functionality already:


DDRAID

Network RAID 3.5, node number limited to powers of 2. Project seems to beunmaintained.

http://sourceware.org/cluster/ddraid/

Cleversafe Dispersed Storage

Network RAIDn (n+m). Awesome idea, but sadly it's Java based (i.e.bloatware with the performance of a snail on sedatives)

http://www.cleversafe.org/dispersed-storage

I hope this helps.

Gordan

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] GlusterFS Roadmap: Erasure codes., Rodney McDuff, 2008/04/24
- Re: [Gluster-devel] GlusterFS Roadmap: Erasure codes., gordan <=

Prev by Date: [Gluster-devel] GlusterFS Roadmap: Erasure codes.
Next by Date: [Gluster-devel] Doing LS with a lot of directory, files
Previous by thread: [Gluster-devel] GlusterFS Roadmap: Erasure codes.
Next by thread: [Gluster-devel] Replication Initialization from Existing Directory
Index(es):
- Date
- Thread