Re: [Gluster-devel] Improving real world performance by moving files clo

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Improving real world performance by moving files clo

From:	Gordan Bobic
Subject:	Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads
Date:	Mon, 19 May 2008 08:29:19 +0100
User-agent:	Thunderbird 1.5.0.12 (X11/20080430)

Luke McGregor wrote:

Firstly im starting to think that maybe the best way forward is to somehow
centralise the metadata store (be that by distributing it or by having it
handled by a dedicated metadata server im not too sure).

If you _REALLY_ want to do it that way (and personally, I'm not at allconvinced), look at Lustre.

The reason im
thinking this may be the best way forward is that i think that any quorum
based approach will not be able to guarentee that any write is sucessful. if
a write occurs and the network is queried for a quorum before the node with
the latest copy has a chance to pass replicate its new data then there is
too much room for a quorum being reached without needing the approval of the
node with the latest copy.

Not true. If you have quorum of the majority of the nodes, (50%+1) forevery write, then the node with that latest copy would have had quorumto begin with.

This could cause some serious problems especially
on a hevially accessed file.

No, you'd just have to have explicit file locking, like with any othercluster FS. (Note: GlusterFS doesn't do implicit locking at the moment.It will be doing it for O_APPEND access to guarantee that writes areatomic in that mode soon.)

The problem would i believe be worsened as the
nodes which are hosting any hevially accessed file are the most likely to
not respond quickly to any kind of multicast.

I think it's a non-issue. If you always need majority response, withoutthat majority response, those heavily accessed nodes wouldn't be able toget a lock themselves, either.

Furthermore in the essence of performace you would want to act as soon as a
quorum was reached. this would effectivly mean that the nodes which made the
decision on the lock would be the lightest accessed nodes with the hevier
accessed nodes responding in slightly more time. I believe this would mean
that in order to be sure of a lock you would need to get a consesus from all
nodes. This would be unpractical.

As I said, it's an even playing field - heavily accessed nodes need toget a lock by quorum just like any other node.

Having said that, I think you'd need quorum from not only majority ofnodes (that would work if you have a shared fs, but this is a_replicated_ fs) but from majority of nodes that have a copy of the file.

Also looking for a file to delete to free space would be a really
inefficient proccess as a single request for space would potentially mean
querying the network for redundancy information on every other file stored.
this would not be practical.

It depends on the size of your files. A 1 packet broadcast is a prettydecent trade off for gaining 1GB of space, and pretty inefficient for a1 byte file. But there isn't really a way around it.

I think a centralised metadata system would eliminate these problems as it
would be authoritive. A write shouldnt suceed without the central metadata
being updated and a lock shouldnt be granted without the central metadata
allowing it. This would also mean that old files would be invalidated by the
system centrally giving a side effect of allowing an easy rollback mechanism
until those files were deleted (if anybody ever wanted that feature). It
would also mean that freeing up space would be a reletivly simple operation.

Which has the downside that your metadata store is not asredundant/distributed as your data. IMO, having data and metadatadistributed around equally is one of GlusterFS's major advantages overLustre.

Just as an offside note i think your raid6 set will outperform your raid10
set even though raid6 is slower than raid10 and you may be using slower
drives, you still have 16 drives in the set which i believe will actually
give you faster perfromance than any 4 drive configuration.

Whether RAID6 is "slow" is entirely down to your RAID controller. If itcan calculate the reed-solomon codes faster than the disks can keep up,then of course the total speed of the disks will be the only factoraffecting the speed, and in that case, 16 disks will beat 4 disks any day.


Gordan

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads, (continued)

Prev by Date: Re: [Gluster-devel] booster translator error
Next by Date: [Gluster-devel] Single-process (server and client) AFR problems
Previous by thread: Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads
Next by thread: Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads
Index(es):
- Date
- Thread