Re: [Gluster-devel] Improving real world performance by moving files clo

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Improving real world performance by moving files clo

From:	Gordan Bobic
Subject:	Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads
Date:	Fri, 16 May 2008 01:19:48 +0100
User-agent:	Thunderbird 2.0.0.14 (Windows/20080421)

Luke McGregor wrote:

We are currently experimenting with running GLuster over the nodes in
the cluster to produce a single large filesystem. For my Honors
research project ive been asked to look into making some improvements
to GLuster to try to improve performance by moving the files within
the GLusterFS closer to the node which is accessing the file.

What i was wondering is basically how hard would it be to write code
to modify the metadata so that when a file is accessed it is then
moved to the node which it is accessed from and its location is
updated in the metadata.

So, you want a unify/AFR hybrid translator that keeps track of whatnodes use what files most often, and migrate the file to that node?Perhaps a probabalistic local caching approach would do well with this.When a node accesses a file, there is a chance that it will replicatethe file to local storage. If a node accesses a file repeatedly, thecumulative chance approaches unity. The problem is that you need someway of ensuring that files don't exist on more than XYZ nodes, and thatwhen the store fills up, the file that gets dropped exists somewhereelse, when you are dropping the least recently used file from a node.

Interesting enough idea, but I'm not sure if the book-keeping overheadswould be overcome by speed benefits, especially on a fast network. You'dalso not be able to route requests for a particular file easily, whichmight end up meaning a broadcast request to all nodes to establish whohas the file available.

I suspect that designing an algorithm that does all this withsufficiently little overhead to keep you ahead in performance will bethe most difficult part, not writing a GlusterFS plugin. You are almostlooking at a variant of a probabalistically cached distributed hashtable network, only without using hashes for routing (which makes itmore difficult).


I'd _LOVE_ to see this done, though, it sounds like an awesome project. :)

Gordan

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads, (continued)
- Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads, Gordan Bobic <=

Prev by Date: Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads
Next by Date: Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads
Previous by thread: Re: [Gluster-devel] Improving real world performance by moving files closer to their target workloads
Next by thread: [Gluster-devel] GlusterFS v1.3.9 Release
Index(es):
- Date
- Thread