gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Cluster sizes


From: Jeff Darcy
Subject: Re: [Gluster-devel] Cluster sizes
Date: Wed, 13 Nov 2013 10:49:29 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0

On 11/13/2013 10:19 AM, Vijay Bellur wrote:
Makes me wonder what would be a typical deployment scenario - would we
have a single volume that spans around 10K nodes? If yes, what are the
scalability problems that we foresee? DHT's directory spread is on the
top of my mind. Would the directory spread count option be good enough
to address this?

The idea of a single volume spanning 10K nodes kind of freaks me out, but if we support that many nodes (in this kind of scenario they're likely to be both servers and clients) then it's almost inevitable that we'll have users who try to create volumes across all of them. After all, that's the "unified namespace" value prop, right?

I don't think directory spread count is sufficient to address this. At that level, we can *never* do anything that involves hitting all bricks. That includes getxattr to fetch layouts (even if most of them or empty for a particular directory), it includes mkdir, and so on. We'll have to do *everything* via consistent hashing, including the things where we currently rely on information being global. Even having that many connections is going to be a serious problem, so we'll probably have to do some pooling or proxying or something. Tracking and coordinating rebalance state is going to be another problem, so we'll probably need a fundamentally different approach there as well.

But first, we have to solve the glusterd scaling issues. Any scaling in DHT or elsewhere in the I/O plane won't even matter until the management plane can support building a cluster that large.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]