Re: [Gluster-devel] AFR comments. Maximizing free space use when using m

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] AFR comments. Maximizing free space use when using m

From:	DeeDee Park
Subject:	Re: [Gluster-devel] AFR comments. Maximizing free space use when using mirroring
Date:	Tue, 31 Jul 2007 19:50:43 +0000

thnx for the reponse. comments inline below

From: "Krishna Srinivas" <address@hidden>
To: "DeeDee Park" <address@hidden>
CC: address@hidden
Subject: Re: [Gluster-devel] AFR comments. Maximizing free space use whenusing mirroring.
Date: Tue, 31 Jul 2007 20:59:14 +0530

On 7/25/07, DeeDee Park <address@hidden> wrote:
> Here is my 2c on AFR.
>
> When I setup file servers, the first priority is always to get it up and
> running, and then
> later the next priority is to add mirrors/high availability. Somtimesdue to
> business concerns
> the second priority sometimes does not happen until something drastic
> happens. By
> the time that there is budget to get additional drives, typically thedrive
> sizes that are
> available are also much bigger (remember, they double every 2 years orso).
> So when
> I've bought drives, I've gotten 40GB, 80GB, 120Gb, 160GB, 200GB, 250GB,
> 300GB,
> 500GB, 750GB... So what I'm saying is that when new drives are bought to
> either expand
> the total file server size, or to add additional replicas, the newdrives
> are most of the time bigger than the original drives purchased.
>
> The In the current implementation of AFR, the second brick (in a nonwell
> manged environment)
> will most likely be bigger than the first brick, thus underutilizing
> additional storage space due to mismatch in disk sizes.
>
> The idea I have is that I want to use as many available commodity partsthat
> I can find and
> build a largest file server for my customer's needs and reallocating the
> remaining space for
> replicas. I still have a lot of these 120GB drives sitting around from afew
> years ago, and I've
> got 500/750GB drives. It seems to be a difficult task to match each120GB
> drive with another
> 120GB drive to optimize disk usage for AFR purposes. I could have 2500GB
> drives for
> replication *:2, but if I want to move to *:3 in the future, most likely
> I'll have some 750GB
> drives laying around. Using a 750GB as my third brick would most likely
> waste the remaining 250GB.
You can just put the bigger drive brick as the first subvolume in thesubvolume
list. This should fix the problem right?

Yes, it would give the user more space, but wouldn't there be some kind oferrormessage when the replica disk runs out of space once the primary 750GBcontains

more than 250GB of data? Once the user exceeds 250GB, then they no longer
have a replica (false sense of security... bad).

>
> Just as RR or ALU puts files anywhere. I envisioned originally that AFRalso> did the same. If my dataset is larger than the largest possible RAID Ican
> afford, then one brick will never carry all the
> files.
No, That would complicate the functionality of AFR and its self-healfeature.
>
> What I think would be cool would be to have the AFR on top of the unifyso
> that if the dataset is spread across X drives, that is fine, the remote
> mirrors would not require the same hardware, and I would just need to
> purchase the approximate 2X hard drive space at the new co-lo. I canjust
> ask a client "How much disk space are you currently using?". If they say
> 20TB all using 200GB drives (=100 drives), then I can setup theadditional> glusterfs replica to utilize 20TB using 750GB drives. I would like tohave
> to buy 27 750GB drives to make up my 20TB, instead of having to buy 100
> 750GB drives to replicate the existing 100GB drives. (It doesn't makesense
> to buy 200GB drives when larger drives are available for purchase).

I just tried afr over unify, there is some problem, I shall look into it.
However we need to see if it is advisable to use this kind of setup.

>
> Also, I have the premise that 100% of the dataset is critical (It is all
> user data), and I cannot say which file extensions should be replicatedor
> should not be replicated. The example that *.c is more critical than *.o
> probably true, but I know users have told me that they have .o filesfrom> systems that are no longer available, so those .o files for that userare> critical. Since I cannot specify *.c:2,*.o:1 for some users and*.c:2,*.o:2> for others (nor would I really want to get that involved in the userdata> details or think I'll have that much free time to investigate that levelof> detail), it only makes sense to replicate everything eg: *:2 or *:3. Itis a> cool feature to have. But also if a user specifies *.c:2,*.o:1, thenthat> assumes (with the current implementation of AFR), that the 2nd brickshould> be smaller than the first brick (Then I have questions as to whathappens
> when there isn't enough space etc).

That is right, second brick should be smaller or equal to fist disk.

Regards,
Krishna


_________________________________________________________________
http://im.live.com/messenger/im/home/?source=hmtextlinkjuly07

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] AFR comments. Maximizing free space use when using mirroring., DeeDee Park, 2007/07/24
- Re: [Gluster-devel] AFR comments. Maximizing free space use when using mirroring., Krishna Srinivas, 2007/07/31
  - Re: [Gluster-devel] AFR comments. Maximizing free space use when using mirroring, DeeDee Park <=

Prev by Date: Re: [Gluster-devel] AFR bricks - how does it resync question
Next by Date: Re: [Gluster-devel] AFR bricks - how does it resync question
Previous by thread: Re: [Gluster-devel] AFR comments. Maximizing free space use when using mirroring.
Next by thread: [Gluster-devel] segv in 354
Index(es):
- Date
- Thread