gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Puppet-Gluster+ThinP


From: Ric Wheeler
Subject: Re: [Gluster-devel] Puppet-Gluster+ThinP
Date: Sun, 20 Apr 2014 17:44:41 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

On 04/20/2014 05:11 PM, James wrote:
On Sun, Apr 20, 2014 at 7:59 PM, Ric Wheeler <address@hidden> wrote:
The amount of space you set aside is very much workload dependent (rate of
change, rate of deletion, rate of notifying the storage about the freed
space).
 From the Puppet-Gluster perspective, this will be configurable. I
would like to set a vaguely sensible default though, which I don't
have at the moment.

This will require a bit of thinking as you have noticed, but let's start with some definitions.

The basic use case is one file system backed by an exclusive dm-thinp target (no other file system writing to that dm-thinp pool or contending for allocation).

The goal is to get an alert in time to intervene before things get ugly, so we are hoping to get a sense of rate of change in the file system and how long any snapshot will be retained for.

For example, if we have a 10TB file system (presented as such to the user) and we write say 500GB of new data/day, daily snapshots will need that space for as long as we retain them. If you write much less (5GB/day), it will clearly take a lot less.

The above makes this all an effort to predict the future, but that is where the watermark alert kicks in to help us recover from a bad prediction.

Maybe we use a default of setting aside 20% of raw capacity for snapshots and set that watermark at 90% full? For a lot of use people, I suspect a fairly low rate of change and that means pretty skinny snapshots.

We will clearly need to have a lot of effort here in helping explain this to users so they can make the trade off for their particular use case.


Keep in mind with snapshots (and thinly provisioned storage, whether using a
software target or thinly provisioned array) we need to issue the "discard"
commands down the IO stack in order to let the storage target reclaim space.

That typically means running the fstrim command on the local file system
(XFS, ext4, btrfs, etc) every so often. Less typically, you can mount your
local file system with "-o discard" to do it inband (but that comes at a
performance penalty usually).
Do you think it would make sense to have Puppet-Gluster add a cron job
to do this operation?
Exactly what command should run, and how often? (Again for having
sensible defaults.)

I think that we should probably run fstrim once a day or so (hopefully late at night or off peak)? Adding in Lukas who lead a lot of the discard work.


There is also a event mechanism to help us get notified when we hit a target
configurable watermark ("help, we are running short on real disk, add more
or clean up!").
Can you point me to some docs about this feature?

My quick google search only shows my own very shallow talk slides, so let me dig around for something better :)


Definitely worth following up with the LVM/device mapper people on how to do
this best,

Ric
Thanks for the comments. From everyone I've talked to, it seems some
of the answers are still in progress. The good news is, that I'm ahead
of the curve for being ready for when this becomes more mainstream. I
think Paul is in the same position too.

James

This is all new stuff - even not with gluster on top of it - so this will mean hitting a few bumps I fear. Definitely worth putting thought into this now and working on the documentation,

Ric




reply via email to

[Prev in Thread] Current Thread [Next in Thread]