gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] GlusterFS Snapshot internals


From: Rajesh Joseph
Subject: Re: [Gluster-devel] GlusterFS Snapshot internals
Date: Mon, 7 Apr 2014 06:12:53 -0400 (EDT)

Thanks Paul for your valuable comments. Please find my comments in-lined below.

Please let us know if you have more questions or clarifications. I will try to 
update the
doc where ever more clarity is needed.

Thanks & Regards,
Rajesh

----- Original Message -----
From: "Paul Cuzner" <address@hidden>
To: "Rajesh Joseph" <address@hidden>
Cc: "gluster-devel" <address@hidden>
Sent: Monday, April 7, 2014 1:59:10 AM
Subject: Re: [Gluster-devel] GlusterFS Snapshot internals

Hi Rajesh, 

Thanks for updating the design doc. It reads well. 

I have a number of questions that would help my understanding; 

Logging : The doc doesn't mention how the snapshot process is logged - 
- will snapshot use an existing log or a new log? 
[RJ]: As of now snapshot make use of existing logging framework.
- Will the log be specific to a volume, or will all snapshot activity be logged 
in a single file? 
[RJ]: Snapshot module is embedded in gluster core framework. Therefore the logs 
will also be part of glusterd logs.
- will the log be visible on all nodes, or just the originating node? 
[RJ]: Similar to glusterd snapshot logs related to each node will be visible in 
those nodes.
- will the highlevel snapshot action be visible when looking from the other 
nodes either in the logs or at the cli? 
[RJ]: As of now highlevel snapshot action will be visible only in the logs of 
originator node. Though cli can be used see
list and info of snapshots from any other nodes.

Restore : You mention that after a restore operation, the snapshot will be 
automatically deleted. 
- I don't believe this is a prudent thing to do. Here's an example, I've seen 
ALOT. Application has a programmatic error, leading to data 'corruption' - devs 
work on the program, storage guys roll the volume back. So far so good...devs 
provide the updated program, and away you go...BUT the issue is not resolved, 
so you need to roll back again to the same point in time. If you delete the 
snap automatically, you loose the restore point. Yes the admin could take 
another snap after the restore - but why add more work into a recovery process 
where people are already stressed out :) I'd recommend leaving the snapshot if 
possible, and let it age out naturally. 
[RJ]: Snapshot restore is a simple operation wherein volume bricks will simply 
point to the brick snapshot instead of the original brick. Therefore once the 
restore is done we cannot use the same snapshot again. We are planning to 
implement a configurable option which will automatically take snapshot of the 
snapshot to fulfill the above mentioned requirement. But with the given 
timeline and resources we will not be able to target it in the coming release. 

Auto-delete : Is this a post phase of the snapshot create, so the successfully 
creation of a new snapshot will trigger the pruning of old versions? 
[RJ] Yes, if we reach the snapshot limit for a volume then the snapshot create 
operation will trigger pruning of older snapshots.

Snapshot Naming : The doc states the name is mandatory. 
- why not offer a default - volume_name_timestamp - instead of making the 
caller decide on a name. Having this as a default will also make the list under 
.snap more usable by default. 
- providing a sensible default will make it easier for end users for self 
service restore. More sensible defaults = more happy admins :)
[RJ]: This is a good to have feature we will try to incorporate this in the 
next release.

Quorum and snaprestore : the doc mentions that when a returning brick comes 
back, it will be snap'd before pending changes are applied. If I understand the 
use of quorum correctly, can you comment on the following scenario; 
- With a brick offline, we'll be tracking changes. Say after 1hr a snap is 
invoked because quorum is met 
- changes continue on the volume for another 15 minutes beyond the snap, when 
the offline brick comes back online. 
- at this point there are two point in times to bring the brick back to - the 
brick needs the changes up to the point of the snap, then a snap of the brick 
followed by the 'replay' of the additional changes to get back to the same 
point in time as the other replica's in the replica set. 
- of course, the brick could be offline for 24 or 48 hours due to a hardware 
fault - during which time multiple snapshots could have been made 
- it wasn't clear to me how this scenario is dealt with from the doc? 
[RJ]: Following action is taken in case we miss a snapshot on brick.
+ Lets say brick2 is down while taking snapshot s1.
+ Snapshot s1 will be taken for all the bricks except brick2. Will update the 
bookkeeping about the missed activity.
+ I/O can continue to happen on origin volume.
+ brick2 comes up. At this moment we take a snapshot before we allow new I/O or 
heal of the brick. We multiple snaps are missed then all the snaps are taken at 
this time. We don't wait till the brick is brought to the same state as other 
bricks.
+ brick2_s1 (snap of brick2) will be added to s1 volume (snapshot volume). Self 
heal will take of bringing brick2 state to its other replica set.


barrier : two things are mentioned here - a buffer size and a timeout value. 
- from an admin's pespective, being able to specify the timeout (secs) is 
likely to be more workable - and will allow them to align this setting with any 
potential timeout setting within the application running against the gluster 
volume. I don't think most admins will know or want to know how to size the 
buffer properly. 
[RJ]: In the current release we are only providing the timeout value as a 
configurable option. The buffer size is being considered for future release as 
configurable option or we find our-self what would be the optimal value based 
on user's system configuration.

Hopefully the above makes sense. 

Cheers, 

Paul C 

----- Original Message -----

> From: "Rajesh Joseph" <address@hidden>
> To: "gluster-devel" <address@hidden>
> Sent: Wednesday, 2 April, 2014 3:55:28 AM
> Subject: [Gluster-devel] GlusterFS Snapshot internals

> Hi all,

> I have updated the GlusterFS snapshot forge wiki.

> https://forge.gluster.org/snapshot/pages/Home

> Please go through it and let me know if you have any questions or queries.

> Best Regards,
> Rajesh

> [PS]: Please ignore previous mail. Accidentally hit send before completing :)

> _______________________________________________
> Gluster-devel mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/gluster-devel



reply via email to

[Prev in Thread] Current Thread [Next in Thread]