Re: [Gluster-devel] AFR Heal Bug

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] AFR Heal Bug

From:	Gareth Bult
Subject:	Re: [Gluster-devel] AFR Heal Bug
Date:	Sun, 30 Dec 2007 20:33:22 +0000 (GMT)

Hi,

Many thanks, a fix would be great .. :)

I've been doing a little more testing and can confirm that AFR definitely does 
not honor "sparse" when healing.

This is particularly noticeable when using XEN images.

A typical XEN image might be 3G for example, with "du" reporting 600M used.
After "healing" the image to another brick, it shows 3G size, and du shows 3G 
used.

This makes a fair difference to my "images" volume (!)

[in addition to the problems when applied to stripes!]

Regards,
Gareth.


----- Original Message -----
From: "Krishna Srinivas" <address@hidden>
To: "Gareth Bult" <address@hidden>
Cc: "gluster-devel" <address@hidden>
Sent: Sunday, December 30, 2007 8:10:42 PM (GMT) Europe/London
Subject: Re: [Gluster-devel] AFR Heal Bug

Hi Gareth,

Yes this bug was introduced recently after we did changes to the way
readdir() call worked in glusterfs, afr is calling readdir() only from the
first child (which is blank in your case) fix will be on its way in a couple
of days.

Thanks
Krishna

On Dec 31, 2007 12:39 AM, Gareth Bult <address@hidden> wrote:
> Ok, I'm going to call it a bug, tell me if I'm wrong .. :)
>
> (two servers, both define a "homes" volume)
>
> Client;
>
> volume nodea-homes
> type protocol/client
> option transport-type tcp/client
> option remote-host nodea
> option remote-subvolume homes
> end-volume
>
> volume nodeb-homes
> type protocol/client
> option transport-type tcp/client
> option remote-host nodeb
> option remote-subvolume homes
> end-volume
>
> volume homes-afr
> type cluster/afr
> subvolumes nodea-homes nodeb-homes ### ISSUE IS HERE! ###
> option scheduler rr
> end-volume
>
> Assume system is completely up-to-date and working Ok.
> Mount homes filesystem on "client".
> Kill the "nodea" server.
> System carries on, effectively using nodeb.
>
> Wipe nodea's physical volume.
> Restart nodea server.
>
> All of a sudden, "client" see's an empty "homes" filesystem, although data is 
> still in place on "B" and "A" is blank.
> i.e. the client is seeing the blank "nodea" only (!)
>
> .. at this point you check nodeb to make sure your data really is there, then 
> you can mop up the coffee you've just spat all over your screens ..
>
> If you crash nodeB instead, there appears to be no problem, and a self heal 
> "find" will correct the blank volume.
> Alternatively, if you reverse the subvolumes as listed above, you don't see 
> the problem.
>
> The issue appears to be blanking the first subvolume.
>
> I'm thinking the order of the volumes should not be an issue, gluster should 
> know one volume is empty / new and one contains real data and act 
> accordingly, rather than relying on the order volumes are listed .. (???)
>
> I'm using fuse glfs7 and gluster 1.3.8 (tla).
> _______________________________________________
> Gluster-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] AFR Heal Bug, Gareth Bult, 2007/12/30
- Re: [Gluster-devel] AFR Heal Bug, Krishna Srinivas, 2007/12/30
  - Re: [Gluster-devel] AFR Heal Bug, Gareth Bult <=
    - Re: [Gluster-devel] AFR Heal Bug, Krishna Srinivas, 2007/12/30

Prev by Date: Re: [Gluster-devel] AFR Heal Bug
Next by Date: Re: [Gluster-devel] AFR Heal Bug
Previous by thread: Re: [Gluster-devel] AFR Heal Bug
Next by thread: Re: [Gluster-devel] AFR Heal Bug
Index(es):
- Date
- Thread