[Gluster-devel] AFR Heal Bug

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] AFR Heal Bug

From:	Gareth Bult
Subject:	[Gluster-devel] AFR Heal Bug
Date:	Sun, 30 Dec 2007 19:09:56 +0000 (GMT)

Ok, I'm going to call it a bug, tell me if I'm wrong .. :) 

(two servers, both define a "homes" volume) 

Client; 

volume nodea-homes 
type protocol/client 
option transport-type tcp/client 
option remote-host nodea 
option remote-subvolume homes 
end-volume 

volume nodeb-homes 
type protocol/client 
option transport-type tcp/client 
option remote-host nodeb 
option remote-subvolume homes 
end-volume 

volume homes-afr 
type cluster/afr 
subvolumes nodea-homes nodeb-homes ### ISSUE IS HERE! ### 
option scheduler rr 
end-volume 

Assume system is completely up-to-date and working Ok. 
Mount homes filesystem on "client". 
Kill the "nodea" server. 
System carries on, effectively using nodeb. 

Wipe nodea's physical volume. 
Restart nodea server. 

All of a sudden, "client" see's an empty "homes" filesystem, although data is 
still in place on "B" and "A" is blank. 
i.e. the client is seeing the blank "nodea" only (!) 

.. at this point you check nodeb to make sure your data really is there, then 
you can mop up the coffee you've just spat all over your screens .. 

If you crash nodeB instead, there appears to be no problem, and a self heal 
"find" will correct the blank volume. 
Alternatively, if you reverse the subvolumes as listed above, you don't see the 
problem. 

The issue appears to be blanking the first subvolume. 

I'm thinking the order of the volumes should not be an issue, gluster should 
know one volume is empty / new and one contains real data and act accordingly, 
rather than relying on the order volumes are listed .. (???) 

I'm using fuse glfs7 and gluster 1.3.8 (tla).

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] AFR Heal Bug, Gareth Bult <=
- Re: [Gluster-devel] AFR Heal Bug, Krishna Srinivas, 2007/12/30
  - Re: [Gluster-devel] AFR Heal Bug, Gareth Bult, 2007/12/30
    - Re: [Gluster-devel] AFR Heal Bug, Krishna Srinivas, 2007/12/30

Prev by Date: [Gluster-devel] memory leak with 2.5--patch-628
Next by Date: Re: [Gluster-devel] AFR Heal Bug
Previous by thread: [Gluster-devel] memory leak with 2.5--patch-628
Next by thread: Re: [Gluster-devel] AFR Heal Bug
Index(es):
- Date
- Thread