gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] AFR Heal Bug


From: Gareth Bult
Subject: [Gluster-devel] AFR Heal Bug
Date: Sun, 30 Dec 2007 19:09:56 +0000 (GMT)

Ok, I'm going to call it a bug, tell me if I'm wrong .. :) 

(two servers, both define a "homes" volume) 

Client; 

volume nodea-homes 
type protocol/client 
option transport-type tcp/client 
option remote-host nodea 
option remote-subvolume homes 
end-volume 

volume nodeb-homes 
type protocol/client 
option transport-type tcp/client 
option remote-host nodeb 
option remote-subvolume homes 
end-volume 

volume homes-afr 
type cluster/afr 
subvolumes nodea-homes nodeb-homes ### ISSUE IS HERE! ### 
option scheduler rr 
end-volume 

Assume system is completely up-to-date and working Ok. 
Mount homes filesystem on "client". 
Kill the "nodea" server. 
System carries on, effectively using nodeb. 

Wipe nodea's physical volume. 
Restart nodea server. 

All of a sudden, "client" see's an empty "homes" filesystem, although data is 
still in place on "B" and "A" is blank. 
i.e. the client is seeing the blank "nodea" only (!) 

.. at this point you check nodeb to make sure your data really is there, then 
you can mop up the coffee you've just spat all over your screens .. 

If you crash nodeB instead, there appears to be no problem, and a self heal 
"find" will correct the blank volume. 
Alternatively, if you reverse the subvolumes as listed above, you don't see the 
problem. 

The issue appears to be blanking the first subvolume. 

I'm thinking the order of the volumes should not be an issue, gluster should 
know one volume is empty / new and one contains real data and act accordingly, 
rather than relying on the order volumes are listed .. (???) 

I'm using fuse glfs7 and gluster 1.3.8 (tla). 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]