[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gluster-devel] AFR Heal Bug
From: |
Gareth Bult |
Subject: |
[Gluster-devel] AFR Heal Bug |
Date: |
Sun, 30 Dec 2007 19:09:56 +0000 (GMT) |
Ok, I'm going to call it a bug, tell me if I'm wrong .. :)
(two servers, both define a "homes" volume)
Client;
volume nodea-homes
type protocol/client
option transport-type tcp/client
option remote-host nodea
option remote-subvolume homes
end-volume
volume nodeb-homes
type protocol/client
option transport-type tcp/client
option remote-host nodeb
option remote-subvolume homes
end-volume
volume homes-afr
type cluster/afr
subvolumes nodea-homes nodeb-homes ### ISSUE IS HERE! ###
option scheduler rr
end-volume
Assume system is completely up-to-date and working Ok.
Mount homes filesystem on "client".
Kill the "nodea" server.
System carries on, effectively using nodeb.
Wipe nodea's physical volume.
Restart nodea server.
All of a sudden, "client" see's an empty "homes" filesystem, although data is
still in place on "B" and "A" is blank.
i.e. the client is seeing the blank "nodea" only (!)
.. at this point you check nodeb to make sure your data really is there, then
you can mop up the coffee you've just spat all over your screens ..
If you crash nodeB instead, there appears to be no problem, and a self heal
"find" will correct the blank volume.
Alternatively, if you reverse the subvolumes as listed above, you don't see the
problem.
The issue appears to be blanking the first subvolume.
I'm thinking the order of the volumes should not be an issue, gluster should
know one volume is empty / new and one contains real data and act accordingly,
rather than relying on the order volumes are listed .. (???)
I'm using fuse glfs7 and gluster 1.3.8 (tla).
- [Gluster-devel] AFR Heal Bug,
Gareth Bult <=