Re: [Gluster-devel] rc8

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] rc8

From:	ender
Subject:	Re: [Gluster-devel] rc8
Date:	Wed, 22 Apr 2009 14:54:50 -0700
User-agent:	Thunderbird 2.0.0.21 (X11/20090318)

Closer, but still no cigar..

all nodes: killall glusterfsd; killall glusterfs;
all nodes: rm -rf /tank/*
all nodes: glusterfsd -f /usr/local/etc/glusterfs/glusterfsd.vol
all nodes: mount -t glusterfs /usr/local/etc/glusterfs/glusterfs.vol /gtank
node3:~# cp -R gluster /gtank/gluster1
*simulating a hardware failure
node1:~# killall glusterfsd ; killall glusterfs;
node1:~# killall glusterfsd ; killall glusterfs;
glusterfsd: no process killed
glusterfs: no process killed
node1:~# rm -rf /tank/*
*data never stops changing, just because we have a failed node
node3:~# cp -R gluster /gtank/gluster2
all nodes but node1:~# ls -lR /gtank/ | wc -l
2782
all nodes but node1:~# ls -lR /gtank/gluster1 | wc -l
1393
all nodes but node1:~# ls -lR /gtank/gluster2 | wc -l
1393
*Adding hardware back into the network after replacing bad harddrive(s)
node1:~# glusterfsd -f /usr/local/etc/glusterfs/glusterfsd.vol
node1:~# mount -t glusterfs /usr/local/etc/glusterfs/glusterfs.vol /gtank
node3:~# ls -lR /gtank/ | wc -l
1802
node3:~# ls -lR /gtank/gluster1 | wc -l
413
node3:~# ls -lR /gtank/gluster2 | wc -l
1393

Are you aware that taking the broken node1 out fixes the gluster system again?
node1:~# killall glusterfsd ; killall glusterfs;
node1:~# killall glusterfsd ; killall glusterfs;
glusterfsd: no process killed
glusterfs: no process killed
all nodes but node1:~# ls -lR /gtank/ | wc -l
2782
all nodes but node1:~# ls -lR /gtank/gluster1 | wc -l
1393
all nodes but node1:~# ls -lR /gtank/gluster2 | wc -l
1393

Add it back in
node3:~# ls -lR /gtank/gluster1 | wc -l
413

And its broken again.

Thank you for working on gluster, and for the response!

Anand Avati wrote:

Ender,
  There was a bug fix which went in to git today which fixes a similar
bug.. a case where a subset of the files would be recreated if there
are a lot of files (~1000 or more) when the node which was down was
the first subvolume in the list. Please pull the latest patches and
see if it solves your case. Thank you for your patience!

Avati

On Thu, Apr 23, 2009 at 2:29 AM, ender <address@hidden> wrote:

I was just wondering if the self heal bug is planned to be fixed, or if they
developers are just ignoring it in hopes it will go away? Everytime i ask
someone privately if they can reproduce the problem on there own end, they
go silent. (which leads me to believe that they in fact can reproduce it)

Very simple, AFR. As many subvolumes as you want. The first listed subvolume
will always break the self heal. node2 and node3 always heal fine. Swap the
ip address of the first listed subvolume and you will swap the box which
breaks the selfheal. I have been able to repeat this bug every day with the
newest git for the last month.
Please let us know if this is not considered a bug, or acknowledge it in
some fashion. Thank you.
same configs
all nodes: killall glusterfsd; killall glusterfs;
all nodes: rm -rf /tank/*
all nodes: glusterfsd -f /usr/local/etc/glusterfs/glusterfsd.vol
all nodes: mount -t glusterfs /usr/local/etc/glusterfs/glusterfs.vol /gtank
node3:~# cp -R gluster /gtank/gluster1
*simulating a hardware failure
node1:~# killall glusterfsd ; killall glusterfs;
node1:~# killall glusterfsd ; killall glusterfs;
glusterfsd: no process killed
glusterfs: no process killed
node1:~# rm -rf /tank/*
*data never stops changing, just because we have a failed node
node3:~# cp -R gluster /gtank/gluster2
all nodes but node1:~# ls -lR /gtank/ | wc -l
2780
all nodes but node1:~# ls -lR /gtank/gluster1 | wc -l
1387
all nodes but node1:~# ls -lR /gtank/gluster2 | wc -l
1387
*Adding hardware back into the network after replacing bad harddrive(s)
node1:~# glusterfsd -f /usr/local/etc/glusterfs/glusterfsd.vol
node1:~# mount -t glusterfs /usr/local/etc/glusterfs/glusterfs.vol /gtank
node3:~# ls -lR /gtank/ | wc -l
1664
node3:~# ls -lR /gtank/gluster1 | wc -l
271
node3:~# ls -lR /gtank/gluster2 | wc -l
1387


### Export volume "brick" with the contents of "/tank" directory.
volume posix
type storage/posix                     # POSIX FS translator
option directory /tank                 # Export this directory
end-volume

volume locks
 type features/locks
 subvolumes posix
end-volume

volume brick
type performance/io-threads
subvolumes locks
end-volume

### Add network serving capability to above brick.
volume server
 type protocol/server
 option transport-type tcp
 subvolumes brick
 option auth.addr.brick.allow *        # Allow access to "brick" volume
 option client-volume-filename /usr/local/etc/glusterfs/glusterfs.vol
end-volume


#
#mirror block0
#
volume node1
 type protocol/client
 option transport-type tcp
 option remote-host node1.ip                                   # IP address
of the remote brick
# option transport-timeout 30                                   # seconds to
wait for a reply from server for each request
 option remote-subvolume brick                                 # name of the
remote volume
end-volume
volume node2
 type protocol/client
 option transport-type tcp
 option remote-host node2.ip                                   # IP address
of the remote brick
# option transport-timeout 30                                   # seconds to
wait for a reply from server for each request
 option remote-subvolume brick                                 # name of the
remote volume
end-volume
volume node3
 type protocol/client
 option transport-type tcp
 option remote-host node3.ip                                   # IP address
of the remote brick
# option transport-timeout 30                                   # seconds to
wait for a reply from server for each request
 option remote-subvolume brick                                 # name of the
remote volume
end-volume

volume mirrorblock0
 type cluster/replicate
 subvolumes node1 node2 node3
 option metadata-self-heal yes
end-volume




Gordan Bobic wrote:

First-access failing bug still seems to be present.
But other than that, it seems to be distinctly better than rc4. :)
Good work! :)

Gordan


_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel



_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gluster-devel] rc8, (continued)
- Re: [Gluster-devel] rc8, Geoff Kassel, 2009/04/21
  - Re: [Gluster-devel] rc8, Gordan Bobic, 2009/04/21
- Re: [Gluster-devel] rc8, ender, 2009/04/22
  - Re: [Gluster-devel] rc8, Anand Avati, 2009/04/22
    - Re: [Gluster-devel] rc8, ender <=
    - Re: [Gluster-devel] rc8, Anand Avati, 2009/04/22
    - Re: [Gluster-devel] rc8, Serial Thrilla, 2009/04/22
    - Re: [Gluster-devel] rc8, Anand Avati, 2009/04/22
    - Re: [Gluster-devel] rc8, Serial Thrilla, 2009/04/22
    - Re: [Gluster-devel] rc8, Anand Avati, 2009/04/23
    - Re: [Gluster-devel] rc8, ender, 2009/04/23
    - Re: [Gluster-devel] rc8, Gordan Bobic, 2009/04/23
    - Re: [Gluster-devel] rc8, Brent A Nelson, 2009/04/23
    - Re: [Gluster-devel] rc8, Anand Avati, 2009/04/23
    - Re: [Gluster-devel] rc8, Brent A Nelson, 2009/04/23

Prev by Date: Re: [Gluster-devel] rc8
Next by Date: Re: [Gluster-devel] rc8
Previous by thread: Re: [Gluster-devel] rc8
Next by thread: Re: [Gluster-devel] rc8
Index(es):
- Date
- Thread