Re: [Gluster-devel] self-heal behavior

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] self-heal behavior

From:	DeeDee Park
Subject:	Re: [Gluster-devel] self-heal behavior
Date:	Mon, 09 Jul 2007 20:36:23 +0000

Just some more info for you. I'm glad to see there has been thought put intosome sort of "backgrounding" of the AFR. I saw "massive" across all brickswhen I started using AFR (with 750GB of user data, 270K files). I was doingan RSYNC to update the glusterfs data from the master file server. Soanytime anyone wants to increase the number of relicas, and then go throughall the files (eg. using an rsync), there will be a lot of activity goingon.

I don't like the performance hit, and it is not "critical" that all thereplica happen right away -- in fact, it is extremely low priority. Myhighest priority is low latency to my master files. It seems that the besttime to do the replica is not on the open(), but maybe some talk to a"scheduler" xlator who is also looking at bandwidth availability.

I want to bring bandwidth into the picture, because making replicas across awan connection makes sense for disaster recovery. In the case of the bricksout of sync with email files (where the replica data could chew up allavailable wan bandwidth), maybe the AFR, and a separate "garbage cleanup" orfile syncing could be done by another process that is also working in thebackground. I could forsee that with large datasets, and network outages,that large datasets will be out of sync, and have to resync, that resyncingis never "finished". Having a large performance hit or demand on the wanbandwidth when a network outage is fixed would be horrible.

From: "Anand Avati" <address@hidden>
To: "Gerry Reno" <address@hidden>
CC: gluster-devel <address@hidden>
Subject: Re: [Gluster-devel] self-heal behavior
Date: Wed, 4 Jul 2007 19:33:14 +0530

Gerry,
your question is appropriate, but the answer to 'when to resync' is not

very simple. when a brick which was brought down is brought up later, itmay

be a completely new (empty) brick. In that case starting to sync every file
would most likely be the wrong decision. (we should rather sync the file
which the user needs than some unused file). Even if we chose to sync files
without user accessing them it would be very sluggish too since it would be
intervening in other operations.

The current approach is to sync files on the next open() on it. This is
usually a good balance since, during open() if we were to sync a file, even
if it was a GB it would take 10-15 secs, and for normal files (in the order
of few MBs) it is almost not noticable. But if this were to happen together
for all files whether the user accessed them or not there would be a lot of
traffic and be very sluggish.

This approach of syncing on open() is what even other filesystems which
support redundancy do.

Detecting 'idle time' and beginning sync-up and pausing the sync-up when
user begins activity is a very tricky job, but that is definitely what we
aim at finally. It is not enough if AFR detects the client is free, because
the servers may be busy serving files to another client and syncing at that

time may not be the most apprpriate time. The following versions of AFRwillhave more options to tune 'when' to sync. Currently it is only at open().We

plan to add options to make it sync on lookup() (happens on ls). Later
versions would have pro-active syncing (detecting that both server and
clients are idle etc).

thanks,
avati

2007/7/4, Gerry Reno <address@hidden>:


  I've been doing some testing of self-heal.  Basically taking down one
brick and then copying some files to one of the client mounts, then
bringing the downed brick back up.  What I see is that when I bring the
downed brick back up, no activity occurs.  It's only when I start doing
something in one of the client mounts that something occurs to rebuild
the out-of-sync brick.  My concern with this is that if I have four
applications on different client nodes (separate machines) using the
same data set (mounted on GlusterFS).  The brick on one of these nodes
is out-of-sync, and it is not until some user is trying to use the
application that the brick starts to resync.  This results in sluggish
performance to the user as all the data has to be brought over the
network from other bricks since the local brick is out-of-sync.  Now
there may have been ten minutes of idle time prior to this user trying
to access the data but glusterfs did not make any use of this time to
rebuild the out-of-sync brick but rather waited until a user tried to
access data.  To me, it appears that glusterfs should be making use of
such opportunity and this would diminish the overall impact to users of
the out-of-sync condition.

Regards,
Gerry



_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel




--
Anand V. Avati
_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel


_________________________________________________________________
http://newlivehotmail.com

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] self-heal behavior, Gerry Reno, 2007/07/04
- Re: [Gluster-devel] self-heal behavior, Anand Avati, 2007/07/04
  - Re: [Gluster-devel] self-heal behavior, Steffen Grunewald, 2007/07/04
  - Re: [Gluster-devel] self-heal behavior, Gerry Reno, 2007/07/04
  - Re: [Gluster-devel] self-heal behavior, DeeDee Park <=

Prev by Date: Re: [Gluster-devel] namespace cache
Next by Date: [Gluster-devel] Re: posting errors:
Previous by thread: Re: [Gluster-devel] self-heal behavior
Next by thread: [Gluster-devel] problems with patch 268
Index(es):
- Date
- Thread