gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Re: [List-hacking] [bug #25207] an rm of a file shou


From: Gordan Bobic
Subject: Re: [Gluster-devel] Re: [List-hacking] [bug #25207] an rm of a file should not cause that file to be replicated with afr self-heal.
Date: Mon, 05 Jan 2009 17:55:42 +0000
User-agent: Thunderbird 2.0.0.18 (X11/20081120)

Maybe I'm missing something here, but if you take self-healing out of AFR, then surely that makes the system completely useless and no better than running rsync every 5 minutes. Since that can't be right, what am I missing?

Gordan

Anand Babu Periasamy wrote:
Christopher, main issue with self-heal is its complexity. Handling self-healing logic in a non-blocking asynchronous code path is difficult. Replicating a missing sounds simple, but holding off a lookup call and initiating a new series of calls to heal the file and then resuming back normal operation is tricky. Much of the bugs we faced in 1.3 is related to self-heal. We have handled most of these cases over a period of time. Self-healing is decent now, but not good enough. We feel that it has only complicated the code base. It is hard to test and maintain this part of
the code base.

Plan is to drop self-heal code all together once the active healing tool gets ready. Unlike self-healing, this active healing can be run by the user on a mounted file system (online) any time. By moving the code out of the file system, into a tool (that is
synchronous and linear), we can implement sophisticated healing techniques.

Code is not in the repository yet. Hopefully in a month, it will be ready for use. You can simply turn off self-heal and run this utility while the file system is mounted.

List-hacking is an internal list, mostly junk :). It is an internal company list. We don't discuss technical / architectural stuff there. They are mostly done over phone and in-person meetings. We do want to actively involve the community right
from the design phase. Mailing list is cumbersome and slow to interactively
brainstorm design discussions. We can once in a while organize IRC sessions
for this purpose.

--
Anand Babu

Swank iest wrote:
Well,

I guess this is getting outside of the bug. I suppose you are going to mark it as not going to fix?

I'm trying to put gluster into production right now, so may I ask:

1) What are the current issues with self-heal that require a full re-write? Is there a place in the Wiki or elsewhere where it's being documented? 2) May I see the new code? I must not be looking in the correct place in TLA? 3) If it's not written yet, may I be included in the design discussion? (As I haven't put gluster into production yet, now would be a good time to know if it's not going to work in the near future.) 4) May I be placed on the address@hidden mailing list, please?

 Christopher.

 > Date: Mon, 5 Jan 2009 01:36:14 -0800
 > From: address@hidden
 > To: address@hidden
 > CC: address@hidden; address@hidden
> Subject: Re: [List-hacking] [bug #25207] an rm of a file should not cause that file to be replicated with afr self-heal.
 >
> Krishna, leave it as is. Once self-heal ensures that the volumes are intact, rm will > remove both the copies anyways. It is inefficient, but optimizing it the current framework
 > will be hacky.
 >
> Swaniker, We are ditching the current self-healing framework with an active healing tool.
 > We can take care of it then.
 >
 >
 > Krishna Srinivas wrote:
 >> The current selfheal logic is built in lookup of a file, lookup is
>> issued just before any file operation on a file. So if the lookup call
 >> does not know whether an open or rm is going to be done on the file.
>> Will get back to you if we can do anything about this, i.e to save the
 >> redundant copy of the file when it is going to be rm'ed
 >>
 >> Krishna
 >>
>> On Mon, Jan 5, 2009 at 12:19 PM, swankier <address@hidden> wrote:
 >>> Follow-up Comment #2, bug #25207 (project gluster):
 >>>
 >>> I am:
 >>>
 >>> 1) delete file from posix system beneath afr on one side
 >>> 2) run rm on gluster file system
 >>>
 >>> file is then replicated followed by deletion




reply via email to

[Prev in Thread] Current Thread [Next in Thread]