Re: [Gluster-devel] Choice of Translator question

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Choice of Translator question

From:	Kevan Benson
Subject:	Re: [Gluster-devel] Choice of Translator question
Date:	Thu, 27 Dec 2007 12:16:53 -0800
User-agent:	Thunderbird 2.0.0.9 (X11/20071031)

Gareth Bult wrote:

Agreed, which is why I just showed the single file self-heal
method, since in your case targeted self heal (maybe before a full
filesystem self heal) might be more useful.


Sorry, I was mixing moans .. on the one hand there's no log hence no
automatic detection of out of date files (which means you need a
manual scan), and secondly, doing a full self-heal on a large
file-system "can" be prohibitively "expensive" ...

I'm vaguely wondering if it would be possible to have a "log"
translator that wrote changes to a namespace volume for quick
recovery following a node restart. (as an option of course)

An interesting thought. Possibly something that keeps a filename andtimestamp so other AFR members could connect and request changed fileAFR versions since X timestamp.

Automatic self-heal is supposed to be on the way, so I suspect they arealready doing (or planning) something like this.

I don't see how the AFR could even be aware the chunks belong to
the same file, so how it would know to replicate all the chunks of
a file is a bit of a mystery to me.  I will admit I haven't done
much with the stripe translator though, so my understanding of it's
operation may wrong.


Mmm, trouble is there's nothing definitive in the documentation
either way .. I'm wondering whether it's a known critical omission
which is why it's not been documented (!) At the moment stripe is
pretty useless without self-heal (i.e. AFR). AFR is pretty useless
without stripe for anyone with large files. (which I'm guessing is
why stripe was implemented after all the "stripe is bad"
documentation) If the the two don't play well and a self-heal on a
large file means a 1TB network data transfer - this would strike me
as a show stopper.

I think the original docs said it was implemented because it was easy,but there wasn't a whole lot to be gained by using it. Since then, I'veseen people post numbers that seemed to indicate it gave a somewhatsizable boost, but the extra complexity in introduced never made itattractive to me.

The possibility it could be used to greatly speed up self-heal on largefiles seems like a real good reason to use it though, so hopefully wecan find a way to make it work.

Understood.  I'll have to actually try this when I have some time,
instead of just doing some armchair theorizing.


Sure .. I think my tests were "proper" .. although I might try them
on TLA just to make sure.

Just thinking logically for a second, for AFR to do chunk level
self-heal, there must be a chunk level signature store somewhere. ...
where would this be ?

Well, to AFR each chunk should just look like another file, it shouldn'tcare that it's part of a whole.

I assume the stripe translator uses another extended attribute to tellwhat file it's part of. Perhaps the AFR translator is stripe aware andthat's causing the problem?

Was this on AFR over stripe or stripe over AFR?


Logic told me it must be AFR over stipe, but I tries it both ways
round ..

Let get rid of the over/under terminology (which I always seem to thinkof reverse from other people), and use a representation that's moreabsolute:


client -> XLATOR(stripe) -> XLATOR(AFR) -> diskVol(1..N)

Throw in your network connections wherever you want, but this should betestable on a single box with two different directories exported as volumes.

The client writes to the stripe translator, which splits up the largefile, which is then sent to the AFR translator so each chunk is storedredundantly in each disk volume supplied.

If the AFR and stripe are reversed, it will have to pull all stripechunks to do a self heal (unless AFR is stripe aware), which isn't whatwe are aiming for.


Is that similar to what you tested?

--

-Kevan Benson
-A-1 Networks

[Prev in Thread]

Current Thread

[Next in Thread]

[Gluster-devel] Choice of Translator question, Gareth Bult, 2007/12/22
- Re: [Gluster-devel] Choice of Translator question, Krishna Srinivas, 2007/12/25
  - Re: [Gluster-devel] Choice of Translator question, Gareth Bult, 2007/12/25
    - Re: [Gluster-devel] Choice of Translator question, Kevan Benson, 2007/12/26
    - Re: [Gluster-devel] Choice of Translator question, Gareth Bult, 2007/12/27
    - Re: [Gluster-devel] Choice of Translator question, Kevan Benson, 2007/12/27
    - Re: [Gluster-devel] Choice of Translator question, Gareth Bult, 2007/12/27
    - Re: [Gluster-devel] Choice of Translator question, Kevan Benson <=
    - Re: [Gluster-devel] Choice of Translator question, Gareth Bult, 2007/12/27
    - [Gluster-devel] Permissions and ownership ..., Gareth Bult, 2007/12/27
    - Re: [Gluster-devel] Permissions and ownership ..., Raghavendra G, 2007/12/27
    - Re: [Gluster-devel] Choice of Translator question, Kevan Benson, 2007/12/27
    - Re: [Gluster-devel] Choice of Translator question, Csibra Gergo, 2007/12/28
    - Re: [Gluster-devel] Choice of Translator question, Gareth Bult, 2007/12/28
    - Re: [Gluster-devel] Choice of Translator question, Csibra Gergo, 2007/12/28
- Re: [Gluster-devel] Choice of Translator question, Gareth Bult, 2007/12/28
  - Re: [Gluster-devel] Choice of Translator question, Krishna Srinivas, 2007/12/28
    - Re: [Gluster-devel] Choice of Translator question, Gareth Bult, 2007/12/28

Prev by Date: Re: [Gluster-devel] Choice of Translator question
Next by Date: Re: [Gluster-devel] Choice of Translator question
Previous by thread: Re: [Gluster-devel] Choice of Translator question
Next by thread: Re: [Gluster-devel] Choice of Translator question
Index(es):
- Date
- Thread