gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gluster-devel] The Performance Dataset Problem and the [RFC] Filetree s


From: Csaba Henk
Subject: [Gluster-devel] The Performance Dataset Problem and the [RFC] Filetree scheme microformat
Date: Wed, 29 May 2013 05:36:44 +0000 (UTC)
User-agent: slrn/1.0.1 (Linux)

Hi All,

I wonder who is aware of the thing we call the Performance Dataset Problem.
Well I guess everyone would be aware of the problem itself, just not necessarily
familiar with this coinage :)

The behavior of Glusterfs and related utilities (eminemtly, geo-rep) might quite
well be affected by the content of the volume. Eg., having *many* small files is
a pattern that tends to defeat general optimization efforts, and needs some 
dedicated
care to be good at it; or having a deep and branchy directory structure provokes
quite differenti behavioral patterns than a flat hierarchy.

So for proper testing of the software, we need a large variety of filetree 
layouts
and content. Problem is that these are hard to describe and hard to produce, and
to worsen it, these two come hand in hand. Because what can one do when one 
wants to
be more specific than saying these vague things like "small files", "deep and 
branchy",
"flat"? Well, mostly s/he will end up with coding up a script that performs the 
creation
of the hiearchy in mind. That's an engineering effort which definitely gets at 
a precise
description; but a highly ineffective effort -- the resulting code will be full 
of ugly
loops and recursive constructs and various ad-hoc naming and numeric parameters.
Quite hard to distill the idea by the reader (best s/he can do is to run it and 
then
run find(1) on the created tree), quite specific, non-reusable imperative code 
for
the creation.

But we could do better. The idea of some filetree layout could be possible to 
communicate
clearly, had we a language that empowers us to speak about it. The creation of 
a filetree
of some particular layout could be performed by a general utility, had we a way 
to specify
that layout.

The language to specify filetree layouts -- that's what I made an attempt on.

We need to get it right, before moving on to the tool that creates file trees 
on its terms.
We have some conflicting perspectives -- the language should be

- human friendly
- machine friendly
- compact
- versatile

I made up my mind to find a good trade-off and came up with this:

https://gist.github.com/csabahenk/5668160

Here is a printable version for the current version of the document:

https://dl.dropboxusercontent.com/u/27330206/filetree-scheme.pdf

Please comment.

Thanks
Csaba




reply via email to

[Prev in Thread] Current Thread [Next in Thread]