gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] lseek


From: Xavier Hernandez
Subject: Re: [Gluster-devel] lseek
Date: Mon, 14 May 2012 13:47:10 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:12.0) Gecko/20120428 Thunderbird/12.0.1

Hello Ian,

I didn't thought in statfs. In this special case things are a bit harder for a compression translator. I think it's impossible to return accurate data without a considerable amount of work.

Maybe some estimation of the available space based on the current achieved mean compression ratio would be sufficient, but never accurate. With more work you could even be able to say exactly how much space have been used, but the best you can do with the remaining space is an estimation.

Regarding lseek, there isn't a map with lookup. Probably I haven't explained it as well as I wanted.

There are basically two kinds of user mode calls. Those that use a string containing a filename to operate with (stat, unlink, open, creat, ...), and those that use a file descriptor (fstat, read, write, ...). The kernel does not work with names to handle files, so it has to translate the names to inodes to work with them. This means that any call that uses a string will need to make a "lookup" to get the associated inode (the only exception is creat, that creates a new inode without using lookup).

This means that every filename based operation can generate a lookup request (although some caching mechanism may reduce the number of calls). All operations that work with a file descriptor do not generate a lookup request, because the file descriptor is already bound to an inode.

In your particular case, to do an lseek you must have made a previous call to open (that would have generated a lookup request) or creat.

Hope this better explains how kernel and gluster are bound...

Xavi

On 05/14/2012 01:18 PM, Ian Latter wrote:
Hello Xavier,


    I don't have a problem with the principles, these
were effectively how I was traveling (the notable
difference is statfs which I want to pass-through
unaffected, reporting the true file system capacity
such that a du [stat] may sum to a greater value
than a df [statfs]).  In 2009 I had a mostly-
functional hashing write function and a dubious
read function (I stumbled when I had to open a
file from within a fop).

   But I think what you're telling/showing me is that
I have no deep understanding of the mapping of
the system calls to their Fuse->Gluster fops -
which is expected :)  And, this is a better outcome
than learning that Gluster has gaps in its
framework with regard to my objective.  I.e. I
didn't know that lseek mapped to lookup.  And
the examples aren't comprehensive enough
(rot-13 is the only one that really manipulates
content, and it only plays with read and write,
obviously because it has a 1:1 relationship with
the data).

This is the key, and not something that I was
expecting;

In gluster there are a lot of fops that return a iatt
structure. You must guarantee that all these
functions return the correct size of the file in
the field ia_size to be sure that everything works
as expected.
I'll do my best to build a comprehensive list of iatt
returning fops from the examples ... but I'd say it'll
take a solid peer review to get this hammered out
properly.

Thanks for steering me straight Xavi, appreciate
it.



----- Original Message -----
From: "Xavier Hernandez"<address@hidden>
To: "Ian Latter"<address@hidden>
Subject:  Re: [Gluster-devel] lseek
Date: Mon, 14 May 2012 12:29:54 +0200

Hello Ian,

lseek calls are handled internally by the kernel and they
never reach
the user land for fuse calls. lseek only updates the
current file offset
that is stored inside the kernel file's structure. This
value is what is
passed to read/write fuse calls as an absolute offset.

There isn't any problem in this behavior as long as you
hide all size
manipulations from fuse. If you write a translator that
compresses a
file, you should do so in a transparent manner. This
means, basically, that:
1. Whenever you are asked to return the file size, you
must return the
size of the uncompressed file
2. Whenever you receive an offset, you must translate that
offset to the
corresponding offset in the compressed file and work with that
3. Whenever you are asked to read or write data, you must
return the
number of uncompressed bytes read or written (even if you
have
compressed the chunk of data to a smaller size and you
have physically
written less bytes).
4. All read requests must return uncompressed data (this
seems obvious
though)

This guarantees that your manipulations are not seen in
any way by any
upper translator or even fuse, thus everything should work
smoothly.
If you respect these rules, lseek (and your translator)
will work as
expected.

In particular, when a user calls lseek with SEEK_END, the
kernel takes
the size of the file from the internal kernel inode's
structure. This
size is obtained through a previous call to lookup or
updated using the
result of write operations. If you respect points 1 and 3,
this value
will be correct.

In gluster there are a lot of fops that return a iatt
structure. You
must guarantee that all these functions return the correct
size of the
file in the field ia_size to be sure that everything works
as expected.
Xavi

On 05/14/2012 11:51 AM, Ian Latter wrote:
Hello Xavi,


    Ok - thanks.  I was hoping that this was how read
and write were working (i.e. with absolute offsets
and not just getting relative offsets from the current
seek point), however what of the raw seek
command?

       len = lseek(fd, 0, SEEK_END);

       Upon  successful completion, lseek() returns
       the resulting offset location as measured in
       bytes from the beginning of the  file.

    Any idea on where the return value comes from?
I will need to fake up a file size for this command ..



----- Original Message -----
From: "Xavier Hernandez"<address@hidden>
To:<address@hidden>
Subject:  Re: [Gluster-devel] lseek
Date: Mon, 14 May 2012 09:48:17 +0200

Hello Ian,

there is no such thing as an explicit seek in glusterfs.
Each readv,
writev, (f)truncate and rchecksum have an offset parameter
that tells
you the position where the operation must be performed.

If you make something that changes the size of the file
you must make it
in a way that it is transparent to upper translators. This
means that
all offsets you will receive are "real" (in your case,
offsets in the
uncompressed version of the file). You should calculate in
some way the
equivalent offset in the compressed version of the file
and send it to
the correspoding fop of the lower translators.

In the same way, you must return in all iatt structures
the real size of
the file (not the compressed size).

I'm not sure what is the intended use of NONSEEKABLE, but
I think it is
for special file types, like devices or similar that are
sequential in
nature. Anyway, this is a fuse flag that you can't return
from a regular
translator open fop.

Xavi

On 05/14/2012 03:22 AM, Ian Latter wrote:
Hello,


     I'm looking for a seek (lseek) implementation in
one of the modules and I can't see one.

     Do I need to care about seeking if my module
changes the file size (i.e. compresses) in Gluster?
I would have thought that I did except that I believe
that what I'm reading is that Gluster returns a
NONSEEKABLE flag on file open (fuse_kernel.h at
line 149).  Does this mitigate the need to correct
the user seeks?


Cheers,



--
Ian Latter
Late night coder ..
http://midnightcode.org/

_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel

--
Ian Latter
Late night coder ..
http://midnightcode.org/

_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel


--
Ian Latter
Late night coder ..
http://midnightcode.org/

_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]