Re: [Gluster-devel] lookup caching

From: Raghavendra G
Subject: Re: [Gluster-devel] lookup caching
Date: Thu, 8 Apr 2010 15:15:50 +0400

On Thu, Apr 8, 2010 at 2:19 PM, Olivier Le Cam <address@hidden> wrote:
Hi -

Raghavendra G wrote:

   Well, as far as I understand the problem is rather that folders on
   the path of the targeted file are not cached at all by io-cache.

Did you mean caching dentries, when you said "caching directories"? io-cache only caches file data. Also note that it is not a translator which does metadata (stats) caching.

That's it. Metadata caching is exactly what I would like to implement but I haven't been able to find which part of code is dealing with metadatas so far.

Metadata caching is done in stat-prefetch. But that is not suitable to your use-case. in stat-prefetch stats of dentries are stored from the period of readdir to close on directory fd.

   This is why accessing a file makes (quite a lot) network traffic and
   can introduce bottlenecks because each folder is stated on the
   servers before delivering the file, even if that file is indeed
   cached by io-cache.

io-cache only sends stat on the file being read not to all of its parent directories. I am not getting the reason you are concerned with too many stat calls. io-cache (during read and when file is cached) only sends a single stat call and that too if "cache-timeout" number of seconds have been elapsed since last validation.

Well I know understand that I don't have to focus my attention on io-cache at all anymore.

In order to reach acceptable performances for my webfarm (at least the same than I have with an NFS server), I really have to find a way for limiting as much as possible stat calls over the network. Stat calls introduce too much latency before the glusterfs client is able to deliver a file (even with io-cache enabled). This makes performances absolutly terrible!

Note: I'm not concerned at all by file being modified by other clients. The frontal reverse-proxy always forwards http connections to the same webserver (according to the domain name), but in case of failover. Having matadatas caching would then __greatly__ improve overall performance in this situation!

Did you try with a very high cache-timeout value (say 5 minutes or more)? Also a bigger cache-size might help.

Hope you can put me on the right way, please!

Thanks and best regards,

Raghavendra G

