gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] Posix locks - Patch works


From: Gordan Bobic
Subject: Re: [Gluster-devel] Posix locks - Patch works
Date: Fri, 05 Feb 2010 13:48:27 +0000
User-agent: Thunderbird 2.0.0.22 (X11/20090625)

Will this be included in the 3.0.2 release?

Gordan

Samuel Hassine wrote:
Hi all,

I just patched the GlusterFS Source code with
http://patches.gluster.com/patch/2716/ .

For PHP Sessions, the problem is solved, it works with the locks
translator.
For internal server errors on a huge trafic, same thing, no problem
anymore.

Thanks a lot. We will send you other feedbacks for the same
infrastructure (huge trafic) but with a distributed-replicated
GlusterFS.

Nice job developers :)

Regards.


Le vendredi 05 février 2010 à 06:21 -0600, Tejas N. Bhise a écrit :
Thanks, Samuel.

Also, as mentioned earlier please provide us details of the linux kernel version / Fuse Kernel Module versions on both the servers and the clients used
apart from the output of the 'option trace on' in the locks translator.

Regards,
Tejas.

----- Original Message -----
From: "Samuel Hassine" <address@hidden>
To: "Pavan Vilas Sondur" <address@hidden>
Cc: address@hidden, "Yann Autissier" <address@hidden>, "Gluster List" 
<address@hidden>
Sent: Friday, February 5, 2010 5:43:46 PM GMT +05:30 Chennai, Kolkata, Mumbai, 
New Delhi
Subject: [Gluster-devel] Re: Feedback - Problem with the locks feature

Hi all,

Juste before I test this patch, I have an other bug to report
with/without the locks translator. As I said in my first email, I just
change from NFS to GlusterFS for my websites storage partition (about 15
000 websites).

I thought that only PHP sessions didnt "like" the posix locks but its
not. The other simple distributed partition for website files is
impacted :

With the posix locks, I have 30% of web server internal errors 500
(premature end of scripts headers), but without locks (I just change the
configuration), no 500 et no end of scripts headers. So I think there is
a link. (We have a huge trafic, maybe it could be another reason).

I'm applying the patch right know and will give you a feedback as soon
as possible.

Regards.

Le vendredi 05 février 2010 à 15:13 +0530, Pavan Vilas Sondur a écrit :
Hi Samuel,
Looking at log messages such as these:
[2010-02-04 21:11:22] W [posix.c:246:posix_lstat_with_gen] posix1:
Access to /data//.. (on dev 2049) is crossing device (64768)
[2010-02-04 21:11:24] W [posix.c:246:posix_lstat_with_gen] posix1:
Access to /data//.. (on dev 2049) is crossing device (64768)
It seems you are also running into bug 571 
(http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=576). Can
you apply this patch: http://patches.gluster.com/patch/2716 and let us know how 
it goes. Also, can you provide
us details of the linux kernel version / Fuse Kernel Module versions on both 
the servers and the clients used
apart from the output of the 'option trace on' in the locks translator.

Pavan

On 04/02/10 21:42 -0600, Anand Avati wrote:
----- "Samuel Hassine" <address@hidden> wrote:

Hi all,

For the PHP script with little write/read accesses I will try to find
it (I dont remember exactly the syntax), but for PHP Sessions, the bug
could be easily reproduced.

I just test it on a new very simple GlusterFS partition with no trafic
(juste me), and I reproduced it immediatly.

Explainations:
- 2 servers Debian Lenny stable
- GlusterFS 3.0.0 in distributed mode (one server and multiple
clients)
- Lighttpd / PHP5 Fast-CGI

I juste mount the GlusterFS partition on the /var/www directory.

First of all, the PHP script you can execute:

<?php
session_save_path('.');
//if you want to verify if it worked
//echo session_save_path();
session_start();
?>

Secondly, there are 2 configurations if GlusterFS and, of course, one
works and one does not.
The client configuration is the same in the both cases:

glusterfs.vol
volume test-1
type protocol/client
option transport-type tcp
option remote-host test
option transport.socket.nodelay on
option transport.remote-port 6996
option remote-subvolume brick1
end-volume

volume writebehind
type performance/write-behind
option cache-size 4MB
subvolumes test-1
end-volume

volume readahead
type performance/read-ahead
option page-count 4
subvolumes writebehind
end-volume

volume iocache
type performance/io-cache
option cache-size 1GB
option cache-timeout 1
subvolumes readahead
end-volume

volume quickread
type performance/quick-read
option cache-timeout 1
option max-file-size 64kB
subvolumes iocache
end-volume

volume statprefetch
type performance/stat-prefetch
subvolumes quickread
end-volume

Now the server configuration:

glusterfsd.vol (this doesnt work)
volume posix1
type storage/posix
option directory /data
end-volume

volume locks1
type features/locks
subvolumes posix1
end-volume

volume brick1
type performance/io-threads
option thread-count 8
subvolumes locks1
end-volume

volume server-tcp
type protocol/server
option transport-type tcp
option auth.addr.brick1.allow *
option transport.socket.listen-port 6996
option transport.socket.nodelay on
subvolumes brick1
end-volume

glusterfsd.vol (this works)
volume posix1
type storage/posix
option directory /data
end-volume

#volume locks1
# type features/locks
# subvolumes posix1
#end-volume

volume brick1
type performance/io-threads
option thread-count 8
subvolumes posix1
end-volume

volume server-tcp
type protocol/server
option transport-type tcp
option auth.addr.brick1.allow *
option transport.socket.listen-port 6996
option transport.socket.nodelay on
subvolumes brick1
end-volume

So, with the locks translator, you can execute the script one time (it
will be ok) but the second time the session file is on the file system
but locked and nobody can access to it. PHP freezes and processes
coult not be killed.

When it's happened, I have nothing in client-side logs but I have 2
kinds of message in the server-side logs:
When I execute the script:
[2010-02-04 21:11:22] W [posix.c:246:posix_lstat_with_gen] posix1:
Access to /data//.. (on dev 2049) is crossing device (64768)
[2010-02-04 21:11:24] W [posix.c:246:posix_lstat_with_gen] posix1:
Access to /data//.. (on dev 2049) is crossing device (64768)

When I try to umount -f (disconnect the gluster):
[2010-02-04 21:13:45] E [server-protocol.c:339:protocol_server_reply]
protocol/server: frame 20: failed to submit. op= 26, type= 4

As I said I will try to find the other PHP script.

I hope this will help you.
I tried to reproduce the problem with your exact configuration (only changing 
'option remote-host') from 1 server and 2 clients. I was not able to hit the 
problem with the configuration which is breaking for you. I used v3.0.0 as well.

Can you please turn 'option trace on' in the locks translator and give us the 
server log when the php session hangs?

Thanks,
Avati


_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel



_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel




_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel





reply via email to

[Prev in Thread] Current Thread [Next in Thread]