|
From: | Michael Brown |
Subject: | [Gluster-devel] Parallel readdir from NFS clients causes incorrect data |
Date: | Wed, 03 Apr 2013 17:37:39 -0400 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130308 Thunderbird/17.0.4 |
I'm seeing a problem on my fairly fresh RHEL gluster install. Smells
to me like a parallelism problem on the server. If I mount a gluster volume via NFS (using glusterd's internal NFS server, nfs-kernel-server) and read a directory from multiple clients *in parallel*, I get inconsistent results across servers. Some files are missing from the directory listing, some may be present twice! Exactly which files (or directories!) are missing/duplicated varies each time. But I can very consistently reproduce the behaviour. You can see a screenshot here: http://imgur.com/JU8AFrt The replication steps are: * clusterssh to each NFS client * unmount /gv0 (to clear cache) * mount /gv0 [1] * ls -al /gv0/common/apache-jmeter-2.9/bin (which is where I first noticed this) Here's the rub: if, instead of doing the 'ls' in parallel, I do it in series, it works just fine (consistent correct results everywhere). But hitting the gluster server from multiple clients at the same time causes problems. I can still stat() and open() the files missing from the directory listing, they just don't show up in an enumeration. Mounting gv0 as a gluster client filesystem works just fine. Details of my setup: 2 × gluster servers: 2×E5-2670, 128GB RAM, RHEL 6.4 64-bit, glusterfs-server-3.3.1-1.el6.x86_64 (from EPEL) 4 × NFS clients: 2×E5-2660, 128GB RAM, RHEL 5.7 64-bit, glusterfs-3.3.1-11.el5 (from kkeithley's repo, only used for testing) gv0 volume information is below bricks are 400GB SSDs with ext4[2] common network is 10GbE, replication between servers happens over direct 10GbE link. I will be testing on xfs/btrfs/zfs eventually, but for now I'm on ext4. Also attached is my chatlog from asking about this in #gluster [1]: fstab line is: fearless1:/gv0 /gv0 nfs defaults,sync,tcp,wsize=8192,rsize=8192 0 0 [2]: yes, I've turned off dir_index to avoid That Bug. I've run the d_off test, results are here: http://pastebin.com/zQt5gZnZ ---- gluster> volume info gv0 Volume Name: gv0 Type: Distributed-Replicate Volume ID: 20117b48-7f88-4f16-9490-a0349afacf71 Status: Started Number of Bricks: 8 x 2 = 16 Transport-type: tcp Bricks: Brick1: fearless1:/export/bricks/500117310007a6d8/glusterdata Brick2: fearless2:/export/bricks/500117310007a674/glusterdata Brick3: fearless1:/export/bricks/500117310007a714/glusterdata Brick4: fearless2:/export/bricks/500117310007a684/glusterdata Brick5: fearless1:/export/bricks/500117310007a7dc/glusterdata Brick6: fearless2:/export/bricks/500117310007a694/glusterdata Brick7: fearless1:/export/bricks/500117310007a7e4/glusterdata Brick8: fearless2:/export/bricks/500117310007a720/glusterdata Brick9: fearless1:/export/bricks/500117310007a7ec/glusterdata Brick10: fearless2:/export/bricks/500117310007a74c/glusterdata Brick11: fearless1:/export/bricks/500117310007a838/glusterdata Brick12: fearless2:/export/bricks/500117310007a814/glusterdata Brick13: fearless1:/export/bricks/500117310007a850/glusterdata Brick14: fearless2:/export/bricks/500117310007a84c/glusterdata Brick15: fearless1:/export/bricks/500117310007a858/glusterdata Brick16: fearless2:/export/bricks/500117310007a8f8/glusterdata Options Reconfigured: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on nfs.disable: off ---- -- Michael Brown | `One of the main causes of the fall of Systems Consultant | the Roman Empire was that, lacking zero, Net Direct Inc. | they had no way to indicate successful ☎: +1 519 883 1172 x5106 | termination of their C programs.' - Firth |
chatlog.txt
Description: Text document
[Prev in Thread] | Current Thread | [Next in Thread] |