Re: [Gluster-devel] glfs vs. unfsd performance figures

gluster-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] glfs vs. unfsd performance figures

From:	Shehjar Tikoo
Subject:	Re: [Gluster-devel] glfs vs. unfsd performance figures
Date:	Sat, 09 Jan 2010 16:16:46 +0530
User-agent:	Mozilla-Thunderbird 2.0.0.19 (X11/20090103)

Gordan Bobic wrote:

Shehjar Tikoo wrote:
The answer to your question is, yes, it will be possible to exportyourlocal file system with knfsd and glusterfs distributed-replicatedvolumes with Gluster NFS translator BUT not in the first release.
See comment above. Isn't that all the more reason to double checkperformance figures before even bothering?
In fact, I may have just convinced myself to acquire some iozoneperformance figures. Will report later.
OK, I couldn't get iozone to report sane results. glfs was reportingthings in the reasonable ball park I'd expect (between 7MB/s and110MB/s which is what I'd expect on gigabit ethernet). NFS wasreporting figures that look more like the memory bandwidth so I'dguess that FS-Cache was taking over. With O_DIRECT and O_SYNC figureswere in the 700KB/s range for NFS which is clearly not sane becausein actual use the two seem fairly equivalent.
So - I did a redneck test instead - dd 64MB of /dev/zero to a file onthe mounted partition.
On writes, NFS gets 4.4MB/s, GlusterFS (server side AFR) gets4.6MB/s. Pretty even.On reads GlusterFS gets 117MB/s, NFS gets 119MB/s (on the first readafter flushing the caches, after that it goes up to 600MB/s). Thedifference in the unbuffered readings seems to be in the sane ballpark and the difference on the reads is roughly what I'd expectconsidering NFS is running UDP and GLFS is running TCP.
So in conclusion - there is no performance difference between themworth speaking of. So what is the point in implementing a user-spaceNFS handler in glusterfsd when unfsd seems to do the job as well asglusterfsd could reasonably hope to?
A single dd, which is basically sequential IO is something even
an undergrad OS 101 project can optimize for. We, on the other hand,
are aiming higher. We'll be providing much better meta-data
performance, something unfsd sucks at(..not without reason, I
appreciate the measures it takes for ensuring correctness..) due to
the large number of system calls it performs, much better support for
concurrency in order to exploit the proliferating multi-cores, much
better parallelism for multiple NFS clients where all of them are
hammering away at the server, again something unfsd does not to do.
Since you (quite rightly) say that a single sequential I/O isn't aparticularly valid real-world test case, I now have some performancefigures, and they are showing a similar equivalence between glfs andunfsd client connections (see tests 8,9 below).
The testing was done using the following method:
make clean;
# prime the caches for the benefit of the doubt
find . -type f -exec cat '{}' > /dev/null \;;
sync;
# The machines involved are quad core
time make -j8 all

1) pure ext3        6:40    CPU bound
2) ext3            15:15    rootfs (glfs, no cache) I/O bound
3) ext3+knfsd        7:02    mostly network bound
4) ext3+unfsd        16:04
5) glfs            61:54    rootfs (glfs, no cache) I/O bound
6) glfs+cache        32:32    rootfs (glfs, no cache) I/O bound
7) glfs+unfsd        278:30
8) glfs+cache+unfsd    189:15
9) glfs+cache+glfs    186:43

Notes:
- Time is in minutes:seconds
- GlusterFS 2.0.9 was used in all cases, on RHEL 5.4, 64-bit
- The times are for building the RHEL 5.4 kernel
- noatime is used on all mounts
- cache means that caching was applied on the server in the form ofwritebehind and io-cache translators directly on top of the assembledAFR bricks.- All tests except 2, 5, and 6 were done on a Quad Core2 3.2 GHz with2GB of RAM- Tests 2, 5, and 6 were done on a Phenom X4 2.8GHz with 4GB of RAM. Inthis instance the figures are reasonably comparable- In tests 2, 5, 6 rootfs (which is where gcc and other binaries are),was on glfs, which caused further slow-down.- In all cases except 1 (where all the files were local), the server wasthe same PhenomX4 machine with 4GB of RAM. It was paired in AFR to anAtom 330 machine in all cases where glfs was used.
- Gigabit network was used in all cases.
- The client was always connecting to a single, server assembled AFRvolume (so the server was proxying write requests to the slaved Atom 330machine).- glfs rootfs runs without any performance translators in all cases andwith --disable-direct-io=off- the volume containing /usr/src where the source code being compiledresides was always mounted without the direct-io mount parameter mentioned.
Even if we ignore tests, 2, 5, 6, the results are quite concerning:
1) pure ext3        6:40    CPU bound
3) ext3+knfsd        7:02    mostly network bound
4) ext3+unfsd        16:04
7) glfs+unfsd        278:30
8) glfs+cache+unfsd    189:15
9) glfs+cache+glfs    186:43
REsults 1,3,4 above are pretty much just the base line for how long theoperation takes without any glfs involvement.
The main point here is between results 7, 8 and 9:
7) glfs+unfsd        278:30
8) glfs+cache+unfsd    189:15
9) glfs+cache+glfs    186:43
Specifically, the point I was making earlier about glfs vs. unfsdperformance. The difference appears to be quite negligible, so I'd daresay that in terms of performance, rolling a NFS server into glusterfswill do absolutely nothing for performance.

..and based on that sweeping generalization, which in itself is basedon this one particular test at one particular deployment, if your

point is that there is no need for the NFS translator, I cannot help
but notice that you're completely ignoring all the other points I've
made in the previous emails.

Then, for a second lets ignore the config, to which I'll come later,
you're extrapolating about NFS translator performance only on the
basis of this comparison between unfsd and glusterfs. I find this
judgment on the NFS translator without having tested it unacceptable.

But, since you've put in this much appreciated effort into running
the tests, a few more points are deserved.

So in bullet points:
- unfsd runs at a bit under half the speed of knfsd.
- glfs without writebehind + io-cache translators runs approximately 10xslower than ext3 (when backed by ext3 as in this test, at least).- writebehind + io-cache approximately doubles the performance. This isevident both from tests 5,6 and 7,8- With glfs being used for the replicated volume to be exported toclients, the performance is approximately 30x lower than the nearestcomparable case which is ext3+unfsd.

> - there is no performance difference between unfsd and glfs for the
> exported volumes.
>

ext3+unfsd is not the nearest comparable case, or even the base line.
Where is the logic in comparing a disk filesystem to a distributed one
in terms of performance? The ext3+knfsd and ext3+unfsd cases are
relevant only to the point of calibrating the test and showing the
upper-bound on performance but not as a data point to which you can
make comparisons and say things like "approximately 30x lower".

The comparable to ext3+knfsd and ext3+unfsd is glusterfsd exporting a
directory on ext3 using just storage/posix and protocol/client. And
this too, only when you'd like a cursory glance at performance of
various alternatives. If this was actually your use case, I'd've
right away suggested that you go with ext3+knfsd.

For glusterfs, comparables would've been other parallel and
distributed file systems, and further only those which use files and
directories as units of operation and not blocks.

For this particular test, the nearest comparable case is 7 and knfsd

re-exporting a glusterfs FUSE mount that replicates exactly the wayyou've done.


Then there is the very peculiar replication config. If I understand
it correctly, this is what it looks like for 8 and 9:

Clnt(Gl or NFS)<-Lnk(a)->(unfsd+gl client or gl srv)<-Lnk(b)->(gl srv)

The primary benefit of GlusterFS is that the replication and
distribution logic sits at the client, allowing GlusterFS client
to talk directly to the servers/replicas.

With the need to support NFS, keeping in mind that when I say NFS, I
mean NFSv3 and v4 to an extent, we need an extra layer that performs
the transformations between NFS ops/semantics and GlusterFS ops.
Doing so generally requires a new server instance that at one
side of network(Lnk(a)) talks NFS, like (unfsd+gl client) above, and
also GlusterFS for the other side, in order to talk to GlusterFS
backends. The point being that this extra network link, i.e. Lnk(b),
in a NFS+GlusterFS deployment is unavoidable.

However, the second link is completely avoidable for the case where
you need two replicas with GlusterFS clients and GlusterFS servers,
because, again, GlusterFS clients can talk to each of the replicas
directly without an intermediate server.

In effect, what this particular config does is, it forces glusterfs
performance down to that of a NFS+GlusterFS deployment
by forcing GlusterFS to talk to atleast one replica through an extra
network link. In traditional and most used configurations of
replicate, gluster clients do talk directly to the glusterfs backends.
It is obvious that on those, GlusterFS replicated performance would be
better then NFS+GlusterFS replicated performance. The question
then arises is, is this even a relevant comparison, let alone fair?
It is clearly not on both counts.

That is why I believe this particular comparison of glusterfs
and unfsd is not useful to make the kind of conclusion you're
attempting to regarding NFS in general and NFS translator,
in particular.

Now, having looked at and hacked on unfsd source, please trust me
when I say that the NFS translator will perform better than
unfsd when used with glusterfs. The inflection point of unfsd
and NFS xlator perf might not be in the first release, but rest
assured that incremental changes will start showing differences.

-Shehjar

It might be interesting to see whether running additional cachingtranslators on the client itself might positively affect performance,but considering how long these tests have taken, I'm feeling less thanmotivated to run more at the moment.
I'm looking forward to seeing how 3.0.1 will stack up against this withit's complete evasion of libfuse.
Gordan


_______________________________________________
Gluster-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/gluster-devel

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Gluster-devel] Multiple NFS Servers (Gluster NFS in 3.x, unfsd, knfsd, etc.), (continued)
- Re: [Gluster-devel] Multiple NFS Servers (Gluster NFS in 3.x, unfsd, knfsd, etc.), Shehjar Tikoo, 2010/01/05

Prev by Date: Re: [Gluster-devel] glfs vs. unfsd performance figures
Next by Date: Re: [Gluster-devel] Multiple NFS Servers (Gluster NFS in 3.x, unfsd, knfsd, etc.)
Previous by thread: Re: [Gluster-devel] glfs vs. unfsd performance figures
Next by thread: Re: [Gluster-devel] glfs vs. unfsd performance figures
Index(es):
- Date
- Thread