[Gluster-devel] Profiling glusterfs (Re: Mirrored GlusterFS -- very poor

On 03.01.2014 01:59, Mikhail T. wrote:

We expected to pay some performance penalty for the features, but the actual numbers are causing a sticker-shock...

We tried to mount GlusterFS replicated volume using NFS, rather than glusterfs/fuse and were pleasantly surprised. The read-performance improved dramatically -- and is in line with that of the NetApp NFS-server we are also using:

	Local FS		NetApp NFS		GlusterFS		GlusterNFS
	Average	Minimum	Average	Minimum	Average	Minimum	Average	Minimum
Small static file	1.532	0.513	4.027	0.916	27.81	7.184	5.591	1.394
Large static file	14.45	2.721	14.56	3.643	37.90	7.433	14.95	4.829

This suggests, the performance problem is not on the Gluster-servers, but on the client-side and I tried to use the profiler. Although the software is not easily profilable due to its use of shared libraries, partial coverage is possible... Here is what I did:

% env LDFLAGS="-pg" CFLAGS="-pg -O2 -march=core2 -pipe -fno-strict-aliasing -g" ./configure --enable-static% make% cd glusterfsd/src% cc -pg -o glusterfsd.profilable *.o -lpthread ../../rpc/xdr/src/*.o ../../rpc/rpc-lib/src/*.o -lcrypto -lz ../../libglusterfs/src/*.o ../../xlators/mount/fuse/src/*.o -L ../../libglusterfs/src/.libs/ -lglusterfs -L ../../rpc/rpc-lib/src/.libs -lgfrpc -L ../../rpc/xdr/src/.libs -lgfxdr% ln -s glusterfsd.profiled glusterfs% ./glusterfs --no-daemon ......

I then ran some tests for two minutes and umounted the share. The resulting gmon.out does not cover any calls, that took place in the explicitly dlopen-ed objects (like fuse.so), but everything else is included: my glusterfsd.profilable executable does not use shared libraries for its own work, it links explicitly with the .o-files. (The shared libs are needed only to be able to dlopen various "xlators", which expect certain symbols to be available.)

Now, my testing did repeated requests of two files -- for one minute each. First a tiny 430-byte file, and then a bigger 93Kb one. In 122 seconds there were about 55K file-transfers and glusterfs-process accumulated 135 seconds of CPU-time... According to gprof, these 55K transfers resulted in 597971 calls to each of the rpc_clnt_notify and rpc_transport_notify, each accounting for over 34% of the total time.

Joseph Landman indicated earlier, that he blames the kernel/user-space context-switching, and that surely is responsible for some overhead. But the bulk of it seems to be in the client/server communication (that is, between the glusterfs client and glusterfsd server), which appears to be unnecessarily chatty. Why are there 100 notifications for each file-transfer? I wonder, if this can be improved bringing the performance of glusterfs-mounts closer to that of the NFS method... (In fact, the NFS method may benefit from such an optimization too.)

I may be reading profiler's output wrong -- I am rusty in this area. Would anybody else care to take a look at the results:

Plain "gprof glusterfsd.profiled"
Annotated source produced by "gprof -A -l -x glusterfsd.profiled"

Please, let me know, if you'd like me to massage the data in some different way -- I'll keep the raw results around for some time... Thank you! Yours,

-mi

From:	Mikhail T.
Subject:	[Gluster-devel] Profiling glusterfs (Re: Mirrored GlusterFS -- very poor read performance)
Date:	Sat, 04 Jan 2014 14:13:47 -0500
User-agent:	Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0