qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tools/virtiofs: Multi threading seems to hurt performance


From: Vivek Goyal
Subject: Re: tools/virtiofs: Multi threading seems to hurt performance
Date: Tue, 22 Sep 2020 13:47:33 -0400

On Tue, Sep 22, 2020 at 11:25:31AM +0100, Dr. David Alan Gilbert wrote:
> * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> > Hi,
> >   I've been doing some of my own perf tests and I think I agree
> > about the thread pool size;  my test is a kernel build
> > and I've tried a bunch of different options.
> > 
> > My config:
> >   Host: 16 core AMD EPYC (32 thread), 128G RAM,
> >      5.9.0-rc4 kernel, rhel 8.2ish userspace.
> >   5.1.0 qemu/virtiofsd built from git.
> >   Guest: Fedora 32 from cloud image with just enough extra installed for
> > a kernel build.
> > 
> >   git cloned and checkout v5.8 of Linux into /dev/shm/linux on the host
> > fresh before each test.  Then log into the guest, make defconfig,
> > time make -j 16 bzImage,  make clean; time make -j 16 bzImage 
> > The numbers below are the 'real' time in the guest from the initial make
> > (the subsequent makes dont vary much)
> > 
> > Below are the detauls of what each of these means, but here are the
> > numbers first
> > 
> > virtiofsdefault        4m0.978s
> > 9pdefault              9m41.660s
> > virtiofscache=none    10m29.700s
> > 9pmmappass             9m30.047s
> > 9pmbigmsize           12m4.208s
> > 9pmsecnone             9m21.363s
> > virtiofscache=noneT1   7m17.494s
> > virtiofsdefaultT1      3m43.326s
> > 
> > So the winner there by far is the 'virtiofsdefaultT1' - that's
> > the default virtiofs settings, but with --thread-pool-size=1 - so
> > yes it gives a small benefit.
> > But interestingly the cache=none virtiofs performance is pretty bad,
> > but thread-pool-size=1 on that makes a BIG improvement.
> 
> Here are fio runs that Vivek asked me to run in my same environment
> (there are some 0's in some of the mmap cases, and I've not investigated
> why yet).

cache=none does not allow mmap in case of virtiofs. That's when you
are seeing 0.

>virtiofs is looking good here in I think all of the cases;
> there's some division over which cinfig; cache=none
> seems faster in some cases which surprises me.

I know cache=none is faster in case of write workloads. It forces
direct write where we don't call file_remove_privs(). While cache=auto
goes through file_remove_privs() and that adds a GETXATTR request to
every WRITE request.

Vivek




reply via email to

[Prev in Thread] Current Thread [Next in Thread]