qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tools/virtiofs: Multi threading seems to hurt performance


From: Venegas Munoz, Jose Carlos
Subject: Re: tools/virtiofs: Multi threading seems to hurt performance
Date: Thu, 24 Sep 2020 21:33:01 +0000

Hi Folks,

Sorry for the delay about how to reproduce `fio` data.

I have some code to automate testing for multiple kata configs and collect info 
like:
- Kata-env, kata configuration.toml, qemu command, virtiofsd command.

See: 
https://github.com/jcvenegas/mrunner/


Last time we agreed to narrow the cases and configs to compare virtiofs and 9pfs

The configs where the following:

- qemu + virtiofs(cache=auto, dax=0) a.ka. `kata-qemu-virtiofs` WITOUT xattr
- qemu + 9pfs a.k.a `kata-qemu`

Please take a look to the html and raw results I attach in this mail.

## Can I say that the  current status is:
- As David tests and Vivek points, for the fio workload you are using, seems 
that the best candidate should be cache=none
   -  In the comparison I took  cache=auto as Vivek suggested, this make sense 
as it seems that will be the default for kata.
   - Even if for this case cache=none works better, Can I assume that 
cache=auto dax=0 will be better than any 9pfs config? (once we find the root 
cause)

- Vivek is taking a look to mmap mode from 9pfs, to see how different is  with 
current virtiofs implementations. In 9pfs for kata, this is what we use by 
default.

## I'd like to identify what should be next on the debug/testing?

- Should I try to narrow by only trying to with qemu? 
- Should I try first with a new patch you already have? 
- Probably try with qemu without static build?
- Do the same test with thread-pool-size=1?

Please let me know how can I help.

Cheers.

On 22/09/20 12:47, "Vivek Goyal" <vgoyal@redhat.com> wrote:

    On Tue, Sep 22, 2020 at 11:25:31AM +0100, Dr. David Alan Gilbert wrote:
    > * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
    > > Hi,
    > >   I've been doing some of my own perf tests and I think I agree
    > > about the thread pool size;  my test is a kernel build
    > > and I've tried a bunch of different options.
    > > 
    > > My config:
    > >   Host: 16 core AMD EPYC (32 thread), 128G RAM,
    > >      5.9.0-rc4 kernel, rhel 8.2ish userspace.
    > >   5.1.0 qemu/virtiofsd built from git.
    > >   Guest: Fedora 32 from cloud image with just enough extra installed for
    > > a kernel build.
    > > 
    > >   git cloned and checkout v5.8 of Linux into /dev/shm/linux on the host
    > > fresh before each test.  Then log into the guest, make defconfig,
    > > time make -j 16 bzImage,  make clean; time make -j 16 bzImage 
    > > The numbers below are the 'real' time in the guest from the initial make
    > > (the subsequent makes dont vary much)
    > > 
    > > Below are the detauls of what each of these means, but here are the
    > > numbers first
    > > 
    > > virtiofsdefault        4m0.978s
    > > 9pdefault              9m41.660s
    > > virtiofscache=none    10m29.700s
    > > 9pmmappass             9m30.047s
    > > 9pmbigmsize           12m4.208s
    > > 9pmsecnone             9m21.363s
    > > virtiofscache=noneT1   7m17.494s
    > > virtiofsdefaultT1      3m43.326s
    > > 
    > > So the winner there by far is the 'virtiofsdefaultT1' - that's
    > > the default virtiofs settings, but with --thread-pool-size=1 - so
    > > yes it gives a small benefit.
    > > But interestingly the cache=none virtiofs performance is pretty bad,
    > > but thread-pool-size=1 on that makes a BIG improvement.
    > 
    > Here are fio runs that Vivek asked me to run in my same environment
    > (there are some 0's in some of the mmap cases, and I've not investigated
    > why yet).

    cache=none does not allow mmap in case of virtiofs. That's when you
    are seeing 0.

    >virtiofs is looking good here in I think all of the cases;
    > there's some division over which cinfig; cache=none
    > seems faster in some cases which surprises me.

    I know cache=none is faster in case of write workloads. It forces
    direct write where we don't call file_remove_privs(). While cache=auto
    goes through file_remove_privs() and that adds a GETXATTR request to
    every WRITE request.

    Vivek


Attachment: results.tar.gz
Description: results.tar.gz

vitiofs 9pfs: fio comparsion

Platform

Packet : c1.small.x86-01

PROC1 x Intel E3-1240 v3 RAM32GB
DISK2 x 120GB SSD
NIC2 x 1Gbps Bonded Port
Nproc: 8

Env

Namekata-qemu-virtiofskata-qemu
Kata version1.12.0-alpha11.12.0-alpha1
Qemu versionversion 5.0.0 (kata-static)5.0.0 (kata-static)
Qemu code repohttps://gitlab.com/virtio-fs/qemu.githttps://github.com/qemu/qemu
Qemu tagqemu5.0-virtiofs-with51bits-daxv5.0.0
Kernel codehttps://gitlab.com/virtio-fs/linux.githttps://cdn.kernel.org/pub/linux/kernel/v4.x/
kernel tagkata-v5.6-april-09-2020v5.4.60
OS:18.04.2 LTS (Bionic Beaver)
Host kernel:4.15.0-50-generic #54-Ubuntu

fio workload:

fio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75 --output=/output/fio.tx

Results:

kata-qemu(9pfs):

READ: bw=211MiB/s (222MB/s), 211MiB/s-211MiB/s (222MB/s-222MB/s), io=3070MiB (3219MB), run=14532-14532msec

WRITE: bw=70.6MiB/s (74.0MB/s), 70.6MiB/s-70.6MiB/s (74.0MB/s-74.0MB/s), io=1026MiB (1076MB), run=14532-14532msec

kata-qemu-virtiofs:

Run status group 0 (all jobs):
   READ: bw=159MiB/s (167MB/s), 159MiB/s-159MiB/s (167MB/s-167MB/s), io=3070MiB (3219MB), run=19321-19321msec
  WRITE: bw=53.1MiB/s (55.7MB/s), 53.1MiB/s-53.1MiB/s (55.7MB/s-55.7MB/s), io=1026MiB (1076MB), run=19321-19321msec

Some other useful information:

Qemu command:

/opt/kata/bin/qemu-virtiofs-system-x86_64 
-name sandbox-6da5fc42e640c8e9bb4a0104a379c69b5a97d0074c8e88b27e1b58342f20aefb 
-uuid 6dfcd0cf-4d02-45a0-88d6-b5a3b3f88e52 
-machine pc,accel=kvm,kernel_irqchip,nvdimm 
-cpu host,pmu=off 
-qmp unix:/run/vc/vm/6da5fc42e640c8e9bb4a0104a379c69b5a97d0074c8e88b27e1b58342f20aefb/qmp.sock,server,nowait 
-m 2048M,slots=10,maxmem=33139M 
-device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= 
-device virtio-serial-pci,disable-modern=false,id=serial0,romfile= 
-device virtconsole,chardev=charconsole0,id=console0 
-chardev socket,id=charconsole0,path=/run/vc/vm/6da5fc42e640c8e9bb4a0104a379c69b5a97d0074c8e88b27e1b58342f20aefb/console.sock,server,nowait 
-device nvdimm,id=nv0,memdev=mem0 
-object memory-backend-file,id=mem0,mem-path=/opt/kata/share/kata-containers/kata-containers-image_clearlinux_1.12.0-alpha1_agent_8c9bbadcd4.img,size=268435456 
-device virtio-scsi-pci,id=scsi0,disable-modern=false,romfile= 
-object rng-random,id=rng0,filename=/dev/urandom 
-device virtio-rng-pci,rng=rng0,romfile= 
-device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 
-chardev socket,id=charch0,path=/run/vc/vm/6da5fc42e640c8e9bb4a0104a379c69b5a97d0074c8e88b27e1b58342f20aefb/kata.sock,server,nowait 
-chardev socket,id=char-31162971885078d3,path=/run/vc/vm/6da5fc42e640c8e9bb4a0104a379c69b5a97d0074c8e88b27e1b58342f20aefb/vhost-fs.sock 
-device vhost-user-fs-pci,chardev=char-31162971885078d3,tag=kataShared,romfile= 
-netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 
-device 
driver=virtio-net-pci,netdev=network-0,mac=02:42:ac:11:00:02,disable-modern=false,mq=on,vectors=4,romfile= 
-rtc base=utc,driftfix=slew,clock=host -global kvm-pit.lost_tick_policy=discard 
-vga none -no-user-config -nodefaults 
-nographic 
--no-reboot 
-daemonize 
-object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on 
-numa node,memdev=dimm1 -kernel /opt/kata/share/kata-containers/vmlinux-5.4.60-88 
-append 'tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 cryptomgr.notests net.ifnames=0 pci=lastbus=0 iommu=off root=/dev/pmem0p1 rootflags=dax,data="" ro rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=8 agent.use_vsock=false systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none'
-pidfile /run/vc/vm/6da5fc42e640c8e9bb4a0104a379c69b5a97d0074c8e88b27e1b58342f20aefb/pid 
-smp 1,cores=1,threads=1,sockets=8,maxcpus=8

virtiofsd command:

/opt/kata/bin/virtiofsd 
--fd=3 
-o source=/run/kata-containers/shared/sandboxes/6da5fc42e640c8e9bb4a0104a379c69b5a97d0074c8e88b27e1b58342f20aefb/shared 
-o cache=auto 
--syslog 
-o no_posix_lock 
-f

qemu-9pfs:

/opt/kata/bin/qemu-system-x86_64 
-name sandbox-0c7f830064ca21e4b48d759b808eab11d8daaae6513192b533443ab8ce383970 
-uuid 56fce320-3e1d-4763-ae94-d8c539ee3e5f 

-machine pc,accel=kvm,kernel_irqchip,nvdimm 
-cpu host,pmu=off 
-qmp unix:/run/vc/vm/0c7f830064ca21e4b48d759b808eab11d8daaae6513192b533443ab8ce383970/qmp.sock,server,nowait 
-m 2048M,slots=10,maxmem=33139M 
-device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= 
-device virtio-serial-pci,disable-modern=false,id=serial0,romfile= 
-device virtconsole,chardev=charconsole0,id=console0 
-chardev socket,id=charconsole0,path=/run/vc/vm/0c7f830064ca21e4b48d759b808eab11d8daaae6513192b533443ab8ce383970/console.sock,server,nowait 
-device nvdimm,id=nv0,memdev=mem0 
-object memory-backend-file,id=mem0,mem-path=/opt/kata/share/kata-containers/kata-containers-image_clearlinux_1.12.0-alpha1_agent_8c9bbadcd4.img,size=268435456 
-device virtio-scsi-pci,id=scsi0,disable-modern=false,romfile= 
-object rng-random,id=rng0,filename=/dev/urandom 
-device virtio-rng-pci,rng=rng0,romfile= 
-device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 
-chardev socket,id=charch0,path=/run/vc/vm/0c7f830064ca21e4b48d759b808eab11d8daaae6513192b533443ab8ce383970/kata.sock,server,nowait 
-device virtio-9p-pci,disable-modern=false,fsdev=extra-9p-kataShared,mount_tag=kataShared,romfile= 
-fsdev local,id=extra-9p-kataShared,path=/shared,security_model=none,multidevs=remap 
-netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 
-device driver=virtio-net-pci,netdev=network-0,mac=02:42:ac:11:00:02,disable-modern=false,mq=on,vectors=4,romfile= 
-rtc base=utc,driftfix=slew,clock=host 
-global kvm-pit.lost_tick_policy=discard 
-vga none -no-user-config -nodefaults -nographic 
--no-reboot -daemonize -object memory-backend-ram,id=dimm1,size=2048M 
-numa node,memdev=dimm1 -kernel /opt/kata/share/kata-containers/vmlinux-5.4.60-88 
-append 'tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 cryptomgr.notests net.ifnames=0 pci=lastbus=0 iommu=off root=/dev/pmem0p1 rootflags=dax,data="" ro rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=8 agent.use_vsock=false systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none' 
-pidfile /run/vc/vm/0c7f830064ca21e4b48d759b808eab11d8daaae6513192b533443ab8ce383970/pid 
-smp 1,cores=1,threads=1,sockets=8,maxcpus=8

-fsdev local,id=extra-9p-kataShared,path=/shared,security_model=none,multidevs=remap -netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4

-device virtio-9p-pci,disable-modern=false,fsdev=extra-9p-kataShared,mount_tag=kataShared,romfile= -fsdev local,id=extra-9p-kataShared,path=/shared,security_model=none,multidevs=remap

If you want to try by yourself

Please use the following repository: https://github.com/jcvenegas/mrunner/

If you dont find useful the repository:

Here is a summary about what it does:

#!/bin/bash
set -x
set -e

(
cd workloads/fio/dockerfile/
docker build -f Dockerfile -t large-files-4gb .
)

# virtiofsd
results_dir="${PWD}"
sudo crudini --set --existing /opt/kata/share/defaults/kata-containers/configuration-qemu-virtiofs.toml hypervisor.qemu virtio_fs_cache '"auto"'
sudo crudini --set --existing /opt/kata/share/defaults/kata-containers/configuration-qemu-virtiofs.toml hypervisor.qemu virtio_fs_cache_size 0
sudo crudini --set --existing /opt/kata/share/defaults/kata-containers/configuration-qemu-virtiofs.toml hypervisor.qemu virtio_fs_extra_args '[]'
sudo crudini --set --existing /opt/kata/share/defaults/kata-containers/configuration-qemu-virtiofs.toml hypervisor.qemu kernel '"/opt/kata/share/kata-containers/vmlinux.container"'
/opt/kata/bin/kata-qemu-virtiofs kata-env



docker run -dti --runtime kata-qemu-virtiofs -v ${results_dir}:/output --name large-files-4gb large-files-4gb
docker exec -i large-files-4gb sh -c 'fio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75 --output=/output/fio-virtiofs.txt'
docker rm -f large-files-4gb

# 9pfs
docker run -dti --runtime kata-qemu -v ${results_dir}:/output --name large-files-4gb large-files-4gb
docker exec -i large-files-4gb sh -c 'fio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75 --output=/output/fio-9pfs.txt'
docker rm -f large-files-4gb


reply via email to

[Prev in Thread] Current Thread [Next in Thread]