qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC patch 0/1] block: vhost-blk backend


From: Stefan Hajnoczi
Subject: Re: [RFC patch 0/1] block: vhost-blk backend
Date: Tue, 4 Oct 2022 14:26:12 -0400

On Mon, Jul 25, 2022 at 11:55:26PM +0300, Andrey Zhadchenko wrote:
> Although QEMU virtio-blk is quite fast, there is still some room for
> improvements. Disk latency can be reduced if we handle virito-blk requests
> in host kernel so we avoid a lot of syscalls and context switches.
> 
> The biggest disadvantage of this vhost-blk flavor is raw format.
> Luckily Kirill Thai proposed device mapper driver for QCOW2 format to attach
> files as block devices: https://www.spinics.net/lists/kernel/msg4292965.html
> 
> Also by using kernel modules we can bypass iothread limitation and finaly 
> scale
> block requests with cpus for high-performance devices. This is planned to be
> implemented in next version.
> 
> Linux kernel module part:
> https://lore.kernel.org/kvm/20220725202753.298725-1-andrey.zhadchenko@virtuozzo.com/
> 
> test setups and results:
> fio --direct=1 --rw=randread  --bs=4k  --ioengine=libaio --iodepth=128

> QEMU drive options: cache=none
> filesystem: xfs

Please post the full QEMU command-line so it's clear exactly what this
is benchmarking.

A preallocated raw image file is a good baseline with:

  --object iothread,id=iothread0 \
  --blockdev file,filename=test.img,cache.direct=on,aio=native,node-name=drive0 
\
  --device virtio-blk-pci,drive=drive0,iothread=iothread0

(BTW QEMU's default vq size is 256 descriptors and the number of vqs is
the number of vCPUs.)

> 
> SSD:
>                | randread, IOPS  | randwrite, IOPS |
> Host           |      95.8k    |      85.3k      |
> QEMU virtio    |      57.5k    |      79.4k      |
> QEMU vhost-blk |      95.6k    |      84.3k      |
> 
> RAMDISK (vq == vcpu):

With fio numjobs=vcpu here?

>                  | randread, IOPS | randwrite, IOPS |
> virtio, 1vcpu    |    123k      |      129k       |
> virtio, 2vcpu    |    253k (??) |      250k (??)  |

QEMU's aio=threads (default) gets around the single IOThread. It beats
aio=native for this reason in some cases. Were you using aio=native or
aio=threads?

> virtio, 4vcpu    |    158k      |      154k       |
> vhost-blk, 1vcpu |    110k      |      113k       |
> vhost-blk, 2vcpu |    247k      |      252k       |

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]