[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [network performance question] only~2Gbpsthroughputbetw
From: |
Zhang Haoyu |
Subject: |
Re: [Qemu-devel] [network performance question] only~2Gbpsthroughputbetweentwo linux guests which are running on the same hostvianetperf-tTCP_STREAM -m 1400, but xen can ac |
Date: |
Tue, 10 Jun 2014 11:50:38 +0800 |
I run ethtool -k for backend tap netdevice, find that its tso is off,
Features for tap0:
rx-checksumming: off [fixed]
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: on
tcp-segmentation-offload: off
tx-tcp-segmentation: off [requested on]
tx-tcp-ecn-segmentation: off [requested on]
tx-tcp6-segmentation: off [requested on]
udp-fragmentation-offload: off [requested on]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: on
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
but, I failed to enable its tso, "Could not change any device features" error
message was reported, why?
>>I see that RX checksumming is still off for you on virtio, this is
>>likely what's contribution to the problem.
>>
>>Here's how it looks for me:
>>ethtool -k eth1
>> Offload parameters for eth1:
>> rx-checksumming: on
>> tx-checksumming: on
>> scatter-gather: on
>> tcp-segmentation-offload: on
>> udp-fragmentation-offload: on
>> generic-segmentation-offload: on
>> generic-receive-offload: on
>> large-receive-offload: off
>>
>When I select centos-6.3 as guest os, the rx-checksuming is on, too.
>After update the qemu from 1.4.0 to 2.0.0, the inter-vm throughput can achieve
>~5Gbps via netper -t TCP_STREAM -m 1400.
>Here is ethtool -k eth1 on centos-6.3 guest,
>ethtool -k eth1
>Offload parameters for eth1:
>rx-checksumming: on
>tx-checksumming: on
>scatter-gather: on
>tcp-segmentation-offload: on
>udp-fragmentation-offload: on
>generic-segmentation-offload: on
>generic-receive-offload: off
>large-receive-offload: off
>
>the only difference is gro, on for you, off for me,
>I run 'ethtool -K eth1 gro on' on my guest, below error reported,
>"Cannot set device GRO settings: Invalid argument"
>
>>you don't supply kernel versions for host or guest kernels,
>>so it's hard to judge what's going on exactly.
>>
>host: linux-3.10.27(directly download from kernel.org)
>qemu: qemu-2.0.0(directly download from wiki.qemu.org/Download)
>guest: centos-6.3(2.6.32-279.e16.x86_64), 2vcpu
>
>>Bridge configuration also plays a huge role.
>>Things like ebtables might affect performance as well,
>>sometimes even if they are only loaded, not even enabled.
>>
>I will check it.
>
>>Also, some old scheduler versions didn't put VMs on different
>>CPUs aggressively enough, this resulted in conflicts
>>when VMs compete for the same CPU.
>I will check it.
>
No aggressively contention for the same CPU, but when I pin each vcpu to
different pcpu, ~1Gbps bonus was gained.
>>On numa systems, some older host kernels would split VM memory
>>across NUMA nodes, this might lead to bad performance.
>>
>local first.
>
>>On Sat, Jun 07, 2014 at 11:07:10AM +0800, Zhang Haoyu wrote:
>>> After updating the qemu from 1.4 to 2.0, the inter-vm throughput can
>>> achieve ~5Gbps via netper -t TCP_STREAM -m 1400,
>>> the performance gap(~2Gbps) between kvm and xen still exist.
>>>
>>> Thanks,
>>> Zhang Haoyu
>>>
>>> ------------------
>>> Zhang Haoyu
>>> 2014-06-07
>>>
>>> -----Original Message-----
>>> From: Zhang Haoyu
>>> Sent: 2014-06-07 09:27:16
>>> To: Venkateswara Rao Nandigam; kvm; qemu-devel
>>> Cc: Gleb Natapov; Paolo Bonzini; Michael S.Tsirkin; yewudi
>>> Subject: Re: [network performance question] only ~2Gbpsthroughputbetweentwo
>>> linux guests which are running on the same host vianetperf-tTCP_STREAM -m
>>> 1400, but xen can ac
>>>
>>> > Doesn't that answer your original question about performance gap!
> > Sorry, do you mean it's the offloadings cause the performance gap?
>>> But even OFF the checksum-offload, tso, gro, .etc, the performance gap
>>> still exist,
>>> if I understand correctly, kvm should have better performance than xen from
>>> the angle of implementation, because of shorter path, and fewer
>>> context-switches,
>>> especially inter-vm communication.
>>>
>>> And, why the performance gap is so big(~2G vs ~7G) when checksum-offload,
>>> tso, gro, .etc is on for both hypervisors?
>>> Why the packes' size can be so big(65160) and stable on xen, but most
>>> packets' size is 1448, only a few part is ~65000 on kvm, when netperf -t
>>> TCP_STREAM -m 1400 ?
>>> Does some TCP configurations have buissness with this? Or some virtio-net
>>> configurations?
>>>
>>> Thanks,
>>> Zhang Haoyu
>>>
>>> -----Original Message-----
>>> From: address@hidden [mailto:address@hidden On Behalf Of Zhang Haoyu
>>> Sent: Friday, June 06, 2014 3:44 PM
>>> To: Venkateswara Rao Nandigam; kvm; qemu-devel
>>> Cc: Gleb Natapov; Paolo Bonzini; Michael S.Tsirkin; yewudi
>>> Subject: Re: RE: [network performance question] only ~2Gbps
>>> throughputbetweentwo linux guests which are running on the same host via
>>> netperf-tTCP_STREAM -m 1400, but xen can ac
>>>
>>> > >> Try Rx/Tx checksum offload on the all the concerned guests of both
>>> > >> Hypervisors.
>>> > >>
>>> > >Already ON on both hypervisors, so some other offloadings(e.g. tso, gso)
>>> > >can be supported.
>>> >
>>> > Try Rx/Tx checksum offload "OFF" on the all the concerned guests of
>>> > both Hypervisors
>>> >
>>> Off Rx/Tx checksum offload on XEN guest, 1.6Gbps achived, tcpdump result
>>> on backend vif netdeivce shown that packets' size is 1448, stable.
>>> Off Rx/Tx checksum offload on KVM guest, only ~1Gbps ahchived, tcpdump
>>> result on backend tap netdevice shown that packets' size is 1448, stable.
>>>
>>> > And While launching the VM in KVM, in command line of virtio interface,
>>> > you can specify TSO, LRO, RxMergebuf. Try this instead of ethtool
>>> > interface.
>>> The cuurent qemu command shown as below, and I will change the virtio-net
>>> configuration later as your advise, /usr/bin/kvm -id 8572667846472 -chardev
>>> socket,id=qmp,path=/var/run/qemu-server/8572667846472.qmp,server,nowait
>>> -mon chardev=qmp,mode=control -vnc :0,websocket,to=200,x509,password
>>> -pidfile /var/run/qemu-server/8572667846472.pid -daemonize -name
>>> centos6-196.5.5.72 -smp sockets=1,cores=2 -cpu core2duo -nodefaults -vga
>>> cirrus -no-hpet -k en-us -boot menu=on,splash-time=8000 -m 4096 -usb -drive
>>> file=/sf/data/local/iso/vmtools/virtio_auto_install.iso,if=none,id=drive-ide0,media=cdrom,aio=threads,forecast=disable
>>> -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0,id=ide0,bootindex=200
>>> -drive
>>> file=/sf/data/local/images/host-f8bc123b3e74/32f49b646d1e/centos6-196.5.5.72.vm/vm-disk-1.qcow2,if=none,id=drive-ide2,cache=directsync,aio=threads,forecast=disable
>>> -device ide-hd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100
>>> -netdev type=tap,id=net0,ifname=857266784647200,s
c
> r
>> ip
>>> t=/sf/etc/kvm/vtp-bridge,vhost=on,vhostforce=on -device
>>> virtio-net-pci,mac=FE:FC:FE:95:EC:A7,netdev=net0,bus=p
>>> ci.0,addr=0x12,id=net0,bootindex=300 -rtc driftfix=slew,clock=rt -global
>>> kvm-pit.lost_tick_policy=discard -global PIIX4_PM.disable_s3=1 -global
>>> PIIX4_PM.disable_s4=1
>>>
>>> -----Original Message-----
>>> From: Zhang Haoyu [mailto:address@hidden
>>> Sent: Friday, June 06, 2014 1:26 PM
>>> To: Venkateswara Rao Nandigam; kvm; qemu-devel
>>> Cc: Gleb Natapov; Paolo Bonzini; Michael S.Tsirkin; yewudi
>>> Subject: RE: [network performance question] only ~2Gbps throughput
>>> betweentwo linux guests which are running on the same host via netperf
>>> -tTCP_STREAM -m 1400, but xen can ac
>>>
>>> Thanks for reply.
>>> > >>> And, vhost enabled, tx zero-copy enabled, virtio TSO enabled on kvm.
>>> >
>>> > Try lro "ON" on client side. This would require mergable Rx buffers to be
>>> > ON.
>>> >
>>> current setttings for gro and lro,
>>> generic-receive-offload: on
>>> large-receive-offload: off [fixed]
>>>
>>> > And Xen netfront to KVM virtio are not apples to apples because of their
>>> > implementation details.
>>> >
>>> You are right, I just want to make network performance comparison between
>>> the two virtualization platform from the view of user.
>>>
>>> > Try Rx/Tx checksum offload on the all the concerned guests of both
>>> > Hypervisors.
>>> >
>>> Already ON on both hypervisors, so some other offloadings(e.g. tso, gso)
>>> can be supported.
>>>
>>> kvm virtio-net nic:
>>> ethtool -k eth0
>>> Features for eth0:
>>> rx-checksumming: off [fixed]
>>> tx-checksumming: on
>>> tx-checksum-ipv4: off [fixed]\
>>> tx-checksum-ip-generic: on
>>> tx-checksum-ipv6: off [fixed]
>>> tx-checksum-fcoe-crc: off [fixed]
>>> tx-checksum-sctp: off [fixed]
>>> scatter-gather: on
>>> tx-scatter-gather: on
>>> scatter-gather-fraglist: on
>>> tcp-segmentation-offload: on
>>> tx-tcp-segmentation: on
>>> tx-tcp-ecn-segmentation: on
>>> tx-tcp6-segmentation: on
>>> udp-fragmentation-offload: on
>>> generic-segmentation-offload: on
>>> generic-receive-offload: on
>>> large-receive-offload: off [fixed]
>>> rx-vlan-offload: off [fixed]
>>> tx-vlan-offload: off [fixed]
>>> ntuple-filters: off [fixed]
>>> receive-hashing: off [fixed]
>>> highdma: on [fixed]
>>> rx-vlan-filter: on [fixed]
>>> vlan-challenged: off [fixed]
>>> tx-lockless: off [fixed]
>>> netns-local: off [fixed]
>>> tx-gso-robust: off [fixed]
>>> tx-fcoe-segmentation: off [fixed]
>>> tx-gre-segmentation: off [fixed]
>>> tx-udp_tnl-segmentation: off [fixed]
>>> fcoe-mtu: off [fixed]
>>> tx-nocache-copy: on
>>> loopback: off [fixed]
>>> rx-fcs: off [fixed]
>>> rx-all: off [fixed]
>>> tx-vlan-stag-hw-insert: off [fixed]
>>> rx-vlan-stag-hw-parse: off [fixed]
>>> rx-vlan-stag-filter: off [fixed]
>>>
>>> xen netfront nic:
>>> ethtool -k eth0
>>> Offload features for eth0:
>>> rx-checksumming: on
>>> tx-checksumming: on
>>> scatter-gather: on
>>> tcp-segmentation-offload: on
>>> udp-fragmentation-offload: off
>>> generic-segmentation-offload: on
>>> generic-receive-offload: off
>>> large-receive-offload: off
>>>
>>> <piece of tcpdump result on xen backend vif netdevice >
>>> 15:46:41.279954 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193138968:1193204128, ack 1, win 115, options [nop,nop,TS val 102307210
>>> ecr 102291188], length 65160
>>> 15:46:41.279971 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193204128:1193269288, ack 1, win 115, options [nop,nop,TS val 102307210
>>> ecr 102291188], length 65160
>>> 15:46:41.279987 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193269288:1193334448, ack 1, win 115, options [nop,nop,TS val 102307210
>>> ecr 102291188], length 65160
>>> 15:46:41.280003 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193334448:1193399608, ack 1, win 115, options [nop,nop,TS val 102307210
>>> ecr 102291188], length 65160
>>> 15:46:41.280020 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193399608:1193464768, ack 1, win 115, options [nop,nop,TS val 102307210
>>> ecr 102291188], length 65160
>>> 15:46:41.280213 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193464768:1193529928, ack 1, win 115, options [nop,nop,TS val 102307211
>>> ecr 102291189], length 65160
>>> 15:46:41.280233 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193529928:1193595088, ack 1, win 115, options [nop,nop,TS val 102307211
>>> ecr 102291189], length 65160
>>> 15:46:41.280250 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193595088:1193660248, ack 1, win 115, options [nop,nop,TS val 102307211
>>> ecr 102291189], length 65160
>>> 15:46:41.280239 IP 196.6.6.71.53622 > 196.6.6.72.53507: Flags [.], ack
>>> 1193138968, win 22399, options [nop,nop,TS val 102291190 ecr 102307210],
>>> length 0
>>> 15:46:41.280267 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193660248:1193725408, ack 1, win 115, options [nop,nop,TS val 102307211
>>> ecr 102291189], length 65160
>>> 15:46:41.280284 IP 196.6.6.72.53507 > 196.6.6.71.53622: Flags [.], seq
>>> 1193725408:1193790568, ack 1, win 115, options [nop,nop,TS val 102307211
>>> ecr 102291189], length 65160
>>>
>>> Packets' size is very stable, 65160 Bytes.
>>>
>>> Thanks,
>>> Zhang Haoyu
>>>
>>> -----Original Message-----
>>> From: address@hidden [mailto:address@hidden On Behalf Of Zhang Haoyu
>>> Sent: Friday, June 06, 2014 9:01 AM
>>> To: kvm; qemu-devel
>>> Cc: Gleb Natapov; Paolo Bonzini; Michael S.Tsirkin; yewudi
>>> Subject: [network performance question] only ~2Gbps throughput between two
>>> linux guests which are running on the same host via netperf -t TCP_STREAM
>>> -m 1400, but xen can achieve ~7Gbps
>>>
>>> Hi, all
>>>
>>> I ran two linux guest on the same kvm host, then start the netserver on one
>>> vm, start netperf on the other one, netperf command and test result shown
>>> as below, netperf -H 196.5.5.71 -t TCP_STREAM -l 60 -- -m 1400 -M 1400
>>> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
>>> 196.5.5.71 () port 0 AF_INET : nodelay
>>> Recv Send Send
>>> Socket Socket Message Elapsed
>>> Size Size Size Time Throughput
>>> bytes bytes bytes secs. 10^6bits/sec
>>>
>>> 87380 16384 1400 60.01 2355.45
>>>
>>> but I ran two linux guest on the same xen hypervisor, ~7Gbps throughput
>>> achived, netperf command and test result shown as below, netperf -H
>>> 196.5.5.71 -t TCP_STREAM -l 60 -- -m 1400 -M 1400 MIGRATED TCP STREAM TEST
>>> from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 196.5.5.71 () port 0 AF_INET
>>> Recv Send Send
>>> Socket Socket Message Elapsed
>>> Size Size Size Time Throughput
>>> bytes bytes bytes secs. 10^6bits/sec
>>>
>>> 87380 16384 1400 60.01 2349.82
>>>
>>> many times test performed, the result is similar as above.
>>>
>>> When I tcpdump backend tap netdevice, found that most packets' size is
>>> 1448bytes on kvm, and few packets are ~60000Bytes, but I tcpdump backend
>>> vif netdevice, found that most packets' size is >60000bytes on xen.
>>> Test result of netperf -t TCP_STREAM -m 64 is similar, more larger packets
>>> on xen than kvm.
>>>
>>> And, vhost enabled, tx zero-copy enabled, virtio TSO enabled on kvm.
>>>
>>> Any ideas?
>>>
>>> Thanks,
>>> Zhang Haoyu
- [Qemu-devel] [network performance question] only ~2Gbps throughput between two linux guests which are running on the same host via netperf -t TCP_STREAM -m 1400, but xen can achieve ~7Gbps, Zhang Haoyu, 2014/06/05
- Re: [Qemu-devel] [network performance question] only ~2Gbps throughput between two linux guests which are running on the same host via netperf -t TCP_STREAM -m 1400, but xen can achieve ~7Gbps, Venkateswara Rao Nandigam, 2014/06/06
- Re: [Qemu-devel] [network performance question] only ~2Gbps throughput betweentwo linux guests which are running on the same host via netperf -tTCP_STREAM -m 1400, but xen can ac, Zhang Haoyu, 2014/06/06
- Re: [Qemu-devel] [network performance question] only ~2Gbps throughput betweentwo linux guests which are running on the same host via netperf -tTCP_STREAM -m 1400, but xen can ac, Venkateswara Rao Nandigam, 2014/06/06
- Re: [Qemu-devel] [network performance question] only ~2Gbps throughputbetweentwo linux guests which are running on the same host via netperf-tTCP_STREAM -m 1400, but xen can ac, Zhang Haoyu, 2014/06/06
- Re: [Qemu-devel] [network performance question] only ~2Gbps throughputbetweentwo linux guests which are running on the same host via netperf-tTCP_STREAM -m 1400, but xen can ac, Venkateswara Rao Nandigam, 2014/06/06
- Re: [Qemu-devel] [network performance question] only ~2Gbpsthroughputbetweentwo linux guests which are running on the same host vianetperf-tTCP_STREAM -m 1400, but xen can ac, Zhang Haoyu, 2014/06/06
- Re: [Qemu-devel] [network performance question] only ~2Gbpsthroughputbetweentwo linux guests which are running on the same host vianetperf-tTCP_STREAM -m 1400, but xen can ac, Zhang Haoyu, 2014/06/06
- Re: [Qemu-devel] [network performance question] only ~2Gbpsthroughputbetweentwo linux guests which are running on the same host vianetperf-tTCP_STREAM -m 1400, but xen can ac, Michael S. Tsirkin, 2014/06/08
- Re: [Qemu-devel] [network performance question] only~2Gbpsthroughputbetweentwo linux guests which are running on the same hostvianetperf-tTCP_STREAM -m 1400, but xen can ac, Zhang Haoyu, 2014/06/08
- Re: [Qemu-devel] [network performance question] only~2Gbpsthroughputbetweentwo linux guests which are running on the same hostvianetperf-tTCP_STREAM -m 1400, but xen can ac,
Zhang Haoyu <=
Re: [Qemu-devel] [network performance question] only ~2Gbps throughput between two linux guests which are running on the same host via netperf -t TCP_STREAM -m 1400, but xen can achieve ~7Gbps, Venkateswara Rao Nandigam, 2014/06/06