[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH COLO-Frame v15 00/38] COarse-grain LOck-stepping
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH COLO-Frame v15 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT) |
Date: |
Tue, 1 Mar 2016 12:25:54 +0000 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
* Hailiang Zhang (address@hidden) wrote:
> On 2016/2/29 17:47, Dr. David Alan Gilbert wrote:
> >* Hailiang Zhang (address@hidden) wrote:
> >>On 2016/2/27 0:36, Dr. David Alan Gilbert wrote:
> >>>* Dr. David Alan Gilbert (address@hidden) wrote:
> >>>>* zhanghailiang (address@hidden) wrote:
> >>>>>From: root <address@hidden>
> >>>>>
> >>>>>This is the 15th version of COLO (Still only support periodic
> >>>>>checkpoint).
> >>>>>
> >>>>>Here is only COLO frame part, you can get the whole codes from github:
> >>>>>https://github.com/coloft/qemu/commits/colo-v2.6-periodic-mode
> >>>>>
> >>>>>There are little changes for this series except the network releated
> >>>>>part.
> >>>>
> >>>>I was looking at the time the guest is paused during COLO and
> >>>>was surprised to find one of the larger chunks was the time to reset
> >>>>the guest before loading each checkpoint; I've traced it part way, the
> >>>>biggest contributors for my test VM seem to be:
> >>>>
> >>>> 3.8ms pcibus_reset: VGA
> >>>> 1.8ms pcibus_reset: virtio-net-pci
> >>>> 1.5ms pcibus_reset: virtio-blk-pci
> >>>> 1.5ms qemu_devices_reset: piix4_reset
> >>>> 1.1ms pcibus_reset: piix3-ide
> >>>> 1.1ms pcibus_reset: virtio-rng-pci
> >>>>
> >>>>I've not looked deeper yet, but some of these are very silly;
> >>>>I'm running with -nographic so why it's taking 3.8ms to reset VGA is
> >>>>going to be interesting.
> >>>>Also, my only block device is the virtio-blk, so while I understand the
> >>>>standard PC machine has the IDE controller, why it takes it over a ms
> >>>>to reset an unused device.
> >>>
> >>>OK, so I've dug a bit deeper, and it appears that it's the changes in
> >>>PCI bars that actually take the time; every time we do a reset we
> >>>reset all the BARs, this causes it to do a pci_update_mappings and
> >>>end up doing a memory_region_del_subregion.
> >>>Then we load the config space of the PCI device as we do the vmstate_load,
> >>>and this recreates all the mappings again.
> >>>
> >>>I'm not sure what the fix is, but that sounds like it would
> >>>speed up the checkpoints usefully if we can avoid the map/remap when
> >>>they're the same.
> >>>
> >>
> >>Interesting, and thanks for your report.
> >>
> >>We already known qemu_system_reset() is a time-consuming function, we
> >>shouldn't
> >>call it here, but if we didn't do that, there will be a bug, which we have
> >>reported before in the previous COLO series, the bellow is the copy of the
> >>related
> >>patch comment:
Paolo suggested one fix, see the patch below; I'm not sure if it's safe
(in particular if the guest changed a bar and the device code tried to access
the memory
while loading the state???) - but it does seem to work and shaves ~10ms off the
reset/load
times:
Dave
commit 7570b2984143860005ad9fe79f5394c75f294328
Author: Dr. David Alan Gilbert <address@hidden>
Date: Tue Mar 1 12:08:14 2016 +0000
COLO: Lock memory map around reset/load
Changing the memory map appears to be expensive; we see this
partiuclarly when on loading a checkpoint we:
a) reset the devices
This causes PCI bars to be reset
b) Loading the device states
This causes the PCI bars to be reloaded.
Turning this all into a single memory_region_transaction saves
~10ms/checkpoint.
TBD: What happens if the device code accesses the RAM during loading
the checkpoint?
Signed-off-by: Dr. David Alan Gilbert <address@hidden>
Suggested-by: Paolo Bonzini <address@hidden>
diff --git a/migration/colo.c b/migration/colo.c
index 45c3432..c44fb2a 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -22,6 +22,7 @@
#include "net/colo-proxy.h"
#include "net/net.h"
#include "block/block_int.h"
+#include "exec/memory.h"
static bool vmstate_loading;
@@ -934,6 +935,7 @@ void *colo_process_incoming_thread(void *opaque)
stage_time_start = qemu_clock_get_us(QEMU_CLOCK_HOST);
qemu_mutex_lock_iothread();
+ memory_region_transaction_begin();
qemu_system_reset(VMRESET_SILENT);
stage_time_end = qemu_clock_get_us(QEMU_CLOCK_HOST);
timed_average_account(&mis->colo_state.time_reset,
@@ -947,6 +949,7 @@ void *colo_process_incoming_thread(void *opaque)
stage_time_end - stage_time_start);
stage_time_start = stage_time_end;
ret = qemu_load_device_state(fb);
+ memory_region_transaction_commit();
if (ret < 0) {
error_report("COLO: load device state failed\n");
vmstate_loading = false;
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
- Re: [Qemu-devel] [PATCH COLO-Frame v15 00/38] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT),
Dr. David Alan Gilbert <=