|
From: | Paolo Bonzini |
Subject: | [Qemu-devel] Re: [PATCH 09/10] Exit loop if we have been there too long |
Date: | Tue, 30 Nov 2010 15:12:11 +0100 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101103 Fedora/1.0-0.33.b2pre.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.6 |
On 11/30/2010 02:47 PM, Anthony Liguori wrote:
On 11/30/2010 01:15 AM, Paolo Bonzini wrote:On 11/30/2010 03:11 AM, Anthony Liguori wrote:BufferedFile should hit the qemu_file_rate_limit check when the socket buffer gets filled up.The problem is that the file rate limit is not hit because work is done elsewhere. The rate can limit the bandwidth used and makes QEMU aware that socket operations may block (because that's what the buffered file freeze/unfreeze logic does); but it cannot be used to limit the _time_ spent in the migration code.Yes, it can, if you set the rate limit sufficiently low.
You mean, just like you can drive a car without brakes by keeping the speed sufficiently low.
[..] accounting zero pages as full sized pages should "fix" the problem.
I know you used quotes, but it's a very very generous definition of fix. Both these proposed "fixes" are nothing more than workarounds, and even particularly ugly ones. The worst thing about them is that there is no guarantee of migration finishing in a reasonable time, or at all.
If you account zero pages as full, you don't use effectively the bandwidth that was allotted to you, you use only 0.2% of it (8/4096). It then takes an exaggerate amount of time to start iteration on pages that matter. If you set the bandwidth low, instead, you do not have the bandwidth you need in order to converge.
Even from an aesthetic point of view, if there is such a thing, I don't understand why you advocate conflating network bandwidth and CPU usage into a single measurement. Nobody disagrees that all you propose is nice to have, and that what Juan sent is a stopgap measure (though a very effective one). However, this doesn't negate that Juan's accounting patches make a lot of sense in the current design.
In the long term, we need a new dirty bit interface from kvm.ko that uses a multi-level table. That should dramatically improve scan performance. We also need to implement live migration in a separate thread that doesn't carry qemu_mutex while it runs.
This may be a good way to fix it, but it's also basically a rewrite. Paolo
[Prev in Thread] | Current Thread | [Next in Thread] |