|
From: | Peter Lieven |
Subject: | Re: [Qemu-block] [Qemu-devel] Migration sometimes fails with IDE and Qemu 2.2.1 |
Date: | Thu, 09 Apr 2015 15:32:06 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 |
Am 09.04.2015 um 14:49 schrieb Peter Lieven:
Am 07.04.2015 um 21:01 schrieb Dr. David Alan Gilbert:* Peter Lieven (address@hidden) wrote:Am 07.04.2015 um 17:29 schrieb Dr. David Alan Gilbert:* Peter Lieven (address@hidden) wrote:Hi David, Am 07.04.2015 um 10:43 schrieb Dr. David Alan Gilbert:Any particular workload or reproducer?Workload is almost zero. I try to figure out if there is a way to trigger it. Maybe playing a role: Machine type is -M pc1.2 and we set -kvmclock as CPU flag since kvmclock seemed to be quite buggy in 2.6.16... Exact cmdline is:/usr/bin/qemu-2.2.1 -enable-kvm -M pc-1.2 -nodefaults -netdev type=tap,id=guest2,script=no,downscript=no,ifname=tap2 -device e1000,netdev=guest2,mac=52:54:00:ff:00:65 -drive format=raw,file=iscsi://172.21.200.53/iqn.2001-05.com.equallogic:4-52aed6-88a7e99a4-d9e00040fdc509a3-XXX-hd0/0,if=ide,cache=writeback,aio=native -serial null -parallel null -m 1024 -smp 2,sockets=1,cores=2,threads=1 -monitor tcp:0:4003,server,nowait -vnc :3 -qmp tcp:0:3003,server,nowait -name 'XXX' -boot order=c,once=dc,menu=off -drive index=2,media=cdrom,if=ide,cache=unsafe,aio=native,readonly=on -k de -incoming tcp:0:5003 -pidfile /var/run/qemu/vm-146.pid -mem-path /hugepages -mem-prealloc -rtc base=utc -usb -usbdevice tablet -no-hpet -vga cirrus -cpu qemu64,-kvmclockExact kernel is: 2.6.16.46-0.12-smp (i think this is SLES10 or sth.) The machine does not hang. It seems just I/O is hanging. So you can type at the console or ping the system, but no longer login. Thank you, PeterInteresting observation: Migrating the vServer again seems to fix to problem (at least in one case I could test just now). 2.6.8-24-smp is also affected.How often does it fail - you say 'sometimes' - is it a 1/10 or a 1/1000 ?Its more often than 1/10 I would say.OK, that's not too bad - it's the 1/1000 that are really nasty to find. In your setup, how easy would it be for you to try : with either 2.1 or current head? with a newer machine-type? without the cdrom?Its all possible. I can clone the system and try everything on my test systems. I hope it reproduces there.Great. I think the order I would go would be: Try head - if it works we know we've already got the fix somewhere Try 2.1 - if it works we know it's something we introduced between 2.1 and 2.2.1 Try a newer machine type - because pc-1.2 probably isn't tested much CDROM at the end.Update: - head -> not working - 2.1.3 -> not working - without CROM -> not working - with head and no machine type specified -> not working - with -device isa-ide -> BIOS not booting harddisk Will now try 1.3.1 just to be sure.
1.3.1 => not working kernel parameter ide=nodma (ide-core.nodma not supported by kernel) => not working. I usually crash in my setup at around 50-80 migrations. In production it seems to happen more often. Maybe it is load depending. Peter
[Prev in Thread] | Current Thread | [Next in Thread] |