[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] strange crash in tracked_request_begin
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] strange crash in tracked_request_begin |
Date: |
Tue, 8 Mar 2016 10:06:12 +0000 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
On Mon, Mar 07, 2016 at 08:00:49PM +0100, Christian Borntraeger wrote:
> On 03/07/2016 06:01 PM, Stefan Hajnoczi wrote:
> > On Mon, Mar 07, 2016 at 01:29:08PM +0100, Christian Borntraeger wrote:
> >> Folks,
> >>
> >> I had a crash of a qemu guest in tracked_request_begin.
> >> The testcase was a guest with ramdisk/kernel that reboots in a
> >> loop. (about 10 times per second) with a single null-co disk
> >> attached. No idea how to reproduce this, seems to be a lucky hit.
> >>
> >> (gdb) bt
> >> #0 0x00000000101db5ba in tracked_request_begin (address@hidden,
> >> address@hidden, address@hidden, address@hidden, address@hidden)
> >> at /home/cborntra/REPOS/qemu/block/io.c:390
> >> #1 0x00000000101de91e in bdrv_co_do_preadv (bs=0x42a39190, offset=0,
> >> bytes=4096, qiov=0x3ff7400cbd8, flags=<optimized out>,
> >> address@hidden(unknown: 0))
> >> at /home/cborntra/REPOS/qemu/block/io.c:1001
> >> #2 0x00000000101dfc3e in bdrv_co_do_readv (flags=(unknown: 0),
> >> qiov=<optimized out>, nb_sectors=<optimized out>, sector_num=<optimized
> >> out>, bs=<optimized out>)
> >> at /home/cborntra/REPOS/qemu/block/io.c:1024
> >> #3 bdrv_co_do_rw (opaque=0x3ff7400e370) at
> >> /home/cborntra/REPOS/qemu/block/io.c:2173
> >> #4 0x000000001022d8f6 in coroutine_trampoline (i0=<optimized out>,
> >> i1=-1946150928) at /home/cborntra/REPOS/qemu/util/coroutine-ucontext.c:79
> >> #5 0x000003ff95ed150a in __makecontext_ret () from /lib64/libc.so.6
> >>
> >> looking at the code we are at
> >>
> >> QLIST_INSERT_HEAD(&bs->tracked_requests, req, list);
> >> which translates to
> >>
> >> if (((req)->list.le_next = (&bs->tracked_requests)->lh_first) != NULL)
> >> (&bs->tracked_requests)->lh_first->list.le_prev = &(req)->list.le_next;
> >> (&bs->tracked_requests)->lh_first = (req);
> >> (req)->list.le_prev = &(&bs->tracked_requests)->lh_first;
> >>
> >> gdb says, that (&bs->tracked_requests)->lh_first) is zero in the corefile
> >> (gdb) print /x bs->tracked_requests
> >> $6 = {lh_first = 0x0}
> >>
> >> Now looking at the code I am asking myself if this can happen in parallel
> >> to another code that touches tracked_requests, because gcc seems to read
> >> &bs->tracked_requests)->lh_first twice (first to check the value, then
> >> to use it as pointer)
> >
> > tracked_requests is protected by AioContext. Perhaps something is doing
> > I/O without acquiring AioContext?
>
> Hmm, the guest was rebooting, which resets all devices. Maybe something
> in that code is still not right? I will have a look.
virtio_blk_reset() does acquire AioContext so at least that part should
be safe with running IOThreads.
Stefan
signature.asc
Description: PGP signature