Re: dma_blk helpers and infinite dma_memory

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dma_blk helpers and infinite dma_memory_map retries

From:	John Snow
Subject:	Re: dma_blk helpers and infinite dma_memory_map retries
Date:	Thu, 30 Jul 2020 14:26:42 -0400
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 7/29/20 7:43 AM, Paolo Bonzini wrote:

This is the question: what is special about this address_space_map, that
makes it never succeed?  The usual case where address_space_map fails is
simply two or more concurrent users of the bounce buffer, and these
solve by themselves when address_space_unmap calls
cpu_notify_map_clients.  Do we never call address_space_unmap?

TLDR: the dma-helpers handle alignment ... poorly, and it's not clearwhat the right way to handle this might be for a few reasons.

Read on for my usual thinking-out-loud analysis tracing this problem,though I do have a question for Kevin/Paolo at the end. (See thenumbered list 1-4 at the bottom.)



First, the (partially bogus, fuzzer-generated) IDE command wants to:

1. dma write 259 sectors starting at sector 1

2. Provides a PRDT at addr 0x00 whose first PRDT describes a data bufferat 0xffffffff of length 0x10000. [a]

3. The remaining 8,191 PRD entries are uninitialized memory that allwind up describing the same data buffer at 0x00 of length 0x10000.

Generally, the entire PRDT is going to be read, but truncated into anSGList that's exactly as long as the IDE command. Here, that's 0x20600bytes.


Yadda, yadda, yadda, that winds up turning into these map requests:

addr 0xffffffff; len 0x10000
  -- mapped length: 0x01 (normal map return)

addr 0x100000000; len 0xffff
  -- mapped length: 0x1000 (bounce buffer return)

addr 0x100001000; len 0xefff
  -- bounce buffer is busy, cannot map

Then it proceeds and calls the iofunc. We return to dma_blk_cb and then:

unmap 0xffffffff; len 0x01; access_len 0x01;

... That unmaps the "direct" one, but we seemingly fail to unmap theindirect one.

Uh, cool. When we build the IOV, we build it with two entries; butqemu_iovec_discard_back discards the second entry entirely withoutunmapping it.

IDE asks for an alignment of BDRV_SECTOR_SIZE (512 bytes). The IDE statemachine transfers an entire sector or nothing at all. The total IOV sizewe have build thus far is 0x1001 bytes, which is not aligned as youmight have noticed.


So, we try to align it:

qemu_iovec_discard_back(&dbs->iov, QEMU_ALIGN_DOWN(4097, 512))

... I think we probably wanted to ask to shave off one byte instead ofasking to shave off 4096 bytes.



So, a few perceived problems with dma_blk_cb:

1. Our alignment math is wrong. discard_back takes as an argument thenumber of bytes to discard, not the number of bytes you want to haveafterwards.

2. qemu_iovec_discard_back will happily unwind entire IO vectors that wewould need to unmap and have now lost. Worse, whenever we do any kind oftruncating at all, those bytes are not re-added to the source SGList, sosubsequent transfers will have skipped some bytes in the guest SGList.

3. the dma_blk_io interfaces don't ever check to see if the sg list isan even multiple of the alignment. They don't return synchronous errorand no callers check for an error case. (Though BMDMA does carefullyprepare the SGList such that it is aligned in this way. AHCI does too,IIRC.) This means we might have an unaligned tail that we will just dropor ignore, leading to another partial DMA.

4. There's no guarantee that any given individual IO vector will have anentire sector's worth of data in it. It is theoretically valid todescribe a series of vectors of two bytes each. If we can only map 1-2vectors at a time, depending, we're never going to be able to scroungeup enough buffer real estate to transfer an entire sector.



Hm, Ow.

Before I go on some quest to ascend to a higher plane of being, I wantedto solicit feedback on the problem as described thus far and see ifanyone has a suggestion for a fix that's not too involved.


--js

[a] This is against the BMDMA spec. The address must be aligned to 0x02and cannot cross a 64K boundary. bit0 is documented as always beingzero, but it's not clear what should happen if the boundary constraintis violated. Absent other concerns, it might just be easiest to fulfillthe transfer if it's possible.

[Prev in Thread]

Current Thread

[Next in Thread]

dma_blk helpers and infinite dma_memory_map retries, John Snow, 2020/07/29
- Re: dma_blk helpers and infinite dma_memory_map retries, Paolo Bonzini, 2020/07/29
  - Re: dma_blk helpers and infinite dma_memory_map retries, John Snow, 2020/07/29
    - Re: dma_blk helpers and infinite dma_memory_map retries, Paolo Bonzini, 2020/07/29
  - Re: dma_blk helpers and infinite dma_memory_map retries, John Snow <=

Prev by Date: Re: Disk cache defaults
Next by Date: Re: [PATCH v12 01/11] iotests: add test for QCOW2 header dump
Previous by thread: Re: dma_blk helpers and infinite dma_memory_map retries
Next by thread: Re: qemu-img convert asserts while converting from vhdx to raw
Index(es):
- Date
- Thread