qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] hw/virtio/virtio-mem: Prohibit unplugging when size <= reque


From: Wei Chen
Subject: Re: [PATCH] hw/virtio/virtio-mem: Prohibit unplugging when size <= requested_size
Date: Tue, 26 Nov 2024 23:31:09 +0800
User-agent: Mozilla Thunderbird

> How can you be sure (IOW trigger) that the system will store
> "important data" like EPTs?

We cannot, but we have designed the attack (see below) to improve the
possibility.

> So is one magic bit really that for your experiments, one needs a
> viommu?

Admittedly the way we accomplish a VM escape is a bit arcane.

We require device passthrough because it pins the VM's memory down and
converts them to MIGRATE_UNMOVABLE. Hotplugged memory will also be
converted to MIGRATE_UNMOVABLE. That way when we give memory back to the
hypervisor, they stay UNMOVABLE. Otherwise we will have to convert the
pages to UNMOVABLE or exhaust ALL MIGRATE_MOVALE pages, both of which
cannot be easily accomplished.

Then we require vIOMMU because vIOMMU mappings, much like EPTEs, use
MIGRATE_UNMOVABLE pages as well. By spawning lots of meaningless vIOMMU
entries, we exhaust UNMOVABLE page blocks of lower orders (<9). Next
time KVM tries to allocate pages to store EPTEs, the kernel has to split
an order-9 page block, which is exactly the size of a 2MB sub-block.

> Out of curiosity, are newer CPUs no longer affected?

When qemu pins down the VM's memory, it also establishes every possible
mapping to the VM's memory in the EPT.

To spawn new EPTEs, we exploit KVM's fix to the iTLB multihit bug.
Basically, we execute a bunch of no-op functions, and KVM will have to
split hugepages into 4KB pages. This process creates a large number of
EPTEs.

The iTLB multihit bug roughly speaking is only present on non-Atom Intel
CPUs manufactured before 2020.

> So it won't be sufficient to have a single sub-block plugged and then
> trigger VIRTIO_MEM_REQ_UNPLUG_ALL?

Could work in theory, but if the newly plugged sub-block does not
contain vulnerable pages, there is no promise that the attacker would
get a sub-block containing a different set of pages next time.

It also depends heavily on the configuration of the virtio-mem device.
If there is not much non-virtio-mem memory for the VM, the attacker
could easily run out of memory.


Best regards,
Wei Chen

On 2024/11/26 22:46, David Hildenbrand wrote:
On 26.11.24 15:20, Wei Chen wrote:
  > Please provide more information how this is supposed to work


Thanks for the information. A lot of what you wrote belongs into the patch description. Especially, that this might currently only be relevant with device passthrough + viommu.

We initially discovered that virtio-mem could be used by a malicious
agent to trigger the Rowhammer vulnerability and further achieve a VM
escape.

Simply speaking, Rowhammer is a DRAM vulnerability where frequent access
to a memory location might cause voltage leakage to adjacent locations,
effectively flipping bits in these locations. In other words, with
Rowhammer, an adversary can modify the data stored in the memory.

For a complete attack, an adversary needs to: a) determine which parts
of the memory are prone to bit flips, b) trick the system to store
important data on those parts of memory and c) trigger bit flips to
tamper important data.

Now, for an attacker who only has access to their VM but not to the
hypervisor, one important challenge among the three is b), i.e., to give
back the memory they determine as vulnerable to the hypervisor. This is
where the pitfall for virtio-mem lies: the attacker can modify the
virtio-mem driver in the VM's kernel and unplug memory proactively.

But b), as you write, is not only about giving back that memory to the hypervisor. How can you be sure (IOW trigger) that the system will store "important data" like EPTs?


The current impl of virtio-mem in qemu does not check if it is valid for
the VM to unplug memory. Therefore, as is proved by our experiments,
this method works in practice.

  > whether this is a purely theoretical case, and how relevant this is in
  > practice.

In our design, on a host machine equipped with certain Intel processors
and inside a VM that a) has a passed-through PCI device, b) has a vIOMMU
and c) has a virtio-mem device, an attacker can force the EPT to use
pages that are prone to Rowhammer bit flips and thus modify the EPT to
gain read and write privileges to an arbitrary memory location.

Our efforts involved conducting end-to-end attacks on two separate
machines with the Core i3-10100 and the Xeon E2124 processors
respectively, and has achieved successful VM escapes.

Out of curiosity, are newer CPUs no longer affected?


  > Further, what about virtio-balloon, which does not even support
  > rejecting requests?

virtio-balloon does not work with device passthrough currently, so we
have yet to produce a feasible attack with it.

So is one magic bit really that for your experiments, one needs a viommu?

The only mentioning of rohammer+memory ballooning I found is: https://www.whonix.org/pipermail/whonix-devel/2016-September/000746.html


  > I recall that that behavior was desired once the driver would support
  > de-fragmenting unplugged memory blocks.

By "that behavior" do you mean to unplug memory when size <=
requested_size? I am not sure how that is to be implemented.

To defragment, the idea was to unplug one additional block, so we can plug another block.


  > Note that VIRTIO_MEM_REQ_UNPLUG_ALL would still always be allowed

That is true, but the attacker will want the capability to release a
specific sub-block.

So it won't be sufficient to have a single sub-block plugged and then trigger VIRTIO_MEM_REQ_UNPLUG_ALL?


In fact, a sub-block is still somewhat coarse, because most likely there
is only one page in a sub-block that contains potential bit flips. When
the attacker spawns EPTEs, they have to spawn enough to make sure the
target page is used to store the EPTEs.

A 2MB sub-block can store 2MB/4KB*512=262,144 EPTEs, equating to at
least 1GB of memory. In other words, the attack program exhausts 1GB of
memory just for the possibility that KVM uses the target page to store
EPTEs.

Ah, that makes sense.

Can you compress what you wrote into the patch description? Further, I assume we want to add a Fixes: tag and Cc: QEMU Stable <qemu-stable@nongnu.org>

Thanks!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]