qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 09/10] i386/pc: relocate 4g start to 1T where applicable


From: Igor Mammedov
Subject: Re: [PATCH v7 09/10] i386/pc: relocate 4g start to 1T where applicable
Date: Fri, 15 Jul 2022 13:57:21 +0200

On Thu, 14 Jul 2022 19:28:19 +0100
Joao Martins <joao.m.martins@oracle.com> wrote:

> It is assumed that the whole GPA space is available to be DMA
> addressable, within a given address space limit, except for a
> tiny region before the 4G. Since Linux v5.4, VFIO validates
> whether the selected GPA is indeed valid i.e. not reserved by
> IOMMU on behalf of some specific devices or platform-defined
> restrictions, and thus failing the ioctl(VFIO_DMA_MAP) with
>  -EINVAL.
> 
> AMD systems with an IOMMU are examples of such platforms and
> particularly may only have these ranges as allowed:
> 
>       0000000000000000 - 00000000fedfffff (0      .. 3.982G)
>       00000000fef00000 - 000000fcffffffff (3.983G .. 1011.9G)
>       0000010000000000 - ffffffffffffffff (1Tb    .. 16Pb[*])
> 
> We already account for the 4G hole, albeit if the guest is big
> enough we will fail to allocate a guest with  >1010G due to the
> ~12G hole at the 1Tb boundary, reserved for HyperTransport (HT).
> 
> [*] there is another reserved region unrelated to HT that exists
> in the 256T boundary in Fam 17h according to Errata #1286,
> documeted also in "Open-Source Register Reference for AMD Family
> 17h Processors (PUB)"
> 
> When creating the region above 4G, take into account that on AMD
> platforms the HyperTransport range is reserved and hence it
> cannot be used either as GPAs. On those cases rather than
> establishing the start of ram-above-4g to be 4G, relocate instead
> to 1Tb. See AMD IOMMU spec, section 2.1.2 "IOMMU Logical
> Topology", for more information on the underlying restriction of
> IOVAs.
> 
> After accounting for the 1Tb hole on AMD hosts, mtree should
> look like:
> 
> 0000000000000000-000000007fffffff (prio 0, i/o):
>        alias ram-below-4g @pc.ram 0000000000000000-000000007fffffff
> 0000010000000000-000001ff7fffffff (prio 0, i/o):
>       alias ram-above-4g @pc.ram 0000000080000000-000000ffffffffff
> 
> If the relocation is done or the address space covers it, we
> also add the the reserved HT e820 range as reserved.
> 
> Default phys-bits on Qemu is TCG_PHYS_ADDR_BITS (40) which is enough
> to address 1Tb (0xff ffff ffff). On AMD platforms, if a
> ram-above-4g relocation may be desired and the CPU wasn't configured
> with a big enough phys-bits, print an error message to the user
> and do not make the relocation of the above-4g-region if phys-bits
> is too low.
> 
> Suggested-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> ---
>  hw/i386/pc.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 82 insertions(+)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index cda435e3baeb..17613974163e 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -880,6 +880,52 @@ static uint64_t pc_get_cxl_range_end(PCMachineState 
> *pcms)
>      return start;
>  }
>  
> +static hwaddr pc_max_used_gpa(PCMachineState *pcms, uint64_t pci_hole64_size)
> +{
> +    X86CPU *cpu = X86_CPU(first_cpu);
> +
> +    /* 32-bit systems don't have hole64 thus return max CPU address */
> +    if (cpu->phys_bits <= 32) {
> +        return ((hwaddr)1 << cpu->phys_bits) - 1;
> +    }
> +
> +    return pc_pci_hole64_start() + pci_hole64_size - 1;
> +}
> +
[...]

> +
> +    /*
> +     * Relocating ram-above-4G requires more than TCG_PHYS_ADDR_BITS (40).
> +     * So make sure phys-bits is required to be appropriately sized in order
> +     * to proceed with the above-4g-region relocation and thus boot.

drop mention of relocation here as it's orthogonal to the check.
Important thing we are checking here is that max used GPA is
reachable by configured vCPU (physbits).

> +     */
> +    maxusedaddr = pc_max_used_gpa(pcms, pci_hole64_size);
> +    maxphysaddr = ((hwaddr)1 << cpu->phys_bits) - 1;
> +    if (maxphysaddr < maxusedaddr) {
> +        error_report("Address space limit 0x%"PRIx64" < 0x%"PRIx64
> +                     " phys-bits too low (%u)",
> +                     maxphysaddr, maxusedaddr, cpu->phys_bits);
> +        exit(EXIT_FAILURE);
> +    }

these hunks should be a separate patch preceding relocation patch
as it basically does max_gpa vs physbits check regardless
of relocation (i.e. relocation is only one of the reasons
max_used_gpa might exceed physbits).

> +
>      /*
>       * Split single memory region and use aliases to address portions of it,
>       * done for backwards compatibility with older qemus.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]