qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/5] virtio-mem: Probe THP size to determine default block


From: David Hildenbrand
Subject: Re: [PATCH v1 1/5] virtio-mem: Probe THP size to determine default block size
Date: Mon, 28 Sep 2020 11:36:21 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0

On 28.09.20 11:31, Pankaj Gupta wrote:
>>>> Let's allow a minimum block size of 1 MiB in all configurations. Use
>>>> a default block size based on the THP size, and warn if something
>>>> smaller is configured by the user.
>>>>
>>>> VIRTIO_MEM only supports Linux (depends on LINUX), so we can probe the
>>>> THP size unconditionally.
>>>>
>>>> For now we only support virtio-mem on x86-64 - there isn't a user-visiable
>>>> change (x86-64 only supports 2 MiB THP on the PMD level) - the default
>>>> was, and will be 2 MiB.
>>>>
>>>> If we ever have THP on the PUD level (e.g., 1 GiB THP on x86-64), we
>>>> expect to have a trigger to explicitly opt-in for the new THP granularity.
>>>>
>>>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>>>> Cc: Wei Yang <richardw.yang@linux.intel.com>
>>>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>> Cc: Igor Mammedov <imammedo@redhat.com>
>>>> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>> ---
>>>>  hw/virtio/virtio-mem.c | 82 +++++++++++++++++++++++++++++++++++++++---
>>>>  1 file changed, 78 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
>>>> index 8fbec77ccc..58098686ee 100644
>>>> --- a/hw/virtio/virtio-mem.c
>>>> +++ b/hw/virtio/virtio-mem.c
>>>> @@ -33,10 +33,70 @@
>>>>  #include "trace.h"
>>>>
>>>>  /*
>>>> - * Use QEMU_VMALLOC_ALIGN, so no THP will have to be split when unplugging
>>>> - * memory (e.g., 2MB on x86_64).
>>>> + * Let's not allow blocks smaller than 1 MiB, for example, to keep the 
>>>> tracking
>>>> + * bitmap small.
>>>>   */
>>>> -#define VIRTIO_MEM_MIN_BLOCK_SIZE ((uint32_t)QEMU_VMALLOC_ALIGN)
>>>> +#define VIRTIO_MEM_MIN_BLOCK_SIZE ((uint32_t)(1 * MiB))
>>>> +
>>>> +/*
>>>> + * We want to have a reasonable default block size such that
>>>> + * 1. We avoid splitting THPs when unplugging memory, which degrades
>>>> + *    performance.
>>>> + * 2. We avoid placing THPs for plugged blocks that also cover unplugged
>>>> + *    blocks.
>>>> + *
>>>> + * The actual THP size might differ between Linux kernels, so we try to 
>>>> probe
>>>> + * it. In the future (if we ever run into issues regarding 2.), we might 
>>>> want
>>>> + * to disable THP in case we fail to properly probe the THP size, or if 
>>>> the
>>>> + * block size is configured smaller than the THP size.
>>>> + */
>>>> +static uint32_t default_block_size;
>>>> +
>>>> +#define HPAGE_PMD_SIZE_PATH 
>>>> "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size"
>>>> +static uint32_t virtio_mem_default_block_size(void)
>>>> +{
>>>> +    gchar *content = NULL;
>>>> +    const char *endptr;
>>>> +    uint64_t tmp;
>>>> +
>>>> +    if (default_block_size) {
>>>> +        return default_block_size;
>>>> +    }
>>>> +
>>>> +    /*
>>>> +     * Try to probe the actual THP size, fallback to (sane but eventually
>>>> +     * incorrect) default sizes.
>>>> +     */
>>>> +    if (g_file_get_contents(HPAGE_PMD_SIZE_PATH, &content, NULL, NULL) &&
>>>> +        !qemu_strtou64(content, &endptr, 0, &tmp) &&
>>>> +        (!endptr || *endptr == '\n')) {
>>>> +        /*
>>>> +         * Sanity-check the value, if it's too big (e.g., aarch64 with 
>>>> 64k base
>>>> +         * pages) or weird, fallback to something smaller.
>>>> +         */
>>>> +        if (!tmp || !is_power_of_2(tmp) || tmp > 16 * MiB) {
>>>> +            warn_report("Detected a THP size of %" PRIx64
>>>> +                        " MiB, falling back to 1 MiB.", tmp / MiB);
>>>> +            default_block_size = 1 * MiB;
>>>
>>> Probably use macro "VIRTIO_MEM_MIN_BLOCK_SIZE"
>>
>> Indeed.
>>
>>>> +        } else {
>>>> +            default_block_size = tmp;
>>>> +        }
>>>> +    } else {
>>>> +#if defined(__x86_64__) || defined(__arm__) || defined(__aarch64__) || \
>>>> +    defined(__powerpc64__)
>>>> +        default_block_size = 2 * MiB;
>>>> +#else
>>>> +        /* fallback to 1 MiB (e.g., the THP size on s390x) */
>>>> +        default_block_size = 1 * MiB;
>>>> +#endif
>>>
>>> Maybe we can declare this macro near to "VIRTIO_MEM_MIN_BLOCK_SIZE
>>> ((uint32_t)(1 * MiB))"
>>> or club into one, just a thought.
>>
>> I decided to not use VIRTIO_MEM_MIN_BLOCK_SIZE here to not accidentally
>> mess up the s390x THP size when ever wanting to decrease
>> VIRTIO_MEM_MIN_BLOCK_SIZE. But as we have a comment here, people should
>> know whats happening when ever changing VIRTIO_MEM_MIN_BLOCK_SIZE.
> 
> Thanks for answering. Makes sense.
> 
> Overall the patch looks good to me.
> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> 

I'll do some tweaks to the patch to make something weird (but possible)
be handled correctly (and make it overall look nicer):

It's possible to have a THP size that's bigger than a hugetlbfs page
size. In that case, we want to use the hugetlbfs page size instead (and
don't want to warn).

(I'll drop the RB for now, because it would be good if you gave it
another look - thanks!)

-- 
Thanks,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]