All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gavin Shan <gshan@redhat.com>
To: David Hildenbrand <david@redhat.com>, qemu-arm@nongnu.org
Cc: peter.maydell@linaro.org, drjones@redhat.com,
	Michal Privoznik <mprivozn@redhat.com>,
	richard.henderson@linaro.org, qemu-devel@nongnu.org,
	eric.auger@redhat.com, shan.gavin@gmail.com,
	Jonathan.Cameron@huawei.com, imammedo@redhat.com
Subject: Re: [PATCH 1/1] hw/arm/virt: Support for virtio-mem-pci
Date: Fri, 3 Dec 2021 14:42:38 +1100	[thread overview]
Message-ID: <18aa700f-e540-50f5-423a-717ef39d52d3@redhat.com> (raw)
In-Reply-To: <1ae3f686-aaa9-4402-433e-325f422275df@redhat.com>

On 12/1/21 8:03 PM, David Hildenbrand wrote:
>>>>
>>>>      * It has been passing the tests with various combinations like 64KB
>>>>        and 4KB page sizes on host and guest, different memory device
>>>>        backends like normal, transparent huge page and HugeTLB, plus
>>>>        migration.
>>>
>>> Perfect. A note that hugetlbfs isn't fully supported/safe to use until
>>> we have preallocation support in QEMU (WIP).
>>>
>>
>> Yes, there is some warnings raised to enlarge 'request-size' on
>> host with 64KB page size. Note that the memory backends I used
>> in the testings always have "prealloc=on" property though.
> 
> 1. prealloc=on
> 
> "prealloc=on" on the memory backend won't get the job done, because the first
> thing virtio-mem does is discard all memory in the memory backend again when
> it initializes. So it's an expensive NOP :) See
> 
> https://lkml.kernel.org/r/20211130104136.40927-9-david@redhat.com
> 
> for the virtio-mem "prealloc=on" option that preallocates memory when
> exposing that memory to the VM.
> 
> To use huge pages in a safe way with virtio-mem, we need "reserve=off" on the
> memory backend and "prealloc=on" on the virtio-mem device. (I'm in the process
> of documenting that on virtio-mem.gitlab.io/ to make it clearer)
> 

David, I will reply in a different thread for discussion purpose only. I
need some time for the investigation :)

> 
> 2. Warning on arm64 with 64k
> 
> I assume the warning you're seeing is regarding the block-size:
> 
> "Read unsupported THP size: ..." followed by
> "Could not detect THP size, falling back to ..."
> 
> The right thing to do for now should be to remove that sanity check:
> 
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index d5a578142b..33c32afeb1 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -78,11 +78,8 @@ static uint32_t virtio_mem_thp_size(void)
>       if (g_file_get_contents(HPAGE_PMD_SIZE_PATH, &content, NULL, NULL) &&
>           !qemu_strtou64(content, &endptr, 0, &tmp) &&
>           (!endptr || *endptr == '\n')) {
> -        /*
> -         * Sanity-check the value, if it's too big (e.g., aarch64 with 64k base
> -         * pages) or weird, fallback to something smaller.
> -         */
> -        if (!tmp || !is_power_of_2(tmp) || tmp > 16 * MiB) {
> +        /* Sanity-check the value and fallback to something reasonable.  */
> +        if (!tmp || !is_power_of_2(tmp)) {
>               warn_report("Read unsupported THP size: %" PRIx64, tmp);
>           } else {
>               thp_size = tmp;
> 
> This will not affect setups we care about ( x86-64 KVM ).
> 
> It will imply that with a arm64 64k host, we can only hot(un)plug in
> 512 MiB granularity when not using hugetlb witht he default block-size.
> However, that is already the case with arm64 64k guests as well.
> The suggestion will be to use arm64 4k with virtio-mem in the host and
> the guest for increased flexibility -- fortunately most distros
> already have performed the switch to 4k on arm64, so we don't really
> care IMHO.
> 
> To support block_size < THP size when not using hugetlb,
> we'll have to disable THP (via MADV_NOHUGEPAGE) permanently for the memory
> region, also making sure that e.g., postcopy won't re-enable it by adding
> a proper flag (RAM_NOHUGEPAGE) to the RAMBlock. Because the issue is that
> once the guest touches some memory, we might populate a THP that would cover
> plugged and unplugged memory, which is bad.
> 
> Instead of warning in virtio_mem_device_realize() when
> 	vmem->block_size < virtio_mem_default_block_size(rb)
> we'd have to disable THP.
> 
> 
> Further, we should fixup the default THP size on arm64 in case we're
> running on an older kernel where we cannot sense the THP size:
> 
> 
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index d5a578142b..371cee380a 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -38,13 +38,21 @@
>    */
>   #define VIRTIO_MEM_MIN_BLOCK_SIZE ((uint32_t)(1 * MiB))
>   
> +static uint32_t virtio_mem_default_thp_size(void)
> +{
> +#if defined(__aarch64__)
> +    if (qemu_real_host_page_size == 64 * KiB) {
> +        return 512 * MiB;
> +    }
> +#endif
>   #if defined(__x86_64__) || defined(__arm__) || defined(__aarch64__) || \
>       defined(__powerpc64__)
> -#define VIRTIO_MEM_DEFAULT_THP_SIZE ((uint32_t)(2 * MiB))
> +    return 2 * MiB;
>   #else
> -        /* fallback to 1 MiB (e.g., the THP size on s390x) */
> -#define VIRTIO_MEM_DEFAULT_THP_SIZE VIRTIO_MEM_MIN_BLOCK_SIZE
> +    /* fallback to 1 MiB (e.g., the THP size on s390x) */
> +    return VIRTIO_MEM_MIN_BLOCK_SIZE;
>   #endif
> +}
>   
>   /*
>    * We want to have a reasonable default block size such that
> @@ -90,7 +98,7 @@ static uint32_t virtio_mem_thp_size(void)
>       }
>   
>       if (!thp_size) {
> -        thp_size = VIRTIO_MEM_DEFAULT_THP_SIZE;
> +        thp_size = virtio_mem_default_thp_size();
>           warn_report("Could not detect THP size, falling back to %" PRIx64
>                       "  MiB.", thp_size / MiB);
>       }
> 
> 
> 
> In the context of proper arm64 support, adding the above two changes
> should be good enough. If you agree, can you include them in your v2
> series as a separate patch?
> 
> Supporting block_size < thp_size when not using hugetlb is a different
> work IMHO, if someone ever cares about that.
> 
> 

Yeah, It's exactly the warnings I observed and I agree on the changes
except the 16KB-base-page-size case is missed. I've included all the
changes into separate patch in v2, which was just posted.

Thanks,
Gavin



      reply	other threads:[~2021-12-03  3:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-30  0:33 [PATCH 0/1] hw/arm/virt: Support for virtio-mem-pci Gavin Shan
2021-11-30  0:33 ` [PATCH 1/1] " Gavin Shan
2021-11-30  9:37   ` David Hildenbrand
2021-12-01  5:09     ` Gavin Shan
2021-12-01  9:03       ` David Hildenbrand
2021-12-03  3:42         ` Gavin Shan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=18aa700f-e540-50f5-423a-717ef39d52d3@redhat.com \
    --to=gshan@redhat.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=david@redhat.com \
    --cc=drjones@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=mprivozn@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.