All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/5] mm,memory_hotplug: Allocate memmap from the added memory range
Date: Thu, 18 Mar 2021 11:38:48 +0100	[thread overview]
Message-ID: <YFMtuKZ8Ho66D8hN@localhost.localdomain> (raw)
In-Reply-To: <YFMPBFSJPq2VEOk9@localhost.localdomain>

On Thu, Mar 18, 2021 at 09:27:48AM +0100, Oscar Salvador wrote:
> > If we check for
> > 
> > IS_ALIGNED(nr_vmemmap_pages, PMD_SIZE), please add a proper TODO comment
> > that this is most probably the wrong place to take care of this.
> 
> Sure, I will stuff the check in there and place a big TODO comment so we
> do not forget about addressing this issue the right way.

Ok, I realized something while working on v5.

Here is what I have right now:

 bool mhp_supports_memmap_on_memory(unsigned long size)
 {
        /*
         * Note: We calculate for a single memory section. The calculation
         * implicitly covers memory blocks that span multiple sections.
         *
         * Not all archs define SECTION_SIZE, but MIN_MEMORY_BLOCK_SIZE always
         * equals SECTION_SIZE, so use that instead.
         */
        unsigned long nr_vmemmap_pages = MIN_MEMORY_BLOCK_SIZE / PAGE_SIZE;
        unsigned long vmemmap_size = nr_vmemmap_pages * sizeof(struct page);
        unsigned long remaining_size = size - vmemmap_size;
 
        /*
         * Besides having arch support and the feature enabled at runtime, we
         * need a few more assumptions to hold true:
         *
         * a) We span a single memory block: memory onlining/offlinin;g happens
         *    in memory block granularity. We don't want the vmemmap of online
         *    memory blocks to reside on offline memory blocks. In the future,
         *    we might want to support variable-sized memory blocks to make the
         *    feature more versatile.
         *
         * b) The vmemmap pages span complete PMDs: We don't want vmemmap code
         *    to populate memory from the altmap for unrelated parts (i.e.,
         *    other memory blocks)
         *
         * c) The vmemmap pages (and thereby the pages that will be exposed to
         *    the buddy) have to cover full pageblocks: memory onlining/offlining
         *    code requires applicable ranges to be page-aligned, for example, to
         *    set the migratetypes properly.
         *
         * TODO: Although we have a check here to make sure that vmemmap pages
         *       fully populate a PMD, it is not the right place to check for
         *       this. A much better solution involves improving vmemmap code
         *       to fallback to base pages when trying to populate vmemmap using
         *       altmap as an alternative source of memory, and we do not exactly
         *       populate a single PMD.
         */
        return memmap_on_memory &&
               IS_ENABLED(CONFIG_MHP_MEMMAP_ON_MEMORY) &&
               size == memory_block_size_bytes() &&
               remaining_size &&
               IS_ALIGNED(remaining_size, pageblock_size) &&
               IS_ALIGNED(vmemmap_size, PMD_SIZE);
 }

 Assume we are on x86_64 to simplify the case.

 Above, nr_vmemmap_pages would be 32768 and vmemmap_size 2MB (exactly a
 PMD).

 Now, although correct, this nr_vmemmap_pages does not match with the
 altmap->alloc.

 static void * __meminit altmap_alloc_block_buf(unsigned long size,
  struct altmap)
 {
   ...
   ...
   nr_pfns = size >> PAGE_SHIFT; //size is PMD_SIZE
   altmap->alloc += nr_pfns;
 }

 altmap->alloc will be 512, 512 * 4K pages = 2MB.

Of course, the reason they do not match is because in one case, we are
saying a) how many pfns we need to cover a PMD_SIZE, while in the
other case we say b) how many pages we need to cover SECTION_SIZE

Then b) multiply for page_size to get the current size of it.

So, I have mixed feeling about this.
Would it be more clear to just do:

 bool mhp_supports_memmap_on_memory(unsigned long size)
 {
        /*
         * Note: We calculate for a single memory section. The calculation
         * implicitly covers memory blocks that span multiple sections.
         */
        unsigned long nr_vmemmap_pages = PMD_SIZE / PAGE_SIZE;
        unsigned long vmemmap_size = nr_vmemmap_pages * PAGE_SIZE;
        unsigned long remaining_size = size - vmemmap_size;
	...
	...


-- 
Oscar Salvador
SUSE L3

  reply	other threads:[~2021-03-18 10:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-09 17:55 [PATCH v4 0/5] Allocate memmap from hotadded memory (per device) Oscar Salvador
2021-03-09 17:55 ` [PATCH v4 1/5] mm,memory_hotplug: Allocate memmap from the added memory range Oscar Salvador
2021-03-11 19:06   ` David Hildenbrand
2021-03-15 10:22     ` Oscar Salvador
2021-03-16 16:46       ` David Hildenbrand
2021-03-16 17:45         ` David Hildenbrand
2021-03-17 14:08           ` Oscar Salvador
2021-03-17 14:35             ` David Hildenbrand
2021-03-18  8:27               ` Oscar Salvador
2021-03-18 10:38                 ` Oscar Salvador [this message]
2021-03-18 11:24                   ` David Hildenbrand
2021-03-18 12:03                     ` Oscar Salvador
2021-03-18 12:05                       ` David Hildenbrand
2021-03-09 17:55 ` [PATCH v4 2/5] acpi,memhotplug: Enable MHP_MEMMAP_ON_MEMORY when supported Oscar Salvador
2021-03-09 17:55 ` [PATCH v4 3/5] mm,memory_hotplug: Add kernel boot option to enable memmap_on_memory Oscar Salvador
2021-03-09 17:55 ` [PATCH v4 4/5] x86/Kconfig: Introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE Oscar Salvador
2021-03-09 17:55 ` [PATCH v4 5/5] arm64/Kconfig: " Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YFMtuKZ8Ho66D8hN@localhost.localdomain \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.