All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 1/5] mm,memory_hotplug: Allocate memmap from the added memory range
Date: Wed, 24 Mar 2021 20:16:53 +0100	[thread overview]
Message-ID: <9591a0b8-c000-2f61-67a6-4402678fe50b@redhat.com> (raw)
In-Reply-To: <YFtjCMwYjx1BwEg0@dhcp22.suse.cz>

On 24.03.21 17:04, Michal Hocko wrote:
> On Wed 24-03-21 15:52:38, David Hildenbrand wrote:
>> On 24.03.21 15:42, Michal Hocko wrote:
>>> On Wed 24-03-21 13:03:29, Michal Hocko wrote:
>>>> On Wed 24-03-21 11:12:59, Oscar Salvador wrote:
>>> [...]
>>>>> I kind of understand to be reluctant to use vmemmap_pages terminology here, but
>>>>> unfortunately we need to know about it.
>>>>> We could rename nr_vmemmap_pages to offset_buddy_pages or something like that.
>>>>
>>>> I am not convinced. It seems you are justr trying to graft the new
>>>> functionality in. But I still believe that {on,off}lining shouldn't care
>>>> about where their vmemmaps come from at all. It should be a
>>>> responsibility of the code which reserves that space to compansate for
>>>> accounting. Otherwise we will end up with a hard to maintain code
>>>> because expectations would be spread at way too many places. Not to
>>>> mention different pfns that the code should care about.
>>>
>>> The below is a quick hack on top of this patch to illustrate my
>>> thinking. I have dug out all the vmemmap pieces out of the
>>> {on,off}lining and hooked all the accounting when the space is reserved.
>>> This just compiles without any deeper look so there are likely some
>>> minor problems but I haven't really encountered any major problems or
>>> hacks to introduce into the code. The separation seems to be possible.
>>> The diffstat also looks promising. Am I missing something fundamental in
>>> this?
>>>
>>
>>  From a quick glimpse, this touches on two things discussed in the past:
>>
>> 1. If the underlying memory block is offline, all sections are offline. Zone
>> shrinking code will happily skip over the vmemmap pages and you can end up
>> with out-of-zone pages assigned to the zone. Can happen in corner cases.
> 
> You are right. But do we really care? Those pages should be of no
> interest to anybody iterating through zones/nodes anyway.

Well, we were just discussing getting zone/node links + span right for 
all pages (including for special reserved pages), because it already 
resulted in BUGs. So I am not convinced that we *don't* have to care.

However, I agree that most code that cares about node/zone spans 
shouldn't care - e.g., never call set_pfnblock_flags_mask() on such blocks.

But I guess there are corner cases where we would end up with 
zone_is_empty() == true, not sure what that effect would be ... at least 
the node cannot vanish as we disallow offlining it while we have a 
memory block linked to it.


Another thing that comes to my mind is that our zone shrinking code 
currently searches in PAGES_PER_SUBSECTION (2 MiB IIRC) increments. In 
case our vmemmap pages would be less than that, we could accidentally 
shrink the !vmemmap part too much, as we are mis-detecting the type for 
a PAGES_PER_SUBSECTION block.

IIRC, this would apply for memory block sizes < 128 MiB. Not relevant on 
x86 and arm64. Could be relevant for ppc64, if we'd ever want to support 
memmap_on_memory there. Or if we'd ever reduce the section size on some 
arch below 128 MiB. At least we would have to fence it somehow.


> 
>> There is no way to know that the memmap of these pages was initialized and
>> is of value.
>>
>> 2. You heavily fragment zone layout although you might end up with
>> consecutive zones (e.g., online all hotplugged memory movable)
> 
> What would be consequences?

IIRC, set_zone_contiguous() will leave zone->contiguous = false.

This, in turn, will force pageblock_pfn_to_page() via the slow path, 
turning page isolation a bit slower.

Not a deal breaker, but obviously something where Oscar's original patch 
can do better.


I yet have to think again about other issues (I remember most issues we 
discussed back then were related to having the vmemmap only within the 
same memory block). I think 2) might be tolerable, although unfortunate. 
Regarding 1), we'll have to dive into more details.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2021-03-24 19:18 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19  9:26 [PATCH v5 0/5] Allocate memmap from hotadded memory (per device) Oscar Salvador
2021-03-19  9:26 ` [PATCH v5 1/5] mm,memory_hotplug: Allocate memmap from the added memory range Oscar Salvador
2021-03-19 10:20   ` David Hildenbrand
2021-03-19 10:31     ` Oscar Salvador
2021-03-19 12:04       ` David Hildenbrand
2021-03-23 10:11   ` Michal Hocko
2021-03-24 10:12     ` Oscar Salvador
2021-03-24 12:03       ` Michal Hocko
2021-03-24 12:10         ` Michal Hocko
2021-03-24 12:23           ` David Hildenbrand
2021-03-24 12:37             ` Michal Hocko
2021-03-24 13:13               ` David Hildenbrand
2021-03-24 13:40                 ` Michal Hocko
2021-03-24 14:05                   ` David Hildenbrand
2021-03-24 13:27         ` Oscar Salvador
2021-03-24 14:42         ` Michal Hocko
2021-03-24 14:52           ` David Hildenbrand
2021-03-24 16:04             ` Michal Hocko
2021-03-24 19:16               ` David Hildenbrand [this message]
2021-03-25  8:07                 ` Oscar Salvador
2021-03-25  9:17                   ` Michal Hocko
2021-03-25 10:55                     ` Oscar Salvador
2021-03-25 11:08                       ` David Hildenbrand
2021-03-25 11:23                         ` Oscar Salvador
2021-03-25 12:35                         ` Michal Hocko
2021-03-25 12:40                           ` David Hildenbrand
2021-03-25 14:08                             ` Michal Hocko
2021-03-25 14:09                               ` David Hildenbrand
2021-03-25 14:34                                 ` Michal Hocko
2021-03-25 14:46                                   ` David Hildenbrand
2021-03-25 15:12                                     ` Michal Hocko
2021-03-25 15:19                                       ` David Hildenbrand
2021-03-25 15:35                                         ` Michal Hocko
2021-03-25 15:40                                           ` David Hildenbrand
2021-03-25 16:07                                           ` Michal Hocko
2021-03-25 16:20                                             ` David Hildenbrand
2021-03-25 16:36                                               ` Michal Hocko
2021-03-25 16:47                                                 ` Michal Hocko
2021-03-25 16:55                                                   ` David Hildenbrand
2021-03-25 22:06                                                   ` Oscar Salvador
2021-03-26  8:35                                                     ` Michal Hocko
2021-03-26  8:52                                                       ` David Hildenbrand
2021-03-26  8:57                                                         ` Oscar Salvador
2021-03-26 12:15                                                           ` Oscar Salvador
2021-03-26 13:36                                                             ` David Hildenbrand
2021-03-26 14:38                                                         ` Michal Hocko
2021-03-26 14:53                                                           ` David Hildenbrand
2021-03-26 15:31                                                             ` Michal Hocko
2021-03-26 16:03                                                               ` David Hildenbrand
2021-03-26  8:55                                                       ` Oscar Salvador
2021-03-26  9:11                                                         ` Michal Hocko
2021-03-25 18:08                                                 ` David Hildenbrand
2021-03-25 12:26                       ` Michal Hocko
2021-03-25 14:02                         ` Oscar Salvador
2021-03-25 14:40                           ` Michal Hocko
2021-03-19  9:26 ` [PATCH v5 2/5] acpi,memhotplug: Enable MHP_MEMMAP_ON_MEMORY when supported Oscar Salvador
2021-03-23 10:40   ` Michal Hocko
2021-03-19  9:26 ` [PATCH v5 3/5] mm,memory_hotplug: Add kernel boot option to enable memmap_on_memory Oscar Salvador
2021-03-23 10:47   ` Michal Hocko
2021-03-24  8:45     ` Oscar Salvador
2021-03-24  9:02       ` Michal Hocko
2021-03-19  9:26 ` [PATCH v5 4/5] x86/Kconfig: Introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE Oscar Salvador
2021-03-19  9:26 ` [PATCH v5 5/5] arm64/Kconfig: " Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9591a0b8-c000-2f61-67a6-4402678fe50b@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.