All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@suse.com>, Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 1/5] mm,memory_hotplug: Allocate memmap from the added memory range
Date: Fri, 26 Mar 2021 09:52:58 +0100	[thread overview]
Message-ID: <5be95091-b4ac-8e05-4694-ac5c65f790a4@redhat.com> (raw)
In-Reply-To: <YF2ct/UZUBG1GcM3@dhcp22.suse.cz>

On 26.03.21 09:35, Michal Hocko wrote:
> On Thu 25-03-21 23:06:50, Oscar Salvador wrote:
>> On Thu, Mar 25, 2021 at 05:47:30PM +0100, Michal Hocko wrote:
>>> On Thu 25-03-21 17:36:22, Michal Hocko wrote:
>>>> If all it takes is to make pfn_to_online_page work (and my
>>>> previous attempt is incorrect because it should consult block rather
>>>> than section pfn range)
>>>
>>> This should work.
>>
>> Sorry, but while this solves some of the issues with that approach, I really
>> think that overcomplicates things and buys us not so much in return.
>> To me it seems that we are just patching things to make it work that
>> way.
> 
> I do agree that special casing vmemmap areas is something I do not
> really like but we are in that schrödinger situation when this memory is
> not onlineable unless it shares memory section with an onlineable
> memory. There are two ways around that, either special case it on
> pfn_to_online_page or mark the vmemmap section online even though it is
> not really.
> 
>> To be honest, I dislike this, and I guess we can only agree to disagree
>> here.
> 
> No problem there. I will not insist on my approach unless I can convince
> you that it is a better solution. It seems I have failed and I can live
> with that.
> 
>> I find the following much easier, cleaner, and less risky to encounter
>> pitfalls in the future:
>>
>> (!!!It is untested and incomplete, and I would be surprised if it even
>> compiles, but it is enough as a PoC !!!)
>>
>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
>> index 5ea2b3fbce02..799d14fc2f9b 100644
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -169,6 +169,60 @@ int memory_notify(unsigned long val, void *v)
>>   	return blocking_notifier_call_chain(&memory_chain, val, v);
>>   }
>>
>> +static int memory_block_online(unsigned long start_pfn, unsigned long nr_pages,
>> +			       unsigned long nr_vmemmap_pages, int online_type,
>> +			       int nid)
>> +{
>> +	int ret;
>> +	/*
>> +	 * Despite vmemmap pages having a different lifecycle than the pages
>> +	 * they describe, initialiating and accounting vmemmap pages at the
>> +	 * online/offline stage eases things a lot.
> 
> This requires quite some explaining.
> 
>> +	 * We do it out of {online,offline}_pages, so those routines only have
>> +	 * to deal with pages that are actual usable memory.
>> +	 */
>> +	if (nr_vmemmap_pages) {
>> +		struct zone *z;
>> +
>> +		z = zone_for_pfn_range(online_type, nid, start_pfn, nr_pages);
>> +		move_pfn_range_to_zone(z, start_pfn, nr_vmemmap_pages, NULL,
>> +				       MIGRATE_UNMOVABLE);
>> +		/*
>> +		 * The below could go to a helper to make it less bulky here,
>> +		 * so {online,offline}_pages could also use it.
>> +		 */
>> +		z->present_pages += nr_pages;
>> +		pgdat_resize_lock(z->zone_pgdat, &flags);
>> +		z->zone_pgdat->node_present_pages += nr_pages;
>> +		pgdat_resize_unlock(z->zone_pgdat, &flags);

Might have to set fully spanned section online. (vmemmap >= SECTION_SIZE)

>> +	}
>> +
>> +	ret = online_pages(start_pfn + nr_vmemmap_pages, nr_pages - nr_vmemmap_pages,
>> +			   online_type);
>> +
>> +	/*
>> +	 * In case online_pages() failed for some reason, we should cleanup vmemmap
>> +	 * accounting as well.
>> +	 */
>> +	return ret;
>> +}
> 
> Yes this is much better! Just a minor suggestion would be to push
> memory_block all the way to memory_block_online (it oline a memory
> block). I would also slightly prefer to provide 2 helpers that would make
> it clear that this is to reserve/cleanup the vmemamp space (defined in
> the memory_hotplug proper).
> 
> Thanks!
> 

Something else to note:


We'll not call the memory notifier (e.g., MEM_ONLINE) for the vmemmap. 
The result is that

1. We won't allocate extended struct pages for the range. Don't think 
this is really problematic (pages are never allocated/freed, so I guess 
we don't care - like ZONE_DEVICE code).

2. We won't allocate kasan shadow memory. We most probably have to do it 
explicitly via kasan_add_zero_shadow()/kasan_remove_zero_shadow(), see 
mm/memremap.c:pagemap_range()


Further a locking rework might be necessary. We hold the device hotplug 
lock, but not the memory hotplug lock. E.g., for get_online_mems(). 
Might have to move that out online_pages.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2021-03-26  8:53 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-19  9:26 [PATCH v5 0/5] Allocate memmap from hotadded memory (per device) Oscar Salvador
2021-03-19  9:26 ` [PATCH v5 1/5] mm,memory_hotplug: Allocate memmap from the added memory range Oscar Salvador
2021-03-19 10:20   ` David Hildenbrand
2021-03-19 10:31     ` Oscar Salvador
2021-03-19 12:04       ` David Hildenbrand
2021-03-23 10:11   ` Michal Hocko
2021-03-24 10:12     ` Oscar Salvador
2021-03-24 12:03       ` Michal Hocko
2021-03-24 12:10         ` Michal Hocko
2021-03-24 12:23           ` David Hildenbrand
2021-03-24 12:37             ` Michal Hocko
2021-03-24 13:13               ` David Hildenbrand
2021-03-24 13:40                 ` Michal Hocko
2021-03-24 14:05                   ` David Hildenbrand
2021-03-24 13:27         ` Oscar Salvador
2021-03-24 14:42         ` Michal Hocko
2021-03-24 14:52           ` David Hildenbrand
2021-03-24 16:04             ` Michal Hocko
2021-03-24 19:16               ` David Hildenbrand
2021-03-25  8:07                 ` Oscar Salvador
2021-03-25  9:17                   ` Michal Hocko
2021-03-25 10:55                     ` Oscar Salvador
2021-03-25 11:08                       ` David Hildenbrand
2021-03-25 11:23                         ` Oscar Salvador
2021-03-25 12:35                         ` Michal Hocko
2021-03-25 12:40                           ` David Hildenbrand
2021-03-25 14:08                             ` Michal Hocko
2021-03-25 14:09                               ` David Hildenbrand
2021-03-25 14:34                                 ` Michal Hocko
2021-03-25 14:46                                   ` David Hildenbrand
2021-03-25 15:12                                     ` Michal Hocko
2021-03-25 15:19                                       ` David Hildenbrand
2021-03-25 15:35                                         ` Michal Hocko
2021-03-25 15:40                                           ` David Hildenbrand
2021-03-25 16:07                                           ` Michal Hocko
2021-03-25 16:20                                             ` David Hildenbrand
2021-03-25 16:36                                               ` Michal Hocko
2021-03-25 16:47                                                 ` Michal Hocko
2021-03-25 16:55                                                   ` David Hildenbrand
2021-03-25 22:06                                                   ` Oscar Salvador
2021-03-26  8:35                                                     ` Michal Hocko
2021-03-26  8:52                                                       ` David Hildenbrand [this message]
2021-03-26  8:57                                                         ` Oscar Salvador
2021-03-26 12:15                                                           ` Oscar Salvador
2021-03-26 13:36                                                             ` David Hildenbrand
2021-03-26 14:38                                                         ` Michal Hocko
2021-03-26 14:53                                                           ` David Hildenbrand
2021-03-26 15:31                                                             ` Michal Hocko
2021-03-26 16:03                                                               ` David Hildenbrand
2021-03-26  8:55                                                       ` Oscar Salvador
2021-03-26  9:11                                                         ` Michal Hocko
2021-03-25 18:08                                                 ` David Hildenbrand
2021-03-25 12:26                       ` Michal Hocko
2021-03-25 14:02                         ` Oscar Salvador
2021-03-25 14:40                           ` Michal Hocko
2021-03-19  9:26 ` [PATCH v5 2/5] acpi,memhotplug: Enable MHP_MEMMAP_ON_MEMORY when supported Oscar Salvador
2021-03-23 10:40   ` Michal Hocko
2021-03-19  9:26 ` [PATCH v5 3/5] mm,memory_hotplug: Add kernel boot option to enable memmap_on_memory Oscar Salvador
2021-03-23 10:47   ` Michal Hocko
2021-03-24  8:45     ` Oscar Salvador
2021-03-24  9:02       ` Michal Hocko
2021-03-19  9:26 ` [PATCH v5 4/5] x86/Kconfig: Introduce ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE Oscar Salvador
2021-03-19  9:26 ` [PATCH v5 5/5] arm64/Kconfig: " Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5be95091-b4ac-8e05-4694-ac5c65f790a4@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.