linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Baoquan He <bhe@redhat.com>, Oscar Salvador <osalvador@suse.de>,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Subject: Re: [PATCH RFC 1/2] mm/memory_hotplug: no need to init new pgdat with node_start_pfn
Date: Tue, 21 Apr 2020 15:06:20 +0200	[thread overview]
Message-ID: <c5b693e0-61b7-ca05-68b5-eb19c517759f@redhat.com> (raw)
In-Reply-To: <20200421125250.GG27314@dhcp22.suse.cz>

On 21.04.20 14:52, Michal Hocko wrote:
> On Tue 21-04-20 14:35:12, David Hildenbrand wrote:
>> On 21.04.20 14:30, Michal Hocko wrote:
>>> Sorry for the late reply
>>>
>>> On Thu 16-04-20 12:47:06, David Hildenbrand wrote:
>>>> A hotadded node/pgdat will span no pages at all, until memory is moved to
>>>> the zone/node via move_pfn_range_to_zone() -> resize_pgdat_range - e.g.,
>>>> when onlining memory blocks. We don't have to initialize the
>>>> node_start_pfn to the memory we are adding.
>>>
>>> You are right that the node is empty at this phase but that is already
>>> reflected by zero present pages (hmm, I do not see spanned pages to be
>>> set 0 though). What I am missing here is why this is an improvement. The
>>> new node is already visible here and I do not see why we hide the
>>> information we already know.
>>
>> "information we already know" - no, not before we online the memory.
> 
> Is this really the case? All add_memory_resource users operate on a
> physical memory range.

Having the first add_memory() to magically set node_start_pfn of a hotplugged
node isn't dangerous, I think we agree on that. It's just completely
unnecessary here and at least left me confused why this is needed at all-
because the node start/end pfn is only really touched when
onlining/offlining memory (when resizing the zone and the pgdat).

> 
>> Before onlining, it's just setting node_start_pfn to *some value* to be
>> overwritten in move_pfn_range_to_zone()->resize_pgdat_range().
> 
> Yes the value is overwritten but I am not sure this is actually correct
> thing to do. I cannot remember why I've chosen to do that. It doesn't
> really seem unlikely to online node in a higher physical address.
> 

Well, we decided to glue the node span to onlining/offlining of memory.
So, the value really has no meaning without any of that memory being
online/the node span being 0.

> Btw. one thing that I have in my notes, I was never able to actually
> test the no numa node case. Because I have always been testing with node
> being allocated during the boot. Do you have any way to trigger this
> path?

Sure, here is my test case

#! /bin/bash
sudo qemu-system-x86_64 \
    --enable-kvm \
    -m 4G,maxmem=20G,slots=2 \
    -smp sockets=2,cores=2 \
    -numa node,nodeid=0,cpus=0-1,mem=4G -numa node,nodeid=1,mem=0G \
    -kernel /home/dhildenb/git/linux/arch/x86_64/boot/bzImage \
    -append "console=ttyS0 rd.shell rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0 page_owner=on" \
    -initrd /boot/initramfs-5.4.7-200.fc31.x86_64.img \
    -machine pc \
    -nographic \
    -nodefaults \
    -chardev stdio,id=serial \
    -device isa-serial,chardev=serial \
    -chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \
    -mon chardev=monitor,mode=readline \
    -device virtio-balloon \
    -object memory-backend-ram,id=mem0,size=512M \
    -object memory-backend-ram,id=mem1,size=512M \
    -device pc-dimm,id=dimm0,memdev=mem0,node=1 \
    -device pc-dimm,id=dimm1,memdev=mem1,node=1

Instead of coldplugging the DIMMs to node 1, you could also hotplug them later
(let me know if you need information on how to do that). I use this test to
verify that the node is properly onlined/offlined once I unplug/replug the two
DIMMs (e.g., after onlining/offlining the memory blocks).

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2020-04-21 13:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-16 10:47 [PATCH RFC 0/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK David Hildenbrand
2020-04-16 10:47 ` [PATCH RFC 1/2] mm/memory_hotplug: no need to init new pgdat with node_start_pfn David Hildenbrand
2020-04-16 14:11   ` Pankaj Gupta
2020-04-21 12:30   ` Michal Hocko
2020-04-21 12:35     ` David Hildenbrand
2020-04-21 12:52       ` Michal Hocko
2020-04-21 13:06         ` David Hildenbrand [this message]
2020-04-22  8:21           ` Michal Hocko
2020-04-22  8:32             ` David Hildenbrand
2020-04-22 10:00               ` Michal Hocko
2020-04-16 10:47 ` [PATCH RFC 2/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK David Hildenbrand
2020-04-16 17:09   ` Mike Rapoport
2020-04-21 12:39   ` Michal Hocko
2020-04-21 12:41     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c5b693e0-61b7-ca05-68b5-eb19c517759f@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=osalvador@suse.de \
    --cc=pankaj.gupta.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).