All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Baoquan He <bhe@redhat.com>, Oscar Salvador <osalvador@suse.de>,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Subject: Re: [PATCH RFC 1/2] mm/memory_hotplug: no need to init new pgdat with node_start_pfn
Date: Tue, 21 Apr 2020 15:06:20 +0200	[thread overview]
Message-ID: <c5b693e0-61b7-ca05-68b5-eb19c517759f@redhat.com> (raw)
In-Reply-To: <20200421125250.GG27314@dhcp22.suse.cz>

On 21.04.20 14:52, Michal Hocko wrote:
> On Tue 21-04-20 14:35:12, David Hildenbrand wrote:
>> On 21.04.20 14:30, Michal Hocko wrote:
>>> Sorry for the late reply
>>>
>>> On Thu 16-04-20 12:47:06, David Hildenbrand wrote:
>>>> A hotadded node/pgdat will span no pages at all, until memory is moved to
>>>> the zone/node via move_pfn_range_to_zone() -> resize_pgdat_range - e.g.,
>>>> when onlining memory blocks. We don't have to initialize the
>>>> node_start_pfn to the memory we are adding.
>>>
>>> You are right that the node is empty at this phase but that is already
>>> reflected by zero present pages (hmm, I do not see spanned pages to be
>>> set 0 though). What I am missing here is why this is an improvement. The
>>> new node is already visible here and I do not see why we hide the
>>> information we already know.
>>
>> "information we already know" - no, not before we online the memory.
> 
> Is this really the case? All add_memory_resource users operate on a
> physical memory range.

Having the first add_memory() to magically set node_start_pfn of a hotplugged
node isn't dangerous, I think we agree on that. It's just completely
unnecessary here and at least left me confused why this is needed at all-
because the node start/end pfn is only really touched when
onlining/offlining memory (when resizing the zone and the pgdat).

> 
>> Before onlining, it's just setting node_start_pfn to *some value* to be
>> overwritten in move_pfn_range_to_zone()->resize_pgdat_range().
> 
> Yes the value is overwritten but I am not sure this is actually correct
> thing to do. I cannot remember why I've chosen to do that. It doesn't
> really seem unlikely to online node in a higher physical address.
> 

Well, we decided to glue the node span to onlining/offlining of memory.
So, the value really has no meaning without any of that memory being
online/the node span being 0.

> Btw. one thing that I have in my notes, I was never able to actually
> test the no numa node case. Because I have always been testing with node
> being allocated during the boot. Do you have any way to trigger this
> path?

Sure, here is my test case

#! /bin/bash
sudo qemu-system-x86_64 \
    --enable-kvm \
    -m 4G,maxmem=20G,slots=2 \
    -smp sockets=2,cores=2 \
    -numa node,nodeid=0,cpus=0-1,mem=4G -numa node,nodeid=1,mem=0G \
    -kernel /home/dhildenb/git/linux/arch/x86_64/boot/bzImage \
    -append "console=ttyS0 rd.shell rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0 page_owner=on" \
    -initrd /boot/initramfs-5.4.7-200.fc31.x86_64.img \
    -machine pc \
    -nographic \
    -nodefaults \
    -chardev stdio,id=serial \
    -device isa-serial,chardev=serial \
    -chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \
    -mon chardev=monitor,mode=readline \
    -device virtio-balloon \
    -object memory-backend-ram,id=mem0,size=512M \
    -object memory-backend-ram,id=mem1,size=512M \
    -device pc-dimm,id=dimm0,memdev=mem0,node=1 \
    -device pc-dimm,id=dimm1,memdev=mem1,node=1

Instead of coldplugging the DIMMs to node 1, you could also hotplug them later
(let me know if you need information on how to do that). I use this test to
verify that the node is properly onlined/offlined once I unplug/replug the two
DIMMs (e.g., after onlining/offlining the memory blocks).

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2020-04-21 13:06 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-16 10:47 [PATCH RFC 0/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK David Hildenbrand
2020-04-16 10:47 ` [PATCH RFC 1/2] mm/memory_hotplug: no need to init new pgdat with node_start_pfn David Hildenbrand
2020-04-16 14:11   ` Pankaj Gupta
2020-04-16 14:11     ` Pankaj Gupta
2020-04-21 12:30   ` Michal Hocko
2020-04-21 12:35     ` David Hildenbrand
2020-04-21 12:52       ` Michal Hocko
2020-04-21 13:06         ` David Hildenbrand [this message]
2020-04-22  8:21           ` Michal Hocko
2020-04-22  8:32             ` David Hildenbrand
2020-04-22 10:00               ` Michal Hocko
2020-04-16 10:47 ` [PATCH RFC 2/2] mm/memory_hotplug: handle memblocks only with CONFIG_ARCH_KEEP_MEMBLOCK David Hildenbrand
2020-04-16 17:09   ` Mike Rapoport
2020-04-21 12:39   ` Michal Hocko
2020-04-21 12:41     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c5b693e0-61b7-ca05-68b5-eb19c517759f@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=osalvador@suse.de \
    --cc=pankaj.gupta.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.