From: Dan Williams <dan.j.williams@intel.com>
To: David Hildenbrand <david@redhat.com>
Cc: Linux MM <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-ia64@vger.kernel.org,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
linux-s390 <linux-s390@vger.kernel.org>,
Linux-sh <linux-sh@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
"mike.travis@hpe.com" <mike.travis@hpe.com>,
Ingo Molnar <mingo@kernel.org>,
Andrew Banman <andrew.banman@hpe.com>,
Oscar Salvador <osalvador@suse.de>,
Michal Hocko <mhocko@suse.com>,
Pavel Tatashin <pasha.tatashin@soleen.com>, Qian Cai <cai@lca.pw>,
Wei Yang <richard.weiyang@gmail.com>,
Arun KS <arunks@codeaurora.org>,
Mathieu Malaterre <malat@debian.org>
Subject: Re: [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory()
Date: Tue, 7 May 2019 14:17:16 -0700 [thread overview]
Message-ID: <CAPcyv4jiVyaPbUrQwSiy65xk=EegJwuGSDKkVYWkGiTJz847gg@mail.gmail.com> (raw)
In-Reply-To: <20190507183804.5512-5-david@redhat.com>
On Tue, May 7, 2019 at 11:38 AM David Hildenbrand <david@redhat.com> wrote:
>
> Only memory to be added to the buddy and to be onlined/offlined by
> user space using memory block devices needs (and should have!) memory
> block devices.
>
> Factor out creation of memory block devices Create all devices after
> arch_add_memory() succeeded. We can later drop the want_memblock parameter,
> because it is now effectively stale.
>
> Only after memory block devices have been added, memory can be onlined
> by user space. This implies, that memory is not visible to user space at
> all before arch_add_memory() succeeded.
Nice!
>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: "mike.travis@hpe.com" <mike.travis@hpe.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Andrew Banman <andrew.banman@hpe.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Cc: Arun KS <arunks@codeaurora.org>
> Cc: Mathieu Malaterre <malat@debian.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> drivers/base/memory.c | 70 ++++++++++++++++++++++++++----------------
> include/linux/memory.h | 2 +-
> mm/memory_hotplug.c | 15 ++++-----
> 3 files changed, 53 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/base/memory.c b/drivers/base/memory.c
> index 6e0cb4fda179..862c202a18ca 100644
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -701,44 +701,62 @@ static int add_memory_block(int base_section_nr)
> return 0;
> }
>
> +static void unregister_memory(struct memory_block *memory)
> +{
> + BUG_ON(memory->dev.bus != &memory_subsys);
Given this should never happen and only a future kernel developer
might trip over it, do we really need to kill that developer's
machine? I.e. s/BUG/WARN/? I guess an argument can be made to move
such a change that to a follow-on patch since you're just preserving
existing behavior, but I figure might as well address these as the
code is refactored.
> +
> + /* drop the ref. we got via find_memory_block() */
> + put_device(&memory->dev);
> + device_unregister(&memory->dev);
> +}
> +
> /*
> - * need an interface for the VM to add new memory regions,
> - * but without onlining it.
> + * Create memory block devices for the given memory area. Start and size
> + * have to be aligned to memory block granularity. Memory block devices
> + * will be initialized as offline.
> */
> -int hotplug_memory_register(int nid, struct mem_section *section)
> +int hotplug_memory_register(unsigned long start, unsigned long size)
> {
> - int ret = 0;
> + unsigned long block_nr_pages = memory_block_size_bytes() >> PAGE_SHIFT;
> + unsigned long start_pfn = PFN_DOWN(start);
> + unsigned long end_pfn = start_pfn + (size >> PAGE_SHIFT);
> + unsigned long pfn;
> struct memory_block *mem;
> + int ret = 0;
>
> - mutex_lock(&mem_sysfs_mutex);
> + BUG_ON(!IS_ALIGNED(start, memory_block_size_bytes()));
> + BUG_ON(!IS_ALIGNED(size, memory_block_size_bytes()));
Perhaps:
if (WARN_ON(...))
return -EINVAL;
>
> - mem = find_memory_block(section);
> - if (mem) {
> - mem->section_count++;
> - put_device(&mem->dev);
> - } else {
> - ret = init_memory_block(&mem, section, MEM_OFFLINE);
> + mutex_lock(&mem_sysfs_mutex);
> + for (pfn = start_pfn; pfn != end_pfn; pfn += block_nr_pages) {
> + mem = find_memory_block(__pfn_to_section(pfn));
> + if (mem) {
> + WARN_ON_ONCE(false);
?? Isn't that a nop?
> + put_device(&mem->dev);
> + continue;
> + }
> + ret = init_memory_block(&mem, __pfn_to_section(pfn),
> + MEM_OFFLINE);
> if (ret)
> - goto out;
> - mem->section_count++;
> + break;
> + mem->section_count = memory_block_size_bytes() /
> + MIN_MEMORY_BLOCK_SIZE;
> + }
> + if (ret) {
> + end_pfn = pfn;
> + for (pfn = start_pfn; pfn != end_pfn; pfn += block_nr_pages) {
> + mem = find_memory_block(__pfn_to_section(pfn));
> + if (!mem)
> + continue;
> + mem->section_count = 0;
> + unregister_memory(mem);
> + }
> }
> -
> -out:
> mutex_unlock(&mem_sysfs_mutex);
> return ret;
> }
>
> -static void
> -unregister_memory(struct memory_block *memory)
> -{
> - BUG_ON(memory->dev.bus != &memory_subsys);
> -
> - /* drop the ref. we got via find_memory_block() */
> - put_device(&memory->dev);
> - device_unregister(&memory->dev);
> -}
> -
> -void unregister_memory_section(struct mem_section *section)
> +static int remove_memory_section(struct mem_section *section)
> {
> struct memory_block *mem;
>
> diff --git a/include/linux/memory.h b/include/linux/memory.h
> index 474c7c60c8f2..95505fbb5f85 100644
> --- a/include/linux/memory.h
> +++ b/include/linux/memory.h
> @@ -111,7 +111,7 @@ extern int register_memory_notifier(struct notifier_block *nb);
> extern void unregister_memory_notifier(struct notifier_block *nb);
> extern int register_memory_isolate_notifier(struct notifier_block *nb);
> extern void unregister_memory_isolate_notifier(struct notifier_block *nb);
> -int hotplug_memory_register(int nid, struct mem_section *section);
> +int hotplug_memory_register(unsigned long start, unsigned long size);
> extern void unregister_memory_section(struct mem_section *);
> extern int memory_dev_init(void);
> extern int memory_notify(unsigned long val, void *v);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 7b5439839d67..e1637c8a0723 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -258,13 +258,7 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn,
> return -EEXIST;
>
> ret = sparse_add_one_section(nid, phys_start_pfn, altmap);
> - if (ret < 0)
> - return ret;
> -
> - if (!want_memblock)
> - return 0;
> -
> - return hotplug_memory_register(nid, __pfn_to_section(phys_start_pfn));
> + return ret < 0 ? ret : 0;
> }
>
> /*
> @@ -1106,6 +1100,13 @@ int __ref add_memory_resource(int nid, struct resource *res)
> if (ret < 0)
> goto error;
>
> + /* create memory block devices after memory was added */
> + ret = hotplug_memory_register(start, size);
> + if (ret) {
> + arch_remove_memory(nid, start, size, NULL);
> + goto error;
> + }
> +
> if (new_node) {
> /* If sysfs file of new node can't be created, cpu on the node
> * can't be hot-added. There is no rollback way now.
> --
> 2.20.1
>
next prev parent reply other threads:[~2019-05-07 21:17 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-07 18:37 [PATCH v2 0/8] mm/memory_hotplug: Factor out memory block device handling David Hildenbrand
2019-05-07 18:37 ` [PATCH v2 1/8] mm/memory_hotplug: Simplify and fix check_hotplug_memory_range() David Hildenbrand
2019-05-07 20:38 ` Dan Williams
2019-05-09 12:23 ` Wei Yang
2019-05-07 18:37 ` [PATCH v2 2/8] s390x/mm: Implement arch_remove_memory() David Hildenbrand
2019-05-07 20:46 ` Dan Williams
2019-05-07 20:47 ` David Hildenbrand
2019-05-07 20:57 ` Dan Williams
2019-05-07 21:13 ` David Hildenbrand
2019-05-07 18:37 ` [PATCH v2 3/8] mm/memory_hotplug: arch_remove_memory() and __remove_pages() with CONFIG_MEMORY_HOTPLUG David Hildenbrand
2019-05-07 21:02 ` Dan Williams
2019-05-07 21:06 ` David Hildenbrand
2019-05-13 7:48 ` David Hildenbrand
2019-05-13 8:20 ` David Hildenbrand
2019-05-07 18:38 ` [PATCH v2 4/8] mm/memory_hotplug: Create memory block devices after arch_add_memory() David Hildenbrand
2019-05-07 21:17 ` Dan Williams [this message]
2019-05-07 21:27 ` David Hildenbrand
2019-05-08 8:35 ` David Hildenbrand
2019-05-09 12:43 ` Wei Yang
2019-05-09 12:50 ` David Hildenbrand
2019-05-09 13:55 ` Wei Yang
2019-05-09 14:05 ` David Hildenbrand
2019-05-09 14:31 ` Wei Yang
2019-05-09 14:58 ` David Hildenbrand
2019-05-09 21:50 ` Wei Yang
2019-05-09 22:18 ` David Hildenbrand
2019-05-07 18:38 ` [PATCH v2 5/8] mm/memory_hotplug: Drop MHP_MEMBLOCK_API David Hildenbrand
2019-05-07 21:19 ` Dan Williams
2019-05-07 21:24 ` David Hildenbrand
2019-05-07 21:25 ` Dan Williams
2019-05-08 7:39 ` David Hildenbrand
2019-05-08 23:08 ` osalvador
2019-05-09 7:05 ` David Hildenbrand
2019-05-07 18:38 ` [PATCH v2 6/8] mm/memory_hotplug: Remove memory block devices before arch_remove_memory() David Hildenbrand
2019-05-07 21:27 ` Dan Williams
2019-05-07 18:38 ` [PATCH v2 7/8] mm/memory_hotplug: Make unregister_memory_block_under_nodes() never fail David Hildenbrand
2019-05-08 0:15 ` Dan Williams
2019-05-08 7:21 ` David Hildenbrand
2019-05-08 13:50 ` Dan Williams
2019-05-07 18:38 ` [PATCH v2 8/8] mm/memory_hotplug: Remove "zone" parameter from sparse_remove_one_section David Hildenbrand
2019-05-08 0:30 ` Dan Williams
2019-05-07 19:04 ` [PATCH v2 0/8] mm/memory_hotplug: Factor out memory block device handling Dan Williams
2019-05-07 19:21 ` David Hildenbrand
2019-05-07 19:37 ` David Hildenbrand
2019-05-07 20:36 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAPcyv4jiVyaPbUrQwSiy65xk=EegJwuGSDKkVYWkGiTJz847gg@mail.gmail.com' \
--to=dan.j.williams@intel.com \
--cc=akpm@linux-foundation.org \
--cc=andrew.banman@hpe.com \
--cc=arunks@codeaurora.org \
--cc=cai@lca.pw \
--cc=david@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=malat@debian.org \
--cc=mhocko@suse.com \
--cc=mike.travis@hpe.com \
--cc=mingo@kernel.org \
--cc=osalvador@suse.de \
--cc=pasha.tatashin@soleen.com \
--cc=rafael@kernel.org \
--cc=richard.weiyang@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).