* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 10:12 ` [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection Wei Yang
@ 2018-11-22 10:15 ` Wei Yang
2018-11-22 10:29 ` Michal Hocko
2018-11-22 10:37 ` osalvador
` (2 subsequent siblings)
3 siblings, 1 reply; 40+ messages in thread
From: Wei Yang @ 2018-11-22 10:15 UTC (permalink / raw)
To: Wei Yang; +Cc: mhocko, osalvador, akpm, linux-mm
On Thu, Nov 22, 2018 at 06:12:41PM +0800, Wei Yang wrote:
>During online_pages phase, pgdat->nr_zones will be updated in case this
>zone is empty.
>
>Currently the online_pages phase is protected by the global lock
>mem_hotplug_begin(), which ensures there is no contention during the
>update of nr_zones. But this global lock introduces scalability issues.
>
>This patch is a preparation for removing the global lock during
>online_pages phase. Also this patch changes the documentation of
>node_size_lock to include the protectioin of nr_zones.
>
>Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Missed this, if I am correct. :-)
Acked-by: Michal Hocko <mhocko@suse.com>
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 10:15 ` Wei Yang
@ 2018-11-22 10:29 ` Michal Hocko
2018-11-22 14:27 ` Wei Yang
0 siblings, 1 reply; 40+ messages in thread
From: Michal Hocko @ 2018-11-22 10:29 UTC (permalink / raw)
To: Wei Yang; +Cc: osalvador, akpm, linux-mm
On Thu 22-11-18 10:15:57, Wei Yang wrote:
> On Thu, Nov 22, 2018 at 06:12:41PM +0800, Wei Yang wrote:
> >During online_pages phase, pgdat->nr_zones will be updated in case this
> >zone is empty.
> >
> >Currently the online_pages phase is protected by the global lock
> >mem_hotplug_begin(), which ensures there is no contention during the
> >update of nr_zones. But this global lock introduces scalability issues.
> >
> >This patch is a preparation for removing the global lock during
> >online_pages phase. Also this patch changes the documentation of
> >node_size_lock to include the protectioin of nr_zones.
I would just add that the patch moves init_currently_empty_zone under
both zone_span_writelock and pgdat_resize_lock because both the pgdat
state is changed (nr_zones) and the zone's start_pfn
> >
> >Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>
> Missed this, if I am correct. :-)
>
> Acked-by: Michal Hocko <mhocko@suse.com>
Yes, thank you.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 10:29 ` Michal Hocko
@ 2018-11-22 14:27 ` Wei Yang
0 siblings, 0 replies; 40+ messages in thread
From: Wei Yang @ 2018-11-22 14:27 UTC (permalink / raw)
To: Michal Hocko; +Cc: Wei Yang, osalvador, akpm, linux-mm
On Thu, Nov 22, 2018 at 11:29:22AM +0100, Michal Hocko wrote:
>On Thu 22-11-18 10:15:57, Wei Yang wrote:
>> On Thu, Nov 22, 2018 at 06:12:41PM +0800, Wei Yang wrote:
>> >During online_pages phase, pgdat->nr_zones will be updated in case this
>> >zone is empty.
>> >
>> >Currently the online_pages phase is protected by the global lock
>> >mem_hotplug_begin(), which ensures there is no contention during the
>> >update of nr_zones. But this global lock introduces scalability issues.
>> >
>> >This patch is a preparation for removing the global lock during
>> >online_pages phase. Also this patch changes the documentation of
>> >node_size_lock to include the protectioin of nr_zones.
>
>I would just add that the patch moves init_currently_empty_zone under
>both zone_span_writelock and pgdat_resize_lock because both the pgdat
>state is changed (nr_zones) and the zone's start_pfn
>
>> >
>> >Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>
>> Missed this, if I am correct. :-)
>>
>> Acked-by: Michal Hocko <mhocko@suse.com>
>
>Yes, thank you.
My pleasure :-)
>--
>Michal Hocko
>SUSE Labs
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 10:12 ` [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection Wei Yang
2018-11-22 10:15 ` Wei Yang
@ 2018-11-22 10:37 ` osalvador
2018-11-22 14:28 ` Wei Yang
2018-11-22 15:26 ` David Hildenbrand
2018-11-30 6:58 ` [PATCH v3] " Wei Yang
3 siblings, 1 reply; 40+ messages in thread
From: osalvador @ 2018-11-22 10:37 UTC (permalink / raw)
To: Wei Yang, mhocko; +Cc: akpm, linux-mm
On Thu, 2018-11-22 at 18:12 +0800, Wei Yang wrote:
>
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Thanks ;-)
Reviewed-by: Oscar Salvador <osalvador@suse.de>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 10:37 ` osalvador
@ 2018-11-22 14:28 ` Wei Yang
0 siblings, 0 replies; 40+ messages in thread
From: Wei Yang @ 2018-11-22 14:28 UTC (permalink / raw)
To: osalvador; +Cc: Wei Yang, mhocko, akpm, linux-mm
On Thu, Nov 22, 2018 at 11:37:38AM +0100, osalvador wrote:
>On Thu, 2018-11-22 at 18:12 +0800, Wei Yang wrote:
>>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>
>Thanks ;-)
>
Thanks for your suggestions :-)
>Reviewed-by: Oscar Salvador <osalvador@suse.de>
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 10:12 ` [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection Wei Yang
2018-11-22 10:15 ` Wei Yang
2018-11-22 10:37 ` osalvador
@ 2018-11-22 15:26 ` David Hildenbrand
2018-11-22 21:28 ` Wei Yang
2018-11-23 8:42 ` Michal Hocko
2018-11-30 6:58 ` [PATCH v3] " Wei Yang
3 siblings, 2 replies; 40+ messages in thread
From: David Hildenbrand @ 2018-11-22 15:26 UTC (permalink / raw)
To: Wei Yang, mhocko, osalvador; +Cc: akpm, linux-mm
On 22.11.18 11:12, Wei Yang wrote:
> During online_pages phase, pgdat->nr_zones will be updated in case this
> zone is empty.
>
> Currently the online_pages phase is protected by the global lock
> mem_hotplug_begin(), which ensures there is no contention during the
> update of nr_zones. But this global lock introduces scalability issues.
>
> This patch is a preparation for removing the global lock during
> online_pages phase. Also this patch changes the documentation of
> node_size_lock to include the protectioin of nr_zones.
I looked into locking recently, and there is more to it.
Please read:
commit dee6da22efac451d361f5224a60be2796d847b51
Author: David Hildenbrand <david@redhat.com>
Date: Tue Oct 30 15:10:44 2018 -0700
memory-hotplug.rst: add some details about locking internals
Let's document the magic a bit, especially why device_hotplug_lock is
required when adding/removing memory and how it all play together with
requests to online/offline memory from user space.
Short summary: Onlining/offlining of memory requires the device_hotplug_lock
as of now.
mem_hotplug_begin() is just an internal optimization. (we don't want
everybody to take the device lock)
>
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> ---
> v2:
> * commit log changes
> * modify the code in move_pfn_range_to_zone() instead of in
> init_currently_empty_zone()
> * documentation change
>
> ---
> include/linux/mmzone.h | 7 ++++---
> mm/memory_hotplug.c | 5 ++---
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 68d7b558924b..1bb749bee284 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -636,9 +636,10 @@ typedef struct pglist_data {
> #endif
> #if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_DEFERRED_STRUCT_PAGE_INIT)
> /*
> - * Must be held any time you expect node_start_pfn, node_present_pages
> - * or node_spanned_pages stay constant. Holding this will also
> - * guarantee that any pfn_valid() stays that way.
> + * Must be held any time you expect node_start_pfn,
> + * node_present_pages, node_spanned_pages or nr_zones stay constant.
> + * Holding this will also guarantee that any pfn_valid() stays that
> + * way.
> *
> * pgdat_resize_lock() and pgdat_resize_unlock() are provided to
> * manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 61972da38d93..f626e7e5f57b 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -742,14 +742,13 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> int nid = pgdat->node_id;
> unsigned long flags;
>
> - if (zone_is_empty(zone))
> - init_currently_empty_zone(zone, start_pfn, nr_pages);
> -
> clear_zone_contiguous(zone);
>
> /* TODO Huh pgdat is irqsave while zone is not. It used to be like that before */
> pgdat_resize_lock(pgdat, &flags);
> zone_span_writelock(zone);
> + if (zone_is_empty(zone))
> + init_currently_empty_zone(zone, start_pfn, nr_pages);
> resize_zone_range(zone, start_pfn, nr_pages);
> zone_span_writeunlock(zone);
> resize_pgdat_range(pgdat, start_pfn, nr_pages);
>
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 15:26 ` David Hildenbrand
@ 2018-11-22 21:28 ` Wei Yang
2018-11-22 21:53 ` David Hildenbrand
2018-11-23 8:42 ` Michal Hocko
1 sibling, 1 reply; 40+ messages in thread
From: Wei Yang @ 2018-11-22 21:28 UTC (permalink / raw)
To: David Hildenbrand; +Cc: Wei Yang, mhocko, osalvador, akpm, linux-mm
On Thu, Nov 22, 2018 at 04:26:40PM +0100, David Hildenbrand wrote:
>On 22.11.18 11:12, Wei Yang wrote:
>> During online_pages phase, pgdat->nr_zones will be updated in case this
>> zone is empty.
>>
>> Currently the online_pages phase is protected by the global lock
>> mem_hotplug_begin(), which ensures there is no contention during the
>> update of nr_zones. But this global lock introduces scalability issues.
>>
>> This patch is a preparation for removing the global lock during
>> online_pages phase. Also this patch changes the documentation of
>> node_size_lock to include the protectioin of nr_zones.
>
>I looked into locking recently, and there is more to it.
>
>Please read:
>
>commit dee6da22efac451d361f5224a60be2796d847b51
>Author: David Hildenbrand <david@redhat.com>
>Date: Tue Oct 30 15:10:44 2018 -0700
>
> memory-hotplug.rst: add some details about locking internals
>
> Let's document the magic a bit, especially why device_hotplug_lock is
> required when adding/removing memory and how it all play together with
> requests to online/offline memory from user space.
>
>Short summary: Onlining/offlining of memory requires the device_hotplug_lock
>as of now.
>
>mem_hotplug_begin() is just an internal optimization. (we don't want
> everybody to take the device lock)
>
Hi, David
Thanks for your comment.
Hmm... I didn't catch your point.
Related to memory hot-plug, there are (at least) three locks,
* device_hotplug_lock (global)
* device lock (device scope)
* mem_hotplug_lock (global)
But with two different hold sequence in two cases:
* device_online()
device_hotplug_lock
device_lock
mem_hotplug_lock
* add_memory_resource()
device_hotplug_lock
mem_hotplug_lock
device_lock
^
|
I don't find where this is hold in add_memory_resource().
Would you mind giving me a hint?
If my understanding is correct, what is your point?
I guess your point is : just remove mem_hotplug_lock is not enough to
resolve the scalability issue?
Please correct me, if I am not. :-)
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 21:28 ` Wei Yang
@ 2018-11-22 21:53 ` David Hildenbrand
2018-11-22 23:53 ` Wei Yang
0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2018-11-22 21:53 UTC (permalink / raw)
To: Wei Yang; +Cc: mhocko, osalvador, akpm, linux-mm
On 22.11.18 22:28, Wei Yang wrote:
> On Thu, Nov 22, 2018 at 04:26:40PM +0100, David Hildenbrand wrote:
>> On 22.11.18 11:12, Wei Yang wrote:
>>> During online_pages phase, pgdat->nr_zones will be updated in case this
>>> zone is empty.
>>>
>>> Currently the online_pages phase is protected by the global lock
>>> mem_hotplug_begin(), which ensures there is no contention during the
>>> update of nr_zones. But this global lock introduces scalability issues.
>>>
>>> This patch is a preparation for removing the global lock during
>>> online_pages phase. Also this patch changes the documentation of
>>> node_size_lock to include the protectioin of nr_zones.
>>
>> I looked into locking recently, and there is more to it.
>>
>> Please read:
>>
>> commit dee6da22efac451d361f5224a60be2796d847b51
>> Author: David Hildenbrand <david@redhat.com>
>> Date: Tue Oct 30 15:10:44 2018 -0700
>>
>> memory-hotplug.rst: add some details about locking internals
>>
>> Let's document the magic a bit, especially why device_hotplug_lock is
>> required when adding/removing memory and how it all play together with
>> requests to online/offline memory from user space.
>>
>> Short summary: Onlining/offlining of memory requires the device_hotplug_lock
>> as of now.
>>
>> mem_hotplug_begin() is just an internal optimization. (we don't want
>> everybody to take the device lock)
>>
>
> Hi, David
>
> Thanks for your comment.
My last sentence should have been "we don't want everybody to take the
device hotplug lock" :) That caused confusion.
>
> Hmm... I didn't catch your point.
>
> Related to memory hot-plug, there are (at least) three locks,
>
> * device_hotplug_lock (global)
> * device lock (device scope)
> * mem_hotplug_lock (global)
>
> But with two different hold sequence in two cases:
>
> * device_online()
>
> device_hotplug_lock
> device_lock
> mem_hotplug_lock
>
> * add_memory_resource()
>
> device_hotplug_lock
> mem_hotplug_lock
> device_lock
> ^
> |
> I don't find where this is hold in add_memory_resource().
> Would you mind giving me a hint?
>
> If my understanding is correct, what is your point?
>
The point I was trying to make:
Right now all onlining/offlining/adding/removing is protected by the
device_hotplug_lock (and that's a good thing, things are fragile enough
already).
mem_hotplug_lock() is used in addition for get_online_mems().
"This patch is a preparation for removing the global lock during
online_pages phase." - is more like "one global lock".
> I guess your point is : just remove mem_hotplug_lock is not enough to
> resolve the scalability issue?
Depends on which scalability issue :)
Getting rid of / removing the impact of mem_hotplug_lock is certainly a
very good idea. And improves scalability of all callers of
get_online_mems(). If that is the intention, very good :)
If the intention is to make onlining/offlining more scalable (e.g. in
parallel or such), then scalability is limited by device_hotplug_lock.
>
> Please correct me, if I am not. :-)
>
Guess I was just wondering which scalability issue we are trying to solve :)
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 21:53 ` David Hildenbrand
@ 2018-11-22 23:53 ` Wei Yang
0 siblings, 0 replies; 40+ messages in thread
From: Wei Yang @ 2018-11-22 23:53 UTC (permalink / raw)
To: David Hildenbrand; +Cc: Wei Yang, mhocko, osalvador, akpm, linux-mm
On Thu, Nov 22, 2018 at 10:53:31PM +0100, David Hildenbrand wrote:
>On 22.11.18 22:28, Wei Yang wrote:
>> On Thu, Nov 22, 2018 at 04:26:40PM +0100, David Hildenbrand wrote:
>>> On 22.11.18 11:12, Wei Yang wrote:
>>>> During online_pages phase, pgdat->nr_zones will be updated in case this
>>>> zone is empty.
>>>>
>>>> Currently the online_pages phase is protected by the global lock
>>>> mem_hotplug_begin(), which ensures there is no contention during the
>>>> update of nr_zones. But this global lock introduces scalability issues.
>>>>
>>>> This patch is a preparation for removing the global lock during
>>>> online_pages phase. Also this patch changes the documentation of
>>>> node_size_lock to include the protectioin of nr_zones.
>>>
>>> I looked into locking recently, and there is more to it.
>>>
>>> Please read:
>>>
>>> commit dee6da22efac451d361f5224a60be2796d847b51
>>> Author: David Hildenbrand <david@redhat.com>
>>> Date: Tue Oct 30 15:10:44 2018 -0700
>>>
>>> memory-hotplug.rst: add some details about locking internals
>>>
>>> Let's document the magic a bit, especially why device_hotplug_lock is
>>> required when adding/removing memory and how it all play together with
>>> requests to online/offline memory from user space.
>>>
>>> Short summary: Onlining/offlining of memory requires the device_hotplug_lock
>>> as of now.
>>>
>>> mem_hotplug_begin() is just an internal optimization. (we don't want
>>> everybody to take the device lock)
>>>
>>
>> Hi, David
>>
>> Thanks for your comment.
>
>My last sentence should have been "we don't want everybody to take the
>device hotplug lock" :) That caused confusion.
>
>>
>> Hmm... I didn't catch your point.
>>
>> Related to memory hot-plug, there are (at least) three locks,
>>
>> * device_hotplug_lock (global)
>> * device lock (device scope)
>> * mem_hotplug_lock (global)
>>
>> But with two different hold sequence in two cases:
>>
>> * device_online()
>>
>> device_hotplug_lock
>> device_lock
>> mem_hotplug_lock
>>
>> * add_memory_resource()
>>
>> device_hotplug_lock
>> mem_hotplug_lock
>> device_lock
>> ^
>> |
>> I don't find where this is hold in add_memory_resource().
>> Would you mind giving me a hint?
>>
>> If my understanding is correct, what is your point?
>>
>
>The point I was trying to make:
>
>Right now all onlining/offlining/adding/removing is protected by the
>device_hotplug_lock (and that's a good thing, things are fragile enough
>already).
>
>mem_hotplug_lock() is used in addition for get_online_mems().
>
>"This patch is a preparation for removing the global lock during
>online_pages phase." - is more like "one global lock".
>
Thanks for reminding. You are right.
>> I guess your point is : just remove mem_hotplug_lock is not enough to
>> resolve the scalability issue?
>
>Depends on which scalability issue :)
>
>Getting rid of / removing the impact of mem_hotplug_lock is certainly a
>very good idea. And improves scalability of all callers of
>get_online_mems(). If that is the intention, very good :)
>
Maybe not exact.
The intention is to get rid of mem_hotplug_begin/done, if I am correct.
>If the intention is to make onlining/offlining more scalable (e.g. in
>parallel or such), then scalability is limited by device_hotplug_lock.
>
I didn't notice this lock.
While this is a step by step improvement.
>
>>
>> Please correct me, if I am not. :-)
>>
>
>Guess I was just wondering which scalability issue we are trying to solve :)
>
>--
>
>Thanks,
>
>David / dhildenb
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 15:26 ` David Hildenbrand
2018-11-22 21:28 ` Wei Yang
@ 2018-11-23 8:42 ` Michal Hocko
2018-11-23 8:46 ` David Hildenbrand
1 sibling, 1 reply; 40+ messages in thread
From: Michal Hocko @ 2018-11-23 8:42 UTC (permalink / raw)
To: David Hildenbrand; +Cc: Wei Yang, osalvador, akpm, linux-mm
On Thu 22-11-18 16:26:40, David Hildenbrand wrote:
> On 22.11.18 11:12, Wei Yang wrote:
> > During online_pages phase, pgdat->nr_zones will be updated in case this
> > zone is empty.
> >
> > Currently the online_pages phase is protected by the global lock
> > mem_hotplug_begin(), which ensures there is no contention during the
> > update of nr_zones. But this global lock introduces scalability issues.
> >
> > This patch is a preparation for removing the global lock during
> > online_pages phase. Also this patch changes the documentation of
> > node_size_lock to include the protectioin of nr_zones.
>
> I looked into locking recently, and there is more to it.
>
> Please read:
>
> commit dee6da22efac451d361f5224a60be2796d847b51
> Author: David Hildenbrand <david@redhat.com>
> Date: Tue Oct 30 15:10:44 2018 -0700
>
> memory-hotplug.rst: add some details about locking internals
>
> Let's document the magic a bit, especially why device_hotplug_lock is
> required when adding/removing memory and how it all play together with
> requests to online/offline memory from user space.
>
> Short summary: Onlining/offlining of memory requires the device_hotplug_lock
> as of now.
Well, I would tend to disagree here. You might be describing the current
state of art but the device_hotplug_lock doesn't make much sense for the
memory hotplug in principle. There is absolutely nothing in the core MM
that would require this lock. The current state just uses a BKL in some
sense and we really want to get rid of that longterm. This patch is a tiny
step in that direction and I suspect many more will need to come on the
way. We really want to end up with a clear scope of each lock being
taken. A project for a brave soul...
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-23 8:42 ` Michal Hocko
@ 2018-11-23 8:46 ` David Hildenbrand
2018-11-26 1:44 ` Wei Yang
0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2018-11-23 8:46 UTC (permalink / raw)
To: Michal Hocko; +Cc: Wei Yang, osalvador, akpm, linux-mm
On 23.11.18 09:42, Michal Hocko wrote:
> On Thu 22-11-18 16:26:40, David Hildenbrand wrote:
>> On 22.11.18 11:12, Wei Yang wrote:
>>> During online_pages phase, pgdat->nr_zones will be updated in case this
>>> zone is empty.
>>>
>>> Currently the online_pages phase is protected by the global lock
>>> mem_hotplug_begin(), which ensures there is no contention during the
>>> update of nr_zones. But this global lock introduces scalability issues.
>>>
>>> This patch is a preparation for removing the global lock during
>>> online_pages phase. Also this patch changes the documentation of
>>> node_size_lock to include the protectioin of nr_zones.
>>
>> I looked into locking recently, and there is more to it.
>>
>> Please read:
>>
>> commit dee6da22efac451d361f5224a60be2796d847b51
>> Author: David Hildenbrand <david@redhat.com>
>> Date: Tue Oct 30 15:10:44 2018 -0700
>>
>> memory-hotplug.rst: add some details about locking internals
>>
>> Let's document the magic a bit, especially why device_hotplug_lock is
>> required when adding/removing memory and how it all play together with
>> requests to online/offline memory from user space.
>>
>> Short summary: Onlining/offlining of memory requires the device_hotplug_lock
>> as of now.
>
> Well, I would tend to disagree here. You might be describing the current
> state of art but the device_hotplug_lock doesn't make much sense for the
> memory hotplug in principle. There is absolutely nothing in the core MM
There are collisions with CPU hotplug that require this lock (when nodes
come and go as far as I remember). And there is the problematic lock
inversion that can happen when adding/remving memory. This all has to be
sorted out, we'll have to see if we really need it for
onlining/offlining, though, however ...
> that would require this lock. The current state just uses a BKL in some
> sense and we really want to get rid of that longterm. This patch is a tiny
> step in that direction and I suspect many more will need to come on the
> way. We really want to end up with a clear scope of each lock being
> taken. A project for a brave soul...
... for now I don't consider "optimize for parallel
onlining/offlining/adding/removing of memory and cpus" really necessary.
What is necessary indeed is to not slowdown the whole system just
because some memory is coming/going. Therefore I agree, this patch is a
step into the right direction.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-23 8:46 ` David Hildenbrand
@ 2018-11-26 1:44 ` Wei Yang
2018-11-26 9:24 ` David Hildenbrand
0 siblings, 1 reply; 40+ messages in thread
From: Wei Yang @ 2018-11-26 1:44 UTC (permalink / raw)
To: David Hildenbrand; +Cc: Michal Hocko, Wei Yang, osalvador, akpm, linux-mm
On Fri, Nov 23, 2018 at 09:46:52AM +0100, David Hildenbrand wrote:
>On 23.11.18 09:42, Michal Hocko wrote:
>> On Thu 22-11-18 16:26:40, David Hildenbrand wrote:
>>> On 22.11.18 11:12, Wei Yang wrote:
>>>> During online_pages phase, pgdat->nr_zones will be updated in case this
>>>> zone is empty.
>>>>
>>>> Currently the online_pages phase is protected by the global lock
>>>> mem_hotplug_begin(), which ensures there is no contention during the
>>>> update of nr_zones. But this global lock introduces scalability issues.
>>>>
>>>> This patch is a preparation for removing the global lock during
>>>> online_pages phase. Also this patch changes the documentation of
>>>> node_size_lock to include the protectioin of nr_zones.
>>>
>>> I looked into locking recently, and there is more to it.
>>>
>>> Please read:
>>>
>>> commit dee6da22efac451d361f5224a60be2796d847b51
>>> Author: David Hildenbrand <david@redhat.com>
>>> Date: Tue Oct 30 15:10:44 2018 -0700
>>>
>>> memory-hotplug.rst: add some details about locking internals
>>>
>>> Let's document the magic a bit, especially why device_hotplug_lock is
>>> required when adding/removing memory and how it all play together with
>>> requests to online/offline memory from user space.
>>>
>>> Short summary: Onlining/offlining of memory requires the device_hotplug_lock
>>> as of now.
>>
>> Well, I would tend to disagree here. You might be describing the current
>> state of art but the device_hotplug_lock doesn't make much sense for the
>> memory hotplug in principle. There is absolutely nothing in the core MM
>
>There are collisions with CPU hotplug that require this lock (when nodes
>come and go as far as I remember). And there is the problematic lock
>inversion that can happen when adding/remving memory. This all has to be
>sorted out, we'll have to see if we really need it for
>onlining/offlining, though, however ...
>
Seems I get a little understanding on this part.
There are two hotplug:
* CPU hotplug
* Memory hotplug.
There are two phase for Memory hotplug:
* physical add/remove
* logical online/offline
All of them are protected by device_hotplug_lock now, so we need to be
careful to release this in any case. Is my understanding correct?
>> that would require this lock. The current state just uses a BKL in some
>> sense and we really want to get rid of that longterm. This patch is a tiny
>> step in that direction and I suspect many more will need to come on the
>> way. We really want to end up with a clear scope of each lock being
>> taken. A project for a brave soul...
>
>... for now I don't consider "optimize for parallel
>onlining/offlining/adding/removing of memory and cpus" really necessary.
>What is necessary indeed is to not slowdown the whole system just
>because some memory is coming/going. Therefore I agree, this patch is a
>step into the right direction.
>
Agree.
The target is to accelerate the hot-plug without slow down the normal
process.
>--
>
>Thanks,
>
>David / dhildenb
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-26 1:44 ` Wei Yang
@ 2018-11-26 9:24 ` David Hildenbrand
2018-11-27 0:23 ` Wei Yang
0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2018-11-26 9:24 UTC (permalink / raw)
To: Wei Yang; +Cc: Michal Hocko, osalvador, akpm, linux-mm
On 26.11.18 02:44, Wei Yang wrote:
> On Fri, Nov 23, 2018 at 09:46:52AM +0100, David Hildenbrand wrote:
>> On 23.11.18 09:42, Michal Hocko wrote:
>>> On Thu 22-11-18 16:26:40, David Hildenbrand wrote:
>>>> On 22.11.18 11:12, Wei Yang wrote:
>>>>> During online_pages phase, pgdat->nr_zones will be updated in case this
>>>>> zone is empty.
>>>>>
>>>>> Currently the online_pages phase is protected by the global lock
>>>>> mem_hotplug_begin(), which ensures there is no contention during the
>>>>> update of nr_zones. But this global lock introduces scalability issues.
>>>>>
>>>>> This patch is a preparation for removing the global lock during
>>>>> online_pages phase. Also this patch changes the documentation of
>>>>> node_size_lock to include the protectioin of nr_zones.
>>>>
>>>> I looked into locking recently, and there is more to it.
>>>>
>>>> Please read:
>>>>
>>>> commit dee6da22efac451d361f5224a60be2796d847b51
>>>> Author: David Hildenbrand <david@redhat.com>
>>>> Date: Tue Oct 30 15:10:44 2018 -0700
>>>>
>>>> memory-hotplug.rst: add some details about locking internals
>>>>
>>>> Let's document the magic a bit, especially why device_hotplug_lock is
>>>> required when adding/removing memory and how it all play together with
>>>> requests to online/offline memory from user space.
>>>>
>>>> Short summary: Onlining/offlining of memory requires the device_hotplug_lock
>>>> as of now.
>>>
>>> Well, I would tend to disagree here. You might be describing the current
>>> state of art but the device_hotplug_lock doesn't make much sense for the
>>> memory hotplug in principle. There is absolutely nothing in the core MM
>>
>> There are collisions with CPU hotplug that require this lock (when nodes
>> come and go as far as I remember). And there is the problematic lock
>> inversion that can happen when adding/remving memory. This all has to be
>> sorted out, we'll have to see if we really need it for
>> onlining/offlining, though, however ...
>>
>
> Seems I get a little understanding on this part.
>
> There are two hotplug:
> * CPU hotplug
> * Memory hotplug.
>
> There are two phase for Memory hotplug:
> * physical add/remove
> * logical online/offline
>
> All of them are protected by device_hotplug_lock now, so we need to be
> careful to release this in any case. Is my understanding correct?
Yes, e.g. the acpi driver always held the device_hotplug_lock (due to
possible problems with concurrent cpu/memory hot(un)plug). Onlining
offlining of devices (including cpu/memory) from user space always held
the device_hotplug_lock. So this part was executed sequentially for a
long time.
I recently made sure that any adding/removing/onlining/offlining
correctly grabs the device_hotplug_lock AND the mem_hotplug_lock in any
case (because it was inconsistent and broken), so it is all executed
sequentially.
So when getting rid of mem_hotplug_lock we only have to care about all
users that don't take the device_hotplug_lock.
>
>>> that would require this lock. The current state just uses a BKL in some
>>> sense and we really want to get rid of that longterm. This patch is a tiny
>>> step in that direction and I suspect many more will need to come on the
>>> way. We really want to end up with a clear scope of each lock being
>>> taken. A project for a brave soul...
>>
>> ... for now I don't consider "optimize for parallel
>> onlining/offlining/adding/removing of memory and cpus" really necessary.
>> What is necessary indeed is to not slowdown the whole system just
>> because some memory is coming/going. Therefore I agree, this patch is a
>> step into the right direction.
>>
>
> Agree.
>
> The target is to accelerate the hot-plug without slow down the normal
> process.
Indeed!
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-26 9:24 ` David Hildenbrand
@ 2018-11-27 0:23 ` Wei Yang
0 siblings, 0 replies; 40+ messages in thread
From: Wei Yang @ 2018-11-27 0:23 UTC (permalink / raw)
To: David Hildenbrand; +Cc: Wei Yang, Michal Hocko, osalvador, akpm, linux-mm
On Mon, Nov 26, 2018 at 10:24:26AM +0100, David Hildenbrand wrote:
>On 26.11.18 02:44, Wei Yang wrote:
>> On Fri, Nov 23, 2018 at 09:46:52AM +0100, David Hildenbrand wrote:
>>> On 23.11.18 09:42, Michal Hocko wrote:
>>>> On Thu 22-11-18 16:26:40, David Hildenbrand wrote:
>>>>> On 22.11.18 11:12, Wei Yang wrote:
>>>>>> During online_pages phase, pgdat->nr_zones will be updated in case this
>>>>>> zone is empty.
>>>>>>
>>>>>> Currently the online_pages phase is protected by the global lock
>>>>>> mem_hotplug_begin(), which ensures there is no contention during the
>>>>>> update of nr_zones. But this global lock introduces scalability issues.
>>>>>>
>>>>>> This patch is a preparation for removing the global lock during
>>>>>> online_pages phase. Also this patch changes the documentation of
>>>>>> node_size_lock to include the protectioin of nr_zones.
>>>>>
>>>>> I looked into locking recently, and there is more to it.
>>>>>
>>>>> Please read:
>>>>>
>>>>> commit dee6da22efac451d361f5224a60be2796d847b51
>>>>> Author: David Hildenbrand <david@redhat.com>
>>>>> Date: Tue Oct 30 15:10:44 2018 -0700
>>>>>
>>>>> memory-hotplug.rst: add some details about locking internals
>>>>>
>>>>> Let's document the magic a bit, especially why device_hotplug_lock is
>>>>> required when adding/removing memory and how it all play together with
>>>>> requests to online/offline memory from user space.
>>>>>
>>>>> Short summary: Onlining/offlining of memory requires the device_hotplug_lock
>>>>> as of now.
>>>>
>>>> Well, I would tend to disagree here. You might be describing the current
>>>> state of art but the device_hotplug_lock doesn't make much sense for the
>>>> memory hotplug in principle. There is absolutely nothing in the core MM
>>>
>>> There are collisions with CPU hotplug that require this lock (when nodes
>>> come and go as far as I remember). And there is the problematic lock
>>> inversion that can happen when adding/remving memory. This all has to be
>>> sorted out, we'll have to see if we really need it for
>>> onlining/offlining, though, however ...
>>>
>>
>> Seems I get a little understanding on this part.
>>
>> There are two hotplug:
>> * CPU hotplug
>> * Memory hotplug.
>>
>> There are two phase for Memory hotplug:
>> * physical add/remove
>> * logical online/offline
>>
>> All of them are protected by device_hotplug_lock now, so we need to be
>> careful to release this in any case. Is my understanding correct?
>
>Yes, e.g. the acpi driver always held the device_hotplug_lock (due to
>possible problems with concurrent cpu/memory hot(un)plug). Onlining
>offlining of devices (including cpu/memory) from user space always held
>the device_hotplug_lock. So this part was executed sequentially for a
>long time.
>
>I recently made sure that any adding/removing/onlining/offlining
>correctly grabs the device_hotplug_lock AND the mem_hotplug_lock in any
>case (because it was inconsistent and broken), so it is all executed
>sequentially.
>
>So when getting rid of mem_hotplug_lock we only have to care about all
>users that don't take the device_hotplug_lock.
>
Thanks for your sharing.
So there are two global locks to protect the hotplug procedure.
* device_hotplug_lock
* mem_hotplug_lock
Seems even removing mem_hotplug_lock doesn't help much on scalability?
Maybe we need to move one by one.
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v3] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-22 10:12 ` [PATCH v2] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection Wei Yang
` (2 preceding siblings ...)
2018-11-22 15:26 ` David Hildenbrand
@ 2018-11-30 6:58 ` Wei Yang
2018-11-30 9:30 ` David Hildenbrand
2018-12-03 20:50 ` [PATCH v4] " Wei Yang
3 siblings, 2 replies; 40+ messages in thread
From: Wei Yang @ 2018-11-30 6:58 UTC (permalink / raw)
To: mhocko, osalvador, david; +Cc: akpm, linux-mm, Wei Yang
During online_pages phase, pgdat->nr_zones will be updated in case this
zone is empty.
Currently the online_pages phase is protected by the global lock
mem_hotplug_begin(), which ensures there is no contention during the
update of nr_zones. But this global lock introduces scalability issues.
The patch moves init_currently_empty_zone under both zone_span_writelock
and pgdat_resize_lock because both the pgdat state is changed (nr_zones)
and the zone's start_pfn. Also this patch changes the documentation
of node_size_lock to include the protectioin of nr_zones.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
CC: David Hildenbrand <david@redhat.com>
---
David, I may not catch you exact comment on the code or changelog. If I
missed, just let me know.
---
v3:
* slightly modify the last paragraph of changelog based on Michal's
comment
v2:
* commit log changes
* modify the code in move_pfn_range_to_zone() instead of in
init_currently_empty_zone()
* pgdat_resize_lock documentation change
---
include/linux/mmzone.h | 7 ++++---
mm/memory_hotplug.c | 5 ++---
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 3d0c472438d2..37d9c5c3faa6 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -635,9 +635,10 @@ typedef struct pglist_data {
#endif
#if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_DEFERRED_STRUCT_PAGE_INIT)
/*
- * Must be held any time you expect node_start_pfn, node_present_pages
- * or node_spanned_pages stay constant. Holding this will also
- * guarantee that any pfn_valid() stays that way.
+ * Must be held any time you expect node_start_pfn,
+ * node_present_pages, node_spanned_pages or nr_zones stay constant.
+ * Holding this will also guarantee that any pfn_valid() stays that
+ * way.
*
* pgdat_resize_lock() and pgdat_resize_unlock() are provided to
* manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 61972da38d93..f626e7e5f57b 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -742,14 +742,13 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
int nid = pgdat->node_id;
unsigned long flags;
- if (zone_is_empty(zone))
- init_currently_empty_zone(zone, start_pfn, nr_pages);
-
clear_zone_contiguous(zone);
/* TODO Huh pgdat is irqsave while zone is not. It used to be like that before */
pgdat_resize_lock(pgdat, &flags);
zone_span_writelock(zone);
+ if (zone_is_empty(zone))
+ init_currently_empty_zone(zone, start_pfn, nr_pages);
resize_zone_range(zone, start_pfn, nr_pages);
zone_span_writeunlock(zone);
resize_pgdat_range(pgdat, start_pfn, nr_pages);
--
2.15.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH v3] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-30 6:58 ` [PATCH v3] " Wei Yang
@ 2018-11-30 9:30 ` David Hildenbrand
2018-12-01 0:27 ` Wei Yang
2018-12-03 20:50 ` [PATCH v4] " Wei Yang
1 sibling, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2018-11-30 9:30 UTC (permalink / raw)
To: Wei Yang, mhocko, osalvador; +Cc: akpm, linux-mm
On 30.11.18 07:58, Wei Yang wrote:
> During online_pages phase, pgdat->nr_zones will be updated in case this
> zone is empty.
>
> Currently the online_pages phase is protected by the global lock
> mem_hotplug_begin(), which ensures there is no contention during the
> update of nr_zones. But this global lock introduces scalability issues.
>
> The patch moves init_currently_empty_zone under both zone_span_writelock
> and pgdat_resize_lock because both the pgdat state is changed (nr_zones)
> and the zone's start_pfn. Also this patch changes the documentation
> of node_size_lock to include the protectioin of nr_zones.
s/protectioin/protection/
>
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> CC: David Hildenbrand <david@redhat.com>
>
> ---
> David, I may not catch you exact comment on the code or changelog. If I
> missed, just let me know.
I guess I would have rewritten it to something like the following
"
Currently the online_pages phase is protected by two global locks
(device_device_hotplug_lock and mem_hotplug_lock). Especial the latter
can result in scalability issues, as it will slow down code relying on
get_online_mems(). Let's prepare code for not having to rely on
get_online_mems() but instead some more fine grained locks.
During online_pages phase, pgdat->nr_zones will be updated in case the
zone is empty. Right now mem_hotplug_lock ensures that there is no
contention during the update of nr_zones.
The patch moves init_currently_empty_zone under both zone_span_writelock
and pgdat_resize_lock because both the pgdat state is changed (nr_zones)
and the zone's start_pfn. Also this patch changes the documentation
of node_size_lock to include the protection of nr_zones.
"
Does that make sense?
>
> ---
> v3:
> * slightly modify the last paragraph of changelog based on Michal's
> comment
> v2:
> * commit log changes
> * modify the code in move_pfn_range_to_zone() instead of in
> init_currently_empty_zone()
> * pgdat_resize_lock documentation change
> ---
> include/linux/mmzone.h | 7 ++++---
> mm/memory_hotplug.c | 5 ++---
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 3d0c472438d2..37d9c5c3faa6 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -635,9 +635,10 @@ typedef struct pglist_data {
> #endif
> #if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_DEFERRED_STRUCT_PAGE_INIT)
> /*
> - * Must be held any time you expect node_start_pfn, node_present_pages
> - * or node_spanned_pages stay constant. Holding this will also
> - * guarantee that any pfn_valid() stays that way.
> + * Must be held any time you expect node_start_pfn,
> + * node_present_pages, node_spanned_pages or nr_zones stay constant.
> + * Holding this will also guarantee that any pfn_valid() stays that
> + * way.
> *
> * pgdat_resize_lock() and pgdat_resize_unlock() are provided to
> * manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 61972da38d93..f626e7e5f57b 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -742,14 +742,13 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
> int nid = pgdat->node_id;
> unsigned long flags;
>
> - if (zone_is_empty(zone))
> - init_currently_empty_zone(zone, start_pfn, nr_pages);
> -
> clear_zone_contiguous(zone);
>
> /* TODO Huh pgdat is irqsave while zone is not. It used to be like that before */
> pgdat_resize_lock(pgdat, &flags);
> zone_span_writelock(zone);
> + if (zone_is_empty(zone))
> + init_currently_empty_zone(zone, start_pfn, nr_pages);
> resize_zone_range(zone, start_pfn, nr_pages);
> zone_span_writeunlock(zone);
> resize_pgdat_range(pgdat, start_pfn, nr_pages);
>
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-30 9:30 ` David Hildenbrand
@ 2018-12-01 0:27 ` Wei Yang
2018-12-03 10:09 ` David Hildenbrand
0 siblings, 1 reply; 40+ messages in thread
From: Wei Yang @ 2018-12-01 0:27 UTC (permalink / raw)
To: David Hildenbrand; +Cc: Wei Yang, mhocko, osalvador, akpm, linux-mm
On Fri, Nov 30, 2018 at 10:30:22AM +0100, David Hildenbrand wrote:
>On 30.11.18 07:58, Wei Yang wrote:
>> During online_pages phase, pgdat->nr_zones will be updated in case this
>> zone is empty.
>>
>> Currently the online_pages phase is protected by the global lock
>> mem_hotplug_begin(), which ensures there is no contention during the
>> update of nr_zones. But this global lock introduces scalability issues.
>>
>> The patch moves init_currently_empty_zone under both zone_span_writelock
>> and pgdat_resize_lock because both the pgdat state is changed (nr_zones)
>> and the zone's start_pfn. Also this patch changes the documentation
>> of node_size_lock to include the protectioin of nr_zones.
>
>s/protectioin/protection/
>
>>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> Acked-by: Michal Hocko <mhocko@suse.com>
>> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>> CC: David Hildenbrand <david@redhat.com>
>>
>> ---
>> David, I may not catch you exact comment on the code or changelog. If I
>> missed, just let me know.
>
>I guess I would have rewritten it to something like the following
>
>"
>Currently the online_pages phase is protected by two global locks
>(device_device_hotplug_lock and mem_hotplug_lock). Especial the latter
>can result in scalability issues, as it will slow down code relying on
>get_online_mems(). Let's prepare code for not having to rely on
>get_online_mems() but instead some more fine grained locks.
I am not sure why we specify get_online_mems() here. mem_hotplug_lock is
grabed in many places besides this one. In my mind, each place introduce
scalability issue, not only this one.
Or you want to say, the mem_hotplug_lock will introduce scalability
issue in two place:
* hotplug process itself
* slab allocation process
The second one is more critical. And this is what we try to address?
>
>During online_pages phase, pgdat->nr_zones will be updated in case the
>zone is empty. Right now mem_hotplug_lock ensures that there is no
>contention during the update of nr_zones.
>
>The patch moves init_currently_empty_zone under both zone_span_writelock
>and pgdat_resize_lock because both the pgdat state is changed (nr_zones)
>and the zone's start_pfn. Also this patch changes the documentation
>of node_size_lock to include the protection of nr_zones.
>"
>
>Does that make sense?
>
>>
>> ---
>> v3:
>> * slightly modify the last paragraph of changelog based on Michal's
>> comment
>> v2:
>> * commit log changes
>> * modify the code in move_pfn_range_to_zone() instead of in
>> init_currently_empty_zone()
>> * pgdat_resize_lock documentation change
>> ---
>> include/linux/mmzone.h | 7 ++++---
>> mm/memory_hotplug.c | 5 ++---
>> 2 files changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 3d0c472438d2..37d9c5c3faa6 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -635,9 +635,10 @@ typedef struct pglist_data {
>> #endif
>> #if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_DEFERRED_STRUCT_PAGE_INIT)
>> /*
>> - * Must be held any time you expect node_start_pfn, node_present_pages
>> - * or node_spanned_pages stay constant. Holding this will also
>> - * guarantee that any pfn_valid() stays that way.
>> + * Must be held any time you expect node_start_pfn,
>> + * node_present_pages, node_spanned_pages or nr_zones stay constant.
>> + * Holding this will also guarantee that any pfn_valid() stays that
>> + * way.
>> *
>> * pgdat_resize_lock() and pgdat_resize_unlock() are provided to
>> * manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 61972da38d93..f626e7e5f57b 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -742,14 +742,13 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>> int nid = pgdat->node_id;
>> unsigned long flags;
>>
>> - if (zone_is_empty(zone))
>> - init_currently_empty_zone(zone, start_pfn, nr_pages);
>> -
>> clear_zone_contiguous(zone);
>>
>> /* TODO Huh pgdat is irqsave while zone is not. It used to be like that before */
>> pgdat_resize_lock(pgdat, &flags);
>> zone_span_writelock(zone);
>> + if (zone_is_empty(zone))
>> + init_currently_empty_zone(zone, start_pfn, nr_pages);
>> resize_zone_range(zone, start_pfn, nr_pages);
>> zone_span_writeunlock(zone);
>> resize_pgdat_range(pgdat, start_pfn, nr_pages);
>>
>
>
>--
>
>Thanks,
>
>David / dhildenb
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-12-01 0:27 ` Wei Yang
@ 2018-12-03 10:09 ` David Hildenbrand
2018-12-03 20:37 ` Wei Yang
0 siblings, 1 reply; 40+ messages in thread
From: David Hildenbrand @ 2018-12-03 10:09 UTC (permalink / raw)
To: Wei Yang; +Cc: mhocko, osalvador, akpm, linux-mm
On 01.12.18 01:27, Wei Yang wrote:
> On Fri, Nov 30, 2018 at 10:30:22AM +0100, David Hildenbrand wrote:
>> On 30.11.18 07:58, Wei Yang wrote:
>>> During online_pages phase, pgdat->nr_zones will be updated in case this
>>> zone is empty.
>>>
>>> Currently the online_pages phase is protected by the global lock
>>> mem_hotplug_begin(), which ensures there is no contention during the
>>> update of nr_zones. But this global lock introduces scalability issues.
>>>
>>> The patch moves init_currently_empty_zone under both zone_span_writelock
>>> and pgdat_resize_lock because both the pgdat state is changed (nr_zones)
>>> and the zone's start_pfn. Also this patch changes the documentation
>>> of node_size_lock to include the protectioin of nr_zones.
>>
>> s/protectioin/protection/
>>
>>>
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> Acked-by: Michal Hocko <mhocko@suse.com>
>>> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>>> CC: David Hildenbrand <david@redhat.com>
>>>
>>> ---
>>> David, I may not catch you exact comment on the code or changelog. If I
>>> missed, just let me know.
>>
>> I guess I would have rewritten it to something like the following
>>
>> "
>> Currently the online_pages phase is protected by two global locks
>> (device_device_hotplug_lock and mem_hotplug_lock). Especial the latter
>> can result in scalability issues, as it will slow down code relying on
>> get_online_mems(). Let's prepare code for not having to rely on
>> get_online_mems() but instead some more fine grained locks.
>
> I am not sure why we specify get_online_mems() here. mem_hotplug_lock is
> grabed in many places besides this one. In my mind, each place introduce
> scalability issue, not only this one.
mem_hotplug_lock is grabbed in write only when
adding/removing/onlining/offlining memory and when adding/removing
device memory. The read locker are the critical part for now.
>
> Or you want to say, the mem_hotplug_lock will introduce scalability
> issue in two place:
>
> * hotplug process itself
> * slab allocation process
>
> The second one is more critical. And this is what we try to address?
Indeed, especially as the first usually (except device memory) also uses
the device_hotplug_lock, I only consider the second one critical.
Feel free to change this description to whatever you like.
As I already stated scalability of adding/removing/onlining/offlining is
not really an issue as of now (prove me wrong :) ). So I would not care
about including such information in this patch.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH v3] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-12-03 10:09 ` David Hildenbrand
@ 2018-12-03 20:37 ` Wei Yang
0 siblings, 0 replies; 40+ messages in thread
From: Wei Yang @ 2018-12-03 20:37 UTC (permalink / raw)
To: David Hildenbrand; +Cc: Wei Yang, mhocko, osalvador, akpm, linux-mm
On Mon, Dec 03, 2018 at 11:09:52AM +0100, David Hildenbrand wrote:
>On 01.12.18 01:27, Wei Yang wrote:
>> On Fri, Nov 30, 2018 at 10:30:22AM +0100, David Hildenbrand wrote:
>>> On 30.11.18 07:58, Wei Yang wrote:
>>>> During online_pages phase, pgdat->nr_zones will be updated in case this
>>>> zone is empty.
>>>>
>>>> Currently the online_pages phase is protected by the global lock
>>>> mem_hotplug_begin(), which ensures there is no contention during the
>>>> update of nr_zones. But this global lock introduces scalability issues.
>>>>
>>>> The patch moves init_currently_empty_zone under both zone_span_writelock
>>>> and pgdat_resize_lock because both the pgdat state is changed (nr_zones)
>>>> and the zone's start_pfn. Also this patch changes the documentation
>>>> of node_size_lock to include the protectioin of nr_zones.
>>>
>>> s/protectioin/protection/
>>>
>>>>
>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>> Acked-by: Michal Hocko <mhocko@suse.com>
>>>> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>>>> CC: David Hildenbrand <david@redhat.com>
>>>>
>>>> ---
>>>> David, I may not catch you exact comment on the code or changelog. If I
>>>> missed, just let me know.
>>>
>>> I guess I would have rewritten it to something like the following
>>>
>>> "
>>> Currently the online_pages phase is protected by two global locks
>>> (device_device_hotplug_lock and mem_hotplug_lock). Especial the latter
>>> can result in scalability issues, as it will slow down code relying on
>>> get_online_mems(). Let's prepare code for not having to rely on
>>> get_online_mems() but instead some more fine grained locks.
>>
>> I am not sure why we specify get_online_mems() here. mem_hotplug_lock is
>> grabed in many places besides this one. In my mind, each place introduce
>> scalability issue, not only this one.
>
>mem_hotplug_lock is grabbed in write only when
>adding/removing/onlining/offlining memory and when adding/removing
>device memory. The read locker are the critical part for now.
>
>>
>> Or you want to say, the mem_hotplug_lock will introduce scalability
>> issue in two place:
>>
>> * hotplug process itself
>> * slab allocation process
>>
>> The second one is more critical. And this is what we try to address?
>
>Indeed, especially as the first usually (except device memory) also uses
>the device_hotplug_lock, I only consider the second one critical.
>
>Feel free to change this description to whatever you like.
>As I already stated scalability of adding/removing/onlining/offlining is
>not really an issue as of now (prove me wrong :) ). So I would not care
>about including such information in this patch.
>
Thanks for your information.
Let me try to reword the changelog.
>--
>
>Thanks,
>
>David / dhildenb
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH v4] mm, hotplug: move init_currently_empty_zone() under zone_span_lock protection
2018-11-30 6:58 ` [PATCH v3] " Wei Yang
2018-11-30 9:30 ` David Hildenbrand
@ 2018-12-03 20:50 ` Wei Yang
1 sibling, 0 replies; 40+ messages in thread
From: Wei Yang @ 2018-12-03 20:50 UTC (permalink / raw)
To: mhocko, osalvador, david; +Cc: akpm, linux-mm, Wei Yang
During online_pages phase, pgdat->nr_zones will be updated in case this
zone is empty.
Currently the online_pages phase is protected by the global locks
(device_device_hotplug_lock and mem_hotplug_lock), which ensures there
is no contention during the update of nr_zones.
These global locks introduces scalability issues (especially the second
one), which slow down code relying on get_online_mems(). This is also a
preparation for not having to rely on get_online_mems() but instead some
more fine grained locks.
The patch moves init_currently_empty_zone under both zone_span_writelock
and pgdat_resize_lock because both the pgdat state is changed (nr_zones)
and the zone's start_pfn. Also this patch changes the documentation
of node_size_lock to include the protection of nr_zones.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
CC: David Hildenbrand <david@redhat.com>
---
v4:
* mention the preparation for improving scalability by David's
comment
v3:
* slightly modify the last paragraph of changelog based on Michal's
comment
v2:
* commit log changes
* modify the code in move_pfn_range_to_zone() instead of in
init_currently_empty_zone()
* pgdat_resize_lock documentation change
---
include/linux/mmzone.h | 7 ++++---
mm/memory_hotplug.c | 5 ++---
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 3d0c472438d2..37d9c5c3faa6 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -635,9 +635,10 @@ typedef struct pglist_data {
#endif
#if defined(CONFIG_MEMORY_HOTPLUG) || defined(CONFIG_DEFERRED_STRUCT_PAGE_INIT)
/*
- * Must be held any time you expect node_start_pfn, node_present_pages
- * or node_spanned_pages stay constant. Holding this will also
- * guarantee that any pfn_valid() stays that way.
+ * Must be held any time you expect node_start_pfn,
+ * node_present_pages, node_spanned_pages or nr_zones stay constant.
+ * Holding this will also guarantee that any pfn_valid() stays that
+ * way.
*
* pgdat_resize_lock() and pgdat_resize_unlock() are provided to
* manipulate node_size_lock without checking for CONFIG_MEMORY_HOTPLUG
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 61972da38d93..f626e7e5f57b 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -742,14 +742,13 @@ void __ref move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
int nid = pgdat->node_id;
unsigned long flags;
- if (zone_is_empty(zone))
- init_currently_empty_zone(zone, start_pfn, nr_pages);
-
clear_zone_contiguous(zone);
/* TODO Huh pgdat is irqsave while zone is not. It used to be like that before */
pgdat_resize_lock(pgdat, &flags);
zone_span_writelock(zone);
+ if (zone_is_empty(zone))
+ init_currently_empty_zone(zone, start_pfn, nr_pages);
resize_zone_range(zone, start_pfn, nr_pages);
zone_span_writeunlock(zone);
resize_pgdat_range(pgdat, start_pfn, nr_pages);
--
2.15.1
^ permalink raw reply related [flat|nested] 40+ messages in thread