* [PATCH 0/2] mm, memory_hotplug: remove zone onlining restriction @ 2017-07-14 12:12 Michal Hocko 2017-07-14 12:12 ` [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering Michal Hocko 2017-07-14 12:12 ` [PATCH 2/2] mm, memory_hotplug: remove zone restrictions Michal Hocko 0 siblings, 2 replies; 8+ messages in thread From: Michal Hocko @ 2017-07-14 12:12 UTC (permalink / raw) To: Andrew Morton Cc: Mel Gorman, Vlastimil Babka, Andrea Arcangeli, Reza Arbab, Yasuaki Ishimatsu, qiuxishi, Kani Toshimitsu, slaoub, Joonsoo Kim, Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, Wei Yang, linux-mm, LKML, Joonsoo Kim, linux-api, Michal Hocko Hi, I have sent this as an RFC previously [1] and there haven't been any fundamental objections to the approach. The biggest concern was that if anybody starts depending on the default online semantic introduced in 4.13 merge window then this would break it [2]. I find it rather unlikely but if we are worried we can try to push this later in the release cycle. Unfortunatelly I didn't have much time to work on this sooner. This work should help Joonsoo with his CMA zone based approach when reusing MOVABLE zone. I think it will also help to remove more code from the memory hotplug (e.g. zone shrinking). Patch 1 restores original memoryXY/valid_zones semantic wrt zone ordering. This can be merged without patch 2 which removes the zone overlap restriction and defines a semantic for the default onlining. See more in the patch. Questions, concerns, objections? Shortlog Michal Hocko (2): mm, memory_hotplug: display allowed zones in the preferred ordering mm, memory_hotplug: remove zone restrictions Diffstat drivers/base/memory.c | 30 ++++++++++----- include/linux/memory_hotplug.h | 2 +- mm/memory_hotplug.c | 87 +++++++++++++++++------------------------- 3 files changed, 55 insertions(+), 64 deletions(-) [1] http://lkml.kernel.org/r/20170629073509.623-1-mhocko@kernel.org [2] http://lkml.kernel.org/r/20170710064540.GA19185@dhcp22.suse.cz -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering 2017-07-14 12:12 [PATCH 0/2] mm, memory_hotplug: remove zone onlining restriction Michal Hocko @ 2017-07-14 12:12 ` Michal Hocko 2017-07-14 12:12 ` [PATCH 2/2] mm, memory_hotplug: remove zone restrictions Michal Hocko 1 sibling, 0 replies; 8+ messages in thread From: Michal Hocko @ 2017-07-14 12:12 UTC (permalink / raw) To: Andrew Morton Cc: Mel Gorman, Vlastimil Babka, Andrea Arcangeli, Reza Arbab, Yasuaki Ishimatsu, qiuxishi, Kani Toshimitsu, slaoub, Joonsoo Kim, Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, Wei Yang, linux-mm, LKML, Michal Hocko, Joonsoo Kim From: Michal Hocko <mhocko@suse.com> Prior to "mm, memory_hotplug: do not associate hotadded memory to zones until online" we used to allow to change the valid zone types of a memory block if it is adjacent to a different zone type. This fact was reflected in memoryNN/valid_zones by the ordering of printed zones. The first one was default (echo online > memoryNN/state) and the other one could be onlined explicitly by online_{movable,kernel}. This behavior was removed by the said patch and as such the ordering was not all that important. In most cases a kernel zone would be default anyway. The only exception is movable_node handled by "mm, memory_hotplug: support movable_node for hotpluggable nodes". Let's reintroduce this behavior again because later patch will remove the zone overlap restriction and so user will be allowed to online kernel resp. movable block regardless of its placement. Original behavior will then become significant again because it would be non-trivial for users to see what is the default zone to online into. Implementation is really simple. Pull out zone selection out of move_pfn_range into zone_for_pfn_range helper and use it in show_valid_zones to display the zone for default onlining and then both kernel and movable if they are allowed. Default online zone is not duplicated. Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Michal Hocko <mhocko@suse.com> --- drivers/base/memory.c | 33 +++++++++++++------ include/linux/memory_hotplug.h | 2 +- mm/memory_hotplug.c | 73 ++++++++++++++++++++++++------------------ 3 files changed, 65 insertions(+), 43 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index c7c4e0325cdb..26383af9900c 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -388,6 +388,22 @@ static ssize_t show_phys_device(struct device *dev, } #ifdef CONFIG_MEMORY_HOTREMOVE +static void print_allowed_zone(char *buf, int nid, unsigned long start_pfn, + unsigned long nr_pages, int online_type, + struct zone *default_zone) +{ + struct zone *zone; + + if (!allow_online_pfn_range(nid, start_pfn, nr_pages, online_type)) + return; + + zone = zone_for_pfn_range(online_type, nid, start_pfn, nr_pages); + if (zone != default_zone) { + strcat(buf, " "); + strcat(buf, zone->name); + } +} + static ssize_t show_valid_zones(struct device *dev, struct device_attribute *attr, char *buf) { @@ -395,7 +411,7 @@ static ssize_t show_valid_zones(struct device *dev, unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; unsigned long valid_start_pfn, valid_end_pfn; - bool append = false; + struct zone *default_zone; int nid; /* @@ -418,16 +434,13 @@ static ssize_t show_valid_zones(struct device *dev, } nid = pfn_to_nid(start_pfn); - if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL)) { - strcat(buf, default_zone_for_pfn(nid, start_pfn, nr_pages)->name); - append = true; - } + default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages); + strcat(buf, default_zone->name); - if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_MOVABLE)) { - if (append) - strcat(buf, " "); - strcat(buf, NODE_DATA(nid)->node_zones[ZONE_MOVABLE].name); - } + print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL, + default_zone); + print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_MOVABLE, + default_zone); out: strcat(buf, "\n"); diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index c8a5056a5ae0..5e6e4cc36ff4 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -319,6 +319,6 @@ extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum); extern bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type); -extern struct zone *default_zone_for_pfn(int nid, unsigned long pfn, +extern struct zone *zone_for_pfn_range(int online_type, int nid, unsigned start_pfn, unsigned long nr_pages); #endif /* __LINUX_MEMORY_HOTPLUG_H */ diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index d620d0427b6b..b4f2583677b1 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -777,31 +777,6 @@ static void node_states_set_node(int node, struct memory_notify *arg) node_set_state(node, N_MEMORY); } -bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type) -{ - struct pglist_data *pgdat = NODE_DATA(nid); - struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE]; - struct zone *default_zone = default_zone_for_pfn(nid, pfn, nr_pages); - - /* - * TODO there shouldn't be any inherent reason to have ZONE_NORMAL - * physically before ZONE_MOVABLE. All we need is they do not - * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE - * though so let's stick with it for simplicity for now. - * TODO make sure we do not overlap with ZONE_DEVICE - */ - if (online_type == MMOP_ONLINE_KERNEL) { - if (zone_is_empty(movable_zone)) - return true; - return movable_zone->zone_start_pfn >= pfn + nr_pages; - } else if (online_type == MMOP_ONLINE_MOVABLE) { - return zone_end_pfn(default_zone) <= pfn; - } - - /* MMOP_ONLINE_KEEP will always succeed and inherits the current zone */ - return online_type == MMOP_ONLINE_KEEP; -} - static void __meminit resize_zone_range(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages) { @@ -860,7 +835,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, * If no kernel zone covers this pfn range it will automatically go * to the ZONE_NORMAL. */ -struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, +static struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, unsigned long nr_pages) { struct pglist_data *pgdat = NODE_DATA(nid); @@ -876,6 +851,31 @@ struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, return &pgdat->node_zones[ZONE_NORMAL]; } +bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type) +{ + struct pglist_data *pgdat = NODE_DATA(nid); + struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE]; + struct zone *default_zone = default_zone_for_pfn(nid, pfn, nr_pages); + + /* + * TODO there shouldn't be any inherent reason to have ZONE_NORMAL + * physically before ZONE_MOVABLE. All we need is they do not + * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE + * though so let's stick with it for simplicity for now. + * TODO make sure we do not overlap with ZONE_DEVICE + */ + if (online_type == MMOP_ONLINE_KERNEL) { + if (zone_is_empty(movable_zone)) + return true; + return movable_zone->zone_start_pfn >= pfn + nr_pages; + } else if (online_type == MMOP_ONLINE_MOVABLE) { + return zone_end_pfn(default_zone) <= pfn; + } + + /* MMOP_ONLINE_KEEP will always succeed and inherits the current zone */ + return online_type == MMOP_ONLINE_KEEP; +} + static inline bool movable_pfn_range(int nid, struct zone *default_zone, unsigned long start_pfn, unsigned long nr_pages) { @@ -889,12 +889,8 @@ static inline bool movable_pfn_range(int nid, struct zone *default_zone, return !zone_intersects(default_zone, start_pfn, nr_pages); } -/* - * Associates the given pfn range with the given node and the zone appropriate - * for the given online type. - */ -static struct zone * __meminit move_pfn_range(int online_type, int nid, - unsigned long start_pfn, unsigned long nr_pages) +struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn, + unsigned long nr_pages) { struct pglist_data *pgdat = NODE_DATA(nid); struct zone *zone = default_zone_for_pfn(nid, start_pfn, nr_pages); @@ -913,6 +909,19 @@ static struct zone * __meminit move_pfn_range(int online_type, int nid, zone = &pgdat->node_zones[ZONE_MOVABLE]; } + return zone; +} + +/* + * Associates the given pfn range with the given node and the zone appropriate + * for the given online type. + */ +static struct zone * __meminit move_pfn_range(int online_type, int nid, + unsigned long start_pfn, unsigned long nr_pages) +{ + struct zone *zone; + + zone = zone_for_pfn_range(online_type, nid, start_pfn, nr_pages); move_pfn_range_to_zone(zone, start_pfn, nr_pages); return zone; } -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] mm, memory_hotplug: remove zone restrictions 2017-07-14 12:12 [PATCH 0/2] mm, memory_hotplug: remove zone onlining restriction Michal Hocko 2017-07-14 12:12 ` [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering Michal Hocko @ 2017-07-14 12:12 ` Michal Hocko 2017-07-14 12:17 ` Vlastimil Babka 2017-07-14 14:26 ` Reza Arbab 1 sibling, 2 replies; 8+ messages in thread From: Michal Hocko @ 2017-07-14 12:12 UTC (permalink / raw) To: Andrew Morton Cc: Mel Gorman, Vlastimil Babka, Andrea Arcangeli, Reza Arbab, Yasuaki Ishimatsu, qiuxishi, Kani Toshimitsu, slaoub, Joonsoo Kim, Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, Wei Yang, linux-mm, LKML, Michal Hocko, Joonsoo Kim, linux-api From: Michal Hocko <mhocko@suse.com> Historically we have enforced that any kernel zone (e.g ZONE_NORMAL) has to precede the Movable zone in the physical memory range. The purpose of the movable zone is, however, not bound to any physical memory restriction. It merely defines a class of migrateable and reclaimable memory. There are users (e.g. CMA) who might want to reserve specific physical memory ranges for their own purpose. Moreover our pfn walkers have to be prepared for zones overlapping in the physical range already because we do support interleaving NUMA nodes and therefore zones can interleave as well. This means we can allow each memory block to be associated with a different zone. Loosen the current onlining semantic and allow explicit onlining type on any memblock. That means that online_{kernel,movable} will be allowed regardless of the physical address of the memblock as long as it is offline of course. This might result in moveble zone overlapping with other kernel zones. Default onlining then becomes a bit tricky but still sensible. echo online > memoryXY/state will online the given block to 1) the default zone if the given range is outside of any zone 2) the enclosing zone if such a zone doesn't interleave with any other zone 3) the default zone if more zones interleave for this range where default zone is movable zone only if movable_node is enabled otherwise it is a kernel zone. Here is an example of the semantic with (movable_node is not present but it work in an analogous way). We start with following memblocks, all of them offline memory34/valid_zones:Normal Movable memory35/valid_zones:Normal Movable memory36/valid_zones:Normal Movable memory37/valid_zones:Normal Movable memory38/valid_zones:Normal Movable memory39/valid_zones:Normal Movable memory40/valid_zones:Normal Movable memory41/valid_zones:Normal Movable Now, we online block 34 in default mode and block 37 as movable root@test1:/sys/devices/system/node/node1# echo online > memory34/state root@test1:/sys/devices/system/node/node1# echo online_movable > memory37/state memory34/valid_zones:Normal memory35/valid_zones:Normal Movable memory36/valid_zones:Normal Movable memory37/valid_zones:Movable memory38/valid_zones:Normal Movable memory39/valid_zones:Normal Movable memory40/valid_zones:Normal Movable memory41/valid_zones:Normal Movable As we can see all other blocks can still be onlined both into Normal and Movable zones and the Normal is default because the Movable zone spans only block37 now. root@test1:/sys/devices/system/node/node1# echo online_movable > memory41/state memory34/valid_zones:Normal memory35/valid_zones:Normal Movable memory36/valid_zones:Normal Movable memory37/valid_zones:Movable memory38/valid_zones:Movable Normal memory39/valid_zones:Movable Normal memory40/valid_zones:Movable Normal memory41/valid_zones:Movable Now the default zone for blocks 37-41 has changed because movable zone spans that range. root@test1:/sys/devices/system/node/node1# echo online_kernel > memory39/state memory34/valid_zones:Normal memory35/valid_zones:Normal Movable memory36/valid_zones:Normal Movable memory37/valid_zones:Movable memory38/valid_zones:Normal Movable memory39/valid_zones:Normal memory40/valid_zones:Movable Normal memory41/valid_zones:Movable Note that the block 39 now belongs to the zone Normal and so block38 falls into Normal by default as well. For completness root@test1:/sys/devices/system/node/node1# for i in memory[34]? do echo online > $i/state 2>/dev/null done memory34/valid_zones:Normal memory35/valid_zones:Normal memory36/valid_zones:Normal memory37/valid_zones:Movable memory38/valid_zones:Normal memory39/valid_zones:Normal memory40/valid_zones:Movable memory41/valid_zones:Movable Implementation wise the change is quite straightforward. We can get rid of allow_online_pfn_range altogether. online_pages allows only offline nodes already. The original default_zone_for_pfn will become default_kernel_zone_for_pfn. New default_zone_for_pfn implements the above semantic. zone_for_pfn_range is slightly reorganized to implement kernel and movable online type explicitly and MMOP_ONLINE_KEEP becomes a catch all default behavior. Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <linux-api@vger.kernel.org> Signed-off-by: Michal Hocko <mhocko@suse.com> --- drivers/base/memory.c | 3 --- mm/memory_hotplug.c | 74 ++++++++++++++++----------------------------------- 2 files changed, 23 insertions(+), 54 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 26383af9900c..4e3b61cda520 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -394,9 +394,6 @@ static void print_allowed_zone(char *buf, int nid, unsigned long start_pfn, { struct zone *zone; - if (!allow_online_pfn_range(nid, start_pfn, nr_pages, online_type)) - return; - zone = zone_for_pfn_range(online_type, nid, start_pfn, nr_pages); if (zone != default_zone) { strcat(buf, " "); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index b4f2583677b1..d8b771b1ae29 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -835,7 +835,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, * If no kernel zone covers this pfn range it will automatically go * to the ZONE_NORMAL. */ -static struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, +static struct zone *default_kernel_zone_for_pfn(int nid, unsigned long start_pfn, unsigned long nr_pages) { struct pglist_data *pgdat = NODE_DATA(nid); @@ -851,65 +851,40 @@ static struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, return &pgdat->node_zones[ZONE_NORMAL]; } -bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type) +static inline struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, + unsigned long nr_pages) { - struct pglist_data *pgdat = NODE_DATA(nid); - struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE]; - struct zone *default_zone = default_zone_for_pfn(nid, pfn, nr_pages); + struct zone *kernel_zone = default_kernel_zone_for_pfn(nid, start_pfn, + nr_pages); + struct zone *movable_zone = &NODE_DATA(nid)->node_zones[ZONE_MOVABLE]; + bool in_kernel = zone_intersects(kernel_zone, start_pfn, nr_pages); + bool in_movable = zone_intersects(movable_zone, start_pfn, nr_pages); /* - * TODO there shouldn't be any inherent reason to have ZONE_NORMAL - * physically before ZONE_MOVABLE. All we need is they do not - * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE - * though so let's stick with it for simplicity for now. - * TODO make sure we do not overlap with ZONE_DEVICE + * We inherit the existing zone in a simple case where zones do not + * overlap in the given range */ - if (online_type == MMOP_ONLINE_KERNEL) { - if (zone_is_empty(movable_zone)) - return true; - return movable_zone->zone_start_pfn >= pfn + nr_pages; - } else if (online_type == MMOP_ONLINE_MOVABLE) { - return zone_end_pfn(default_zone) <= pfn; - } - - /* MMOP_ONLINE_KEEP will always succeed and inherits the current zone */ - return online_type == MMOP_ONLINE_KEEP; -} - -static inline bool movable_pfn_range(int nid, struct zone *default_zone, - unsigned long start_pfn, unsigned long nr_pages) -{ - if (!allow_online_pfn_range(nid, start_pfn, nr_pages, - MMOP_ONLINE_KERNEL)) - return true; - - if (!movable_node_is_enabled()) - return false; + if (in_kernel ^ in_movable) + return (in_kernel) ? kernel_zone : movable_zone; - return !zone_intersects(default_zone, start_pfn, nr_pages); + /* + * If the range doesn't belong to any zone or two zones overlap in the + * given range then we use movable zone only if movable_node is + * enabled because we always online to a kernel zone by default. + */ + return movable_node_enabled ? movable_zone : kernel_zone; } struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn, unsigned long nr_pages) { - struct pglist_data *pgdat = NODE_DATA(nid); - struct zone *zone = default_zone_for_pfn(nid, start_pfn, nr_pages); + if (online_type == MMOP_ONLINE_KERNEL) + return default_kernel_zone_for_pfn(nid, start_pfn, nr_pages); - if (online_type == MMOP_ONLINE_KEEP) { - struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE]; - /* - * MMOP_ONLINE_KEEP defaults to MMOP_ONLINE_KERNEL but use - * movable zone if that is not possible (e.g. we are within - * or past the existing movable zone). movable_node overrides - * this default and defaults to movable zone - */ - if (movable_pfn_range(nid, zone, start_pfn, nr_pages)) - zone = movable_zone; - } else if (online_type == MMOP_ONLINE_MOVABLE) { - zone = &pgdat->node_zones[ZONE_MOVABLE]; - } + if (online_type == MMOP_ONLINE_MOVABLE) + return &NODE_DATA(nid)->node_zones[ZONE_MOVABLE]; - return zone; + return default_zone_for_pfn(nid, start_pfn, nr_pages); } /* @@ -938,9 +913,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ struct memory_notify arg; nid = pfn_to_nid(pfn); - if (!allow_online_pfn_range(nid, pfn, nr_pages, online_type)) - return -EINVAL; - /* associate pfn range with the zone */ zone = move_pfn_range(online_type, nid, pfn, nr_pages); -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] mm, memory_hotplug: remove zone restrictions 2017-07-14 12:12 ` [PATCH 2/2] mm, memory_hotplug: remove zone restrictions Michal Hocko @ 2017-07-14 12:17 ` Vlastimil Babka 2017-07-14 14:26 ` Reza Arbab 1 sibling, 0 replies; 8+ messages in thread From: Vlastimil Babka @ 2017-07-14 12:17 UTC (permalink / raw) To: Michal Hocko, Andrew Morton Cc: Mel Gorman, Andrea Arcangeli, Reza Arbab, Yasuaki Ishimatsu, qiuxishi, Kani Toshimitsu, slaoub, Joonsoo Kim, Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, Wei Yang, linux-mm, LKML, Michal Hocko, Joonsoo Kim, linux-api On 07/14/2017 02:12 PM, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > Historically we have enforced that any kernel zone (e.g ZONE_NORMAL) has > to precede the Movable zone in the physical memory range. The purpose of > the movable zone is, however, not bound to any physical memory restriction. > It merely defines a class of migrateable and reclaimable memory. > > There are users (e.g. CMA) who might want to reserve specific physical > memory ranges for their own purpose. Moreover our pfn walkers have to be > prepared for zones overlapping in the physical range already because we > do support interleaving NUMA nodes and therefore zones can interleave as > well. This means we can allow each memory block to be associated with a > different zone. > > Loosen the current onlining semantic and allow explicit onlining type on > any memblock. That means that online_{kernel,movable} will be allowed > regardless of the physical address of the memblock as long as it is > offline of course. This might result in moveble zone overlapping with > other kernel zones. Default onlining then becomes a bit tricky but still > sensible. echo online > memoryXY/state will online the given block to > 1) the default zone if the given range is outside of any zone > 2) the enclosing zone if such a zone doesn't interleave with > any other zone > 3) the default zone if more zones interleave for this range > where default zone is movable zone only if movable_node is enabled > otherwise it is a kernel zone. > > Here is an example of the semantic with (movable_node is not present but > it work in an analogous way). We start with following memblocks, all of > them offline > memory34/valid_zones:Normal Movable > memory35/valid_zones:Normal Movable > memory36/valid_zones:Normal Movable > memory37/valid_zones:Normal Movable > memory38/valid_zones:Normal Movable > memory39/valid_zones:Normal Movable > memory40/valid_zones:Normal Movable > memory41/valid_zones:Normal Movable > > Now, we online block 34 in default mode and block 37 as movable > root@test1:/sys/devices/system/node/node1# echo online > memory34/state > root@test1:/sys/devices/system/node/node1# echo online_movable > memory37/state > memory34/valid_zones:Normal > memory35/valid_zones:Normal Movable > memory36/valid_zones:Normal Movable > memory37/valid_zones:Movable > memory38/valid_zones:Normal Movable > memory39/valid_zones:Normal Movable > memory40/valid_zones:Normal Movable > memory41/valid_zones:Normal Movable > > As we can see all other blocks can still be onlined both into Normal and > Movable zones and the Normal is default because the Movable zone spans > only block37 now. > root@test1:/sys/devices/system/node/node1# echo online_movable > memory41/state > memory34/valid_zones:Normal > memory35/valid_zones:Normal Movable > memory36/valid_zones:Normal Movable > memory37/valid_zones:Movable > memory38/valid_zones:Movable Normal > memory39/valid_zones:Movable Normal > memory40/valid_zones:Movable Normal > memory41/valid_zones:Movable > > Now the default zone for blocks 37-41 has changed because movable zone > spans that range. > root@test1:/sys/devices/system/node/node1# echo online_kernel > memory39/state > memory34/valid_zones:Normal > memory35/valid_zones:Normal Movable > memory36/valid_zones:Normal Movable > memory37/valid_zones:Movable > memory38/valid_zones:Normal Movable > memory39/valid_zones:Normal > memory40/valid_zones:Movable Normal > memory41/valid_zones:Movable > > Note that the block 39 now belongs to the zone Normal and so block38 > falls into Normal by default as well. > > For completness > root@test1:/sys/devices/system/node/node1# for i in memory[34]? > do > echo online > $i/state 2>/dev/null > done > > memory34/valid_zones:Normal > memory35/valid_zones:Normal > memory36/valid_zones:Normal > memory37/valid_zones:Movable > memory38/valid_zones:Normal > memory39/valid_zones:Normal > memory40/valid_zones:Movable > memory41/valid_zones:Movable > > Implementation wise the change is quite straightforward. We can get rid > of allow_online_pfn_range altogether. online_pages allows only offline > nodes already. The original default_zone_for_pfn will become > default_kernel_zone_for_pfn. New default_zone_for_pfn implements the > above semantic. zone_for_pfn_range is slightly reorganized to implement > kernel and movable online type explicitly and MMOP_ONLINE_KEEP becomes > a catch all default behavior. > > Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> > Cc: <linux-api@vger.kernel.org> > Signed-off-by: Michal Hocko <mhocko@suse.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] mm, memory_hotplug: remove zone restrictions 2017-07-14 12:12 ` [PATCH 2/2] mm, memory_hotplug: remove zone restrictions Michal Hocko 2017-07-14 12:17 ` Vlastimil Babka @ 2017-07-14 14:26 ` Reza Arbab 1 sibling, 0 replies; 8+ messages in thread From: Reza Arbab @ 2017-07-14 14:26 UTC (permalink / raw) To: Michal Hocko Cc: Andrew Morton, Mel Gorman, Vlastimil Babka, Andrea Arcangeli, Yasuaki Ishimatsu, qiuxishi, Kani Toshimitsu, slaoub, Joonsoo Kim, Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, Wei Yang, linux-mm, LKML, Michal Hocko, Joonsoo Kim, linux-api On Fri, Jul 14, 2017 at 02:12:33PM +0200, Michal Hocko wrote: >Historically we have enforced that any kernel zone (e.g ZONE_NORMAL) has >to precede the Movable zone in the physical memory range. The purpose of >the movable zone is, however, not bound to any physical memory restriction. >It merely defines a class of migrateable and reclaimable memory. > >There are users (e.g. CMA) who might want to reserve specific physical >memory ranges for their own purpose. Moreover our pfn walkers have to be >prepared for zones overlapping in the physical range already because we >do support interleaving NUMA nodes and therefore zones can interleave as >well. This means we can allow each memory block to be associated with a >different zone. > >Loosen the current onlining semantic and allow explicit onlining type on >any memblock. That means that online_{kernel,movable} will be allowed >regardless of the physical address of the memblock as long as it is >offline of course. This might result in moveble zone overlapping with >other kernel zones. Default onlining then becomes a bit tricky but still >sensible. echo online > memoryXY/state will online the given block to > 1) the default zone if the given range is outside of any zone > 2) the enclosing zone if such a zone doesn't interleave with > any other zone > 3) the default zone if more zones interleave for this range >where default zone is movable zone only if movable_node is enabled >otherwise it is a kernel zone. > >Here is an example of the semantic with (movable_node is not present but >it work in an analogous way). We start with following memblocks, all of >them offline >memory34/valid_zones:Normal Movable >memory35/valid_zones:Normal Movable >memory36/valid_zones:Normal Movable >memory37/valid_zones:Normal Movable >memory38/valid_zones:Normal Movable >memory39/valid_zones:Normal Movable >memory40/valid_zones:Normal Movable >memory41/valid_zones:Normal Movable > >Now, we online block 34 in default mode and block 37 as movable >root@test1:/sys/devices/system/node/node1# echo online > memory34/state >root@test1:/sys/devices/system/node/node1# echo online_movable > memory37/state >memory34/valid_zones:Normal >memory35/valid_zones:Normal Movable >memory36/valid_zones:Normal Movable >memory37/valid_zones:Movable >memory38/valid_zones:Normal Movable >memory39/valid_zones:Normal Movable >memory40/valid_zones:Normal Movable >memory41/valid_zones:Normal Movable > >As we can see all other blocks can still be onlined both into Normal and >Movable zones and the Normal is default because the Movable zone spans >only block37 now. >root@test1:/sys/devices/system/node/node1# echo online_movable > memory41/state >memory34/valid_zones:Normal >memory35/valid_zones:Normal Movable >memory36/valid_zones:Normal Movable >memory37/valid_zones:Movable >memory38/valid_zones:Movable Normal >memory39/valid_zones:Movable Normal >memory40/valid_zones:Movable Normal >memory41/valid_zones:Movable > >Now the default zone for blocks 37-41 has changed because movable zone >spans that range. >root@test1:/sys/devices/system/node/node1# echo online_kernel > memory39/state >memory34/valid_zones:Normal >memory35/valid_zones:Normal Movable >memory36/valid_zones:Normal Movable >memory37/valid_zones:Movable >memory38/valid_zones:Normal Movable >memory39/valid_zones:Normal >memory40/valid_zones:Movable Normal >memory41/valid_zones:Movable > >Note that the block 39 now belongs to the zone Normal and so block38 >falls into Normal by default as well. > >For completness >root@test1:/sys/devices/system/node/node1# for i in memory[34]? >do > echo online > $i/state 2>/dev/null >done > >memory34/valid_zones:Normal >memory35/valid_zones:Normal >memory36/valid_zones:Normal >memory37/valid_zones:Movable >memory38/valid_zones:Normal >memory39/valid_zones:Normal >memory40/valid_zones:Movable >memory41/valid_zones:Movable > >Implementation wise the change is quite straightforward. We can get rid >of allow_online_pfn_range altogether. online_pages allows only offline >nodes already. The original default_zone_for_pfn will become >default_kernel_zone_for_pfn. New default_zone_for_pfn implements the >above semantic. zone_for_pfn_range is slightly reorganized to implement >kernel and movable online type explicitly and MMOP_ONLINE_KEEP becomes >a catch all default behavior. > >Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Reza Arbab <arbab@linux.vnet.ibm.com> >Cc: <linux-api@vger.kernel.org> >Signed-off-by: Michal Hocko <mhocko@suse.com> -- Reza Arbab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* [RFC PATCH 0/2] mm, memory_hotplug: remove zone onlining restriction @ 2017-06-29 7:35 Michal Hocko 2017-06-29 7:35 ` [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering Michal Hocko 0 siblings, 1 reply; 8+ messages in thread From: Michal Hocko @ 2017-06-29 7:35 UTC (permalink / raw) To: linux-mm Cc: Andrew Morton, Mel Gorman, Vlastimil Babka, Andrea Arcangeli, Reza Arbab, Yasuaki Ishimatsu, qiuxishi, Kani Toshimitsu, slaoub, Joonsoo Kim, Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, Wei Yang, LKML Hi, I am sending this as an RFC because this hasn't seen a lot of testing yet but I would like to see whether the semantic I came up with (see patch 2) is sensible. This work should help Joonsoo with his CMA zone based approach when reusing MOVABLE zone. I think it will also help to remove more code from the memory hotplug (e.g. zone shrinking). Patch 1 restores original memoryXY/valid_zones semantic wrt zone ordering. This can be merged without patch 2 which removes the zone overlap restriction and defines a semantic for the default onlining. See more in the patch. Questions, concerns, objections? Shortlog Michal Hocko (2): mm, memory_hotplug: display allowed zones in the preferred ordering mm, memory_hotplug: remove zone restrictions Diffstat drivers/base/memory.c | 30 ++++++++++----- include/linux/memory_hotplug.h | 2 +- mm/memory_hotplug.c | 87 +++++++++++++++++------------------------- 3 files changed, 55 insertions(+), 64 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering 2017-06-29 7:35 [RFC PATCH 0/2] mm, memory_hotplug: remove zone onlining restriction Michal Hocko @ 2017-06-29 7:35 ` Michal Hocko 2017-06-30 0:45 ` Joonsoo Kim 2017-07-07 14:34 ` Vlastimil Babka 0 siblings, 2 replies; 8+ messages in thread From: Michal Hocko @ 2017-06-29 7:35 UTC (permalink / raw) To: linux-mm Cc: Andrew Morton, Mel Gorman, Vlastimil Babka, Andrea Arcangeli, Reza Arbab, Yasuaki Ishimatsu, qiuxishi, Kani Toshimitsu, slaoub, Joonsoo Kim, Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, Wei Yang, LKML, Michal Hocko From: Michal Hocko <mhocko@suse.com> Prior to "mm, memory_hotplug: do not associate hotadded memory to zones until online" we used to allow to change the valid zone types of a memory block if it is adjacent to a different zone type. This fact was reflected in memoryNN/valid_zones by the ordering of printed zones. The first one was default (echo online > memoryNN/state) and the other one could be onlined explicitly by online_{movable,kernel}. This behavior was removed by the said patch and as such the ordering was not all that important. In most cases a kernel zone would be default anyway. The only exception is movable_node handled by "mm, memory_hotplug: support movable_node for hotpluggable nodes". Let's reintroduce this behavior again because later patch will remove the zone overlap restriction and so user will be allowed to online kernel resp. movable block regardless of its placement. Original behavior will then become significant again because it would be non-trivial for users to see what is the default zone to online into. Implementation is really simple. Pull out zone selection out of move_pfn_range into zone_for_pfn_range helper and use it in show_valid_zones to display the zone for default onlining and then both kernel and movable if they are allowed. Default online zone is not duplicated. Signed-off-by: Michal Hocko <mhocko@suse.com> fold me "mm, memory_hotplug: display allowed zones in the preferred ordering" --- drivers/base/memory.c | 33 +++++++++++++------ include/linux/memory_hotplug.h | 2 +- mm/memory_hotplug.c | 73 ++++++++++++++++++++++++------------------ 3 files changed, 65 insertions(+), 43 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index c7c4e0325cdb..26383af9900c 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -388,6 +388,22 @@ static ssize_t show_phys_device(struct device *dev, } #ifdef CONFIG_MEMORY_HOTREMOVE +static void print_allowed_zone(char *buf, int nid, unsigned long start_pfn, + unsigned long nr_pages, int online_type, + struct zone *default_zone) +{ + struct zone *zone; + + if (!allow_online_pfn_range(nid, start_pfn, nr_pages, online_type)) + return; + + zone = zone_for_pfn_range(online_type, nid, start_pfn, nr_pages); + if (zone != default_zone) { + strcat(buf, " "); + strcat(buf, zone->name); + } +} + static ssize_t show_valid_zones(struct device *dev, struct device_attribute *attr, char *buf) { @@ -395,7 +411,7 @@ static ssize_t show_valid_zones(struct device *dev, unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; unsigned long valid_start_pfn, valid_end_pfn; - bool append = false; + struct zone *default_zone; int nid; /* @@ -418,16 +434,13 @@ static ssize_t show_valid_zones(struct device *dev, } nid = pfn_to_nid(start_pfn); - if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL)) { - strcat(buf, default_zone_for_pfn(nid, start_pfn, nr_pages)->name); - append = true; - } + default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages); + strcat(buf, default_zone->name); - if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_MOVABLE)) { - if (append) - strcat(buf, " "); - strcat(buf, NODE_DATA(nid)->node_zones[ZONE_MOVABLE].name); - } + print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL, + default_zone); + print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_MOVABLE, + default_zone); out: strcat(buf, "\n"); diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index c8a5056a5ae0..5e6e4cc36ff4 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -319,6 +319,6 @@ extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum); extern bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type); -extern struct zone *default_zone_for_pfn(int nid, unsigned long pfn, +extern struct zone *zone_for_pfn_range(int online_type, int nid, unsigned start_pfn, unsigned long nr_pages); #endif /* __LINUX_MEMORY_HOTPLUG_H */ diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index b4015a39d108..6b9a60115e37 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -836,31 +836,6 @@ static void node_states_set_node(int node, struct memory_notify *arg) node_set_state(node, N_MEMORY); } -bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type) -{ - struct pglist_data *pgdat = NODE_DATA(nid); - struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE]; - struct zone *default_zone = default_zone_for_pfn(nid, pfn, nr_pages); - - /* - * TODO there shouldn't be any inherent reason to have ZONE_NORMAL - * physically before ZONE_MOVABLE. All we need is they do not - * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE - * though so let's stick with it for simplicity for now. - * TODO make sure we do not overlap with ZONE_DEVICE - */ - if (online_type == MMOP_ONLINE_KERNEL) { - if (zone_is_empty(movable_zone)) - return true; - return movable_zone->zone_start_pfn >= pfn + nr_pages; - } else if (online_type == MMOP_ONLINE_MOVABLE) { - return zone_end_pfn(default_zone) <= pfn; - } - - /* MMOP_ONLINE_KEEP will always succeed and inherits the current zone */ - return online_type == MMOP_ONLINE_KEEP; -} - static void __meminit resize_zone_range(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages) { @@ -919,7 +894,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, * If no kernel zone covers this pfn range it will automatically go * to the ZONE_NORMAL. */ -struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, +static struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, unsigned long nr_pages) { struct pglist_data *pgdat = NODE_DATA(nid); @@ -935,6 +910,31 @@ struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, return &pgdat->node_zones[ZONE_NORMAL]; } +bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type) +{ + struct pglist_data *pgdat = NODE_DATA(nid); + struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE]; + struct zone *default_zone = default_zone_for_pfn(nid, pfn, nr_pages); + + /* + * TODO there shouldn't be any inherent reason to have ZONE_NORMAL + * physically before ZONE_MOVABLE. All we need is they do not + * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE + * though so let's stick with it for simplicity for now. + * TODO make sure we do not overlap with ZONE_DEVICE + */ + if (online_type == MMOP_ONLINE_KERNEL) { + if (zone_is_empty(movable_zone)) + return true; + return movable_zone->zone_start_pfn >= pfn + nr_pages; + } else if (online_type == MMOP_ONLINE_MOVABLE) { + return zone_end_pfn(default_zone) <= pfn; + } + + /* MMOP_ONLINE_KEEP will always succeed and inherits the current zone */ + return online_type == MMOP_ONLINE_KEEP; +} + static inline bool movable_pfn_range(int nid, struct zone *default_zone, unsigned long start_pfn, unsigned long nr_pages) { @@ -948,12 +948,8 @@ static inline bool movable_pfn_range(int nid, struct zone *default_zone, return !zone_intersects(default_zone, start_pfn, nr_pages); } -/* - * Associates the given pfn range with the given node and the zone appropriate - * for the given online type. - */ -static struct zone * __meminit move_pfn_range(int online_type, int nid, - unsigned long start_pfn, unsigned long nr_pages) +struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn, + unsigned long nr_pages) { struct pglist_data *pgdat = NODE_DATA(nid); struct zone *zone = default_zone_for_pfn(nid, start_pfn, nr_pages); @@ -972,6 +968,19 @@ static struct zone * __meminit move_pfn_range(int online_type, int nid, zone = &pgdat->node_zones[ZONE_MOVABLE]; } + return zone; +} + +/* + * Associates the given pfn range with the given node and the zone appropriate + * for the given online type. + */ +static struct zone * __meminit move_pfn_range(int online_type, int nid, + unsigned long start_pfn, unsigned long nr_pages) +{ + struct zone *zone; + + zone = zone_for_pfn_range(online_type, nid, start_pfn, nr_pages); move_pfn_range_to_zone(zone, start_pfn, nr_pages); return zone; } -- 2.11.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering 2017-06-29 7:35 ` [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering Michal Hocko @ 2017-06-30 0:45 ` Joonsoo Kim 2017-07-07 14:34 ` Vlastimil Babka 1 sibling, 0 replies; 8+ messages in thread From: Joonsoo Kim @ 2017-06-30 0:45 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, Andrew Morton, Mel Gorman, Vlastimil Babka, Andrea Arcangeli, Reza Arbab, Yasuaki Ishimatsu, qiuxishi, Kani Toshimitsu, slaoub, Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, Wei Yang, LKML, Michal Hocko On Thu, Jun 29, 2017 at 09:35:08AM +0200, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > Prior to "mm, memory_hotplug: do not associate hotadded memory to zones > until online" we used to allow to change the valid zone types of a > memory block if it is adjacent to a different zone type. This fact was > reflected in memoryNN/valid_zones by the ordering of printed zones. > The first one was default (echo online > memoryNN/state) and the other > one could be onlined explicitly by online_{movable,kernel}. This > behavior was removed by the said patch and as such the ordering was > not all that important. In most cases a kernel zone would be default > anyway. The only exception is movable_node handled by "mm, > memory_hotplug: support movable_node for hotpluggable nodes". > > Let's reintroduce this behavior again because later patch will remove > the zone overlap restriction and so user will be allowed to online > kernel resp. movable block regardless of its placement. Original > behavior will then become significant again because it would be > non-trivial for users to see what is the default zone to online into. > > Implementation is really simple. Pull out zone selection out of > move_pfn_range into zone_for_pfn_range helper and use it in > show_valid_zones to display the zone for default onlining and then > both kernel and movable if they are allowed. Default online zone is not > duplicated. > > Signed-off-by: Michal Hocko <mhocko@suse.com> > Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Thanks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering 2017-06-29 7:35 ` [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering Michal Hocko 2017-06-30 0:45 ` Joonsoo Kim @ 2017-07-07 14:34 ` Vlastimil Babka 1 sibling, 0 replies; 8+ messages in thread From: Vlastimil Babka @ 2017-07-07 14:34 UTC (permalink / raw) To: Michal Hocko, linux-mm Cc: Andrew Morton, Mel Gorman, Andrea Arcangeli, Reza Arbab, Yasuaki Ishimatsu, qiuxishi, Kani Toshimitsu, slaoub, Joonsoo Kim, Daniel Kiper, Igor Mammedov, Vitaly Kuznetsov, Wei Yang, LKML, Michal Hocko On 06/29/2017 09:35 AM, Michal Hocko wrote: > From: Michal Hocko <mhocko@suse.com> > > Prior to "mm, memory_hotplug: do not associate hotadded memory to zones > until online" we used to allow to change the valid zone types of a > memory block if it is adjacent to a different zone type. This fact was > reflected in memoryNN/valid_zones by the ordering of printed zones. > The first one was default (echo online > memoryNN/state) and the other > one could be onlined explicitly by online_{movable,kernel}. This > behavior was removed by the said patch and as such the ordering was > not all that important. In most cases a kernel zone would be default > anyway. The only exception is movable_node handled by "mm, > memory_hotplug: support movable_node for hotpluggable nodes". > > Let's reintroduce this behavior again because later patch will remove > the zone overlap restriction and so user will be allowed to online > kernel resp. movable block regardless of its placement. Original > behavior will then become significant again because it would be > non-trivial for users to see what is the default zone to online into. > > Implementation is really simple. Pull out zone selection out of > move_pfn_range into zone_for_pfn_range helper and use it in > show_valid_zones to display the zone for default onlining and then > both kernel and movable if they are allowed. Default online zone is not > duplicated. Hm I wouldn't call this maze of functions simple, but seems to be correct. Maybe Patch 2/2 will simplify the code... > Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> > > fold me "mm, memory_hotplug: display allowed zones in the preferred ordering" > --- > drivers/base/memory.c | 33 +++++++++++++------ > include/linux/memory_hotplug.h | 2 +- > mm/memory_hotplug.c | 73 ++++++++++++++++++++++++------------------ > 3 files changed, 65 insertions(+), 43 deletions(-) > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > index c7c4e0325cdb..26383af9900c 100644 > --- a/drivers/base/memory.c > +++ b/drivers/base/memory.c > @@ -388,6 +388,22 @@ static ssize_t show_phys_device(struct device *dev, > } > > #ifdef CONFIG_MEMORY_HOTREMOVE > +static void print_allowed_zone(char *buf, int nid, unsigned long start_pfn, > + unsigned long nr_pages, int online_type, > + struct zone *default_zone) > +{ > + struct zone *zone; > + > + if (!allow_online_pfn_range(nid, start_pfn, nr_pages, online_type)) > + return; > + > + zone = zone_for_pfn_range(online_type, nid, start_pfn, nr_pages); > + if (zone != default_zone) { > + strcat(buf, " "); > + strcat(buf, zone->name); > + } > +} > + > static ssize_t show_valid_zones(struct device *dev, > struct device_attribute *attr, char *buf) > { > @@ -395,7 +411,7 @@ static ssize_t show_valid_zones(struct device *dev, > unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); > unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; > unsigned long valid_start_pfn, valid_end_pfn; > - bool append = false; > + struct zone *default_zone; > int nid; > > /* > @@ -418,16 +434,13 @@ static ssize_t show_valid_zones(struct device *dev, > } > > nid = pfn_to_nid(start_pfn); > - if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL)) { > - strcat(buf, default_zone_for_pfn(nid, start_pfn, nr_pages)->name); > - append = true; > - } > + default_zone = zone_for_pfn_range(MMOP_ONLINE_KEEP, nid, start_pfn, nr_pages); > + strcat(buf, default_zone->name); > > - if (allow_online_pfn_range(nid, start_pfn, nr_pages, MMOP_ONLINE_MOVABLE)) { > - if (append) > - strcat(buf, " "); > - strcat(buf, NODE_DATA(nid)->node_zones[ZONE_MOVABLE].name); > - } > + print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_KERNEL, > + default_zone); > + print_allowed_zone(buf, nid, start_pfn, nr_pages, MMOP_ONLINE_MOVABLE, > + default_zone); > out: > strcat(buf, "\n"); > > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h > index c8a5056a5ae0..5e6e4cc36ff4 100644 > --- a/include/linux/memory_hotplug.h > +++ b/include/linux/memory_hotplug.h > @@ -319,6 +319,6 @@ extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map, > unsigned long pnum); > extern bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, > int online_type); > -extern struct zone *default_zone_for_pfn(int nid, unsigned long pfn, > +extern struct zone *zone_for_pfn_range(int online_type, int nid, unsigned start_pfn, > unsigned long nr_pages); > #endif /* __LINUX_MEMORY_HOTPLUG_H */ > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index b4015a39d108..6b9a60115e37 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -836,31 +836,6 @@ static void node_states_set_node(int node, struct memory_notify *arg) > node_set_state(node, N_MEMORY); > } > > -bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type) > -{ > - struct pglist_data *pgdat = NODE_DATA(nid); > - struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE]; > - struct zone *default_zone = default_zone_for_pfn(nid, pfn, nr_pages); > - > - /* > - * TODO there shouldn't be any inherent reason to have ZONE_NORMAL > - * physically before ZONE_MOVABLE. All we need is they do not > - * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE > - * though so let's stick with it for simplicity for now. > - * TODO make sure we do not overlap with ZONE_DEVICE > - */ > - if (online_type == MMOP_ONLINE_KERNEL) { > - if (zone_is_empty(movable_zone)) > - return true; > - return movable_zone->zone_start_pfn >= pfn + nr_pages; > - } else if (online_type == MMOP_ONLINE_MOVABLE) { > - return zone_end_pfn(default_zone) <= pfn; > - } > - > - /* MMOP_ONLINE_KEEP will always succeed and inherits the current zone */ > - return online_type == MMOP_ONLINE_KEEP; > -} > - > static void __meminit resize_zone_range(struct zone *zone, unsigned long start_pfn, > unsigned long nr_pages) > { > @@ -919,7 +894,7 @@ void __ref move_pfn_range_to_zone(struct zone *zone, > * If no kernel zone covers this pfn range it will automatically go > * to the ZONE_NORMAL. > */ > -struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, > +static struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, > unsigned long nr_pages) > { > struct pglist_data *pgdat = NODE_DATA(nid); > @@ -935,6 +910,31 @@ struct zone *default_zone_for_pfn(int nid, unsigned long start_pfn, > return &pgdat->node_zones[ZONE_NORMAL]; > } > > +bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type) > +{ > + struct pglist_data *pgdat = NODE_DATA(nid); > + struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE]; > + struct zone *default_zone = default_zone_for_pfn(nid, pfn, nr_pages); > + > + /* > + * TODO there shouldn't be any inherent reason to have ZONE_NORMAL > + * physically before ZONE_MOVABLE. All we need is they do not > + * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE > + * though so let's stick with it for simplicity for now. > + * TODO make sure we do not overlap with ZONE_DEVICE > + */ > + if (online_type == MMOP_ONLINE_KERNEL) { > + if (zone_is_empty(movable_zone)) > + return true; > + return movable_zone->zone_start_pfn >= pfn + nr_pages; > + } else if (online_type == MMOP_ONLINE_MOVABLE) { > + return zone_end_pfn(default_zone) <= pfn; > + } > + > + /* MMOP_ONLINE_KEEP will always succeed and inherits the current zone */ > + return online_type == MMOP_ONLINE_KEEP; > +} > + > static inline bool movable_pfn_range(int nid, struct zone *default_zone, > unsigned long start_pfn, unsigned long nr_pages) > { > @@ -948,12 +948,8 @@ static inline bool movable_pfn_range(int nid, struct zone *default_zone, > return !zone_intersects(default_zone, start_pfn, nr_pages); > } > > -/* > - * Associates the given pfn range with the given node and the zone appropriate > - * for the given online type. > - */ > -static struct zone * __meminit move_pfn_range(int online_type, int nid, > - unsigned long start_pfn, unsigned long nr_pages) > +struct zone * zone_for_pfn_range(int online_type, int nid, unsigned start_pfn, > + unsigned long nr_pages) > { > struct pglist_data *pgdat = NODE_DATA(nid); > struct zone *zone = default_zone_for_pfn(nid, start_pfn, nr_pages); > @@ -972,6 +968,19 @@ static struct zone * __meminit move_pfn_range(int online_type, int nid, > zone = &pgdat->node_zones[ZONE_MOVABLE]; > } > > + return zone; > +} > + > +/* > + * Associates the given pfn range with the given node and the zone appropriate > + * for the given online type. > + */ > +static struct zone * __meminit move_pfn_range(int online_type, int nid, > + unsigned long start_pfn, unsigned long nr_pages) > +{ > + struct zone *zone; > + > + zone = zone_for_pfn_range(online_type, nid, start_pfn, nr_pages); > move_pfn_range_to_zone(zone, start_pfn, nr_pages); > return zone; > } > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-07-14 14:26 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-07-14 12:12 [PATCH 0/2] mm, memory_hotplug: remove zone onlining restriction Michal Hocko 2017-07-14 12:12 ` [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering Michal Hocko 2017-07-14 12:12 ` [PATCH 2/2] mm, memory_hotplug: remove zone restrictions Michal Hocko 2017-07-14 12:17 ` Vlastimil Babka 2017-07-14 14:26 ` Reza Arbab -- strict thread matches above, loose matches on Subject: below -- 2017-06-29 7:35 [RFC PATCH 0/2] mm, memory_hotplug: remove zone onlining restriction Michal Hocko 2017-06-29 7:35 ` [PATCH 1/2] mm, memory_hotplug: display allowed zones in the preferred ordering Michal Hocko 2017-06-30 0:45 ` Joonsoo Kim 2017-07-07 14:34 ` Vlastimil Babka
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).