linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/1] mm: balancing the node zones occupancy
@ 2021-02-18 17:24 Charan Teja Reddy
  2021-02-18 17:24 ` [PATCH 1/1] mm: proof-of-concept for balancing " Charan Teja Reddy
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Charan Teja Reddy @ 2021-02-18 17:24 UTC (permalink / raw)
  To: akpm, rientjes, vbabka, mhocko, david, mgorman, linux-mm
  Cc: vinmenon, sudaraja, linux-kernel, Charan Teja Reddy

I would like to start discussion about  balancing the occupancy of
memory zones in a node in the system whose imabalance may be caused by
migration of pages to other zones during hotremove and then hotadding
same memory. In this case there is a lot of free memory in newly hotadd
memory which can be filled up by the previous migrated pages(as part of
offline/hotremove) thus may free up some pressure in other zones of the
node.

Say that in system has 2 zones(Normal and Movable), where Normal zone is
almost filled up by the pages of the movable zone as part of offline
operation and then we hot add a memory node as movable zone. At this
moment, Movable zone is almost empty and Normal zone is almost filled
up(by the migrated pages as part of previous offline/hot-remove
operation). At this point of time, allocation requests from Normal zone
may have to go through reclaim cycles thus cause some pressure. This
problem is quite common in the system where they aggressively offline
and online the memory blocks in the system where the former part do the
migration of pages to the lower zones and the later part don't reverse
them and as a result the offline operation may contribute to the
pressure in other zones of the system.

To overcome this problem, we can do the reverse of what offline
operation did, after onlining the memory block i.e. **try to reverse
migrate the pages from other zones which were migrated as part of the
offline operation**. This may freeup some pressure built in the other
zones because of offline operation. Since we are reverse migrating the
pages in the system, we can name this as "reverse migration feature" or
since we are actually balancing the occupancy of the zones in the system
by migrating the pages we can name it as "balancing the system zones
occupancy" or any name...

We have the proof-of-concept code tried on the Snapdragon systems with
the system configuration, single memory node of just 2 zones, 6GB normal
zone and 2GB movable zone. And this Movable zone is such that hot-added
once and there after offline/online based on the need.

We run the below unit test to evalutate this:
1) Completely fill up the already hot added movable zone with anon
pages

2) Offline this hot added movable zone. At this point there is no
pressure in the normal zone yet, but this migration of pages can
contribute to it in the future.

3) Now fill up the normal zone such that there is 300MB or less left in
the system.

4) And now the user onlined the movable zone memory blocks in the
system.

5) Run the tests of allocating 512MB of memory from the normal zone and
in doing so try allocating the higher order pages first and then
gradually fall back to lower orders. I took the help from ion system heap
memory allocation which try to allocate the memory in available orders:
9, 4 and 0.

6) Repeat the above steps for 10 iterations and below is the average of
the results.

We did try to collect the time it takes to complete the tests and the
distribution of anon pages(are the ones participated in the tests) in
the system node zones.
a) With out the reverse migration, it took an average of around 145msec
to complete the test.
b) With the reverse migration, it took an average of the 120msec to
complete the tests.

For distribution of the anon pages in the system we did try collect the
anon pages left in the individual zone before and after the test:
------------------------------------- |-------------------------------
         Base			      |		Reverse Migration
--------------------------------------|-------------------------------
	   Beforetest Aftertest	      | Beforetest	 Aftertest
Normal zone(Anon)                     |
  Active  499825	45756         |  481441		  203124
  Inactive 46008	446687	      |	 51553	          58602
  Free     80350	224252	      |  84877		**440586**
Movable zone(Anon)                    |
  Active   2224		2626	      |  2239		**484622**
  Inactive    8		   8	      |  9		7663
--------------------------------------|-------------------------------

The above table shows that, On base case(left column), there exists a
lot of anon pages in the system which can be migrated back to Movable
zone(almost totally free), thus may freeup some space in the normal
zone. With the reverse migration(Right coloumn), we see that the anon
pages are evenly distributed in the system and lot of free memory
left in the normal zones caused by the reverse migration.

The code shows the PoC by assuming just 2 zones(normal and Movable) of a
single node in the system. The number of pages to be reverse migrated is
written on the sysfs interface from the userspace by monitoring the
memory pressure events in the system.

Charan Teja Reddy (1):
  mm: proof-of-concept for balancing node zones occupancy

 include/linux/migrate.h |   8 +-
 include/linux/mm.h      |   3 +
 include/linux/mmzone.h  |   2 +
 kernel/sysctl.c         |  11 ++
 mm/compaction.c         |   4 +-
 mm/memory_hotplug.c     | 265 ++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 290 insertions(+), 3 deletions(-)

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/1] mm: proof-of-concept for balancing node zones occupancy
  2021-02-18 17:24 [PATCH RFC 0/1] mm: balancing the node zones occupancy Charan Teja Reddy
@ 2021-02-18 17:24 ` Charan Teja Reddy
  2021-02-18 18:16 ` [PATCH RFC 0/1] mm: balancing the " David Hildenbrand
  2021-02-19 11:26 ` Vlastimil Babka
  2 siblings, 0 replies; 6+ messages in thread
From: Charan Teja Reddy @ 2021-02-18 17:24 UTC (permalink / raw)
  To: akpm, rientjes, vbabka, mhocko, david, mgorman, linux-mm
  Cc: vinmenon, sudaraja, linux-kernel, Charan Teja Reddy

This is a Proof-of-concept code for balancing the node zones occupancy
whose imbalance may be caused by the memory hotplug.

Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
---
 include/linux/migrate.h |   8 +-
 include/linux/mm.h      |   3 +
 include/linux/mmzone.h  |   2 +
 kernel/sysctl.c         |  11 ++
 mm/compaction.c         |   4 +-
 mm/memory_hotplug.c     | 265 ++++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 290 insertions(+), 3 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 4594838..b7dc259 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -53,6 +53,8 @@ extern int migrate_huge_page_move_mapping(struct address_space *mapping,
 				  struct page *newpage, struct page *page);
 extern int migrate_page_move_mapping(struct address_space *mapping,
 		struct page *newpage, struct page *page, int extra_count);
+extern void split_map_pages(struct list_head *list);
+extern unsigned long release_freepages(struct list_head *freelist);
 #else
 
 static inline void putback_movable_pages(struct list_head *l) {}
@@ -81,7 +83,11 @@ static inline int migrate_huge_page_move_mapping(struct address_space *mapping,
 {
 	return -ENOSYS;
 }
-
+static inline void split_map_pages(struct list_head *list) { }
+static inline unsigned long release_freepages(struct list_head *freelist)
+{
+	return 0;
+}
 #endif /* CONFIG_MIGRATION */
 
 #ifdef CONFIG_COMPACTION
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ecdf8a8..1014139 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2465,6 +2465,9 @@ extern int watermark_boost_factor;
 extern int watermark_scale_factor;
 extern bool arch_has_descending_max_zone_pfns(void);
 
+/* memory_hotplug.c */
+extern int balance_node_occupancy_pages;
+
 /* nommu.c */
 extern atomic_long_t mmap_pages_allocated;
 extern int nommu_shrink_inode_mappings(struct inode *, size_t, size_t);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index b593316..ce417c3 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -977,6 +977,8 @@ int sysctl_min_slab_ratio_sysctl_handler(struct ctl_table *, int,
 		void *, size_t *, loff_t *);
 int numa_zonelist_order_handler(struct ctl_table *, int,
 		void *, size_t *, loff_t *);
+extern int sysctl_balance_node_occupancy_handler(struct ctl_table *tbl,
+		int write, void *buf, size_t *len, loff_t *pos);
 extern int percpu_pagelist_fraction;
 extern char numa_zonelist_order[];
 #define NUMA_ZONELIST_ORDER_LEN	16
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index c9fbdd8..4b95a90 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -3140,6 +3140,17 @@ static struct ctl_table vm_table[] = {
 		.extra2		= SYSCTL_ONE,
 	},
 #endif
+#ifdef CONFIG_MEMORY_HOTPLUG
+	{
+		.procname	= "balance_node_occupancy_pages",
+		.data		= &balance_node_occupancy_pages,
+		.maxlen		= sizeof(balance_node_occupancy_pages),
+		.mode		= 0200,
+		.proc_handler	= sysctl_balance_node_occupancy_handler,
+		.extra1		= SYSCTL_ZERO,
+	},
+
+#endif
 	{ }
 };
 
diff --git a/mm/compaction.c b/mm/compaction.c
index 190ccda..da3c015 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -68,7 +68,7 @@ static const unsigned int HPAGE_FRAG_CHECK_INTERVAL_MSEC = 500;
 #define COMPACTION_HPAGE_ORDER	(PMD_SHIFT - PAGE_SHIFT)
 #endif
 
-static unsigned long release_freepages(struct list_head *freelist)
+unsigned long release_freepages(struct list_head *freelist)
 {
 	struct page *page, *next;
 	unsigned long high_pfn = 0;
@@ -84,7 +84,7 @@ static unsigned long release_freepages(struct list_head *freelist)
 	return high_pfn;
 }
 
-static void split_map_pages(struct list_head *list)
+void split_map_pages(struct list_head *list)
 {
 	unsigned int i, order, nr_pages;
 	struct page *page, *next;
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index f9d57b9..2780c91 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -97,6 +97,271 @@ void mem_hotplug_done(void)
 
 u64 max_mem_size = U64_MAX;
 
+int balance_node_occupancy_pages;
+static atomic_t target_migrate_pages = ATOMIC_INIT(0);
+
+struct movable_zone_fill_control {
+	struct list_head freepages;
+	unsigned long start_pfn;
+	unsigned long end_pfn;
+	unsigned long nr_migrate_pages;
+	unsigned long nr_free_pages;
+	unsigned long limit;
+	int target;
+	struct zone *zone;
+};
+
+static void fill_movable_zone_fn(struct work_struct *work);
+static DECLARE_WORK(fill_movable_zone_work, fill_movable_zone_fn);
+static DEFINE_MUTEX(page_migrate_lock);
+
+static inline void reset_page_order(struct page *page)
+{
+	__ClearPageBuddy(page);
+	set_page_private(page, 0);
+}
+
+static int isolate_free_page(struct page *page, unsigned int order)
+{
+	struct zone *zone;
+
+	zone = page_zone(page);
+	list_del(&page->lru);
+	zone->free_area[order].nr_free--;
+	reset_page_order(page);
+
+	return 1UL << order;
+}
+
+static void isolate_free_pages(struct movable_zone_fill_control *fc)
+{
+	struct page *page;
+	unsigned long flags;
+	unsigned int order;
+	unsigned long start_pfn = fc->start_pfn;
+	unsigned long end_pfn = fc->end_pfn;
+
+	spin_lock_irqsave(&fc->zone->lock, flags);
+	for (; start_pfn < end_pfn; start_pfn++) {
+		unsigned long isolated;
+
+		if (!pfn_valid(start_pfn))
+			continue;
+
+		page = pfn_to_page(start_pfn);
+		if (!page)
+			continue;
+
+		if (PageCompound(page)) {
+			struct page *head = compound_head(page);
+			int skip;
+
+			skip = (1 << compound_order(head)) - (page - head);
+			start_pfn += skip - 1;
+			continue;
+		}
+
+		if (!PageBuddy(page))
+			continue;
+
+		order = page_private(page);
+		isolated = isolate_free_page(page, order);
+		set_page_private(page, order);
+		list_add_tail(&page->lru, &fc->freepages);
+		fc->nr_free_pages += isolated;
+		__mod_zone_page_state(fc->zone, NR_FREE_PAGES, -isolated);
+		start_pfn += isolated - 1;
+
+		/*
+		 * Make sure that the zone->lock is not held for long by
+		 * returning once we have SWAP_CLUSTER_MAX pages in the
+		 * free list for migration.
+		 */
+		if (fc->nr_free_pages >= SWAP_CLUSTER_MAX)
+			break;
+	}
+	fc->start_pfn = start_pfn + 1;
+	spin_unlock_irqrestore(&fc->zone->lock, flags);
+
+	split_map_pages(&fc->freepages);
+}
+
+static struct page *movable_page_alloc(struct page *page, unsigned long data)
+{
+	struct movable_zone_fill_control *fc;
+	struct page *freepage;
+
+	fc = (struct movable_zone_fill_control *)data;
+	if (list_empty(&fc->freepages)) {
+		isolate_free_pages(fc);
+		if (list_empty(&fc->freepages))
+			return NULL;
+	}
+
+	freepage = list_entry(fc->freepages.next, struct page, lru);
+	list_del(&freepage->lru);
+	fc->nr_free_pages--;
+
+	return freepage;
+}
+
+static void movable_page_free(struct page *page, unsigned long data)
+{
+	struct movable_zone_fill_control *fc;
+
+	fc = (struct movable_zone_fill_control *)data;
+	list_add(&page->lru, &fc->freepages);
+	fc->nr_free_pages++;
+}
+
+static unsigned long get_anon_movable_pages(
+			struct movable_zone_fill_control *fc,
+			unsigned long start_pfn,
+			unsigned long end_pfn, struct list_head *list)
+{
+	int found = 0, pfn, ret;
+	int limit = min_t(int, fc->target, (int)pageblock_nr_pages);
+
+	fc->nr_migrate_pages = 0;
+	for (pfn = start_pfn; pfn < end_pfn && found < limit; ++pfn) {
+		struct page *page = pfn_to_page(pfn);
+
+		if (!pfn_valid(pfn))
+			continue;
+
+		if (PageCompound(page)) {
+			struct page *head = compound_head(page);
+			int skip;
+
+			skip = (1 << compound_order(head)) - (page - head);
+			pfn += skip - 1;
+			continue;
+		}
+
+		if (PageBuddy(page)) {
+			unsigned long freepage_order;
+
+			freepage_order = READ_ONCE(page_private(page));
+			if (freepage_order > 0 && freepage_order < MAX_ORDER)
+				pfn += (1 << page_private(page)) - 1;
+			continue;
+		}
+
+		if (!PageLRU(page) || !PageAnon(page))
+			continue;
+
+		if (!get_page_unless_zero(page))
+			continue;
+
+		found++;
+		ret = isolate_lru_page(page);
+		if (!ret) {
+			list_add_tail(&page->lru, list);
+			inc_node_page_state(page, NR_ISOLATED_ANON +
+					page_is_file_lru(page));
+			++fc->nr_migrate_pages;
+		}
+
+		put_page(page);
+	}
+
+	return pfn;
+}
+
+static void prepare_fc(struct movable_zone_fill_control *fc)
+{
+	struct zone *zone;
+
+	zone = &(NODE_DATA(0)->node_zones[ZONE_MOVABLE]);
+	fc->zone = zone;
+	fc->start_pfn = zone->zone_start_pfn;
+	fc->end_pfn = zone_end_pfn(zone);
+	fc->limit = atomic64_read(&zone->managed_pages);
+	INIT_LIST_HEAD(&fc->freepages);
+}
+
+#define MIGRATE_TIMEOUT_SEC (20)
+static void fill_movable_zone_fn(struct work_struct *work)
+{
+	unsigned long start_pfn, end_pfn;
+	unsigned long movable_highmark;
+	struct zone *normal_zone = &(NODE_DATA(0)->node_zones[ZONE_NORMAL]);
+	struct zone *movable_zone = &(NODE_DATA(0)->node_zones[ZONE_MOVABLE]);
+	LIST_HEAD(source);
+	int ret, free;
+	struct movable_zone_fill_control fc = { {0} };
+	unsigned long timeout = MIGRATE_TIMEOUT_SEC * HZ, expire;
+
+	start_pfn = normal_zone->zone_start_pfn;
+	end_pfn = zone_end_pfn(normal_zone);
+	movable_highmark = high_wmark_pages(movable_zone);
+
+	lru_add_drain_all();
+	drain_all_pages(normal_zone);
+	if (!mutex_trylock(&page_migrate_lock))
+		return;
+	prepare_fc(&fc);
+	if (!fc.limit)
+		goto out;
+	expire = jiffies + timeout;
+restart:
+	fc.target = atomic_xchg(&target_migrate_pages, 0);
+	if (!fc.target)
+		goto out;
+repeat:
+	cond_resched();
+	if (time_after(jiffies, expire))
+		goto out;
+	free = zone_page_state(movable_zone, NR_FREE_PAGES);
+	if (free - fc.target <= movable_highmark)
+		fc.target = free - movable_highmark;
+	if (fc.target <= 0)
+		goto out;
+
+	start_pfn = get_anon_movable_pages(&fc, start_pfn, end_pfn, &source);
+	if (list_empty(&source) && start_pfn < end_pfn)
+		goto repeat;
+
+	ret = migrate_pages(&source, movable_page_alloc, movable_page_free,
+			(unsigned long) &fc,
+			MIGRATE_ASYNC, MR_MEMORY_HOTPLUG);
+	if (ret)
+		putback_movable_pages(&source);
+
+	fc.target -= fc.nr_migrate_pages;
+	if (ret == -ENOMEM || start_pfn >= end_pfn)
+		goto out;
+	else if (fc.target <= 0)
+		goto restart;
+
+	goto repeat;
+out:
+	mutex_unlock(&page_migrate_lock);
+	if (fc.nr_free_pages > 0)
+		release_freepages(&fc.freepages);
+}
+
+
+
+int sysctl_balance_node_occupancy_handler(struct ctl_table *table, int write,
+		void __user *buffer, size_t *length, loff_t *ppos)
+{
+	int rc;
+
+	rc = proc_dointvec_minmax(table, write, buffer, length, ppos);
+	if (rc)
+		return rc;
+
+	if (write) {
+		atomic_add(balance_node_occupancy_pages, &target_migrate_pages);
+
+		if (!work_pending(&fill_movable_zone_work))
+			queue_work(system_unbound_wq, &fill_movable_zone_work);
+	}
+
+	return 0;
+}
+
 /* add this memory to iomem resource */
 static struct resource *register_memory_resource(u64 start, u64 size,
 						 const char *resource_name)
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC 0/1] mm: balancing the node zones occupancy
  2021-02-18 17:24 [PATCH RFC 0/1] mm: balancing the node zones occupancy Charan Teja Reddy
  2021-02-18 17:24 ` [PATCH 1/1] mm: proof-of-concept for balancing " Charan Teja Reddy
@ 2021-02-18 18:16 ` David Hildenbrand
  2021-02-23 12:30   ` Charan Teja Kalla
  2021-02-19 11:26 ` Vlastimil Babka
  2 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand @ 2021-02-18 18:16 UTC (permalink / raw)
  To: Charan Teja Reddy, akpm, rientjes, vbabka, mhocko, mgorman, linux-mm
  Cc: vinmenon, sudaraja, linux-kernel

On 18.02.21 18:24, Charan Teja Reddy wrote:
> I would like to start discussion about  balancing the occupancy of
> memory zones in a node in the system whose imabalance may be caused by
> migration of pages to other zones during hotremove and then hotadding
> same memory. In this case there is a lot of free memory in newly hotadd
> memory which can be filled up by the previous migrated pages(as part of
> offline/hotremove) thus may free up some pressure in other zones of the
> node.

Why is this specific to memory hot(un)plug? I think the problem is more 
generic:

Assume

1. Application 1 allocates a lot of memory and gets ZONE_MOVABLE.
2. Application 2 allocates a lot of memory and gets ZONE_NORMAL.
3. Application 1 quits.

Same problem, no?

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC 0/1] mm: balancing the node zones occupancy
  2021-02-18 17:24 [PATCH RFC 0/1] mm: balancing the node zones occupancy Charan Teja Reddy
  2021-02-18 17:24 ` [PATCH 1/1] mm: proof-of-concept for balancing " Charan Teja Reddy
  2021-02-18 18:16 ` [PATCH RFC 0/1] mm: balancing the " David Hildenbrand
@ 2021-02-19 11:26 ` Vlastimil Babka
  2021-02-23 13:45   ` Charan Teja Kalla
  2 siblings, 1 reply; 6+ messages in thread
From: Vlastimil Babka @ 2021-02-19 11:26 UTC (permalink / raw)
  To: Charan Teja Reddy, akpm, rientjes, mhocko, david, mgorman, linux-mm
  Cc: vinmenon, sudaraja, linux-kernel, Dave Hansen

On 2/18/21 6:24 PM, Charan Teja Reddy wrote:
> I would like to start discussion about  balancing the occupancy of
> memory zones in a node in the system whose imabalance may be caused by
> migration of pages to other zones during hotremove and then hotadding
> same memory. In this case there is a lot of free memory in newly hotadd
> memory which can be filled up by the previous migrated pages(as part of
> offline/hotremove) thus may free up some pressure in other zones of the
> node.

Can you share the use case for doing this? If it's to replace a failed RAM, then
it's probably extremely rare, right.

> We have the proof-of-concept code tried on the Snapdragon systems with
> the system configuration, single memory node of just 2 zones, 6GB normal
> zone and 2GB movable zone. And this Movable zone is such that hot-added
> once and there after offline/online based on the need.

Hm, snapdragon... so is this some kind of power saving thing?

Anyway, shouln't auto NUMA balancing help here, and especially "Migrate Pages in
lieu of discard" (CC'd Dave) as a generic mechanism, so we wouldn't need to have
hotplug-specific actions?



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC 0/1] mm: balancing the node zones occupancy
  2021-02-18 18:16 ` [PATCH RFC 0/1] mm: balancing the " David Hildenbrand
@ 2021-02-23 12:30   ` Charan Teja Kalla
  0 siblings, 0 replies; 6+ messages in thread
From: Charan Teja Kalla @ 2021-02-23 12:30 UTC (permalink / raw)
  To: David Hildenbrand, akpm, rientjes, vbabka, mhocko, mgorman, linux-mm
  Cc: vinmenon, sudaraja, linux-kernel


Thanks David for the review comments!!

On 2/18/2021 11:46 PM, David Hildenbrand wrote:
>> I would like to start discussion about  balancing the occupancy of
>> memory zones in a node in the system whose imabalance may be caused by
>> migration of pages to other zones during hotremove and then hotadding
>> same memory. In this case there is a lot of free memory in newly hotadd
>> memory which can be filled up by the previous migrated pages(as part of
>> offline/hotremove) thus may free up some pressure in other zones of the
>> node.
> 
> Why is this specific to memory hot(un)plug? I think the problem is more
> generic:
> 
> Assume
> 
> 1. Application 1 allocates a lot of memory and gets ZONE_MOVABLE.
> 2. Application 2 allocates a lot of memory and gets ZONE_NORMAL.
> 3. Application 1 quits.
> 
> Same problem, no?

Thanks for simplifying this problem. Yeah, this looks more generic
problem. But for these type of problems, user/system administrator has
clear view about the state of the system and thus may need to take some
decisions to maintain the the node zones balancing e.g. like this change
where migrate the eligible pages to other zones.

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH RFC 0/1] mm: balancing the node zones occupancy
  2021-02-19 11:26 ` Vlastimil Babka
@ 2021-02-23 13:45   ` Charan Teja Kalla
  0 siblings, 0 replies; 6+ messages in thread
From: Charan Teja Kalla @ 2021-02-23 13:45 UTC (permalink / raw)
  To: Vlastimil Babka, akpm, rientjes, mhocko, david, mgorman, linux-mm
  Cc: vinmenon, sudaraja, linux-kernel, Dave Hansen

Thanks Vlastimil for the review comments!!

On 2/19/2021 4:56 PM, Vlastimil Babka wrote:
> Can you share the use case for doing this? If it's to replace a failed RAM, then
> it's probably extremely rare, right.
> 
>> We have the proof-of-concept code tried on the Snapdragon systems with
>> the system configuration, single memory node of just 2 zones, 6GB normal
>> zone and 2GB movable zone. And this Movable zone is such that hot-added
>> once and there after offline/online based on the need.
> Hm, snapdragon... so is this some kind of power saving thing?
> 

You are correct. This is the power saving usecase which does the offline
and online of the memory blocks in the system by the user. This is not a
failed RAM.

> Anyway, shouln't auto NUMA balancing help here, and especially "Migrate Pages in
> lieu of discard" (CC'd Dave) as a generic mechanism, 

On the Snapdragon systems we have got only single memory node with
Normal and movable zones. And my little understanding is that on most
embedded systems we will just have single memory node.

My limited understanding about this auto NUMA balancing is that there
should be min. 2 nodes for this balancing to trigger. Please correct if
I am wrong here. If I am correct then this approach is not suitable for
us. Moreover the idea I would like to convey in this RFC patch is about
__balancing the zones in a node but not across NUMA nodes__.

> so we wouldn't need to have hotplug-specific actions?
David has told a very simple view of this problem which is nothing todo
with the hotplug specific actions.

With just 2 zones(Normal and Movable) in a single node in the system,

1. Application 1 allocates a lot of memory and gets ZONE_MOVABLE.
2. Application 2 allocates a lot of memory and gets ZONE_NORMAL.
3. Application 1 quits.

Then after step3, we can expect a lot free memory in the Movable zone
but normal zone is under pressure. Applying the similar semantics of
Auto numa balancing("Migrate pages in lieu of swap/discard"), we could
migrate some eligible pages of Application 2 to Movable zone there by
can relieve some pressure in Normal zone.

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-02-23 13:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-18 17:24 [PATCH RFC 0/1] mm: balancing the node zones occupancy Charan Teja Reddy
2021-02-18 17:24 ` [PATCH 1/1] mm: proof-of-concept for balancing " Charan Teja Reddy
2021-02-18 18:16 ` [PATCH RFC 0/1] mm: balancing the " David Hildenbrand
2021-02-23 12:30   ` Charan Teja Kalla
2021-02-19 11:26 ` Vlastimil Babka
2021-02-23 13:45   ` Charan Teja Kalla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).