linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] show_mem updates
@ 2017-01-12 13:16 Michal Hocko
  2017-01-12 13:16 ` [PATCH 1/4] mm, page_alloc: do not report all nodes in show_mem Michal Hocko
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Michal Hocko @ 2017-01-12 13:16 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes,
	Chris Metcalf, David S. Miller, Fenghua Yu, Guan Xuetao,
	Helge Deller, James E.J. Bottomley, Michal Hocko, Tony Luck

Hi,
this is a mixture of one bug fix (patch 1), an enhancement (patch 2)
and cleanups (the rest of the series). First two patches should be
really straightforward. Patch 3 removes some arch specific show_mem
implementations because I think they are quite outdated and do not
really serve any useful purpose anymore. I might be missing something
which is why this patch is RFC. I think we should really strive to have
a consistent show_mem output regardless of the architecture. If some
architecture is really special and wants to dump something additional we
should do that via an arch specific hook.
The last patch adds nodemask parameter so that we do not rely on
the hardcoded mems_allowed of the current task when doing the node
filtering.  I consider this more a cleanup than a fix because basically
all users use a nodemask which is a subset of mems_allowed. There is
only one call path in the memory hotplug which doesn't comply with this
but that is hardly something to worry about.

Thoughts, comments?

Michal Hocko (4):
      mm, page_alloc: do not report all nodes in show_mem
      mm, page_alloc: warn_alloc print nodemask
      arch, mm: remove arch specific show_mem
      lib/show_mem.c: teach show_mem to work with the given nodemask

 arch/ia64/mm/init.c                 | 48 ------------------------------------
 arch/parisc/mm/init.c               | 49 -------------------------------------
 arch/powerpc/xmon/xmon.c            |  2 +-
 arch/sparc/kernel/setup_32.c        |  2 +-
 arch/sparc/mm/init_32.c             | 11 ---------
 arch/tile/mm/pgtable.c              | 45 ----------------------------------
 arch/unicore32/mm/init.c            | 44 ---------------------------------
 drivers/net/ethernet/sgi/ioc3-eth.c |  2 +-
 drivers/tty/sysrq.c                 |  2 +-
 drivers/tty/vt/keyboard.c           |  2 +-
 include/linux/mm.h                  |  9 +++----
 lib/show_mem.c                      |  4 +--
 mm/nommu.c                          |  6 ++---
 mm/oom_kill.c                       |  2 +-
 mm/page_alloc.c                     | 48 +++++++++++++++++++-----------------
 mm/vmalloc.c                        |  4 +--
 16 files changed, 43 insertions(+), 237 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/4] mm, page_alloc: do not report all nodes in show_mem
  2017-01-12 13:16 [PATCH 0/4] show_mem updates Michal Hocko
@ 2017-01-12 13:16 ` Michal Hocko
  2017-01-12 13:47   ` Mel Gorman
  2017-01-14 16:26   ` Johannes Weiner
  2017-01-12 13:16 ` [PATCH 2/4] mm, page_alloc: warn_alloc print nodemask Michal Hocko
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 20+ messages in thread
From: Michal Hocko @ 2017-01-12 13:16 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

599d0c954f91 ("mm, vmscan: move LRU lists to node") has added per numa
node statistics to show_mem but it forgot to add skip_free_areas_node
to fileter out nodes which are outside of the allocating task numa
policy. Add this check to not pollute the output with the pointless
information.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/page_alloc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8ff25883c172..8f4f306d804c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4345,6 +4345,9 @@ void show_free_areas(unsigned int filter)
 		global_page_state(NR_FREE_CMA_PAGES));
 
 	for_each_online_pgdat(pgdat) {
+		if (skip_free_areas_node(filter, pgdat->node_id))
+			continue;
+
 		printk("Node %d"
 			" active_anon:%lukB"
 			" inactive_anon:%lukB"
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/4] mm, page_alloc: warn_alloc print nodemask
  2017-01-12 13:16 [PATCH 0/4] show_mem updates Michal Hocko
  2017-01-12 13:16 ` [PATCH 1/4] mm, page_alloc: do not report all nodes in show_mem Michal Hocko
@ 2017-01-12 13:16 ` Michal Hocko
  2017-01-12 13:47   ` Mel Gorman
  2017-01-13 11:31   ` Vlastimil Babka
  2017-01-12 13:16 ` [RFC PATCH 3/4] arch, mm: remove arch specific show_mem Michal Hocko
  2017-01-12 13:16 ` [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask Michal Hocko
  3 siblings, 2 replies; 20+ messages in thread
From: Michal Hocko @ 2017-01-12 13:16 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

warn_alloc is currently used for to report an allocation failure or an
allocation stall. We print some details of the allocation request like
the gfp mask and the request order. We do not print the allocation
nodemask which is important when debugging the reason for the allocation
failure as well. We alreaddy print the nodemask in the OOM report.

Add nodemask to warn_alloc and print it in warn_alloc as well.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/linux/mm.h | 4 ++--
 mm/page_alloc.c    | 9 +++++----
 mm/vmalloc.c       | 4 ++--
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 57dc3c3b53c1..3e35eb04a28a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1912,8 +1912,8 @@ extern void si_meminfo_node(struct sysinfo *val, int nid);
 extern unsigned long arch_reserved_kernel_pages(void);
 #endif
 
-extern __printf(2, 3)
-void warn_alloc(gfp_t gfp_mask, const char *fmt, ...);
+extern __printf(3, 4)
+void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...);
 
 extern void setup_per_cpu_pageset(void);
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8f4f306d804c..0a9805a696bb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3031,12 +3031,13 @@ static void warn_alloc_show_mem(gfp_t gfp_mask)
 	show_mem(filter);
 }
 
-void warn_alloc(gfp_t gfp_mask, const char *fmt, ...)
+void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
 {
 	struct va_format vaf;
 	va_list args;
 	static DEFINE_RATELIMIT_STATE(nopage_rs, DEFAULT_RATELIMIT_INTERVAL,
 				      DEFAULT_RATELIMIT_BURST);
+	nodemask_t *nm = (nodemask) ? nodemask : &cpuset_current_mems_allowed;
 
 	if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs) ||
 	    debug_guardpage_minorder() > 0)
@@ -3050,7 +3051,7 @@ void warn_alloc(gfp_t gfp_mask, const char *fmt, ...)
 	pr_cont("%pV", &vaf);
 	va_end(args);
 
-	pr_cont(", mode:%#x(%pGg)\n", gfp_mask, &gfp_mask);
+	pr_cont(", mode:%#x(%pGg), nodemask=%*pbl\n", gfp_mask, &gfp_mask, nodemask_pr_args(nm));
 
 	dump_stack();
 	warn_alloc_show_mem(gfp_mask);
@@ -3709,7 +3710,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 
 	/* Make sure we know about allocations which stall for too long */
 	if (time_after(jiffies, alloc_start + stall_timeout)) {
-		warn_alloc(gfp_mask,
+		warn_alloc(gfp_mask, ac->nodemask,
 			"page allocation stalls for %ums, order:%u",
 			jiffies_to_msecs(jiffies-alloc_start), order);
 		stall_timeout += 10 * HZ;
@@ -3743,7 +3744,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	}
 
 nopage:
-	warn_alloc(gfp_mask,
+	warn_alloc(gfp_mask, ac->nodemask,
 			"page allocation failure: order:%u", order);
 got_pg:
 	return page;
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b9999fc44aa6..0600bbbd1080 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1662,7 +1662,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 	return area->addr;
 
 fail:
-	warn_alloc(gfp_mask,
+	warn_alloc(gfp_mask, NULL,
 			  "vmalloc: allocation failure, allocated %ld of %ld bytes",
 			  (area->nr_pages*PAGE_SIZE), area->size);
 	vfree(area->addr);
@@ -1724,7 +1724,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
 	return addr;
 
 fail:
-	warn_alloc(gfp_mask,
+	warn_alloc(gfp_mask, NULL,
 			  "vmalloc: allocation failure: %lu bytes", real_size);
 	return NULL;
 }
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC PATCH 3/4] arch, mm: remove arch specific show_mem
  2017-01-12 13:16 [PATCH 0/4] show_mem updates Michal Hocko
  2017-01-12 13:16 ` [PATCH 1/4] mm, page_alloc: do not report all nodes in show_mem Michal Hocko
  2017-01-12 13:16 ` [PATCH 2/4] mm, page_alloc: warn_alloc print nodemask Michal Hocko
@ 2017-01-12 13:16 ` Michal Hocko
  2017-01-12 13:48   ` Mel Gorman
                     ` (4 more replies)
  2017-01-12 13:16 ` [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask Michal Hocko
  3 siblings, 5 replies; 20+ messages in thread
From: Michal Hocko @ 2017-01-12 13:16 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes,
	Michal Hocko, Tony Luck, Fenghua Yu, James E.J. Bottomley,
	Helge Deller, David S. Miller, Chris Metcalf, Guan Xuetao,
	linux-ia64, linux-parisc

From: Michal Hocko <mhocko@suse.com>

We have a generic implementation for quite some time already. If there
is any arch specific information to be printed then we should add a
callback called from the generic code rather than duplicate the whole
show_mem. The current code has resulted in the code duplication and
the output divergence which is both confusing and adds maintainance
costs. Let's just get rid of this mess.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: linux-ia64@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/ia64/mm/init.c      | 48 -----------------------------------------------
 arch/parisc/mm/init.c    | 49 ------------------------------------------------
 arch/sparc/mm/init_32.c  | 11 -----------
 arch/tile/mm/pgtable.c   | 45 --------------------------------------------
 arch/unicore32/mm/init.c | 44 -------------------------------------------
 5 files changed, 197 deletions(-)

diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 1841ef69183d..46afc8d5ebfc 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -684,51 +684,3 @@ int arch_remove_memory(u64 start, u64 size)
 }
 #endif
 #endif
-
-/**
- * show_mem - give short summary of memory stats
- *
- * Shows a simple page count of reserved and used pages in the system.
- * For discontig machines, it does this on a per-pgdat basis.
- */
-void show_mem(unsigned int filter)
-{
-	int total_reserved = 0;
-	unsigned long total_present = 0;
-	pg_data_t *pgdat;
-
-	printk(KERN_INFO "Mem-info:\n");
-	show_free_areas(filter);
-	printk(KERN_INFO "Node memory in pages:\n");
-	for_each_online_pgdat(pgdat) {
-		unsigned long present;
-		unsigned long flags;
-		int reserved = 0;
-		int nid = pgdat->node_id;
-		int zoneid;
-
-		if (skip_free_areas_node(filter, nid))
-			continue;
-		pgdat_resize_lock(pgdat, &flags);
-
-		for (zoneid = 0; zoneid < MAX_NR_ZONES; zoneid++) {
-			struct zone *zone = &pgdat->node_zones[zoneid];
-			if (!populated_zone(zone))
-				continue;
-
-			reserved += zone->present_pages - zone->managed_pages;
-		}
-		present = pgdat->node_present_pages;
-
-		pgdat_resize_unlock(pgdat, &flags);
-		total_present += present;
-		total_reserved += reserved;
-		printk(KERN_INFO "Node %4d:  RAM: %11ld, rsvd: %8d, ",
-		       nid, present, reserved);
-	}
-	printk(KERN_INFO "%ld pages of RAM\n", total_present);
-	printk(KERN_INFO "%d reserved pages\n", total_reserved);
-	printk(KERN_INFO "Total of %ld pages in page table cache\n",
-	       quicklist_total_size());
-	printk(KERN_INFO "%ld free buffer pages\n", nr_free_buffer_pages());
-}
diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
index e02ada312be8..64bfdf636f39 100644
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -653,55 +653,6 @@ void __init mem_init(void)
 unsigned long *empty_zero_page __read_mostly;
 EXPORT_SYMBOL(empty_zero_page);
 
-void show_mem(unsigned int filter)
-{
-	int total = 0,reserved = 0;
-	pg_data_t *pgdat;
-
-	printk(KERN_INFO "Mem-info:\n");
-	show_free_areas(filter);
-
-	for_each_online_pgdat(pgdat) {
-		unsigned long flags;
-		int zoneid;
-
-		pgdat_resize_lock(pgdat, &flags);
-		for (zoneid = 0; zoneid < MAX_NR_ZONES; zoneid++) {
-			struct zone *zone = &pgdat->node_zones[zoneid];
-			if (!populated_zone(zone))
-				continue;
-
-			total += zone->present_pages;
-			reserved = zone->present_pages - zone->managed_pages;
-		}
-		pgdat_resize_unlock(pgdat, &flags);
-	}
-
-	printk(KERN_INFO "%d pages of RAM\n", total);
-	printk(KERN_INFO "%d reserved pages\n", reserved);
-
-#ifdef CONFIG_DISCONTIGMEM
-	{
-		struct zonelist *zl;
-		int i, j;
-
-		for (i = 0; i < npmem_ranges; i++) {
-			zl = node_zonelist(i, 0);
-			for (j = 0; j < MAX_NR_ZONES; j++) {
-				struct zoneref *z;
-				struct zone *zone;
-
-				printk("Zone list for zone %d on node %d: ", j, i);
-				for_each_zone_zonelist(zone, z, zl, j)
-					printk("[%d/%s] ", zone_to_nid(zone),
-								zone->name);
-				printk("\n");
-			}
-		}
-	}
-#endif
-}
-
 /*
  * pagetable_init() sets up the page tables
  *
diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
index eb8287155279..c6afe98de4d9 100644
--- a/arch/sparc/mm/init_32.c
+++ b/arch/sparc/mm/init_32.c
@@ -55,17 +55,6 @@ extern unsigned int sparc_ramdisk_size;
 
 unsigned long highstart_pfn, highend_pfn;
 
-void show_mem(unsigned int filter)
-{
-	printk("Mem-info:\n");
-	show_free_areas(filter);
-	printk("Free swap:       %6ldkB\n",
-	       get_nr_swap_pages() << (PAGE_SHIFT-10));
-	printk("%ld pages of RAM\n", totalram_pages);
-	printk("%ld free pages\n", nr_free_pages());
-}
-
-
 unsigned long last_valid_pfn;
 
 unsigned long calc_highpages(void)
diff --git a/arch/tile/mm/pgtable.c b/arch/tile/mm/pgtable.c
index 7cc6ee7f1a58..492a7361e58e 100644
--- a/arch/tile/mm/pgtable.c
+++ b/arch/tile/mm/pgtable.c
@@ -36,51 +36,6 @@
 
 #define K(x) ((x) << (PAGE_SHIFT-10))
 
-/*
- * The normal show_free_areas() is too verbose on Tile, with dozens
- * of processors and often four NUMA zones each with high and lowmem.
- */
-void show_mem(unsigned int filter)
-{
-	struct zone *zone;
-
-	pr_err("Active:%lu inactive:%lu dirty:%lu writeback:%lu unstable:%lu free:%lu\n slab:%lu mapped:%lu pagetables:%lu bounce:%lu pagecache:%lu swap:%lu\n",
-	       (global_node_page_state(NR_ACTIVE_ANON) +
-		global_node_page_state(NR_ACTIVE_FILE)),
-	       (global_node_page_state(NR_INACTIVE_ANON) +
-		global_node_page_state(NR_INACTIVE_FILE)),
-	       global_node_page_state(NR_FILE_DIRTY),
-	       global_node_page_state(NR_WRITEBACK),
-	       global_node_page_state(NR_UNSTABLE_NFS),
-	       global_page_state(NR_FREE_PAGES),
-	       (global_page_state(NR_SLAB_RECLAIMABLE) +
-		global_page_state(NR_SLAB_UNRECLAIMABLE)),
-	       global_node_page_state(NR_FILE_MAPPED),
-	       global_page_state(NR_PAGETABLE),
-	       global_page_state(NR_BOUNCE),
-	       global_node_page_state(NR_FILE_PAGES),
-	       get_nr_swap_pages());
-
-	for_each_zone(zone) {
-		unsigned long flags, order, total = 0, largest_order = -1;
-
-		if (!populated_zone(zone))
-			continue;
-
-		spin_lock_irqsave(&zone->lock, flags);
-		for (order = 0; order < MAX_ORDER; order++) {
-			int nr = zone->free_area[order].nr_free;
-			total += nr << order;
-			if (nr)
-				largest_order = order;
-		}
-		spin_unlock_irqrestore(&zone->lock, flags);
-		pr_err("Node %d %7s: %lukB (largest %luKb)\n",
-		       zone_to_nid(zone), zone->name,
-		       K(total), largest_order ? K(1UL) << largest_order : 0);
-	}
-}
-
 /**
  * shatter_huge_page() - ensure a given address is mapped by a small page.
  *
diff --git a/arch/unicore32/mm/init.c b/arch/unicore32/mm/init.c
index be2bde9b07cf..f4950fbfe574 100644
--- a/arch/unicore32/mm/init.c
+++ b/arch/unicore32/mm/init.c
@@ -57,50 +57,6 @@ early_param("initrd", early_initrd);
  */
 struct meminfo meminfo;
 
-void show_mem(unsigned int filter)
-{
-	int free = 0, total = 0, reserved = 0;
-	int shared = 0, cached = 0, slab = 0, i;
-	struct meminfo *mi = &meminfo;
-
-	printk(KERN_DEFAULT "Mem-info:\n");
-	show_free_areas(filter);
-
-	for_each_bank(i, mi) {
-		struct membank *bank = &mi->bank[i];
-		unsigned int pfn1, pfn2;
-		struct page *page, *end;
-
-		pfn1 = bank_pfn_start(bank);
-		pfn2 = bank_pfn_end(bank);
-
-		page = pfn_to_page(pfn1);
-		end  = pfn_to_page(pfn2 - 1) + 1;
-
-		do {
-			total++;
-			if (PageReserved(page))
-				reserved++;
-			else if (PageSwapCache(page))
-				cached++;
-			else if (PageSlab(page))
-				slab++;
-			else if (!page_count(page))
-				free++;
-			else
-				shared += page_count(page) - 1;
-			page++;
-		} while (page < end);
-	}
-
-	printk(KERN_DEFAULT "%d pages of RAM\n", total);
-	printk(KERN_DEFAULT "%d free pages\n", free);
-	printk(KERN_DEFAULT "%d reserved pages\n", reserved);
-	printk(KERN_DEFAULT "%d slab pages\n", slab);
-	printk(KERN_DEFAULT "%d pages shared\n", shared);
-	printk(KERN_DEFAULT "%d pages swap cached\n", cached);
-}
-
 static void __init find_limits(unsigned long *min, unsigned long *max_low,
 	unsigned long *max_high)
 {
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask
  2017-01-12 13:16 [PATCH 0/4] show_mem updates Michal Hocko
                   ` (2 preceding siblings ...)
  2017-01-12 13:16 ` [RFC PATCH 3/4] arch, mm: remove arch specific show_mem Michal Hocko
@ 2017-01-12 13:16 ` Michal Hocko
  2017-01-12 13:49   ` Mel Gorman
  2017-01-13 13:08   ` Vlastimil Babka
  3 siblings, 2 replies; 20+ messages in thread
From: Michal Hocko @ 2017-01-12 13:16 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

show_mem() allows to filter out node specific data which is irrelevant
to the allocation request via SHOW_MEM_FILTER_NODES. The filtering
is done in skip_free_areas_node which skips all nodes which are not
in the mems_allowed of the current process. This works most of the
time as expected because the nodemask shouldn't be outside of the
allocating task but there are some exceptions. E.g. memory hotplug might
want to request allocations from outside of the allowed nodes (see
new_node_page).

Get rid of this hardcoded behavior and push the allocation mask down the
show_mem path and use it instead of cpuset_current_mems_allowed. NULL
nodemask is interpreted as cpuset_current_mems_allowed.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/powerpc/xmon/xmon.c            |  2 +-
 arch/sparc/kernel/setup_32.c        |  2 +-
 drivers/net/ethernet/sgi/ioc3-eth.c |  2 +-
 drivers/tty/sysrq.c                 |  2 +-
 drivers/tty/vt/keyboard.c           |  2 +-
 include/linux/mm.h                  |  5 ++---
 lib/show_mem.c                      |  4 ++--
 mm/nommu.c                          |  6 +++---
 mm/oom_kill.c                       |  2 +-
 mm/page_alloc.c                     | 38 ++++++++++++++++++-------------------
 10 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 760545519a0b..e285a89a65ec 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -913,7 +913,7 @@ cmds(struct pt_regs *excp)
 				memzcan();
 				break;
 			case 'i':
-				show_mem(0);
+				show_mem(0, NULL);
 				break;
 			default:
 				termch = cmd;
diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c
index c4e65cb3280f..6f06058c5ae7 100644
--- a/arch/sparc/kernel/setup_32.c
+++ b/arch/sparc/kernel/setup_32.c
@@ -82,7 +82,7 @@ static void prom_sync_me(void)
 			     "nop\n\t" : : "r" (&trapbase));
 
 	prom_printf("PROM SYNC COMMAND...\n");
-	show_free_areas(0);
+	show_free_areas(0, NULL);
 	if (!is_idle_task(current)) {
 		local_irq_enable();
 		sys_sync();
diff --git a/drivers/net/ethernet/sgi/ioc3-eth.c b/drivers/net/ethernet/sgi/ioc3-eth.c
index 7a254da85dd7..231e96d8bd14 100644
--- a/drivers/net/ethernet/sgi/ioc3-eth.c
+++ b/drivers/net/ethernet/sgi/ioc3-eth.c
@@ -914,7 +914,7 @@ static void ioc3_alloc_rings(struct net_device *dev)
 
 			skb = ioc3_alloc_skb(RX_BUF_ALLOC_SIZE, GFP_ATOMIC);
 			if (!skb) {
-				show_free_areas(0);
+				show_free_areas(0, NULL);
 				continue;
 			}
 
diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 52bbd27e93ae..667fa3931161 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -317,7 +317,7 @@ static struct sysrq_key_op sysrq_ftrace_dump_op = {
 
 static void sysrq_handle_showmem(int key)
 {
-	show_mem(0);
+	show_mem(0, NULL);
 }
 static struct sysrq_key_op sysrq_showmem_op = {
 	.handler	= sysrq_handle_showmem,
diff --git a/drivers/tty/vt/keyboard.c b/drivers/tty/vt/keyboard.c
index 0f8caae4267d..09511a362ade 100644
--- a/drivers/tty/vt/keyboard.c
+++ b/drivers/tty/vt/keyboard.c
@@ -572,7 +572,7 @@ static void fn_scroll_back(struct vc_data *vc)
 
 static void fn_show_mem(struct vc_data *vc)
 {
-	show_mem(0);
+	show_mem(0, NULL);
 }
 
 static void fn_show_state(struct vc_data *vc)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3e35eb04a28a..95488f901c6f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1124,8 +1124,7 @@ extern void pagefault_out_of_memory(void);
  */
 #define SHOW_MEM_FILTER_NODES		(0x0001u)	/* disallowed nodes */
 
-extern void show_free_areas(unsigned int flags);
-extern bool skip_free_areas_node(unsigned int flags, int nid);
+extern void show_free_areas(unsigned int flags, nodemask_t *nodemask);
 
 int shmem_zero_setup(struct vm_area_struct *);
 #ifdef CONFIG_SHMEM
@@ -1904,7 +1903,7 @@ extern void setup_per_zone_wmarks(void);
 extern int __meminit init_per_zone_wmark_min(void);
 extern void mem_init(void);
 extern void __init mmap_init(void);
-extern void show_mem(unsigned int flags);
+extern void show_mem(unsigned int flags, nodemask_t *nodemask);
 extern long si_mem_available(void);
 extern void si_meminfo(struct sysinfo * val);
 extern void si_meminfo_node(struct sysinfo *val, int nid);
diff --git a/lib/show_mem.c b/lib/show_mem.c
index 1feed6a2b12a..0beaa1d899aa 100644
--- a/lib/show_mem.c
+++ b/lib/show_mem.c
@@ -9,13 +9,13 @@
 #include <linux/quicklist.h>
 #include <linux/cma.h>
 
-void show_mem(unsigned int filter)
+void show_mem(unsigned int filter, nodemask_t *nodemask)
 {
 	pg_data_t *pgdat;
 	unsigned long total = 0, reserved = 0, highmem = 0;
 
 	printk("Mem-Info:\n");
-	show_free_areas(filter);
+	show_free_areas(filter, nodemask);
 
 	for_each_online_pgdat(pgdat) {
 		unsigned long flags;
diff --git a/mm/nommu.c b/mm/nommu.c
index ca239988fb68..5bd401b7f9a9 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1191,7 +1191,7 @@ static int do_mmap_private(struct vm_area_struct *vma,
 enomem:
 	pr_err("Allocation of length %lu from process %d (%s) failed\n",
 	       len, current->pid, current->comm);
-	show_free_areas(0);
+	show_free_areas(0, NULL);
 	return -ENOMEM;
 }
 
@@ -1412,13 +1412,13 @@ unsigned long do_mmap(struct file *file,
 	kmem_cache_free(vm_region_jar, region);
 	pr_warn("Allocation of vma for %lu byte allocation from process %d failed\n",
 			len, current->pid);
-	show_free_areas(0);
+	show_free_areas(0, NULL);
 	return -ENOMEM;
 
 error_getting_region:
 	pr_warn("Allocation of vm region for %lu byte allocation from process %d failed\n",
 			len, current->pid);
-	show_free_areas(0);
+	show_free_areas(0, NULL);
 	return -ENOMEM;
 }
 
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index ead093c6f2a6..7cf61b928ba8 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -417,7 +417,7 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
 	if (oc->memcg)
 		mem_cgroup_print_oom_info(oc->memcg, p);
 	else
-		show_mem(SHOW_MEM_FILTER_NODES);
+		show_mem(SHOW_MEM_FILTER_NODES, nm);
 	if (sysctl_oom_dump_tasks)
 		dump_tasks(oc->memcg, oc->nodemask);
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0a9805a696bb..44ba8b27a2b1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3008,7 +3008,7 @@ static inline bool should_suppress_show_mem(void)
 	return ret;
 }
 
-static void warn_alloc_show_mem(gfp_t gfp_mask)
+static void warn_alloc_show_mem(gfp_t gfp_mask, nodemask_t *nodemask)
 {
 	unsigned int filter = SHOW_MEM_FILTER_NODES;
 	static DEFINE_RATELIMIT_STATE(show_mem_rs, HZ, 1);
@@ -3028,7 +3028,7 @@ static void warn_alloc_show_mem(gfp_t gfp_mask)
 	if (in_interrupt() || !(gfp_mask & __GFP_DIRECT_RECLAIM))
 		filter &= ~SHOW_MEM_FILTER_NODES;
 
-	show_mem(filter);
+	show_mem(filter, nodemask);
 }
 
 void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
@@ -3054,7 +3054,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
 	pr_cont(", mode:%#x(%pGg), nodemask=%*pbl\n", gfp_mask, &gfp_mask, nodemask_pr_args(nm));
 
 	dump_stack();
-	warn_alloc_show_mem(gfp_mask);
+	warn_alloc_show_mem(gfp_mask, nm);
 }
 
 static inline struct page *
@@ -4250,20 +4250,20 @@ void si_meminfo_node(struct sysinfo *val, int nid)
  * Determine whether the node should be displayed or not, depending on whether
  * SHOW_MEM_FILTER_NODES was passed to show_free_areas().
  */
-bool skip_free_areas_node(unsigned int flags, int nid)
+static bool show_mem_node_skip(unsigned int flags, int nid, nodemask_t *nodemask)
 {
-	bool ret = false;
-	unsigned int cpuset_mems_cookie;
-
 	if (!(flags & SHOW_MEM_FILTER_NODES))
-		goto out;
+		return false;
 
-	do {
-		cpuset_mems_cookie = read_mems_allowed_begin();
-		ret = !node_isset(nid, cpuset_current_mems_allowed);
-	} while (read_mems_allowed_retry(cpuset_mems_cookie));
-out:
-	return ret;
+	/*
+	 * no node mask - aka implicit memory numa policy. Do not bother with the
+	 * synchronization - read_mems_allowed_begin - because we do not have to be
+	 * precise here.
+	 */
+	if (!nodemask)
+		nodemask = &cpuset_current_mems_allowed;
+
+	return !node_isset(nid, *nodemask);
 }
 
 #define K(x) ((x) << (PAGE_SHIFT-10))
@@ -4304,7 +4304,7 @@ static void show_migration_types(unsigned char type)
  * SHOW_MEM_FILTER_NODES: suppress nodes that are not allowed by current's
  *   cpuset.
  */
-void show_free_areas(unsigned int filter)
+void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 {
 	unsigned long free_pcp = 0;
 	int cpu;
@@ -4312,7 +4312,7 @@ void show_free_areas(unsigned int filter)
 	pg_data_t *pgdat;
 
 	for_each_populated_zone(zone) {
-		if (skip_free_areas_node(filter, zone_to_nid(zone)))
+		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
 			continue;
 
 		for_each_online_cpu(cpu)
@@ -4346,7 +4346,7 @@ void show_free_areas(unsigned int filter)
 		global_page_state(NR_FREE_CMA_PAGES));
 
 	for_each_online_pgdat(pgdat) {
-		if (skip_free_areas_node(filter, pgdat->node_id))
+		if (show_mem_node_skip(filter, pgdat->node_id, nodemask))
 			continue;
 
 		printk("Node %d"
@@ -4398,7 +4398,7 @@ void show_free_areas(unsigned int filter)
 	for_each_populated_zone(zone) {
 		int i;
 
-		if (skip_free_areas_node(filter, zone_to_nid(zone)))
+		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
 			continue;
 
 		free_pcp = 0;
@@ -4463,7 +4463,7 @@ void show_free_areas(unsigned int filter)
 		unsigned long nr[MAX_ORDER], flags, total = 0;
 		unsigned char types[MAX_ORDER];
 
-		if (skip_free_areas_node(filter, zone_to_nid(zone)))
+		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
 			continue;
 		show_node(zone);
 		printk(KERN_CONT "%s: ", zone->name);
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] mm, page_alloc: do not report all nodes in show_mem
  2017-01-12 13:16 ` [PATCH 1/4] mm, page_alloc: do not report all nodes in show_mem Michal Hocko
@ 2017-01-12 13:47   ` Mel Gorman
  2017-01-14 16:26   ` Johannes Weiner
  1 sibling, 0 replies; 20+ messages in thread
From: Mel Gorman @ 2017-01-12 13:47 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Johannes Weiner, David Rientjes, Michal Hocko

On Thu, Jan 12, 2017 at 02:16:56PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> 599d0c954f91 ("mm, vmscan: move LRU lists to node") has added per numa
> node statistics to show_mem but it forgot to add skip_free_areas_node
> to fileter out nodes which are outside of the allocating task numa
> policy. Add this check to not pollute the output with the pointless
> information.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] mm, page_alloc: warn_alloc print nodemask
  2017-01-12 13:16 ` [PATCH 2/4] mm, page_alloc: warn_alloc print nodemask Michal Hocko
@ 2017-01-12 13:47   ` Mel Gorman
  2017-01-13 11:31   ` Vlastimil Babka
  1 sibling, 0 replies; 20+ messages in thread
From: Mel Gorman @ 2017-01-12 13:47 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Johannes Weiner, David Rientjes, Michal Hocko

On Thu, Jan 12, 2017 at 02:16:57PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> warn_alloc is currently used for to report an allocation failure or an
> allocation stall. We print some details of the allocation request like
> the gfp mask and the request order. We do not print the allocation
> nodemask which is important when debugging the reason for the allocation
> failure as well. We alreaddy print the nodemask in the OOM report.
> 
> Add nodemask to warn_alloc and print it in warn_alloc as well.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 3/4] arch, mm: remove arch specific show_mem
  2017-01-12 13:16 ` [RFC PATCH 3/4] arch, mm: remove arch specific show_mem Michal Hocko
@ 2017-01-12 13:48   ` Mel Gorman
  2017-01-12 17:53   ` Chris Metcalf
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Mel Gorman @ 2017-01-12 13:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Johannes Weiner, David Rientjes,
	Michal Hocko, Tony Luck, Fenghua Yu, James E.J. Bottomley,
	Helge Deller, David S. Miller, Chris Metcalf, Guan Xuetao,
	linux-ia64, linux-parisc

On Thu, Jan 12, 2017 at 02:16:58PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> We have a generic implementation for quite some time already. If there
> is any arch specific information to be printed then we should add a
> callback called from the generic code rather than duplicate the whole
> show_mem. The current code has resulted in the code duplication and
> the output divergence which is both confusing and adds maintainance
> costs. Let's just get rid of this mess.
> 
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
> Cc: Helge Deller <deller@gmx.de>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Chris Metcalf <cmetcalf@mellanox.com>
> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
> Cc: linux-ia64@vger.kernel.org
> Cc: linux-parisc@vger.kernel.org
> Signed-off-by: Michal Hocko <mhocko@suse.com>

This is overdue. The last time it was brought up, no one objected to
arch-specific information from show_mem but maybe they weren't looking
that carefully. For me;

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask
  2017-01-12 13:16 ` [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask Michal Hocko
@ 2017-01-12 13:49   ` Mel Gorman
  2017-01-13 13:08   ` Vlastimil Babka
  1 sibling, 0 replies; 20+ messages in thread
From: Mel Gorman @ 2017-01-12 13:49 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Johannes Weiner, David Rientjes, Michal Hocko

On Thu, Jan 12, 2017 at 02:16:59PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> show_mem() allows to filter out node specific data which is irrelevant
> to the allocation request via SHOW_MEM_FILTER_NODES. The filtering
> is done in skip_free_areas_node which skips all nodes which are not
> in the mems_allowed of the current process. This works most of the
> time as expected because the nodemask shouldn't be outside of the
> allocating task but there are some exceptions. E.g. memory hotplug might
> want to request allocations from outside of the allowed nodes (see
> new_node_page).
> 
> Get rid of this hardcoded behavior and push the allocation mask down the
> show_mem path and use it instead of cpuset_current_mems_allowed. NULL
> nodemask is interpreted as cpuset_current_mems_allowed.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Fairly marginal but

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 3/4] arch, mm: remove arch specific show_mem
  2017-01-12 13:16 ` [RFC PATCH 3/4] arch, mm: remove arch specific show_mem Michal Hocko
  2017-01-12 13:48   ` Mel Gorman
@ 2017-01-12 17:53   ` Chris Metcalf
  2017-01-12 20:04   ` Helge Deller
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Chris Metcalf @ 2017-01-12 17:53 UTC (permalink / raw)
  To: Michal Hocko, linux-mm
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes,
	Michal Hocko, Tony Luck, Fenghua Yu, James E.J. Bottomley,
	Helge Deller, David S. Miller, Guan Xuetao, linux-ia64,
	linux-parisc

On 1/12/2017 8:16 AM, Michal Hocko wrote:
> From: Michal Hocko<mhocko@suse.com>
>
> We have a generic implementation for quite some time already. If there
> is any arch specific information to be printed then we should add a
> callback called from the generic code rather than duplicate the whole
> show_mem. The current code has resulted in the code duplication and
> the output divergence which is both confusing and adds maintainance
> costs. Let's just get rid of this mess.
>
> Cc: Tony Luck<tony.luck@intel.com>
> Cc: Fenghua Yu<fenghua.yu@intel.com>
> Cc: "James E.J. Bottomley"<jejb@parisc-linux.org>
> Cc: Helge Deller<deller@gmx.de>
> Cc: "David S. Miller"<davem@davemloft.net>
> Cc: Chris Metcalf<cmetcalf@mellanox.com>
> Cc: Guan Xuetao<gxt@mprc.pku.edu.cn>
> Cc:linux-ia64@vger.kernel.org
> Cc:linux-parisc@vger.kernel.org
> Signed-off-by: Michal Hocko<mhocko@suse.com>
> ---
>   arch/ia64/mm/init.c      | 48 -----------------------------------------------
>   arch/parisc/mm/init.c    | 49 ------------------------------------------------
>   arch/sparc/mm/init_32.c  | 11 -----------
>   arch/tile/mm/pgtable.c   | 45 --------------------------------------------
>   arch/unicore32/mm/init.c | 44 -------------------------------------------
>   5 files changed, 197 deletions(-)

Acked-by: Chris Metcalf <cmetcalf@mellanox.com> [for tile]

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 3/4] arch, mm: remove arch specific show_mem
  2017-01-12 13:16 ` [RFC PATCH 3/4] arch, mm: remove arch specific show_mem Michal Hocko
  2017-01-12 13:48   ` Mel Gorman
  2017-01-12 17:53   ` Chris Metcalf
@ 2017-01-12 20:04   ` Helge Deller
  2017-01-13  2:49   ` Xuetao Guan
  2017-01-14 16:29   ` Johannes Weiner
  4 siblings, 0 replies; 20+ messages in thread
From: Helge Deller @ 2017-01-12 20:04 UTC (permalink / raw)
  To: Michal Hocko, linux-mm
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes,
	Michal Hocko, Tony Luck, Fenghua Yu, James E.J. Bottomley,
	David S. Miller, Chris Metcalf, Guan Xuetao, linux-ia64,
	linux-parisc

On 12.01.2017 14:16, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> We have a generic implementation for quite some time already. If there
> is any arch specific information to be printed then we should add a
> callback called from the generic code rather than duplicate the whole
> show_mem. The current code has resulted in the code duplication and
> the output divergence which is both confusing and adds maintainance
> costs. Let's just get rid of this mess.
> 
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
> Cc: Helge Deller <deller@gmx.de>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Chris Metcalf <cmetcalf@mellanox.com>
> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
> Cc: linux-ia64@vger.kernel.org
> Cc: linux-parisc@vger.kernel.org
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  arch/ia64/mm/init.c      | 48 -----------------------------------------------
>  arch/parisc/mm/init.c    | 49 ------------------------------------------------
>  arch/sparc/mm/init_32.c  | 11 -----------
>  arch/tile/mm/pgtable.c   | 45 --------------------------------------------
>  arch/unicore32/mm/init.c | 44 -------------------------------------------
>  5 files changed, 197 deletions(-)

Thanks!

Acked-by: Helge Deller <deller@gmx.de> [for parisc] 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 3/4] arch, mm: remove arch specific show_mem
  2017-01-12 13:16 ` [RFC PATCH 3/4] arch, mm: remove arch specific show_mem Michal Hocko
                     ` (2 preceding siblings ...)
  2017-01-12 20:04   ` Helge Deller
@ 2017-01-13  2:49   ` Xuetao Guan
  2017-01-14 16:29   ` Johannes Weiner
  4 siblings, 0 replies; 20+ messages in thread
From: Xuetao Guan @ 2017-01-13  2:49 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Johannes Weiner, Mel Gorman,
	David Rientjes, Michal Hocko, Tony Luck, Fenghua Yu,
	James E.J. Bottomley, Helge Deller, David S. Miller,
	Chris Metcalf, Guan Xuetao, linux-ia64, linux-parisc

> From: Michal Hocko <mhocko@suse.com>
>
> We have a generic implementation for quite some time already. If there
> is any arch specific information to be printed then we should add a
> callback called from the generic code rather than duplicate the whole
> show_mem. The current code has resulted in the code duplication and
> the output divergence which is both confusing and adds maintainance
> costs. Let's just get rid of this mess.
>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
> Cc: Helge Deller <deller@gmx.de>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Chris Metcalf <cmetcalf@mellanox.com>
> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
> Cc: linux-ia64@vger.kernel.org
> Cc: linux-parisc@vger.kernel.org
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  arch/ia64/mm/init.c      | 48
> -----------------------------------------------
>  arch/parisc/mm/init.c    | 49
> ------------------------------------------------
>  arch/sparc/mm/init_32.c  | 11 -----------
>  arch/tile/mm/pgtable.c   | 45
> --------------------------------------------
>  arch/unicore32/mm/init.c | 44 -------------------------------------------
>  5 files changed, 197 deletions(-)

For UniCore32:
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] mm, page_alloc: warn_alloc print nodemask
  2017-01-12 13:16 ` [PATCH 2/4] mm, page_alloc: warn_alloc print nodemask Michal Hocko
  2017-01-12 13:47   ` Mel Gorman
@ 2017-01-13 11:31   ` Vlastimil Babka
  2017-01-13 14:58     ` Michal Hocko
  1 sibling, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2017-01-13 11:31 UTC (permalink / raw)
  To: Michal Hocko, linux-mm
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes, Michal Hocko

On 01/12/2017 02:16 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
>
> warn_alloc is currently used for to report an allocation failure or an
> allocation stall. We print some details of the allocation request like
> the gfp mask and the request order. We do not print the allocation
> nodemask which is important when debugging the reason for the allocation
> failure as well. We alreaddy print the nodemask in the OOM report.
>
> Add nodemask to warn_alloc and print it in warn_alloc as well.

That's helpful, but still IMHO incomplete compared to oom killer, see below.

> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3031,12 +3031,13 @@ static void warn_alloc_show_mem(gfp_t gfp_mask)
>  	show_mem(filter);
>  }
>
> -void warn_alloc(gfp_t gfp_mask, const char *fmt, ...)
> +void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
>  {
>  	struct va_format vaf;
>  	va_list args;
>  	static DEFINE_RATELIMIT_STATE(nopage_rs, DEFAULT_RATELIMIT_INTERVAL,
>  				      DEFAULT_RATELIMIT_BURST);
> +	nodemask_t *nm = (nodemask) ? nodemask : &cpuset_current_mems_allowed;

Yes that's same as oom's dump_header() does it. But what if there's both 
mempolicy nodemask and cpuset at play? From oom report you'll see that as it 
also calls cpuset_print_current_mems_allowed(). So could we do that here as well?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask
  2017-01-12 13:16 ` [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask Michal Hocko
  2017-01-12 13:49   ` Mel Gorman
@ 2017-01-13 13:08   ` Vlastimil Babka
  2017-01-13 15:08     ` Michal Hocko
  1 sibling, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2017-01-13 13:08 UTC (permalink / raw)
  To: Michal Hocko, linux-mm
  Cc: Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes, Michal Hocko

On 01/12/2017 02:16 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
>
> show_mem() allows to filter out node specific data which is irrelevant
> to the allocation request via SHOW_MEM_FILTER_NODES. The filtering
> is done in skip_free_areas_node which skips all nodes which are not
> in the mems_allowed of the current process. This works most of the
> time as expected because the nodemask shouldn't be outside of the
> allocating task but there are some exceptions. E.g. memory hotplug might
> want to request allocations from outside of the allowed nodes (see
> new_node_page).

Hm AFAICS memory hotplug's new_node_page() is restricted both by cpusets (by 
using GFP_USER), and by the nodemask it constructs. That's probably a bug in 
itself, as it shouldn't matter which task is triggering the offline?

Which probably means that if show_mem() wants to be really precise, it would 
have to start from nodemask and intersect with cpuset when the allocation in 
question cannot escape it. But if we accept that it's ok when we print too many 
nodes (because we can filter them out when reading the output by having also 
nodemask and mems_allowed printed), and strive only to not miss any nodes, then 
this patch could really fix cases when we do miss (although new_node_page() 
currently isn't such example).

Or am I wrong?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/4] mm, page_alloc: warn_alloc print nodemask
  2017-01-13 11:31   ` Vlastimil Babka
@ 2017-01-13 14:58     ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-01-13 14:58 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-mm, Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes

On Fri 13-01-17 12:31:52, Vlastimil Babka wrote:
> On 01/12/2017 02:16 PM, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > warn_alloc is currently used for to report an allocation failure or an
> > allocation stall. We print some details of the allocation request like
> > the gfp mask and the request order. We do not print the allocation
> > nodemask which is important when debugging the reason for the allocation
> > failure as well. We alreaddy print the nodemask in the OOM report.
> > 
> > Add nodemask to warn_alloc and print it in warn_alloc as well.
> 
> That's helpful, but still IMHO incomplete compared to oom killer, see below.
> 
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3031,12 +3031,13 @@ static void warn_alloc_show_mem(gfp_t gfp_mask)
> >  	show_mem(filter);
> >  }
> > 
> > -void warn_alloc(gfp_t gfp_mask, const char *fmt, ...)
> > +void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
> >  {
> >  	struct va_format vaf;
> >  	va_list args;
> >  	static DEFINE_RATELIMIT_STATE(nopage_rs, DEFAULT_RATELIMIT_INTERVAL,
> >  				      DEFAULT_RATELIMIT_BURST);
> > +	nodemask_t *nm = (nodemask) ? nodemask : &cpuset_current_mems_allowed;
> 
> Yes that's same as oom's dump_header() does it. But what if there's both
> mempolicy nodemask and cpuset at play? From oom report you'll see that as it
> also calls cpuset_print_current_mems_allowed(). So could we do that here as
> well?

OK, I will add it. It cannot be harmful.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask
  2017-01-13 13:08   ` Vlastimil Babka
@ 2017-01-13 15:08     ` Michal Hocko
  2017-01-13 15:30       ` Vlastimil Babka
  0 siblings, 1 reply; 20+ messages in thread
From: Michal Hocko @ 2017-01-13 15:08 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: linux-mm, Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes

On Fri 13-01-17 14:08:34, Vlastimil Babka wrote:
> On 01/12/2017 02:16 PM, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > show_mem() allows to filter out node specific data which is irrelevant
> > to the allocation request via SHOW_MEM_FILTER_NODES. The filtering
> > is done in skip_free_areas_node which skips all nodes which are not
> > in the mems_allowed of the current process. This works most of the
> > time as expected because the nodemask shouldn't be outside of the
> > allocating task but there are some exceptions. E.g. memory hotplug might
> > want to request allocations from outside of the allowed nodes (see
> > new_node_page).
> 
> Hm AFAICS memory hotplug's new_node_page() is restricted both by cpusets (by
> using GFP_USER), and by the nodemask it constructs. That's probably a bug in
> itself, as it shouldn't matter which task is triggering the offline?

yes that is true. A task bound to a node which is offlined would be
funny...

> Which probably means that if show_mem() wants to be really precise, it would
> have to start from nodemask and intersect with cpuset when the allocation in
> question cannot escape it. But if we accept that it's ok when we print too
> many nodes (because we can filter them out when reading the output by having
> also nodemask and mems_allowed printed), and strive only to not miss any
> nodes, then this patch could really fix cases when we do miss (although
> new_node_page() currently isn't such example).

I guess it should be sufficient to add cpuset_print_current_mems_allowed()
in warn_alloc. This should give us the full picture without doing too
much twiddling. What do you think?

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask
  2017-01-13 15:08     ` Michal Hocko
@ 2017-01-13 15:30       ` Vlastimil Babka
  0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-01-13 15:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Johannes Weiner, Mel Gorman, David Rientjes

On 01/13/2017 04:08 PM, Michal Hocko wrote:
> I guess it should be sufficient to add cpuset_print_current_mems_allowed()
> in warn_alloc. This should give us the full picture without doing too
> much twiddling. What do you think?

Agree!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/4] mm, page_alloc: do not report all nodes in show_mem
  2017-01-12 13:16 ` [PATCH 1/4] mm, page_alloc: do not report all nodes in show_mem Michal Hocko
  2017-01-12 13:47   ` Mel Gorman
@ 2017-01-14 16:26   ` Johannes Weiner
  1 sibling, 0 replies; 20+ messages in thread
From: Johannes Weiner @ 2017-01-14 16:26 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, David Rientjes, Michal Hocko

On Thu, Jan 12, 2017 at 02:16:56PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> 599d0c954f91 ("mm, vmscan: move LRU lists to node") has added per numa
> node statistics to show_mem but it forgot to add skip_free_areas_node
> to fileter out nodes which are outside of the allocating task numa
> policy. Add this check to not pollute the output with the pointless
> information.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC PATCH 3/4] arch, mm: remove arch specific show_mem
  2017-01-12 13:16 ` [RFC PATCH 3/4] arch, mm: remove arch specific show_mem Michal Hocko
                     ` (3 preceding siblings ...)
  2017-01-13  2:49   ` Xuetao Guan
@ 2017-01-14 16:29   ` Johannes Weiner
  4 siblings, 0 replies; 20+ messages in thread
From: Johannes Weiner @ 2017-01-14 16:29 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Andrew Morton, Mel Gorman, David Rientjes,
	Michal Hocko, Tony Luck, Fenghua Yu, James E.J. Bottomley,
	Helge Deller, David S. Miller, Chris Metcalf, Guan Xuetao,
	linux-ia64, linux-parisc

On Thu, Jan 12, 2017 at 02:16:58PM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> We have a generic implementation for quite some time already. If there
> is any arch specific information to be printed then we should add a
> callback called from the generic code rather than duplicate the whole
> show_mem. The current code has resulted in the code duplication and
> the output divergence which is both confusing and adds maintainance
> costs. Let's just get rid of this mess.
> 
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
> Cc: Helge Deller <deller@gmx.de>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Chris Metcalf <cmetcalf@mellanox.com>
> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
> Cc: linux-ia64@vger.kernel.org
> Cc: linux-parisc@vger.kernel.org
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask
  2017-01-17  9:15 [PATCH 0/4 v2] show_mem updates Michal Hocko
@ 2017-01-17  9:15 ` Michal Hocko
  0 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-01-17  9:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Mel Gorman, Vlastimil Babka, David Rientjes,
	linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

show_mem() allows to filter out node specific data which is irrelevant
to the allocation request via SHOW_MEM_FILTER_NODES. The filtering
is done in skip_free_areas_node which skips all nodes which are not
in the mems_allowed of the current process. This works most of the
time as expected because the nodemask shouldn't be outside of the
allocating task but there are some exceptions. E.g. memory hotplug might
want to request allocations from outside of the allowed nodes (see
new_node_page).

Get rid of this hardcoded behavior and push the allocation mask down the
show_mem path and use it instead of cpuset_current_mems_allowed. NULL
nodemask is interpreted as cpuset_current_mems_allowed.

Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/powerpc/xmon/xmon.c            |  2 +-
 arch/sparc/kernel/setup_32.c        |  2 +-
 drivers/net/ethernet/sgi/ioc3-eth.c |  2 +-
 drivers/tty/sysrq.c                 |  2 +-
 drivers/tty/vt/keyboard.c           |  2 +-
 include/linux/mm.h                  |  5 ++---
 lib/show_mem.c                      |  4 ++--
 mm/nommu.c                          |  6 +++---
 mm/oom_kill.c                       |  2 +-
 mm/page_alloc.c                     | 38 ++++++++++++++++++-------------------
 10 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 760545519a0b..e285a89a65ec 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -913,7 +913,7 @@ cmds(struct pt_regs *excp)
 				memzcan();
 				break;
 			case 'i':
-				show_mem(0);
+				show_mem(0, NULL);
 				break;
 			default:
 				termch = cmd;
diff --git a/arch/sparc/kernel/setup_32.c b/arch/sparc/kernel/setup_32.c
index c4e65cb3280f..6f06058c5ae7 100644
--- a/arch/sparc/kernel/setup_32.c
+++ b/arch/sparc/kernel/setup_32.c
@@ -82,7 +82,7 @@ static void prom_sync_me(void)
 			     "nop\n\t" : : "r" (&trapbase));
 
 	prom_printf("PROM SYNC COMMAND...\n");
-	show_free_areas(0);
+	show_free_areas(0, NULL);
 	if (!is_idle_task(current)) {
 		local_irq_enable();
 		sys_sync();
diff --git a/drivers/net/ethernet/sgi/ioc3-eth.c b/drivers/net/ethernet/sgi/ioc3-eth.c
index 7a254da85dd7..231e96d8bd14 100644
--- a/drivers/net/ethernet/sgi/ioc3-eth.c
+++ b/drivers/net/ethernet/sgi/ioc3-eth.c
@@ -914,7 +914,7 @@ static void ioc3_alloc_rings(struct net_device *dev)
 
 			skb = ioc3_alloc_skb(RX_BUF_ALLOC_SIZE, GFP_ATOMIC);
 			if (!skb) {
-				show_free_areas(0);
+				show_free_areas(0, NULL);
 				continue;
 			}
 
diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 52bbd27e93ae..667fa3931161 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -317,7 +317,7 @@ static struct sysrq_key_op sysrq_ftrace_dump_op = {
 
 static void sysrq_handle_showmem(int key)
 {
-	show_mem(0);
+	show_mem(0, NULL);
 }
 static struct sysrq_key_op sysrq_showmem_op = {
 	.handler	= sysrq_handle_showmem,
diff --git a/drivers/tty/vt/keyboard.c b/drivers/tty/vt/keyboard.c
index 0f8caae4267d..09511a362ade 100644
--- a/drivers/tty/vt/keyboard.c
+++ b/drivers/tty/vt/keyboard.c
@@ -572,7 +572,7 @@ static void fn_scroll_back(struct vc_data *vc)
 
 static void fn_show_mem(struct vc_data *vc)
 {
-	show_mem(0);
+	show_mem(0, NULL);
 }
 
 static void fn_show_state(struct vc_data *vc)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3e35eb04a28a..95488f901c6f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1124,8 +1124,7 @@ extern void pagefault_out_of_memory(void);
  */
 #define SHOW_MEM_FILTER_NODES		(0x0001u)	/* disallowed nodes */
 
-extern void show_free_areas(unsigned int flags);
-extern bool skip_free_areas_node(unsigned int flags, int nid);
+extern void show_free_areas(unsigned int flags, nodemask_t *nodemask);
 
 int shmem_zero_setup(struct vm_area_struct *);
 #ifdef CONFIG_SHMEM
@@ -1904,7 +1903,7 @@ extern void setup_per_zone_wmarks(void);
 extern int __meminit init_per_zone_wmark_min(void);
 extern void mem_init(void);
 extern void __init mmap_init(void);
-extern void show_mem(unsigned int flags);
+extern void show_mem(unsigned int flags, nodemask_t *nodemask);
 extern long si_mem_available(void);
 extern void si_meminfo(struct sysinfo * val);
 extern void si_meminfo_node(struct sysinfo *val, int nid);
diff --git a/lib/show_mem.c b/lib/show_mem.c
index 1feed6a2b12a..0beaa1d899aa 100644
--- a/lib/show_mem.c
+++ b/lib/show_mem.c
@@ -9,13 +9,13 @@
 #include <linux/quicklist.h>
 #include <linux/cma.h>
 
-void show_mem(unsigned int filter)
+void show_mem(unsigned int filter, nodemask_t *nodemask)
 {
 	pg_data_t *pgdat;
 	unsigned long total = 0, reserved = 0, highmem = 0;
 
 	printk("Mem-Info:\n");
-	show_free_areas(filter);
+	show_free_areas(filter, nodemask);
 
 	for_each_online_pgdat(pgdat) {
 		unsigned long flags;
diff --git a/mm/nommu.c b/mm/nommu.c
index ca239988fb68..5bd401b7f9a9 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1191,7 +1191,7 @@ static int do_mmap_private(struct vm_area_struct *vma,
 enomem:
 	pr_err("Allocation of length %lu from process %d (%s) failed\n",
 	       len, current->pid, current->comm);
-	show_free_areas(0);
+	show_free_areas(0, NULL);
 	return -ENOMEM;
 }
 
@@ -1412,13 +1412,13 @@ unsigned long do_mmap(struct file *file,
 	kmem_cache_free(vm_region_jar, region);
 	pr_warn("Allocation of vma for %lu byte allocation from process %d failed\n",
 			len, current->pid);
-	show_free_areas(0);
+	show_free_areas(0, NULL);
 	return -ENOMEM;
 
 error_getting_region:
 	pr_warn("Allocation of vm region for %lu byte allocation from process %d failed\n",
 			len, current->pid);
-	show_free_areas(0);
+	show_free_areas(0, NULL);
 	return -ENOMEM;
 }
 
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index ead093c6f2a6..7cf61b928ba8 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -417,7 +417,7 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
 	if (oc->memcg)
 		mem_cgroup_print_oom_info(oc->memcg, p);
 	else
-		show_mem(SHOW_MEM_FILTER_NODES);
+		show_mem(SHOW_MEM_FILTER_NODES, nm);
 	if (sysctl_oom_dump_tasks)
 		dump_tasks(oc->memcg, oc->nodemask);
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7f9c0ee18ae0..380bfe340336 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3008,7 +3008,7 @@ static inline bool should_suppress_show_mem(void)
 	return ret;
 }
 
-static void warn_alloc_show_mem(gfp_t gfp_mask)
+static void warn_alloc_show_mem(gfp_t gfp_mask, nodemask_t *nodemask)
 {
 	unsigned int filter = SHOW_MEM_FILTER_NODES;
 	static DEFINE_RATELIMIT_STATE(show_mem_rs, HZ, 1);
@@ -3028,7 +3028,7 @@ static void warn_alloc_show_mem(gfp_t gfp_mask)
 	if (in_interrupt() || !(gfp_mask & __GFP_DIRECT_RECLAIM))
 		filter &= ~SHOW_MEM_FILTER_NODES;
 
-	show_mem(filter);
+	show_mem(filter, nodemask);
 }
 
 void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
@@ -3055,7 +3055,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
 	cpuset_print_current_mems_allowed();
 
 	dump_stack();
-	warn_alloc_show_mem(gfp_mask);
+	warn_alloc_show_mem(gfp_mask, nm);
 }
 
 static inline struct page *
@@ -4251,20 +4251,20 @@ void si_meminfo_node(struct sysinfo *val, int nid)
  * Determine whether the node should be displayed or not, depending on whether
  * SHOW_MEM_FILTER_NODES was passed to show_free_areas().
  */
-bool skip_free_areas_node(unsigned int flags, int nid)
+static bool show_mem_node_skip(unsigned int flags, int nid, nodemask_t *nodemask)
 {
-	bool ret = false;
-	unsigned int cpuset_mems_cookie;
-
 	if (!(flags & SHOW_MEM_FILTER_NODES))
-		goto out;
+		return false;
 
-	do {
-		cpuset_mems_cookie = read_mems_allowed_begin();
-		ret = !node_isset(nid, cpuset_current_mems_allowed);
-	} while (read_mems_allowed_retry(cpuset_mems_cookie));
-out:
-	return ret;
+	/*
+	 * no node mask - aka implicit memory numa policy. Do not bother with the
+	 * synchronization - read_mems_allowed_begin - because we do not have to be
+	 * precise here.
+	 */
+	if (!nodemask)
+		nodemask = &cpuset_current_mems_allowed;
+
+	return !node_isset(nid, *nodemask);
 }
 
 #define K(x) ((x) << (PAGE_SHIFT-10))
@@ -4305,7 +4305,7 @@ static void show_migration_types(unsigned char type)
  * SHOW_MEM_FILTER_NODES: suppress nodes that are not allowed by current's
  *   cpuset.
  */
-void show_free_areas(unsigned int filter)
+void show_free_areas(unsigned int filter, nodemask_t *nodemask)
 {
 	unsigned long free_pcp = 0;
 	int cpu;
@@ -4313,7 +4313,7 @@ void show_free_areas(unsigned int filter)
 	pg_data_t *pgdat;
 
 	for_each_populated_zone(zone) {
-		if (skip_free_areas_node(filter, zone_to_nid(zone)))
+		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
 			continue;
 
 		for_each_online_cpu(cpu)
@@ -4347,7 +4347,7 @@ void show_free_areas(unsigned int filter)
 		global_page_state(NR_FREE_CMA_PAGES));
 
 	for_each_online_pgdat(pgdat) {
-		if (skip_free_areas_node(filter, pgdat->node_id))
+		if (show_mem_node_skip(filter, pgdat->node_id, nodemask))
 			continue;
 
 		printk("Node %d"
@@ -4399,7 +4399,7 @@ void show_free_areas(unsigned int filter)
 	for_each_populated_zone(zone) {
 		int i;
 
-		if (skip_free_areas_node(filter, zone_to_nid(zone)))
+		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
 			continue;
 
 		free_pcp = 0;
@@ -4464,7 +4464,7 @@ void show_free_areas(unsigned int filter)
 		unsigned long nr[MAX_ORDER], flags, total = 0;
 		unsigned char types[MAX_ORDER];
 
-		if (skip_free_areas_node(filter, zone_to_nid(zone)))
+		if (show_mem_node_skip(filter, zone_to_nid(zone), nodemask))
 			continue;
 		show_node(zone);
 		printk(KERN_CONT "%s: ", zone->name);
-- 
2.11.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-01-17  9:15 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-12 13:16 [PATCH 0/4] show_mem updates Michal Hocko
2017-01-12 13:16 ` [PATCH 1/4] mm, page_alloc: do not report all nodes in show_mem Michal Hocko
2017-01-12 13:47   ` Mel Gorman
2017-01-14 16:26   ` Johannes Weiner
2017-01-12 13:16 ` [PATCH 2/4] mm, page_alloc: warn_alloc print nodemask Michal Hocko
2017-01-12 13:47   ` Mel Gorman
2017-01-13 11:31   ` Vlastimil Babka
2017-01-13 14:58     ` Michal Hocko
2017-01-12 13:16 ` [RFC PATCH 3/4] arch, mm: remove arch specific show_mem Michal Hocko
2017-01-12 13:48   ` Mel Gorman
2017-01-12 17:53   ` Chris Metcalf
2017-01-12 20:04   ` Helge Deller
2017-01-13  2:49   ` Xuetao Guan
2017-01-14 16:29   ` Johannes Weiner
2017-01-12 13:16 ` [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask Michal Hocko
2017-01-12 13:49   ` Mel Gorman
2017-01-13 13:08   ` Vlastimil Babka
2017-01-13 15:08     ` Michal Hocko
2017-01-13 15:30       ` Vlastimil Babka
2017-01-17  9:15 [PATCH 0/4 v2] show_mem updates Michal Hocko
2017-01-17  9:15 ` [PATCH 4/4] lib/show_mem.c: teach show_mem to work with the given nodemask Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).