From: Anshuman Khandual <khandual@linux.vnet.ibm.com> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: mhocko@suse.com, js1304@gmail.com, vbabka@suse.cz, mgorman@suse.de, minchan@kernel.org, akpm@linux-foundation.org, aneesh.kumar@linux.vnet.ibm.com, bsingharora@gmail.com Subject: [DEBUG 07/10] mm: Add debugfs interface to dump each node's zonelist information Date: Mon, 24 Oct 2016 10:12:26 +0530 [thread overview] Message-ID: <1477284149-2976-8-git-send-email-khandual@linux.vnet.ibm.com> (raw) In-Reply-To: <1477284149-2976-1-git-send-email-khandual@linux.vnet.ibm.com> Each individual node in the system has a ZONELIST_FALLBACK zonelist and a ZONELIST_NOFALLBACK zonelist. These zonelists decide fallback order of zones during memory allocations. Sometimes it helps to dump these zonelists to see the priority order of various zones in them. Particularly platforms which support memory hotplug into previously non existing zones (at boot), this interface helps in visualizing which all zonelists of the system at what priority level, the new hot added memory ends up in. POWER is such a platform where all the memory detected during boot time remains with ZONE_DMA for good but then hot plug process can actually get new memory into ZONE_MOVABLE. So having a way to get the snapshot of the zonelists on the system after memory or node hot[un]plug is desirable. This change adds one new debugfs interface (/sys/kernel/debug/zonelists) which will fetch and dump this information. Example zonelist information from a KVM guest with four NUMA nodes on a POWER8 platform. [NODE (0)] ZONELIST_FALLBACK (0) (Node 0) (DMA) (1) (Node 1) (DMA) (2) (Node 2) (DMA) (3) (Node 3) (DMA) ZONELIST_NOFALLBACK (0) (Node 0) (DMA) [NODE (1)] ZONELIST_FALLBACK (0) (Node 1) (DMA) (1) (Node 2) (DMA) (2) (Node 3) (DMA) (3) (Node 0) (DMA) ZONELIST_NOFALLBACK (0) (Node 1) (DMA) [NODE (2)] ZONELIST_FALLBACK (0) (Node 2) (DMA) (1) (Node 3) (DMA) (2) (Node 0) (DMA) (3) (Node 1) (DMA) ZONELIST_NOFALLBACK (0) (Node 2) (DMA) [NODE (3)] ZONELIST_FALLBACK (0) (Node 3) (DMA) (1) (Node 0) (DMA) (2) (Node 1) (DMA) (3) (Node 2) (DMA) ZONELIST_NOFALLBACK (0) (Node 3) (DMA) Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> --- mm/memory.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index e18c57b..3be1753 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -64,6 +64,7 @@ #include <linux/debugfs.h> #include <linux/userfaultfd_k.h> #include <linux/dax.h> +#include <linux/mmzone.h> #include <asm/io.h> #include <asm/mmu_context.h> @@ -3087,6 +3088,68 @@ static int __init fault_around_debugfs(void) pr_warn("Failed to create fault_around_bytes in debugfs"); return 0; } + +#ifdef CONFIG_NUMA +static void show_zonelist(struct seq_file *m, struct zonelist *zonelist) +{ + unsigned int i; + + for (i = 0; zonelist->_zonerefs[i].zone; i++) { + seq_printf(m, "\t\t(%d) (Node %d) (%-7s 0x%pK)\n", i, + zonelist->_zonerefs[i].zone->zone_pgdat->node_id, + zone_names[zonelist->_zonerefs[i].zone_idx], + (void *) zonelist->_zonerefs[i].zone); + } +} + +static int zonelists_show(struct seq_file *m, void *v) +{ + struct zonelist *zonelist; + unsigned int node; + + for_each_online_node(node) { + zonelist = &(NODE_DATA(node)-> + node_zonelists[ZONELIST_FALLBACK]); + seq_printf(m, "[NODE (%d)]\n", node); + seq_puts(m, "\tZONELIST_FALLBACK "); + seq_printf(m, "(0x%pK)\n", zonelist); + show_zonelist(m, zonelist); + + zonelist = &(NODE_DATA(node)-> + node_zonelists[ZONELIST_NOFALLBACK]); + seq_puts(m, "\tZONELIST_NOFALLBACK "); + seq_printf(m, "(0x%pK)\n", zonelist); + show_zonelist(m, zonelist); + } + return 0; +} + +static int zonelists_open(struct inode *inode, struct file *filp) +{ + return single_open(filp, zonelists_show, NULL); +} + +static const struct file_operations zonelists_fops = { + .open = zonelists_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static int __init zonelists_debugfs(void) +{ + void *ret; + + ret = debugfs_create_file("zonelists", 0444, NULL, NULL, + &zonelists_fops); + if (!ret) + pr_warn("Failed to create zonelists in debugfs"); + return 0; +} + +late_initcall(zonelists_debugfs); +#endif /* CONFIG_NUMA */ + late_initcall(fault_around_debugfs); #endif -- 2.1.0
WARNING: multiple messages have this Message-ID (diff)
From: Anshuman Khandual <khandual@linux.vnet.ibm.com> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: mhocko@suse.com, js1304@gmail.com, vbabka@suse.cz, mgorman@suse.de, minchan@kernel.org, akpm@linux-foundation.org, aneesh.kumar@linux.vnet.ibm.com, bsingharora@gmail.com Subject: [DEBUG 07/10] mm: Add debugfs interface to dump each node's zonelist information Date: Mon, 24 Oct 2016 10:12:26 +0530 [thread overview] Message-ID: <1477284149-2976-8-git-send-email-khandual@linux.vnet.ibm.com> (raw) In-Reply-To: <1477284149-2976-1-git-send-email-khandual@linux.vnet.ibm.com> Each individual node in the system has a ZONELIST_FALLBACK zonelist and a ZONELIST_NOFALLBACK zonelist. These zonelists decide fallback order of zones during memory allocations. Sometimes it helps to dump these zonelists to see the priority order of various zones in them. Particularly platforms which support memory hotplug into previously non existing zones (at boot), this interface helps in visualizing which all zonelists of the system at what priority level, the new hot added memory ends up in. POWER is such a platform where all the memory detected during boot time remains with ZONE_DMA for good but then hot plug process can actually get new memory into ZONE_MOVABLE. So having a way to get the snapshot of the zonelists on the system after memory or node hot[un]plug is desirable. This change adds one new debugfs interface (/sys/kernel/debug/zonelists) which will fetch and dump this information. Example zonelist information from a KVM guest with four NUMA nodes on a POWER8 platform. [NODE (0)] ZONELIST_FALLBACK (0) (Node 0) (DMA) (1) (Node 1) (DMA) (2) (Node 2) (DMA) (3) (Node 3) (DMA) ZONELIST_NOFALLBACK (0) (Node 0) (DMA) [NODE (1)] ZONELIST_FALLBACK (0) (Node 1) (DMA) (1) (Node 2) (DMA) (2) (Node 3) (DMA) (3) (Node 0) (DMA) ZONELIST_NOFALLBACK (0) (Node 1) (DMA) [NODE (2)] ZONELIST_FALLBACK (0) (Node 2) (DMA) (1) (Node 3) (DMA) (2) (Node 0) (DMA) (3) (Node 1) (DMA) ZONELIST_NOFALLBACK (0) (Node 2) (DMA) [NODE (3)] ZONELIST_FALLBACK (0) (Node 3) (DMA) (1) (Node 0) (DMA) (2) (Node 1) (DMA) (3) (Node 2) (DMA) ZONELIST_NOFALLBACK (0) (Node 3) (DMA) Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> --- mm/memory.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index e18c57b..3be1753 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -64,6 +64,7 @@ #include <linux/debugfs.h> #include <linux/userfaultfd_k.h> #include <linux/dax.h> +#include <linux/mmzone.h> #include <asm/io.h> #include <asm/mmu_context.h> @@ -3087,6 +3088,68 @@ static int __init fault_around_debugfs(void) pr_warn("Failed to create fault_around_bytes in debugfs"); return 0; } + +#ifdef CONFIG_NUMA +static void show_zonelist(struct seq_file *m, struct zonelist *zonelist) +{ + unsigned int i; + + for (i = 0; zonelist->_zonerefs[i].zone; i++) { + seq_printf(m, "\t\t(%d) (Node %d) (%-7s 0x%pK)\n", i, + zonelist->_zonerefs[i].zone->zone_pgdat->node_id, + zone_names[zonelist->_zonerefs[i].zone_idx], + (void *) zonelist->_zonerefs[i].zone); + } +} + +static int zonelists_show(struct seq_file *m, void *v) +{ + struct zonelist *zonelist; + unsigned int node; + + for_each_online_node(node) { + zonelist = &(NODE_DATA(node)-> + node_zonelists[ZONELIST_FALLBACK]); + seq_printf(m, "[NODE (%d)]\n", node); + seq_puts(m, "\tZONELIST_FALLBACK "); + seq_printf(m, "(0x%pK)\n", zonelist); + show_zonelist(m, zonelist); + + zonelist = &(NODE_DATA(node)-> + node_zonelists[ZONELIST_NOFALLBACK]); + seq_puts(m, "\tZONELIST_NOFALLBACK "); + seq_printf(m, "(0x%pK)\n", zonelist); + show_zonelist(m, zonelist); + } + return 0; +} + +static int zonelists_open(struct inode *inode, struct file *filp) +{ + return single_open(filp, zonelists_show, NULL); +} + +static const struct file_operations zonelists_fops = { + .open = zonelists_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, +}; + +static int __init zonelists_debugfs(void) +{ + void *ret; + + ret = debugfs_create_file("zonelists", 0444, NULL, NULL, + &zonelists_fops); + if (!ret) + pr_warn("Failed to create zonelists in debugfs"); + return 0; +} + +late_initcall(zonelists_debugfs); +#endif /* CONFIG_NUMA */ + late_initcall(fault_around_debugfs); #endif -- 2.1.0 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-10-24 4:43 UTC|newest] Thread overview: 135+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-10-24 4:31 [RFC 0/8] Define coherent device memory node Anshuman Khandual 2016-10-24 4:31 ` Anshuman Khandual 2016-10-24 4:31 ` [RFC 1/8] mm: " Anshuman Khandual 2016-10-24 4:31 ` Anshuman Khandual 2016-10-24 17:09 ` Dave Hansen 2016-10-24 17:09 ` Dave Hansen 2016-10-25 1:22 ` Anshuman Khandual 2016-10-25 1:22 ` Anshuman Khandual 2016-10-25 15:47 ` Dave Hansen 2016-10-25 15:47 ` Dave Hansen 2016-10-24 4:31 ` [RFC 2/8] mm: Add specialized fallback zonelist for coherent device memory nodes Anshuman Khandual 2016-10-24 4:31 ` Anshuman Khandual 2016-10-24 17:10 ` Dave Hansen 2016-10-24 17:10 ` Dave Hansen 2016-10-25 1:27 ` Anshuman Khandual 2016-10-25 1:27 ` Anshuman Khandual 2016-11-17 7:40 ` Anshuman Khandual 2016-11-17 7:40 ` Anshuman Khandual 2016-11-17 7:59 ` [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask Anshuman Khandual 2016-11-17 7:59 ` Anshuman Khandual 2016-11-17 7:59 ` [DRAFT 2/2] mm/hugetlb: Restrict HugeTLB allocations only to the system RAM nodes Anshuman Khandual 2016-11-17 7:59 ` Anshuman Khandual 2016-11-17 8:28 ` [DRAFT 1/2] mm/cpuset: Exclude CDM nodes from each task's mems_allowed node mask kbuild test robot 2016-10-24 4:31 ` [RFC 3/8] mm: Isolate coherent device memory nodes from HugeTLB allocation paths Anshuman Khandual 2016-10-24 4:31 ` Anshuman Khandual 2016-10-24 17:16 ` Dave Hansen 2016-10-24 17:16 ` Dave Hansen 2016-10-25 4:15 ` Aneesh Kumar K.V 2016-10-25 4:15 ` Aneesh Kumar K.V 2016-10-25 7:17 ` Balbir Singh 2016-10-25 7:17 ` Balbir Singh 2016-10-25 7:25 ` Balbir Singh 2016-10-25 7:25 ` Balbir Singh 2016-10-24 4:31 ` [RFC 4/8] mm: Accommodate coherent device memory nodes in MPOL_BIND implementation Anshuman Khandual 2016-10-24 4:31 ` Anshuman Khandual 2016-10-24 4:31 ` [RFC 5/8] mm: Add new flag VM_CDM for coherent device memory Anshuman Khandual 2016-10-24 4:31 ` Anshuman Khandual 2016-10-24 17:38 ` Dave Hansen 2016-10-24 17:38 ` Dave Hansen 2016-10-24 18:00 ` Dave Hansen 2016-10-24 18:00 ` Dave Hansen 2016-10-25 12:36 ` Balbir Singh 2016-10-25 12:36 ` Balbir Singh 2016-10-25 19:20 ` Aneesh Kumar K.V 2016-10-25 19:20 ` Aneesh Kumar K.V 2016-10-25 20:01 ` Dave Hansen 2016-10-25 20:01 ` Dave Hansen 2016-10-24 4:31 ` [RFC 6/8] mm: Make VM_CDM marked VMAs non migratable Anshuman Khandual 2016-10-24 4:31 ` Anshuman Khandual 2016-10-24 4:31 ` [RFC 7/8] mm: Add a new migration function migrate_virtual_range() Anshuman Khandual 2016-10-24 4:31 ` Anshuman Khandual 2016-10-24 4:31 ` [RFC 8/8] mm: Add N_COHERENT_DEVICE node type into node_states[] Anshuman Khandual 2016-10-24 4:31 ` Anshuman Khandual 2016-10-25 7:22 ` Balbir Singh 2016-10-25 7:22 ` Balbir Singh 2016-10-26 4:52 ` Anshuman Khandual 2016-10-26 4:52 ` Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 00/10] Test and debug patches for coherent device memory Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 01/10] dt-bindings: Add doc for ibm,hotplug-aperture Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 02/10] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 03/10] powerpc/mm: Allow memory hotplug into a memory less node Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 04/10] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 05/10] powerpc/mm: Identify isolation seeking coherent memory nodes during boot Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 06/10] mm: Export definition of 'zone_names' array through mmzone.h Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual [this message] 2016-10-24 4:42 ` [DEBUG 07/10] mm: Add debugfs interface to dump each node's zonelist information Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 08/10] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 09/10] drivers: Add two drivers for coherent device memory tests Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 4:42 ` [DEBUG 10/10] test: Add a script to perform random VMA migrations across nodes Anshuman Khandual 2016-10-24 4:42 ` Anshuman Khandual 2016-10-24 17:09 ` [RFC 0/8] Define coherent device memory node Jerome Glisse 2016-10-24 17:09 ` Jerome Glisse 2016-10-25 4:26 ` Aneesh Kumar K.V 2016-10-25 4:26 ` Aneesh Kumar K.V 2016-10-25 15:16 ` Jerome Glisse 2016-10-25 15:16 ` Jerome Glisse 2016-10-26 11:09 ` Aneesh Kumar K.V 2016-10-26 11:09 ` Aneesh Kumar K.V 2016-10-26 16:07 ` Jerome Glisse 2016-10-26 16:07 ` Jerome Glisse 2016-10-28 5:29 ` Aneesh Kumar K.V 2016-10-28 5:29 ` Aneesh Kumar K.V 2016-10-28 16:16 ` Jerome Glisse 2016-10-28 16:16 ` Jerome Glisse 2016-11-05 5:21 ` Anshuman Khandual 2016-11-05 5:21 ` Anshuman Khandual 2016-11-05 18:02 ` Jerome Glisse 2016-11-05 18:02 ` Jerome Glisse 2016-10-25 4:59 ` Aneesh Kumar K.V 2016-10-25 4:59 ` Aneesh Kumar K.V 2016-10-25 15:32 ` Jerome Glisse 2016-10-25 15:32 ` Jerome Glisse 2016-10-25 17:31 ` Aneesh Kumar K.V 2016-10-25 17:31 ` Aneesh Kumar K.V 2016-10-25 18:52 ` Jerome Glisse 2016-10-25 18:52 ` Jerome Glisse 2016-10-26 11:13 ` Anshuman Khandual 2016-10-26 11:13 ` Anshuman Khandual 2016-10-26 16:02 ` Jerome Glisse 2016-10-26 16:02 ` Jerome Glisse 2016-10-27 4:38 ` Anshuman Khandual 2016-10-27 4:38 ` Anshuman Khandual 2016-10-27 7:03 ` Anshuman Khandual 2016-10-27 7:03 ` Anshuman Khandual 2016-10-27 15:05 ` Jerome Glisse 2016-10-27 15:05 ` Jerome Glisse 2016-10-28 5:47 ` Anshuman Khandual 2016-10-28 5:47 ` Anshuman Khandual 2016-10-28 16:08 ` Jerome Glisse 2016-10-28 16:08 ` Jerome Glisse 2016-10-26 12:56 ` Anshuman Khandual 2016-10-26 12:56 ` Anshuman Khandual 2016-10-26 16:28 ` Jerome Glisse 2016-10-26 16:28 ` Jerome Glisse 2016-10-27 10:23 ` Balbir Singh 2016-10-27 10:23 ` Balbir Singh 2016-10-25 12:07 ` Balbir Singh 2016-10-25 12:07 ` Balbir Singh 2016-10-25 15:21 ` Jerome Glisse 2016-10-25 15:21 ` Jerome Glisse 2016-10-24 18:04 ` Dave Hansen 2016-10-24 18:04 ` Dave Hansen 2016-10-24 18:32 ` David Nellans 2016-10-24 18:32 ` David Nellans 2016-10-24 19:36 ` Dave Hansen 2016-10-24 19:36 ` Dave Hansen
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1477284149-2976-8-git-send-email-khandual@linux.vnet.ibm.com \ --to=khandual@linux.vnet.ibm.com \ --cc=akpm@linux-foundation.org \ --cc=aneesh.kumar@linux.vnet.ibm.com \ --cc=bsingharora@gmail.com \ --cc=js1304@gmail.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@suse.de \ --cc=mhocko@suse.com \ --cc=minchan@kernel.org \ --cc=vbabka@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.