linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cody P Schafer <cody@linux.vnet.ibm.com>
To: Linux MM <linux-mm@kvack.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Cody P Schafer <cody@linux.vnet.ibm.com>,
	Simon Jeons <simon.jeons@gmail.com>
Subject: [RFC PATCH v3 31/31] mm: add a early_param "extra_nr_node_ids" to increase nr_node_ids above the minimum by a percentage.
Date: Thu,  2 May 2013 17:01:03 -0700	[thread overview]
Message-ID: <1367539263-19999-32-git-send-email-cody@linux.vnet.ibm.com> (raw)
In-Reply-To: <1367539263-19999-1-git-send-email-cody@linux.vnet.ibm.com>

For dynamic numa, sometimes the hypervisor we're running under will want
to split a single NUMA node into multiple NUMA nodes. If the number of
numa nodes is limited to the number avaliable when the system booted (as
it is on x86), we may not be able to fully adopt the new memory layout
provided by the hypervisor.

This option allows reserving some extra node ids as a percentage of the
boot time node ids. While not perfect (idealy nr_node_ids would be fully
dynamic), this allows decent functionality without invasive changes to
the SL{U,A}B allocators.

Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
---
 Documentation/kernel-parameters.txt |  6 ++++++
 mm/page_alloc.c                     | 24 ++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9653cf2..c606371 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2082,6 +2082,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			use hotplug cpu feature to put more cpu back to online.
 			just like you compile the kernel NR_CPUS=n
 
+	extra_nr_node_ids= [NUMA] Increase the maximum number of NUMA nodes
+			above the number detected at boot by the specified
+			percentage (rounded up). For example:
+			extra_nr_node_ids=100 would double the number of
+			node_ids avaliable (up to a max of MAX_NUMNODES).
+
 	nr_uarts=	[SERIAL] maximum number of UARTs to be registered.
 
 	numa_balancing=	[KNL,X86] Enable or disable automatic NUMA balancing.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cc7b332..1fd2f2f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4837,6 +4837,17 @@ void __paginginit free_area_init_node(int nid, unsigned long *zones_size,
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 
 #if MAX_NUMNODES > 1
+
+static unsigned nr_node_ids_mod_percent;
+static int __init setup_extra_nr_node_ids(char *arg)
+{
+	int r = kstrtouint(arg, 10, &nr_node_ids_mod_percent);
+	if (r)
+		pr_err("invalid param value extra_nr_node_ids=\"%s\"\n", arg);
+	return 0;
+}
+early_param("extra_nr_node_ids", setup_extra_nr_node_ids);
+
 /*
  * Figure out the number of possible node ids.
  */
@@ -4848,6 +4859,19 @@ void __init setup_nr_node_ids(void)
 	for_each_node_mask(node, node_possible_map)
 		highest = node;
 	nr_node_ids = highest + 1;
+
+	/*
+	 * expand nr_node_ids and node_possible_map so more can be onlined
+	 * later
+	 */
+	nr_node_ids +=
+		DIV_ROUND_UP(nr_node_ids * nr_node_ids_mod_percent, 100);
+
+	if (nr_node_ids > MAX_NUMNODES)
+		nr_node_ids = MAX_NUMNODES;
+
+	for (node = highest + 1; node < nr_node_ids; node++)
+		node_set(node, node_possible_map);
 }
 #endif
 
-- 
1.8.2.2


      parent reply	other threads:[~2013-05-03  0:05 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-03  0:00 [RFC PATCH v3 00/31] Dynamic NUMA: Runtime NUMA memory layout reconfiguration Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 01/31] rbtree: add postorder iteration functions Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 02/31] rbtree: add rbtree_postorder_for_each_entry_safe() helper Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 03/31] mm/memory_hotplug: factor out zone+pgdat growth Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 04/31] memory_hotplug: export ensure_zone_is_initialized() in mm/internal.h Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 05/31] mm/memory_hotplug: use {pgdat,zone}_is_empty() when resizing zones & pgdats Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 06/31] mm: add nid_zone() helper Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 07/31] mm: Add Dynamic NUMA Kconfig Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 08/31] page_alloc: add return_pages_to_zone() when DYNAMIC_NUMA is enabled Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 09/31] page_alloc: in move_freepages(), skip pages instead of VM_BUG on node differences Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 10/31] page_alloc: when dynamic numa is enabled, don't check that all pages in a block belong to the same zone Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 11/31] page-flags dnuma: reserve a pageflag for determining if a page needs a node lookup Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 12/31] memory_hotplug: factor out locks in mem_online_cpu() Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 13/31] mm: add memlayout & dnuma to track pfn->nid & transplant pages between nodes Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 14/31] mm: memlayout+dnuma: add debugfs interface Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 15/31] drivers/base/memory.c: alphabetize headers Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 16/31] drivers/base/node,memory: rename function to match interface Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 17/31] drivers/base/node: rename unregister_mem_blk_under_nodes() to be more acurate Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 18/31] drivers/base/node: add unregister_mem_block_under_nodes() Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 19/31] mm: memory,memlayout: add refresh_memory_blocks() for Dynamic NUMA Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 20/31] page_alloc: use dnuma to transplant newly freed pages in __free_pages_ok() Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 21/31] page_alloc: use dnuma to transplant newly freed pages in free_hot_cold_page() Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 22/31] page_alloc: transplant pages that are being flushed from the per-cpu lists Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 23/31] x86: memlayout: add a arch specific inital memlayout setter Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 24/31] init/main: call memlayout_global_init() in start_kernel() Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 25/31] dnuma: memlayout: add memory_add_physaddr_to_nid() for memory_hotplug Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 26/31] x86/mm/numa: when dnuma is enabled, use memlayout to handle memory hotplug's physaddr_to_nid Cody P Schafer
2013-05-03  0:00 ` [RFC PATCH v3 27/31] mm/memory_hotplug: VM_BUG if nid is too large Cody P Schafer
2013-05-03  0:01 ` [RFC PATCH v3 28/31] mm/page_alloc: in page_outside_zone_boundaries(), avoid premature decisions Cody P Schafer
2013-05-03  0:01 ` [RFC PATCH v3 29/31] mm/page_alloc: make pr_err() in page_outside_zone_boundaries() more useful Cody P Schafer
2013-05-03  0:01 ` [RFC PATCH v3 30/31] mm/page_alloc: use manage_pages instead of present pages when calculating default_zonelist_order() Cody P Schafer
2013-05-03  0:01 ` Cody P Schafer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1367539263-19999-32-git-send-email-cody@linux.vnet.ibm.com \
    --to=cody@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=simon.jeons@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).