All of lore.kernel.org
 help / color / mirror / Atom feed
From: Huang Ying <ying.huang@intel.com>
To: Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>
Cc: x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Huang Ying <ying.huang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>, "H. Peter Anvin" <hpa@zytor.com>,
	Dan Williams <dan.j.williams@intel.com>,
	David Rientjes <rientjes@google.com>,
	Dave Jiang <dave.jiang@intel.com>
Subject: [PATCH] x86, fakenuma: Avoid too large emulated node
Date: Tue,  8 Sep 2020 14:09:12 +0800	[thread overview]
Message-ID: <20200908060912.12200-1-ying.huang@intel.com> (raw)

On a testing system with 2 physical NUMA node, 8GB memory, a small
memory hole from 640KB to 1MB, and a large memory hole from 3GB to
4GB.  If "numa=fake=1G" is used in kernel command line, the resulting
fake NUMA nodes are as follows,

    NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0xbfffffff] -> [mem 0x00000000-0xbfffffff]
    NUMA: Node 0 [mem 0x00000000-0xbfffffff] + [mem 0x100000000-0x13fffffff] -> [mem 0x00000000-0x13fffffff]
    Faking node 0 at [mem 0x0000000000000000-0x0000000041ffffff] (1056MB)
    Faking node 1 at [mem 0x0000000140000000-0x000000017fffffff] (1024MB)
    Faking node 2 at [mem 0x0000000042000000-0x0000000081ffffff] (1024MB)
    Faking node 3 at [mem 0x0000000180000000-0x00000001bfffffff] (1024MB)
    Faking node 4 at [mem 0x0000000082000000-0x000000013fffffff] (3040MB)
    Faking node 5 at [mem 0x00000001c0000000-0x00000001ffffffff] (1024MB)
    Faking node 6 at [mem 0x0000000200000000-0x000000023fffffff] (1024MB)

Where, 7 fake NUMA nodes are emulated, the size of fake node 4 is 3040
- 1024 = 2016MB.  This is nearly 2 times of the size of the other fake
nodes (about 1024MB).  This isn't a reasonable splitting.  The better
way is to make the fake node size not too large or small.  So in this
patch, the splitting algorithm is changed to make the fake node size
between 1/2 to 3/2 of the specified node size.  After applying this
patch, the resulting fake NUMA nodes become,

    Faking node 0 at [mem 0x0000000000000000-0x0000000041ffffff] (1056MB)
    Faking node 1 at [mem 0x0000000140000000-0x000000017fffffff] (1024MB)
    Faking node 2 at [mem 0x0000000042000000-0x0000000081ffffff] (1024MB)
    Faking node 3 at [mem 0x0000000180000000-0x00000001bfffffff] (1024MB)
    Faking node 4 at [mem 0x0000000082000000-0x0000000103ffffff] (2080MB)
    Faking node 5 at [mem 0x00000001c0000000-0x00000001ffffffff] (1024MB)
    Faking node 6 at [mem 0x0000000104000000-0x000000013fffffff] (960MB)
    Faking node 7 at [mem 0x0000000200000000-0x000000023fffffff] (1024MB)

The newly added node 6 is a little smaller than the specified node
size (960MB vs. 1024MB).  But the overall results look more
reasonable.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dave Jiang <dave.jiang@intel.com>
---
 arch/x86/mm/numa_emulation.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/numa_emulation.c b/arch/x86/mm/numa_emulation.c
index 683cd12f4793..231469e1de6a 100644
--- a/arch/x86/mm/numa_emulation.c
+++ b/arch/x86/mm/numa_emulation.c
@@ -300,9 +300,10 @@ static int __init split_nodes_size_interleave_uniform(struct numa_meminfo *ei,
 			/*
 			 * If there won't be enough non-reserved memory for the
 			 * next node, this one must extend to the end of the
-			 * physical node.
+			 * physical node.  The size of the emulated node should
+			 * be between size/2 and size*3/2.
 			 */
-			if ((limit - end - mem_hole_size(end, limit) < size)
+			if ((limit - end - mem_hole_size(end, limit) < size / 2)
 					&& !uniform)
 				end = limit;
 
-- 
2.28.0


                 reply	other threads:[~2020-09-08  6:35 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200908060912.12200-1-ying.huang@intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dave.jiang@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.