linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Slaby <jslaby@suse.cz>
To: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Thomas Tai <thomas.tai@oracle.com>,
	"David S . Miller" <davem@davemloft.net>,
	Jiri Slaby <jslaby@suse.cz>
Subject: [PATCH 3.12 37/38] sparc64: Fix find_node warning if numa node cannot be found
Date: Tue, 13 Dec 2016 20:53:03 +0100	[thread overview]
Message-ID: <d68ed46bd6c06585e61556d4a7171d8ea198101c.1481658746.git.jslaby@suse.cz> (raw)
In-Reply-To: <15034b96ec06ee859b67c6cd4e3be569a4ef286b.1481658746.git.jslaby@suse.cz>
In-Reply-To: <cover.1481658746.git.jslaby@suse.cz>

From: Thomas Tai <thomas.tai@oracle.com>

3.12-stable review patch.  If anyone has any objections, please let me know.

===============

[ Upstream commit 74a5ed5c4f692df2ff0a2313ea71e81243525519 ]

When booting up LDOM, find_node() warns that a physical address
doesn't match a NUMA node.

WARNING: CPU: 0 PID: 0 at arch/sparc/mm/init_64.c:835
find_node+0xf4/0x120 find_node: A physical address doesn't
match a NUMA node rule. Some physical memory will be
owned by node 0.Modules linked in:

CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc3 #4
Call Trace:
 [0000000000468ba0] __warn+0xc0/0xe0
 [0000000000468c74] warn_slowpath_fmt+0x34/0x60
 [00000000004592f4] find_node+0xf4/0x120
 [0000000000dd0774] add_node_ranges+0x38/0xe4
 [0000000000dd0b1c] numa_parse_mdesc+0x268/0x2e4
 [0000000000dd0e9c] bootmem_init+0xb8/0x160
 [0000000000dd174c] paging_init+0x808/0x8fc
 [0000000000dcb0d0] setup_arch+0x2c8/0x2f0
 [0000000000dc68a0] start_kernel+0x48/0x424
 [0000000000dcb374] start_early_boot+0x27c/0x28c
 [0000000000a32c08] tlb_fixup_done+0x4c/0x64
 [0000000000027f08] 0x27f08

It is because linux use an internal structure node_masks[] to
keep the best memory latency node only. However, LDOM mdesc can
contain single latency-group with multiple memory latency nodes.

If the address doesn't match the best latency node within
node_masks[], it should check for an alternative via mdesc.
The warning message should only be printed if the address
doesn't match any node_masks[] nor within mdesc. To minimize
the impact of searching mdesc every time, the last matched
mask and index is stored in a variable.

Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
Reviewed-by: Chris Hyser <chris.hyser@oracle.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
---
 arch/sparc/mm/init_64.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 61 insertions(+), 4 deletions(-)

diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 4650a3840305..5979f01af0d2 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -807,6 +807,8 @@ struct mdesc_mblock {
 };
 static struct mdesc_mblock *mblocks;
 static int num_mblocks;
+static int find_numa_node_for_addr(unsigned long pa,
+				   struct node_mem_mask *pnode_mask);
 
 static unsigned long ra_to_pa(unsigned long addr)
 {
@@ -826,6 +828,9 @@ static unsigned long ra_to_pa(unsigned long addr)
 
 static int find_node(unsigned long addr)
 {
+	static bool search_mdesc = true;
+	static struct node_mem_mask last_mem_mask = { ~0UL, ~0UL };
+	static int last_index;
 	int i;
 
 	addr = ra_to_pa(addr);
@@ -835,10 +840,27 @@ static int find_node(unsigned long addr)
 		if ((addr & p->mask) == p->val)
 			return i;
 	}
-	/* The following condition has been observed on LDOM guests.*/
-	WARN_ONCE(1, "find_node: A physical address doesn't match a NUMA node"
-		" rule. Some physical memory will be owned by node 0.");
-	return 0;
+	/* The following condition has been observed on LDOM guests because
+	 * node_masks only contains the best latency mask and value.
+	 * LDOM guest's mdesc can contain a single latency group to
+	 * cover multiple address range. Print warning message only if the
+	 * address cannot be found in node_masks nor mdesc.
+	 */
+	if ((search_mdesc) &&
+	    ((addr & last_mem_mask.mask) != last_mem_mask.val)) {
+		/* find the available node in the mdesc */
+		last_index = find_numa_node_for_addr(addr, &last_mem_mask);
+		numadbg("find_node: latency group for address 0x%lx is %d\n",
+			addr, last_index);
+		if ((last_index < 0) || (last_index >= num_node_masks)) {
+			/* WARN_ONCE() and use default group 0 */
+			WARN_ONCE(1, "find_node: A physical address doesn't match a NUMA node rule. Some physical memory will be owned by node 0.");
+			search_mdesc = false;
+			last_index = 0;
+		}
+	}
+
+	return last_index;
 }
 
 static u64 memblock_nid_range(u64 start, u64 end, int *nid)
@@ -1150,6 +1172,41 @@ static struct mdesc_mlgroup * __init find_mlgroup(u64 node)
 	return NULL;
 }
 
+static int find_numa_node_for_addr(unsigned long pa,
+				   struct node_mem_mask *pnode_mask)
+{
+	struct mdesc_handle *md = mdesc_grab();
+	u64 node, arc;
+	int i = 0;
+
+	node = mdesc_node_by_name(md, MDESC_NODE_NULL, "latency-groups");
+	if (node == MDESC_NODE_NULL)
+		goto out;
+
+	mdesc_for_each_node_by_name(md, node, "group") {
+		mdesc_for_each_arc(arc, md, node, MDESC_ARC_TYPE_FWD) {
+			u64 target = mdesc_arc_target(md, arc);
+			struct mdesc_mlgroup *m = find_mlgroup(target);
+
+			if (!m)
+				continue;
+			if ((pa & m->mask) == m->match) {
+				if (pnode_mask) {
+					pnode_mask->mask = m->mask;
+					pnode_mask->val = m->match;
+				}
+				mdesc_release(md);
+				return i;
+			}
+		}
+		i++;
+	}
+
+out:
+	mdesc_release(md);
+	return -1;
+}
+
 static int __init numa_attach_mlgroup(struct mdesc_handle *md, u64 grp,
 				      int index)
 {
-- 
2.11.0

  parent reply	other threads:[~2016-12-13 19:54 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20161213195251epcas5p33cd25dd883c71a35fd9cdec0b8e8254a@epcas5p3.samsung.com>
2016-12-13 19:52 ` [PATCH 3.12 00/38] 3.12.69-stable review Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 01/38] x86/idle: Restore trace_cpu_idle to mwait_idle() calls Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 02/38] PCI: Fix devfn for VPD access through function 0 Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 03/38] PCI: Use function 0 VPD for identical functions, regular VPD for others Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 04/38] i2c: at91: fix write transfers by clearing pending interrupt first Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 05/38] mtd: blkdevs: fix potential deadlock + lockdep warnings Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 06/38] kernel/panic.c: turn off locks debug before releasing console lock Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 07/38] tty: audit: Fix audit source Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 08/38] Revert "drivers/net: Disable UFO through virtio" Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 09/38] KVM: x86: drop error recovery in em_jmp_far and em_ret_far Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 10/38] usb: chipidea: move the lock initialization to core file Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 11/38] USB: serial: cp210x: add ID for the Zone DPMX Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 12/38] USB: serial: ftdi_sio: add support for TI CC3200 LaunchPad Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 13/38] Fix USB CB/CBI storage devices with CONFIG_VMAP_STACK=y Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 14/38] scsi: mpt3sas: Fix secure erase premature termination Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 15/38] tile: avoid using clocksource_cyc2ns with absolute cycle count Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 16/38] cfg80211: limit scan results cache size Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 17/38] apparmor: fix change_hat not finding hat after policy replacement Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 18/38] mpi: Fix NULL ptr dereference in mpi_powm() [ver #3] Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 19/38] drm/radeon: Ensure vblank interrupt is enabled on DPMS transition to on Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 20/38] x86/traps: Ignore high word of regs->cs in early_fixup_exception() Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 21/38] rcu: Fix soft lockup for rcu_nocb_kthread Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 22/38] PCI: Export pcie_find_root_port Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 23/38] mwifiex: printk() overflow with 32-byte SSIDs Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 24/38] pwm: Fix device reference leak Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 25/38] ipv6: Set skb->protocol properly for local output Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 26/38] ipv4: " Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 27/38] ALSA: pcm : Call kill_fasync() in stream lock Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 28/38] ip6_tunnel: disable caching when the traffic class is inherited Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 29/38] net: sky2: Fix shutdown crash Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 30/38] l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind() Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 31/38] net/sched: pedit: make sure that offset is valid Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 32/38] net/dccp: fix use-after-free in dccp_invalid_packet Jiri Slaby
2016-12-13 19:52   ` [PATCH 3.12 33/38] packet: fix race condition in packet_set_ring Jiri Slaby
2016-12-13 19:53   ` [PATCH 3.12 34/38] net: avoid signed overflows for SO_{SND|RCV}BUFFORCE Jiri Slaby
2016-12-13 19:53   ` [PATCH 3.12 35/38] net: ping: check minimum size on ICMP header length Jiri Slaby
2016-12-13 19:53   ` [PATCH 3.12 36/38] sparc32: Fix inverted invalid_frame_pointer checks on sigreturns Jiri Slaby
2016-12-13 19:53   ` Jiri Slaby [this message]
2016-12-13 19:53   ` [PATCH 3.12 38/38] sparc64: fix compile warning section mismatch in find_node() Jiri Slaby
2016-12-14  0:51   ` [PATCH 3.12 00/38] 3.12.69-stable review Shuah Khan
2016-12-17  9:10     ` Jiri Slaby
2016-12-14  3:42   ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d68ed46bd6c06585e61556d4a7171d8ea198101c.1481658746.git.jslaby@suse.cz \
    --to=jslaby@suse.cz \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=thomas.tai@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).