linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/mm: avoid truncating memblocks for SGX memory
@ 2021-06-17 19:46 Dave Hansen
  2021-06-18  8:44 ` Du, Fan
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Dave Hansen @ 2021-06-17 19:46 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, Dave Hansen, fan.du, reinette.chatre, jarkko,
	dan.j.williams, dave.hansen, x86, linux-sgx, luto, peterz


From: Fan Du <fan.du@intel.com>

tl;dr:

Several SGX users reported seeing the following message on NUMA systems:

	sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0.

This turned out to be the 'memblock' code mistakenly throwing away
SGX memory.

=== Full Changelog ===

The 'max_pfn' variable represents the highest known RAM address.  It can
be used, for instance, to quickly determine for which physical addresses
there is mem_map[] space allocated.  The numa_meminfo code makes an
effort to throw out ("trim") all memory blocks which are above 'max_pfn'.

SGX memory is not considered RAM (it is marked as "Reserved" in the
e820) and is not taken into account by max_pfn.  Despite this, SGX
memory areas have NUMA affinity and are enumerated in the ACPI SRAT.
The existing SGX code uses the numa_meminfo mechanism to look up the
NUMA affinity for its memory areas.

In cases where SGX memory was above max_pfn (usually just the one EPC
section in the last highest NUMA node), the numa_memblock is truncated
at 'max_pfn', which is below the SGX memory.  When the SGX code tries to
look up the affinity of this memory, it fails and produces an error message:

	sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0.

and assigns the memory to NUMA node 0.

Instead of silently truncating the memory block at 'max_pfn' and
dropping the SGX memory, add the truncated portion to
'numa_reserved_meminfo'.  This allows the SGX code to later determine
the NUMA affinity of its 'Reserved' area.

Without this patch, numa_meminfo looks like this (from 'crash'):

  blk = { start =          0x0, end = 0x2080000000, nid = 0x0 }
        { start = 0x2080000000, end = 0x4000000000, nid = 0x1 }

numa_reserved_meminfo is empty.

After the patch, numa_meminfo looks like this:

  blk = { start =          0x0, end = 0x2080000000, nid = 0x0 }
        { start = 0x2080000000, end = 0x4000000000, nid = 0x1 }

and numa_reserved_meminfo has an entry for node 1's SGX memory:

  blk =  { start = 0x4000000000, end = 0x4080000000, nid = 0x1 }

 [ daveh: completely rewrote/reworked changelog ]

Signed-off-by: Fan Du <fan.du@intel.com>
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Fixes: 5d30f92e7631 ("x86/NUMA: Provide a range-to-target_node lookup facility")
Cc: x86@kernel.org
Cc: linux-sgx@vger.kernel.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
---

 b/arch/x86/mm/numa.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff -puN arch/x86/mm/numa.c~sgx-srat arch/x86/mm/numa.c
--- a/arch/x86/mm/numa.c~sgx-srat	2021-06-17 11:23:05.116159990 -0700
+++ b/arch/x86/mm/numa.c	2021-06-17 11:55:46.117155100 -0700
@@ -254,7 +254,13 @@ int __init numa_cleanup_meminfo(struct n
 
 		/* make sure all non-reserved blocks are inside the limits */
 		bi->start = max(bi->start, low);
-		bi->end = min(bi->end, high);
+
+		/* preserve info for non-RAM areas above 'max_pfn': */
+		if (bi->end > high) {
+			numa_add_memblk_to(bi->nid, high, bi->end,
+					   &numa_reserved_meminfo);
+			bi->end = high;
+		}
 
 		/* and there's no empty block */
 		if (bi->start >= bi->end)
_

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] x86/mm: avoid truncating memblocks for SGX memory
  2021-06-17 19:46 [PATCH] x86/mm: avoid truncating memblocks for SGX memory Dave Hansen
@ 2021-06-18  8:44 ` Du, Fan
  2021-06-18 15:19 ` Dave Hansen
  2021-06-18 17:42 ` [tip: x86/urgent] x86/mm: Avoid " tip-bot2 for Fan Du
  2 siblings, 0 replies; 4+ messages in thread
From: Du, Fan @ 2021-06-18  8:44 UTC (permalink / raw)
  To: Dave Hansen, linux-mm
  Cc: linux-kernel, Chatre, Reinette, jarkko, Williams, Dan J, Hansen,
	Dave, x86, linux-sgx, luto, peterz, Du, Fan



>-----Original Message-----
>From: Dave Hansen <dave.hansen@linux.intel.com>
>Sent: Friday, June 18, 2021 3:47 AM
>To: linux-mm@kvack.org
>Cc: linux-kernel@vger.kernel.org; Dave Hansen
><dave.hansen@linux.intel.com>; Du, Fan <fan.du@intel.com>; Chatre,
>Reinette <reinette.chatre@intel.com>; jarkko@kernel.org; Williams, Dan J
><dan.j.williams@intel.com>; Hansen, Dave <dave.hansen@intel.com>;
>x86@kernel.org; linux-sgx@vger.kernel.org; luto@kernel.org;
>peterz@infradead.org
>Subject: [PATCH] x86/mm: avoid truncating memblocks for SGX memory
>
>
>From: Fan Du <fan.du@intel.com>
>
>tl;dr:
>
>Several SGX users reported seeing the following message on NUMA systems:
>
>	sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback
>to the NUMA node 0.
>
>This turned out to be the 'memblock' code mistakenly throwing away
>SGX memory.
>
>=== Full Changelog ===
>
>The 'max_pfn' variable represents the highest known RAM address.  It can
>be used, for instance, to quickly determine for which physical addresses
>there is mem_map[] space allocated.  The numa_meminfo code makes an
>effort to throw out ("trim") all memory blocks which are above 'max_pfn'.
>
>SGX memory is not considered RAM (it is marked as "Reserved" in the
>e820) and is not taken into account by max_pfn.  Despite this, SGX
>memory areas have NUMA affinity and are enumerated in the ACPI SRAT.
>The existing SGX code uses the numa_meminfo mechanism to look up the
>NUMA affinity for its memory areas.
>
>In cases where SGX memory was above max_pfn (usually just the one EPC
>section in the last highest NUMA node), the numa_memblock is truncated
>at 'max_pfn', which is below the SGX memory.  When the SGX code tries to
>look up the affinity of this memory, it fails and produces an error message:
>
>	sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback
>to the NUMA node 0.
>
>and assigns the memory to NUMA node 0.
>
>Instead of silently truncating the memory block at 'max_pfn' and
>dropping the SGX memory, add the truncated portion to
>'numa_reserved_meminfo'.  This allows the SGX code to later determine
>the NUMA affinity of its 'Reserved' area.
>
>Without this patch, numa_meminfo looks like this (from 'crash'):
>
>  blk = { start =          0x0, end = 0x2080000000, nid = 0x0 }
>        { start = 0x2080000000, end = 0x4000000000, nid = 0x1 }
>
>numa_reserved_meminfo is empty.
>
>After the patch, numa_meminfo looks like this:
>
>  blk = { start =          0x0, end = 0x2080000000, nid = 0x0 }
>        { start = 0x2080000000, end = 0x4000000000, nid = 0x1 }
>
>and numa_reserved_meminfo has an entry for node 1's SGX memory:
>
>  blk =  { start = 0x4000000000, end = 0x4080000000, nid = 0x1 }
>
> [ daveh: completely rewrote/reworked changelog ]

Really what's your PROBLEM?!
Neither did I ask you to send my patch, nor do I agree to change it.
Who grant you the right to do this ?!
It's disgraceful to do this w/o my notice.

If you have comments, please DO align with the other two maintainers Jarkko and Dan first,
who already reviewed the patch in this format.

https://lkml.org/lkml/2021/6/17/1151



>Signed-off-by: Fan Du <fan.du@intel.com>
>Reported-by: Reinette Chatre <reinette.chatre@intel.com>
>Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
>Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>Reviewed-by: Dave Hansen <dave.hansen@intel.com>
>Fixes: 5d30f92e7631 ("x86/NUMA: Provide a range-to-target_node lookup
>facility")
>Cc: x86@kernel.org
>Cc: linux-sgx@vger.kernel.org
>Cc: Andy Lutomirski <luto@kernel.org>
>Cc: Peter Zijlstra <peterz@infradead.org>
>---
>
> b/arch/x86/mm/numa.c |    8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff -puN arch/x86/mm/numa.c~sgx-srat arch/x86/mm/numa.c
>--- a/arch/x86/mm/numa.c~sgx-srat	2021-06-17 11:23:05.116159990 -0700
>+++ b/arch/x86/mm/numa.c	2021-06-17 11:55:46.117155100 -0700
>@@ -254,7 +254,13 @@ int __init numa_cleanup_meminfo(struct n
>
> 		/* make sure all non-reserved blocks are inside the limits */
> 		bi->start = max(bi->start, low);
>-		bi->end = min(bi->end, high);
>+
>+		/* preserve info for non-RAM areas above 'max_pfn': */
>+		if (bi->end > high) {
>+			numa_add_memblk_to(bi->nid, high, bi->end,
>+					   &numa_reserved_meminfo);
>+			bi->end = high;
>+		}
>
> 		/* and there's no empty block */
> 		if (bi->start >= bi->end)
>_

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86/mm: avoid truncating memblocks for SGX memory
  2021-06-17 19:46 [PATCH] x86/mm: avoid truncating memblocks for SGX memory Dave Hansen
  2021-06-18  8:44 ` Du, Fan
@ 2021-06-18 15:19 ` Dave Hansen
  2021-06-18 17:42 ` [tip: x86/urgent] x86/mm: Avoid " tip-bot2 for Fan Du
  2 siblings, 0 replies; 4+ messages in thread
From: Dave Hansen @ 2021-06-18 15:19 UTC (permalink / raw)
  To: Dave Hansen, linux-mm
  Cc: linux-kernel, fan.du, reinette.chatre, jarkko, dan.j.williams,
	x86, linux-sgx, luto, peterz

On 6/17/21 12:46 PM, Dave Hansen wrote:
> Signed-off-by: Fan Du <fan.du@intel.com>
> Reported-by: Reinette Chatre <reinette.chatre@intel.com>
> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Dave Hansen <dave.hansen@intel.com>
> Fixes: 5d30f92e7631 ("x86/NUMA: Provide a range-to-target_node lookup facility")
> Cc: x86@kernel.org
> Cc: linux-sgx@vger.kernel.org
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>

Forgot to add:

Signed-off-by: Dave Hansen <dave.hansen@intel.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip: x86/urgent] x86/mm: Avoid truncating memblocks for SGX memory
  2021-06-17 19:46 [PATCH] x86/mm: avoid truncating memblocks for SGX memory Dave Hansen
  2021-06-18  8:44 ` Du, Fan
  2021-06-18 15:19 ` Dave Hansen
@ 2021-06-18 17:42 ` tip-bot2 for Fan Du
  2 siblings, 0 replies; 4+ messages in thread
From: tip-bot2 for Fan Du @ 2021-06-18 17:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Reinette Chatre, Fan Du, Dave Hansen, Borislav Petkov,
	Jarkko Sakkinen, Dan Williams, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     28e5e44aa3f4e0e0370864ed008fb5e2d85f4dc8
Gitweb:        https://git.kernel.org/tip/28e5e44aa3f4e0e0370864ed008fb5e2d85f4dc8
Author:        Fan Du <fan.du@intel.com>
AuthorDate:    Thu, 17 Jun 2021 12:46:57 -07:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Fri, 18 Jun 2021 19:37:01 +02:00

x86/mm: Avoid truncating memblocks for SGX memory

tl;dr:

Several SGX users reported seeing the following message on NUMA systems:

  sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0.

This turned out to be the memblock code mistakenly throwing away SGX
memory.

=== Full Changelog ===

The 'max_pfn' variable represents the highest known RAM address.  It can
be used, for instance, to quickly determine for which physical addresses
there is mem_map[] space allocated.  The numa_meminfo code makes an
effort to throw out ("trim") all memory blocks which are above 'max_pfn'.

SGX memory is not considered RAM (it is marked as "Reserved" in the
e820) and is not taken into account by max_pfn. Despite this, SGX memory
areas have NUMA affinity and are enumerated in the ACPI SRAT table. The
existing SGX code uses the numa_meminfo mechanism to look up the NUMA
affinity for its memory areas.

In cases where SGX memory was above max_pfn (usually just the one EPC
section in the last highest NUMA node), the numa_memblock is truncated
at 'max_pfn', which is below the SGX memory.  When the SGX code tries to
look up the affinity of this memory, it fails and produces an error message:

  sgx: [Firmware Bug]: Unable to map EPC section to online node. Fallback to the NUMA node 0.

and assigns the memory to NUMA node 0.

Instead of silently truncating the memory block at 'max_pfn' and
dropping the SGX memory, add the truncated portion to
'numa_reserved_meminfo'.  This allows the SGX code to later determine
the NUMA affinity of its 'Reserved' area.

Before, numa_meminfo looked like this (from 'crash'):

  blk = { start =          0x0, end = 0x2080000000, nid = 0x0 }
        { start = 0x2080000000, end = 0x4000000000, nid = 0x1 }

numa_reserved_meminfo is empty.

With this, numa_meminfo looks like this:

  blk = { start =          0x0, end = 0x2080000000, nid = 0x0 }
        { start = 0x2080000000, end = 0x4000000000, nid = 0x1 }

and numa_reserved_meminfo has an entry for node 1's SGX memory:

  blk =  { start = 0x4000000000, end = 0x4080000000, nid = 0x1 }

 [ daveh: completely rewrote/reworked changelog ]

Fixes: 5d30f92e7631 ("x86/NUMA: Provide a range-to-target_node lookup facility")
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Fan Du <fan.du@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20210617194657.0A99CB22@viggo.jf.intel.com
---
 arch/x86/mm/numa.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 5eb4dc2..e94da74 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -254,7 +254,13 @@ int __init numa_cleanup_meminfo(struct numa_meminfo *mi)
 
 		/* make sure all non-reserved blocks are inside the limits */
 		bi->start = max(bi->start, low);
-		bi->end = min(bi->end, high);
+
+		/* preserve info for non-RAM areas above 'max_pfn': */
+		if (bi->end > high) {
+			numa_add_memblk_to(bi->nid, high, bi->end,
+					   &numa_reserved_meminfo);
+			bi->end = high;
+		}
 
 		/* and there's no empty block */
 		if (bi->start >= bi->end)

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-06-18 17:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-17 19:46 [PATCH] x86/mm: avoid truncating memblocks for SGX memory Dave Hansen
2021-06-18  8:44 ` Du, Fan
2021-06-18 15:19 ` Dave Hansen
2021-06-18 17:42 ` [tip: x86/urgent] x86/mm: Avoid " tip-bot2 for Fan Du

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).