All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kai Huang <kai.huang@intel.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: seanjc@google.com, pbonzini@redhat.com, dave.hansen@intel.com,
	len.brown@intel.com, tony.luck@intel.com,
	rafael.j.wysocki@intel.com, reinette.chatre@intel.com,
	dan.j.williams@intel.com, peterz@infradead.org,
	ak@linux.intel.com, kirill.shutemov@linux.intel.com,
	sathyanarayanan.kuppuswamy@linux.intel.com,
	isaku.yamahata@intel.com, kai.huang@intel.com
Subject: [PATCH v4 05/22] x86/virt/tdx: Prevent hot-add driver managed memory
Date: Wed,  1 Jun 2022 07:39:28 +1200	[thread overview]
Message-ID: <b19661b081d502c767fe503920ff8975f11c2012.1654025431.git.kai.huang@intel.com> (raw)
In-Reply-To: <cover.1654025430.git.kai.huang@intel.com>

TDX provides increased levels of memory confidentiality and integrity.
This requires special hardware support for features like memory
encryption and storage of memory integrity checksums.  Not all memory
satisfies these requirements.

As a result, the TDX introduced the concept of a "Convertible Memory
Region" (CMR).  During boot, the firmware builds a list of all of the
memory ranges which can provide the TDX security guarantees.  The list
of these ranges is available to the kernel by querying the TDX module.

However those TDX-capable memory regions are not automatically useable
to the TDX module.  The kernel needs to choose which convertible memory
regions to be the TDX-usable memory and pass those regions to the TDX
module when initializing the module.  Once those ranges are passed to
the TDX module, the TDX-usable memory regions are fixed during module's
lifetime.

To avoid having to modify the page allocator to distinguish TDX and
non-TDX memory allocation, this implementation guarantees all pages
managed by the page allocator are TDX memory.  This means any hot-added
memory to the page allocator will break such guarantee thus should be
prevented.

There are basically two memory hot-add cases that need to be prevented:
ACPI memory hot-add and driver managed memory hot-add.  However, adding
new memory to ZONE_DEVICE should not be prevented as those pages are not
managed by the page allocator.  Therefore memremap_pages() variants
should be allowed although they internally also use memory hotplug
functions.

ACPI memory hotplug is already prevented.  To prevent driver managed
memory and still allow memremap_pages() variants to work, add a __weak
hook to do arch-specific check in add_memory_resource().  Implement the
x86 version to prevent new memory region from being added when TDX is
enabled by BIOS.

The __weak arch-specific hook is used instead of a new CC_ATTR similar
to disable software CPU hotplug.  It is because some driver managed
memory resources may actually be TDX-capable (such as legacy PMEM, which
is underneath indeed RAM), and the arch-specific hook can be further
enhanced to allow those when needed.

Note arch-specific hook for __remove_memory() is not required.  Both
ACPI hot-removal and driver managed memory removal cannot reach it.

Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/mm/init_64.c          | 21 +++++++++++++++++++++
 include/linux/memory_hotplug.h |  2 ++
 mm/memory_hotplug.c            | 15 +++++++++++++++
 3 files changed, 38 insertions(+)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 96d34ebb20a9..ce89cf88a818 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -55,6 +55,7 @@
 #include <asm/uv/uv.h>
 #include <asm/setup.h>
 #include <asm/ftrace.h>
+#include <asm/tdx.h>
 
 #include "mm_internal.h"
 
@@ -972,6 +973,26 @@ int arch_add_memory(int nid, u64 start, u64 size,
 	return add_pages(nid, start_pfn, nr_pages, params);
 }
 
+int arch_memory_add_precheck(int nid, u64 start, u64 size, mhp_t mhp_flags)
+{
+	if (!platform_tdx_enabled())
+		return 0;
+
+	/*
+	 * TDX needs to guarantee all pages managed by the page allocator
+	 * are TDX memory in order to not have to distinguish TDX and
+	 * non-TDX memory allocation.  The kernel needs to pass the
+	 * TDX-usable memory regions to the TDX module when it gets
+	 * initialized.  After that, the TDX-usable memory regions are
+	 * fixed.  This means any memory hot-add to the page allocator
+	 * will break above guarantee thus should be prevented.
+	 */
+	pr_err("Unable to add memory [0x%llx, 0x%llx) on TDX enabled platform.\n",
+			start, start + size);
+
+	return -EINVAL;
+}
+
 static void __meminit free_pagetable(struct page *page, int order)
 {
 	unsigned long magic;
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 1ce6f8044f1e..306ef4ceb419 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -325,6 +325,8 @@ extern int add_memory_resource(int nid, struct resource *resource,
 extern int add_memory_driver_managed(int nid, u64 start, u64 size,
 				     const char *resource_name,
 				     mhp_t mhp_flags);
+extern int arch_memory_add_precheck(int nid, u64 start, u64 size,
+				    mhp_t mhp_flags);
 extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 				   unsigned long nr_pages,
 				   struct vmem_altmap *altmap, int migratetype);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 416b38ca8def..2ad4b2603c7c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1296,6 +1296,17 @@ bool mhp_supports_memmap_on_memory(unsigned long size)
 	       IS_ALIGNED(remaining_size, (pageblock_nr_pages << PAGE_SHIFT));
 }
 
+/*
+ * Pre-check whether hot-add memory is allowed before arch_add_memory().
+ *
+ * Arch to provide replacement version if required.
+ */
+int __weak arch_memory_add_precheck(int nid, u64 start, u64 size,
+				    mhp_t mhp_flags)
+{
+	return 0;
+}
+
 /*
  * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
  * and online/offline operations (triggered e.g. by sysfs).
@@ -1319,6 +1330,10 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
 	if (ret)
 		return ret;
 
+	ret = arch_memory_add_precheck(nid, start, size, mhp_flags);
+	if (ret)
+		return ret;
+
 	if (mhp_flags & MHP_NID_IS_MGID) {
 		group = memory_group_find_by_id(nid);
 		if (!group)
-- 
2.35.3


  parent reply	other threads:[~2022-05-31 19:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-31 19:39 [PATCH v4 00/22] TDX host kernel support Kai Huang
2022-05-31 19:39 ` [PATCH v4 01/22] x86/virt/tdx: Detect TDX during kernel boot Kai Huang
2022-05-31 19:39 ` [PATCH v4 02/22] cc_platform: Add new attribute to prevent ACPI CPU hotplug Kai Huang
2022-05-31 19:39 ` [PATCH v4 03/22] cc_platform: Add new attribute to prevent ACPI memory hotplug Kai Huang
2022-05-31 19:39 ` [PATCH v4 04/22] x86/virt/tdx: Prevent ACPI CPU hotplug and " Kai Huang
2022-05-31 19:39 ` Kai Huang [this message]
2022-05-31 19:39 ` [PATCH v4 06/22] x86/virt/tdx: Add skeleton to initialize TDX on demand Kai Huang
2022-05-31 19:39 ` [PATCH v4 07/22] x86/virt/tdx: Implement SEAMCALL function Kai Huang
2022-05-31 19:39 ` [PATCH v4 08/22] x86/virt/tdx: Shut down TDX module in case of error Kai Huang
2022-05-31 19:39 ` [PATCH v4 09/22] x86/virt/tdx: Detect TDX module by doing module global initialization Kai Huang
2022-05-31 19:39 ` [PATCH v4 10/22] x86/virt/tdx: Do logical-cpu scope TDX module initialization Kai Huang
2022-05-31 19:39 ` [PATCH v4 11/22] x86/virt/tdx: Get information about TDX module and TDX-capable memory Kai Huang
2022-05-31 19:39 ` [PATCH v4 12/22] x86/virt/tdx: Convert all memory regions in memblock to TDX memory Kai Huang
2022-05-31 19:39 ` [PATCH v4 13/22] x86/virt/tdx: Add placeholder to construct TDMRs based on memblock Kai Huang
2022-05-31 19:39 ` [PATCH v4 14/22] x86/virt/tdx: Create TDMRs to cover all memblock memory regions Kai Huang
2022-05-31 19:39 ` [PATCH v4 15/22] x86/virt/tdx: Allocate and set up PAMTs for TDMRs Kai Huang
2022-05-31 19:39 ` [PATCH v4 16/22] x86/virt/tdx: Set up reserved areas for all TDMRs Kai Huang
2022-05-31 19:39 ` [PATCH v4 17/22] x86/virt/tdx: Reserve TDX module global KeyID Kai Huang
2022-05-31 19:39 ` [PATCH v4 18/22] x86/virt/tdx: Configure TDX module with TDMRs and " Kai Huang
2022-05-31 19:39 ` [PATCH v4 19/22] x86/virt/tdx: Configure global KeyID on all packages Kai Huang
2022-05-31 19:39 ` [PATCH v4 20/22] x86/virt/tdx: Initialize all TDMRs Kai Huang
2022-05-31 19:39 ` [PATCH v4 21/22] x86/virt/tdx: Support kexec() Kai Huang
2022-05-31 19:39 ` [PATCH v4 22/22] Documentation/x86: Add documentation for TDX host support Kai Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b19661b081d502c767fe503920ff8975f11c2012.1654025431.git.kai.huang@intel.com \
    --to=kai.huang@intel.com \
    --cc=ak@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=reinette.chatre@intel.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.