All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@intel.com, luto@kernel.org, peterz@infradead.org
Cc: sathyanarayanan.kuppuswamy@linux.intel.com, aarcange@redhat.com,
	ak@linux.intel.com, dan.j.williams@intel.com, david@redhat.com,
	hpa@zytor.com, jgross@suse.com, jmattson@google.com,
	joro@8bytes.org, jpoimboe@redhat.com, knsathya@kernel.org,
	pbonzini@redhat.com, sdeep@vmware.com, seanjc@google.com,
	tony.luck@intel.com, vkuznets@redhat.com, wanpengli@tencent.com,
	thomas.lendacky@amd.com, brijesh.singh@amd.com, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: [PATCHv8 21/30] x86/acpi, x86/boot: Add multiprocessor wake-up support
Date: Wed,  6 Apr 2022 02:29:30 +0300	[thread overview]
Message-ID: <20220405232939.73860-22-kirill.shutemov@linux.intel.com> (raw)
In-Reply-To: <20220405232939.73860-1-kirill.shutemov@linux.intel.com>

From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Secondary CPU startup is currently performed with something called
the "INIT/SIPI protocol".  This protocol requires assistance from
VMMs to boot guests.  As should be a familiar story by now, that
support can not be provded to TDX guests because TDX VMMs are
not trusted by guests.

To remedy this situation a new[1] "Multiprocessor Wakeup Structure"
has been added to to an existing ACPI table (MADT).  This structure
provides the physical address of a "mailbox".  A write to the mailbox
then steers the secondary CPU to the boot code.

Add ACPI MADT wake structure parsing support and wake support.  Use
this support to wake CPUs whenever it is present instead of INIT/SIPI.

While this structure can theoretically be used on 32-bit kernels,
there are no 32-bit TDX guest kernels.  It has not been tested and
can not practically *be* tested on 32-bit.  Make it 64-bit only.

1. Details about the new structure can be found in ACPI v6.4, in the
   "Multiprocessor Wakeup Structure" section.

Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
---
 arch/x86/include/asm/apic.h |  5 ++
 arch/x86/kernel/acpi/boot.c | 93 ++++++++++++++++++++++++++++++++++++-
 arch/x86/kernel/apic/apic.c | 10 ++++
 3 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 35006e151774..bd8ae0a7010a 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -490,6 +490,11 @@ static inline unsigned int read_apic_id(void)
 	return apic->get_apic_id(reg);
 }
 
+#ifdef CONFIG_X86_64
+typedef int (*wakeup_cpu_handler)(int apicid, unsigned long start_eip);
+extern void acpi_wake_cpu_handler_update(wakeup_cpu_handler handler);
+#endif
+
 extern int default_apic_id_valid(u32 apicid);
 extern int default_acpi_madt_oem_check(char *, char *);
 extern void default_setup_apic_routing(void);
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 0d01e7f5078c..6d2c50819501 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -65,6 +65,13 @@ static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE;
 static bool acpi_support_online_capable;
 #endif
 
+#ifdef CONFIG_X86_64
+/* Physical address of the Multiprocessor Wakeup Structure mailbox */
+static u64 acpi_mp_wake_mailbox_paddr;
+/* Virtual address of the Multiprocessor Wakeup Structure mailbox */
+static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox;
+#endif
+
 #ifdef CONFIG_X86_IO_APIC
 /*
  * Locks related to IOAPIC hotplug
@@ -336,7 +343,60 @@ acpi_parse_lapic_nmi(union acpi_subtable_headers * header, const unsigned long e
 	return 0;
 }
 
-#endif				/*CONFIG_X86_LOCAL_APIC */
+#ifdef CONFIG_X86_64
+static int acpi_wakeup_cpu(int apicid, unsigned long start_ip)
+{
+	/*
+	 * Remap mailbox memory only for the first call to acpi_wakeup_cpu().
+	 *
+	 * Wakeup of secondary CPUs is fully serialized in the core code.
+	 * No need to protect acpi_mp_wake_mailbox from concurrent accesses.
+	 */
+	if (!acpi_mp_wake_mailbox) {
+		acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr,
+						sizeof(*acpi_mp_wake_mailbox),
+						MEMREMAP_WB);
+	}
+
+	/*
+	 * Mailbox memory is shared between the firmware and OS. Firmware will
+	 * listen on mailbox command address, and once it receives the wakeup
+	 * command, the CPU associated with the given apicid will be booted.
+	 *
+	 * The value of 'apic_id' and 'wakeup_vector' must be visible to the
+	 * firmware before the wakeup command is visible.  smp_store_release()
+	 * ensures ordering and visibility.
+	 */
+	acpi_mp_wake_mailbox->apic_id	    = apicid;
+	acpi_mp_wake_mailbox->wakeup_vector = start_ip;
+	smp_store_release(&acpi_mp_wake_mailbox->command,
+			  ACPI_MP_WAKE_COMMAND_WAKEUP);
+
+	/*
+	 * Wait for the CPU to wake up.
+	 *
+	 * The CPU being woken up is essentially in a spin loop waiting to be
+	 * woken up. It should not take long for it wake up and acknowledge by
+	 * zeroing out ->command.
+	 *
+	 * ACPI specification doesn't provide any guidance on how long kernel
+	 * has to wait for a wake up acknowledgement. It also doesn't provide
+	 * a way to cancel a wake up request if it takes too long.
+	 *
+	 * In TDX environment, the VMM has control over how long it takes to
+	 * wake up secondary. It can postpone scheduling secondary vCPU
+	 * indefinitely. Giving up on wake up request and reporting error opens
+	 * possible attack vector for VMM: it can wake up a secondary CPU when
+	 * kernel doesn't expect it. Wait until positive result of the wake up
+	 * request.
+	 */
+	while (READ_ONCE(acpi_mp_wake_mailbox->command))
+		cpu_relax();
+
+	return 0;
+}
+#endif /* CONFIG_X86_64 */
+#endif /* CONFIG_X86_LOCAL_APIC */
 
 #ifdef CONFIG_X86_IO_APIC
 #define MP_ISA_BUS		0
@@ -1083,6 +1143,29 @@ static int __init acpi_parse_madt_lapic_entries(void)
 	}
 	return 0;
 }
+
+#ifdef CONFIG_X86_64
+static int __init acpi_parse_mp_wake(union acpi_subtable_headers *header,
+				     const unsigned long end)
+{
+	struct acpi_madt_multiproc_wakeup *mp_wake;
+
+	if (!IS_ENABLED(CONFIG_SMP))
+		return -ENODEV;
+
+	mp_wake = (struct acpi_madt_multiproc_wakeup *)header;
+	if (BAD_MADT_ENTRY(mp_wake, end))
+		return -EINVAL;
+
+	acpi_table_print_madt_entry(&header->common);
+
+	acpi_mp_wake_mailbox_paddr = mp_wake->base_address;
+
+	acpi_wake_cpu_handler_update(acpi_wakeup_cpu);
+
+	return 0;
+}
+#endif				/* CONFIG_X86_64 */
 #endif				/* CONFIG_X86_LOCAL_APIC */
 
 #ifdef	CONFIG_X86_IO_APIC
@@ -1278,6 +1361,14 @@ static void __init acpi_process_madt(void)
 
 				smp_found_config = 1;
 			}
+
+#ifdef CONFIG_X86_64
+			/*
+			 * Parse MADT MP Wake entry.
+			 */
+			acpi_table_parse_madt(ACPI_MADT_TYPE_MULTIPROC_WAKEUP,
+					      acpi_parse_mp_wake, 1);
+#endif
 		}
 		if (error == -EINVAL) {
 			/*
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index b70344bf6600..3c8f2c797a98 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2551,6 +2551,16 @@ u32 x86_msi_msg_get_destid(struct msi_msg *msg, bool extid)
 }
 EXPORT_SYMBOL_GPL(x86_msi_msg_get_destid);
 
+#ifdef CONFIG_X86_64
+void __init acpi_wake_cpu_handler_update(wakeup_cpu_handler handler)
+{
+	struct apic **drv;
+
+	for (drv = __apicdrivers; drv < __apicdrivers_end; drv++)
+		(*drv)->wakeup_secondary_cpu_64 = handler;
+}
+#endif
+
 /*
  * Override the generic EOI implementation with an optimized version.
  * Only called during early boot when only one CPU is active and with
-- 
2.35.1


  parent reply	other threads:[~2022-04-06  3:23 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-05 23:29 [PATCHv8 00/30] TDX Guest: TDX core support Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 01/30] x86/tdx: Detect running as a TDX guest in early boot Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kuppuswamy Sathyanarayanan
2022-04-05 23:29 ` [PATCHv8 02/30] x86/tdx: Provide common base for SEAMCALL and TDCALL C wrappers Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 03/30] x86/tdx: Add __tdx_module_call() and __tdx_hypercall() helper functions Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kuppuswamy Sathyanarayanan
2022-05-20  8:38     ` [PATCH] x86/tdx: Fix tdx asm Peter Zijlstra
2022-05-20 11:00       ` [tip: x86/tdx] x86/tdx: Fix RETs in TDX asm tip-bot2 for Peter Zijlstra
2022-05-20 13:59       ` [PATCH] x86/tdx: Fix tdx asm Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 04/30] x86/tdx: Extend the confidential computing API to support TDX guests Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 05/30] x86/tdx: Exclude shared bit from __PHYSICAL_MASK Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 06/30] x86/traps: Refactor exc_general_protection() Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 07/30] x86/traps: Add #VE support for TDX guest Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 08/30] x86/tdx: Add HLT support for TDX guests Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 09/30] x86/tdx: Add MSR " Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 10/30] x86/tdx: Handle CPUID via #VE Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 11/30] x86/tdx: Handle in-kernel MMIO Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 12/30] x86/tdx: Detect TDX at early kernel decompression time Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kuppuswamy Sathyanarayanan
2022-04-05 23:29 ` [PATCHv8 13/30] x86: Adjust types used in port I/O helpers Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 14/30] x86: Consolidate " Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-10 10:58   ` [PATCHv8 14/30] " Borislav Petkov
2022-04-10 20:00     ` Kirill A. Shutemov
2022-04-10 20:37       ` Borislav Petkov
2022-04-11  7:49       ` [tip: x86/tdx] x86/kaslr: Fix build warning in KASLR code in boot stub tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 15/30] x86/boot: Port I/O: allow to hook up alternative helpers Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] x86/boot: Port I/O: Allow " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 16/30] x86/boot: Port I/O: add decompression-time support for TDX Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] x86/boot: Port I/O: Add " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 17/30] x86/tdx: Port I/O: add runtime hypercalls Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] x86/tdx: Port I/O: Add " tip-bot2 for Kuppuswamy Sathyanarayanan
2022-04-05 23:29 ` [PATCHv8 18/30] x86/tdx: Port I/O: add early boot support Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] x86/tdx: Port I/O: Add " tip-bot2 for Andi Kleen
2022-04-05 23:29 ` [PATCHv8 19/30] x86/tdx: Wire up KVM hypercalls Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kuppuswamy Sathyanarayanan
2022-04-05 23:29 ` [PATCHv8 20/30] x86/boot: Add a trampoline for booting APs via firmware handoff Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Sean Christopherson
2022-04-05 23:29 ` Kirill A. Shutemov [this message]
2022-04-09  1:27   ` [tip: x86/tdx] x86/acpi/x86/boot: Add multiprocessor wake-up support tip-bot2 for Kuppuswamy Sathyanarayanan
2022-04-05 23:29 ` [PATCHv8 22/30] x86/boot: Set CR0.NE early and keep it set during the boot Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 23/30] x86/boot: Avoid #VE during boot for TDX platforms Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Sean Christopherson
2022-04-05 23:29 ` [PATCHv8 24/30] x86/topology: Disable CPU online/offline control for TDX guests Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kuppuswamy Sathyanarayanan
2022-04-05 23:29 ` [PATCHv8 25/30] x86/tdx: Make pages shared in ioremap() Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 26/30] x86/mm/cpa: Add support for TDX shared memory Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 27/30] x86/mm: Make DMA memory shared for TD guest Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 28/30] x86/tdx: ioapic: Add shared bit for IOAPIC base address Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] x86/tdx/ioapic: " tip-bot2 for Isaku Yamahata
2022-04-05 23:29 ` [PATCHv8 29/30] ACPICA: Avoid cache flush inside virtual machines Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kirill A. Shutemov
2022-04-05 23:29 ` [PATCHv8 30/30] Documentation/x86: Document TDX kernel architecture Kirill A. Shutemov
2022-04-09  1:27   ` [tip: x86/tdx] " tip-bot2 for Kuppuswamy Sathyanarayanan
2022-04-07 16:36 ` [PATCHv8 00/30] TDX Guest: TDX core support Dave Hansen
2022-04-07 16:50   ` Sean Christopherson
2022-04-07 17:42     ` Tom Lendacky
2022-04-07 17:47     ` Kirill A. Shutemov
2022-04-07 18:53       ` Sean Christopherson
2022-04-08 11:01         ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220405232939.73860-22-kirill.shutemov@linux.intel.com \
    --to=kirill.shutemov@linux.intel.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=jpoimboe@redhat.com \
    --cc=knsathya@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=sdeep@vmware.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tony.luck@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.