All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/3] TDX Guest Quote generation support
@ 2023-03-26  6:20 Kuppuswamy Sathyanarayanan
  2023-03-26  6:20 ` [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-03-26  6:20 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Guorui Yu, Du Fan,
	linux-kernel, linux-kselftest, linux-doc

Hi All,

In TDX guest, the attestation process is used to verify the TDX guest
trustworthiness to other entities before provisioning secrets to the
guest.

The TDX guest attestation process consists of two steps:

1. TDREPORT generation
2. Quote generation.

The First step (TDREPORT generation) involves getting the TDX guest
measurement data in the format of TDREPORT which is further used to
validate the authenticity of the TDX guest. The second step involves
sending the TDREPORT to a Quoting Enclave (QE) server to generate a
remotely verifiable Quote. TDREPORT by design can only be verified on
the local platform. To support remote verification of the TDREPORT,
TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
locally and convert it to a remotely verifiable Quote. Although
attestation software can use communication methods like TCP/IP or
vsock to send the TDREPORT to QE, not all platforms support these
communication models. So TDX GHCI specification [1] defines a method
for Quote generation via hypercalls. Please check the discussion from
Google [2] and Alibaba [3] which clarifies the need for hypercall based
Quote generation support. This patch set adds this support.

Support for TDREPORT generation already exists in the TDX guest driver. 
This patchset extends the same driver to add the Quote generation
support.

Following are the details of the patch set:

Patch 1/3 -> Adds event notification IRQ support.
Patch 2/3 -> Adds Quote generation support.
Patch 3/3 -> Adds selftest support for Quote generation feature.

[1] https://cdrdv2.intel.com/v1/dl/getContent/726790, section titled "TDG.VP.VMCALL<GetQuote>".
[2] https://lore.kernel.org/lkml/CAAYXXYxxs2zy_978GJDwKfX5Hud503gPc8=1kQ-+JwG_kA79mg@mail.gmail.com/
[3] https://lore.kernel.org/lkml/a69faebb-11e8-b386-d591-dbd08330b008@linux.alibaba.com/

Kuppuswamy Sathyanarayanan (3):
  x86/tdx: Add TDX Guest event notify interrupt support
  virt: tdx-guest: Add Quote generation support
  selftests/tdx: Test GetQuote TDX attestation feature

 Documentation/virt/coco/tdx-guest.rst        |  11 +
 arch/x86/coco/tdx/tdx.c                      | 203 +++++++++++++++
 arch/x86/include/asm/tdx.h                   |   8 +
 drivers/virt/coco/tdx-guest/tdx-guest.c      | 249 ++++++++++++++++++-
 include/uapi/linux/tdx-guest.h               |  44 ++++
 tools/testing/selftests/tdx/tdx_guest_test.c |  68 ++++-
 6 files changed, 575 insertions(+), 8 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support
  2023-03-26  6:20 [PATCH v1 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
@ 2023-03-26  6:20 ` Kuppuswamy Sathyanarayanan
  2023-03-28  2:38   ` Huang, Kai
  2023-03-26  6:20 ` [PATCH v1 2/3] virt: tdx-guest: Add Quote generation support Kuppuswamy Sathyanarayanan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-03-26  6:20 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Guorui Yu, Du Fan,
	linux-kernel, linux-kselftest, linux-doc

Host-guest event notification via configured interrupt vector is useful
in cases where a guest makes an asynchronous request and needs a
callback from the host to indicate the completion or to let the host
notify the guest about events like device removal. One usage example is,
callback requirement of GetQuote asynchronous hypercall.

In TDX guest, SetupEventNotifyInterrupt hypercall can be used by the
guest to specify which interrupt vector to use as an event-notify
vector to the VMM. Details about the SetupEventNotifyInterrupt
hypercall can be found in TDX Guest-Host Communication Interface
(GHCI) Specification, sec 3.5 "VP.VMCALL<SetupEventNotifyInterrupt>".

As per design, VMM will post the event completion IRQ using the same
CPU in which SetupEventNotifyInterrupt hypercall request is received.
So allocate an IRQ vector from "x86_vector_domain", and set the CPU
affinity of the IRQ vector to the current CPU.

Add tdx_register_event_irq_cb()/tdx_unregister_event_irq_cb()
interfaces to allow drivers register/unregister event noficiation
handlers.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Wander Lairson Costa <wander@redhat.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 arch/x86/coco/tdx/tdx.c    | 163 +++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/tdx.h |   6 ++
 2 files changed, 169 insertions(+)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 055300e08fb3..d03985952d45 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -7,12 +7,18 @@
 #include <linux/cpufeature.h>
 #include <linux/export.h>
 #include <linux/io.h>
+#include <linux/string.h>
+#include <linux/uaccess.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/numa.h>
 #include <asm/coco.h>
 #include <asm/tdx.h>
 #include <asm/vmx.h>
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
 #include <asm/pgtable.h>
+#include <asm/irqdomain.h>
 
 /* TDX module Call Leaf IDs */
 #define TDX_GET_INFO			1
@@ -27,6 +33,7 @@
 /* TDX hypercall Leaf IDs */
 #define TDVMCALL_MAP_GPA		0x10001
 #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
+#define TDVMCALL_SETUP_NOTIFY_INTR	0x10004
 
 /* MMIO direction */
 #define EPT_READ	0
@@ -51,6 +58,16 @@
 
 #define TDREPORT_SUBTYPE_0	0
 
+struct event_irq_entry {
+	tdx_event_irq_cb_t handler;
+	void *data;
+	struct list_head head;
+};
+
+static int tdx_event_irq;
+static LIST_HEAD(event_irq_cb_list);
+static DEFINE_SPINLOCK(event_irq_cb_lock);
+
 /*
  * Wrapper for standard use of __tdx_hypercall with no output aside from
  * return code.
@@ -873,3 +890,149 @@ void __init tdx_early_init(void)
 
 	pr_info("Guest detected\n");
 }
+
+static irqreturn_t tdx_event_irq_handler(int irq, void *dev_id)
+{
+	struct event_irq_entry *entry;
+
+	spin_lock(&event_irq_cb_lock);
+	list_for_each_entry(entry, &event_irq_cb_list, head) {
+		if (entry->handler)
+			entry->handler(entry->data);
+	}
+	spin_unlock(&event_irq_cb_lock);
+
+	return IRQ_HANDLED;
+}
+
+/* Reserve an IRQ from x86_vector_domain for TD event notification */
+static int __init tdx_event_irq_init(void)
+{
+	struct irq_alloc_info info;
+	cpumask_t saved_cpumask;
+	struct irq_cfg *cfg;
+	int cpu, irq;
+
+	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
+		return 0;
+
+	init_irq_alloc_info(&info, NULL);
+
+	/*
+	 * Event notification vector will be delivered to the CPU
+	 * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
+	 * So set the IRQ affinity to the current CPU.
+	 */
+	cpu = get_cpu();
+	cpumask_copy(&saved_cpumask, current->cpus_ptr);
+	info.mask = cpumask_of(cpu);
+	put_cpu();
+
+	irq = irq_domain_alloc_irqs(x86_vector_domain, 1, NUMA_NO_NODE, &info);
+	if (irq <= 0) {
+		pr_err("Event notification IRQ allocation failed %d\n", irq);
+		return -EIO;
+	}
+
+	irq_set_handler(irq, handle_edge_irq);
+
+	cfg = irq_cfg(irq);
+	if (!cfg) {
+		pr_err("Event notification IRQ config not found\n");
+		goto err_free_irqs;
+	}
+
+	if (request_irq(irq, tdx_event_irq_handler, IRQF_NOBALANCING,
+			"tdx_event_irq", NULL)) {
+		pr_err("Event notification IRQ request failed\n");
+		goto err_free_irqs;
+	}
+
+	set_cpus_allowed_ptr(current, cpumask_of(cpu));
+
+	/*
+	 * Register callback vector address with VMM. More details
+	 * about the ABI can be found in TDX Guest-Host-Communication
+	 * Interface (GHCI), sec titled
+	 * "TDG.VP.VMCALL<SetupEventNotifyInterrupt>".
+	 */
+	if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, cfg->vector, 0, 0, 0)) {
+		pr_err("Event notification hypercall failed\n");
+		goto err_restore_cpus;
+	}
+
+	set_cpus_allowed_ptr(current, &saved_cpumask);
+
+	tdx_event_irq = irq;
+
+	return 0;
+
+err_restore_cpus:
+	set_cpus_allowed_ptr(current, &saved_cpumask);
+	free_irq(irq, NULL);
+err_free_irqs:
+	irq_domain_free_irqs(irq, 1);
+
+	return -EIO;
+}
+arch_initcall(tdx_event_irq_init)
+
+/**
+ * tdx_register_event_irq_cb() - Register TDX event IRQ callback handler.
+ * @handler: Address of driver specific event IRQ callback handler. Handler
+ *           will be called in IRQ context and hence cannot sleep.
+ * @data: Context data to be passed to the callback handler.
+ *
+ * Return: 0 on success or standard error code on other failures.
+ */
+int tdx_register_event_irq_cb(tdx_event_irq_cb_t handler, void *data)
+{
+	struct event_irq_entry *entry;
+	unsigned long flags;
+
+	if (tdx_event_irq <= 0)
+		return -EIO;
+
+	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return -ENOMEM;
+
+	entry->data = data;
+	entry->handler = handler;
+
+	spin_lock_irqsave(&event_irq_cb_lock, flags);
+	list_add_tail(&entry->head, &event_irq_cb_list);
+	spin_unlock_irqrestore(&event_irq_cb_lock, flags);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(tdx_register_event_irq_cb);
+
+/**
+ * tdx_unregister_event_irq_cb() - Unregister TDX event IRQ callback handler.
+ * @handler: Address of driver specific event IRQ callback handler.
+ * @data: Context data to be passed to the callback handler.
+ *
+ * Return: 0 on success or -EIO if event IRQ is not allocated.
+ */
+int tdx_unregister_event_irq_cb(tdx_event_irq_cb_t handler, void *data)
+{
+	struct event_irq_entry *entry;
+	unsigned long flags;
+
+	if (tdx_event_irq <= 0)
+		return -EIO;
+
+	spin_lock_irqsave(&event_irq_cb_lock, flags);
+	list_for_each_entry(entry, &event_irq_cb_list, head) {
+		if (entry->handler == handler && entry->data == data) {
+			list_del(&entry->head);
+			kfree(entry);
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&event_irq_cb_lock, flags);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(tdx_unregister_event_irq_cb);
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 28d889c9aa16..8807fe1b1f3f 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -53,6 +53,8 @@ struct ve_info {
 
 #ifdef CONFIG_INTEL_TDX_GUEST
 
+typedef int (*tdx_event_irq_cb_t)(void *);
+
 void __init tdx_early_init(void);
 
 /* Used to communicate with the TDX module */
@@ -69,6 +71,10 @@ bool tdx_early_handle_ve(struct pt_regs *regs);
 
 int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport);
 
+int tdx_register_event_irq_cb(tdx_event_irq_cb_t handler, void *data);
+
+int tdx_unregister_event_irq_cb(tdx_event_irq_cb_t handler, void *data);
+
 #else
 
 static inline void tdx_early_init(void) { };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v1 2/3] virt: tdx-guest: Add Quote generation support
  2023-03-26  6:20 [PATCH v1 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
  2023-03-26  6:20 ` [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
@ 2023-03-26  6:20 ` Kuppuswamy Sathyanarayanan
  2023-03-26  6:34   ` Greg KH
  2023-03-26  6:20 ` [PATCH v1 3/3] selftests/tdx: Test GetQuote TDX attestation feature Kuppuswamy Sathyanarayanan
  2023-03-27 17:36 ` [PATCH v1 0/3] TDX Guest Quote generation support Erdem Aktas
  3 siblings, 1 reply; 16+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-03-26  6:20 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Guorui Yu, Du Fan,
	linux-kernel, linux-kselftest, linux-doc

In TDX guest, the second stage in attestation process is to send the
TDREPORT to QE/QGS to generate the TD Quote. For platforms that does
not support communication channels like vsock or TCP/IP, implement
support to get TD Quote using hypercall. GetQuote hypercall can be used
by the TD guest to request VMM facilitate the Quote generation via
QE/QGS. More details about GetQuote hypercall can be found in TDX
Guest-Host Communication Interface (GHCI) for Intel TDX 1.0, section
titled "TDG.VP.VMCALL<GetQuote>".

Add support for TDX_CMD_GET_QUOTE IOCTL to allow attestation agent
submit GetQuote requests from the user space using GetQuote hypercall.

Since GetQuote is an asynchronous request hypercall, VMM will use
callback interrupt vector configured by SetupEventNotifyInterrupt
hypercall to notify the guest about Quote generation completion or
failure. So register an IRQ handler for it.

GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
with TDREPORT data as input, which is further used by the VMM to copy
the TD Quote result after successful Quote generation. To create the
shared buffer without breaking the direct map, use DMA alloc APIs.

Also note that, shared buffer allocation is currently handled in IOCTL
handler, although it will increase the TDX_CMD_GET_QUOTE IOCTL response
time, it is negligible compared to the time required for the quote
generation completion. So IOCTL performance optimization is not
considered at this time.

To support parallel GetQuote requests, use linked list to track the
active GetQuote requests and upon receiving the callback IRQ, loop
through the active requests and mark the processed requests complete.
Users can open multiple instances of the attestation device and send
GetQuote requests in parallel.

Since GetQuote support requires usage of DMA APIs, convert TDX guest
driver to a platform driver.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 Documentation/virt/coco/tdx-guest.rst   |  11 ++
 arch/x86/coco/tdx/tdx.c                 |  40 ++++
 arch/x86/include/asm/tdx.h              |   2 +
 drivers/virt/coco/tdx-guest/tdx-guest.c | 249 +++++++++++++++++++++++-
 include/uapi/linux/tdx-guest.h          |  44 +++++
 5 files changed, 344 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/coco/tdx-guest.rst b/Documentation/virt/coco/tdx-guest.rst
index 46e316db6bb4..69954ed85c68 100644
--- a/Documentation/virt/coco/tdx-guest.rst
+++ b/Documentation/virt/coco/tdx-guest.rst
@@ -42,6 +42,17 @@ ABI. However, in the future, if the TDX Module supports more than one subtype,
 a new IOCTL CMD will be created to handle it. To keep the IOCTL naming
 consistent, a subtype index is added as part of the IOCTL CMD.
 
+2.2 TDX_CMD_GET_QUOTE
+----------------------
+
+:Input parameters: struct tdx_quote_req
+:Output: Return 0 on success, -EINTR for interrupted request, -EIO on TDCALL
+         failure or standard error number on common failures. Upon successful
+         execution, QUOTE data is copied to tdx_quote_req.buf.
+
+The TDX_CMD_GET_QUOTE IOCTL can be used by attestation software to generate
+QUOTE for the given TDREPORT using TDG.VP.VMCALL<GetQuote> hypercall.
+
 Reference
 ---------
 
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index d03985952d45..b823c4f493d4 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -34,6 +34,7 @@
 #define TDVMCALL_MAP_GPA		0x10001
 #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
 #define TDVMCALL_SETUP_NOTIFY_INTR	0x10004
+#define TDVMCALL_GET_QUOTE		0x10002
 
 /* MMIO direction */
 #define EPT_READ	0
@@ -199,6 +200,45 @@ static void __noreturn tdx_panic(const char *msg)
 		__tdx_hypercall(&args, 0);
 }
 
+/**
+ * tdx_hcall_get_quote() - Wrapper to request TD Quote using GetQuote
+ *                         hypercall.
+ * @tdquote: Address of the direct mapped kernel buffer which contains
+ *	     TDREPORT data. The same buffer will be used by VMM to store
+ *	     the generated TD Quote output.
+ * @size: size of the tdquote buffer.
+ *
+ * Refer to section titled "TDG.VP.VMCALL<GetQuote>" in the TDX GHCI
+ * v1.0 specification for more information on GetQuote hypercall.
+ * It is used in the TDX guest driver module to get the TD Quote.
+ *
+ * Return 0 on success or error code on failure.
+ */
+int tdx_hcall_get_quote(u8 *tdquote, size_t size)
+{
+	struct tdx_hypercall_args args = {0};
+
+	/*
+	 * TDX guest driver is the only user of this function and it uses
+	 * the kernel mapped memory. So use virt_to_phys() to get the
+	 * physical address of the TDQuote buffer without any additional
+	 * checks for memory type.
+	 */
+	args.r10 = TDX_HYPERCALL_STANDARD;
+	args.r11 = TDVMCALL_GET_QUOTE;
+	args.r12 = cc_mkdec(virt_to_phys(tdquote));
+	args.r13 = size;
+
+	/*
+	 * Pass the physical address of TDREPORT to the VMM and
+	 * trigger the Quote generation. It is not a blocking
+	 * call, hence completion of this request will be notified to
+	 * the TD guest via a callback interrupt.
+	 */
+	return __tdx_hypercall(&args, 0);
+}
+EXPORT_SYMBOL_GPL(tdx_hcall_get_quote);
+
 static void tdx_parse_tdinfo(u64 *cc_mask)
 {
 	struct tdx_module_output out;
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 8807fe1b1f3f..a72bd7b96564 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -75,6 +75,8 @@ int tdx_register_event_irq_cb(tdx_event_irq_cb_t handler, void *data);
 
 int tdx_unregister_event_irq_cb(tdx_event_irq_cb_t handler, void *data);
 
+int tdx_hcall_get_quote(u8 *tdquote, size_t size);
+
 #else
 
 static inline void tdx_early_init(void) { };
diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
index 5e44a0fa69bd..2424a0d02bc8 100644
--- a/drivers/virt/coco/tdx-guest/tdx-guest.c
+++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
@@ -12,12 +12,151 @@
 #include <linux/mod_devicetable.h>
 #include <linux/string.h>
 #include <linux/uaccess.h>
+#include <linux/platform_device.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/dma-mapping.h>
 
 #include <uapi/linux/tdx-guest.h>
 
 #include <asm/cpu_device_id.h>
 #include <asm/tdx.h>
 
+/**
+ * struct quote_entry - Quote request struct
+ * @valid: Flag to check validity of the GetQuote request.
+ * @buf: Kernel buffer to share data with VMM (size is page aligned).
+ * @buf_len: Size of the buf in bytes.
+ * @dma_address: DMA address of the buffer.
+ * @dev: Reference to device used in DMA allocation.
+ * @compl: Completion object to track completion of GetQuote request.
+ * @list: List object for reference in quote_list.
+ */
+struct quote_entry {
+	bool valid;
+	void *buf;
+	size_t buf_len;
+	dma_addr_t dma_address;
+	struct device *dev;
+	struct completion compl;
+	struct list_head list;
+};
+
+/*
+ * To support parallel GetQuote requests, use the list
+ * to track active GetQuote requests.
+ */
+static LIST_HEAD(quote_list);
+
+/* Lock to protect quote_list */
+static DEFINE_MUTEX(quote_lock);
+
+/* Quote generation work */
+static void quote_callback_handler(struct work_struct *work);
+
+static DECLARE_WORK(quote_work, quote_callback_handler);
+
+static struct platform_device *tdx_dev;
+
+static struct quote_entry *alloc_quote_entry(struct device *dev, size_t buf_len)
+{
+	struct quote_entry *entry = NULL;
+	size_t new_len = PAGE_ALIGN(buf_len);
+
+	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return NULL;
+
+	entry->buf = dma_alloc_coherent(dev, new_len, &entry->dma_address,
+					GFP_KERNEL);
+	if (!entry->buf) {
+		kfree(entry);
+		return NULL;
+	}
+
+	entry->buf_len = new_len;
+	init_completion(&entry->compl);
+	entry->valid = true;
+	entry->dev = dev;
+
+	return entry;
+}
+
+static void free_quote_entry(struct quote_entry *entry)
+{
+	dma_free_coherent(entry->dev, entry->buf_len, entry->buf,
+			  entry->dma_address);
+	kfree(entry);
+}
+
+/* Must be called with quote_lock held */
+static void _del_quote_entry(struct quote_entry *entry)
+{
+	list_del(&entry->list);
+	free_quote_entry(entry);
+}
+
+static void del_quote_entry(struct quote_entry *entry)
+{
+	mutex_lock(&quote_lock);
+	_del_quote_entry(entry);
+	mutex_unlock(&quote_lock);
+}
+
+/* Handles early termination of GetQuote requests */
+static void terminate_quote_request(struct quote_entry *entry)
+{
+	struct tdx_quote_hdr *quote_hdr;
+
+	/*
+	 * For early termination, if the request is not yet
+	 * processed by VMM (GET_QUOTE_IN_FLIGHT), the VMM
+	 * still owns the shared buffer, so mark the request
+	 * invalid to let quote_callback_handler() handle the
+	 * memory cleanup function. If the request is already
+	 * processed, then do the cleanup and return.
+	 */
+	mutex_lock(&quote_lock);
+	quote_hdr = entry->buf;
+	if (quote_hdr->status == GET_QUOTE_IN_FLIGHT) {
+		entry->valid = false;
+		mutex_unlock(&quote_lock);
+		return;
+	}
+	_del_quote_entry(entry);
+	mutex_unlock(&quote_lock);
+}
+
+static int attestation_callback_handler(void *dev_id)
+{
+	schedule_work(&quote_work);
+	return 0;
+}
+
+static void quote_callback_handler(struct work_struct *work)
+{
+	struct tdx_quote_hdr *quote_hdr;
+	struct quote_entry *entry, *next;
+
+	/* Find processed quote request and mark it complete */
+	mutex_lock(&quote_lock);
+	list_for_each_entry_safe(entry, next, &quote_list, list) {
+		quote_hdr = entry->buf;
+		if (quote_hdr->status == GET_QUOTE_IN_FLIGHT)
+			continue;
+		/*
+		 * If user invalidated the current request, remove the
+		 * entry from the quote list and free it. If the request
+		 * is still valid, mark it complete.
+		 */
+		if (entry->valid)
+			complete(&entry->compl);
+		else
+			_del_quote_entry(entry);
+	}
+	mutex_unlock(&quote_lock);
+}
+
 static long tdx_get_report0(struct tdx_report_req __user *req)
 {
 	u8 *reportdata, *tdreport;
@@ -53,12 +192,78 @@ static long tdx_get_report0(struct tdx_report_req __user *req)
 	return ret;
 }
 
+static long tdx_get_quote(struct tdx_quote_req __user *ureq)
+{
+	struct device *dev = &tdx_dev->dev;
+	struct quote_entry *entry;
+	struct tdx_quote_req req;
+	long ret;
+
+	if (copy_from_user(&req, ureq, sizeof(req)))
+		return -EFAULT;
+
+	/*
+	 * TDX GHCI specification does not specify any upper limit
+	 * on Quote buffer size. So no upper limit check is added.
+	 * If a user passes a large value, it is expected that it
+	 * will fail during memory allocation in alloc_quote_entry().
+	 */
+	if (!req.len)
+		return -EINVAL;
+
+	entry = alloc_quote_entry(dev, req.len);
+	if (!entry)
+		return -ENOMEM;
+
+	if (copy_from_user(entry->buf, (void __user *)req.buf, req.len)) {
+		free_quote_entry(entry);
+		return -EFAULT;
+	}
+
+	mutex_lock(&quote_lock);
+
+	/* Submit GetQuote Request using GetQuote hypercall */
+	ret = tdx_hcall_get_quote(entry->buf, entry->buf_len);
+	if (ret) {
+		mutex_unlock(&quote_lock);
+		dev_err(dev, "GetQuote hypercall failed, status:%lx\n", ret);
+		free_quote_entry(entry);
+		return -EIO;
+	}
+
+	/* Add current quote entry to quote_list to track active requests */
+	list_add_tail(&entry->list, &quote_list);
+
+	mutex_unlock(&quote_lock);
+
+	/*
+	 * Wait till GetQuote completion or request is cancelled by the
+	 * user signal. If the request is cancelled by the user, don't
+	 * cleanup the quote buffer as it is still owned by the VMM.
+	 */
+	ret = wait_for_completion_interruptible(&entry->compl);
+	if (ret < 0) {
+		dev_err(dev, "GetQuote request terminated\n");
+		terminate_quote_request(entry);
+		return -EINTR;
+	}
+
+	if (copy_to_user((void __user *)req.buf, entry->buf, req.len))
+		ret = -EFAULT;
+
+	del_quote_entry(entry);
+
+	return ret;
+}
+
 static long tdx_guest_ioctl(struct file *file, unsigned int cmd,
 			    unsigned long arg)
 {
 	switch (cmd) {
 	case TDX_CMD_GET_REPORT0:
 		return tdx_get_report0((struct tdx_report_req __user *)arg);
+	case TDX_CMD_GET_QUOTE:
+		return tdx_get_quote((struct tdx_quote_req *)arg);
 	default:
 		return -ENOTTY;
 	}
@@ -76,6 +281,27 @@ static struct miscdevice tdx_misc_dev = {
 	.fops = &tdx_guest_fops,
 };
 
+static int tdx_guest_probe(struct platform_device *pdev)
+{
+	if (tdx_register_event_irq_cb(attestation_callback_handler, pdev))
+		return -EIO;
+
+	return misc_register(&tdx_misc_dev);
+}
+
+static int tdx_guest_remove(struct platform_device *pdev)
+{
+	tdx_unregister_event_irq_cb(attestation_callback_handler, pdev);
+	misc_deregister(&tdx_misc_dev);
+	return 0;
+}
+
+static struct platform_driver tdx_guest_driver = {
+	.probe = tdx_guest_probe,
+	.remove = tdx_guest_remove,
+	.driver.name = KBUILD_MODNAME,
+};
+
 static const struct x86_cpu_id tdx_guest_ids[] = {
 	X86_MATCH_FEATURE(X86_FEATURE_TDX_GUEST, NULL),
 	{}
@@ -84,16 +310,35 @@ MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
 
 static int __init tdx_guest_init(void)
 {
+	int ret;
+
 	if (!x86_match_cpu(tdx_guest_ids))
 		return -ENODEV;
 
-	return misc_register(&tdx_misc_dev);
+	ret = platform_driver_register(&tdx_guest_driver);
+	if (ret) {
+		pr_err("failed to register driver, err=%d\n", ret);
+		return ret;
+	}
+
+	tdx_dev = platform_device_register_simple(KBUILD_MODNAME,
+						  PLATFORM_DEVID_NONE,
+						  NULL, 0);
+	if (IS_ERR(tdx_dev)) {
+		ret = PTR_ERR(tdx_dev);
+		pr_err("failed to allocate device, err=%d\n", ret);
+		platform_driver_unregister(&tdx_guest_driver);
+		return ret;
+	}
+
+	return 0;
 }
 module_init(tdx_guest_init);
 
 static void __exit tdx_guest_exit(void)
 {
-	misc_deregister(&tdx_misc_dev);
+	platform_device_unregister(tdx_dev);
+	platform_driver_unregister(&tdx_guest_driver);
 }
 module_exit(tdx_guest_exit);
 
diff --git a/include/uapi/linux/tdx-guest.h b/include/uapi/linux/tdx-guest.h
index a6a2098c08ff..94c78d08f2aa 100644
--- a/include/uapi/linux/tdx-guest.h
+++ b/include/uapi/linux/tdx-guest.h
@@ -17,6 +17,12 @@
 /* Length of TDREPORT used in TDG.MR.REPORT TDCALL */
 #define TDX_REPORT_LEN                  1024
 
+/* TD Quote status codes */
+#define GET_QUOTE_SUCCESS               0
+#define GET_QUOTE_IN_FLIGHT             0xffffffffffffffff
+#define GET_QUOTE_ERROR                 0x8000000000000000
+#define GET_QUOTE_SERVICE_UNAVAILABLE   0x8000000000000001
+
 /**
  * struct tdx_report_req - Request struct for TDX_CMD_GET_REPORT0 IOCTL.
  *
@@ -30,6 +36,35 @@ struct tdx_report_req {
 	__u8 tdreport[TDX_REPORT_LEN];
 };
 
+/* struct tdx_quote_hdr: Format of Quote request buffer header.
+ * @version: Quote format version, filled by TD.
+ * @status: Status code of Quote request, filled by VMM.
+ * @in_len: Length of TDREPORT, filled by TD.
+ * @out_len: Length of Quote data, filled by VMM.
+ * @data: Quote data on output or TDREPORT on input.
+ *
+ * More details of Quote data header can be found in TDX
+ * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
+ * section titled "TDG.VP.VMCALL<GetQuote>"
+ */
+struct tdx_quote_hdr {
+	__u64 version;
+	__u64 status;
+	__u32 in_len;
+	__u32 out_len;
+	__u64 data[];
+};
+
+/* struct tdx_quote_req: Request struct for TDX_CMD_GET_QUOTE IOCTL.
+ * @buf: Address of user buffer that includes TDREPORT. Upon successful
+ *	 completion of IOCTL, output is copied back to the same buffer.
+ * @len: Length of the Quote buffer.
+ */
+struct tdx_quote_req {
+	__u64 buf;
+	__u64 len;
+};
+
 /*
  * TDX_CMD_GET_REPORT0 - Get TDREPORT0 (a.k.a. TDREPORT subtype 0) using
  *                       TDCALL[TDG.MR.REPORT]
@@ -39,4 +74,13 @@ struct tdx_report_req {
  */
 #define TDX_CMD_GET_REPORT0              _IOWR('T', 1, struct tdx_report_req)
 
+/*
+ * TDX_CMD_GET_QUOTE - Get TD Guest Quote from QE/QGS using GetQuote
+ *		       TDVMCALL.
+ *
+ * Returns 0 on success, -EINTR for interrupted request, and
+ * standard errno on other failures.
+ */
+#define TDX_CMD_GET_QUOTE		_IOR('T', 2, struct tdx_quote_req)
+
 #endif /* _UAPI_LINUX_TDX_GUEST_H_ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v1 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-03-26  6:20 [PATCH v1 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
  2023-03-26  6:20 ` [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
  2023-03-26  6:20 ` [PATCH v1 2/3] virt: tdx-guest: Add Quote generation support Kuppuswamy Sathyanarayanan
@ 2023-03-26  6:20 ` Kuppuswamy Sathyanarayanan
  2023-03-28 20:24   ` Shuah Khan
  2023-03-27 17:36 ` [PATCH v1 0/3] TDX Guest Quote generation support Erdem Aktas
  3 siblings, 1 reply; 16+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2023-03-26  6:20 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kuppuswamy Sathyanarayanan, Kirill A . Shutemov,
	Tony Luck, Wander Lairson Costa, Erdem Aktas, Guorui Yu, Du Fan,
	linux-kernel, linux-kselftest, linux-doc

In TDX guest, the second stage of the attestation process is Quote
generation. This process is required to convert the locally generated
TDREPORT into a remotely verifiable Quote. It involves sending the
TDREPORT data to a Quoting Enclave (QE) which will verify the
integerity of the TDREPORT and sign it with an attestation key.

Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
allow user agent get the TD Quote.

Add a kernel selftest module to verify the Quote generation feature.

TD Quote generation involves following steps:

* Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
* Embed the TDREPORT data in quote buffer and request for quote
  generation via TDX_CMD_GET_QUOTE IOCTL request.
* Upon completion of the GetQuote request, check for non zero value
  in the status field of Quote header to make sure the generated
  quote is valid.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 tools/testing/selftests/tdx/tdx_guest_test.c | 68 ++++++++++++++++++--
 1 file changed, 62 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/tdx/tdx_guest_test.c b/tools/testing/selftests/tdx/tdx_guest_test.c
index 81d8cb88ea1a..2eccde54185b 100644
--- a/tools/testing/selftests/tdx/tdx_guest_test.c
+++ b/tools/testing/selftests/tdx/tdx_guest_test.c
@@ -18,6 +18,7 @@
 #define TDX_GUEST_DEVNAME "/dev/tdx_guest"
 #define HEX_DUMP_SIZE 8
 #define DEBUG 0
+#define QUOTE_SIZE 8192
 
 /**
  * struct tdreport_type - Type header of TDREPORT_STRUCT.
@@ -128,21 +129,29 @@ static void print_array_hex(const char *title, const char *prefix_str,
 	printf("\n");
 }
 
+/* Helper function to get TDREPORT */
+long get_tdreport0(int devfd, struct tdx_report_req *req)
+{
+	int i;
+
+	/* Generate sample report data */
+	for (i = 0; i < TDX_REPORTDATA_LEN; i++)
+		req->reportdata[i] = i;
+
+	return ioctl(devfd, TDX_CMD_GET_REPORT0, req);
+}
+
 TEST(verify_report)
 {
 	struct tdx_report_req req;
 	struct tdreport *tdreport;
-	int devfd, i;
+	int devfd;
 
 	devfd = open(TDX_GUEST_DEVNAME, O_RDWR | O_SYNC);
 	ASSERT_LT(0, devfd);
 
-	/* Generate sample report data */
-	for (i = 0; i < TDX_REPORTDATA_LEN; i++)
-		req.reportdata[i] = i;
-
 	/* Get TDREPORT */
-	ASSERT_EQ(0, ioctl(devfd, TDX_CMD_GET_REPORT0, &req));
+	ASSERT_EQ(0, get_tdreport0(devfd, &req));
 
 	if (DEBUG) {
 		print_array_hex("\n\t\tTDX report data\n", "",
@@ -160,4 +169,51 @@ TEST(verify_report)
 	ASSERT_EQ(0, close(devfd));
 }
 
+TEST(verify_quote)
+{
+	struct tdx_quote_hdr *quote_hdr;
+	struct tdx_report_req rep_req;
+	struct tdx_quote_req req;
+	__u64 quote_buf_size;
+	__u8 *quote_buf;
+	int devfd;
+
+	/* Open attestation device */
+	devfd = open(TDX_GUEST_DEVNAME, O_RDWR | O_SYNC);
+
+	ASSERT_LT(0, devfd);
+
+	/* Add size for quote header */
+	quote_buf_size = sizeof(*quote_hdr) + QUOTE_SIZE;
+
+	/* Allocate quote buffer */
+	quote_buf = malloc(quote_buf_size);
+	ASSERT_NE(NULL, quote_buf);
+
+	quote_hdr = (struct tdx_quote_hdr *)quote_buf;
+
+	/* Initialize GetQuote header */
+	quote_hdr->version = 1;
+	quote_hdr->status  = GET_QUOTE_SUCCESS;
+	quote_hdr->in_len  = TDX_REPORT_LEN;
+	quote_hdr->out_len = 0;
+
+	/* Get TDREPORT data */
+	ASSERT_EQ(0, get_tdreport0(devfd, &rep_req));
+
+	/* Fill GetQuote request */
+	memcpy(quote_hdr->data, rep_req.tdreport, TDX_REPORT_LEN);
+	req.buf	  = (__u64)quote_buf;
+	req.len	  = quote_buf_size;
+
+	ASSERT_EQ(0, ioctl(devfd, TDX_CMD_GET_QUOTE, &req));
+
+	/* Check whether GetQuote request is successful */
+	EXPECT_EQ(0, quote_hdr->status);
+
+	free(quote_buf);
+
+	ASSERT_EQ(0, close(devfd));
+}
+
 TEST_HARNESS_MAIN
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/3] virt: tdx-guest: Add Quote generation support
  2023-03-26  6:20 ` [PATCH v1 2/3] virt: tdx-guest: Add Quote generation support Kuppuswamy Sathyanarayanan
@ 2023-03-26  6:34   ` Greg KH
  2023-03-26 19:06     ` Sathyanarayanan Kuppuswamy
  0 siblings, 1 reply; 16+ messages in thread
From: Greg KH @ 2023-03-26  6:34 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet, H . Peter Anvin,
	Kirill A . Shutemov, Tony Luck, Wander Lairson Costa,
	Erdem Aktas, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc

On Sat, Mar 25, 2023 at 11:20:38PM -0700, Kuppuswamy Sathyanarayanan wrote:
> Since GetQuote support requires usage of DMA APIs, convert TDX guest
> driver to a platform driver.

Sorry, but that's not a valid reason to use a platform device for fake
things like this:

> +static struct platform_device *tdx_dev;

Especially a single static one.

> +static int tdx_guest_probe(struct platform_device *pdev)
> +{
> +	if (tdx_register_event_irq_cb(attestation_callback_handler, pdev))
> +		return -EIO;
> +
> +	return misc_register(&tdx_misc_dev);
> +}
> +
> +static int tdx_guest_remove(struct platform_device *pdev)
> +{
> +	tdx_unregister_event_irq_cb(attestation_callback_handler, pdev);
> +	misc_deregister(&tdx_misc_dev);
> +	return 0;
> +}
> +
> +static struct platform_driver tdx_guest_driver = {
> +	.probe = tdx_guest_probe,
> +	.remove = tdx_guest_remove,
> +	.driver.name = KBUILD_MODNAME,
> +};
> +
>  static const struct x86_cpu_id tdx_guest_ids[] = {
>  	X86_MATCH_FEATURE(X86_FEATURE_TDX_GUEST, NULL),
>  	{}
> @@ -84,16 +310,35 @@ MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
>  
>  static int __init tdx_guest_init(void)
>  {
> +	int ret;
> +
>  	if (!x86_match_cpu(tdx_guest_ids))
>  		return -ENODEV;
>  
> -	return misc_register(&tdx_misc_dev);
> +	ret = platform_driver_register(&tdx_guest_driver);
> +	if (ret) {
> +		pr_err("failed to register driver, err=%d\n", ret);
> +		return ret;
> +	}

No, please do not create a fake platform driver.

> +	tdx_dev = platform_device_register_simple(KBUILD_MODNAME,
> +						  PLATFORM_DEVID_NONE,
> +						  NULL, 0);

And please do not create a fake platform device.

As always, do not create fake platform devices for things that are NOT
platform devices.

If this device needs DMA (but why?) then make it a real device and tie
it to the bus it belongs to (that it is obviously doing DMA on.)

But as-is, this isn't ok, sorry.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/3] virt: tdx-guest: Add Quote generation support
  2023-03-26  6:34   ` Greg KH
@ 2023-03-26 19:06     ` Sathyanarayanan Kuppuswamy
  2023-03-27  6:30       ` Greg KH
  0 siblings, 1 reply; 16+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-03-26 19:06 UTC (permalink / raw)
  To: Greg KH
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet, H . Peter Anvin,
	Kirill A . Shutemov, Tony Luck, Wander Lairson Costa,
	Erdem Aktas, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc

Hi Greg,

Thanks for the review comments.

On 3/25/23 11:34 PM, Greg KH wrote:
> On Sat, Mar 25, 2023 at 11:20:38PM -0700, Kuppuswamy Sathyanarayanan wrote:
>> Since GetQuote support requires usage of DMA APIs, convert TDX guest
>> driver to a platform driver.
> 
> Sorry, but that's not a valid reason to use a platform device for fake
> things like this:
> 
>> +static struct platform_device *tdx_dev;
> 
> Especially a single static one.
> 
>> +static int tdx_guest_probe(struct platform_device *pdev)
>> +{
>> +	if (tdx_register_event_irq_cb(attestation_callback_handler, pdev))
>> +		return -EIO;
>> +
>> +	return misc_register(&tdx_misc_dev);
>> +}
>> +
>> +static int tdx_guest_remove(struct platform_device *pdev)
>> +{
>> +	tdx_unregister_event_irq_cb(attestation_callback_handler, pdev);
>> +	misc_deregister(&tdx_misc_dev);
>> +	return 0;
>> +}
>> +
>> +static struct platform_driver tdx_guest_driver = {
>> +	.probe = tdx_guest_probe,
>> +	.remove = tdx_guest_remove,
>> +	.driver.name = KBUILD_MODNAME,
>> +};
>> +
>>  static const struct x86_cpu_id tdx_guest_ids[] = {
>>  	X86_MATCH_FEATURE(X86_FEATURE_TDX_GUEST, NULL),
>>  	{}
>> @@ -84,16 +310,35 @@ MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
>>  
>>  static int __init tdx_guest_init(void)
>>  {
>> +	int ret;
>> +
>>  	if (!x86_match_cpu(tdx_guest_ids))
>>  		return -ENODEV;
>>  
>> -	return misc_register(&tdx_misc_dev);
>> +	ret = platform_driver_register(&tdx_guest_driver);
>> +	if (ret) {
>> +		pr_err("failed to register driver, err=%d\n", ret);
>> +		return ret;
>> +	}
> 
> No, please do not create a fake platform driver.
> 
>> +	tdx_dev = platform_device_register_simple(KBUILD_MODNAME,
>> +						  PLATFORM_DEVID_NONE,
>> +						  NULL, 0);
> 
> And please do not create a fake platform device.
> 
> As always, do not create fake platform devices for things that are NOT
> platform devices.
> 
> If this device needs DMA (but why?) then make it a real device and tie

Since the Quote generation process requires the guest and VMM to share data,
we need to use shared buffers. But in TDX guest, VMM cannot directly access
the guest private memory. Any memory that is required for communication with
the VMM must be shared explicitly. One way is to allocate the memory on
demand and share it explicitly using set_memory_decrypted() call. Although
this approach is clean, since it breaks the direct mapping, it is not
efficient. An alternative way is to reserve a pool of shared buffers during
boot time and re-use them for guest/VMM communication. Since in TDX, DMA API
is configured to tap into the SWIOTLB memory framework (which is shared by
default), we try to use dma_alloc_* APIs to allocate the shared buffers from
the shared pool without breaking the direct map.

If usage of DMA APIs / platform device is not acceptable for this use case,
an alternative approach is to allocate a fixed number of shared buffers during
the TDX guest driver probe and use it for GetQuote requests. Although it would
limit the amount of memory we can use for GetQuote requests at a time and also
reserve a chunk of memory during the init() time, I think it is an acceptable
tradeoff when compared to alternative choices. The AMD SEV guest driver also
adopts the similar approach. Please let me know if this approach is acceptable.

> it to the bus it belongs to (that it is obviously doing DMA on.)


> 
> thanks,
> 
> greg k-h

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 2/3] virt: tdx-guest: Add Quote generation support
  2023-03-26 19:06     ` Sathyanarayanan Kuppuswamy
@ 2023-03-27  6:30       ` Greg KH
  0 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2023-03-27  6:30 UTC (permalink / raw)
  To: Sathyanarayanan Kuppuswamy
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet, H . Peter Anvin,
	Kirill A . Shutemov, Tony Luck, Wander Lairson Costa,
	Erdem Aktas, Guorui Yu, Du Fan, linux-kernel, linux-kselftest,
	linux-doc

On Sun, Mar 26, 2023 at 12:06:26PM -0700, Sathyanarayanan Kuppuswamy wrote:
> If usage of DMA APIs / platform device is not acceptable for this use case,

It is not ok to use a platform device for this because you just do not
have a platform device for it, don't make one up out of thin air please,
as that really doesn't even have the correct bindings to the DMA memory
that you want here.

> an alternative approach is to allocate a fixed number of shared buffers during
> the TDX guest driver probe and use it for GetQuote requests. Although it would
> limit the amount of memory we can use for GetQuote requests at a time and also
> reserve a chunk of memory during the init() time, I think it is an acceptable
> tradeoff when compared to alternative choices. The AMD SEV guest driver also
> adopts the similar approach. Please let me know if this approach is acceptable.

This sounds like a better approach.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 0/3] TDX Guest Quote generation support
  2023-03-26  6:20 [PATCH v1 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
                   ` (2 preceding siblings ...)
  2023-03-26  6:20 ` [PATCH v1 3/3] selftests/tdx: Test GetQuote TDX attestation feature Kuppuswamy Sathyanarayanan
@ 2023-03-27 17:36 ` Erdem Aktas
  2023-03-28 19:59   ` Dionna Amalie Glaze
  3 siblings, 1 reply; 16+ messages in thread
From: Erdem Aktas @ 2023-03-27 17:36 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Shuah Khan, Jonathan Corbet, H . Peter Anvin,
	Kirill A . Shutemov, Tony Luck, Wander Lairson Costa, Guorui Yu,
	Du Fan, linux-kernel, linux-kselftest, linux-doc

On Sat, Mar 25, 2023 at 11:20 PM Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
>
> Hi All,
>
> In TDX guest, the attestation process is used to verify the TDX guest
> trustworthiness to other entities before provisioning secrets to the
> guest.
>
> The TDX guest attestation process consists of two steps:
>
> 1. TDREPORT generation
> 2. Quote generation.
>
> The First step (TDREPORT generation) involves getting the TDX guest
> measurement data in the format of TDREPORT which is further used to
> validate the authenticity of the TDX guest. The second step involves
> sending the TDREPORT to a Quoting Enclave (QE) server to generate a
> remotely verifiable Quote. TDREPORT by design can only be verified on
> the local platform. To support remote verification of the TDREPORT,
> TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
> locally and convert it to a remotely verifiable Quote. Although
> attestation software can use communication methods like TCP/IP or
> vsock to send the TDREPORT to QE, not all platforms support these
> communication models. So TDX GHCI specification [1] defines a method
> for Quote generation via hypercalls. Please check the discussion from
> Google [2] and Alibaba [3] which clarifies the need for hypercall based
Thanks Sathyanarayanan for submitting patches again.

I just wanted to reiterate what I said before that having a clean
TDVMCALL based interface to get TDX Quote without any virtio/vsock
dependency  is critical for us to support many use cases.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support
  2023-03-26  6:20 ` [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
@ 2023-03-28  2:38   ` Huang, Kai
  2023-03-28  2:50     ` Sathyanarayanan Kuppuswamy
  0 siblings, 1 reply; 16+ messages in thread
From: Huang, Kai @ 2023-03-28  2:38 UTC (permalink / raw)
  To: corbet, dave.hansen, bp, shuah, sathyanarayanan.kuppuswamy, tglx,
	x86, mingo
  Cc: Yu, Guorui, linux-kselftest, wander, hpa, linux-kernel, Aktas,
	Erdem, kirill.shutemov, Luck, Tony, Du, Fan, linux-doc

On Sat, 2023-03-25 at 23:20 -0700, Kuppuswamy Sathyanarayanan wrote:
> Host-guest event notification via configured interrupt vector is useful
> in cases where a guest makes an asynchronous request and needs a
> callback from the host to indicate the completion or to let the host
> notify the guest about events like device removal. One usage example is,
> callback requirement of GetQuote asynchronous hypercall.
> 
> In TDX guest, SetupEventNotifyInterrupt hypercall can be used by the
> guest to specify which interrupt vector to use as an event-notify
> vector to the VMM. 
> 

"to the VMM" -> "from the VMM"?

> Details about the SetupEventNotifyInterrupt
> hypercall can be found in TDX Guest-Host Communication Interface
> (GHCI) Specification, sec 3.5 "VP.VMCALL<SetupEventNotifyInterrupt>".

It seems we shouldn't mention the exact section number.

> 
> As per design, VMM will post the event completion IRQ using the same
> CPU in which SetupEventNotifyInterrupt hypercall request is received.

"in which" -> "on which"

> So allocate an IRQ vector from "x86_vector_domain", and set the CPU
> affinity of the IRQ vector to the current CPU.

IMHO "current CPU" is a little bit vague.  Perhaps just "to the CPU on which
SetupEventNotifyInterrupt hypercall is made".

Also, perhaps it's better to mention to use IRQF_NOBALANCING to prevent the IRQ
from being migrated to another cpu.

> 
> Add tdx_register_event_irq_cb()/tdx_unregister_event_irq_cb()
> interfaces to allow drivers register/unregister event noficiation
> handlers.
> 
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Andi Kleen <ak@linux.intel.com>
> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Acked-by: Wander Lairson Costa <wander@redhat.com>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> ---
>  arch/x86/coco/tdx/tdx.c    | 163 +++++++++++++++++++++++++++++++++++++
>  arch/x86/include/asm/tdx.h |   6 ++
>  2 files changed, 169 insertions(+)
> 
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 055300e08fb3..d03985952d45 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -7,12 +7,18 @@
>  #include <linux/cpufeature.h>
>  #include <linux/export.h>
>  #include <linux/io.h>
> +#include <linux/string.h>
> +#include <linux/uaccess.h>

Do you need above two headers?

Also, perhaps you should explicitly include <.../list.h> and <.../spinlock.h>.


> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/numa.h>
>  #include <asm/coco.h>
>  #include <asm/tdx.h>
>  #include <asm/vmx.h>
>  #include <asm/insn.h>
>  #include <asm/insn-eval.h>
>  #include <asm/pgtable.h>
> +#include <asm/irqdomain.h>
>  
>  /* TDX module Call Leaf IDs */
>  #define TDX_GET_INFO			1
> @@ -27,6 +33,7 @@
>  /* TDX hypercall Leaf IDs */
>  #define TDVMCALL_MAP_GPA		0x10001
>  #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
> +#define TDVMCALL_SETUP_NOTIFY_INTR	0x10004
>  
>  /* MMIO direction */
>  #define EPT_READ	0
> @@ -51,6 +58,16 @@
>  
>  #define TDREPORT_SUBTYPE_0	0
>  
> +struct event_irq_entry {
> +	tdx_event_irq_cb_t handler;
> +	void *data;
> +	struct list_head head;
> +};
> +
> +static int tdx_event_irq;


__ro_after_init?

> +static LIST_HEAD(event_irq_cb_list);
> +static DEFINE_SPINLOCK(event_irq_cb_lock);
> +
>  /*
>   * Wrapper for standard use of __tdx_hypercall with no output aside from
>   * return code.
> @@ -873,3 +890,149 @@ void __init tdx_early_init(void)
>  
>  	pr_info("Guest detected\n");
>  }
> +
> +static irqreturn_t tdx_event_irq_handler(int irq, void *dev_id)
> +{
> +	struct event_irq_entry *entry;
> +
> +	spin_lock(&event_irq_cb_lock);
> +	list_for_each_entry(entry, &event_irq_cb_list, head) {
> +		if (entry->handler)
> +			entry->handler(entry->data);
> +	}
> +	spin_unlock(&event_irq_cb_lock);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +/* Reserve an IRQ from x86_vector_domain for TD event notification */
> +static int __init tdx_event_irq_init(void)
> +{
> +	struct irq_alloc_info info;
> +	cpumask_t saved_cpumask;
> +	struct irq_cfg *cfg;
> +	int cpu, irq;
> +
> +	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
> +		return 0;
> +
> +	init_irq_alloc_info(&info, NULL);
> +
> +	/*
> +	 * Event notification vector will be delivered to the CPU
> +	 * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
> +	 * So set the IRQ affinity to the current CPU.
> +	 */
> +	cpu = get_cpu();
> +	cpumask_copy(&saved_cpumask, current->cpus_ptr);
> +	info.mask = cpumask_of(cpu);
> +	put_cpu();

The 'saved_cpumask' related code is ugly.  If you move put_cpu() to the end of
this function, I think you can remove all related code:

	cpu = get_cpu();

	/*
	 * Set @info->mask to local cpu to make sure a valid vector is
	 * pre-allocated when TDX event notification IRQ is allocated
	 * from x86_vector_domain.
	 */
	init_irq_alloc_info(&info, cpumask_of(cpu));

	// rest staff: request_irq(), hypercall ...

	put_cpu();
	
> +
> +	irq = irq_domain_alloc_irqs(x86_vector_domain, 1, NUMA_NO_NODE, &info);

Should you use cpu_to_node(cpu) instead of NUMA_NO_NODE?

> +	if (irq <= 0) {
> +		pr_err("Event notification IRQ allocation failed %d\n", irq);
> +		return -EIO;
> +	}
> +
> +	irq_set_handler(irq, handle_edge_irq);
> +
> +	cfg = irq_cfg(irq);
> +	if (!cfg) {
> +		pr_err("Event notification IRQ config not found\n");
> +		goto err_free_irqs;
> +	}

You are depending on irq_domain_alloc_irqs() to have already allocated a vector
on the local cpu.  Then if !cfg, your code of calling irq_domain_alloc_irqs() to
allocate vector is broken.

So, perhaps you should just WARN() if vector hasn't been allocated to catch bug.

	WARN(!(irq_cfg(irq)->vector));


> +
> +	if (request_irq(irq, tdx_event_irq_handler, IRQF_NOBALANCING,

It's better to add a comment to explain why using IRQF_NOBALANCING.

> +			"tdx_event_irq", NULL)) {
> +		pr_err("Event notification IRQ request failed\n");
> +		goto err_free_irqs;
> +	}
> +
> +	set_cpus_allowed_ptr(current, cpumask_of(cpu));
> +
> +	/*
> +	 * Register callback vector address with VMM. More details
> +	 * about the ABI can be found in TDX Guest-Host-Communication
> +	 * Interface (GHCI), sec titled
> +	 * "TDG.VP.VMCALL<SetupEventNotifyInterrupt>".
> +	 */
> +	if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, cfg->vector, 0, 0, 0)) {
> +		pr_err("Event notification hypercall failed\n");
> +		goto err_restore_cpus;
> +	}
> +
> +	set_cpus_allowed_ptr(current, &saved_cpumask);
> +
> +	tdx_event_irq = irq;
> +
> +	return 0;
> +
> +err_restore_cpus:
> +	set_cpus_allowed_ptr(current, &saved_cpumask);
> +	free_irq(irq, NULL);
> +err_free_irqs:
> +	irq_domain_free_irqs(irq, 1);
> +
> +	return -EIO;
> +}
> +arch_initcall(tdx_event_irq_init)
> +


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support
  2023-03-28  2:38   ` Huang, Kai
@ 2023-03-28  2:50     ` Sathyanarayanan Kuppuswamy
  2023-03-28  4:02       ` Huang, Kai
  0 siblings, 1 reply; 16+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-03-28  2:50 UTC (permalink / raw)
  To: Huang, Kai, corbet, dave.hansen, bp, shuah, tglx, x86, mingo
  Cc: Yu, Guorui, linux-kselftest, wander, hpa, linux-kernel, Aktas,
	Erdem, kirill.shutemov, Luck, Tony, Du, Fan, linux-doc

Hi Kai,

On 3/27/23 7:38 PM, Huang, Kai wrote:
>> +/* Reserve an IRQ from x86_vector_domain for TD event notification */
>> +static int __init tdx_event_irq_init(void)
>> +{
>> +	struct irq_alloc_info info;
>> +	cpumask_t saved_cpumask;
>> +	struct irq_cfg *cfg;
>> +	int cpu, irq;
>> +
>> +	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
>> +		return 0;
>> +
>> +	init_irq_alloc_info(&info, NULL);
>> +
>> +	/*
>> +	 * Event notification vector will be delivered to the CPU
>> +	 * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
>> +	 * So set the IRQ affinity to the current CPU.
>> +	 */
>> +	cpu = get_cpu();
>> +	cpumask_copy(&saved_cpumask, current->cpus_ptr);
>> +	info.mask = cpumask_of(cpu);
>> +	put_cpu();
> The 'saved_cpumask' related code is ugly.  If you move put_cpu() to the end of
> this function, I think you can remove all related code:
> 
> 	cpu = get_cpu();
> 
> 	/*
> 	 * Set @info->mask to local cpu to make sure a valid vector is
> 	 * pre-allocated when TDX event notification IRQ is allocated
> 	 * from x86_vector_domain.
> 	 */
> 	init_irq_alloc_info(&info, cpumask_of(cpu));
> 
> 	// rest staff: request_irq(), hypercall ...
> 
> 	put_cpu();
> 	

init_irq_alloc_info() is a sleeping function. Since get_cpu() disables
preemption, we cannot call sleeping function after it. Initially, I
have implemented it like you have mentioned. However, I discovered the
following error.

[    2.400755] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
[    2.404664] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
[    2.408671] preempt_count: 1, expected: 0
[    2.412650] RCU nest depth: 0, expected: 0
[    2.412666] no locks held by swapper/0/1.
[    2.416650] Preemption disabled at:
[    2.416650] [<ffffffff83b8089f>] tdx_arch_init+0x38/0x117
[    2.420670] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc4-00117-g672ca073d9f9-dirty #2527
[    2.424650] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[    2.424650] Call Trace:
[    2.424650]  <TASK>
[    2.424650]  dump_stack_lvl+0x6a/0x86
[    2.424650]  __might_resched.cold+0xf4/0x12f
[    2.424650]  __mutex_lock+0x50/0x810
[    2.424650]  ? lock_is_held_type+0xd8/0x130
[    2.424650]  ? __irq_alloc_descs+0xcf/0x310
[    2.424650]  ? find_held_lock+0x2b/0x80
[    2.424650]  ? __irq_alloc_descs+0xcf/0x310
[    2.424650]  __irq_alloc_descs+0xcf/0x310
[    2.424650]  irq_domain_alloc_descs.part.0+0x49/0xa0
[    2.424650]  __irq_domain_alloc_irqs+0x2a0/0x4f0
[    2.424650]  ? next_arg+0x129/0x1f0
[    2.424650]  ? tdx_guest_init+0x5b/0x5b
[    2.424650]  tdx_arch_init+0x8e/0x117
[    2.424650]  do_one_initcall+0x137/0x2ec
[    2.424650]  ? rcu_read_lock_sched_held+0x36/0x60
[    2.424650]  kernel_init_freeable+0x1e3/0x241
[    2.424650]  ? rest_init+0x1a0/0x1a0
[    2.424650]  kernel_init+0x17/0x170
[    2.424650]  ? rest_init+0x1a0/0x1a0
[    2.424650]  ret_from_fork+0x1f/0x30
[    2.424650]  </TASK>
-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support
  2023-03-28  2:50     ` Sathyanarayanan Kuppuswamy
@ 2023-03-28  4:02       ` Huang, Kai
  2023-04-05  1:02         ` Sathyanarayanan Kuppuswamy
  0 siblings, 1 reply; 16+ messages in thread
From: Huang, Kai @ 2023-03-28  4:02 UTC (permalink / raw)
  To: corbet, bp, dave.hansen, shuah, sathyanarayanan.kuppuswamy, tglx,
	x86, mingo
  Cc: Yu, Guorui, linux-kselftest, wander, hpa, linux-kernel,
	kirill.shutemov, Luck, Tony, Du, Fan, Aktas, Erdem, linux-doc

On Mon, 2023-03-27 at 19:50 -0700, Sathyanarayanan Kuppuswamy wrote:
> Hi Kai,
> 
> On 3/27/23 7:38 PM, Huang, Kai wrote:
> > > +/* Reserve an IRQ from x86_vector_domain for TD event notification */
> > > +static int __init tdx_event_irq_init(void)
> > > +{
> > > +	struct irq_alloc_info info;
> > > +	cpumask_t saved_cpumask;
> > > +	struct irq_cfg *cfg;
> > > +	int cpu, irq;
> > > +
> > > +	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
> > > +		return 0;
> > > +
> > > +	init_irq_alloc_info(&info, NULL);
> > > +
> > > +	/*
> > > +	 * Event notification vector will be delivered to the CPU
> > > +	 * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
> > > +	 * So set the IRQ affinity to the current CPU.
> > > +	 */
> > > +	cpu = get_cpu();
> > > +	cpumask_copy(&saved_cpumask, current->cpus_ptr);
> > > +	info.mask = cpumask_of(cpu);
> > > +	put_cpu();
> > The 'saved_cpumask' related code is ugly.  If you move put_cpu() to the end of
> > this function, I think you can remove all related code:
> > 
> > 	cpu = get_cpu();
> > 
> > 	/*
> > 	 * Set @info->mask to local cpu to make sure a valid vector is
> > 	 * pre-allocated when TDX event notification IRQ is allocated
> > 	 * from x86_vector_domain.
> > 	 */
> > 	init_irq_alloc_info(&info, cpumask_of(cpu));
> > 
> > 	// rest staff: request_irq(), hypercall ...
> > 
> > 	put_cpu();
> > 	
> 
> init_irq_alloc_info() is a sleeping function. Since get_cpu() disables
> preemption, we cannot call sleeping function after it. Initially, I
> have implemented it like you have mentioned. However, I discovered the
> following error.

Oh sorry I forgot this.  So I think we should use migrate_disable() instead:

	migrate_disable();

	init_irq_alloc_info(&info, cpumask_of(smp_processor_id()));

	...

	migrate_enable();

Or, should we just use early_initcall() so that only BSP is running?  IMHO it's
OK to always allocate the vector from BSP.

Anyway, either way is fine to me.

> 
> [    2.400755] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
> [    2.404664] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
> [    2.408671] preempt_count: 1, expected: 0
> [    2.412650] RCU nest depth: 0, expected: 0
> [    2.412666] no locks held by swapper/0/1.
> [    2.416650] Preemption disabled at:
> [    2.416650] [<ffffffff83b8089f>] tdx_arch_init+0x38/0x117
> [    2.420670] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc4-00117-g672ca073d9f9-dirty #2527
> [    2.424650] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> [    2.424650] Call Trace:
> [    2.424650]  <TASK>
> [    2.424650]  dump_stack_lvl+0x6a/0x86
> [    2.424650]  __might_resched.cold+0xf4/0x12f
> [    2.424650]  __mutex_lock+0x50/0x810
> [    2.424650]  ? lock_is_held_type+0xd8/0x130
> [    2.424650]  ? __irq_alloc_descs+0xcf/0x310
> [    2.424650]  ? find_held_lock+0x2b/0x80
> [    2.424650]  ? __irq_alloc_descs+0xcf/0x310
> [    2.424650]  __irq_alloc_descs+0xcf/0x310
> [    2.424650]  irq_domain_alloc_descs.part.0+0x49/0xa0
> [    2.424650]  __irq_domain_alloc_irqs+0x2a0/0x4f0
> [    2.424650]  ? next_arg+0x129/0x1f0
> [    2.424650]  ? tdx_guest_init+0x5b/0x5b
> [    2.424650]  tdx_arch_init+0x8e/0x117
> [    2.424650]  do_one_initcall+0x137/0x2ec
> [    2.424650]  ? rcu_read_lock_sched_held+0x36/0x60
> [    2.424650]  kernel_init_freeable+0x1e3/0x241
> [    2.424650]  ? rest_init+0x1a0/0x1a0
> [    2.424650]  kernel_init+0x17/0x170
> [    2.424650]  ? rest_init+0x1a0/0x1a0
> [    2.424650]  ret_from_fork+0x1f/0x30
> [    2.424650]  </TASK>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 0/3] TDX Guest Quote generation support
  2023-03-27 17:36 ` [PATCH v1 0/3] TDX Guest Quote generation support Erdem Aktas
@ 2023-03-28 19:59   ` Dionna Amalie Glaze
  2023-03-28 20:43     ` Chong Cai
  0 siblings, 1 reply; 16+ messages in thread
From: Dionna Amalie Glaze @ 2023-03-28 19:59 UTC (permalink / raw)
  To: Erdem Aktas, Chong Cai
  Cc: Kuppuswamy Sathyanarayanan, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet,
	H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc

+Chong Cai

Adding a colleague per his request since he's not subscribed to the list yet.

On Mon, Mar 27, 2023 at 10:36 AM Erdem Aktas <erdemaktas@google.com> wrote:
>
> On Sat, Mar 25, 2023 at 11:20 PM Kuppuswamy Sathyanarayanan
> <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
> >
> > Hi All,
> >
> > In TDX guest, the attestation process is used to verify the TDX guest
> > trustworthiness to other entities before provisioning secrets to the
> > guest.
> >
> > The TDX guest attestation process consists of two steps:
> >
> > 1. TDREPORT generation
> > 2. Quote generation.
> >
> > The First step (TDREPORT generation) involves getting the TDX guest
> > measurement data in the format of TDREPORT which is further used to
> > validate the authenticity of the TDX guest. The second step involves
> > sending the TDREPORT to a Quoting Enclave (QE) server to generate a
> > remotely verifiable Quote. TDREPORT by design can only be verified on
> > the local platform. To support remote verification of the TDREPORT,
> > TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
> > locally and convert it to a remotely verifiable Quote. Although
> > attestation software can use communication methods like TCP/IP or
> > vsock to send the TDREPORT to QE, not all platforms support these
> > communication models. So TDX GHCI specification [1] defines a method
> > for Quote generation via hypercalls. Please check the discussion from
> > Google [2] and Alibaba [3] which clarifies the need for hypercall based
> Thanks Sathyanarayanan for submitting patches again.
>
> I just wanted to reiterate what I said before that having a clean
> TDVMCALL based interface to get TDX Quote without any virtio/vsock
> dependency  is critical for us to support many use cases.



-- 
-Dionna Glaze, PhD (she/her)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 3/3] selftests/tdx: Test GetQuote TDX attestation feature
  2023-03-26  6:20 ` [PATCH v1 3/3] selftests/tdx: Test GetQuote TDX attestation feature Kuppuswamy Sathyanarayanan
@ 2023-03-28 20:24   ` Shuah Khan
  0 siblings, 0 replies; 16+ messages in thread
From: Shuah Khan @ 2023-03-28 20:24 UTC (permalink / raw)
  To: Kuppuswamy Sathyanarayanan, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Shuah Khan, Jonathan Corbet
  Cc: H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Erdem Aktas, Guorui Yu, Du Fan,
	linux-kernel, linux-kselftest, linux-doc, Shuah Khan

On 3/26/23 00:20, Kuppuswamy Sathyanarayanan wrote:
> In TDX guest, the second stage of the attestation process is Quote
> generation. This process is required to convert the locally generated
> TDREPORT into a remotely verifiable Quote. It involves sending the
> TDREPORT data to a Quoting Enclave (QE) which will verify the
> integerity of the TDREPORT and sign it with an attestation key.
> 
> Intel's TDX attestation driver exposes TDX_CMD_GET_QUOTE IOCTL to
> allow user agent get the TD Quote.
> 
> Add a kernel selftest module to verify the Quote generation feature.
> 
> TD Quote generation involves following steps:
> 
> * Get the TDREPORT data using TDX_CMD_GET_REPORT IOCTL.
> * Embed the TDREPORT data in quote buffer and request for quote
>    generation via TDX_CMD_GET_QUOTE IOCTL request.
> * Upon completion of the GetQuote request, check for non zero value
>    in the status field of Quote header to make sure the generated
>    quote is valid.
> 
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Andi Kleen <ak@linux.intel.com>
> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> ---
>   tools/testing/selftests/tdx/tdx_guest_test.c | 68 ++++++++++++++++++--
>   1 file changed, 62 insertions(+), 6 deletions(-)
> 

Looks good to me.

Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 0/3] TDX Guest Quote generation support
  2023-03-28 19:59   ` Dionna Amalie Glaze
@ 2023-03-28 20:43     ` Chong Cai
  0 siblings, 0 replies; 16+ messages in thread
From: Chong Cai @ 2023-03-28 20:43 UTC (permalink / raw)
  To: Dionna Amalie Glaze
  Cc: Erdem Aktas, Kuppuswamy Sathyanarayanan, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, Shuah Khan,
	Jonathan Corbet, H . Peter Anvin, Kirill A . Shutemov, Tony Luck,
	Wander Lairson Costa, Guorui Yu, Du Fan, linux-kernel,
	linux-kselftest, linux-doc

On Tue, Mar 28, 2023 at 12:59 PM Dionna Amalie Glaze
<dionnaglaze@google.com> wrote:
>
> +Chong Cai
>
> Adding a colleague per his request since he's not subscribed to the list yet.
>
> On Mon, Mar 27, 2023 at 10:36 AM Erdem Aktas <erdemaktas@google.com> wrote:
> >
> > On Sat, Mar 25, 2023 at 11:20 PM Kuppuswamy Sathyanarayanan
> > <sathyanarayanan.kuppuswamy@linux.intel.com> wrote:
> > >
> > > Hi All,
> > >
> > > In TDX guest, the attestation process is used to verify the TDX guest
> > > trustworthiness to other entities before provisioning secrets to the
> > > guest.
> > >
> > > The TDX guest attestation process consists of two steps:
> > >
> > > 1. TDREPORT generation
> > > 2. Quote generation.
> > >
> > > The First step (TDREPORT generation) involves getting the TDX guest
> > > measurement data in the format of TDREPORT which is further used to
> > > validate the authenticity of the TDX guest. The second step involves
> > > sending the TDREPORT to a Quoting Enclave (QE) server to generate a
> > > remotely verifiable Quote. TDREPORT by design can only be verified on
> > > the local platform. To support remote verification of the TDREPORT,
> > > TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT
> > > locally and convert it to a remotely verifiable Quote. Although
> > > attestation software can use communication methods like TCP/IP or
> > > vsock to send the TDREPORT to QE, not all platforms support these
> > > communication models. So TDX GHCI specification [1] defines a method
> > > for Quote generation via hypercalls. Please check the discussion from
> > > Google [2] and Alibaba [3] which clarifies the need for hypercall based
> > Thanks Sathyanarayanan for submitting patches again.
> >
> > I just wanted to reiterate what I said before that having a clean
> > TDVMCALL based interface to get TDX Quote without any virtio/vsock
> > dependency  is critical for us to support many use cases.
>
> +1 to Erdem's point. A simple TDVMCALL interface could make it much
> easier for user cases that can not depend on virtio and vsock.
> Without the TDVMCALL, it will largely limit those user cases to adopt TDX.
> Thanks Sathyanarayanan for submitting this patch.
> --
> -Dionna Glaze, PhD (she/her)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support
  2023-03-28  4:02       ` Huang, Kai
@ 2023-04-05  1:02         ` Sathyanarayanan Kuppuswamy
  2023-04-05  2:45           ` Huang, Kai
  0 siblings, 1 reply; 16+ messages in thread
From: Sathyanarayanan Kuppuswamy @ 2023-04-05  1:02 UTC (permalink / raw)
  To: Huang, Kai, corbet, bp, dave.hansen, shuah, tglx, x86, mingo
  Cc: Yu, Guorui, linux-kselftest, wander, hpa, linux-kernel,
	kirill.shutemov, Luck, Tony, Du, Fan, Aktas, Erdem, linux-doc



On 3/27/23 9:02 PM, Huang, Kai wrote:
> On Mon, 2023-03-27 at 19:50 -0700, Sathyanarayanan Kuppuswamy wrote:
>> Hi Kai,
>>
>> On 3/27/23 7:38 PM, Huang, Kai wrote:
>>>> +/* Reserve an IRQ from x86_vector_domain for TD event notification */
>>>> +static int __init tdx_event_irq_init(void)
>>>> +{
>>>> +	struct irq_alloc_info info;
>>>> +	cpumask_t saved_cpumask;
>>>> +	struct irq_cfg *cfg;
>>>> +	int cpu, irq;
>>>> +
>>>> +	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
>>>> +		return 0;
>>>> +
>>>> +	init_irq_alloc_info(&info, NULL);
>>>> +
>>>> +	/*
>>>> +	 * Event notification vector will be delivered to the CPU
>>>> +	 * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
>>>> +	 * So set the IRQ affinity to the current CPU.
>>>> +	 */
>>>> +	cpu = get_cpu();
>>>> +	cpumask_copy(&saved_cpumask, current->cpus_ptr);
>>>> +	info.mask = cpumask_of(cpu);
>>>> +	put_cpu();
>>> The 'saved_cpumask' related code is ugly.  If you move put_cpu() to the end of
>>> this function, I think you can remove all related code:
>>>
>>> 	cpu = get_cpu();
>>>
>>> 	/*
>>> 	 * Set @info->mask to local cpu to make sure a valid vector is
>>> 	 * pre-allocated when TDX event notification IRQ is allocated
>>> 	 * from x86_vector_domain.
>>> 	 */
>>> 	init_irq_alloc_info(&info, cpumask_of(cpu));
>>>
>>> 	// rest staff: request_irq(), hypercall ...
>>>
>>> 	put_cpu();
>>> 	
>>
>> init_irq_alloc_info() is a sleeping function. Since get_cpu() disables
>> preemption, we cannot call sleeping function after it. Initially, I
>> have implemented it like you have mentioned. However, I discovered the
>> following error.
> 
> Oh sorry I forgot this.  So I think we should use migrate_disable() instead:
> 
> 	migrate_disable();
> 
> 	init_irq_alloc_info(&info, cpumask_of(smp_processor_id()));
> 
> 	...
> 
> 	migrate_enable();
> 
> Or, should we just use early_initcall() so that only BSP is running?  IMHO it's
> OK to always allocate the vector from BSP.
> 
> Anyway, either way is fine to me.

Final version looks like below. 

static int __init tdx_event_irq_init(void)
{
        struct irq_alloc_info info;
        struct irq_cfg *cfg;
        int irq;

        if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
                return 0;

        init_irq_alloc_info(&info, NULL);

        /*
         * Event notification vector will be delivered to the CPU
         * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
         * So set the IRQ affinity to the current CPU.
         */
        info.mask = cpumask_of(0);

        irq = irq_domain_alloc_irqs(x86_vector_domain, 1, cpu_to_node(0), &info);
        if (irq <= 0) {
                pr_err("Event notification IRQ allocation failed %d\n", irq);
                return -EIO;
        }

        irq_set_handler(irq, handle_edge_irq);

        /* Since the IRQ affinity is set, it cannot be balanced */
        if (request_irq(irq, tdx_event_irq_handler, IRQF_NOBALANCING,
                        "tdx_event_irq", NULL)) {
                pr_err("Event notification IRQ request failed\n");
                goto err_free_domain_irqs;
        }

        cfg = irq_cfg(irq);

        /*
         * Since tdx_event_irq_init() is triggered via early_initcall(),
         * it will called before secondary CPUs bringup. Since there is
         * only one CPU, it complies with the requirement of executing
         * the TDVMCALL_SETUP_NOTIFY_INTR hypercall on the same CPU where
         * the IRQ vector is allocated.
         *
         * Register callback vector address with VMM. More details
         * about the ABI can be found in TDX Guest-Host-Communication
         * Interface (GHCI), sec titled
         * "TDG.VP.VMCALL<SetupEventNotifyInterrupt>".
         */
        if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, cfg->vector, 0, 0, 0)) {
                pr_err("Event notification hypercall failed\n");
                goto err_free_irqs;
        }

        tdx_event_irq = irq;

        return 0;

err_free_irqs:
        free_irq(irq, NULL);
err_free_domain_irqs:
        irq_domain_free_irqs(irq, 1);

        return -EIO;
}
early_initcall(tdx_event_irq_init)


> 
>>
>> [    2.400755] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
>> [    2.404664] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
>> [    2.408671] preempt_count: 1, expected: 0
>> [    2.412650] RCU nest depth: 0, expected: 0
>> [    2.412666] no locks held by swapper/0/1.
>> [    2.416650] Preemption disabled at:
>> [    2.416650] [<ffffffff83b8089f>] tdx_arch_init+0x38/0x117
>> [    2.420670] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc4-00117-g672ca073d9f9-dirty #2527
>> [    2.424650] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>> [    2.424650] Call Trace:
>> [    2.424650]  <TASK>
>> [    2.424650]  dump_stack_lvl+0x6a/0x86
>> [    2.424650]  __might_resched.cold+0xf4/0x12f
>> [    2.424650]  __mutex_lock+0x50/0x810
>> [    2.424650]  ? lock_is_held_type+0xd8/0x130
>> [    2.424650]  ? __irq_alloc_descs+0xcf/0x310
>> [    2.424650]  ? find_held_lock+0x2b/0x80
>> [    2.424650]  ? __irq_alloc_descs+0xcf/0x310
>> [    2.424650]  __irq_alloc_descs+0xcf/0x310
>> [    2.424650]  irq_domain_alloc_descs.part.0+0x49/0xa0
>> [    2.424650]  __irq_domain_alloc_irqs+0x2a0/0x4f0
>> [    2.424650]  ? next_arg+0x129/0x1f0
>> [    2.424650]  ? tdx_guest_init+0x5b/0x5b
>> [    2.424650]  tdx_arch_init+0x8e/0x117
>> [    2.424650]  do_one_initcall+0x137/0x2ec
>> [    2.424650]  ? rcu_read_lock_sched_held+0x36/0x60
>> [    2.424650]  kernel_init_freeable+0x1e3/0x241
>> [    2.424650]  ? rest_init+0x1a0/0x1a0
>> [    2.424650]  kernel_init+0x17/0x170
>> [    2.424650]  ? rest_init+0x1a0/0x1a0
>> [    2.424650]  ret_from_fork+0x1f/0x30
>> [    2.424650]  </TASK>
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support
  2023-04-05  1:02         ` Sathyanarayanan Kuppuswamy
@ 2023-04-05  2:45           ` Huang, Kai
  0 siblings, 0 replies; 16+ messages in thread
From: Huang, Kai @ 2023-04-05  2:45 UTC (permalink / raw)
  To: corbet, dave.hansen, bp, shuah, sathyanarayanan.kuppuswamy, tglx,
	x86, mingo
  Cc: Yu, Guorui, linux-kselftest, wander, hpa, linux-kernel,
	kirill.shutemov, Luck, Tony, Du, Fan, Aktas, Erdem, linux-doc

On Tue, 2023-04-04 at 18:02 -0700, Sathyanarayanan Kuppuswamy wrote:
> 
> On 3/27/23 9:02 PM, Huang, Kai wrote:
> > On Mon, 2023-03-27 at 19:50 -0700, Sathyanarayanan Kuppuswamy wrote:
> > > Hi Kai,
> > > 
> > > On 3/27/23 7:38 PM, Huang, Kai wrote:
> > > > > +/* Reserve an IRQ from x86_vector_domain for TD event notification */
> > > > > +static int __init tdx_event_irq_init(void)
> > > > > +{
> > > > > +	struct irq_alloc_info info;
> > > > > +	cpumask_t saved_cpumask;
> > > > > +	struct irq_cfg *cfg;
> > > > > +	int cpu, irq;
> > > > > +
> > > > > +	if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
> > > > > +		return 0;
> > > > > +
> > > > > +	init_irq_alloc_info(&info, NULL);
> > > > > +
> > > > > +	/*
> > > > > +	 * Event notification vector will be delivered to the CPU
> > > > > +	 * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
> > > > > +	 * So set the IRQ affinity to the current CPU.
> > > > > +	 */
> > > > > +	cpu = get_cpu();
> > > > > +	cpumask_copy(&saved_cpumask, current->cpus_ptr);
> > > > > +	info.mask = cpumask_of(cpu);
> > > > > +	put_cpu();
> > > > The 'saved_cpumask' related code is ugly.  If you move put_cpu() to the end of
> > > > this function, I think you can remove all related code:
> > > > 
> > > > 	cpu = get_cpu();
> > > > 
> > > > 	/*
> > > > 	 * Set @info->mask to local cpu to make sure a valid vector is
> > > > 	 * pre-allocated when TDX event notification IRQ is allocated
> > > > 	 * from x86_vector_domain.
> > > > 	 */
> > > > 	init_irq_alloc_info(&info, cpumask_of(cpu));
> > > > 
> > > > 	// rest staff: request_irq(), hypercall ...
> > > > 
> > > > 	put_cpu();
> > > > 	
> > > 
> > > init_irq_alloc_info() is a sleeping function. Since get_cpu() disables
> > > preemption, we cannot call sleeping function after it. Initially, I
> > > have implemented it like you have mentioned. However, I discovered the
> > > following error.
> > 
> > Oh sorry I forgot this.  So I think we should use migrate_disable() instead:
> > 
> > 	migrate_disable();
> > 
> > 	init_irq_alloc_info(&info, cpumask_of(smp_processor_id()));
> > 
> > 	...
> > 
> > 	migrate_enable();
> > 
> > Or, should we just use early_initcall() so that only BSP is running?  IMHO it's
> > OK to always allocate the vector from BSP.
> > 
> > Anyway, either way is fine to me.
> 
> Final version looks like below. 
> 
> static int __init tdx_event_irq_init(void)
> {
>         struct irq_alloc_info info;
>         struct irq_cfg *cfg;
>         int irq;
> 
>         if (!cpu_feature_enabled(X86_FEATURE_TDX_GUEST))
>                 return 0;
> 
>         init_irq_alloc_info(&info, NULL);
> 
>         /*
>          * Event notification vector will be delivered to the CPU
>          * in which TDVMCALL_SETUP_NOTIFY_INTR hypercall is requested.
>          * So set the IRQ affinity to the current CPU.
>          */
>         info.mask = cpumask_of(0);
> 
>         irq = irq_domain_alloc_irqs(x86_vector_domain, 1, cpu_to_node(0), &info);
>         if (irq <= 0) {
>                 pr_err("Event notification IRQ allocation failed %d\n", irq);
>                 return -EIO;
>         }
> 
>         irq_set_handler(irq, handle_edge_irq);
> 
>         /* Since the IRQ affinity is set, it cannot be balanced */
>         if (request_irq(irq, tdx_event_irq_handler, IRQF_NOBALANCING,
>                         "tdx_event_irq", NULL)) {
>                 pr_err("Event notification IRQ request failed\n");
>                 goto err_free_domain_irqs;
>         }
> 
>         cfg = irq_cfg(irq);
> 
>         /*
>          * Since tdx_event_irq_init() is triggered via early_initcall(),
>          * it will called before secondary CPUs bringup. Since there is
>          * only one CPU, it complies with the requirement of executing
>          * the TDVMCALL_SETUP_NOTIFY_INTR hypercall on the same CPU where
>          * the IRQ vector is allocated.
>          *
>          * Register callback vector address with VMM. More details
>          * about the ABI can be found in TDX Guest-Host-Communication
>          * Interface (GHCI), sec titled
>          * "TDG.VP.VMCALL<SetupEventNotifyInterrupt>".
>          */
>         if (_tdx_hypercall(TDVMCALL_SETUP_NOTIFY_INTR, cfg->vector, 0, 0, 0)) {
>                 pr_err("Event notification hypercall failed\n");
>                 goto err_free_irqs;
>         }
> 
>         tdx_event_irq = irq;
> 
>         return 0;
> 
> err_free_irqs:
>         free_irq(irq, NULL);
> err_free_domain_irqs:
>         irq_domain_free_irqs(irq, 1);
> 
>         return -EIO;
> }
> early_initcall(tdx_event_irq_init)

I found there's another series also doing similar thing, and it seems Thomas
wasn't happy about using x86_vector_domain directly:

https://lore.kernel.org/lkml/877cv99k0y.ffs@tglx/

An alternative was also posted (creating IRQ domain on top of
x86_vector_domain):

https://lore.kernel.org/lkml/20230328182933.GA1403032@vm02.guest.corp.microsoft.com/

I think we should monitor that and hear from others more.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-04-05  2:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-26  6:20 [PATCH v1 0/3] TDX Guest Quote generation support Kuppuswamy Sathyanarayanan
2023-03-26  6:20 ` [PATCH v1 1/3] x86/tdx: Add TDX Guest event notify interrupt support Kuppuswamy Sathyanarayanan
2023-03-28  2:38   ` Huang, Kai
2023-03-28  2:50     ` Sathyanarayanan Kuppuswamy
2023-03-28  4:02       ` Huang, Kai
2023-04-05  1:02         ` Sathyanarayanan Kuppuswamy
2023-04-05  2:45           ` Huang, Kai
2023-03-26  6:20 ` [PATCH v1 2/3] virt: tdx-guest: Add Quote generation support Kuppuswamy Sathyanarayanan
2023-03-26  6:34   ` Greg KH
2023-03-26 19:06     ` Sathyanarayanan Kuppuswamy
2023-03-27  6:30       ` Greg KH
2023-03-26  6:20 ` [PATCH v1 3/3] selftests/tdx: Test GetQuote TDX attestation feature Kuppuswamy Sathyanarayanan
2023-03-28 20:24   ` Shuah Khan
2023-03-27 17:36 ` [PATCH v1 0/3] TDX Guest Quote generation support Erdem Aktas
2023-03-28 19:59   ` Dionna Amalie Glaze
2023-03-28 20:43     ` Chong Cai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.