All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/58] TDX QEMU support
@ 2023-08-18  9:49 Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 01/58] *** HACK *** linux-headers: Update headers to pull in TDX API changes Xiaoyao Li
                   ` (57 more replies)
  0 siblings, 58 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

This is v2 series of adding TDX suppot in QEMU.

This patch series aims to enable TDX support to allow creating and booting a
TD (TDX VM) with QEMU. It needs to work with corresponding KVM v15 patch [1].
TDX related documents can be found in [2].

This series is based on QEMU gmem implemntation, which is posted at [3].
And This series is also available in github:
https://github.com/intel/qemu-tdx/tree/tdx-qemu-upstream-v2

This version aims to update the TDX QEMU side to match with latest TDX
KVM side implementation, which expose gmem for private memory. This
version is not targeted as the final version because how to support KVM
gmem in QEMU is not finalized yet. Though, any review comment is
welcomed.


[1] KVM TDX basic feature support v15
https://lore.kernel.org/kvm/cover.1690322424.git.isaku.yamahata@intel.com/

[2] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html

[3] https://lore.kernel.org/all/20230731162201.271114-1-xiaoyao.li@intel.com/


== Limitation and future work ==
- Readonly memslot

  TDX only support readonly (write protection) memslot for shared memory, but
  not for private memory. For simplicity, just mark readonly memslot not
  supported entirely for TDX.

- CPU model

  We cannot create a TD with arbitrary CPU model like what for non-TDX VMs,
  because only a subset of features can be configured for TD.

  - It's recommended to use '-cpu host' to create TD;
  - '+feature/-feature' might not work as expected;

  future work: To introduce specific CPU model for TDs and enhance +/-features
               for TDs.

- gdb suppport

  gdb support to debug a TD of off-debug mode is future work.


== Change history ==
Changes from v1:
[v1] https://lore.kernel.org/qemu-devel/20220802074750.2581308-1-xiaoyao.li@intel.com/

- Switch to KVM gmem interface for private memory;
- Add TDVMCALL and its sub leaves support;
- mark LMCE as unsupported for TD VM;
- bing back the support of mrconfigid/mrowner/mrownerconfig;
- update documentation;

Changes from RFC v4:
[RFC v4] https://lore.kernel.org/qemu-devel/20220512031803.3315890-1-xiaoyao.li@intel.com/

- Add 3 more patches(9, 10, 11) to improve the tdx_get_supported_cpuid();
- make attributes of object tdx-guest not settable by user;
- improve get_tdx_capabilities() by using a known starting value and
  limiting the loop with a known size;
- clarify why isa.bios needs to be skipped;
- remove the MMIO hob setup since OVMF sets them up itself;

Changes from RFC v3:
[RFC v3] https://lore.kernel.org/qemu-devel/20220317135913.2166202-1-xiaoyao.li@intel.com/

- Load TDVF with -bios interface;
- Adapt to KVM API changes;
	- KVM_TDX_CAPABILITIES changes back to KVM-scope;
	- struct kvm_tdx_init_vm changes;
- Define TDX_SUPPORTED_KVM_FEATURES;
- Drop the patch of introducing property sept-ve-disable since it's not
  public yet;
- some misc cleanups

Changes from RFC v2:
[RFC v2] https://lore.kernel.org/qemu-devel/cover.1625704980.git.isaku.yamahata@intel.com/

- Get vm-type from confidential-guest-support object type;
- Drop machine_init_done_late_notifiers;
- Refactor tdx_ioctl implementation;
- re-use existing pflash interface to load TDVF (i.e., OVMF binaries);
- introduce new date structure to track memory type instead of changing
  e820 table;
- Force smm to off for TDX VM;
- Drop the patches that suppress level-trigger/SMI/INIT/SIPI since KVM
  will ingore them;
- Add documentation;

Changes from RFC v1:
[RFC v1] https://lore.kernel.org/qemu-devel/cover.1613188118.git.isaku.yamahata@intel.com/

- suppress level trigger/SMI/INIT/SIPI related to IOAPIC.
- add VM attribute sha384 to TD measurement.
- guest TSC Hz specification



Chao Peng (1):
  i386/tdx: register TDVF as private memory

Chenyi Qiang (2):
  i386/tdx: register the fd read callback with the main loop to read the
    quote data
  i386/tdx: setup a timer for the qio channel

Isaku Yamahata (14):
  i386/tdx: Make sept_ve_disable set by default
  qom: implement property helper for sha384
  i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM
  i386/tdx: Create kvm gmem for TD
  kvm/tdx: Don't complain when converting vMMIO region to shared
  kvm/tdx: Ignore memory conversion to shared of unassigned region
  i386/tdvf: Introduce function to parse TDVF metadata
  i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION
  i386/tdx: handle TDG.VP.VMCALL<SetupEventNotifyInterrupt>
  i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  i386/tdx: handle TDG.VP.VMCALL<MapGPA> hypercall
  i386/tdx: Limit the range size for MapGPA
  hw/i386: add option to forcibly report edge trigger in acpi tables
  i386/tdx: Don't synchronize guest tsc for TDs

Sean Christopherson (2):
  i386/kvm: Move architectural CPUID leaf generation to separate helper
  i386/tdx: Don't get/put guest state for TDX VMs

Xiaoyao Li (39):
  *** HACK *** linux-headers: Update headers to pull in TDX API changes
  i386: Introduce tdx-guest object
  target/i386: Parse TDX vm type
  target/i386: Introduce kvm_confidential_guest_init()
  i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
  i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
  i386/tdx: Adjust the supported CPUID based on TDX restrictions
  i386/tdx: Update tdx_cpuid_lookup[].tdx_fixed0/1 by
    tdx_caps.cpuid_config[]
  i386/tdx: Integrate tdx_caps->xfam_fixed0/1 into tdx_cpuid_lookup
  i386/tdx: Integrate tdx_caps->attrs_fixed0/1 to tdx_cpuid_lookup
  kvm: Introduce kvm_arch_pre_create_vcpu()
  i386/tdx: Initialize TDX before creating TD vcpus
  i386/tdx: Add property sept-ve-disable for tdx-guest object
  i386/tdx: Wire CPU features up with attributes of TD guest
  i386/tdx: Validate TD attributes
  i386/tdx: Implement user specified tsc frequency
  i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
  i386/tdx: Make memory type private by default
  i386/tdx: Parse TDVF metadata for TDX VM
  i386/tdx: Skip BIOS shadowing setup
  i386/tdx: Don't initialize pc.rom for TDX VMs
  i386/tdx: Track mem_ptr for each firmware entry of TDVF
  i386/tdx: Track RAM entries for TDX VM
  headers: Add definitions from UEFI spec for volumes, resources, etc...
  i386/tdx: Setup the TD HOB list
  memory: Introduce memory_region_init_ram_gmem()
  i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
  i386/tdx: Finalize TDX VM
  i386/tdx: Handle TDG.VP.VMCALL<REPORT_FATAL_ERROR>
  i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility
  i386/tdx: Disable SMM for TDX VMs
  i386/tdx: Disable PIC for TDX VMs
  i386/tdx: Don't allow system reset for TDX VMs
  i386/tdx: LMCE is not supported for TDX
  hw/i386: add eoi_intercept_unsupported member to X86MachineState
  i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
  i386/tdx: Skip kvm_put_apicbase() for TDs
  docs: Add TDX documentation

 accel/kvm/kvm-all.c                        |   55 +-
 configs/devices/i386-softmmu/default.mak   |    1 +
 docs/system/confidential-guest-support.rst |    1 +
 docs/system/i386/tdx.rst                   |  114 ++
 docs/system/target-i386.rst                |    1 +
 hw/i386/Kconfig                            |    6 +
 hw/i386/acpi-build.c                       |   99 +-
 hw/i386/acpi-common.c                      |   50 +-
 hw/i386/meson.build                        |    1 +
 hw/i386/pc.c                               |   21 +-
 hw/i386/pc_sysfw.c                         |    7 +
 hw/i386/tdvf-hob.c                         |  147 ++
 hw/i386/tdvf-hob.h                         |   24 +
 hw/i386/tdvf.c                             |  200 +++
 hw/i386/x86.c                              |   38 +-
 include/exec/memory.h                      |    6 +
 include/hw/i386/tdvf.h                     |   58 +
 include/hw/i386/x86.h                      |    1 +
 include/qom/object.h                       |   17 +
 include/standard-headers/uefi/uefi.h       |  198 +++
 include/sysemu/kvm.h                       |    3 +
 linux-headers/asm-x86/kvm.h                |   90 ++
 linux-headers/linux/kvm.h                  |   87 ++
 qapi/qom.json                              |   26 +
 qapi/run-state.json                        |   17 +-
 qom/object.c                               |   76 +
 softmmu/memory.c                           |   52 +
 softmmu/runstate.c                         |   49 +
 target/i386/cpu-internal.h                 |    9 +
 target/i386/cpu.c                          |   12 -
 target/i386/cpu.h                          |   21 +
 target/i386/kvm/kvm-cpu.c                  |    5 +
 target/i386/kvm/kvm.c                      |  586 ++++----
 target/i386/kvm/kvm_i386.h                 |    5 +
 target/i386/kvm/meson.build                |    2 +
 target/i386/kvm/tdx-stub.c                 |   22 +
 target/i386/kvm/tdx.c                      | 1543 ++++++++++++++++++++
 target/i386/kvm/tdx.h                      |   73 +
 target/i386/sev.c                          |    1 -
 target/i386/sev.h                          |    2 +
 40 files changed, 3382 insertions(+), 344 deletions(-)
 create mode 100644 docs/system/i386/tdx.rst
 create mode 100644 hw/i386/tdvf-hob.c
 create mode 100644 hw/i386/tdvf-hob.h
 create mode 100644 hw/i386/tdvf.c
 create mode 100644 include/hw/i386/tdvf.h
 create mode 100644 include/standard-headers/uefi/uefi.h
 create mode 100644 target/i386/kvm/tdx-stub.c
 create mode 100644 target/i386/kvm/tdx.c
 create mode 100644 target/i386/kvm/tdx.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 120+ messages in thread

* [PATCH v2 01/58] *** HACK *** linux-headers: Update headers to pull in TDX API changes
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 02/58] i386: Introduce tdx-guest object Xiaoyao Li
                   ` (56 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Pull in recent TDX updates, which are not backwards compatible.

It's just to make this series runnable. It will be updated by script

	scripts/update-linux-headers.sh

once TDX support is upstreamed in linux kernel

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 linux-headers/asm-x86/kvm.h | 90 +++++++++++++++++++++++++++++++++++++
 linux-headers/linux/kvm.h   | 87 +++++++++++++++++++++++++++++++++++
 2 files changed, 177 insertions(+)

diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index 003fb745347c..4c3deb0e2a75 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -562,5 +562,95 @@ struct kvm_pmu_event_filter {
 
 #define KVM_X86_DEFAULT_VM	0
 #define KVM_X86_SW_PROTECTED_VM	1
+#define KVM_X86_TDX_VM		2
+#define KVM_X86_SNP_VM		3
+
+/* Trust Domain eXtension sub-ioctl() commands. */
+enum kvm_tdx_cmd_id {
+	KVM_TDX_CAPABILITIES = 0,
+	KVM_TDX_INIT_VM,
+	KVM_TDX_INIT_VCPU,
+	KVM_TDX_INIT_MEM_REGION,
+	KVM_TDX_FINALIZE_VM,
+
+	KVM_TDX_CMD_NR_MAX,
+};
+
+struct kvm_tdx_cmd {
+	/* enum kvm_tdx_cmd_id */
+	__u32 id;
+	/* flags for sub-commend. If sub-command doesn't use this, set zero. */
+	__u32 flags;
+	/*
+	 * data for each sub-command. An immediate or a pointer to the actual
+	 * data in process virtual address.  If sub-command doesn't use it,
+	 * set zero.
+	 */
+	__u64 data;
+	/*
+	 * Auxiliary error code.  The sub-command may return TDX SEAMCALL
+	 * status code in addition to -Exxx.
+	 * Defined for consistency with struct kvm_sev_cmd.
+	 */
+	__u64 error;
+};
+
+struct kvm_tdx_cpuid_config {
+	__u32 leaf;
+	__u32 sub_leaf;
+	__u32 eax;
+	__u32 ebx;
+	__u32 ecx;
+	__u32 edx;
+};
+
+struct kvm_tdx_capabilities {
+	__u64 attrs_fixed0;
+	__u64 attrs_fixed1;
+	__u64 xfam_fixed0;
+	__u64 xfam_fixed1;
+#define TDX_CAP_GPAW_48	(1 << 0)
+#define TDX_CAP_GPAW_52	(1 << 1)
+	__u32 supported_gpaw;
+	__u32 padding;
+	__u64 reserved[251];
+
+	__u32 nr_cpuid_configs;
+	struct kvm_tdx_cpuid_config cpuid_configs[];
+};
+
+struct kvm_tdx_init_vm {
+	__u64 attributes;
+	__u64 mrconfigid[6];	/* sha384 digest */
+	__u64 mrowner[6];	/* sha384 digest */
+	__u64 mrownerconfig[6];	/* sha348 digest */
+	/*
+	 * For future extensibility to make sizeof(struct kvm_tdx_init_vm) = 8KB.
+	 * This should be enough given sizeof(TD_PARAMS) = 1024.
+	 * 8KB was chosen given because
+	 * sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES(=256) = 8KB.
+	 */
+	__u64 reserved[1004];
+
+	/*
+	 * Call KVM_TDX_INIT_VM before vcpu creation, thus before
+	 * KVM_SET_CPUID2.
+	 * This configuration supersedes KVM_SET_CPUID2s for VCPUs because the
+	 * TDX module directly virtualizes those CPUIDs without VMM.  The user
+	 * space VMM, e.g. qemu, should make KVM_SET_CPUID2 consistent with
+	 * those values.  If it doesn't, KVM may have wrong idea of vCPUIDs of
+	 * the guest, and KVM may wrongly emulate CPUIDs or MSRs that the TDX
+	 * module doesn't virtualize.
+	 */
+	struct kvm_cpuid2 cpuid;
+};
+
+#define KVM_TDX_MEASURE_MEMORY_REGION	(1UL << 0)
+
+struct kvm_tdx_init_mem_region {
+	__u64 source_addr;
+	__u64 gpa;
+	__u64 nr_pages;
+};
 
 #endif /* _ASM_X86_KVM_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 278bed78f98e..280f1730fc27 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -237,6 +237,90 @@ struct kvm_xen_exit {
 	} u;
 };
 
+struct kvm_tdx_exit {
+#define KVM_EXIT_TDX_VMCALL	1
+	__u32 type;
+	__u32 pad;
+
+	union {
+		struct kvm_tdx_vmcall {
+			/*
+			 * RAX(bit 0), RCX(bit 1) and RSP(bit 4) are reserved.
+			 * RAX(bit 0): TDG.VP.VMCALL status code.
+			 * RCX(bit 1): bitmap for used registers.
+			 * RSP(bit 4): the caller stack.
+			 */
+#define TDX_VMCALL_REG_MASK_RBX	BIT_ULL(2)
+#define TDX_VMCALL_REG_MASK_RDX	BIT_ULL(3)
+#define TDX_VMCALL_REG_MASK_RSI	BIT_ULL(6)
+#define TDX_VMCALL_REG_MASK_RDI	BIT_ULL(7)
+#define TDX_VMCALL_REG_MASK_R8	BIT_ULL(8)
+#define TDX_VMCALL_REG_MASK_R9	BIT_ULL(9)
+#define TDX_VMCALL_REG_MASK_R10	BIT_ULL(10)
+#define TDX_VMCALL_REG_MASK_R11	BIT_ULL(11)
+#define TDX_VMCALL_REG_MASK_R12	BIT_ULL(12)
+#define TDX_VMCALL_REG_MASK_R13	BIT_ULL(13)
+#define TDX_VMCALL_REG_MASK_R14	BIT_ULL(14)
+#define TDX_VMCALL_REG_MASK_R15	BIT_ULL(15)
+			union {
+				__u64 in_rcx;
+				__u64 reg_mask;
+			};
+
+			/*
+			 * Guest-Host-Communication Interface for TDX spec
+			 * defines the ABI for TDG.VP.VMCALL.
+			 */
+			/* Input parameters: guest -> VMM */
+			union {
+				__u64 in_r10;
+				__u64 type;
+			};
+			union {
+				__u64 in_r11;
+				__u64 subfunction;
+			};
+			/*
+			 * Subfunction specific.
+			 * Registers are used in this order to pass input
+			 * arguments.  r12=arg0, r13=arg1, etc.
+			 */
+			__u64 in_r12;
+			__u64 in_r13;
+			__u64 in_r14;
+			__u64 in_r15;
+			__u64 in_rbx;
+			__u64 in_rdi;
+			__u64 in_rsi;
+			__u64 in_r8;
+			__u64 in_r9;
+			__u64 in_rdx;
+
+			/* Output parameters: VMM -> guest */
+			union {
+				__u64 out_r10;
+				__u64 status_code;
+			};
+			/*
+			 * Subfunction specific.
+			 * Registers are used in this order to output return
+			 * values.  r11=ret0, r12=ret1, etc.
+			 */
+			__u64 out_r11;
+			__u64 out_r12;
+			__u64 out_r13;
+			__u64 out_r14;
+			__u64 out_r15;
+			__u64 out_rbx;
+			__u64 out_rdi;
+			__u64 out_rsi;
+			__u64 out_r8;
+			__u64 out_r9;
+			__u64 out_rdx;
+		} vmcall;
+	} u;
+};
+
 #define KVM_S390_GET_SKEYS_NONE   1
 #define KVM_S390_SKEYS_MAX        1048576
 
@@ -279,6 +363,7 @@ struct kvm_xen_exit {
 #define KVM_EXIT_RISCV_CSR        36
 #define KVM_EXIT_NOTIFY           37
 #define KVM_EXIT_MEMORY_FAULT     38
+#define KVM_EXIT_TDX              39
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -528,6 +613,8 @@ struct kvm_run {
 			__u64 gpa;
 			__u64 size;
 		} memory;
+		/* KVM_EXIT_TDX_VMCALL */
+		struct kvm_tdx_exit tdx;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 02/58] i386: Introduce tdx-guest object
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 01/58] *** HACK *** linux-headers: Update headers to pull in TDX API changes Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-22  6:22   ` Markus Armbruster
  2023-08-18  9:49 ` [PATCH v2 03/58] target/i386: Parse TDX vm type Xiaoyao Li
                   ` (55 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Introduce tdx-guest object which implements the interface of
CONFIDENTIAL_GUEST_SUPPORT, and will be used to create TDX VMs (TDs) by

  qemu -machine ...,confidential-guest-support=tdx0	\
       -object tdx-guset,id=tdx0

It has only one property 'attributes' with fixed value 0 and not
configurable so far.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
changes from RFC-V4
- make @attributes not user-settable
---
 configs/devices/i386-softmmu/default.mak |  1 +
 hw/i386/Kconfig                          |  5 +++
 qapi/qom.json                            | 12 +++++++
 target/i386/kvm/meson.build              |  2 ++
 target/i386/kvm/tdx.c                    | 40 ++++++++++++++++++++++++
 target/i386/kvm/tdx.h                    | 19 +++++++++++
 6 files changed, 79 insertions(+)
 create mode 100644 target/i386/kvm/tdx.c
 create mode 100644 target/i386/kvm/tdx.h

diff --git a/configs/devices/i386-softmmu/default.mak b/configs/devices/i386-softmmu/default.mak
index 598c6646dfc0..9b5ec59d65b0 100644
--- a/configs/devices/i386-softmmu/default.mak
+++ b/configs/devices/i386-softmmu/default.mak
@@ -18,6 +18,7 @@
 #CONFIG_QXL=n
 #CONFIG_SEV=n
 #CONFIG_SGA=n
+#CONFIG_TDX=n
 #CONFIG_TEST_DEVICES=n
 #CONFIG_TPM_CRB=n
 #CONFIG_TPM_TIS_ISA=n
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index 9051083c1e78..929f6c3f0e85 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -10,6 +10,10 @@ config SGX
     bool
     depends on KVM
 
+config TDX
+    bool
+    depends on KVM
+
 config PC
     bool
     imply APPLESMC
@@ -26,6 +30,7 @@ config PC
     imply QXL
     imply SEV
     imply SGX
+    imply TDX
     imply TEST_DEVICES
     imply TPM_CRB
     imply TPM_TIS_ISA
diff --git a/qapi/qom.json b/qapi/qom.json
index e0b2044e3d20..2ca7ce7c0da5 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -866,6 +866,16 @@
             'reduced-phys-bits': 'uint32',
             '*kernel-hashes': 'bool' } }
 
+##
+# @TdxGuestProperties:
+#
+# Properties for tdx-guest objects.
+#
+# Since: 8.2
+##
+{ 'struct': 'TdxGuestProperties',
+  'data': { }}
+
 ##
 # @ThreadContextProperties:
 #
@@ -944,6 +954,7 @@
     'sev-guest',
     'thread-context',
     's390-pv-guest',
+    'tdx-guest',
     'throttle-group',
     'tls-creds-anon',
     'tls-creds-psk',
@@ -1010,6 +1021,7 @@
       'secret_keyring':             { 'type': 'SecretKeyringProperties',
                                       'if': 'CONFIG_SECRET_KEYRING' },
       'sev-guest':                  'SevGuestProperties',
+      'tdx-guest':                  'TdxGuestProperties',
       'thread-context':             'ThreadContextProperties',
       'throttle-group':             'ThrottleGroupProperties',
       'tls-creds-anon':             'TlsCredsAnonProperties',
diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
index 40fbde96cac6..21ab03fe1349 100644
--- a/target/i386/kvm/meson.build
+++ b/target/i386/kvm/meson.build
@@ -11,6 +11,8 @@ i386_softmmu_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files('xen-emu.c'))
 
 i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
 
+i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
+
 i386_system_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
 
 i386_system_ss.add_all(when: 'CONFIG_KVM', if_true: i386_softmmu_kvm_ss)
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
new file mode 100644
index 000000000000..d3792d4a3d56
--- /dev/null
+++ b/target/i386/kvm/tdx.c
@@ -0,0 +1,40 @@
+/*
+ * QEMU TDX support
+ *
+ * Copyright Intel
+ *
+ * Author:
+ *      Xiaoyao Li <xiaoyao.li@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qom/object_interfaces.h"
+
+#include "tdx.h"
+
+/* tdx guest */
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
+                                   tdx_guest,
+                                   TDX_GUEST,
+                                   CONFIDENTIAL_GUEST_SUPPORT,
+                                   { TYPE_USER_CREATABLE },
+                                   { NULL })
+
+static void tdx_guest_init(Object *obj)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+
+    tdx->attributes = 0;
+}
+
+static void tdx_guest_finalize(Object *obj)
+{
+}
+
+static void tdx_guest_class_init(ObjectClass *oc, void *data)
+{
+}
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
new file mode 100644
index 000000000000..415aeb5af746
--- /dev/null
+++ b/target/i386/kvm/tdx.h
@@ -0,0 +1,19 @@
+#ifndef QEMU_I386_TDX_H
+#define QEMU_I386_TDX_H
+
+#include "exec/confidential-guest-support.h"
+
+#define TYPE_TDX_GUEST "tdx-guest"
+#define TDX_GUEST(obj)  OBJECT_CHECK(TdxGuest, (obj), TYPE_TDX_GUEST)
+
+typedef struct TdxGuestClass {
+    ConfidentialGuestSupportClass parent_class;
+} TdxGuestClass;
+
+typedef struct TdxGuest {
+    ConfidentialGuestSupport parent_obj;
+
+    uint64_t attributes;    /* TD attributes */
+} TdxGuest;
+
+#endif /* QEMU_I386_TDX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 03/58] target/i386: Parse TDX vm type
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 01/58] *** HACK *** linux-headers: Update headers to pull in TDX API changes Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 02/58] i386: Introduce tdx-guest object Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-21  8:27   ` Daniel P. Berrangé
  2023-08-18  9:49 ` [PATCH v2 04/58] target/i386: Introduce kvm_confidential_guest_init() Xiaoyao Li
                   ` (54 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

TDX VM requires VM type KVM_X86_TDX_VM to be passed to
kvm_ioctl(KVM_CREATE_VM).

If tdx-guest object is specified to confidential-guest-support, like,

  qemu -machine ...,confidential-guest-support=tdx0 \
       -object tdx-guest,id=tdx0,...

it parses VM type as KVM_X86_TDX_VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 62f237068a3a..77f4772afe6c 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -32,6 +32,7 @@
 #include "sysemu/runstate.h"
 #include "kvm_i386.h"
 #include "sev.h"
+#include "tdx.h"
 #include "xen-emu.h"
 #include "hyperv.h"
 #include "hyperv-proto.h"
@@ -158,6 +159,7 @@ static int kvm_get_one_msr(X86CPU *cpu, int index, uint64_t *value);
 static const char* vm_type_name[] = {
     [KVM_X86_DEFAULT_VM] = "default",
     [KVM_X86_SW_PROTECTED_VM] = "sw-protected-vm",
+    [KVM_X86_TDX_VM] = "tdx",
 };
 
 int kvm_get_vm_type(MachineState *ms, const char *vm_type)
@@ -170,12 +172,18 @@ int kvm_get_vm_type(MachineState *ms, const char *vm_type)
             kvm_type = KVM_X86_DEFAULT_VM;
         } else if (!g_ascii_strcasecmp(vm_type, "sw-protected-vm")) {
             kvm_type = KVM_X86_SW_PROTECTED_VM;
-        } else {
+        } else if (!g_ascii_strcasecmp(vm_type, "tdx")) {
+            kvm_type = KVM_X86_TDX_VM;
+        }else {
             error_report("Unknown kvm-type specified '%s'", vm_type);
             exit(1);
         }
     }
 
+    if (ms->cgs && object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
+        kvm_type = KVM_X86_TDX_VM;
+    }
+
     /*
      * old KVM doesn't support KVM_CAP_VM_TYPES and KVM_X86_DEFAULT_VM
      * is always supported
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 04/58] target/i386: Introduce kvm_confidential_guest_init()
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (2 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 03/58] target/i386: Parse TDX vm type Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-29 14:42   ` Philippe Mathieu-Daudé
  2023-08-18  9:49 ` [PATCH v2 05/58] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context Xiaoyao Li
                   ` (53 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Introduce a separate function kvm_confidential_guest_init() for SEV (and
future TDX).

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/kvm.c | 11 ++++++++++-
 target/i386/sev.c     |  1 -
 target/i386/sev.h     |  2 ++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 77f4772afe6c..051307437ecd 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2633,6 +2633,15 @@ static MemoryListener kvm_x86_sw_protected_vm_memory_listener = {
     .priority = MEMORY_LISTENER_PRIORITY_ACCEL_HIGH,
 };
 
+static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
+        return sev_kvm_init(ms->cgs, errp);
+    }
+
+    return 0;
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     X86MachineState *x86ms = X86_MACHINE(ms);
@@ -2654,7 +2663,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
      * mechanisms are supported in future (e.g. TDX), they'll need
      * their own initialization either here or elsewhere.
      */
-    ret = sev_kvm_init(ms->cgs, &local_err);
+    ret = kvm_confidential_guest_init(ms, &local_err);
     if (ret < 0) {
         error_report_err(local_err);
         return ret;
diff --git a/target/i386/sev.c b/target/i386/sev.c
index fe2144c0388b..5aa04863846d 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -39,7 +39,6 @@
 #include "hw/i386/pc.h"
 #include "exec/address-spaces.h"
 
-#define TYPE_SEV_GUEST "sev-guest"
 OBJECT_DECLARE_SIMPLE_TYPE(SevGuestState, SEV_GUEST)
 
 
diff --git a/target/i386/sev.h b/target/i386/sev.h
index 7b1528248a54..64fbf186dbd2 100644
--- a/target/i386/sev.h
+++ b/target/i386/sev.h
@@ -20,6 +20,8 @@
 
 #include "exec/confidential-guest-support.h"
 
+#define TYPE_SEV_GUEST "sev-guest"
+
 #define SEV_POLICY_NODBG        0x1
 #define SEV_POLICY_NOKS         0x2
 #define SEV_POLICY_ES           0x4
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 05/58] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (3 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 04/58] target/i386: Introduce kvm_confidential_guest_init() Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 06/58] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES Xiaoyao Li
                   ` (52 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Introduce tdx_kvm_init() and invoke it in kvm_confidential_guest_init()
if it's a TDX VM. More initialization will be added later.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/kvm.c       | 15 ++++++---------
 target/i386/kvm/meson.build |  2 +-
 target/i386/kvm/tdx-stub.c  |  8 ++++++++
 target/i386/kvm/tdx.c       |  7 +++++++
 target/i386/kvm/tdx.h       |  2 ++
 5 files changed, 24 insertions(+), 10 deletions(-)
 create mode 100644 target/i386/kvm/tdx-stub.c

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 051307437ecd..d6b988d6c2d1 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -63,6 +63,7 @@
 #include "migration/blocker.h"
 #include "exec/memattrs.h"
 #include "trace.h"
+#include "tdx.h"
 
 #include CONFIG_DEVICES
 
@@ -2637,6 +2638,8 @@ static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
 {
     if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
         return sev_kvm_init(ms->cgs, errp);
+    } else if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
+        return tdx_kvm_init(ms, errp);
     }
 
     return 0;
@@ -2652,16 +2655,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     Error *local_err = NULL;
 
     /*
-     * Initialize SEV context, if required
+     * Initialize confidential guest (SEV/TDX) context, if required
      *
-     * If no memory encryption is requested (ms->cgs == NULL) this is
-     * a no-op.
-     *
-     * It's also a no-op if a non-SEV confidential guest support
-     * mechanism is selected.  SEV is the only mechanism available to
-     * select on x86 at present, so this doesn't arise, but if new
-     * mechanisms are supported in future (e.g. TDX), they'll need
-     * their own initialization either here or elsewhere.
+     * It's a no-op if a non-SEV/non-tdx confidential guest support
+     * mechanism is selected, i.e., ms->cgs == NULL
      */
     ret = kvm_confidential_guest_init(ms, &local_err);
     if (ret < 0) {
diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
index 21ab03fe1349..876350a387aa 100644
--- a/target/i386/kvm/meson.build
+++ b/target/i386/kvm/meson.build
@@ -11,7 +11,7 @@ i386_softmmu_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files('xen-emu.c'))
 
 i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
 
-i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
+i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'), if_false: files('tdx-stub.c'))
 
 i386_system_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
 
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
new file mode 100644
index 000000000000..1d866d5496bf
--- /dev/null
+++ b/target/i386/kvm/tdx-stub.c
@@ -0,0 +1,8 @@
+#include "qemu/osdep.h"
+
+#include "tdx.h"
+
+int tdx_kvm_init(MachineState *ms, Error **errp)
+{
+    return -EINVAL;
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index d3792d4a3d56..77e33ae01147 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -12,10 +12,17 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 #include "qom/object_interfaces.h"
 
+#include "hw/i386/x86.h"
 #include "tdx.h"
 
+int tdx_kvm_init(MachineState *ms, Error **errp)
+{
+    return 0;
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 415aeb5af746..c8a23d95258d 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -16,4 +16,6 @@ typedef struct TdxGuest {
     uint64_t attributes;    /* TD attributes */
 } TdxGuest;
 
+int tdx_kvm_init(MachineState *ms, Error **errp);
+
 #endif /* QEMU_I386_TDX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 06/58] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (4 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 05/58] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-21  8:46   ` Daniel P. Berrangé
  2023-08-18  9:49 ` [PATCH v2 07/58] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object Xiaoyao Li
                   ` (51 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

KVM provides TDX capabilities via sub command KVM_TDX_CAPABILITIES of
IOCTL(KVM_MEMORY_ENCRYPT_OP). Get the capabilities when initializing
TDX context. It will be used to validate user's setting later.

Since there is no interface reporting how many cpuid configs contains in
KVM_TDX_CAPABILITIES, QEMU chooses to try starting with a known number
and abort when it exceeds KVM_MAX_CPUID_ENTRIES.

Besides, introduce the interfaces to invoke TDX "ioctls" at different
scope (KVM, VM and VCPU) in preparation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
changes from v1:
  - Make the error message more clear;

changes from RFC v4:
  - start from nr_cpuid_configs = 6 for the loop;
  - stop the loop when nr_cpuid_configs exceeds KVM_MAX_CPUID_ENTRIES;
---
 target/i386/kvm/kvm.c      |  2 -
 target/i386/kvm/kvm_i386.h |  2 +
 target/i386/kvm/tdx.c      | 93 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d6b988d6c2d1..ec5c07bffd38 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1751,8 +1751,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
 
 static Error *invtsc_mig_blocker;
 
-#define KVM_MAX_CPUID_ENTRIES  100
-
 static void kvm_init_xsave(CPUX86State *env)
 {
     if (has_xsave2) {
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index ea3a5b174ac0..769eadbba56c 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -13,6 +13,8 @@
 
 #include "sysemu/kvm.h"
 
+#define KVM_MAX_CPUID_ENTRIES  100
+
 #define kvm_apic_in_kernel() (kvm_irqchip_in_kernel())
 
 #ifdef CONFIG_KVM
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 77e33ae01147..255c47a2a553 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -12,14 +12,107 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qom/object_interfaces.h"
+#include "sysemu/kvm.h"
 
 #include "hw/i386/x86.h"
+#include "kvm_i386.h"
 #include "tdx.h"
 
+static struct kvm_tdx_capabilities *tdx_caps;
+
+enum tdx_ioctl_level{
+    TDX_PLATFORM_IOCTL,
+    TDX_VM_IOCTL,
+    TDX_VCPU_IOCTL,
+};
+
+static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
+                        __u32 flags, void *data)
+{
+    struct kvm_tdx_cmd tdx_cmd;
+    int r;
+
+    memset(&tdx_cmd, 0x0, sizeof(tdx_cmd));
+
+    tdx_cmd.id = cmd_id;
+    tdx_cmd.flags = flags;
+    tdx_cmd.data = (__u64)(unsigned long)data;
+
+    switch (level) {
+    case TDX_PLATFORM_IOCTL:
+        r = kvm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
+        break;
+    case TDX_VM_IOCTL:
+        r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
+        break;
+    case TDX_VCPU_IOCTL:
+        r = kvm_vcpu_ioctl(state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
+        break;
+    default:
+        error_report("Invalid tdx_ioctl_level %d", level);
+        exit(1);
+    }
+
+    return r;
+}
+
+static inline int tdx_platform_ioctl(int cmd_id, __u32 flags, void *data)
+{
+    return __tdx_ioctl(NULL, TDX_PLATFORM_IOCTL, cmd_id, flags, data);
+}
+
+static inline int tdx_vm_ioctl(int cmd_id, __u32 flags, void *data)
+{
+    return __tdx_ioctl(NULL, TDX_VM_IOCTL, cmd_id, flags, data);
+}
+
+static inline int tdx_vcpu_ioctl(void *vcpu_fd, int cmd_id, __u32 flags,
+                                 void *data)
+{
+    return  __tdx_ioctl(vcpu_fd, TDX_VCPU_IOCTL, cmd_id, flags, data);
+}
+
+static void get_tdx_capabilities(void)
+{
+    struct kvm_tdx_capabilities *caps;
+    /* 1st generation of TDX reports 6 cpuid configs */
+    int nr_cpuid_configs = 6;
+    int r, size;
+
+    do {
+        size = sizeof(struct kvm_tdx_capabilities) +
+               nr_cpuid_configs * sizeof(struct kvm_tdx_cpuid_config);
+        caps = g_malloc0(size);
+        caps->nr_cpuid_configs = nr_cpuid_configs;
+
+        r = tdx_vm_ioctl(KVM_TDX_CAPABILITIES, 0, caps);
+        if (r == -E2BIG) {
+            g_free(caps);
+            nr_cpuid_configs *= 2;
+            if (nr_cpuid_configs > KVM_MAX_CPUID_ENTRIES) {
+                error_report("KVM TDX seems broken that number of CPUID entries in kvm_tdx_capabilities exceeds limit");
+                exit(1);
+            }
+        } else if (r < 0) {
+            g_free(caps);
+            error_report("KVM_TDX_CAPABILITIES failed: %s", strerror(-r));
+            exit(1);
+        }
+    }
+    while (r == -E2BIG);
+
+    tdx_caps = caps;
+}
+
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
+    if (!tdx_caps) {
+        get_tdx_capabilities();
+    }
+
     return 0;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 07/58] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (5 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 06/58] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-21  8:48   ` Daniel P. Berrangé
  2023-08-18  9:49 ` [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions Xiaoyao Li
                   ` (50 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

It will need special handling for TDX VMs all around the QEMU.
Introduce is_tdx_vm() helper to query if it's a TDX VM.

Cache tdx_guest object thus no need to cast from ms->cgs every time.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/tdx.c | 13 +++++++++++++
 target/i386/kvm/tdx.h | 10 ++++++++++
 2 files changed, 23 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 255c47a2a553..56cb826f6125 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -21,8 +21,16 @@
 #include "kvm_i386.h"
 #include "tdx.h"
 
+static TdxGuest *tdx_guest;
+
 static struct kvm_tdx_capabilities *tdx_caps;
 
+/* It's valid after kvm_confidential_guest_init()->kvm_tdx_init() */
+bool is_tdx_vm(void)
+{
+    return !!tdx_guest;
+}
+
 enum tdx_ioctl_level{
     TDX_PLATFORM_IOCTL,
     TDX_VM_IOCTL,
@@ -109,10 +117,15 @@ static void get_tdx_capabilities(void)
 
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
+    TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
+                                                    TYPE_TDX_GUEST);
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
 
+    tdx_guest = tdx;
+
     return 0;
 }
 
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index c8a23d95258d..4036ca2f3f99 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -1,6 +1,10 @@
 #ifndef QEMU_I386_TDX_H
 #define QEMU_I386_TDX_H
 
+#ifndef CONFIG_USER_ONLY
+#include CONFIG_DEVICES /* CONFIG_TDX */
+#endif
+
 #include "exec/confidential-guest-support.h"
 
 #define TYPE_TDX_GUEST "tdx-guest"
@@ -16,6 +20,12 @@ typedef struct TdxGuest {
     uint64_t attributes;    /* TD attributes */
 } TdxGuest;
 
+#ifdef CONFIG_TDX
+bool is_tdx_vm(void);
+#else
+#define is_tdx_vm() 0
+#endif /* CONFIG_TDX */
+
 int tdx_kvm_init(MachineState *ms, Error **errp);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (6 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 07/58] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-21 23:00   ` Isaku Yamahata
  2023-10-10  1:02   ` Tina Zhang
  2023-08-18  9:49 ` [PATCH v2 09/58] i386/tdx: Update tdx_cpuid_lookup[].tdx_fixed0/1 by tdx_caps.cpuid_config[] Xiaoyao Li
                   ` (49 subsequent siblings)
  57 siblings, 2 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

According to Chapter "CPUID Virtualization" in TDX module spec, CPUID
bits of TD can be classified into 6 types:

------------------------------------------------------------------------
1 | As configured | configurable by VMM, independent of native value;
------------------------------------------------------------------------
2 | As configured | configurable by VMM if the bit is supported natively
    (if native)   | Otherwise it equals as native(0).
------------------------------------------------------------------------
3 | Fixed         | fixed to 0/1
------------------------------------------------------------------------
4 | Native        | reflect the native value
------------------------------------------------------------------------
5 | Calculated    | calculated by TDX module.
------------------------------------------------------------------------
6 | Inducing #VE  | get #VE exception
------------------------------------------------------------------------

Note:
1. All the configurable XFAM related features and TD attributes related
   features fall into type #2. And fixed0/1 bits of XFAM and TD
   attributes fall into type #3.

2. For CPUID leaves not listed in "CPUID virtualization Overview" table
   in TDX module spec, TDX module injects #VE to TDs when those are
   queried. For this case, TDs can request CPUID emulation from VMM via
   TDVMCALL and the values are fully controlled by VMM.

Due to TDX module has its own virtualization policy on CPUID bits, it leads
to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported
CPUID bits for TDs. In order to keep a consistent CPUID configuration
between VMM and TDs. Adjust supported CPUID for TDs based on TDX
restrictions.

Currently only focus on the CPUID leaves recognized by QEMU's
feature_word_info[] that are indexed by a FeatureWord.

Introduce a TDX CPUID lookup table, which maintains 1 entry for each
FeatureWord. Each entry has below fields:

 - tdx_fixed0/1: The bits that are fixed as 0/1;

 - vmm_fixup:   The bits that are configurable from the view of TDX module.
                But they requires emulation of VMM when they are configured
	        as enabled. For those, they are not supported if VMM doesn't
		report them as supported. So they need be fixed up by
		checking if VMM supports them.

 - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is
                totally configurable by VMM.

 - supported_on_ve: It's valid only when @inducing_ve is true. It represents
		    the maximum feature set supported that be emulated
		    for TDs.

By applying TDX CPUID lookup table and TDX capabilities reported from
TDX module, the supported CPUID for TDs can be obtained from following
steps:

- get the base of VMM supported feature set;

- if the leaf is not a FeatureWord just return VMM's value without
  modification;

- if the leaf is an inducing_ve type, applying supported_on_ve mask and
  return;

- include all native bits, it covers type #2, #4, and parts of type #1.
  (it also includes some unsupported bits. The following step will
   correct it.)

- apply fixed0/1 to it (it covers #3, and rectifies the previous step);

- add configurable bits (it covers the other part of type #1);

- fix the ones in vmm_fixup;

- filter the one has valid .supported field;

(Calculated type is ignored since it's determined at runtime).

Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/cpu.h     |  16 +++
 target/i386/kvm/kvm.c |   4 +
 target/i386/kvm/tdx.c | 254 ++++++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h |   2 +
 4 files changed, 276 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index e0771a10433b..c93dcd274531 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -780,6 +780,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 
 /* Support RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE */
 #define CPUID_7_0_EBX_FSGSBASE          (1U << 0)
+/* Support for TSC adjustment MSR 0x3B */
+#define CPUID_7_0_EBX_TSC_ADJUST        (1U << 1)
 /* Support SGX */
 #define CPUID_7_0_EBX_SGX               (1U << 2)
 /* 1st Group of Advanced Bit Manipulation Extensions */
@@ -798,8 +800,12 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_7_0_EBX_INVPCID           (1U << 10)
 /* Restricted Transactional Memory */
 #define CPUID_7_0_EBX_RTM               (1U << 11)
+/* Cache QoS Monitoring */
+#define CPUID_7_0_EBX_PQM               (1U << 12)
 /* Memory Protection Extension */
 #define CPUID_7_0_EBX_MPX               (1U << 14)
+/* Resource Director Technology Allocation */
+#define CPUID_7_0_EBX_RDT_A             (1U << 15)
 /* AVX-512 Foundation */
 #define CPUID_7_0_EBX_AVX512F           (1U << 16)
 /* AVX-512 Doubleword & Quadword Instruction */
@@ -855,10 +861,16 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_7_0_ECX_AVX512VNNI        (1U << 11)
 /* Support for VPOPCNT[B,W] and VPSHUFBITQMB */
 #define CPUID_7_0_ECX_AVX512BITALG      (1U << 12)
+/* Intel Total Memory Encryption */
+#define CPUID_7_0_ECX_TME               (1U << 13)
 /* POPCNT for vectors of DW/QW */
 #define CPUID_7_0_ECX_AVX512_VPOPCNTDQ  (1U << 14)
+/* Placeholder for bit 15 */
+#define CPUID_7_0_ECX_FZM               (1U << 15)
 /* 5-level Page Tables */
 #define CPUID_7_0_ECX_LA57              (1U << 16)
+/* MAWAU for MPX */
+#define CPUID_7_0_ECX_MAWAU             (31U << 17)
 /* Read Processor ID */
 #define CPUID_7_0_ECX_RDPID             (1U << 22)
 /* Bus Lock Debug Exception */
@@ -869,6 +881,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_7_0_ECX_MOVDIRI           (1U << 27)
 /* Move 64 Bytes as Direct Store Instruction */
 #define CPUID_7_0_ECX_MOVDIR64B         (1U << 28)
+/* ENQCMD and ENQCMDS instructions */
+#define CPUID_7_0_ECX_ENQCMD            (1U << 29)
 /* Support SGX Launch Control */
 #define CPUID_7_0_ECX_SGX_LC            (1U << 30)
 /* Protection Keys for Supervisor-mode Pages */
@@ -886,6 +900,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_7_0_EDX_SERIALIZE         (1U << 14)
 /* TSX Suspend Load Address Tracking instruction */
 #define CPUID_7_0_EDX_TSX_LDTRK         (1U << 16)
+/* PCONFIG instruction */
+#define CPUID_7_0_EDX_PCONFIG           (1U << 18)
 /* Architectural LBRs */
 #define CPUID_7_0_EDX_ARCH_LBR          (1U << 19)
 /* AMX_BF16 instruction */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ec5c07bffd38..46a455a1e331 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -539,6 +539,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
         ret |= 1U << KVM_HINTS_REALTIME;
     }
 
+    if (is_tdx_vm()) {
+        tdx_get_supported_cpuid(function, index, reg, &ret);
+    }
+
     return ret;
 }
 
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 56cb826f6125..3198bc9fd5fb 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -15,11 +15,129 @@
 #include "qemu/error-report.h"
 #include "qapi/error.h"
 #include "qom/object_interfaces.h"
+#include "standard-headers/asm-x86/kvm_para.h"
 #include "sysemu/kvm.h"
+#include "sysemu/sysemu.h"
 
 #include "hw/i386/x86.h"
 #include "kvm_i386.h"
 #include "tdx.h"
+#include "../cpu-internal.h"
+
+#define TDX_SUPPORTED_KVM_FEATURES  ((1U << KVM_FEATURE_NOP_IO_DELAY) | \
+                                     (1U << KVM_FEATURE_PV_UNHALT) | \
+                                     (1U << KVM_FEATURE_PV_TLB_FLUSH) | \
+                                     (1U << KVM_FEATURE_PV_SEND_IPI) | \
+                                     (1U << KVM_FEATURE_POLL_CONTROL) | \
+                                     (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
+                                     (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
+
+typedef struct KvmTdxCpuidLookup {
+    uint32_t tdx_fixed0;
+    uint32_t tdx_fixed1;
+
+    /*
+     * The CPUID bits that are configurable from the view of TDX module
+     * but require VMM emulation if configured to enabled by VMM.
+     *
+     * For those bits, they cannot be enabled actually if VMM (KVM/QEMU) cannot
+     * virtualize them.
+     */
+    uint32_t vmm_fixup;
+
+    bool inducing_ve;
+    /*
+     * The maximum supported feature set for given inducing-#VE leaf.
+     * It's valid only when .inducing_ve is true.
+     */
+    uint32_t supported_on_ve;
+} KvmTdxCpuidLookup;
+
+ /*
+  * QEMU maintained TDX CPUID lookup tables, which reflects how CPUIDs are
+  * virtualized for guest TDs based on "CPUID virtualization" of TDX spec.
+  *
+  * Note:
+  *
+  * This table will be updated runtime by tdx_caps reported by platform.
+  *
+  */
+static KvmTdxCpuidLookup tdx_cpuid_lookup[FEATURE_WORDS] = {
+    [FEAT_1_EDX] = {
+        .tdx_fixed0 =
+            BIT(10) /* Reserved */ | BIT(20) /* Reserved */ | CPUID_IA64,
+        .tdx_fixed1 =
+            CPUID_MSR | CPUID_PAE | CPUID_MCE | CPUID_APIC |
+            CPUID_MTRR | CPUID_MCA | CPUID_CLFLUSH | CPUID_DTS,
+        .vmm_fixup =
+            CPUID_ACPI | CPUID_PBE,
+    },
+    [FEAT_1_ECX] = {
+        .tdx_fixed0 =
+            CPUID_EXT_VMX | CPUID_EXT_SMX | BIT(16) /* Reserved */,
+        .tdx_fixed1 =
+            CPUID_EXT_CX16 | CPUID_EXT_PDCM | CPUID_EXT_X2APIC |
+            CPUID_EXT_AES | CPUID_EXT_XSAVE | CPUID_EXT_RDRAND |
+            CPUID_EXT_HYPERVISOR,
+        .vmm_fixup =
+            CPUID_EXT_EST | CPUID_EXT_TM2 | CPUID_EXT_XTPR | CPUID_EXT_DCA,
+    },
+    [FEAT_8000_0001_EDX] = {
+        .tdx_fixed1 =
+            CPUID_EXT2_NX | CPUID_EXT2_PDPE1GB | CPUID_EXT2_RDTSCP |
+            CPUID_EXT2_LM,
+    },
+    [FEAT_7_0_EBX] = {
+        .tdx_fixed0 =
+            CPUID_7_0_EBX_TSC_ADJUST | CPUID_7_0_EBX_SGX | CPUID_7_0_EBX_MPX,
+        .tdx_fixed1 =
+            CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_RTM |
+            CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_SMAP |
+            CPUID_7_0_EBX_CLFLUSHOPT | CPUID_7_0_EBX_CLWB |
+            CPUID_7_0_EBX_SHA_NI,
+        .vmm_fixup =
+            CPUID_7_0_EBX_PQM | CPUID_7_0_EBX_RDT_A,
+    },
+    [FEAT_7_0_ECX] = {
+        .tdx_fixed0 =
+            CPUID_7_0_ECX_FZM | CPUID_7_0_ECX_MAWAU |
+            CPUID_7_0_ECX_ENQCMD | CPUID_7_0_ECX_SGX_LC,
+        .tdx_fixed1 =
+            CPUID_7_0_ECX_MOVDIR64B | CPUID_7_0_ECX_BUS_LOCK_DETECT,
+        .vmm_fixup =
+            CPUID_7_0_ECX_TME,
+    },
+    [FEAT_7_0_EDX] = {
+        .tdx_fixed1 =
+            CPUID_7_0_EDX_SPEC_CTRL | CPUID_7_0_EDX_ARCH_CAPABILITIES |
+            CPUID_7_0_EDX_CORE_CAPABILITY | CPUID_7_0_EDX_SPEC_CTRL_SSBD,
+        .vmm_fixup =
+            CPUID_7_0_EDX_PCONFIG,
+    },
+    [FEAT_8000_0008_EBX] = {
+        .tdx_fixed0 =
+            ~CPUID_8000_0008_EBX_WBNOINVD,
+        .tdx_fixed1 =
+            CPUID_8000_0008_EBX_WBNOINVD,
+    },
+    [FEAT_XSAVE] = {
+        .tdx_fixed1 =
+            CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
+            CPUID_XSAVE_XSAVES,
+    },
+    [FEAT_6_EAX] = {
+        .inducing_ve = true,
+        .supported_on_ve = CPUID_6_EAX_ARAT,
+    },
+    [FEAT_8000_0007_EDX] = {
+        .inducing_ve = true,
+        .supported_on_ve = -1U,
+    },
+    [FEAT_KVM] = {
+        .inducing_ve = true,
+        .supported_on_ve = TDX_SUPPORTED_KVM_FEATURES,
+    },
+};
 
 static TdxGuest *tdx_guest;
 
@@ -31,6 +149,142 @@ bool is_tdx_vm(void)
     return !!tdx_guest;
 }
 
+static inline uint32_t host_cpuid_reg(uint32_t function,
+                                      uint32_t index, int reg)
+{
+    uint32_t eax, ebx, ecx, edx;
+    uint32_t ret = 0;
+
+    host_cpuid(function, index, &eax, &ebx, &ecx, &edx);
+
+    switch (reg) {
+    case R_EAX:
+        ret |= eax;
+        break;
+    case R_EBX:
+        ret |= ebx;
+        break;
+    case R_ECX:
+        ret |= ecx;
+        break;
+    case R_EDX:
+        ret |= edx;
+        break;
+    }
+    return ret;
+}
+
+static inline uint32_t tdx_cap_cpuid_config(uint32_t function,
+                                            uint32_t index, int reg)
+{
+    struct kvm_tdx_cpuid_config *cpuid_c;
+    int ret = 0;
+    int i;
+
+    if (tdx_caps->nr_cpuid_configs <= 0) {
+        return ret;
+    }
+
+    for (i = 0; i < tdx_caps->nr_cpuid_configs; i++) {
+        cpuid_c = &tdx_caps->cpuid_configs[i];
+        /* 0xffffffff in sub_leaf means the leaf doesn't require a sublesf */
+        if (cpuid_c->leaf == function &&
+            (cpuid_c->sub_leaf == 0xffffffff || cpuid_c->sub_leaf == index)) {
+            switch (reg) {
+            case R_EAX:
+                ret = cpuid_c->eax;
+                break;
+            case R_EBX:
+                ret = cpuid_c->ebx;
+                break;
+            case R_ECX:
+                ret = cpuid_c->ecx;
+                break;
+            case R_EDX:
+                ret = cpuid_c->edx;
+                break;
+            default:
+                return 0;
+            }
+        }
+    }
+    return ret;
+}
+
+static FeatureWord get_cpuid_featureword_index(uint32_t function,
+                                               uint32_t index, int reg)
+{
+    FeatureWord w;
+
+    for (w = 0; w < FEATURE_WORDS; w++) {
+        FeatureWordInfo *f = &feature_word_info[w];
+
+        if (f->type == MSR_FEATURE_WORD || f->cpuid.eax != function ||
+            f->cpuid.reg != reg ||
+            (f->cpuid.needs_ecx && f->cpuid.ecx != index)) {
+            continue;
+        }
+
+        return w;
+    }
+
+    return w;
+}
+
+/*
+ * TDX supported CPUID varies from what KVM reports. Adjust the result by
+ * applying the TDX restrictions.
+ */
+void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
+                             uint32_t *ret)
+{
+    uint32_t vmm_cap = *ret;
+    FeatureWord w;
+
+    /* Only handle features leaves that recognized by feature_word_info[] */
+    w = get_cpuid_featureword_index(function, index, reg);
+    if (w == FEATURE_WORDS) {
+        return;
+    }
+
+    if (tdx_cpuid_lookup[w].inducing_ve) {
+        *ret &= tdx_cpuid_lookup[w].supported_on_ve;
+        return;
+    }
+
+    /*
+     * Include all the native bits as first step. It covers types
+     * - As configured (if native)
+     * - Native
+     * - XFAM related and Attributes realted
+     *
+     * It also has side effect to enable unsupported bits, e.g., the
+     * bits of "fixed0" type while present natively. It's safe because
+     * the unsupported bits will be masked off by .fixed0 later.
+     */
+    *ret |= host_cpuid_reg(function, index, reg);
+
+    /* Adjust according to "fixed" type in tdx_cpuid_lookup. */
+    *ret |= tdx_cpuid_lookup[w].tdx_fixed1;
+    *ret &= ~tdx_cpuid_lookup[w].tdx_fixed0;
+
+    /*
+     * Configurable cpuids are supported unconditionally. It's mainly to
+     * include those configurable regardless of native existence.
+     */
+    *ret |= tdx_cap_cpuid_config(function, index, reg);
+
+    /*
+     * clear the configurable bits that require VMM emulation and VMM doesn't
+     * report the support.
+     */
+    *ret &= ~(~vmm_cap & tdx_cpuid_lookup[w].vmm_fixup);
+
+    /* special handling */
+    if (function == 1 && reg == R_ECX && !enable_cpu_pm)
+        *ret &= ~CPUID_EXT_MONITOR;
+}
+
 enum tdx_ioctl_level{
     TDX_PLATFORM_IOCTL,
     TDX_VM_IOCTL,
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 4036ca2f3f99..06599b65b827 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -27,5 +27,7 @@ bool is_tdx_vm(void);
 #endif /* CONFIG_TDX */
 
 int tdx_kvm_init(MachineState *ms, Error **errp);
+void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
+                             uint32_t *ret);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 09/58] i386/tdx: Update tdx_cpuid_lookup[].tdx_fixed0/1 by tdx_caps.cpuid_config[]
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (7 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 10/58] i386/tdx: Integrate tdx_caps->xfam_fixed0/1 into tdx_cpuid_lookup Xiaoyao Li
                   ` (48 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

tdx_cpuid_lookup[].tdx_fixed0/1 is QEMU maintained data which reflects
TDX restrictions regrading how some CPUIDs are virtualized by TDX.

It's retrieved from TDX spec. However, TDX may change some fixed
fields to configurable in the future. Update
tdx_cpuid.lookup[].tdx_fixed0/1 fields by removing the bits that
reported from TDX module as configurable. This can adapt with the
updated TDX (module) automatically.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 3198bc9fd5fb..4518c79aecc8 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -369,6 +369,34 @@ static void get_tdx_capabilities(void)
     tdx_caps = caps;
 }
 
+static void update_tdx_cpuid_lookup_by_tdx_caps(void)
+{
+    KvmTdxCpuidLookup *entry;
+    FeatureWordInfo *fi;
+    uint32_t config;
+    FeatureWord w;
+
+    /*
+     * Patch tdx_fixed0/1 by tdx_caps that what TDX module reports as
+     * configurable is not fixed.
+     */
+    for (w = 0; w < FEATURE_WORDS; w++) {
+        fi = &feature_word_info[w];
+        entry = &tdx_cpuid_lookup[w];
+
+        if (fi->type != CPUID_FEATURE_WORD) {
+            continue;
+        }
+
+        config = tdx_cap_cpuid_config(fi->cpuid.eax,
+                                      fi->cpuid.needs_ecx ? fi->cpuid.ecx : ~0u,
+                                      fi->cpuid.reg);
+
+        entry->tdx_fixed0 &= ~config;
+        entry->tdx_fixed1 &= ~config;
+    }
+}
+
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
     TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
@@ -378,6 +406,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         get_tdx_capabilities();
     }
 
+    update_tdx_cpuid_lookup_by_tdx_caps();
+
     tdx_guest = tdx;
 
     return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 10/58] i386/tdx: Integrate tdx_caps->xfam_fixed0/1 into tdx_cpuid_lookup
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (8 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 09/58] i386/tdx: Update tdx_cpuid_lookup[].tdx_fixed0/1 by tdx_caps.cpuid_config[] Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 11/58] i386/tdx: Integrate tdx_caps->attrs_fixed0/1 to tdx_cpuid_lookup Xiaoyao Li
                   ` (47 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

KVM requires userspace to pass XFAM configuration via CPUID 0xD leaves.

Convert tdx_caps->xfam_fixed0/1 into corresponding
tdx_cpuid_lookup[].tdx_fixed0/1 field of CPUID 0xD leaves. Thus the
requirement can be applied naturally.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/cpu.c     |  3 ---
 target/i386/cpu.h     |  3 +++
 target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 97ad229d8ba3..b8850a02539a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1568,9 +1568,6 @@ static const X86RegisterInfo32 x86_reg_info_32[CPU_NB_REGS32] = {
 };
 #undef REGISTER
 
-/* CPUID feature bits available in XSS */
-#define CPUID_XSTATE_XSS_MASK    (XSTATE_ARCH_LBR_MASK)
-
 ExtSaveArea x86_ext_save_areas[XSAVE_STATE_AREA_COUNT] = {
     [XSTATE_FP_BIT] = {
         /* x87 FP state component is always enabled if XSAVE is supported */
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index c93dcd274531..d4a7f9e3f54c 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -588,6 +588,9 @@ typedef enum X86Seg {
                                  XSTATE_Hi16_ZMM_MASK | XSTATE_PKRU_MASK | \
                                  XSTATE_XTILE_CFG_MASK | XSTATE_XTILE_DATA_MASK)
 
+/* CPUID feature bits available in XSS */
+#define CPUID_XSTATE_XSS_MASK    (XSTATE_ARCH_LBR_MASK)
+
 /* CPUID feature words */
 typedef enum FeatureWord {
     FEAT_1_EDX,         /* CPUID[1].EDX */
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 4518c79aecc8..19aed03f12c6 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -395,6 +395,30 @@ static void update_tdx_cpuid_lookup_by_tdx_caps(void)
         entry->tdx_fixed0 &= ~config;
         entry->tdx_fixed1 &= ~config;
     }
+
+    /*
+     * Because KVM gets XFAM settings via CPUID leaves 0xD,  map
+     * tdx_caps->xfam_fixed{0, 1} into tdx_cpuid_lookup[].tdx_fixed{0, 1}.
+     *
+     * Then the enforment applies in tdx_get_configurable_cpuid() naturally.
+     */
+    tdx_cpuid_lookup[FEAT_XSAVE_XCR0_LO].tdx_fixed0 =
+            (uint32_t)~tdx_caps->xfam_fixed0 & CPUID_XSTATE_XCR0_MASK;
+    tdx_cpuid_lookup[FEAT_XSAVE_XCR0_LO].tdx_fixed1 =
+            (uint32_t)tdx_caps->xfam_fixed1 & CPUID_XSTATE_XCR0_MASK;
+    tdx_cpuid_lookup[FEAT_XSAVE_XCR0_HI].tdx_fixed0 =
+            (~tdx_caps->xfam_fixed0 & CPUID_XSTATE_XCR0_MASK) >> 32;
+    tdx_cpuid_lookup[FEAT_XSAVE_XCR0_HI].tdx_fixed1 =
+            (tdx_caps->xfam_fixed1 & CPUID_XSTATE_XCR0_MASK) >> 32;
+
+    tdx_cpuid_lookup[FEAT_XSAVE_XSS_LO].tdx_fixed0 =
+            (uint32_t)~tdx_caps->xfam_fixed0 & CPUID_XSTATE_XSS_MASK;
+    tdx_cpuid_lookup[FEAT_XSAVE_XSS_LO].tdx_fixed1 =
+            (uint32_t)tdx_caps->xfam_fixed1 & CPUID_XSTATE_XSS_MASK;
+    tdx_cpuid_lookup[FEAT_XSAVE_XSS_HI].tdx_fixed0 =
+            (~tdx_caps->xfam_fixed0 & CPUID_XSTATE_XSS_MASK) >> 32;
+    tdx_cpuid_lookup[FEAT_XSAVE_XSS_HI].tdx_fixed1 =
+            (tdx_caps->xfam_fixed1 & CPUID_XSTATE_XSS_MASK) >> 32;
 }
 
 int tdx_kvm_init(MachineState *ms, Error **errp)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 11/58] i386/tdx: Integrate tdx_caps->attrs_fixed0/1 to tdx_cpuid_lookup
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (9 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 10/58] i386/tdx: Integrate tdx_caps->xfam_fixed0/1 into tdx_cpuid_lookup Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 12/58] i386/kvm: Move architectural CPUID leaf generation to separate helper Xiaoyao Li
                   ` (46 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Some bits in TD attributes have corresponding CPUID feature bits. Reflect
the fixed0/1 restriction on TD attributes to their corresponding CPUID
bits in tdx_cpuid_lookup[] as well.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/cpu-internal.h |  9 +++++++++
 target/i386/cpu.c          |  9 ---------
 target/i386/cpu.h          |  2 ++
 target/i386/kvm/tdx.c      | 21 +++++++++++++++++++++
 4 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/target/i386/cpu-internal.h b/target/i386/cpu-internal.h
index 9baac5c0b450..e980f6e3147f 100644
--- a/target/i386/cpu-internal.h
+++ b/target/i386/cpu-internal.h
@@ -20,6 +20,15 @@
 #ifndef I386_CPU_INTERNAL_H
 #define I386_CPU_INTERNAL_H
 
+typedef struct FeatureMask {
+    FeatureWord index;
+    uint64_t mask;
+} FeatureMask;
+
+typedef struct FeatureDep {
+    FeatureMask from, to;
+} FeatureDep;
+
 typedef enum FeatureWordType {
    CPUID_FEATURE_WORD,
    MSR_FEATURE_WORD,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b8850a02539a..a529c07e749c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1439,15 +1439,6 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
     },
 };
 
-typedef struct FeatureMask {
-    FeatureWord index;
-    uint64_t mask;
-} FeatureMask;
-
-typedef struct FeatureDep {
-    FeatureMask from, to;
-} FeatureDep;
-
 static FeatureDep feature_dependencies[] = {
     {
         .from = { FEAT_7_0_EDX,             CPUID_7_0_EDX_ARCH_CAPABILITIES },
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d4a7f9e3f54c..f51e93e21f73 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -876,6 +876,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 #define CPUID_7_0_ECX_MAWAU             (31U << 17)
 /* Read Processor ID */
 #define CPUID_7_0_ECX_RDPID             (1U << 22)
+/* KeyLocker */
+#define CPUID_7_0_ECX_KeyLocker         (1U << 23)
 /* Bus Lock Debug Exception */
 #define CPUID_7_0_ECX_BUS_LOCK_DETECT   (1U << 24)
 /* Cache Line Demote Instruction */
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 19aed03f12c6..29f50fb9529e 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -32,6 +32,13 @@
                                      (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
                                      (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
 
+#define TDX_ATTRIBUTES_MAX_BITS      64
+
+static FeatureMask tdx_attrs_ctrl_fields[TDX_ATTRIBUTES_MAX_BITS] = {
+    [30] = { .index = FEAT_7_0_ECX, .mask = CPUID_7_0_ECX_PKS },
+    [31] = { .index = FEAT_7_0_ECX, .mask = CPUID_7_0_ECX_KeyLocker},
+};
+
 typedef struct KvmTdxCpuidLookup {
     uint32_t tdx_fixed0;
     uint32_t tdx_fixed1;
@@ -375,6 +382,8 @@ static void update_tdx_cpuid_lookup_by_tdx_caps(void)
     FeatureWordInfo *fi;
     uint32_t config;
     FeatureWord w;
+    FeatureMask *fm;
+    int i;
 
     /*
      * Patch tdx_fixed0/1 by tdx_caps that what TDX module reports as
@@ -396,6 +405,18 @@ static void update_tdx_cpuid_lookup_by_tdx_caps(void)
         entry->tdx_fixed1 &= ~config;
     }
 
+    for (i = 0; i < ARRAY_SIZE(tdx_attrs_ctrl_fields); i++) {
+        fm = &tdx_attrs_ctrl_fields[i];
+
+        if (tdx_caps->attrs_fixed0 & (1ULL << i)) {
+            tdx_cpuid_lookup[fm->index].tdx_fixed0 |= fm->mask;
+        }
+
+        if (tdx_caps->attrs_fixed1 & (1ULL << i)) {
+            tdx_cpuid_lookup[fm->index].tdx_fixed1 |= fm->mask;
+        }
+    }
+
     /*
      * Because KVM gets XFAM settings via CPUID leaves 0xD,  map
      * tdx_caps->xfam_fixed{0, 1} into tdx_cpuid_lookup[].tdx_fixed{0, 1}.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 12/58] i386/kvm: Move architectural CPUID leaf generation to separate helper
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (10 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 11/58] i386/tdx: Integrate tdx_caps->attrs_fixed0/1 to tdx_cpuid_lookup Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-18  9:49 ` [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu() Xiaoyao Li
                   ` (45 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c      | 454 +++++++++++++++++++------------------
 target/i386/kvm/kvm_i386.h |   3 +
 2 files changed, 235 insertions(+), 222 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 46a455a1e331..9ee41fffc445 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1799,6 +1799,236 @@ static void kvm_init_nested_state(CPUX86State *env)
     }
 }
 
+uint32_t kvm_x86_arch_cpuid(CPUX86State *env, struct kvm_cpuid_entry2 *entries,
+                            uint32_t cpuid_i)
+{
+    uint32_t limit, i, j;
+    uint32_t unused;
+    struct kvm_cpuid_entry2 *c;
+
+    cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
+
+    for (i = 0; i <= limit; i++) {
+        if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+            fprintf(stderr, "unsupported level value: 0x%x\n", limit);
+            abort();
+        }
+        c = &entries[cpuid_i++];
+
+        switch (i) {
+        case 2: {
+            /* Keep reading function 2 till all the input is received */
+            int times;
+
+            c->function = i;
+            c->flags = KVM_CPUID_FLAG_STATEFUL_FUNC |
+                       KVM_CPUID_FLAG_STATE_READ_NEXT;
+            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+            times = c->eax & 0xff;
+
+            for (j = 1; j < times; ++j) {
+                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+                    fprintf(stderr, "cpuid_data is full, no space for "
+                            "cpuid(eax:2):eax & 0xf = 0x%x\n", times);
+                    abort();
+                }
+                c = &entries[cpuid_i++];
+                c->function = i;
+                c->flags = KVM_CPUID_FLAG_STATEFUL_FUNC;
+                cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+            }
+            break;
+        }
+        case 0x1f:
+            if (env->nr_dies < 2) {
+                cpuid_i--;
+                break;
+            }
+            /* fallthrough */
+        case 4:
+        case 0xb:
+        case 0xd:
+            for (j = 0; ; j++) {
+                if (i == 0xd && j == 64) {
+                    break;
+                }
+
+                c->function = i;
+                c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+                c->index = j;
+                cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
+
+                if (i == 4 && c->eax == 0) {
+                    break;
+                }
+                if (i == 0xb && !(c->ecx & 0xff00)) {
+                    break;
+                }
+                if (i == 0x1f && !(c->ecx & 0xff00)) {
+                    break;
+                }
+                if (i == 0xd && c->eax == 0) {
+                    continue;
+                }
+                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+                    fprintf(stderr, "cpuid_data is full, no space for "
+                            "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
+                    abort();
+                }
+                c = &entries[cpuid_i++];
+            }
+            break;
+        case 0x7:
+        case 0x12:
+            for (j = 0; ; j++) {
+                c->function = i;
+                c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+                c->index = j;
+                cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
+
+                if (j > 1 && (c->eax & 0xf) != 1) {
+                    break;
+                }
+
+                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+                    fprintf(stderr, "cpuid_data is full, no space for "
+                                "cpuid(eax:0x12,ecx:0x%x)\n", j);
+                    abort();
+                }
+                c = &entries[cpuid_i++];
+            }
+            break;
+        case 0x14:
+        case 0x1d:
+        case 0x1e: {
+            uint32_t times;
+
+            c->function = i;
+            c->index = 0;
+            c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+            times = c->eax;
+
+            for (j = 1; j <= times; ++j) {
+                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+                    fprintf(stderr, "cpuid_data is full, no space for "
+                                "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
+                    abort();
+                }
+                c = &entries[cpuid_i++];
+                c->function = i;
+                c->index = j;
+                c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+                cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
+            }
+            break;
+        }
+        default:
+            c->function = i;
+            c->flags = 0;
+            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+            if (!c->eax && !c->ebx && !c->ecx && !c->edx) {
+                /*
+                 * KVM already returns all zeroes if a CPUID entry is missing,
+                 * so we can omit it and avoid hitting KVM's 80-entry limit.
+                 */
+                cpuid_i--;
+            }
+            break;
+        }
+    }
+
+    if (limit >= 0x0a) {
+        uint32_t eax, edx;
+
+        cpu_x86_cpuid(env, 0x0a, 0, &eax, &unused, &unused, &edx);
+
+        has_architectural_pmu_version = eax & 0xff;
+        if (has_architectural_pmu_version > 0) {
+            num_architectural_pmu_gp_counters = (eax & 0xff00) >> 8;
+
+            /* Shouldn't be more than 32, since that's the number of bits
+             * available in EBX to tell us _which_ counters are available.
+             * Play it safe.
+             */
+            if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {
+                num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
+            }
+
+            if (has_architectural_pmu_version > 1) {
+                num_architectural_pmu_fixed_counters = edx & 0x1f;
+
+                if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
+                    num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
+                }
+            }
+        }
+    }
+
+    cpu_x86_cpuid(env, 0x80000000, 0, &limit, &unused, &unused, &unused);
+
+    for (i = 0x80000000; i <= limit; i++) {
+        if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+            fprintf(stderr, "unsupported xlevel value: 0x%x\n", limit);
+            abort();
+        }
+        c = &entries[cpuid_i++];
+
+        switch (i) {
+        case 0x8000001d:
+            /* Query for all AMD cache information leaves */
+            for (j = 0; ; j++) {
+                c->function = i;
+                c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
+                c->index = j;
+                cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
+
+                if (c->eax == 0) {
+                    break;
+                }
+                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+                    fprintf(stderr, "cpuid_data is full, no space for "
+                            "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
+                    abort();
+                }
+                c = &entries[cpuid_i++];
+            }
+            break;
+        default:
+            c->function = i;
+            c->flags = 0;
+            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+            if (!c->eax && !c->ebx && !c->ecx && !c->edx) {
+                /*
+                 * KVM already returns all zeroes if a CPUID entry is missing,
+                 * so we can omit it and avoid hitting KVM's 80-entry limit.
+                 */
+                cpuid_i--;
+            }
+            break;
+        }
+    }
+
+    /* Call Centaur's CPUID instructions they are supported. */
+    if (env->cpuid_xlevel2 > 0) {
+        cpu_x86_cpuid(env, 0xC0000000, 0, &limit, &unused, &unused, &unused);
+
+        for (i = 0xC0000000; i <= limit; i++) {
+            if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
+                fprintf(stderr, "unsupported xlevel2 value: 0x%x\n", limit);
+                abort();
+            }
+            c = &entries[cpuid_i++];
+
+            c->function = i;
+            c->flags = 0;
+            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
+        }
+    }
+
+    return cpuid_i;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
     struct {
@@ -1815,8 +2045,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
     X86CPU *cpu = X86_CPU(cs);
     CPUX86State *env = &cpu->env;
-    uint32_t limit, i, j, cpuid_i;
-    uint32_t unused;
+    uint32_t cpuid_i;
     struct kvm_cpuid_entry2 *c;
     uint32_t signature[3];
     int kvm_base = KVM_CPUID_SIGNATURE;
@@ -1965,8 +2194,6 @@ int kvm_arch_init_vcpu(CPUState *cs)
         c->edx = env->features[FEAT_KVM_HINTS];
     }
 
-    cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
-
     if (cpu->kvm_pv_enforce_cpuid) {
         r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1);
         if (r < 0) {
@@ -1977,224 +2204,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
         }
     }
 
-    for (i = 0; i <= limit; i++) {
-        if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
-            fprintf(stderr, "unsupported level value: 0x%x\n", limit);
-            abort();
-        }
-        c = &cpuid_data.entries[cpuid_i++];
-
-        switch (i) {
-        case 2: {
-            /* Keep reading function 2 till all the input is received */
-            int times;
-
-            c->function = i;
-            c->flags = KVM_CPUID_FLAG_STATEFUL_FUNC |
-                       KVM_CPUID_FLAG_STATE_READ_NEXT;
-            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
-            times = c->eax & 0xff;
-
-            for (j = 1; j < times; ++j) {
-                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
-                    fprintf(stderr, "cpuid_data is full, no space for "
-                            "cpuid(eax:2):eax & 0xf = 0x%x\n", times);
-                    abort();
-                }
-                c = &cpuid_data.entries[cpuid_i++];
-                c->function = i;
-                c->flags = KVM_CPUID_FLAG_STATEFUL_FUNC;
-                cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
-            }
-            break;
-        }
-        case 0x1f:
-            if (env->nr_dies < 2) {
-                cpuid_i--;
-                break;
-            }
-            /* fallthrough */
-        case 4:
-        case 0xb:
-        case 0xd:
-            for (j = 0; ; j++) {
-                if (i == 0xd && j == 64) {
-                    break;
-                }
-
-                c->function = i;
-                c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-                c->index = j;
-                cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
-
-                if (i == 4 && c->eax == 0) {
-                    break;
-                }
-                if (i == 0xb && !(c->ecx & 0xff00)) {
-                    break;
-                }
-                if (i == 0x1f && !(c->ecx & 0xff00)) {
-                    break;
-                }
-                if (i == 0xd && c->eax == 0) {
-                    continue;
-                }
-                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
-                    fprintf(stderr, "cpuid_data is full, no space for "
-                            "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
-                    abort();
-                }
-                c = &cpuid_data.entries[cpuid_i++];
-            }
-            break;
-        case 0x7:
-        case 0x12:
-            for (j = 0; ; j++) {
-                c->function = i;
-                c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-                c->index = j;
-                cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
-
-                if (j > 1 && (c->eax & 0xf) != 1) {
-                    break;
-                }
-
-                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
-                    fprintf(stderr, "cpuid_data is full, no space for "
-                                "cpuid(eax:0x12,ecx:0x%x)\n", j);
-                    abort();
-                }
-                c = &cpuid_data.entries[cpuid_i++];
-            }
-            break;
-        case 0x14:
-        case 0x1d:
-        case 0x1e: {
-            uint32_t times;
-
-            c->function = i;
-            c->index = 0;
-            c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
-            times = c->eax;
-
-            for (j = 1; j <= times; ++j) {
-                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
-                    fprintf(stderr, "cpuid_data is full, no space for "
-                                "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
-                    abort();
-                }
-                c = &cpuid_data.entries[cpuid_i++];
-                c->function = i;
-                c->index = j;
-                c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-                cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
-            }
-            break;
-        }
-        default:
-            c->function = i;
-            c->flags = 0;
-            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
-            if (!c->eax && !c->ebx && !c->ecx && !c->edx) {
-                /*
-                 * KVM already returns all zeroes if a CPUID entry is missing,
-                 * so we can omit it and avoid hitting KVM's 80-entry limit.
-                 */
-                cpuid_i--;
-            }
-            break;
-        }
-    }
-
-    if (limit >= 0x0a) {
-        uint32_t eax, edx;
-
-        cpu_x86_cpuid(env, 0x0a, 0, &eax, &unused, &unused, &edx);
-
-        has_architectural_pmu_version = eax & 0xff;
-        if (has_architectural_pmu_version > 0) {
-            num_architectural_pmu_gp_counters = (eax & 0xff00) >> 8;
-
-            /* Shouldn't be more than 32, since that's the number of bits
-             * available in EBX to tell us _which_ counters are available.
-             * Play it safe.
-             */
-            if (num_architectural_pmu_gp_counters > MAX_GP_COUNTERS) {
-                num_architectural_pmu_gp_counters = MAX_GP_COUNTERS;
-            }
-
-            if (has_architectural_pmu_version > 1) {
-                num_architectural_pmu_fixed_counters = edx & 0x1f;
-
-                if (num_architectural_pmu_fixed_counters > MAX_FIXED_COUNTERS) {
-                    num_architectural_pmu_fixed_counters = MAX_FIXED_COUNTERS;
-                }
-            }
-        }
-    }
-
-    cpu_x86_cpuid(env, 0x80000000, 0, &limit, &unused, &unused, &unused);
-
-    for (i = 0x80000000; i <= limit; i++) {
-        if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
-            fprintf(stderr, "unsupported xlevel value: 0x%x\n", limit);
-            abort();
-        }
-        c = &cpuid_data.entries[cpuid_i++];
-
-        switch (i) {
-        case 0x8000001d:
-            /* Query for all AMD cache information leaves */
-            for (j = 0; ; j++) {
-                c->function = i;
-                c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
-                c->index = j;
-                cpu_x86_cpuid(env, i, j, &c->eax, &c->ebx, &c->ecx, &c->edx);
-
-                if (c->eax == 0) {
-                    break;
-                }
-                if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
-                    fprintf(stderr, "cpuid_data is full, no space for "
-                            "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
-                    abort();
-                }
-                c = &cpuid_data.entries[cpuid_i++];
-            }
-            break;
-        default:
-            c->function = i;
-            c->flags = 0;
-            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
-            if (!c->eax && !c->ebx && !c->ecx && !c->edx) {
-                /*
-                 * KVM already returns all zeroes if a CPUID entry is missing,
-                 * so we can omit it and avoid hitting KVM's 80-entry limit.
-                 */
-                cpuid_i--;
-            }
-            break;
-        }
-    }
-
-    /* Call Centaur's CPUID instructions they are supported. */
-    if (env->cpuid_xlevel2 > 0) {
-        cpu_x86_cpuid(env, 0xC0000000, 0, &limit, &unused, &unused, &unused);
-
-        for (i = 0xC0000000; i <= limit; i++) {
-            if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
-                fprintf(stderr, "unsupported xlevel2 value: 0x%x\n", limit);
-                abort();
-            }
-            c = &cpuid_data.entries[cpuid_i++];
-
-            c->function = i;
-            c->flags = 0;
-            cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
-        }
-    }
-
+    cpuid_i = kvm_x86_arch_cpuid(env, cpuid_data.entries, cpuid_i);
     cpuid_data.cpuid.nent = cpuid_i;
 
     if (((env->cpuid_version >> 8)&0xF) >= 6
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index 769eadbba56c..fd7e76fcf847 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -26,6 +26,9 @@
 #define kvm_ioapic_in_kernel() \
     (kvm_irqchip_in_kernel() && !kvm_irqchip_is_split())
 
+uint32_t kvm_x86_arch_cpuid(CPUX86State *env, struct kvm_cpuid_entry2 *entries,
+                            uint32_t cpuid_i);
+
 #else
 
 #define kvm_pit_in_kernel()      0
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu()
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (11 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 12/58] i386/kvm: Move architectural CPUID leaf generation to separate helper Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-21  8:55   ` Daniel P. Berrangé
  2023-08-29 14:40   ` Philippe Mathieu-Daudé
  2023-08-18  9:49 ` [PATCH v2 14/58] i386/tdx: Initialize TDX before creating TD vcpus Xiaoyao Li
                   ` (44 subsequent siblings)
  57 siblings, 2 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
work prior to create any vcpu. This is for i386 TDX because it needs
call TDX_INIT_VM before creating any vcpu.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 accel/kvm/kvm-all.c  | 12 ++++++++++++
 include/sysemu/kvm.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index c9f3aab5e587..5071af917ae0 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -422,6 +422,11 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
     return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
 }
 
+int __attribute__ ((weak)) kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_init_vcpu(CPUState *cpu, Error **errp)
 {
     KVMState *s = kvm_state;
@@ -430,6 +435,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
+    ret = kvm_arch_pre_create_vcpu(cpu);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "%s: kvm_arch_pre_create_vcpu() failed",
+                        __func__);
+        goto err;
+    }
+
     ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
     if (ret < 0) {
         error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 49c896d8a512..d89ec87072d7 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -371,6 +371,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
 
 int kvm_arch_init(MachineState *ms, KVMState *s);
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu);
 int kvm_arch_init_vcpu(CPUState *cpu);
 int kvm_arch_destroy_vcpu(CPUState *cpu);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 14/58] i386/tdx: Initialize TDX before creating TD vcpus
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (12 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu() Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-21  8:54   ` Daniel P. Berrangé
  2023-08-18  9:49 ` [PATCH v2 15/58] i386/tdx: Add property sept-ve-disable for tdx-guest object Xiaoyao Li
                   ` (43 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Invoke KVM_TDX_INIT in kvm_arch_pre_create_vcpu() that KVM_TDX_INIT
configures global TD configurations, e.g. the canonical CPUID config,
and must be executed prior to creating vCPUs.

Use kvm_x86_arch_cpuid() to setup the CPUID settings for TDX VM.

Note, this doesn't address the fact that QEMU may change the CPUID
configuration when creating vCPUs, i.e. punts on refactoring QEMU to
provide a stable CPUID config prior to kvm_arch_init().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 accel/kvm/kvm-all.c        |  9 +++++++-
 target/i386/kvm/kvm.c      |  8 +++++++
 target/i386/kvm/tdx-stub.c |  5 +++++
 target/i386/kvm/tdx.c      | 45 ++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h      |  4 ++++
 5 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 5071af917ae0..fceec7f2a83f 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -435,10 +435,17 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
+    /*
+     * tdx_pre_create_vcpu() may call cpu_x86_cpuid(). It in turn may call
+     * kvm_vm_ioctl(). Set cpu->kvm_state in advance to avoid NULL pointer
+     * dereference.
+     */
+    cpu->kvm_state = s;
     ret = kvm_arch_pre_create_vcpu(cpu);
     if (ret < 0) {
         error_setg_errno(errp, -ret, "%s: kvm_arch_pre_create_vcpu() failed",
                         __func__);
+        cpu->kvm_state = NULL;
         goto err;
     }
 
@@ -446,11 +453,11 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
     if (ret < 0) {
         error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
                          kvm_arch_vcpu_id(cpu));
+        cpu->kvm_state = NULL;
         goto err;
     }
 
     cpu->kvm_fd = ret;
-    cpu->kvm_state = s;
     cpu->vcpu_dirty = true;
     cpu->dirty_pages = 0;
     cpu->throttle_us_per_full = 0;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 9ee41fffc445..d51067fdc12a 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2331,6 +2331,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return r;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    if (is_tdx_vm())
+        return tdx_pre_create_vcpu(cpu);
+
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
index 1d866d5496bf..61f70cc0d1d9 100644
--- a/target/i386/kvm/tdx-stub.c
+++ b/target/i386/kvm/tdx-stub.c
@@ -6,3 +6,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
 {
     return -EINVAL;
 }
+
+int tdx_pre_create_vcpu(CPUState *cpu)
+{
+    return -EINVAL;
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 29f50fb9529e..3d313ed46bd1 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -458,6 +458,49 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
     return 0;
 }
 
+int tdx_pre_create_vcpu(CPUState *cpu)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    X86CPU *x86cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86cpu->env;
+    struct kvm_tdx_init_vm *init_vm;
+    int r = 0;
+
+    qemu_mutex_lock(&tdx_guest->lock);
+    if (tdx_guest->initialized) {
+        goto out;
+    }
+
+    init_vm = g_malloc0(sizeof(struct kvm_tdx_init_vm) +
+                        sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
+
+    r = kvm_vm_enable_cap(kvm_state, KVM_CAP_MAX_VCPUS, 0, ms->smp.cpus);
+    if (r < 0) {
+        error_report("Unable to set MAX VCPUS to %d", ms->smp.cpus);
+        goto out_free;
+    }
+
+    init_vm->cpuid.nent = kvm_x86_arch_cpuid(env, init_vm->cpuid.entries, 0);
+
+    init_vm->attributes = tdx_guest->attributes;
+
+    do {
+        r = tdx_vm_ioctl(KVM_TDX_INIT_VM, 0, init_vm);
+    } while (r == -EAGAIN);
+    if (r < 0) {
+        error_report("KVM_TDX_INIT_VM failed %s", strerror(-r));
+        goto out_free;
+    }
+
+    tdx_guest->initialized = true;
+
+out_free:
+    g_free(init_vm);
+out:
+    qemu_mutex_unlock(&tdx_guest->lock);
+    return r;
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
@@ -470,6 +513,8 @@ static void tdx_guest_init(Object *obj)
 {
     TdxGuest *tdx = TDX_GUEST(obj);
 
+    qemu_mutex_init(&tdx->lock);
+
     tdx->attributes = 0;
 }
 
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 06599b65b827..46a24ee8c7cc 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -17,6 +17,9 @@ typedef struct TdxGuestClass {
 typedef struct TdxGuest {
     ConfidentialGuestSupport parent_obj;
 
+    QemuMutex lock;
+
+    bool initialized;
     uint64_t attributes;    /* TD attributes */
 } TdxGuest;
 
@@ -29,5 +32,6 @@ bool is_tdx_vm(void);
 int tdx_kvm_init(MachineState *ms, Error **errp);
 void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
                              uint32_t *ret);
+int tdx_pre_create_vcpu(CPUState *cpu);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 15/58] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (13 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 14/58] i386/tdx: Initialize TDX before creating TD vcpus Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-21  8:59   ` Daniel P. Berrangé
  2023-08-18  9:49 ` [PATCH v2 16/58] i386/tdx: Make sept_ve_disable set by default Xiaoyao Li
                   ` (42 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it disables
EPT violation conversion to #VE on guest TD access of PENDING pages.

Some guest OS (e.g., Linux TD guest) may require this bit as 1.
Otherwise refuse to boot.

Add sept-ve-disable property for tdx-guest object, for user to configure
this bit.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 qapi/qom.json         |  4 +++-
 target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 2ca7ce7c0da5..cc08b9a98df9 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -871,10 +871,12 @@
 #
 # Properties for tdx-guest objects.
 #
+# @sept-ve-disable: bit 28 of TD attributes (default: 0)
+#
 # Since: 8.2
 ##
 { 'struct': 'TdxGuestProperties',
-  'data': { }}
+  'data': { '*sept-ve-disable': 'bool' } }
 
 ##
 # @ThreadContextProperties:
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 3d313ed46bd1..22130382c0c5 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -32,6 +32,8 @@
                                      (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
                                      (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
 
+#define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
+
 #define TDX_ATTRIBUTES_MAX_BITS      64
 
 static FeatureMask tdx_attrs_ctrl_fields[TDX_ATTRIBUTES_MAX_BITS] = {
@@ -501,6 +503,24 @@ out:
     return r;
 }
 
+static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+
+    return !!(tdx->attributes & TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE);
+}
+
+static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error **errp)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+
+    if (value) {
+        tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE;
+    } else {
+        tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE;
+    }
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
@@ -516,6 +536,10 @@ static void tdx_guest_init(Object *obj)
     qemu_mutex_init(&tdx->lock);
 
     tdx->attributes = 0;
+
+    object_property_add_bool(obj, "sept-ve-disable",
+                             tdx_guest_get_sept_ve_disable,
+                             tdx_guest_set_sept_ve_disable);
 }
 
 static void tdx_guest_finalize(Object *obj)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 16/58] i386/tdx: Make sept_ve_disable set by default
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (14 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 15/58] i386/tdx: Add property sept-ve-disable for tdx-guest object Xiaoyao Li
@ 2023-08-18  9:49 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 17/58] i386/tdx: Wire CPU features up with attributes of TD guest Xiaoyao Li
                   ` (41 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:49 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

For TDX KVM use case, Linux guest is the most major one.  It requires
sept_ve_disable set.  Make it default for the main use case.  For other use
case, it can be enabled/disabled via qemu command line.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 target/i386/kvm/tdx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 22130382c0c5..153a75a02599 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -535,7 +535,7 @@ static void tdx_guest_init(Object *obj)
 
     qemu_mutex_init(&tdx->lock);
 
-    tdx->attributes = 0;
+    tdx->attributes = TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE;
 
     object_property_add_bool(obj, "sept-ve-disable",
                              tdx_guest_get_sept_ve_disable,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 17/58] i386/tdx: Wire CPU features up with attributes of TD guest
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (15 preceding siblings ...)
  2023-08-18  9:49 ` [PATCH v2 16/58] i386/tdx: Make sept_ve_disable set by default Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 18/58] i386/tdx: Validate TD attributes Xiaoyao Li
                   ` (40 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

For QEMU VMs, PKS is configured via CPUID_7_0_ECX_PKS and PMU is
configured by x86cpu->enable_pmu. Reuse the existing configuration
interface for TDX VMs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/tdx.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 153a75a02599..629abd267da8 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -33,6 +33,8 @@
                                      (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
 
 #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
+#define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
+#define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
 
 #define TDX_ATTRIBUTES_MAX_BITS      64
 
@@ -460,6 +462,15 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
     return 0;
 }
 
+static void setup_td_guest_attributes(X86CPU *x86cpu)
+{
+    CPUX86State *env = &x86cpu->env;
+
+    tdx_guest->attributes |= (env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS) ?
+                             TDX_TD_ATTRIBUTES_PKS : 0;
+    tdx_guest->attributes |= x86cpu->enable_pmu ? TDX_TD_ATTRIBUTES_PERFMON : 0;
+}
+
 int tdx_pre_create_vcpu(CPUState *cpu)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
@@ -482,8 +493,9 @@ int tdx_pre_create_vcpu(CPUState *cpu)
         goto out_free;
     }
 
+    setup_td_guest_attributes(x86cpu);
+
     init_vm->cpuid.nent = kvm_x86_arch_cpuid(env, init_vm->cpuid.entries, 0);
-
     init_vm->attributes = tdx_guest->attributes;
 
     do {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 18/58] i386/tdx: Validate TD attributes
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (16 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 17/58] i386/tdx: Wire CPU features up with attributes of TD guest Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-21  9:16   ` Daniel P. Berrangé
  2023-08-18  9:50 ` [PATCH v2 19/58] qom: implement property helper for sha384 Xiaoyao Li
                   ` (39 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
fixed-1 bits must be set.

Besides, sanity check the attribute bits that have not been supported by
QEMU yet. e.g., debug bit, it will be allowed in the future when debug
TD support lands in QEMU.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/tdx.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 629abd267da8..73da15377ec3 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -32,6 +32,7 @@
                                      (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
                                      (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
 
+#define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
 #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
 #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
 #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
@@ -462,13 +463,32 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
     return 0;
 }
 
-static void setup_td_guest_attributes(X86CPU *x86cpu)
+static int tdx_validate_attributes(TdxGuest *tdx)
+{
+    if (((tdx->attributes & tdx_caps->attrs_fixed0) | tdx_caps->attrs_fixed1) !=
+        tdx->attributes) {
+            error_report("Invalid attributes 0x%lx for TDX VM (fixed0 0x%llx, fixed1 0x%llx)",
+                          tdx->attributes, tdx_caps->attrs_fixed0, tdx_caps->attrs_fixed1);
+            return -EINVAL;
+    }
+
+    if (tdx->attributes & TDX_TD_ATTRIBUTES_DEBUG) {
+        error_report("Current QEMU doesn't support attributes.debug[bit 0] for TDX VM");
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static int setup_td_guest_attributes(X86CPU *x86cpu)
 {
     CPUX86State *env = &x86cpu->env;
 
     tdx_guest->attributes |= (env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS) ?
                              TDX_TD_ATTRIBUTES_PKS : 0;
     tdx_guest->attributes |= x86cpu->enable_pmu ? TDX_TD_ATTRIBUTES_PERFMON : 0;
+
+    return tdx_validate_attributes(tdx_guest);
 }
 
 int tdx_pre_create_vcpu(CPUState *cpu)
@@ -493,7 +513,10 @@ int tdx_pre_create_vcpu(CPUState *cpu)
         goto out_free;
     }
 
-    setup_td_guest_attributes(x86cpu);
+    r = setup_td_guest_attributes(x86cpu);
+    if (r) {
+        goto out;
+    }
 
     init_vm->cpuid.nent = kvm_x86_arch_cpuid(env, init_vm->cpuid.entries, 0);
     init_vm->attributes = tdx_guest->attributes;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 19/58] qom: implement property helper for sha384
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (17 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 18/58] i386/tdx: Validate TD attributes Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-21  9:25   ` Daniel P. Berrangé
  2023-08-18  9:50 ` [PATCH v2 20/58] i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM Xiaoyao Li
                   ` (38 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

Implement property_add_sha384() which converts hex string <-> uint8_t[48]
It will be used for TDX which uses sha384 for measurement.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 include/qom/object.h | 17 ++++++++++
 qom/object.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 93 insertions(+)

diff --git a/include/qom/object.h b/include/qom/object.h
index ef7258a5e149..70399a5b1940 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -1887,6 +1887,23 @@ ObjectProperty *object_property_add_alias(Object *obj, const char *name,
 ObjectProperty *object_property_add_const_link(Object *obj, const char *name,
                                                Object *target);
 
+
+/**
+ * object_property_add_sha384:
+ * @obj: the object to add a property to
+ * @name: the name of the property
+ * @v: pointer to value
+ * @flags: bitwise-or'd ObjectPropertyFlags
+ *
+ * Add an sha384 property in memory.  This function will add a
+ * property of type 'sha384'.
+ *
+ * Returns: The newly added property on success, or %NULL on failure.
+ */
+ObjectProperty * object_property_add_sha384(Object *obj, const char *name,
+                                            const uint8_t *v,
+                                            ObjectPropertyFlags flags);
+
 /**
  * object_property_set_description:
  * @obj: the object owning the property
diff --git a/qom/object.c b/qom/object.c
index e25f1e96db1e..e71ce46ed576 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -15,6 +15,7 @@
 #include "qapi/error.h"
 #include "qom/object.h"
 #include "qom/object_interfaces.h"
+#include "qemu/ctype.h"
 #include "qemu/cutils.h"
 #include "qemu/memalign.h"
 #include "qapi/visitor.h"
@@ -2781,6 +2782,81 @@ object_property_add_alias(Object *obj, const char *name,
     return op;
 }
 
+#define SHA384_DIGEST_SIZE      48
+static void property_get_sha384(Object *obj, Visitor *v, const char *name,
+                                void *opaque, Error **errp)
+{
+    uint8_t *value = (uint8_t *)opaque;
+    char str[SHA384_DIGEST_SIZE * 2 + 1];
+    char *str_ = (char*)str;
+    size_t i;
+
+    for (i = 0; i < SHA384_DIGEST_SIZE; i++) {
+        char *buf;
+        buf = &str[i * 2];
+
+        sprintf(buf, "%02hhx", value[i]);
+    }
+    str[SHA384_DIGEST_SIZE * 2] = '\0';
+
+    visit_type_str(v, name, &str_, errp);
+}
+
+static void property_set_sha384(Object *obj, Visitor *v, const char *name,
+                                    void *opaque, Error **errp)
+{
+    uint8_t *value = (uint8_t *)opaque;
+    char* str;
+    size_t len;
+    size_t i;
+
+    if (!visit_type_str(v, name, &str, errp)) {
+        goto err;
+    }
+
+    len = strlen(str);
+    if (len != SHA384_DIGEST_SIZE * 2) {
+        error_setg(errp, "invalid length for sha348 hex string %s. "
+                   "it must be 48 * 2 hex", name);
+        goto err;
+    }
+
+    for (i = 0; i < SHA384_DIGEST_SIZE; i++) {
+        if (!qemu_isxdigit(str[i * 2]) || !qemu_isxdigit(str[i * 2 + 1])) {
+            error_setg(errp, "invalid char for sha318 hex string %s at %c%c",
+                       name, str[i * 2], str[i * 2 + 1]);
+            goto err;
+        }
+
+        if (sscanf(str + i * 2, "%02hhx", &value[i]) != 1) {
+            error_setg(errp, "invalid format for sha318 hex string %s", name);
+            goto err;
+        }
+    }
+
+err:
+    g_free(str);
+}
+
+ObjectProperty *
+object_property_add_sha384(Object *obj, const char *name,
+                           const uint8_t *v, ObjectPropertyFlags flags)
+{
+    ObjectPropertyAccessor *getter = NULL;
+    ObjectPropertyAccessor *setter = NULL;
+
+    if ((flags & OBJ_PROP_FLAG_READ) == OBJ_PROP_FLAG_READ) {
+        getter = property_get_sha384;
+    }
+
+    if ((flags & OBJ_PROP_FLAG_WRITE) == OBJ_PROP_FLAG_WRITE) {
+        setter = property_set_sha384;
+    }
+
+    return object_property_add(obj, name, "sha384",
+                               getter, setter, NULL, (void *)v);
+}
+
 void object_property_set_description(Object *obj, const char *name,
                                      const char *description)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 20/58] i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (18 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 19/58] qom: implement property helper for sha384 Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-21  9:29   ` Daniel P. Berrangé
  2023-08-18  9:50 ` [PATCH v2 21/58] i386/tdx: Implement user specified tsc frequency Xiaoyao Li
                   ` (37 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

When creating TDX vm, three sha384 hash values can be provided for
TDX attestation.

So far they were hard coded as 0. Now allow user to specify those values
via property mrconfigid, mrowner and mrownerconfig. Choose hex-encoded
string as format since it's friendly for user to input.

example
-object tdx-guest, \
  mrconfigid=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef, \
  mrowner=fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210, \
  mrownerconfig=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
TODO:
 - community requests to use base64 encoding if no special reason
---
 qapi/qom.json         | 11 ++++++++++-
 target/i386/kvm/tdx.c | 13 +++++++++++++
 target/i386/kvm/tdx.h |  3 +++
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index cc08b9a98df9..87c1d440f331 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -873,10 +873,19 @@
 #
 # @sept-ve-disable: bit 28 of TD attributes (default: 0)
 #
+# @mrconfigid: MRCONFIGID SHA384 hex string of 48 * 2 length (default: 0)
+#
+# @mrowner: MROWNER SHA384 hex string of 48 * 2 length (default: 0)
+#
+# @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
+#
 # Since: 8.2
 ##
 { 'struct': 'TdxGuestProperties',
-  'data': { '*sept-ve-disable': 'bool' } }
+  'data': { '*sept-ve-disable': 'bool',
+            '*mrconfigid': 'str',
+            '*mrowner': 'str',
+            '*mrownerconfig': 'str' } }
 
 ##
 # @ThreadContextProperties:
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 73da15377ec3..33d015a08c34 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -521,6 +521,13 @@ int tdx_pre_create_vcpu(CPUState *cpu)
     init_vm->cpuid.nent = kvm_x86_arch_cpuid(env, init_vm->cpuid.entries, 0);
     init_vm->attributes = tdx_guest->attributes;
 
+    QEMU_BUILD_BUG_ON(sizeof(init_vm->mrconfigid) != sizeof(tdx_guest->mrconfigid));
+    QEMU_BUILD_BUG_ON(sizeof(init_vm->mrowner) != sizeof(tdx_guest->mrowner));
+    QEMU_BUILD_BUG_ON(sizeof(init_vm->mrownerconfig) != sizeof(tdx_guest->mrownerconfig));
+    memcpy(init_vm->mrconfigid, tdx_guest->mrconfigid, sizeof(tdx_guest->mrconfigid));
+    memcpy(init_vm->mrowner, tdx_guest->mrowner, sizeof(tdx_guest->mrowner));
+    memcpy(init_vm->mrownerconfig, tdx_guest->mrownerconfig, sizeof(tdx_guest->mrownerconfig));
+
     do {
         r = tdx_vm_ioctl(KVM_TDX_INIT_VM, 0, init_vm);
     } while (r == -EAGAIN);
@@ -575,6 +582,12 @@ static void tdx_guest_init(Object *obj)
     object_property_add_bool(obj, "sept-ve-disable",
                              tdx_guest_get_sept_ve_disable,
                              tdx_guest_set_sept_ve_disable);
+    object_property_add_sha384(obj, "mrconfigid", tdx->mrconfigid,
+                               OBJ_PROP_FLAG_READWRITE);
+    object_property_add_sha384(obj, "mrowner", tdx->mrowner,
+                               OBJ_PROP_FLAG_READWRITE);
+    object_property_add_sha384(obj, "mrownerconfig", tdx->mrownerconfig,
+                               OBJ_PROP_FLAG_READWRITE);
 }
 
 static void tdx_guest_finalize(Object *obj)
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 46a24ee8c7cc..68f8327f2231 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -21,6 +21,9 @@ typedef struct TdxGuest {
 
     bool initialized;
     uint64_t attributes;    /* TD attributes */
+    uint8_t mrconfigid[48];     /* sha348 digest */
+    uint8_t mrowner[48];        /* sha348 digest */
+    uint8_t mrownerconfig[48];  /* sha348 digest */
 } TdxGuest;
 
 #ifdef CONFIG_TDX
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 21/58] i386/tdx: Implement user specified tsc frequency
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (19 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 20/58] i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-21  9:30   ` Daniel P. Berrangé
  2023-08-18  9:50 ` [PATCH v2 22/58] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM Xiaoyao Li
                   ` (36 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Reuse "-cpu,tsc-frequency=" to get user wanted tsc frequency and call VM
scope VM_SET_TSC_KHZ to set the tsc frequency of TD before KVM_TDX_INIT_VM.

Besides, sanity check the tsc frequency to be in the legal range and
legal granularity (required by TDX module).

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
Changes from RFC v4:
  - Use VM scope VM_SET_TSC_KHZ to set the TSC frequency of TD since KVM
    side drop the @tsc_khz field in struct kvm_tdx_init_vm
---
 target/i386/kvm/kvm.c |  9 +++++++++
 target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d51067fdc12a..4a146bc42f63 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -859,6 +859,15 @@ static int kvm_arch_set_tsc_khz(CPUState *cs)
     int r, cur_freq;
     bool set_ioctl = false;
 
+    /*
+     * TSC of TD vcpu is immutable, it cannot be set/changed via vcpu scope
+     * VM_SET_TSC_KHZ, but only be initialized via VM scope VM_SET_TSC_KHZ
+     * before ioctl KVM_TDX_INIT_VM in tdx_pre_create_vcpu()
+     */
+    if (is_tdx_vm()) {
+        return 0;
+    }
+
     if (!env->tsc_khz) {
         return 0;
     }
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 33d015a08c34..a72badfbfd65 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -32,6 +32,9 @@
                                      (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
                                      (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
 
+#define TDX_MIN_TSC_FREQUENCY_KHZ   (100 * 1000)
+#define TDX_MAX_TSC_FREQUENCY_KHZ   (10 * 1000 * 1000)
+
 #define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
 #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
 #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
@@ -513,6 +516,27 @@ int tdx_pre_create_vcpu(CPUState *cpu)
         goto out_free;
     }
 
+    r = -EINVAL;
+    if (env->tsc_khz && (env->tsc_khz < TDX_MIN_TSC_FREQUENCY_KHZ ||
+                         env->tsc_khz > TDX_MAX_TSC_FREQUENCY_KHZ)) {
+        error_report("Invalid TSC %ld KHz, must specify cpu_frequency between [%d, %d] kHz",
+                      env->tsc_khz, TDX_MIN_TSC_FREQUENCY_KHZ,
+                      TDX_MAX_TSC_FREQUENCY_KHZ);
+        goto out;
+    }
+
+    if (env->tsc_khz % (25 * 1000)) {
+        error_report("Invalid TSC %ld KHz, it must be multiple of 25MHz", env->tsc_khz);
+        goto out;
+    }
+
+    /* it's safe even env->tsc_khz is 0. KVM uses host's tsc_khz in this case */
+    r = kvm_vm_ioctl(kvm_state, KVM_SET_TSC_KHZ, env->tsc_khz);
+    if (r < 0) {
+        error_report("Unable to set TSC frequency to %" PRId64 " kHz", env->tsc_khz);
+        goto out;
+    }
+
     r = setup_td_guest_attributes(x86cpu);
     if (r) {
         goto out;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 22/58] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (20 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 21/58] i386/tdx: Implement user specified tsc frequency Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 23/58] i386/tdx: Make memory type private by default Xiaoyao Li
                   ` (35 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

TDX only supports readonly for shared memory but not for private memory.

In the view of QEMU, it has no idea whether a memslot is used as shared
memory of private. Thus just mark kvm_readonly_mem_enabled to false to
TDX VM for simplicity.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/tdx.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index a72badfbfd65..8a2491ed03c2 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -461,6 +461,15 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
 
     update_tdx_cpuid_lookup_by_tdx_caps();
 
+    /*
+     * Set kvm_readonly_mem_allowed to false, because TDX only supports readonly
+     * memory for shared memory but not for private memory. Besides, whether a
+     * memslot is private or shared is not determined by QEMU.
+     *
+     * Thus, just mark readonly memory not supported for simplicity.
+     */
+    kvm_readonly_mem_allowed = false;
+
     tdx_guest = tdx;
 
     return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 23/58] i386/tdx: Make memory type private by default
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (21 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 22/58] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 24/58] i386/tdx: Create kvm gmem for TD Xiaoyao Li
                   ` (34 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

By default (due to the recent UPM change), restricted memory attribute is
shared.  Convert the memory region from shared to private at the memory
slot creation time.

add kvm region registering function to check the flag
and convert the region, and add memory listener to TDX guest code to set
the flag to the possible memory region.

Without this patch
- Secure-EPT violation on private area
- KVM_MEMORY_FAULT EXIT (kvm -> qemu)
- qemu converts the 4K page from shared to private
- Resume VCPU execution
- Secure-EPT violation again
- KVM resolves EPT Violation
This also prevents huge page because page conversion is done at 4K
granularity.  Although it's possible to merge 4K private mapping into
2M large page, it slows guest boot.

With this patch
- After memory slot creation, convert the region from private to shared
- Secure-EPT violation on private area.
- KVM resolves EPT Violation

Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 8a2491ed03c2..775110f8bd02 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -18,6 +18,7 @@
 #include "standard-headers/asm-x86/kvm_para.h"
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
+#include "exec/address-spaces.h"
 
 #include "hw/i386/x86.h"
 #include "kvm_i386.h"
@@ -578,6 +579,21 @@ out:
     return r;
 }
 
+static void tdx_guest_region_add(MemoryListener *listener,
+                                 MemoryRegionSection *section)
+{
+    if (memory_region_can_be_private(section->mr)) {
+        memory_region_set_default_private(section->mr);
+    }
+}
+
+static MemoryListener tdx_memory_listener = {
+    .name = TYPE_TDX_GUEST,
+    .region_add = tdx_guest_region_add,
+    /* Higher than KVM memory listener = 10. */
+    .priority = MEMORY_LISTENER_PRIORITY_ACCEL_HIGH,
+};
+
 static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp)
 {
     TdxGuest *tdx = TDX_GUEST(obj);
@@ -607,6 +623,12 @@ OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
 static void tdx_guest_init(Object *obj)
 {
     TdxGuest *tdx = TDX_GUEST(obj);
+    static bool memory_listener_registered = false;
+
+    if (!memory_listener_registered) {
+        memory_listener_register(&tdx_memory_listener, &address_space_memory);
+        memory_listener_registered = true;
+    }
 
     qemu_mutex_init(&tdx->lock);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 24/58] i386/tdx: Create kvm gmem for TD
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (22 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 23/58] i386/tdx: Make memory type private by default Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 25/58] kvm/tdx: Don't complain when converting vMMIO region to shared Xiaoyao Li
                   ` (33 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

Allocate private gmem for TD guest, if the MemoryRegion is memory
backend and has private property on.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 775110f8bd02..f1305191e939 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -19,6 +19,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 #include "exec/address-spaces.h"
+#include "exec/ramblock.h"
 
 #include "hw/i386/x86.h"
 #include "kvm_i386.h"
@@ -582,8 +583,30 @@ out:
 static void tdx_guest_region_add(MemoryListener *listener,
                                  MemoryRegionSection *section)
 {
-    if (memory_region_can_be_private(section->mr)) {
-        memory_region_set_default_private(section->mr);
+    MemoryRegion *mr = section->mr;
+    Object *owner = memory_region_owner(mr);
+
+    if (owner && object_dynamic_cast(owner, TYPE_MEMORY_BACKEND) &&
+        object_property_get_bool(owner, "private", NULL) &&
+        mr->ram_block && mr->ram_block->gmem_fd < 0) {
+        struct kvm_create_guest_memfd gmem = {
+            .size = memory_region_size(mr),
+            /* TODO: add property to hostmem backend for huge pmd */
+            .flags = KVM_GUEST_MEMFD_ALLOW_HUGEPAGE,
+        };
+        int fd;
+
+        fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_GUEST_MEMFD, &gmem);
+        if (fd < 0) {
+            fprintf(stderr, "%s: error creating gmem: %s\n", __func__,
+                    strerror(-fd));
+            abort();
+        }
+        memory_region_set_gmem_fd(mr, fd);
+    }
+
+    if (memory_region_can_be_private(mr)) {
+        memory_region_set_default_private(mr);
     }
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 25/58] kvm/tdx: Don't complain when converting vMMIO region to shared
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (23 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 24/58] i386/tdx: Create kvm gmem for TD Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-21  9:34   ` Daniel P. Berrangé
  2023-08-18  9:50 ` [PATCH v2 26/58] kvm/tdx: Ignore memory conversion to shared of unassigned region Xiaoyao Li
                   ` (32 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

Because vMMIO region needs to be shared region, guest TD may explicitly
convert such region from private to shared.  Don't complain such
conversion.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 accel/kvm/kvm-all.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index fceec7f2a83f..9d0aa8c97feb 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3094,8 +3094,24 @@ static int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
          */
         ram_block_convert_range(rb, offset, size, to_private);
     } else {
-        warn_report("Convert non guest-memfd backed memory region (0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") to %s",
-                    start, size, to_private ? "private" : "shared");
+        MemoryRegion *mr = section.mr;
+
+        /*
+         * Because vMMIO region must be shared, guest TD may convert vMMIO
+         * region to shared explicitly.  Don't complain such case.  See
+         * memory_region_type() for checking if the region is MMIO region.
+         */
+        if (to_private ||
+            memory_region_is_ram(mr) ||
+            memory_region_is_ram_device(mr) ||
+            memory_region_is_rom(mr) ||
+            memory_region_is_romd(mr)) {
+            warn_report("Convert non guest-memfd backed memory region (0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") of %s to %s",
+                        start, size, mr->name, to_private ? "private" : "shared");
+	    } else {
+		    ret = 0;
+	    }
+
     }
 
     memory_region_unref(section.mr);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 26/58] kvm/tdx: Ignore memory conversion to shared of unassigned region
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (24 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 25/58] kvm/tdx: Don't complain when converting vMMIO region to shared Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 27/58] i386/tdvf: Introduce function to parse TDVF metadata Xiaoyao Li
                   ` (31 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

TDX requires vMMIO region to be shared.  For KVM, MMIO region is the region
which kvm memslot isn't assigned to (except in-kernel emulation).
qemu has the memory region for vMMIO at each device level.

While OVMF issues MapGPA(to-shared) conservatively on 32bit PCI MMIO
region, qemu doesn't find corresponding vMMIO region because it's before
PCI device allocation and memory_region_find() finds the device region, not
PCI bus region.  It's safe to ignore MapGPA(to-shared) because when guest
accesses those region they use GPA with shared bit set for vMMIO.  Ignore
memory conversion request of non-assigned region to shared and return
success.  Otherwise OVMF is confused and panics there.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 accel/kvm/kvm-all.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 9d0aa8c97feb..8d53c89e9dbf 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3070,6 +3070,18 @@ static int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
     trace_kvm_convert_memory(start, size, to_private ? "shared_to_private" : "private_to_shared");
     section = memory_region_find(get_system_memory(), start, size);
     if (!section.mr) {
+        /*
+         * Ignore converting non-assigned region to shared.
+         *
+         * TDX requires vMMIO region to be shared to inject #VE to guest.
+         * OVMF issues conservatively MapGPA(shared) on 32bit PCI MMIO region,
+         * and vIO-APIC 0xFEC00000 4K page.
+         * OVMF assigns 32bit PCI MMIO region to
+         * [top of low memory: typically 2GB=0xC000000,  0xFC00000)
+         */
+        if (!to_private) {
+            ret = 0;
+        }
         return ret;
     }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 27/58] i386/tdvf: Introduce function to parse TDVF metadata
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (25 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 26/58] kvm/tdx: Ignore memory conversion to shared of unassigned region Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 28/58] i386/tdx: Parse TDVF metadata for TDX VM Xiaoyao Li
                   ` (30 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

TDX VM needs to boot with its specialized firmware, Trusted Domain
Virtual Firmware (TDVF). QEMU needs to parse TDVF and map it in TD
guest memory prior to running the TDX VM.

A TDVF Metadata in TDVF image describes the structure of firmware.
QEMU refers to it to setup memory for TDVF. Introduce function
tdvf_parse_metadata() to parse the metadata from TDVF image and store
the info of each TDVF section.

TDX metadata is located by a TDX metadata offset block, which is a
GUID-ed structure. The data portion of the GUID structure contains
only an 4-byte field that is the offset of TDX metadata to the end
of firmware file.

Select X86_FW_OVMF when TDX is enable to leverage existing functions
to parse and search OVMF's GUID-ed structures.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>

---
Changes from RFC v4:
 - rename tdvf_parse_section_entry() to
   tdvf_parse_and_check_section_entry()
Changes in v4:
 - rename TDX_METADATA_GUID to TDX_METADATA_OFFSET_GUID
---
 hw/i386/Kconfig        |   1 +
 hw/i386/meson.build    |   1 +
 hw/i386/tdvf.c         | 199 +++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/tdvf.h |  51 +++++++++++
 4 files changed, 252 insertions(+)
 create mode 100644 hw/i386/tdvf.c
 create mode 100644 include/hw/i386/tdvf.h

diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index 929f6c3f0e85..6007bdef184d 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -12,6 +12,7 @@ config SGX
 
 config TDX
     bool
+    select X86_FW_OVMF
     depends on KVM
 
 config PC
diff --git a/hw/i386/meson.build b/hw/i386/meson.build
index cfdbfdcbcb2d..45d90bb2af52 100644
--- a/hw/i386/meson.build
+++ b/hw/i386/meson.build
@@ -28,6 +28,7 @@ i386_ss.add(when: 'CONFIG_PC', if_true: files(
   'port92.c'))
 i386_ss.add(when: 'CONFIG_X86_FW_OVMF', if_true: files('pc_sysfw_ovmf.c'),
                                         if_false: files('pc_sysfw_ovmf-stubs.c'))
+i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c'))
 
 subdir('kvm')
 subdir('xen')
diff --git a/hw/i386/tdvf.c b/hw/i386/tdvf.c
new file mode 100644
index 000000000000..ff51f40088f0
--- /dev/null
+++ b/hw/i386/tdvf.c
@@ -0,0 +1,199 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+
+ * Copyright (c) 2020 Intel Corporation
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+
+#include "hw/i386/pc.h"
+#include "hw/i386/tdvf.h"
+#include "sysemu/kvm.h"
+
+#define TDX_METADATA_OFFSET_GUID    "e47a6535-984a-4798-865e-4685a7bf8ec2"
+#define TDX_METADATA_VERSION        1
+#define TDVF_SIGNATURE              0x46564454 /* TDVF as little endian */
+
+typedef struct {
+    uint32_t DataOffset;
+    uint32_t RawDataSize;
+    uint64_t MemoryAddress;
+    uint64_t MemoryDataSize;
+    uint32_t Type;
+    uint32_t Attributes;
+} TdvfSectionEntry;
+
+typedef struct {
+    uint32_t Signature;
+    uint32_t Length;
+    uint32_t Version;
+    uint32_t NumberOfSectionEntries;
+    TdvfSectionEntry SectionEntries[];
+} TdvfMetadata;
+
+struct tdx_metadata_offset {
+    uint32_t offset;
+};
+
+static TdvfMetadata *tdvf_get_metadata(void *flash_ptr, int size)
+{
+    TdvfMetadata *metadata;
+    uint32_t offset = 0;
+    uint8_t *data;
+
+    if ((uint32_t) size != size) {
+        return NULL;
+    }
+
+    if (pc_system_ovmf_table_find(TDX_METADATA_OFFSET_GUID, &data, NULL)) {
+        offset = size - le32_to_cpu(((struct tdx_metadata_offset *)data)->offset);
+
+        if (offset + sizeof(*metadata) > size) {
+            return NULL;
+        }
+    } else {
+        error_report("Cannot find TDX_METADATA_OFFSET_GUID");
+        return NULL;
+    }
+
+    metadata = flash_ptr + offset;
+
+    /* Finally, verify the signature to determine if this is a TDVF image. */
+    metadata->Signature = le32_to_cpu(metadata->Signature);
+    if (metadata->Signature != TDVF_SIGNATURE) {
+        error_report("Invalid TDVF signature in metadata!");
+        return NULL;
+    }
+
+    /* Sanity check that the TDVF doesn't overlap its own metadata. */
+    metadata->Length = le32_to_cpu(metadata->Length);
+    if (offset + metadata->Length > size) {
+        return NULL;
+    }
+
+    /* Only version 1 is supported/defined. */
+    metadata->Version = le32_to_cpu(metadata->Version);
+    if (metadata->Version != TDX_METADATA_VERSION) {
+        return NULL;
+    }
+
+    return metadata;
+}
+
+static int tdvf_parse_and_check_section_entry(const TdvfSectionEntry *src,
+                                              TdxFirmwareEntry *entry)
+{
+    entry->data_offset = le32_to_cpu(src->DataOffset);
+    entry->data_len = le32_to_cpu(src->RawDataSize);
+    entry->address = le64_to_cpu(src->MemoryAddress);
+    entry->size = le64_to_cpu(src->MemoryDataSize);
+    entry->type = le32_to_cpu(src->Type);
+    entry->attributes = le32_to_cpu(src->Attributes);
+
+    /* sanity check */
+    if (entry->size < entry->data_len) {
+        error_report("Broken metadata RawDataSize 0x%x MemoryDataSize 0x%lx",
+                     entry->data_len, entry->size);
+        return -1;
+    }
+    if (!QEMU_IS_ALIGNED(entry->address, TARGET_PAGE_SIZE)) {
+        error_report("MemoryAddress 0x%lx not page aligned", entry->address);
+        return -1;
+    }
+    if (!QEMU_IS_ALIGNED(entry->size, TARGET_PAGE_SIZE)) {
+        error_report("MemoryDataSize 0x%lx not page aligned", entry->size);
+        return -1;
+    }
+
+    switch (entry->type) {
+    case TDVF_SECTION_TYPE_BFV:
+    case TDVF_SECTION_TYPE_CFV:
+        /* The sections that must be copied from firmware image to TD memory */
+        if (entry->data_len == 0) {
+            error_report("%d section with RawDataSize == 0", entry->type);
+            return -1;
+        }
+        break;
+    case TDVF_SECTION_TYPE_TD_HOB:
+    case TDVF_SECTION_TYPE_TEMP_MEM:
+        /* The sections that no need to be copied from firmware image */
+        if (entry->data_len != 0) {
+            error_report("%d section with RawDataSize 0x%x != 0",
+                         entry->type, entry->data_len);
+            return -1;
+        }
+        break;
+    default:
+        error_report("TDVF contains unsupported section type %d", entry->type);
+        return -1;
+    }
+
+    return 0;
+}
+
+int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size)
+{
+    TdvfSectionEntry *sections;
+    TdvfMetadata *metadata;
+    ssize_t entries_size;
+    uint32_t len, i;
+
+    metadata = tdvf_get_metadata(flash_ptr, size);
+    if (!metadata) {
+        return -EINVAL;
+    }
+
+    //load and parse metadata entries
+    fw->nr_entries = le32_to_cpu(metadata->NumberOfSectionEntries);
+    if (fw->nr_entries < 2) {
+        error_report("Invalid number of fw entries (%u) in TDVF", fw->nr_entries);
+        return -EINVAL;
+    }
+
+    len = le32_to_cpu(metadata->Length);
+    entries_size = fw->nr_entries * sizeof(TdvfSectionEntry);
+    if (len != sizeof(*metadata) + entries_size) {
+        error_report("TDVF metadata len (0x%x) mismatch, expected (0x%x)",
+                     len, (uint32_t)(sizeof(*metadata) + entries_size));
+        return -EINVAL;
+    }
+
+    fw->entries = g_new(TdxFirmwareEntry, fw->nr_entries);
+    sections = g_new(TdvfSectionEntry, fw->nr_entries);
+
+    if (!memcpy(sections, (void *)metadata + sizeof(*metadata), entries_size))  {
+        error_report("Failed to read TDVF section entries");
+        goto err;
+    }
+
+    for (i = 0; i < fw->nr_entries; i++) {
+        if (tdvf_parse_and_check_section_entry(&sections[i], &fw->entries[i])) {
+            goto err;
+        }
+    }
+    g_free(sections);
+
+    return 0;
+
+err:
+    g_free(sections);
+    fw->entries = 0;
+    g_free(fw->entries);
+    return -EINVAL;
+}
diff --git a/include/hw/i386/tdvf.h b/include/hw/i386/tdvf.h
new file mode 100644
index 000000000000..593341eb2e93
--- /dev/null
+++ b/include/hw/i386/tdvf.h
@@ -0,0 +1,51 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+
+ * Copyright (c) 2020 Intel Corporation
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_I386_TDVF_H
+#define HW_I386_TDVF_H
+
+#include "qemu/osdep.h"
+
+#define TDVF_SECTION_TYPE_BFV               0
+#define TDVF_SECTION_TYPE_CFV               1
+#define TDVF_SECTION_TYPE_TD_HOB            2
+#define TDVF_SECTION_TYPE_TEMP_MEM          3
+
+#define TDVF_SECTION_ATTRIBUTES_MR_EXTEND   (1U << 0)
+#define TDVF_SECTION_ATTRIBUTES_PAGE_AUG    (1U << 1)
+
+typedef struct TdxFirmwareEntry {
+    uint32_t data_offset;
+    uint32_t data_len;
+    uint64_t address;
+    uint64_t size;
+    uint32_t type;
+    uint32_t attributes;
+} TdxFirmwareEntry;
+
+typedef struct TdxFirmware {
+    uint32_t nr_entries;
+    TdxFirmwareEntry *entries;
+} TdxFirmware;
+
+int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size);
+
+#endif /* HW_I386_TDVF_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 28/58] i386/tdx: Parse TDVF metadata for TDX VM
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (26 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 27/58] i386/tdvf: Introduce function to parse TDVF metadata Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 29/58] i386/tdx: Skip BIOS shadowing setup Xiaoyao Li
                   ` (29 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

TDX cannot support pflash device since it doesn't support read-only
memslot and doesn't support emulation. Load TDVF(OVMF) with -bios option
for TDs.

When boot a TD, besides load TDVF to the address below 4G, it needs
parse TDVF metadata.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 hw/i386/pc_sysfw.c         | 7 +++++++
 hw/i386/x86.c              | 3 ++-
 target/i386/kvm/tdx-stub.c | 5 +++++
 target/i386/kvm/tdx.c      | 5 +++++
 target/i386/kvm/tdx.h      | 4 ++++
 5 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index c8d9e71b889b..cf63434ba89d 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -37,6 +37,7 @@
 #include "hw/block/flash.h"
 #include "sysemu/kvm.h"
 #include "sev.h"
+#include "kvm/tdx.h"
 
 #define FLASH_SECTOR_SIZE 4096
 
@@ -265,5 +266,11 @@ void x86_firmware_configure(void *ptr, int size)
         }
 
         sev_encrypt_flash(ptr, size, &error_fatal);
+    } else if (is_tdx_vm()) {
+        ret = tdx_parse_tdvf(ptr, size);
+        if (ret) {
+            error_report("failed to parse TDVF for TDX VM");
+            exit(1);
+        }
     }
 }
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 3ccd06154249..dabd33cb830b 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -47,6 +47,7 @@
 #include "hw/intc/i8259.h"
 #include "hw/rtc/mc146818rtc.h"
 #include "target/i386/sev.h"
+#include "kvm/tdx.h"
 
 #include "hw/acpi/cpu_hotplug.h"
 #include "hw/irq.h"
@@ -1152,7 +1153,7 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
     }
     bios = g_malloc(sizeof(*bios));
     memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
-    if (sev_enabled()) {
+    if (sev_enabled() || is_tdx_vm()) {
         /*
          * The concept of a "reset" simply doesn't exist for
          * confidential computing guests, we have to destroy and
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
index 61f70cc0d1d9..3cd114476d78 100644
--- a/target/i386/kvm/tdx-stub.c
+++ b/target/i386/kvm/tdx-stub.c
@@ -11,3 +11,8 @@ int tdx_pre_create_vcpu(CPUState *cpu)
 {
     return -EINVAL;
 }
+
+int tdx_parse_tdvf(void *flash_ptr, int size)
+{
+    return -EINVAL;
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index f1305191e939..e41ea18c04db 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -580,6 +580,11 @@ out:
     return r;
 }
 
+int tdx_parse_tdvf(void *flash_ptr, int size)
+{
+    return tdvf_parse_metadata(&tdx_guest->tdvf, flash_ptr, size);
+}
+
 static void tdx_guest_region_add(MemoryListener *listener,
                                  MemoryRegionSection *section)
 {
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 68f8327f2231..e9d2888162ce 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -6,6 +6,7 @@
 #endif
 
 #include "exec/confidential-guest-support.h"
+#include "hw/i386/tdvf.h"
 
 #define TYPE_TDX_GUEST "tdx-guest"
 #define TDX_GUEST(obj)  OBJECT_CHECK(TdxGuest, (obj), TYPE_TDX_GUEST)
@@ -24,6 +25,8 @@ typedef struct TdxGuest {
     uint8_t mrconfigid[48];     /* sha348 digest */
     uint8_t mrowner[48];        /* sha348 digest */
     uint8_t mrownerconfig[48];  /* sha348 digest */
+
+    TdxFirmware tdvf;
 } TdxGuest;
 
 #ifdef CONFIG_TDX
@@ -36,5 +39,6 @@ int tdx_kvm_init(MachineState *ms, Error **errp);
 void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
                              uint32_t *ret);
 int tdx_pre_create_vcpu(CPUState *cpu);
+int tdx_parse_tdvf(void *flash_ptr, int size);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 29/58] i386/tdx: Skip BIOS shadowing setup
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (27 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 28/58] i386/tdx: Parse TDVF metadata for TDX VM Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 30/58] i386/tdx: Don't initialize pc.rom for TDX VMs Xiaoyao Li
                   ` (28 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

TDX doesn't support map different GPAs to same private memory. Thus,
aliasing top 128KB of BIOS as isa-bios is not supported.

On the other hand, TDX guest cannot go to real mode, it can work fine
without isa-bios.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
Changes from RFC v4:
 - update commit message and comment to clarify
---
 hw/i386/x86.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index dabd33cb830b..e2a1369e5dfc 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1175,17 +1175,20 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
     }
     g_free(filename);
 
-    /* map the last 128KB of the BIOS in ISA space */
-    isa_bios_size = MIN(bios_size, 128 * KiB);
-    isa_bios = g_malloc(sizeof(*isa_bios));
-    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
-                             bios_size - isa_bios_size, isa_bios_size);
-    memory_region_add_subregion_overlap(rom_memory,
-                                        0x100000 - isa_bios_size,
-                                        isa_bios,
-                                        1);
-    if (!isapc_ram_fw) {
-        memory_region_set_readonly(isa_bios, true);
+    /* For TDX, alias different GPAs to same private memory is not supported */
+    if (!is_tdx_vm()) {
+        /* map the last 128KB of the BIOS in ISA space */
+        isa_bios_size = MIN(bios_size, 128 * KiB);
+        isa_bios = g_malloc(sizeof(*isa_bios));
+        memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
+                                bios_size - isa_bios_size, isa_bios_size);
+        memory_region_add_subregion_overlap(rom_memory,
+                                            0x100000 - isa_bios_size,
+                                            isa_bios,
+                                            1);
+        if (!isapc_ram_fw) {
+            memory_region_set_readonly(isa_bios, true);
+        }
     }
 
     /* map all the bios at the top of memory */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 30/58] i386/tdx: Don't initialize pc.rom for TDX VMs
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (28 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 29/58] i386/tdx: Skip BIOS shadowing setup Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 31/58] i386/tdx: Track mem_ptr for each firmware entry of TDVF Xiaoyao Li
                   ` (27 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

For TDX, the address below 1MB are entirely general RAM. No need to
initialize pc.rom memory region for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
This is more as a workaround of the issue that for q35 machine type, the
real memslot update (which requires memslot deletion )for pc.rom happens
after tdx_init_memory_region. It leads to the private memory ADD'ed
before get lost. I haven't work out a good solution to resolve the
order issue. So just skip the pc.rom setup to avoid memslot deletion.
---
 hw/i386/pc.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index abeadd903827..bc307fed0f44 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -62,6 +62,7 @@
 #include "sysemu/reset.h"
 #include "sysemu/runstate.h"
 #include "kvm/kvm_i386.h"
+#include "kvm/tdx.h"
 #include "hw/xen/xen.h"
 #include "hw/xen/start_info.h"
 #include "ui/qemu-spice.h"
@@ -1095,16 +1096,18 @@ void pc_memory_init(PCMachineState *pcms,
     /* Initialize PC system firmware */
     pc_system_firmware_init(pcms, rom_memory);
 
-    option_rom_mr = g_malloc(sizeof(*option_rom_mr));
-    memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
-                           &error_fatal);
-    if (pcmc->pci_enabled) {
-        memory_region_set_readonly(option_rom_mr, true);
+    if (!is_tdx_vm()) {
+        option_rom_mr = g_malloc(sizeof(*option_rom_mr));
+        memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
+                            &error_fatal);
+        if (pcmc->pci_enabled) {
+            memory_region_set_readonly(option_rom_mr, true);
+        }
+        memory_region_add_subregion_overlap(rom_memory,
+                                            PC_ROM_MIN_VGA,
+                                            option_rom_mr,
+                                            1);
     }
-    memory_region_add_subregion_overlap(rom_memory,
-                                        PC_ROM_MIN_VGA,
-                                        option_rom_mr,
-                                        1);
 
     fw_cfg = fw_cfg_arch_create(machine,
                                 x86ms->boot_cpus, x86ms->apic_id_limit);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 31/58] i386/tdx: Track mem_ptr for each firmware entry of TDVF
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (29 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 30/58] i386/tdx: Don't initialize pc.rom for TDX VMs Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM Xiaoyao Li
                   ` (26 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

For each TDVF sections, QEMU needs to copy the content to guest
private memory via KVM API (KVM_TDX_INIT_MEM_REGION).

Introduce a field @mem_ptr for TdxFirmwareEntry to track the memory
pointer of each TDVF sections. So that QEMU can add/copy them to guest
private memory later.

TDVF sections can be classified into two groups:
 - Firmware itself, e.g., TDVF BFV and CFV, that located separately from
   guest RAM. Its memory pointer is the bios pointer.

 - Sections located at guest RAM, e.g., TEMP_MEM and TD_HOB.
   mmap a new memory range for them.

Register a machine_init_done callback to do the stuff.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 hw/i386/tdvf.c         |  1 +
 include/hw/i386/tdvf.h |  7 +++++++
 target/i386/kvm/tdx.c  | 31 +++++++++++++++++++++++++++++++
 3 files changed, 39 insertions(+)

diff --git a/hw/i386/tdvf.c b/hw/i386/tdvf.c
index ff51f40088f0..0a6445705160 100644
--- a/hw/i386/tdvf.c
+++ b/hw/i386/tdvf.c
@@ -189,6 +189,7 @@ int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size)
     }
     g_free(sections);
 
+    fw->mem_ptr = flash_ptr;
     return 0;
 
 err:
diff --git a/include/hw/i386/tdvf.h b/include/hw/i386/tdvf.h
index 593341eb2e93..d880af245a73 100644
--- a/include/hw/i386/tdvf.h
+++ b/include/hw/i386/tdvf.h
@@ -39,13 +39,20 @@ typedef struct TdxFirmwareEntry {
     uint64_t size;
     uint32_t type;
     uint32_t attributes;
+
+    void *mem_ptr;
 } TdxFirmwareEntry;
 
 typedef struct TdxFirmware {
+    void *mem_ptr;
+
     uint32_t nr_entries;
     TdxFirmwareEntry *entries;
 } TdxFirmware;
 
+#define for_each_tdx_fw_entry(fw, e)    \
+    for (e = (fw)->entries; e != (fw)->entries + (fw)->nr_entries; e++)
+
 int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size);
 
 #endif /* HW_I386_TDVF_H */
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index e41ea18c04db..bb806736b4ff 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -13,6 +13,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/error-report.h"
+#include "qemu/mmap-alloc.h"
 #include "qapi/error.h"
 #include "qom/object_interfaces.h"
 #include "standard-headers/asm-x86/kvm_para.h"
@@ -22,6 +23,7 @@
 #include "exec/ramblock.h"
 
 #include "hw/i386/x86.h"
+#include "hw/i386/tdvf.h"
 #include "kvm_i386.h"
 #include "tdx.h"
 #include "../cpu-internal.h"
@@ -452,6 +454,33 @@ static void update_tdx_cpuid_lookup_by_tdx_caps(void)
             (tdx_caps->xfam_fixed1 & CPUID_XSTATE_XSS_MASK) >> 32;
 }
 
+static void tdx_finalize_vm(Notifier *notifier, void *unused)
+{
+    TdxFirmware *tdvf = &tdx_guest->tdvf;
+    TdxFirmwareEntry *entry;
+
+    for_each_tdx_fw_entry(tdvf, entry) {
+        switch (entry->type) {
+        case TDVF_SECTION_TYPE_BFV:
+        case TDVF_SECTION_TYPE_CFV:
+            entry->mem_ptr = tdvf->mem_ptr + entry->data_offset;
+            break;
+        case TDVF_SECTION_TYPE_TD_HOB:
+        case TDVF_SECTION_TYPE_TEMP_MEM:
+            entry->mem_ptr = qemu_ram_mmap(-1, entry->size,
+                                           qemu_real_host_page_size(), 0, 0);
+            break;
+        default:
+            error_report("Unsupported TDVF section %d", entry->type);
+            exit(1);
+        }
+    }
+}
+
+static Notifier tdx_machine_done_notify = {
+    .notify = tdx_finalize_vm,
+};
+
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
     TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
@@ -472,6 +501,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
      */
     kvm_readonly_mem_allowed = false;
 
+    qemu_add_machine_init_done_notifier(&tdx_machine_done_notify);
+
     tdx_guest = tdx;
 
     return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (30 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 31/58] i386/tdx: Track mem_ptr for each firmware entry of TDVF Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-21  9:38   ` Daniel P. Berrangé
  2023-08-21 23:40   ` Isaku Yamahata
  2023-08-18  9:50 ` [PATCH v2 33/58] headers: Add definitions from UEFI spec for volumes, resources, etc Xiaoyao Li
                   ` (25 subsequent siblings)
  57 siblings, 2 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

The RAM of TDX VM can be classified into two types:

 - TDX_RAM_UNACCEPTED: default type of TDX memory, which needs to be
   accepted by TDX guest before it can be used and will be all-zeros
   after being accepted.

 - TDX_RAM_ADDED: the RAM that is ADD'ed to TD guest before running, and
   can be used directly. E.g., TD HOB and TEMP MEM that needed by TDVF.

Maintain TdxRamEntries[] which grabs the initial RAM info from e820 table
and mark each RAM range as default type TDX_RAM_UNACCEPTED.

Then turn the range of TD HOB and TEMP MEM to TDX_RAM_ADDED since these
ranges will be ADD'ed before TD runs and no need to be accepted runtime.

The TdxRamEntries[] are later used to setup the memory TD resource HOB
that passes memory info from QEMU to TDVF.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>

---
Changes from RFC v4:
  - simplify the algorithm of tdx_accept_ram_range() (Suggested-by: Gerd Hoffman)
    (1) Change the existing entry to cover the accepted ram range.
    (2) If there is room before the accepted ram range add a
	TDX_RAM_UNACCEPTED entry for that.
    (3) If there is room after the accepted ram range add a
	TDX_RAM_UNACCEPTED entry for that.
---
 target/i386/kvm/tdx.c | 110 ++++++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h |  14 ++++++
 2 files changed, 124 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index bb806736b4ff..ed617ebab266 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -22,6 +22,7 @@
 #include "exec/address-spaces.h"
 #include "exec/ramblock.h"
 
+#include "hw/i386/e820_memory_layout.h"
 #include "hw/i386/x86.h"
 #include "hw/i386/tdvf.h"
 #include "kvm_i386.h"
@@ -454,11 +455,116 @@ static void update_tdx_cpuid_lookup_by_tdx_caps(void)
             (tdx_caps->xfam_fixed1 & CPUID_XSTATE_XSS_MASK) >> 32;
 }
 
+static void tdx_add_ram_entry(uint64_t address, uint64_t length, uint32_t type)
+{
+    uint32_t nr_entries = tdx_guest->nr_ram_entries;
+    tdx_guest->ram_entries = g_renew(TdxRamEntry, tdx_guest->ram_entries,
+                                     nr_entries + 1);
+
+    tdx_guest->ram_entries[nr_entries].address = address;
+    tdx_guest->ram_entries[nr_entries].length = length;
+    tdx_guest->ram_entries[nr_entries].type = type;
+    tdx_guest->nr_ram_entries++;
+}
+
+static int tdx_accept_ram_range(uint64_t address, uint64_t length)
+{
+    uint64_t head_start, tail_start, head_length, tail_length;
+    uint64_t tmp_address, tmp_length;
+    TdxRamEntry *e;
+    int i;
+
+    for (i = 0; i < tdx_guest->nr_ram_entries; i++) {
+        e = &tdx_guest->ram_entries[i];
+
+        if (address + length <= e->address ||
+            e->address + e->length <= address) {
+                continue;
+        }
+
+        /*
+         * The to-be-accepted ram range must be fully contained by one
+         * RAM entry.
+         */
+        if (e->address > address ||
+            e->address + e->length < address + length) {
+            return -EINVAL;
+        }
+
+        if (e->type == TDX_RAM_ADDED) {
+            return -EINVAL;
+        }
+
+        break;
+    }
+
+    if (i == tdx_guest->nr_ram_entries) {
+        return -1;
+    }
+
+    tmp_address = e->address;
+    tmp_length = e->length;
+
+    e->address = address;
+    e->length = length;
+    e->type = TDX_RAM_ADDED;
+
+    head_length = address - tmp_address;
+    if (head_length > 0) {
+        head_start = tmp_address;
+        tdx_add_ram_entry(head_start, head_length, TDX_RAM_UNACCEPTED);
+    }
+
+    tail_start = address + length;
+    if (tail_start < tmp_address + tmp_length) {
+        tail_length = tmp_address + tmp_length - tail_start;
+        tdx_add_ram_entry(tail_start, tail_length, TDX_RAM_UNACCEPTED);
+    }
+
+    return 0;
+}
+
+static int tdx_ram_entry_compare(const void *lhs_, const void* rhs_)
+{
+    const TdxRamEntry *lhs = lhs_;
+    const TdxRamEntry *rhs = rhs_;
+
+    if (lhs->address == rhs->address) {
+        return 0;
+    }
+    if (le64_to_cpu(lhs->address) > le64_to_cpu(rhs->address)) {
+        return 1;
+    }
+    return -1;
+}
+
+static void tdx_init_ram_entries(void)
+{
+    unsigned i, j, nr_e820_entries;
+
+    nr_e820_entries = e820_get_num_entries();
+    tdx_guest->ram_entries = g_new(TdxRamEntry, nr_e820_entries);
+
+    for (i = 0, j = 0; i < nr_e820_entries; i++) {
+        uint64_t addr, len;
+
+        if (e820_get_entry(i, E820_RAM, &addr, &len)) {
+            tdx_guest->ram_entries[j].address = addr;
+            tdx_guest->ram_entries[j].length = len;
+            tdx_guest->ram_entries[j].type = TDX_RAM_UNACCEPTED;
+            j++;
+        }
+    }
+    tdx_guest->nr_ram_entries = j;
+}
+
 static void tdx_finalize_vm(Notifier *notifier, void *unused)
 {
     TdxFirmware *tdvf = &tdx_guest->tdvf;
     TdxFirmwareEntry *entry;
 
+    tdx_init_ram_entries();
+
     for_each_tdx_fw_entry(tdvf, entry) {
         switch (entry->type) {
         case TDVF_SECTION_TYPE_BFV:
@@ -469,12 +575,16 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
         case TDVF_SECTION_TYPE_TEMP_MEM:
             entry->mem_ptr = qemu_ram_mmap(-1, entry->size,
                                            qemu_real_host_page_size(), 0, 0);
+            tdx_accept_ram_range(entry->address, entry->size);
             break;
         default:
             error_report("Unsupported TDVF section %d", entry->type);
             exit(1);
         }
     }
+
+    qsort(tdx_guest->ram_entries, tdx_guest->nr_ram_entries,
+          sizeof(TdxRamEntry), &tdx_ram_entry_compare);
 }
 
 static Notifier tdx_machine_done_notify = {
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index e9d2888162ce..9b3c427766ef 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -15,6 +15,17 @@ typedef struct TdxGuestClass {
     ConfidentialGuestSupportClass parent_class;
 } TdxGuestClass;
 
+enum TdxRamType{
+    TDX_RAM_UNACCEPTED,
+    TDX_RAM_ADDED,
+};
+
+typedef struct TdxRamEntry {
+    uint64_t address;
+    uint64_t length;
+    uint32_t type;
+} TdxRamEntry;
+
 typedef struct TdxGuest {
     ConfidentialGuestSupport parent_obj;
 
@@ -27,6 +38,9 @@ typedef struct TdxGuest {
     uint8_t mrownerconfig[48];  /* sha348 digest */
 
     TdxFirmware tdvf;
+
+    uint32_t nr_ram_entries;
+    TdxRamEntry *ram_entries;
 } TdxGuest;
 
 #ifdef CONFIG_TDX
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 33/58] headers: Add definitions from UEFI spec for volumes, resources, etc...
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (31 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-23 19:41   ` Isaku Yamahata
  2023-08-18  9:50 ` [PATCH v2 34/58] i386/tdx: Setup the TD HOB list Xiaoyao Li
                   ` (24 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Add UEFI definitions for literals, enums, structs, GUIDs, etc... that
will be used by TDX to build the UEFI Hand-Off Block (HOB) that is passed
to the Trusted Domain Virtual Firmware (TDVF).

All values come from the UEFI specification and TDVF design guide. [1]

Note, EFI_RESOURCE_MEMORY_UNACCEPTED will be added in future UEFI spec.

[1] https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 include/standard-headers/uefi/uefi.h | 198 +++++++++++++++++++++++++++
 1 file changed, 198 insertions(+)
 create mode 100644 include/standard-headers/uefi/uefi.h

diff --git a/include/standard-headers/uefi/uefi.h b/include/standard-headers/uefi/uefi.h
new file mode 100644
index 000000000000..b15aba796156
--- /dev/null
+++ b/include/standard-headers/uefi/uefi.h
@@ -0,0 +1,198 @@
+/*
+ * Copyright (C) 2020 Intel Corporation
+ *
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#ifndef HW_I386_UEFI_H
+#define HW_I386_UEFI_H
+
+/***************************************************************************/
+/*
+ * basic EFI definitions
+ * supplemented with UEFI Specification Version 2.8 (Errata A)
+ * released February 2020
+ */
+/* UEFI integer is little endian */
+
+typedef struct {
+    uint32_t Data1;
+    uint16_t Data2;
+    uint16_t Data3;
+    uint8_t Data4[8];
+} EFI_GUID;
+
+typedef enum {
+    EfiReservedMemoryType,
+    EfiLoaderCode,
+    EfiLoaderData,
+    EfiBootServicesCode,
+    EfiBootServicesData,
+    EfiRuntimeServicesCode,
+    EfiRuntimeServicesData,
+    EfiConventionalMemory,
+    EfiUnusableMemory,
+    EfiACPIReclaimMemory,
+    EfiACPIMemoryNVS,
+    EfiMemoryMappedIO,
+    EfiMemoryMappedIOPortSpace,
+    EfiPalCode,
+    EfiPersistentMemory,
+    EfiUnacceptedMemoryType,
+    EfiMaxMemoryType
+} EFI_MEMORY_TYPE;
+
+#define EFI_HOB_HANDOFF_TABLE_VERSION 0x0009
+
+#define EFI_HOB_TYPE_HANDOFF              0x0001
+#define EFI_HOB_TYPE_MEMORY_ALLOCATION    0x0002
+#define EFI_HOB_TYPE_RESOURCE_DESCRIPTOR  0x0003
+#define EFI_HOB_TYPE_GUID_EXTENSION       0x0004
+#define EFI_HOB_TYPE_FV                   0x0005
+#define EFI_HOB_TYPE_CPU                  0x0006
+#define EFI_HOB_TYPE_MEMORY_POOL          0x0007
+#define EFI_HOB_TYPE_FV2                  0x0009
+#define EFI_HOB_TYPE_LOAD_PEIM_UNUSED     0x000A
+#define EFI_HOB_TYPE_UEFI_CAPSULE         0x000B
+#define EFI_HOB_TYPE_FV3                  0x000C
+#define EFI_HOB_TYPE_UNUSED               0xFFFE
+#define EFI_HOB_TYPE_END_OF_HOB_LIST      0xFFFF
+
+typedef struct {
+    uint16_t HobType;
+    uint16_t HobLength;
+    uint32_t Reserved;
+} EFI_HOB_GENERIC_HEADER;
+
+typedef uint64_t EFI_PHYSICAL_ADDRESS;
+typedef uint32_t EFI_BOOT_MODE;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    uint32_t Version;
+    EFI_BOOT_MODE BootMode;
+    EFI_PHYSICAL_ADDRESS EfiMemoryTop;
+    EFI_PHYSICAL_ADDRESS EfiMemoryBottom;
+    EFI_PHYSICAL_ADDRESS EfiFreeMemoryTop;
+    EFI_PHYSICAL_ADDRESS EfiFreeMemoryBottom;
+    EFI_PHYSICAL_ADDRESS EfiEndOfHobList;
+} EFI_HOB_HANDOFF_INFO_TABLE;
+
+#define EFI_RESOURCE_SYSTEM_MEMORY          0x00000000
+#define EFI_RESOURCE_MEMORY_MAPPED_IO       0x00000001
+#define EFI_RESOURCE_IO                     0x00000002
+#define EFI_RESOURCE_FIRMWARE_DEVICE        0x00000003
+#define EFI_RESOURCE_MEMORY_MAPPED_IO_PORT  0x00000004
+#define EFI_RESOURCE_MEMORY_RESERVED        0x00000005
+#define EFI_RESOURCE_IO_RESERVED            0x00000006
+#define EFI_RESOURCE_MEMORY_UNACCEPTED      0x00000007
+#define EFI_RESOURCE_MAX_MEMORY_TYPE        0x00000008
+
+#define EFI_RESOURCE_ATTRIBUTE_PRESENT                  0x00000001
+#define EFI_RESOURCE_ATTRIBUTE_INITIALIZED              0x00000002
+#define EFI_RESOURCE_ATTRIBUTE_TESTED                   0x00000004
+#define EFI_RESOURCE_ATTRIBUTE_SINGLE_BIT_ECC           0x00000008
+#define EFI_RESOURCE_ATTRIBUTE_MULTIPLE_BIT_ECC         0x00000010
+#define EFI_RESOURCE_ATTRIBUTE_ECC_RESERVED_1           0x00000020
+#define EFI_RESOURCE_ATTRIBUTE_ECC_RESERVED_2           0x00000040
+#define EFI_RESOURCE_ATTRIBUTE_READ_PROTECTED           0x00000080
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_PROTECTED          0x00000100
+#define EFI_RESOURCE_ATTRIBUTE_EXECUTION_PROTECTED      0x00000200
+#define EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE              0x00000400
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_COMBINEABLE        0x00000800
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_THROUGH_CACHEABLE  0x00001000
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_BACK_CACHEABLE     0x00002000
+#define EFI_RESOURCE_ATTRIBUTE_16_BIT_IO                0x00004000
+#define EFI_RESOURCE_ATTRIBUTE_32_BIT_IO                0x00008000
+#define EFI_RESOURCE_ATTRIBUTE_64_BIT_IO                0x00010000
+#define EFI_RESOURCE_ATTRIBUTE_UNCACHED_EXPORTED        0x00020000
+#define EFI_RESOURCE_ATTRIBUTE_READ_ONLY_PROTECTED      0x00040000
+#define EFI_RESOURCE_ATTRIBUTE_READ_ONLY_PROTECTABLE    0x00080000
+#define EFI_RESOURCE_ATTRIBUTE_READ_PROTECTABLE         0x00100000
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_PROTECTABLE        0x00200000
+#define EFI_RESOURCE_ATTRIBUTE_EXECUTION_PROTECTABLE    0x00400000
+#define EFI_RESOURCE_ATTRIBUTE_PERSISTENT               0x00800000
+#define EFI_RESOURCE_ATTRIBUTE_PERSISTABLE              0x01000000
+#define EFI_RESOURCE_ATTRIBUTE_MORE_RELIABLE            0x02000000
+
+typedef uint32_t EFI_RESOURCE_TYPE;
+typedef uint32_t EFI_RESOURCE_ATTRIBUTE_TYPE;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_GUID Owner;
+    EFI_RESOURCE_TYPE ResourceType;
+    EFI_RESOURCE_ATTRIBUTE_TYPE ResourceAttribute;
+    EFI_PHYSICAL_ADDRESS PhysicalStart;
+    uint64_t ResourceLength;
+} EFI_HOB_RESOURCE_DESCRIPTOR;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_GUID Name;
+
+    /* guid specific data follows */
+} EFI_HOB_GUID_TYPE;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+} EFI_HOB_FIRMWARE_VOLUME;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+    EFI_GUID FvName;
+    EFI_GUID FileName;
+} EFI_HOB_FIRMWARE_VOLUME2;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+    uint32_t AuthenticationStatus;
+    bool ExtractedFv;
+    EFI_GUID FvName;
+    EFI_GUID FileName;
+} EFI_HOB_FIRMWARE_VOLUME3;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    uint8_t SizeOfMemorySpace;
+    uint8_t SizeOfIoSpace;
+    uint8_t Reserved[6];
+} EFI_HOB_CPU;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+} EFI_HOB_MEMORY_POOL;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+} EFI_HOB_UEFI_CAPSULE;
+
+#define EFI_HOB_OWNER_ZERO                                      \
+    ((EFI_GUID){ 0x00000000, 0x0000, 0x0000,                    \
+        { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 } })
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 34/58] i386/tdx: Setup the TD HOB list
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (32 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 33/58] headers: Add definitions from UEFI spec for volumes, resources, etc Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 35/58] i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION Xiaoyao Li
                   ` (23 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

The TD HOB list is used to pass the information from VMM to TDVF. The TD
HOB must include PHIT HOB and Resource Descriptor HOB. More details can
be found in TDVF specification and PI specification.

Build the TD HOB in TDX's machine_init_done callback.

Co-developed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>

---
Changes from RFC v4:
  - drop the code of adding mmio resources since OVMF prepares all the
    MMIO hob itself.
---
 hw/i386/meson.build   |   2 +-
 hw/i386/tdvf-hob.c    | 147 ++++++++++++++++++++++++++++++++++++++++++
 hw/i386/tdvf-hob.h    |  24 +++++++
 target/i386/kvm/tdx.c |  16 +++++
 4 files changed, 188 insertions(+), 1 deletion(-)
 create mode 100644 hw/i386/tdvf-hob.c
 create mode 100644 hw/i386/tdvf-hob.h

diff --git a/hw/i386/meson.build b/hw/i386/meson.build
index 45d90bb2af52..b38ea89665f0 100644
--- a/hw/i386/meson.build
+++ b/hw/i386/meson.build
@@ -28,7 +28,7 @@ i386_ss.add(when: 'CONFIG_PC', if_true: files(
   'port92.c'))
 i386_ss.add(when: 'CONFIG_X86_FW_OVMF', if_true: files('pc_sysfw_ovmf.c'),
                                         if_false: files('pc_sysfw_ovmf-stubs.c'))
-i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c'))
+i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c', 'tdvf-hob.c'))
 
 subdir('kvm')
 subdir('xen')
diff --git a/hw/i386/tdvf-hob.c b/hw/i386/tdvf-hob.c
new file mode 100644
index 000000000000..0da6ff2df576
--- /dev/null
+++ b/hw/i386/tdvf-hob.c
@@ -0,0 +1,147 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+
+ * Copyright (c) 2020 Intel Corporation
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/error-report.h"
+#include "e820_memory_layout.h"
+#include "hw/i386/pc.h"
+#include "hw/i386/x86.h"
+#include "hw/pci/pcie_host.h"
+#include "sysemu/kvm.h"
+#include "standard-headers/uefi/uefi.h"
+#include "tdvf-hob.h"
+
+typedef struct TdvfHob {
+    hwaddr hob_addr;
+    void *ptr;
+    int size;
+
+    /* working area */
+    void *current;
+    void *end;
+} TdvfHob;
+
+static uint64_t tdvf_current_guest_addr(const TdvfHob *hob)
+{
+    return hob->hob_addr + (hob->current - hob->ptr);
+}
+
+static void tdvf_align(TdvfHob *hob, size_t align)
+{
+    hob->current = QEMU_ALIGN_PTR_UP(hob->current, align);
+}
+
+static void *tdvf_get_area(TdvfHob *hob, uint64_t size)
+{
+    void *ret;
+
+    if (hob->current + size > hob->end) {
+        error_report("TD_HOB overrun, size = 0x%" PRIx64, size);
+        exit(1);
+    }
+
+    ret = hob->current;
+    hob->current += size;
+    tdvf_align(hob, 8);
+    return ret;
+}
+
+static void tdvf_hob_add_memory_resources(TdxGuest *tdx, TdvfHob *hob)
+{
+    EFI_HOB_RESOURCE_DESCRIPTOR *region;
+    EFI_RESOURCE_ATTRIBUTE_TYPE attr;
+    EFI_RESOURCE_TYPE resource_type;
+
+    TdxRamEntry *e;
+    int i;
+
+    for (i = 0; i < tdx->nr_ram_entries; i++) {
+        e = &tdx->ram_entries[i];
+
+        if (e->type == TDX_RAM_UNACCEPTED) {
+            resource_type = EFI_RESOURCE_MEMORY_UNACCEPTED;
+            attr = EFI_RESOURCE_ATTRIBUTE_TDVF_UNACCEPTED;
+        } else if (e->type == TDX_RAM_ADDED){
+            resource_type = EFI_RESOURCE_SYSTEM_MEMORY;
+            attr = EFI_RESOURCE_ATTRIBUTE_TDVF_PRIVATE;
+        } else {
+            error_report("unknown TDX_RAM_ENTRY type %d", e->type);
+            exit(1);
+        }
+
+        region = tdvf_get_area(hob, sizeof(*region));
+        *region = (EFI_HOB_RESOURCE_DESCRIPTOR) {
+            .Header = {
+                .HobType = EFI_HOB_TYPE_RESOURCE_DESCRIPTOR,
+                .HobLength = cpu_to_le16(sizeof(*region)),
+                .Reserved = cpu_to_le32(0),
+            },
+            .Owner = EFI_HOB_OWNER_ZERO,
+            .ResourceType = cpu_to_le32(resource_type),
+            .ResourceAttribute = cpu_to_le32(attr),
+            .PhysicalStart = cpu_to_le64(e->address),
+            .ResourceLength = cpu_to_le64(e->length),
+        };
+    }
+}
+
+void tdvf_hob_create(TdxGuest *tdx, TdxFirmwareEntry *td_hob)
+{
+    TdvfHob hob = {
+        .hob_addr = td_hob->address,
+        .size = td_hob->size,
+        .ptr = td_hob->mem_ptr,
+
+        .current = td_hob->mem_ptr,
+        .end = td_hob->mem_ptr + td_hob->size,
+    };
+
+    EFI_HOB_GENERIC_HEADER *last_hob;
+    EFI_HOB_HANDOFF_INFO_TABLE *hit;
+
+    /* Note, Efi{Free}Memory{Bottom,Top} are ignored, leave 'em zeroed. */
+    hit = tdvf_get_area(&hob, sizeof(*hit));
+    *hit = (EFI_HOB_HANDOFF_INFO_TABLE) {
+        .Header = {
+            .HobType = EFI_HOB_TYPE_HANDOFF,
+            .HobLength = cpu_to_le16(sizeof(*hit)),
+            .Reserved = cpu_to_le32(0),
+        },
+        .Version = cpu_to_le32(EFI_HOB_HANDOFF_TABLE_VERSION),
+        .BootMode = cpu_to_le32(0),
+        .EfiMemoryTop = cpu_to_le64(0),
+        .EfiMemoryBottom = cpu_to_le64(0),
+        .EfiFreeMemoryTop = cpu_to_le64(0),
+        .EfiFreeMemoryBottom = cpu_to_le64(0),
+        .EfiEndOfHobList = cpu_to_le64(0), /* initialized later */
+    };
+
+    tdvf_hob_add_memory_resources(tdx, &hob);
+
+    last_hob = tdvf_get_area(&hob, sizeof(*last_hob));
+    *last_hob =  (EFI_HOB_GENERIC_HEADER) {
+        .HobType = EFI_HOB_TYPE_END_OF_HOB_LIST,
+        .HobLength = cpu_to_le16(sizeof(*last_hob)),
+        .Reserved = cpu_to_le32(0),
+    };
+    hit->EfiEndOfHobList = tdvf_current_guest_addr(&hob);
+}
diff --git a/hw/i386/tdvf-hob.h b/hw/i386/tdvf-hob.h
new file mode 100644
index 000000000000..1b737e946a8d
--- /dev/null
+++ b/hw/i386/tdvf-hob.h
@@ -0,0 +1,24 @@
+#ifndef HW_I386_TD_HOB_H
+#define HW_I386_TD_HOB_H
+
+#include "hw/i386/tdvf.h"
+#include "target/i386/kvm/tdx.h"
+
+void tdvf_hob_create(TdxGuest *tdx, TdxFirmwareEntry *td_hob);
+
+#define EFI_RESOURCE_ATTRIBUTE_TDVF_PRIVATE     \
+    (EFI_RESOURCE_ATTRIBUTE_PRESENT |           \
+     EFI_RESOURCE_ATTRIBUTE_INITIALIZED |       \
+     EFI_RESOURCE_ATTRIBUTE_TESTED)
+
+#define EFI_RESOURCE_ATTRIBUTE_TDVF_UNACCEPTED  \
+    (EFI_RESOURCE_ATTRIBUTE_PRESENT |           \
+     EFI_RESOURCE_ATTRIBUTE_INITIALIZED |       \
+     EFI_RESOURCE_ATTRIBUTE_TESTED)
+
+#define EFI_RESOURCE_ATTRIBUTE_TDVF_MMIO        \
+    (EFI_RESOURCE_ATTRIBUTE_PRESENT     |       \
+     EFI_RESOURCE_ATTRIBUTE_INITIALIZED |       \
+     EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE)
+
+#endif
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index ed617ebab266..3a93ad293129 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -25,6 +25,7 @@
 #include "hw/i386/e820_memory_layout.h"
 #include "hw/i386/x86.h"
 #include "hw/i386/tdvf.h"
+#include "hw/i386/tdvf-hob.h"
 #include "kvm_i386.h"
 #include "tdx.h"
 #include "../cpu-internal.h"
@@ -455,6 +456,19 @@ static void update_tdx_cpuid_lookup_by_tdx_caps(void)
             (tdx_caps->xfam_fixed1 & CPUID_XSTATE_XSS_MASK) >> 32;
 }
 
+static TdxFirmwareEntry *tdx_get_hob_entry(TdxGuest *tdx)
+{
+    TdxFirmwareEntry *entry;
+
+    for_each_tdx_fw_entry(&tdx->tdvf, entry) {
+        if (entry->type == TDVF_SECTION_TYPE_TD_HOB) {
+            return entry;
+        }
+    }
+    error_report("TDVF metadata doesn't specify TD_HOB location.");
+    exit(1);
+}
+
 static void tdx_add_ram_entry(uint64_t address, uint64_t length, uint32_t type)
 {
     uint32_t nr_entries = tdx_guest->nr_ram_entries;
@@ -585,6 +599,8 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
 
     qsort(tdx_guest->ram_entries, tdx_guest->nr_ram_entries,
           sizeof(TdxRamEntry), &tdx_ram_entry_compare);
+
+    tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 35/58] i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (33 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 34/58] i386/tdx: Setup the TD HOB list Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 36/58] memory: Introduce memory_region_init_ram_gmem() Xiaoyao Li
                   ` (22 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

TDVF firmware (CODE and VARS) needs to be added/copied to TD's private
memory via KVM_TDX_INIT_MEM_REGION, as well as TD HOB and TEMP memory.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>

---
Changes from RFC v4:
  - rename variable @metadata to @flags
---
 target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 3a93ad293129..37ff0f4eea11 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -576,6 +576,7 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
 {
     TdxFirmware *tdvf = &tdx_guest->tdvf;
     TdxFirmwareEntry *entry;
+    int r;
 
     tdx_init_ram_entries();
 
@@ -601,6 +602,29 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
           sizeof(TdxRamEntry), &tdx_ram_entry_compare);
 
     tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
+
+    for_each_tdx_fw_entry(tdvf, entry) {
+        struct kvm_tdx_init_mem_region mem_region = {
+            .source_addr = (__u64)entry->mem_ptr,
+            .gpa = entry->address,
+            .nr_pages = entry->size / 4096,
+        };
+
+        __u32 flags = entry->attributes & TDVF_SECTION_ATTRIBUTES_MR_EXTEND ?
+                      KVM_TDX_MEASURE_MEMORY_REGION : 0;
+
+        r = tdx_vm_ioctl(KVM_TDX_INIT_MEM_REGION, flags, &mem_region);
+        if (r < 0) {
+             error_report("KVM_TDX_INIT_MEM_REGION failed %s", strerror(-r));
+             exit(1);
+        }
+
+        if (entry->type == TDVF_SECTION_TYPE_TD_HOB ||
+            entry->type == TDVF_SECTION_TYPE_TEMP_MEM) {
+            qemu_ram_munmap(-1, entry->mem_ptr, entry->size);
+            entry->mem_ptr = NULL;
+        }
+    }
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 36/58] memory: Introduce memory_region_init_ram_gmem()
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (34 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 35/58] i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-21  9:40   ` Daniel P. Berrangé
  2023-08-29 14:33   ` Philippe Mathieu-Daudé
  2023-08-18  9:50 ` [PATCH v2 37/58] i386/tdx: register TDVF as private memory Xiaoyao Li
                   ` (21 subsequent siblings)
  57 siblings, 2 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Introduce memory_region_init_ram_gmem() to allocate private gmem on the
MemoryRegion initialization. It's for the usercase of TDVF, which must
be private on TDX case.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 include/exec/memory.h |  6 +++++
 softmmu/memory.c      | 52 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 759f797b6acd..127ffb6556b9 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1564,6 +1564,12 @@ void memory_region_init_ram(MemoryRegion *mr,
                             uint64_t size,
                             Error **errp);
 
+void memory_region_init_ram_gmem(MemoryRegion *mr,
+                                 Object *owner,
+                                 const char *name,
+                                 uint64_t size,
+                                 Error **errp);
+
 /**
  * memory_region_init_rom: Initialize a ROM memory region.
  *
diff --git a/softmmu/memory.c b/softmmu/memory.c
index af6aa3c1e3c9..ded44dcef1aa 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -25,6 +25,7 @@
 #include "qom/object.h"
 #include "trace.h"
 
+#include <linux/kvm.h>
 #include "exec/memory-internal.h"
 #include "exec/ram_addr.h"
 #include "sysemu/kvm.h"
@@ -3602,6 +3603,57 @@ void memory_region_init_ram(MemoryRegion *mr,
     vmstate_register_ram(mr, owner_dev);
 }
 
+#ifdef CONFIG_KVM
+void memory_region_init_ram_gmem(MemoryRegion *mr,
+                                 Object *owner,
+                                 const char *name,
+                                 uint64_t size,
+                                 Error **errp)
+{
+    DeviceState *owner_dev;
+    Error *err = NULL;
+    int priv_fd;
+
+    memory_region_init_ram_nomigrate(mr, owner, name, size, &err);
+    if (err) {
+        error_propagate(errp, err);
+        return;
+    }
+
+    if (object_dynamic_cast(OBJECT(current_accel()), TYPE_KVM_ACCEL)) {
+        KVMState *s = KVM_STATE(current_accel());
+        struct kvm_create_guest_memfd gmem = {
+            .size = size,
+            /* TODO: add property to hostmem backend for huge pmd */
+            .flags = KVM_GUEST_MEMFD_ALLOW_HUGEPAGE,
+        };
+
+        priv_fd = kvm_vm_ioctl(s, KVM_CREATE_GUEST_MEMFD, &gmem);
+        if (priv_fd < 0) {
+            fprintf(stderr, "%s: error creating gmem: %s\n", __func__,
+                    strerror(-priv_fd));
+            abort();
+        }
+    } else {
+        fprintf(stderr, "%s: gmem unsupported accel: %s\n", __func__,
+                current_accel_name());
+        abort();
+    }
+
+    memory_region_set_gmem_fd(mr, priv_fd);
+    memory_region_set_default_private(mr);
+
+    /* This will assert if owner is neither NULL nor a DeviceState.
+     * We only want the owner here for the purposes of defining a
+     * unique name for migration. TODO: Ideally we should implement
+     * a naming scheme for Objects which are not DeviceStates, in
+     * which case we can relax this restriction.
+     */
+    owner_dev = DEVICE(owner);
+    vmstate_register_ram(mr, owner_dev);
+}
+#endif
+
 void memory_region_init_rom(MemoryRegion *mr,
                             Object *owner,
                             const char *name,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 37/58] i386/tdx: register TDVF as private memory
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (35 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 36/58] memory: Introduce memory_region_init_ram_gmem() Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 38/58] i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu Xiaoyao Li
                   ` (20 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Chao Peng <chao.p.peng@linux.intel.com>

Allocate private gmem memory for BIOS if it's TD VM.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/x86.c         |  9 ++++++++-
 target/i386/kvm/tdx.c | 17 +++++++++++++++++
 target/i386/kvm/tdx.h |  2 ++
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index e2a1369e5dfc..a0c9f4d646e2 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1151,8 +1151,15 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
         (bios_size % 65536) != 0) {
         goto bios_error;
     }
+
     bios = g_malloc(sizeof(*bios));
-    memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
+    if (is_tdx_vm()) {
+        memory_region_init_ram_gmem(bios, NULL, "pc.bios", bios_size, &error_fatal);
+        tdx_set_tdvf_region(bios);
+    } else {
+        memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
+    }
+
     if (sev_enabled() || is_tdx_vm()) {
         /*
          * The concept of a "reset" simply doesn't exist for
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 37ff0f4eea11..5b688eb39327 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -456,6 +456,12 @@ static void update_tdx_cpuid_lookup_by_tdx_caps(void)
             (tdx_caps->xfam_fixed1 & CPUID_XSTATE_XSS_MASK) >> 32;
 }
 
+void tdx_set_tdvf_region(MemoryRegion *tdvf_region)
+{
+    assert(!tdx_guest->tdvf_region);
+    tdx_guest->tdvf_region = tdvf_region;
+}
+
 static TdxFirmwareEntry *tdx_get_hob_entry(TdxGuest *tdx)
 {
     TdxFirmwareEntry *entry;
@@ -576,6 +582,7 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
 {
     TdxFirmware *tdvf = &tdx_guest->tdvf;
     TdxFirmwareEntry *entry;
+    RAMBlock *ram_block;
     int r;
 
     tdx_init_ram_entries();
@@ -610,6 +617,12 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
             .nr_pages = entry->size / 4096,
         };
 
+        r = kvm_set_memory_attributes_private(entry->address, entry->size);
+        if (r < 0) {
+             error_report("Reserve initial private memory failed %s", strerror(-r));
+             exit(1);
+        }
+
         __u32 flags = entry->attributes & TDVF_SECTION_ATTRIBUTES_MR_EXTEND ?
                       KVM_TDX_MEASURE_MEMORY_REGION : 0;
 
@@ -625,6 +638,10 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
             entry->mem_ptr = NULL;
         }
     }
+
+    /* Tdvf image was copied into private region above. It becomes unnecessary. */
+    ram_block = tdx_guest->tdvf_region->ram_block;
+    ram_block_discard_range(ram_block, 0, ram_block->max_length);
 }
 
 static Notifier tdx_machine_done_notify = {
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 9b3c427766ef..1c444b6cdb3f 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -38,6 +38,7 @@ typedef struct TdxGuest {
     uint8_t mrownerconfig[48];  /* sha348 digest */
 
     TdxFirmware tdvf;
+    MemoryRegion *tdvf_region;
 
     uint32_t nr_ram_entries;
     TdxRamEntry *ram_entries;
@@ -53,6 +54,7 @@ int tdx_kvm_init(MachineState *ms, Error **errp);
 void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
                              uint32_t *ret);
 int tdx_pre_create_vcpu(CPUState *cpu);
+void tdx_set_tdvf_region(MemoryRegion *tdvf_region);
 int tdx_parse_tdvf(void *flash_ptr, int size);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 38/58] i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (36 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 37/58] i386/tdx: register TDVF as private memory Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 39/58] i386/tdx: Finalize TDX VM Xiaoyao Li
                   ` (19 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

TDX vcpu needs to be initialized by SEAMCALL(TDH.VP.INIT) and KVM
provides vcpu level IOCTL KVM_TDX_INIT_VCPU for it.

KVM_TDX_INIT_VCPU needs the address of the HOB as input. Invoke it for
each vcpu after HOB list is created.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/tdx.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 5b688eb39327..2d0efca6787b 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -578,6 +578,22 @@ static void tdx_init_ram_entries(void)
     tdx_guest->nr_ram_entries = j;
 }
 
+static void tdx_post_init_vcpus(void)
+{
+    TdxFirmwareEntry *hob;
+    CPUState *cpu;
+    int r;
+
+    hob = tdx_get_hob_entry(tdx_guest);
+    CPU_FOREACH(cpu) {
+        r = tdx_vcpu_ioctl(cpu, KVM_TDX_INIT_VCPU, 0, (void *)hob->address);
+        if (r < 0) {
+            error_report("KVM_TDX_INIT_VCPU failed %s", strerror(-r));
+            exit(1);
+        }
+    }
+}
+
 static void tdx_finalize_vm(Notifier *notifier, void *unused)
 {
     TdxFirmware *tdvf = &tdx_guest->tdvf;
@@ -610,6 +626,8 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
 
     tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
 
+    tdx_post_init_vcpus();
+
     for_each_tdx_fw_entry(tdvf, entry) {
         struct kvm_tdx_init_mem_region mem_region = {
             .source_addr = (__u64)entry->mem_ptr,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 39/58] i386/tdx: Finalize TDX VM
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (37 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 38/58] i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 40/58] i386/tdx: handle TDG.VP.VMCALL<SetupEventNotifyInterrupt> Xiaoyao Li
                   ` (18 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Invoke KVM_TDX_FINALIZE_VM to finalize the TD's measurement and make
the TD vCPUs runnable once machine initialization is complete.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/tdx.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 2d0efca6787b..5eabdafbe95c 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -660,6 +660,13 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     /* Tdvf image was copied into private region above. It becomes unnecessary. */
     ram_block = tdx_guest->tdvf_region->ram_block;
     ram_block_discard_range(ram_block, 0, ram_block->max_length);
+
+    r = tdx_vm_ioctl(KVM_TDX_FINALIZE_VM, 0, NULL);
+    if (r < 0) {
+        error_report("KVM_TDX_FINALIZE_VM failed %s", strerror(-r));
+        exit(0);
+    }
+    tdx_guest->parent_obj.ready = true;
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 40/58] i386/tdx: handle TDG.VP.VMCALL<SetupEventNotifyInterrupt>
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (38 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 39/58] i386/tdx: Finalize TDX VM Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote> Xiaoyao Li
                   ` (17 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

For SetupEventNotifyInterrupt, record interrupt vector and the apic id
of the vcpu that received this TDVMCALL.

Later it can inject interrupt with given vector to the specific vcpu
that received SetupEventNotifyInterrupt.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c      |  9 ++++++
 target/i386/kvm/tdx-stub.c |  4 +++
 target/i386/kvm/tdx.c      | 61 ++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h      |  6 ++++
 4 files changed, 80 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4a146bc42f63..601683d836c8 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5657,6 +5657,15 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
         ret = kvm_xen_handle_exit(cpu, &run->xen);
         break;
 #endif
+    case KVM_EXIT_TDX:
+        if (!is_tdx_vm()) {
+            error_report("KVM: get KVM_EXIT_TDX for a non-TDX VM.");
+            ret = -1;
+            break;
+        }
+        tdx_handle_exit(cpu, &run->tdx);
+        ret = 0;
+        break;
     default:
         fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
         ret = -1;
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
index 3cd114476d78..e26e2111f606 100644
--- a/target/i386/kvm/tdx-stub.c
+++ b/target/i386/kvm/tdx-stub.c
@@ -16,3 +16,7 @@ int tdx_parse_tdvf(void *flash_ptr, int size)
 {
     return -EINVAL;
 }
+
+void tdx_handle_exit(X86CPU *cpu, struct kvm_tdx_exit *tdx_exit)
+{
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 5eabdafbe95c..1b444886e294 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -894,6 +894,9 @@ static void tdx_guest_init(Object *obj)
                                OBJ_PROP_FLAG_READWRITE);
     object_property_add_sha384(obj, "mrownerconfig", tdx->mrownerconfig,
                                OBJ_PROP_FLAG_READWRITE);
+
+    tdx->event_notify_interrupt = -1;
+    tdx->event_notify_apic_id = -1;
 }
 
 static void tdx_guest_finalize(Object *obj)
@@ -903,3 +906,61 @@ static void tdx_guest_finalize(Object *obj)
 static void tdx_guest_class_init(ObjectClass *oc, void *data)
 {
 }
+
+#define TDG_VP_VMCALL_SETUP_EVENT_NOTIFY_INTERRUPT      0x10004ULL
+
+#define TDG_VP_VMCALL_SUCCESS           0x0000000000000000ULL
+#define TDG_VP_VMCALL_RETRY             0x0000000000000001ULL
+#define TDG_VP_VMCALL_INVALID_OPERAND   0x8000000000000000ULL
+#define TDG_VP_VMCALL_GPA_INUSE         0x8000000000000001ULL
+#define TDG_VP_VMCALL_ALIGN_ERROR       0x8000000000000002ULL
+
+static void tdx_handle_setup_event_notify_interrupt(X86CPU *cpu,
+                                                    struct kvm_tdx_vmcall *vmcall)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    TdxGuest *tdx = TDX_GUEST(ms->cgs);
+    int event_notify_interrupt = vmcall->in_r12;
+
+    if (32 <= event_notify_interrupt && event_notify_interrupt <= 255) {
+        qemu_mutex_lock(&tdx->lock);
+        tdx->event_notify_interrupt = event_notify_interrupt;
+        tdx->event_notify_apic_id = cpu->apic_id;
+        qemu_mutex_unlock(&tdx->lock);
+        vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
+    }
+}
+
+static void tdx_handle_vmcall(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
+{
+    vmcall->status_code = TDG_VP_VMCALL_INVALID_OPERAND;
+
+    /* For now handle only TDG.VP.VMCALL. */
+    if (vmcall->type != 0) {
+        warn_report("unknown tdg.vp.vmcall type 0x%llx subfunction 0x%llx",
+                    vmcall->type, vmcall->subfunction);
+        return;
+    }
+
+    switch (vmcall->subfunction) {
+    case TDG_VP_VMCALL_SETUP_EVENT_NOTIFY_INTERRUPT:
+        tdx_handle_setup_event_notify_interrupt(cpu, vmcall);
+        break;
+    default:
+        warn_report("unknown tdg.vp.vmcall type 0x%llx subfunction 0x%llx",
+                    vmcall->type, vmcall->subfunction);
+        break;
+    }
+}
+
+void tdx_handle_exit(X86CPU *cpu, struct kvm_tdx_exit *tdx_exit)
+{
+    switch (tdx_exit->type) {
+    case KVM_EXIT_TDX_VMCALL:
+        tdx_handle_vmcall(cpu, &tdx_exit->u.vmcall);
+        break;
+    default:
+        warn_report("unknown tdx exit type 0x%x", tdx_exit->type);
+        break;
+    }
+}
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 1c444b6cdb3f..50a151fc79c2 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -7,6 +7,7 @@
 
 #include "exec/confidential-guest-support.h"
 #include "hw/i386/tdvf.h"
+#include "sysemu/kvm.h"
 
 #define TYPE_TDX_GUEST "tdx-guest"
 #define TDX_GUEST(obj)  OBJECT_CHECK(TdxGuest, (obj), TYPE_TDX_GUEST)
@@ -42,6 +43,10 @@ typedef struct TdxGuest {
 
     uint32_t nr_ram_entries;
     TdxRamEntry *ram_entries;
+
+    /* runtime state */
+    int event_notify_interrupt;
+    uint32_t event_notify_apic_id;
 } TdxGuest;
 
 #ifdef CONFIG_TDX
@@ -56,5 +61,6 @@ void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
 int tdx_pre_create_vcpu(CPUState *cpu);
 void tdx_set_tdvf_region(MemoryRegion *tdvf_region);
 int tdx_parse_tdvf(void *flash_ptr, int size);
+void tdx_handle_exit(X86CPU *cpu, struct kvm_tdx_exit *tdx_exit);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (39 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 40/58] i386/tdx: handle TDG.VP.VMCALL<SetupEventNotifyInterrupt> Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-22  6:52   ` Markus Armbruster
  2023-08-18  9:50 ` [PATCH v2 42/58] i386/tdx: register the fd read callback with the main loop to read the quote data Xiaoyao Li
                   ` (16 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

For GetQuote, delegate a request to Quote Generation Service.  Add property
of address of quote generation server and On request, connect to the
server, read request buffer from shared guest memory, send the request
buffer to the server and store the response into shared guest memory and
notify TD guest by interrupt.

"quote-generation-service" is a property to specify Quote Generation
Service(QGS) in qemu socket address format.  The examples of the supported
format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".

command line example:
  qemu-system-x86_64 \
    -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
    -machine confidential-guest-support=tdx0

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 qapi/qom.json         |   5 +-
 target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h |   7 +
 3 files changed, 391 insertions(+), 1 deletion(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 87c1d440f331..37139949d761 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -879,13 +879,16 @@
 #
 # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
 #
+# @quote-generation-service: socket address for Quote Generation Service(QGS)
+#
 # Since: 8.2
 ##
 { 'struct': 'TdxGuestProperties',
   'data': { '*sept-ve-disable': 'bool',
             '*mrconfigid': 'str',
             '*mrowner': 'str',
-            '*mrownerconfig': 'str' } }
+            '*mrownerconfig': 'str',
+            '*quote-generation-service': 'str' } }
 
 ##
 # @ThreadContextProperties:
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 1b444886e294..73d6cd88af9e 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -22,6 +22,8 @@
 #include "exec/address-spaces.h"
 #include "exec/ramblock.h"
 
+#include "exec/address-spaces.h"
+#include "hw/i386/apic_internal.h"
 #include "hw/i386/e820_memory_layout.h"
 #include "hw/i386/x86.h"
 #include "hw/i386/tdvf.h"
@@ -863,6 +865,25 @@ static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error **errp)
     }
 }
 
+static char *tdx_guest_get_quote_generation(
+    Object *obj, Error **errp)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+    return g_strdup(tdx->quote_generation_str);
+}
+
+static void tdx_guest_set_quote_generation(
+    Object *obj, const char *value, Error **errp)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+    tdx->quote_generation = socket_parse(value, errp);
+    if (!tdx->quote_generation)
+        return;
+
+    g_free(tdx->quote_generation_str);
+    tdx->quote_generation_str = g_strdup(value);
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
@@ -895,6 +916,12 @@ static void tdx_guest_init(Object *obj)
     object_property_add_sha384(obj, "mrownerconfig", tdx->mrownerconfig,
                                OBJ_PROP_FLAG_READWRITE);
 
+    tdx->quote_generation_str = NULL;
+    tdx->quote_generation = NULL;
+    object_property_add_str(obj, "quote-generation-service",
+                            tdx_guest_get_quote_generation,
+                            tdx_guest_set_quote_generation);
+
     tdx->event_notify_interrupt = -1;
     tdx->event_notify_apic_id = -1;
 }
@@ -907,6 +934,7 @@ static void tdx_guest_class_init(ObjectClass *oc, void *data)
 {
 }
 
+#define TDG_VP_VMCALL_GET_QUOTE                         0x10002ULL
 #define TDG_VP_VMCALL_SETUP_EVENT_NOTIFY_INTERRUPT      0x10004ULL
 
 #define TDG_VP_VMCALL_SUCCESS           0x0000000000000000ULL
@@ -915,6 +943,355 @@ static void tdx_guest_class_init(ObjectClass *oc, void *data)
 #define TDG_VP_VMCALL_GPA_INUSE         0x8000000000000001ULL
 #define TDG_VP_VMCALL_ALIGN_ERROR       0x8000000000000002ULL
 
+#define TDX_GET_QUOTE_STRUCTURE_VERSION 1ULL
+
+#define TDX_VP_GET_QUOTE_SUCCESS                0ULL
+#define TDX_VP_GET_QUOTE_IN_FLIGHT              (-1ULL)
+#define TDX_VP_GET_QUOTE_ERROR                  0x8000000000000000ULL
+#define TDX_VP_GET_QUOTE_QGS_UNAVAILABLE        0x8000000000000001ULL
+
+/* Limit to avoid resource starvation. */
+#define TDX_GET_QUOTE_MAX_BUF_LEN       (128 * 1024)
+#define TDX_MAX_GET_QUOTE_REQUEST       16
+
+/* Format of pages shared with guest. */
+struct tdx_get_quote_header {
+    /* Format version: must be 1 in little endian. */
+    uint64_t structure_version;
+
+    /*
+     * GetQuote status code in little endian:
+     *   Guest must set error_code to 0 to avoid information leak.
+     *   Qemu sets this before interrupting guest.
+     */
+    uint64_t error_code;
+
+    /*
+     * in-message size in little endian: The message will follow this header.
+     * The in-message will be send to QGS.
+     */
+    uint32_t in_len;
+
+    /*
+     * out-message size in little endian:
+     * On request, out_len must be zero to avoid information leak.
+     * On return, message size from QGS. Qemu overwrites this field.
+     * The message will follows this header.  The in-message is overwritten.
+     */
+    uint32_t out_len;
+
+    /*
+     * Message buffer follows.
+     * Guest sets message that will be send to QGS.  If out_len > in_len, guest
+     * should zero remaining buffer to avoid information leak.
+     * Qemu overwrites this buffer with a message returned from QGS.
+     */
+};
+
+static hwaddr tdx_shared_bit(X86CPU *cpu)
+{
+    return (cpu->phys_bits > 48) ? BIT_ULL(51) : BIT_ULL(47);
+}
+
+struct tdx_get_quote_task {
+    uint32_t apic_id;
+    hwaddr gpa;
+    uint64_t buf_len;
+    struct tdx_get_quote_header hdr;
+    int event_notify_interrupt;
+    QIOChannelSocket *ioc;
+};
+
+struct x86_msi {
+    union {
+        struct {
+            uint32_t    reserved_0              : 2,
+                        dest_mode_logical       : 1,
+                        redirect_hint           : 1,
+                        reserved_1              : 1,
+                        virt_destid_8_14        : 7,
+                        destid_0_7              : 8,
+                        base_address            : 12;
+        } QEMU_PACKED x86_address_lo;
+        uint32_t address_lo;
+    };
+    union {
+        struct {
+            uint32_t    reserved        : 8,
+                        destid_8_31     : 24;
+        } QEMU_PACKED x86_address_hi;
+        uint32_t address_hi;
+    };
+    union {
+        struct {
+            uint32_t    vector                  : 8,
+                        delivery_mode           : 3,
+                        dest_mode_logical       : 1,
+                        reserved                : 2,
+                        active_low              : 1,
+                        is_level                : 1;
+        } QEMU_PACKED x86_data;
+        uint32_t data;
+    };
+};
+
+static void tdx_td_notify(struct tdx_get_quote_task *t)
+{
+    struct x86_msi x86_msi;
+    struct kvm_msi msi;
+    int ret;
+
+    /* It is optional for host VMM to interrupt TD. */
+    if(!(32 <= t->event_notify_interrupt && t->event_notify_interrupt <= 255))
+        return;
+
+    x86_msi = (struct x86_msi) {
+        .x86_address_lo  = {
+            .reserved_0 = 0,
+            .dest_mode_logical = 0,
+            .redirect_hint = 0,
+            .reserved_1 = 0,
+            .virt_destid_8_14 = 0,
+            .destid_0_7 = t->apic_id & 0xff,
+        },
+        .x86_address_hi = {
+            .reserved = 0,
+            .destid_8_31 = t->apic_id >> 8,
+        },
+        .x86_data = {
+            .vector = t->event_notify_interrupt,
+            .delivery_mode = APIC_DM_FIXED,
+            .dest_mode_logical = 0,
+            .reserved = 0,
+            .active_low = 0,
+            .is_level = 0,
+        },
+    };
+    msi = (struct kvm_msi) {
+        .address_lo = x86_msi.address_lo,
+        .address_hi = x86_msi.address_hi,
+        .data = x86_msi.data,
+        .flags = 0,
+        .devid = 0,
+    };
+    ret = kvm_vm_ioctl(kvm_state, KVM_SIGNAL_MSI, &msi);
+    if (ret < 0) {
+        /* In this case, no better way to tell it to guest.  Log it. */
+        error_report("TDX: injection %d failed, interrupt lost (%s).\n",
+                     t->event_notify_interrupt, strerror(-ret));
+    }
+}
+
+/*
+ * TODO: If QGS doesn't reply for long time, make it an error and interrupt
+ * guest.
+ */
+static void tdx_handle_get_quote_connected(QIOTask *task, gpointer opaque)
+{
+    struct tdx_get_quote_task *t = opaque;
+    Error *err = NULL;
+    char *in_data = NULL;
+    char *out_data = NULL;
+    size_t out_len;
+    ssize_t size;
+    MachineState *ms;
+    TdxGuest *tdx;
+
+    t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_ERROR);
+    if (qio_task_propagate_error(task, NULL)) {
+        t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
+        goto error;
+    }
+
+    in_data = g_malloc(le32_to_cpu(t->hdr.in_len));
+    if (address_space_read(&address_space_memory, t->gpa + sizeof(t->hdr),
+                           MEMTXATTRS_UNSPECIFIED, in_data,
+                           le32_to_cpu(t->hdr.in_len)) != MEMTX_OK) {
+        goto error;
+    }
+
+    if (qio_channel_write_all(QIO_CHANNEL(t->ioc), in_data,
+                              le32_to_cpu(t->hdr.in_len), &err) ||
+        err) {
+        t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
+        goto error;
+    }
+
+    out_data = g_malloc(t->buf_len);
+    out_len = 0;
+    size = 0;
+    while (true) {
+        char *buf;
+        size_t buf_size;
+
+        if (out_len < t->buf_len) {
+            buf = out_data + out_len;
+            buf_size = t->buf_len - out_len;
+        } else {
+            /*
+             * The received data is too large to fit in the shared GPA.
+             * Discard the received data and try to know the data size.
+             */
+            buf = out_data;
+            buf_size = t->buf_len;
+        }
+
+        size = qio_channel_read(QIO_CHANNEL(t->ioc), buf, buf_size, &err);
+        if (err) {
+            break;
+        }
+        if (size <= 0) {
+            break;
+        }
+        out_len += size;
+    }
+    /*
+     * Treat partial read as success and let the QGS client to handle it because
+     * the client knows better about the QGS.
+     */
+    if (out_len == 0 && (err || size < 0)) {
+        t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
+        goto error;
+    }
+    if (out_len > 0 && out_len > t->buf_len) {
+        /*
+         * There is no specific error code defined for this case(E2BIG) at the
+         * moment.
+         * TODO: Once an error code for this case is defined in GHCI spec ,
+         * update the error code.
+         */
+        t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_ERROR);
+        t->hdr.out_len = cpu_to_le32(out_len);
+        goto error_hdr;
+    }
+
+    if (address_space_write(
+            &address_space_memory, t->gpa + sizeof(t->hdr),
+            MEMTXATTRS_UNSPECIFIED, out_data, out_len) != MEMTX_OK) {
+        goto error;
+    }
+    /*
+     * Even if out_len == 0, it's a success.  It's up to the QGS-client contract
+     * how to interpret the zero-sized message as return message.
+     */
+    t->hdr.out_len = cpu_to_le32(out_len);
+    t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS);
+
+error:
+    if (t->hdr.error_code != cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS)) {
+        t->hdr.out_len = cpu_to_le32(0);
+    }
+error_hdr:
+    if (address_space_write(
+            &address_space_memory, t->gpa,
+            MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
+        error_report("TDX: failed to updsate GetQuote header.\n");
+    }
+    tdx_td_notify(t);
+
+    qio_channel_close(QIO_CHANNEL(t->ioc), &err);
+    object_unref(OBJECT(t->ioc));
+    g_free(in_data);
+    g_free(out_data);
+
+    /* Maintain the number of in-flight requests. */
+    ms = MACHINE(qdev_get_machine());
+    tdx = TDX_GUEST(ms->cgs);
+    qemu_mutex_lock(&tdx->lock);
+    tdx->quote_generation_num--;
+    qemu_mutex_unlock(&tdx->lock);
+
+    return;
+}
+
+static void tdx_handle_get_quote(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
+{
+    hwaddr gpa = vmcall->in_r12;
+    uint64_t buf_len = vmcall->in_r13;
+    struct tdx_get_quote_header hdr;
+    MachineState *ms;
+    TdxGuest *tdx;
+    QIOChannelSocket *ioc;
+    struct tdx_get_quote_task *t;
+
+    vmcall->status_code = TDG_VP_VMCALL_INVALID_OPERAND;
+
+    /* GPA must be shared. */
+    if (!(gpa & tdx_shared_bit(cpu))) {
+        return;
+    }
+    gpa &= ~tdx_shared_bit(cpu);
+
+    if (!QEMU_IS_ALIGNED(gpa, 4096) || !QEMU_IS_ALIGNED(buf_len, 4096)) {
+        vmcall->status_code = TDG_VP_VMCALL_ALIGN_ERROR;
+        return;
+    }
+    if (buf_len == 0) {
+        return;
+    }
+
+    if (address_space_read(&address_space_memory, gpa, MEMTXATTRS_UNSPECIFIED,
+                           &hdr, sizeof(hdr)) != MEMTX_OK) {
+        return;
+    }
+    if (le64_to_cpu(hdr.structure_version) != TDX_GET_QUOTE_STRUCTURE_VERSION) {
+        return;
+    }
+    /*
+     * Paranoid: Guest should clear error_code and out_len to avoid information
+     * leak.  Enforce it.  The initial value of them doesn't matter for qemu to
+     * process the request.
+     */
+    if (le64_to_cpu(hdr.error_code) != TDX_VP_GET_QUOTE_SUCCESS ||
+        le32_to_cpu(hdr.out_len) != 0) {
+        return;
+    }
+
+    /* Only safe-guard check to avoid too large buffer size. */
+    if (buf_len > TDX_GET_QUOTE_MAX_BUF_LEN ||
+        le32_to_cpu(hdr.in_len) > TDX_GET_QUOTE_MAX_BUF_LEN ||
+        le32_to_cpu(hdr.in_len) > buf_len) {
+        return;
+    }
+
+    /* Mark the buffer in-flight. */
+    hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_IN_FLIGHT);
+    if (address_space_write(&address_space_memory, gpa, MEMTXATTRS_UNSPECIFIED,
+                            &hdr, sizeof(hdr)) != MEMTX_OK) {
+        return;
+    }
+
+    ms = MACHINE(qdev_get_machine());
+    tdx = TDX_GUEST(ms->cgs);
+    ioc = qio_channel_socket_new();
+
+    t = g_malloc(sizeof(*t));
+    t->apic_id = tdx->event_notify_apic_id;
+    t->gpa = gpa;
+    t->buf_len = buf_len;
+    t->hdr = hdr;
+    t->ioc = ioc;
+
+    qemu_mutex_lock(&tdx->lock);
+    if (!tdx->quote_generation ||
+        /* Prevent too many in-flight get-quote request. */
+        tdx->quote_generation_num >= TDX_MAX_GET_QUOTE_REQUEST) {
+        qemu_mutex_unlock(&tdx->lock);
+        vmcall->status_code = TDG_VP_VMCALL_RETRY;
+        object_unref(OBJECT(ioc));
+        g_free(t);
+        return;
+    }
+    tdx->quote_generation_num++;
+    t->event_notify_interrupt = tdx->event_notify_interrupt;
+    qio_channel_socket_connect_async(
+        ioc, tdx->quote_generation, tdx_handle_get_quote_connected, t, g_free,
+        NULL);
+    qemu_mutex_unlock(&tdx->lock);
+
+    vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
+}
+
 static void tdx_handle_setup_event_notify_interrupt(X86CPU *cpu,
                                                     struct kvm_tdx_vmcall *vmcall)
 {
@@ -943,6 +1320,9 @@ static void tdx_handle_vmcall(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
     }
 
     switch (vmcall->subfunction) {
+    case TDG_VP_VMCALL_GET_QUOTE:
+        tdx_handle_get_quote(cpu, vmcall);
+        break;
     case TDG_VP_VMCALL_SETUP_EVENT_NOTIFY_INTERRUPT:
         tdx_handle_setup_event_notify_interrupt(cpu, vmcall);
         break;
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 50a151fc79c2..d861d8516668 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -5,8 +5,10 @@
 #include CONFIG_DEVICES /* CONFIG_TDX */
 #endif
 
+#include <linux/kvm.h>
 #include "exec/confidential-guest-support.h"
 #include "hw/i386/tdvf.h"
+#include "io/channel-socket.h"
 #include "sysemu/kvm.h"
 
 #define TYPE_TDX_GUEST "tdx-guest"
@@ -47,6 +49,11 @@ typedef struct TdxGuest {
     /* runtime state */
     int event_notify_interrupt;
     uint32_t event_notify_apic_id;
+
+    /* GetQuote */
+    int quote_generation_num;
+    char *quote_generation_str;
+    SocketAddress *quote_generation;
 } TdxGuest;
 
 #ifdef CONFIG_TDX
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 42/58] i386/tdx: register the fd read callback with the main loop to read the quote data
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (40 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote> Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-24  6:27   ` Chenyi Qiang
  2023-08-18  9:50 ` [PATCH v2 43/58] i386/tdx: setup a timer for the qio channel Xiaoyao Li
                   ` (15 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Chenyi Qiang <chenyi.qiang@intel.com>

When TD guest invokes getquote tdvmcall, QEMU will register a async qio
task with default context when the qio channel is connected. However, as
there is a blocking action (recvmsg()) in qio_channel_read() and it will
block main thread and make TD guest have no response until the server
returns.

Set the io channel non-blocking and register the socket fd with the main
loop. Move the read operation into the callback. When the fd is readable,
inovke the callback to handle the quote data.

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 147 +++++++++++++++++++++++++++---------------
 1 file changed, 96 insertions(+), 51 deletions(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 73d6cd88af9e..3cb2163a0335 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -997,6 +997,8 @@ struct tdx_get_quote_task {
     uint32_t apic_id;
     hwaddr gpa;
     uint64_t buf_len;
+    char *out_data;
+    uint64_t out_len;
     struct tdx_get_quote_header hdr;
     int event_notify_interrupt;
     QIOChannelSocket *ioc;
@@ -1082,78 +1084,53 @@ static void tdx_td_notify(struct tdx_get_quote_task *t)
     }
 }
 
-/*
- * TODO: If QGS doesn't reply for long time, make it an error and interrupt
- * guest.
- */
-static void tdx_handle_get_quote_connected(QIOTask *task, gpointer opaque)
+static void tdx_get_quote_read(void *opaque)
 {
     struct tdx_get_quote_task *t = opaque;
+    ssize_t size = 0;
     Error *err = NULL;
-    char *in_data = NULL;
-    char *out_data = NULL;
-    size_t out_len;
-    ssize_t size;
     MachineState *ms;
     TdxGuest *tdx;
 
-    t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_ERROR);
-    if (qio_task_propagate_error(task, NULL)) {
-        t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
-        goto error;
-    }
-
-    in_data = g_malloc(le32_to_cpu(t->hdr.in_len));
-    if (address_space_read(&address_space_memory, t->gpa + sizeof(t->hdr),
-                           MEMTXATTRS_UNSPECIFIED, in_data,
-                           le32_to_cpu(t->hdr.in_len)) != MEMTX_OK) {
-        goto error;
-    }
-
-    if (qio_channel_write_all(QIO_CHANNEL(t->ioc), in_data,
-                              le32_to_cpu(t->hdr.in_len), &err) ||
-        err) {
-        t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
-        goto error;
-    }
-
-    out_data = g_malloc(t->buf_len);
-    out_len = 0;
-    size = 0;
     while (true) {
         char *buf;
         size_t buf_size;
 
-        if (out_len < t->buf_len) {
-            buf = out_data + out_len;
-            buf_size = t->buf_len - out_len;
+        if (t->out_len < t->buf_len) {
+            buf = t->out_data + t->out_len;
+            buf_size = t->buf_len - t->out_len;
         } else {
             /*
              * The received data is too large to fit in the shared GPA.
              * Discard the received data and try to know the data size.
              */
-            buf = out_data;
+            buf = t->out_data;
             buf_size = t->buf_len;
         }
 
         size = qio_channel_read(QIO_CHANNEL(t->ioc), buf, buf_size, &err);
-        if (err) {
+        if (!size) {
             break;
         }
-        if (size <= 0) {
-            break;
+
+        if (size < 0) {
+            if (size == QIO_CHANNEL_ERR_BLOCK) {
+                return;
+            } else {
+                break;
+            }
         }
-        out_len += size;
+        t->out_len += size;
     }
     /*
-     * Treat partial read as success and let the QGS client to handle it because
-     * the client knows better about the QGS.
+     * If partial read successfully but return error at last, also treat it
+     * as failure.
      */
-    if (out_len == 0 && (err || size < 0)) {
+    if (size < 0) {
         t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
         goto error;
     }
-    if (out_len > 0 && out_len > t->buf_len) {
+    if (t->out_len > 0 && t->out_len > t->buf_len) {
         /*
          * There is no specific error code defined for this case(E2BIG) at the
          * moment.
@@ -1161,20 +1138,20 @@ static void tdx_handle_get_quote_connected(QIOTask *task, gpointer opaque)
          * update the error code.
          */
         t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_ERROR);
-        t->hdr.out_len = cpu_to_le32(out_len);
+        t->hdr.out_len = cpu_to_le32(t->out_len);
         goto error_hdr;
     }
 
     if (address_space_write(
             &address_space_memory, t->gpa + sizeof(t->hdr),
-            MEMTXATTRS_UNSPECIFIED, out_data, out_len) != MEMTX_OK) {
+            MEMTXATTRS_UNSPECIFIED, t->out_data, t->out_len) != MEMTX_OK) {
         goto error;
     }
     /*
      * Even if out_len == 0, it's a success.  It's up to the QGS-client contract
      * how to interpret the zero-sized message as return message.
      */
-    t->hdr.out_len = cpu_to_le32(out_len);
+    t->hdr.out_len = cpu_to_le32(t->out_len);
     t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS);
 
 error:
@@ -1185,14 +1162,15 @@ error_hdr:
     if (address_space_write(
             &address_space_memory, t->gpa,
             MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
-        error_report("TDX: failed to updsate GetQuote header.\n");
+        error_report("TDX: failed to update GetQuote header.");
     }
     tdx_td_notify(t);
 
+    qemu_set_fd_handler(t->ioc->fd, NULL, NULL, NULL);
     qio_channel_close(QIO_CHANNEL(t->ioc), &err);
     object_unref(OBJECT(t->ioc));
-    g_free(in_data);
-    g_free(out_data);
+    g_free(t->out_data);
+    g_free(t);
 
     /* Maintain the number of in-flight requests. */
     ms = MACHINE(qdev_get_machine());
@@ -1200,7 +1178,71 @@ error_hdr:
     qemu_mutex_lock(&tdx->lock);
     tdx->quote_generation_num--;
     qemu_mutex_unlock(&tdx->lock);
+}
+
+/*
+ * TODO: If QGS doesn't reply for long time, make it an error and interrupt
+ * guest.
+ */
+static void tdx_handle_get_quote_connected(QIOTask *task, gpointer opaque)
+{
+    struct tdx_get_quote_task *t = opaque;
+    Error *err = NULL;
+    char *in_data = NULL;
+    MachineState *ms;
+    TdxGuest *tdx;
+
+    t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_ERROR);
+    if (qio_task_propagate_error(task, NULL)) {
+        t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
+        goto error;
+    }
+
+    in_data = g_malloc(le32_to_cpu(t->hdr.in_len));
+    if (!in_data) {
+        goto error;
+    }
+
+    if (address_space_read(&address_space_memory, t->gpa + sizeof(t->hdr),
+                           MEMTXATTRS_UNSPECIFIED, in_data,
+                           le32_to_cpu(t->hdr.in_len)) != MEMTX_OK) {
+        goto error;
+    }
+
+    qio_channel_set_blocking(QIO_CHANNEL(t->ioc), false, NULL);
+
+    if (qio_channel_write_all(QIO_CHANNEL(t->ioc), in_data,
+                              le32_to_cpu(t->hdr.in_len), &err) ||
+        err) {
+        t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
+        goto error;
+    }
+
+    g_free(in_data);
+    qemu_set_fd_handler(t->ioc->fd, tdx_get_quote_read, NULL, t);
+
+    return;
+error:
+    t->hdr.out_len = cpu_to_le32(0);
 
+    if (address_space_write(
+            &address_space_memory, t->gpa,
+            MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
+        error_report("TDX: failed to update GetQuote header.\n");
+    }
+    tdx_td_notify(t);
+
+    qio_channel_close(QIO_CHANNEL(t->ioc), &err);
+    object_unref(OBJECT(t->ioc));
+    g_free(t);
+    g_free(in_data);
+
+    /* Maintain the number of in-flight requests. */
+    ms = MACHINE(qdev_get_machine());
+    tdx = TDX_GUEST(ms->cgs);
+    qemu_mutex_lock(&tdx->lock);
+    tdx->quote_generation_num--;
+    qemu_mutex_unlock(&tdx->lock);
     return;
 }
 
@@ -1269,6 +1311,8 @@ static void tdx_handle_get_quote(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
     t->apic_id = tdx->event_notify_apic_id;
     t->gpa = gpa;
     t->buf_len = buf_len;
+    t->out_data = g_malloc(t->buf_len);
+    t->out_len = 0;
     t->hdr = hdr;
     t->ioc = ioc;
 
@@ -1279,13 +1323,14 @@ static void tdx_handle_get_quote(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
         qemu_mutex_unlock(&tdx->lock);
         vmcall->status_code = TDG_VP_VMCALL_RETRY;
         object_unref(OBJECT(ioc));
+        g_free(t->out_data);
         g_free(t);
         return;
     }
     tdx->quote_generation_num++;
     t->event_notify_interrupt = tdx->event_notify_interrupt;
     qio_channel_socket_connect_async(
-        ioc, tdx->quote_generation, tdx_handle_get_quote_connected, t, g_free,
+        ioc, tdx->quote_generation, tdx_handle_get_quote_connected, t, NULL,
         NULL);
     qemu_mutex_unlock(&tdx->lock);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 43/58] i386/tdx: setup a timer for the qio channel
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (41 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 42/58] i386/tdx: register the fd read callback with the main loop to read the quote data Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-24  7:21   ` Chenyi Qiang
  2023-08-18  9:50 ` [PATCH v2 44/58] i386/tdx: handle TDG.VP.VMCALL<MapGPA> hypercall Xiaoyao Li
                   ` (14 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Chenyi Qiang <chenyi.qiang@intel.com>

To avoid no response from QGS server, setup a timer for the transaction. If
timeout, make it an error and interrupt guest. Define the threshold of time
to 30s at present, maybe change to other value if not appropriate.

Extract the common cleanup code to make it more clear.

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 151 ++++++++++++++++++++++++------------------
 1 file changed, 85 insertions(+), 66 deletions(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 3cb2163a0335..fa658ce1f2e4 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -1002,6 +1002,7 @@ struct tdx_get_quote_task {
     struct tdx_get_quote_header hdr;
     int event_notify_interrupt;
     QIOChannelSocket *ioc;
+    QEMUTimer timer;
 };
 
 struct x86_msi {
@@ -1084,13 +1085,48 @@ static void tdx_td_notify(struct tdx_get_quote_task *t)
     }
 }
 
+static void tdx_getquote_task_cleanup(struct tdx_get_quote_task *t, bool outlen_overflow)
+{
+    MachineState *ms;
+    TdxGuest *tdx;
+
+    if (t->hdr.error_code != cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS) && !outlen_overflow) {
+        t->hdr.out_len = cpu_to_le32(0);
+    }
+
+    /* Publish the response contents before marking this request completed. */
+    smp_wmb();
+    if (address_space_write(
+            &address_space_memory, t->gpa,
+            MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
+        error_report("TDX: failed to update GetQuote header.");
+    }
+    tdx_td_notify(t);
+
+    if (t->ioc->fd > 0) {
+        qemu_set_fd_handler(t->ioc->fd, NULL, NULL, NULL);
+    }
+    qio_channel_close(QIO_CHANNEL(t->ioc), NULL);
+    object_unref(OBJECT(t->ioc));
+    timer_del(&t->timer);
+    g_free(t->out_data);
+    g_free(t);
+
+    /* Maintain the number of in-flight requests. */
+    ms = MACHINE(qdev_get_machine());
+    tdx = TDX_GUEST(ms->cgs);
+    qemu_mutex_lock(&tdx->lock);
+    tdx->quote_generation_num--;
+    qemu_mutex_unlock(&tdx->lock);
+}
+
+
 static void tdx_get_quote_read(void *opaque)
 {
     struct tdx_get_quote_task *t = opaque;
     ssize_t size = 0;
     Error *err = NULL;
-    MachineState *ms;
-    TdxGuest *tdx;
+    bool outlen_overflow = false;
 
     while (true) {
         char *buf;
@@ -1135,11 +1171,12 @@ static void tdx_get_quote_read(void *opaque)
          * There is no specific error code defined for this case(E2BIG) at the
          * moment.
          * TODO: Once an error code for this case is defined in GHCI spec ,
-         * update the error code.
+         * update the error code and the tdx_getquote_task_cleanup() argument.
          */
         t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_ERROR);
         t->hdr.out_len = cpu_to_le32(t->out_len);
-        goto error_hdr;
+        outlen_overflow = true;
+        goto error;
     }
 
     if (address_space_write(
@@ -1155,94 +1192,76 @@ static void tdx_get_quote_read(void *opaque)
     t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS);
 
 error:
-    if (t->hdr.error_code != cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS)) {
-        t->hdr.out_len = cpu_to_le32(0);
-    }
-error_hdr:
-    if (address_space_write(
-            &address_space_memory, t->gpa,
-            MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
-        error_report("TDX: failed to update GetQuote header.");
-    }
-    tdx_td_notify(t);
+    tdx_getquote_task_cleanup(t, outlen_overflow);
+}
+
+#define TRANSACTION_TIMEOUT 30000
+
+static void getquote_timer_expired(void *opaque)
+{
+    struct tdx_get_quote_task *t = opaque;
+
+    tdx_getquote_task_cleanup(t, false);
+}
 
-    qemu_set_fd_handler(t->ioc->fd, NULL, NULL, NULL);
-    qio_channel_close(QIO_CHANNEL(t->ioc), &err);
-    object_unref(OBJECT(t->ioc));
-    g_free(t->out_data);
-    g_free(t);
+static void tdx_transaction_start(struct tdx_get_quote_task *t)
+{
+    int64_t time;
 
-    /* Maintain the number of in-flight requests. */
-    ms = MACHINE(qdev_get_machine());
-    tdx = TDX_GUEST(ms->cgs);
-    qemu_mutex_lock(&tdx->lock);
-    tdx->quote_generation_num--;
-    qemu_mutex_unlock(&tdx->lock);
+    time = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+    /*
+     * Timeout callback and fd callback both run in main loop thread,
+     * thus no need to worry about race condition.
+     */
+    qemu_set_fd_handler(t->ioc->fd, tdx_get_quote_read, NULL, t);
+    timer_init_ms(&t->timer, QEMU_CLOCK_VIRTUAL, getquote_timer_expired, t);
+    timer_mod(&t->timer, time + TRANSACTION_TIMEOUT);
 }
 
-/*
- * TODO: If QGS doesn't reply for long time, make it an error and interrupt
- * guest.
- */
 static void tdx_handle_get_quote_connected(QIOTask *task, gpointer opaque)
 {
     struct tdx_get_quote_task *t = opaque;
     Error *err = NULL;
     char *in_data = NULL;
-    MachineState *ms;
-    TdxGuest *tdx;
+    int ret = 0;
 
     t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_ERROR);
-    if (qio_task_propagate_error(task, NULL)) {
+    ret = qio_task_propagate_error(task, NULL);
+    if (ret) {
         t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
-        goto error;
+        goto out;
     }
 
     in_data = g_malloc(le32_to_cpu(t->hdr.in_len));
     if (!in_data) {
-        goto error;
+        ret = -1;
+        goto out;
     }
 
-    if (address_space_read(&address_space_memory, t->gpa + sizeof(t->hdr),
-                           MEMTXATTRS_UNSPECIFIED, in_data,
-                           le32_to_cpu(t->hdr.in_len)) != MEMTX_OK) {
-        goto error;
+    ret = address_space_read(&address_space_memory, t->gpa + sizeof(t->hdr),
+                             MEMTXATTRS_UNSPECIFIED, in_data,
+                             le32_to_cpu(t->hdr.in_len));
+    if (ret) {
+        g_free(in_data);
+        goto out;
     }
 
     qio_channel_set_blocking(QIO_CHANNEL(t->ioc), false, NULL);
 
-    if (qio_channel_write_all(QIO_CHANNEL(t->ioc), in_data,
-                              le32_to_cpu(t->hdr.in_len), &err) ||
-        err) {
+    ret = qio_channel_write_all(QIO_CHANNEL(t->ioc), in_data,
+                              le32_to_cpu(t->hdr.in_len), &err);
+    if (ret) {
         t->hdr.error_code = cpu_to_le64(TDX_VP_GET_QUOTE_QGS_UNAVAILABLE);
-        goto error;
+        g_free(in_data);
+        goto out;
     }
 
-    g_free(in_data);
-    qemu_set_fd_handler(t->ioc->fd, tdx_get_quote_read, NULL, t);
-
-    return;
-error:
-    t->hdr.out_len = cpu_to_le32(0);
-
-    if (address_space_write(
-            &address_space_memory, t->gpa,
-            MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
-        error_report("TDX: failed to update GetQuote header.\n");
+out:
+    if (ret) {
+        tdx_getquote_task_cleanup(t, false);
+    } else {
+        tdx_transaction_start(t);
     }
-    tdx_td_notify(t);
-
-    qio_channel_close(QIO_CHANNEL(t->ioc), &err);
-    object_unref(OBJECT(t->ioc));
-    g_free(t);
-    g_free(in_data);
-
-    /* Maintain the number of in-flight requests. */
-    ms = MACHINE(qdev_get_machine());
-    tdx = TDX_GUEST(ms->cgs);
-    qemu_mutex_lock(&tdx->lock);
-    tdx->quote_generation_num--;
-    qemu_mutex_unlock(&tdx->lock);
     return;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 44/58] i386/tdx: handle TDG.VP.VMCALL<MapGPA> hypercall
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (42 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 43/58] i386/tdx: setup a timer for the qio channel Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 45/58] i386/tdx: Limit the range size for MapGPA Xiaoyao Li
                   ` (13 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

MapGPA is a hypercall to convert GPA from/to private GPA to/from shared GPA.
As the conversion function is already implemented as kvm_convert_memory,
wire it to TDX hypercall exit.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 accel/kvm/kvm-all.c   |  2 +-
 include/sysemu/kvm.h  |  2 ++
 target/i386/kvm/tdx.c | 37 +++++++++++++++++++++++++++++++++++++
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 8d53c89e9dbf..b0a18800fd63 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -3059,7 +3059,7 @@ static void kvm_eat_signals(CPUState *cpu)
     } while (sigismember(&chkset, SIG_IPI));
 }
 
-static int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
+int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
 {
     MemoryRegionSection section;
     void *addr;
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index d89ec87072d7..7e0f56d1c5c5 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -584,4 +584,6 @@ uint32_t kvm_dirty_ring_size(void);
 
 int kvm_set_memory_attributes_private(hwaddr start, hwaddr size);
 int kvm_set_memory_attributes_shared(hwaddr start, hwaddr size);
+
+int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);
 #endif
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index fa658ce1f2e4..0c43c1f7759f 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -934,6 +934,7 @@ static void tdx_guest_class_init(ObjectClass *oc, void *data)
 {
 }
 
+#define TDG_VP_VMCALL_MAP_GPA                           0x10001ULL
 #define TDG_VP_VMCALL_GET_QUOTE                         0x10002ULL
 #define TDG_VP_VMCALL_SETUP_EVENT_NOTIFY_INTERRUPT      0x10004ULL
 
@@ -993,6 +994,39 @@ static hwaddr tdx_shared_bit(X86CPU *cpu)
     return (cpu->phys_bits > 48) ? BIT_ULL(51) : BIT_ULL(47);
 }
 
+static void tdx_handle_map_gpa(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
+{
+    hwaddr shared_bit = tdx_shared_bit(cpu);
+    hwaddr gpa = vmcall->in_r12 & ~shared_bit;
+    bool private = !(vmcall->in_r12 & shared_bit);
+    hwaddr size = vmcall->in_r13;
+    int ret = 0;
+
+    vmcall->status_code = TDG_VP_VMCALL_INVALID_OPERAND;
+
+    if (!QEMU_IS_ALIGNED(gpa, 4096) || !QEMU_IS_ALIGNED(size, 4096)) {
+        vmcall->status_code = TDG_VP_VMCALL_ALIGN_ERROR;
+        return;
+    }
+
+    /* Overflow case. */
+    if (gpa + size < gpa) {
+        return;
+    }
+    if (gpa >= (1ULL << cpu->phys_bits) ||
+        gpa + size >= (1ULL << cpu->phys_bits)) {
+        return;
+    }
+
+    if (size > 0) {
+        ret = kvm_convert_memory(gpa, size, private);
+    }
+
+    if (!ret) {
+        vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
+    }
+}
+
 struct tdx_get_quote_task {
     uint32_t apic_id;
     hwaddr gpa;
@@ -1384,6 +1418,9 @@ static void tdx_handle_vmcall(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
     }
 
     switch (vmcall->subfunction) {
+    case TDG_VP_VMCALL_MAP_GPA:
+        tdx_handle_map_gpa(cpu, vmcall);
+        break;
     case TDG_VP_VMCALL_GET_QUOTE:
         tdx_handle_get_quote(cpu, vmcall);
         break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 45/58] i386/tdx: Limit the range size for MapGPA
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (43 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 44/58] i386/tdx: handle TDG.VP.VMCALL<MapGPA> hypercall Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-21 22:30   ` Isaku Yamahata
  2023-08-18  9:50 ` [PATCH v2 46/58] i386/tdx: Handle TDG.VP.VMCALL<REPORT_FATAL_ERROR> Xiaoyao Li
                   ` (12 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

If the range for TDG.VP.VMCALL<MapGPA> is too large, process the limited
size and return retry error.  It's bad for VMM to take too long time,
e.g. second order, with blocking vcpu execution.  It results in too many
missing timer interrupts.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 0c43c1f7759f..ced55be506d1 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -994,12 +994,16 @@ static hwaddr tdx_shared_bit(X86CPU *cpu)
     return (cpu->phys_bits > 48) ? BIT_ULL(51) : BIT_ULL(47);
 }
 
+/* 64MB at most in one call. What value is appropriate? */
+#define TDX_MAP_GPA_MAX_LEN     (64 * 1024 * 1024)
+
 static void tdx_handle_map_gpa(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
 {
     hwaddr shared_bit = tdx_shared_bit(cpu);
     hwaddr gpa = vmcall->in_r12 & ~shared_bit;
     bool private = !(vmcall->in_r12 & shared_bit);
     hwaddr size = vmcall->in_r13;
+    bool retry = false;
     int ret = 0;
 
     vmcall->status_code = TDG_VP_VMCALL_INVALID_OPERAND;
@@ -1018,12 +1022,25 @@ static void tdx_handle_map_gpa(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
         return;
     }
 
+    if (size > TDX_MAP_GPA_MAX_LEN) {
+        retry = true;
+        size = TDX_MAP_GPA_MAX_LEN;
+    }
+
     if (size > 0) {
         ret = kvm_convert_memory(gpa, size, private);
     }
 
     if (!ret) {
-        vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
+        if (retry) {
+            vmcall->status_code = TDG_VP_VMCALL_RETRY;
+            vmcall->out_r11 = gpa + size;
+            if (!private) {
+                vmcall->out_r11 |= shared_bit;
+            }
+        } else {
+            vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
+        }
     }
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 46/58] i386/tdx: Handle TDG.VP.VMCALL<REPORT_FATAL_ERROR>
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (44 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 45/58] i386/tdx: Limit the range size for MapGPA Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 47/58] i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility Xiaoyao Li
                   ` (11 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index ced55be506d1..f111b46dac92 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -936,6 +936,7 @@ static void tdx_guest_class_init(ObjectClass *oc, void *data)
 
 #define TDG_VP_VMCALL_MAP_GPA                           0x10001ULL
 #define TDG_VP_VMCALL_GET_QUOTE                         0x10002ULL
+#define TDG_VP_VMCALL_REPORT_FATAL_ERROR                0x10003ULL
 #define TDG_VP_VMCALL_SETUP_EVENT_NOTIFY_INTERRUPT      0x10004ULL
 
 #define TDG_VP_VMCALL_SUCCESS           0x0000000000000000ULL
@@ -1407,6 +1408,42 @@ static void tdx_handle_get_quote(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
     vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
 }
 
+static void tdx_handle_report_fatal_error(X86CPU *cpu,
+                                          struct kvm_tdx_vmcall *vmcall)
+{
+    uint64_t error_code = vmcall->in_r12;
+    char *message = NULL;
+
+    if (error_code & 0xffff) {
+        error_report("invalid error code of TDG.VP.VMCALL<REPORT_FATAL_ERROR>\n");
+        exit(1);
+    }
+
+    /* it has optional message */
+    if (vmcall->in_r14) {
+        uint64_t * tmp;
+
+#define GUEST_PANIC_INFO_TDX_MESSAGE_MAX        64
+        message = g_malloc0(GUEST_PANIC_INFO_TDX_MESSAGE_MAX + 1);
+
+        tmp = (uint64_t *)message;
+        /* The order is defined in TDX GHCI spec */
+        *(tmp++) = cpu_to_le64(vmcall->in_r14);
+        *(tmp++) = cpu_to_le64(vmcall->in_r15);
+        *(tmp++) = cpu_to_le64(vmcall->in_rbx);
+        *(tmp++) = cpu_to_le64(vmcall->in_rdi);
+        *(tmp++) = cpu_to_le64(vmcall->in_rsi);
+        *(tmp++) = cpu_to_le64(vmcall->in_r8);
+        *(tmp++) = cpu_to_le64(vmcall->in_r9);
+        *(tmp++) = cpu_to_le64(vmcall->in_rdx);
+        message[GUEST_PANIC_INFO_TDX_MESSAGE_MAX] = '\0';
+        assert((char *)tmp == message + GUEST_PANIC_INFO_TDX_MESSAGE_MAX);
+    }
+
+    error_report("TD guest reports fatal error. %s\n", message ? : "");
+    exit(1);
+}
+
 static void tdx_handle_setup_event_notify_interrupt(X86CPU *cpu,
                                                     struct kvm_tdx_vmcall *vmcall)
 {
@@ -1441,6 +1478,9 @@ static void tdx_handle_vmcall(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
     case TDG_VP_VMCALL_GET_QUOTE:
         tdx_handle_get_quote(cpu, vmcall);
         break;
+    case TDG_VP_VMCALL_REPORT_FATAL_ERROR:
+        tdx_handle_report_fatal_error(cpu, vmcall);
+        break;
     case TDG_VP_VMCALL_SETUP_EVENT_NOTIFY_INTERRUPT:
         tdx_handle_setup_event_notify_interrupt(cpu, vmcall);
         break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 47/58] i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (45 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 46/58] i386/tdx: Handle TDG.VP.VMCALL<REPORT_FATAL_ERROR> Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-21  9:58   ` Daniel P. Berrangé
  2023-08-18  9:50 ` [PATCH v2 48/58] i386/tdx: Disable SMM for TDX VMs Xiaoyao Li
                   ` (10 subsequent siblings)
  57 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 qapi/run-state.json   | 17 +++++++++++++--
 softmmu/runstate.c    | 49 +++++++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.c | 24 ++++++++++++++++++++-
 3 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/qapi/run-state.json b/qapi/run-state.json
index f216ba54ec4c..506bbe31541f 100644
--- a/qapi/run-state.json
+++ b/qapi/run-state.json
@@ -499,7 +499,7 @@
 # Since: 2.9
 ##
 { 'enum': 'GuestPanicInformationType',
-  'data': [ 'hyper-v', 's390' ] }
+  'data': [ 'hyper-v', 's390', 'tdx' ] }
 
 ##
 # @GuestPanicInformation:
@@ -514,7 +514,8 @@
  'base': {'type': 'GuestPanicInformationType'},
  'discriminator': 'type',
  'data': {'hyper-v': 'GuestPanicInformationHyperV',
-          's390': 'GuestPanicInformationS390'}}
+          's390': 'GuestPanicInformationS390',
+          'tdx' : 'GuestPanicInformationTdx'}}
 
 ##
 # @GuestPanicInformationHyperV:
@@ -577,6 +578,18 @@
           'psw-addr': 'uint64',
           'reason': 'S390CrashReason'}}
 
+##
+# @GuestPanicInformationTdx:
+#
+# TDX GHCI TDG.VP.VMCALL<ReportFatalError> specific guest panic information
+#
+# Since: 8.2
+##
+{'struct': 'GuestPanicInformationTdx',
+ 'data': {'error-code': 'uint64',
+          'gpa': 'uint64',
+          'message': 'str'}}
+
 ##
 # @MEMORY_FAILURE:
 #
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index f3bd86281813..cab11484ed7e 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -518,7 +518,56 @@ void qemu_system_guest_panicked(GuestPanicInformation *info)
                           S390CrashReason_str(info->u.s390.reason),
                           info->u.s390.psw_mask,
                           info->u.s390.psw_addr);
+        } else if (info->type == GUEST_PANIC_INFORMATION_TYPE_TDX) {
+            char *buf = NULL;
+            bool printable = false;
+
+            /*
+             * Although message is defined as a json string, we shouldn't
+             * unconditionally treat it as is because the guest generated it and
+             * it's not necessarily trustable.
+             */
+            if (info->u.tdx.message) {
+                /* The caller guarantees the NUL-terminated string. */
+                int len = strlen(info->u.tdx.message);
+                int i;
+
+                printable = len > 0;
+                for (i = 0; i < len; i++) {
+                    if (!(0x20 <= info->u.tdx.message[i] &&
+                          info->u.tdx.message[i] <= 0x7e)) {
+                        printable = false;
+                        break;
+                    }
+                }
+
+                /* 3 = length of "%02x " */
+                buf = g_malloc(len * 3);
+                for (i = 0; i < len; i++) {
+                    if (info->u.tdx.message[i] == '\0') {
+                        break;
+                    } else {
+                        sprintf(buf + 3 * i, "%02x ", info->u.tdx.message[i]);
+                    }
+                }
+                if (i > 0)
+                    /* replace the last ' '(space) to NUL */
+                    buf[i * 3 - 1] = '\0';
+                else
+                    buf[0] = '\0';
+            }
+
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          //" TDX report fatal error:\"%s\" %s",
+                          " TDX report fatal error:\"%s\""
+                          "error: 0x%016" PRIx64 " gpa page: 0x%016" PRIx64 "\n",
+                          printable ? info->u.tdx.message : "",
+                          //buf ? buf : "",
+                          info->u.tdx.error_code,
+                          info->u.tdx.gpa);
+            g_free(buf);
         }
+
         qapi_free_GuestPanicInformation(info);
     }
 }
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index f111b46dac92..7efaa13f59e2 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -18,6 +18,7 @@
 #include "qom/object_interfaces.h"
 #include "standard-headers/asm-x86/kvm_para.h"
 #include "sysemu/kvm.h"
+#include "sysemu/runstate.h"
 #include "sysemu/sysemu.h"
 #include "exec/address-spaces.h"
 #include "exec/ramblock.h"
@@ -1408,11 +1409,26 @@ static void tdx_handle_get_quote(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
     vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
 }
 
+static void tdx_panicked_on_fatal_error(X86CPU *cpu, uint64_t error_code,
+                                        uint64_t gpa, char *message)
+{
+    GuestPanicInformation *panic_info;
+
+    panic_info = g_new0(GuestPanicInformation, 1);
+    panic_info->type = GUEST_PANIC_INFORMATION_TYPE_TDX;
+    panic_info->u.tdx.error_code = error_code;
+    panic_info->u.tdx.gpa = gpa;
+    panic_info->u.tdx.message = (char *)message;
+
+    qemu_system_guest_panicked(panic_info);
+}
+
 static void tdx_handle_report_fatal_error(X86CPU *cpu,
                                           struct kvm_tdx_vmcall *vmcall)
 {
     uint64_t error_code = vmcall->in_r12;
     char *message = NULL;
+    uint64_t gpa = -1ull;
 
     if (error_code & 0xffff) {
         error_report("invalid error code of TDG.VP.VMCALL<REPORT_FATAL_ERROR>\n");
@@ -1441,7 +1457,13 @@ static void tdx_handle_report_fatal_error(X86CPU *cpu,
     }
 
     error_report("TD guest reports fatal error. %s\n", message ? : "");
-    exit(1);
+
+#define TDX_REPORT_FATAL_ERROR_GPA_VALID    BIT_ULL(63)
+    if (error_code & TDX_REPORT_FATAL_ERROR_GPA_VALID) {
+	gpa = vmcall->in_r13;
+    }
+
+    tdx_panicked_on_fatal_error(cpu, error_code, gpa, message);
 }
 
 static void tdx_handle_setup_event_notify_interrupt(X86CPU *cpu,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 48/58] i386/tdx: Disable SMM for TDX VMs
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (46 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 47/58] i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 49/58] i386/tdx: Disable PIC " Xiaoyao Li
                   ` (9 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

TDX doesn't support SMM and VMM cannot emulate SMM for TDX VMs because
VMM cannot manipulate TDX VM's memory.

Disable SMM for TDX VMs and error out if user requests to enable SMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/tdx.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 7efaa13f59e2..f9d03ab0f461 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -678,9 +678,17 @@ static Notifier tdx_machine_done_notify = {
 
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
+    X86MachineState *x86ms = X86_MACHINE(ms);
     TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
                                                     TYPE_TDX_GUEST);
 
+    if (x86ms->smm == ON_OFF_AUTO_AUTO) {
+        x86ms->smm = ON_OFF_AUTO_OFF;
+    } else if (x86ms->smm == ON_OFF_AUTO_ON) {
+        error_setg(errp, "TDX VM doesn't support SMM");
+        return -EINVAL;
+    }
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 49/58] i386/tdx: Disable PIC for TDX VMs
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (47 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 48/58] i386/tdx: Disable SMM for TDX VMs Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 50/58] i386/tdx: Don't allow system reset " Xiaoyao Li
                   ` (8 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Legacy PIC (8259) cannot be supported for TDX VMs since TDX module
doesn't allow directly interrupt injection.  Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.

Hence disable PIC for TDX VMs and error out if user wants PIC.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/tdx.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index f9d03ab0f461..23ecd84a9e21 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -689,6 +689,13 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         return -EINVAL;
     }
 
+    if (x86ms->pic == ON_OFF_AUTO_AUTO) {
+        x86ms->pic = ON_OFF_AUTO_OFF;
+    } else if (x86ms->pic == ON_OFF_AUTO_ON) {
+        error_setg(errp, "TDX VM doesn't support PIC");
+        return -EINVAL;
+    }
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 50/58] i386/tdx: Don't allow system reset for TDX VMs
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (48 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 49/58] i386/tdx: Disable PIC " Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 51/58] i386/tdx: LMCE is not supported for TDX Xiaoyao Li
                   ` (7 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

TDX CPU state is protected and thus vcpu state cann't be reset by VMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 601683d836c8..50b0218a8044 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5918,7 +5918,7 @@ bool kvm_has_waitpkg(void)
 
 bool kvm_arch_cpu_check_are_resettable(void)
 {
-    return !sev_es_enabled();
+    return !sev_es_enabled() && !is_tdx_vm();
 }
 
 #define ARCH_REQ_XCOMP_GUEST_PERM       0x1025
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 51/58] i386/tdx: LMCE is not supported for TDX
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (49 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 50/58] i386/tdx: Don't allow system reset " Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 52/58] hw/i386: add eoi_intercept_unsupported member to X86MachineState Xiaoyao Li
                   ` (6 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

LMCE is not supported TDX since KVM doesn't provide emulation for
MSR_IA32_FEAT_CTL.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm-cpu.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/target/i386/kvm/kvm-cpu.c b/target/i386/kvm/kvm-cpu.c
index 7237378a7d4e..bec8b5f918e7 100644
--- a/target/i386/kvm/kvm-cpu.c
+++ b/target/i386/kvm/kvm-cpu.c
@@ -15,6 +15,7 @@
 #include "sysemu/sysemu.h"
 #include "hw/boards.h"
 
+#include "tdx.h"
 #include "kvm_i386.h"
 #include "hw/core/accel-cpu.h"
 
@@ -59,6 +60,10 @@ static bool lmce_supported(void)
     if (kvm_ioctl(kvm_state, KVM_X86_GET_MCE_CAP_SUPPORTED, &mce_cap) < 0) {
         return false;
     }
+
+    if (is_tdx_vm())
+        return false;
+
     return !!(mce_cap & MCG_LMCE_P);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 52/58] hw/i386: add eoi_intercept_unsupported member to X86MachineState
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (50 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 51/58] i386/tdx: LMCE is not supported for TDX Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 53/58] hw/i386: add option to forcibly report edge trigger in acpi tables Xiaoyao Li
                   ` (5 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Add a new bool member, eoi_intercept_unsupported, to X86MachineState
with default value false. Set true for TDX VM.

Inability to intercept eoi causes impossibility to emulate level
triggered interrupt to be re-injected when level is still kept active.
which affects interrupt controller emulation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 hw/i386/x86.c         | 1 +
 include/hw/i386/x86.h | 1 +
 target/i386/kvm/tdx.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index a0c9f4d646e2..567384484244 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1426,6 +1426,7 @@ static void x86_machine_initfn(Object *obj)
     x86ms->oem_table_id = g_strndup(ACPI_BUILD_APPNAME8, 8);
     x86ms->bus_lock_ratelimit = 0;
     x86ms->above_4g_mem_start = 4 * GiB;
+    x86ms->eoi_intercept_unsupported = false;
 }
 
 static void x86_machine_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index a3d03f78cefe..c4bfb67b03c7 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -61,6 +61,7 @@ struct X86MachineState {
 
     /* CPU and apic information: */
     bool apic_xrupt_override;
+    bool eoi_intercept_unsupported;
     unsigned pci_irq_mask;
     unsigned apic_id_limit;
     uint16_t boot_cpus;
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 23ecd84a9e21..9c017cf16d0d 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -696,6 +696,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         return -EINVAL;
     }
 
+    x86ms->eoi_intercept_unsupported = true;
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 53/58] hw/i386: add option to forcibly report edge trigger in acpi tables
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (51 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 52/58] hw/i386: add eoi_intercept_unsupported member to X86MachineState Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 54/58] i386/tdx: Don't synchronize guest tsc for TDs Xiaoyao Li
                   ` (4 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

When level trigger isn't supported on x86 platform,
forcibly report edge trigger in acpi tables.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 hw/i386/acpi-build.c  | 99 ++++++++++++++++++++++++++++---------------
 hw/i386/acpi-common.c | 50 ++++++++++++++++------
 2 files changed, 104 insertions(+), 45 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 9c74fa17ade0..0505514480bd 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -976,7 +976,8 @@ static void build_dbg_aml(Aml *table)
     aml_append(table, scope);
 }
 
-static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
+static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg,
+                           bool level_trigger_unsupported)
 {
     Aml *dev;
     Aml *crs;
@@ -988,7 +989,10 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
     aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
 
     crs = aml_resource_template();
-    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
+    aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                  level_trigger_unsupported ?
+                                  AML_EDGE : AML_LEVEL,
+                                  AML_ACTIVE_HIGH,
                                   AML_SHARED, irqs, ARRAY_SIZE(irqs)));
     aml_append(dev, aml_name_decl("_PRS", crs));
 
@@ -1012,7 +1016,8 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
     return dev;
  }
 
-static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
+static Aml *build_gsi_link_dev(const char *name, uint8_t uid,
+                               uint8_t gsi, bool level_trigger_unsupported)
 {
     Aml *dev;
     Aml *crs;
@@ -1025,7 +1030,10 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
 
     crs = aml_resource_template();
     irqs = gsi;
-    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
+    aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                  level_trigger_unsupported ?
+                                  AML_EDGE : AML_LEVEL,
+                                  AML_ACTIVE_HIGH,
                                   AML_SHARED, &irqs, 1));
     aml_append(dev, aml_name_decl("_PRS", crs));
 
@@ -1044,7 +1052,7 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
 }
 
 /* _CRS method - get current settings */
-static Aml *build_iqcr_method(bool is_piix4)
+static Aml *build_iqcr_method(bool is_piix4, bool level_trigger_unsupported)
 {
     Aml *if_ctx;
     uint32_t irqs;
@@ -1052,7 +1060,9 @@ static Aml *build_iqcr_method(bool is_piix4)
     Aml *crs = aml_resource_template();
 
     irqs = 0;
-    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
+    aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                  level_trigger_unsupported ?
+                                  AML_EDGE : AML_LEVEL,
                                   AML_ACTIVE_HIGH, AML_SHARED, &irqs, 1));
     aml_append(method, aml_name_decl("PRR0", crs));
 
@@ -1086,7 +1096,7 @@ static Aml *build_irq_status_method(void)
     return method;
 }
 
-static void build_piix4_pci0_int(Aml *table)
+static void build_piix4_pci0_int(Aml *table, bool level_trigger_unsupported)
 {
     Aml *dev;
     Aml *crs;
@@ -1099,12 +1109,16 @@ static void build_piix4_pci0_int(Aml *table)
     aml_append(sb_scope, pci0_scope);
 
     aml_append(sb_scope, build_irq_status_method());
-    aml_append(sb_scope, build_iqcr_method(true));
+    aml_append(sb_scope, build_iqcr_method(true, level_trigger_unsupported));
 
-    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0")));
-    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1")));
-    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2")));
-    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3")));
+    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3"),
+                                        level_trigger_unsupported));
 
     dev = aml_device("LNKS");
     {
@@ -1113,7 +1127,9 @@ static void build_piix4_pci0_int(Aml *table)
 
         crs = aml_resource_template();
         irqs = 9;
-        aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
+        aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                      level_trigger_unsupported ?
+                                      AML_EDGE : AML_LEVEL,
                                       AML_ACTIVE_HIGH, AML_SHARED,
                                       &irqs, 1));
         aml_append(dev, aml_name_decl("_PRS", crs));
@@ -1199,7 +1215,7 @@ static Aml *build_q35_routing_table(const char *str)
     return pkg;
 }
 
-static void build_q35_pci0_int(Aml *table)
+static void build_q35_pci0_int(Aml *table, bool level_trigger_unsupported)
 {
     Aml *method;
     Aml *sb_scope = aml_scope("_SB");
@@ -1238,25 +1254,41 @@ static void build_q35_pci0_int(Aml *table)
     aml_append(sb_scope, pci0_scope);
 
     aml_append(sb_scope, build_irq_status_method());
-    aml_append(sb_scope, build_iqcr_method(false));
+    aml_append(sb_scope, build_iqcr_method(false, level_trigger_unsupported));
 
-    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA")));
-    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB")));
-    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC")));
-    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD")));
-    aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE")));
-    aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF")));
-    aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG")));
-    aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH")));
+    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH"),
+                                        level_trigger_unsupported));
 
-    aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10));
-    aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11));
-    aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12));
-    aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13));
-    aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14));
-    aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15));
-    aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16));
-    aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17));
+    aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17,
+                                            level_trigger_unsupported));
 
     aml_append(table, sb_scope);
 }
@@ -1436,6 +1468,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     PCMachineState *pcms = PC_MACHINE(machine);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
     X86MachineState *x86ms = X86_MACHINE(machine);
+    bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
     AcpiMcfgInfo mcfg;
     bool mcfg_valid = !!acpi_get_mcfg(&mcfg);
     uint32_t nr_mem = machine->ram_slots;
@@ -1469,7 +1502,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         if (pm->pcihp_bridge_en || pm->pcihp_root_en) {
             build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
         }
-        build_piix4_pci0_int(dsdt);
+        build_piix4_pci0_int(dsdt, level_trigger_unsupported);
     } else if (q35) {
         sb_scope = aml_scope("_SB");
         dev = aml_device("PCI0");
@@ -1514,7 +1547,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         if (pm->pcihp_bridge_en) {
             build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
         }
-        build_q35_pci0_int(dsdt);
+        build_q35_pci0_int(dsdt, level_trigger_unsupported);
     }
 
     if (misc->has_hpet) {
diff --git a/hw/i386/acpi-common.c b/hw/i386/acpi-common.c
index 8a0932fe84ea..eafdd7d35d47 100644
--- a/hw/i386/acpi-common.c
+++ b/hw/i386/acpi-common.c
@@ -104,6 +104,7 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
     AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(adev);
     AcpiTable table = { .sig = "APIC", .rev = 3, .oem_id = oem_id,
                         .oem_table_id = oem_table_id };
+    bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
 
     acpi_table_begin(&table, table_data);
     /* Local APIC Address */
@@ -123,18 +124,43 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
                      IO_APIC_SECONDARY_ADDRESS, IO_APIC_SECONDARY_IRQBASE);
     }
 
-    if (x86ms->apic_xrupt_override) {
-        build_xrupt_override(table_data, 0, 2,
-            0 /* Flags: Conforms to the specifications of the bus */);
-    }
-
-    for (i = 1; i < 16; i++) {
-        if (!(x86ms->pci_irq_mask & (1 << i))) {
-            /* No need for a INT source override structure. */
-            continue;
-        }
-        build_xrupt_override(table_data, i, i,
-            0xd /* Flags: Active high, Level Triggered */);
+    if (level_trigger_unsupported) {
+        /* Force edge trigger */
+        if (x86ms->apic_xrupt_override) {
+            build_xrupt_override(table_data, 0, 2,
+                                 /* Flags: active high, edge triggered */
+                                 1 | (1 << 2));
+        }
+
+        for (i = x86ms->apic_xrupt_override ? 1 : 0; i < 16; i++) {
+            build_xrupt_override(table_data, i, i,
+                                 /* Flags: active high, edge triggered */
+                                 1 | (1 << 2));
+        }
+
+        if (x86ms->ioapic2) {
+            for (i = 0; i < 16; i++) {
+                build_xrupt_override(table_data, IO_APIC_SECONDARY_IRQBASE + i,
+                                     IO_APIC_SECONDARY_IRQBASE + i,
+                                     /* Flags: active high, edge triggered */
+                                     1 | (1 << 2));
+            }
+        }
+    } else {
+        if (x86ms->apic_xrupt_override) {
+            build_xrupt_override(table_data, 0, 2,
+                                 0 /* Flags: Conforms to the specifications of the bus */);
+        }
+
+        for (i = 1; i < 16; i++) {
+            if (!(x86ms->pci_irq_mask & (1 << i))) {
+                /* No need for a INT source override structure. */
+                continue;
+            }
+            build_xrupt_override(table_data, i, i,
+                                 0xd /* Flags: Active high, Level Triggered */);
+
+        }
     }
 
     if (x2apic_mode) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 54/58] i386/tdx: Don't synchronize guest tsc for TDs
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (52 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 53/58] hw/i386: add option to forcibly report edge trigger in acpi tables Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 55/58] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() " Xiaoyao Li
                   ` (3 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Isaku Yamahata <isaku.yamahata@intel.com>

TSC of TDs is not accessible and KVM doesn't allow access of
MSR_IA32_TSC for TDs. To avoid the assert() in kvm_get_tsc, make
kvm_synchronize_all_tsc() noop for TDs,

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 50b0218a8044..3f18a8d0f40b 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -298,7 +298,7 @@ void kvm_synchronize_all_tsc(void)
 {
     CPUState *cpu;
 
-    if (kvm_enabled()) {
+    if (kvm_enabled() && !is_tdx_vm()) {
         CPU_FOREACH(cpu) {
             run_on_cpu(cpu, do_kvm_synchronize_tsc, RUN_ON_CPU_NULL);
         }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 55/58] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (53 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 54/58] i386/tdx: Don't synchronize guest tsc for TDs Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 56/58] i386/tdx: Skip kvm_put_apicbase() " Xiaoyao Li
                   ` (2 subsequent siblings)
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
by VMM, while the features enumerated/controlled by other MSRs except
MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.

Only configure MSR_IA32_UCODE_REV for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/kvm.c | 44 ++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 3f18a8d0f40b..53d8d65f6667 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3426,32 +3426,34 @@ static void kvm_init_msrs(X86CPU *cpu)
     CPUX86State *env = &cpu->env;
 
     kvm_msr_buf_reset(cpu);
-    if (has_msr_arch_capabs) {
-        kvm_msr_entry_add(cpu, MSR_IA32_ARCH_CAPABILITIES,
-                          env->features[FEAT_ARCH_CAPABILITIES]);
-    }
-
-    if (has_msr_core_capabs) {
-        kvm_msr_entry_add(cpu, MSR_IA32_CORE_CAPABILITY,
-                          env->features[FEAT_CORE_CAPABILITY]);
-    }
-
-    if (has_msr_perf_capabs && cpu->enable_pmu) {
-        kvm_msr_entry_add_perf(cpu, env->features);
+
+    if (!is_tdx_vm()) {
+        if (has_msr_arch_capabs) {
+            kvm_msr_entry_add(cpu, MSR_IA32_ARCH_CAPABILITIES,
+                                env->features[FEAT_ARCH_CAPABILITIES]);
+        }
+
+        if (has_msr_core_capabs) {
+            kvm_msr_entry_add(cpu, MSR_IA32_CORE_CAPABILITY,
+                                env->features[FEAT_CORE_CAPABILITY]);
+        }
+
+        if (has_msr_perf_capabs && cpu->enable_pmu) {
+            kvm_msr_entry_add_perf(cpu, env->features);
+        }
+
+        /*
+         * Older kernels do not include VMX MSRs in KVM_GET_MSR_INDEX_LIST, but
+         * all kernels with MSR features should have them.
+         */
+        if (kvm_feature_msrs && cpu_has_vmx(env)) {
+            kvm_msr_entry_add_vmx(cpu, env->features);
+        }
     }
 
     if (has_msr_ucode_rev) {
         kvm_msr_entry_add(cpu, MSR_IA32_UCODE_REV, cpu->ucode_rev);
     }
-
-    /*
-     * Older kernels do not include VMX MSRs in KVM_GET_MSR_INDEX_LIST, but
-     * all kernels with MSR features should have them.
-     */
-    if (kvm_feature_msrs && cpu_has_vmx(env)) {
-        kvm_msr_entry_add_vmx(cpu, env->features);
-    }
-
     assert(kvm_buf_set_msrs(cpu) == 0);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 56/58] i386/tdx: Skip kvm_put_apicbase() for TDs
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (54 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 55/58] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() " Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 57/58] i386/tdx: Don't get/put guest state for TDX VMs Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 58/58] docs: Add TDX documentation Xiaoyao Li
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

KVM doesn't allow wirting to MSR_IA32_APICBASE for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/kvm.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 53d8d65f6667..d542351983cd 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3208,6 +3208,11 @@ void kvm_put_apicbase(X86CPU *cpu, uint64_t value)
 {
     int ret;
 
+    /* TODO: Allow accessing guest state for debug TDs. */
+    if (is_tdx_vm()) {
+        return;
+    }
+
     ret = kvm_put_one_msr(cpu, MSR_IA32_APICBASE, value);
     assert(ret == 1);
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 57/58] i386/tdx: Don't get/put guest state for TDX VMs
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (55 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 56/58] i386/tdx: Skip kvm_put_apicbase() " Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  2023-08-18  9:50 ` [PATCH v2 58/58] docs: Add TDX documentation Xiaoyao Li
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Don't get/put state of TDX VMs since accessing/mutating guest state of
production TDs is not supported.

Note, it will be allowed for a debug TD. Corresponding support will be
introduced when debug TD support is implemented in the future.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
---
 target/i386/kvm/kvm.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index d542351983cd..1422c79aca40 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4852,6 +4852,11 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 
     assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
 
+    /* TODO: Allow accessing guest state for debug TDs. */
+    if (is_tdx_vm()) {
+        return 0;
+    }
+
     /*
      * Put MSR_IA32_FEATURE_CONTROL first, this ensures the VM gets out of VMX
      * root operation upon vCPU reset. kvm_put_msr_feature_control() should also
@@ -4962,6 +4967,12 @@ int kvm_arch_get_registers(CPUState *cs)
     if (ret < 0) {
         goto out;
     }
+
+    /* TODO: Allow accessing guest state for debug TDs. */
+    if (is_tdx_vm()) {
+        return 0;
+    }
+
     ret = kvm_getput_regs(cpu, 0);
     if (ret < 0) {
         goto out;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* [PATCH v2 58/58] docs: Add TDX documentation
  2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
                   ` (56 preceding siblings ...)
  2023-08-18  9:50 ` [PATCH v2 57/58] i386/tdx: Don't get/put guest state for TDX VMs Xiaoyao Li
@ 2023-08-18  9:50 ` Xiaoyao Li
  57 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-18  9:50 UTC (permalink / raw)
  To: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, xiaoyao.li,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

Add docs/system/i386/tdx.rst for TDX support, and add tdx in
confidential-guest-support.rst

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>

---
Changes since v1:
 - Add prerequisite of private gmem;
 - update example command to launch TD;

Changes since RFC v4:
 - add the restriction that kernel-irqchip must be split
---
 docs/system/confidential-guest-support.rst |   1 +
 docs/system/i386/tdx.rst                   | 114 +++++++++++++++++++++
 docs/system/target-i386.rst                |   1 +
 3 files changed, 116 insertions(+)
 create mode 100644 docs/system/i386/tdx.rst

diff --git a/docs/system/confidential-guest-support.rst b/docs/system/confidential-guest-support.rst
index 0c490dbda2b7..66129fbab64c 100644
--- a/docs/system/confidential-guest-support.rst
+++ b/docs/system/confidential-guest-support.rst
@@ -38,6 +38,7 @@ Supported mechanisms
 Currently supported confidential guest mechanisms are:
 
 * AMD Secure Encrypted Virtualization (SEV) (see :doc:`i386/amd-memory-encryption`)
+* Intel Trust Domain Extension (TDX) (see :doc:`i386/tdx`)
 * POWER Protected Execution Facility (PEF) (see :ref:`power-papr-protected-execution-facility-pef`)
 * s390x Protected Virtualization (PV) (see :doc:`s390x/protvirt`)
 
diff --git a/docs/system/i386/tdx.rst b/docs/system/i386/tdx.rst
new file mode 100644
index 000000000000..48c0861c0530
--- /dev/null
+++ b/docs/system/i386/tdx.rst
@@ -0,0 +1,114 @@
+Intel Trusted Domain eXtension (TDX)
+====================================
+
+Intel Trusted Domain eXtensions (TDX) refers to an Intel technology that extends
+Virtual Machine Extensions (VMX) and Multi-Key Total Memory Encryption (MKTME)
+with a new kind of virtual machine guest called a Trust Domain (TD). A TD runs
+in a CPU mode that is designed to protect the confidentiality of its memory
+contents and its CPU state from any other software, including the hosting
+Virtual Machine Monitor (VMM), unless explicitly shared by the TD itself.
+
+Prerequisites
+-------------
+
+To run TD, the physical machine needs to have TDX module loaded and initialized
+while KVM hypervisor has TDX support and has TDX enabled. If those requirements
+are met, the ``KVM_CAP_VM_TYPES`` will report the support of ``KVM_X86_TDX_VM``.
+
+Trust Domain Virtual Firmware (TDVF)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Trust Domain Virtual Firmware (TDVF) is required to provide TD services to boot
+TD Guest OS. TDVF needs to be copied to guest private memory and measured before
+a TD boots.
+
+The VM scope ``MEMORY_ENCRYPT_OP`` ioctl provides command ``KVM_TDX_INIT_MEM_REGION``
+to copy the TDVF image to TD's private memory space.
+
+Since TDX doesn't support readonly memslot, TDVF cannot be mapped as pflash
+device and it actually works as RAM. "-bios" option is chosen to load TDVF.
+
+OVMF is the opensource firmware that implements the TDVF support. Thus the
+command line to specify and load TDVF is ``-bios OVMF.fd``
+
+KVM private gmem
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+TD's memory (RAM) need to be able to be transformed between private and shared.
+And its BIOS (OVMF/TDVF) needs to be mapped as private. Thus QEMU needs to
+allocate private gmem for them via KVM's IOCTL (KVM_CREATE_GUEST_MEMFD), which
+requires KVM is newer enough with gmem support.
+
+Feature Control
+---------------
+
+Unlike non-TDX VM, the CPU features (enumerated by CPU or MSR) of a TD is not
+under full control of VMM. VMM can only configure part of features of a TD on
+``KVM_TDX_INIT_VM`` command of VM scope ``MEMORY_ENCRYPT_OP`` ioctl.
+
+The configurable features have three types:
+
+- Attributes:
+  - PKS (bit 30) controls whether Supervisor Protection Keys is exposed to TD,
+  which determines related CPUID bit and CR4 bit;
+  - PERFMON (bit 63) controls whether PMU is exposed to TD.
+
+- XSAVE related features (XFAM):
+  XFAM is a 64b mask, which has the same format as XCR0 or IA32_XSS MSR. It
+  determines the set of extended features available for use by the guest TD.
+
+- CPUID features:
+  Only some bits of some CPUID leaves are directly configurable by VMM.
+
+What features can be configured is reported via TDX capabilities.
+
+TDX capabilities
+~~~~~~~~~~~~~~~~
+
+The VM scope ``MEMORY_ENCRYPT_OP`` ioctl provides command ``KVM_TDX_CAPABILITIES``
+to get the TDX capabilities from KVM. It returns a data structure of
+``struct kvm_tdx_capabilites``, which tells the supported configuration of
+attributes, XFAM and CPUIDs.
+
+Launching a TD (TDX VM)
+-----------------------
+
+To launch a TDX guest:
+
+.. parsed-literal::
+
+    |qemu_system_x86| \\
+        -object memory-backend-ram,id=mem0,size=${mem},private=on \\
+        -object tdx-guest,id=tdx0 \\
+        -machine ...,kernel-irqchip=split,confidential-guest-support=tdx0,memory-backend=mem0 \\
+        -bios OVMF.fd \\
+
+Debugging
+---------
+
+Bit 0 of TD attributes, is DEBUG bit, which decides if the TD runs in off-TD
+debug mode. When in off-TD debug mode, TD's VCPU state and private memory are
+accessible via given SEAMCALLs. This requires KVM to expose APIs to invoke those
+SEAMCALLs and resonponding QEMU change.
+
+It's targeted as future work.
+
+restrictions
+------------
+
+ - kernel-irqchip must be split;
+
+ - No readonly support for private memory;
+
+ - No SMM support: SMM support requires manipulating the guset register states
+   which is not allowed;
+
+Live Migration
+--------------
+
+TODO
+
+References
+----------
+
+- `TDX Homepage <https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html>`__
diff --git a/docs/system/target-i386.rst b/docs/system/target-i386.rst
index 1b8a1f248abb..4d58cdbc4e06 100644
--- a/docs/system/target-i386.rst
+++ b/docs/system/target-i386.rst
@@ -29,6 +29,7 @@ Architectural features
    i386/kvm-pv
    i386/sgx
    i386/amd-memory-encryption
+   i386/tdx
 
 OS requirements
 ~~~~~~~~~~~~~~~
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 03/58] target/i386: Parse TDX vm type
  2023-08-18  9:49 ` [PATCH v2 03/58] target/i386: Parse TDX vm type Xiaoyao Li
@ 2023-08-21  8:27   ` Daniel P. Berrangé
  2023-08-21 13:37     ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  8:27 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:49:46AM -0400, Xiaoyao Li wrote:
> TDX VM requires VM type KVM_X86_TDX_VM to be passed to
> kvm_ioctl(KVM_CREATE_VM).
> 
> If tdx-guest object is specified to confidential-guest-support, like,
> 
>   qemu -machine ...,confidential-guest-support=tdx0 \
>        -object tdx-guest,id=tdx0,...
> 
> it parses VM type as KVM_X86_TDX_VM.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  target/i386/kvm/kvm.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 62f237068a3a..77f4772afe6c 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -32,6 +32,7 @@
>  #include "sysemu/runstate.h"
>  #include "kvm_i386.h"
>  #include "sev.h"
> +#include "tdx.h"
>  #include "xen-emu.h"
>  #include "hyperv.h"
>  #include "hyperv-proto.h"
> @@ -158,6 +159,7 @@ static int kvm_get_one_msr(X86CPU *cpu, int index, uint64_t *value);
>  static const char* vm_type_name[] = {
>      [KVM_X86_DEFAULT_VM] = "default",
>      [KVM_X86_SW_PROTECTED_VM] = "sw-protected-vm",
> +    [KVM_X86_TDX_VM] = "tdx",
>  };
>  
>  int kvm_get_vm_type(MachineState *ms, const char *vm_type)
> @@ -170,12 +172,18 @@ int kvm_get_vm_type(MachineState *ms, const char *vm_type)
>              kvm_type = KVM_X86_DEFAULT_VM;
>          } else if (!g_ascii_strcasecmp(vm_type, "sw-protected-vm")) {
>              kvm_type = KVM_X86_SW_PROTECTED_VM;
> -        } else {
> +        } else if (!g_ascii_strcasecmp(vm_type, "tdx")) {
> +            kvm_type = KVM_X86_TDX_VM;
> +        }else {
>              error_report("Unknown kvm-type specified '%s'", vm_type);
>              exit(1);
>          }
>      }

This whole block of code should go away - as this should not exist
as a user visible property. It should be sufficient to use the
tdx-guest object type to identify use of TDX.

>  
> +    if (ms->cgs && object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
> +        kvm_type = KVM_X86_TDX_VM;
> +    }
> +
>      /*
>       * old KVM doesn't support KVM_CAP_VM_TYPES and KVM_X86_DEFAULT_VM
>       * is always supported
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 06/58] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  2023-08-18  9:49 ` [PATCH v2 06/58] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES Xiaoyao Li
@ 2023-08-21  8:46   ` Daniel P. Berrangé
  2023-08-22  7:31     ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  8:46 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:49:49AM -0400, Xiaoyao Li wrote:
> KVM provides TDX capabilities via sub command KVM_TDX_CAPABILITIES of
> IOCTL(KVM_MEMORY_ENCRYPT_OP). Get the capabilities when initializing
> TDX context. It will be used to validate user's setting later.
> 
> Since there is no interface reporting how many cpuid configs contains in
> KVM_TDX_CAPABILITIES, QEMU chooses to try starting with a known number
> and abort when it exceeds KVM_MAX_CPUID_ENTRIES.
> 
> Besides, introduce the interfaces to invoke TDX "ioctls" at different
> scope (KVM, VM and VCPU) in preparation.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
> changes from v1:
>   - Make the error message more clear;
> 
> changes from RFC v4:
>   - start from nr_cpuid_configs = 6 for the loop;
>   - stop the loop when nr_cpuid_configs exceeds KVM_MAX_CPUID_ENTRIES;
> ---
>  target/i386/kvm/kvm.c      |  2 -
>  target/i386/kvm/kvm_i386.h |  2 +
>  target/i386/kvm/tdx.c      | 93 ++++++++++++++++++++++++++++++++++++++
>  3 files changed, 95 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index d6b988d6c2d1..ec5c07bffd38 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -1751,8 +1751,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
>  
>  static Error *invtsc_mig_blocker;
>  
> -#define KVM_MAX_CPUID_ENTRIES  100
> -
>  static void kvm_init_xsave(CPUX86State *env)
>  {
>      if (has_xsave2) {
> diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
> index ea3a5b174ac0..769eadbba56c 100644
> --- a/target/i386/kvm/kvm_i386.h
> +++ b/target/i386/kvm/kvm_i386.h
> @@ -13,6 +13,8 @@
>  
>  #include "sysemu/kvm.h"
>  
> +#define KVM_MAX_CPUID_ENTRIES  100
> +
>  #define kvm_apic_in_kernel() (kvm_irqchip_in_kernel())
>  
>  #ifdef CONFIG_KVM
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 77e33ae01147..255c47a2a553 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -12,14 +12,107 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/error-report.h"
>  #include "qapi/error.h"
>  #include "qom/object_interfaces.h"
> +#include "sysemu/kvm.h"
>  
>  #include "hw/i386/x86.h"
> +#include "kvm_i386.h"
>  #include "tdx.h"
>  
> +static struct kvm_tdx_capabilities *tdx_caps;
> +
> +enum tdx_ioctl_level{
> +    TDX_PLATFORM_IOCTL,
> +    TDX_VM_IOCTL,
> +    TDX_VCPU_IOCTL,
> +};
> +
> +static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
> +                        __u32 flags, void *data)

Names with an initial double underscore are reserved for us by the
platform implementation, so shouldn't be used in userspace app
code.

> +{
> +    struct kvm_tdx_cmd tdx_cmd;
> +    int r;
> +
> +    memset(&tdx_cmd, 0x0, sizeof(tdx_cmd));
> +
> +    tdx_cmd.id = cmd_id;
> +    tdx_cmd.flags = flags;
> +    tdx_cmd.data = (__u64)(unsigned long)data;
> +
> +    switch (level) {
> +    case TDX_PLATFORM_IOCTL:
> +        r = kvm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
> +        break;
> +    case TDX_VM_IOCTL:
> +        r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
> +        break;
> +    case TDX_VCPU_IOCTL:
> +        r = kvm_vcpu_ioctl(state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
> +        break;
> +    default:
> +        error_report("Invalid tdx_ioctl_level %d", level);
> +        exit(1);
> +    }
> +
> +    return r;
> +}
> +
> +static inline int tdx_platform_ioctl(int cmd_id, __u32 flags, void *data)
> +{
> +    return __tdx_ioctl(NULL, TDX_PLATFORM_IOCTL, cmd_id, flags, data);
> +}
> +
> +static inline int tdx_vm_ioctl(int cmd_id, __u32 flags, void *data)
> +{
> +    return __tdx_ioctl(NULL, TDX_VM_IOCTL, cmd_id, flags, data);
> +}
> +
> +static inline int tdx_vcpu_ioctl(void *vcpu_fd, int cmd_id, __u32 flags,
> +                                 void *data)
> +{
> +    return  __tdx_ioctl(vcpu_fd, TDX_VCPU_IOCTL, cmd_id, flags, data);
> +}
> +
> +static void get_tdx_capabilities(void)

Pass in 'Error **errp'

> +{
> +    struct kvm_tdx_capabilities *caps;
> +    /* 1st generation of TDX reports 6 cpuid configs */
> +    int nr_cpuid_configs = 6;
> +    int r, size;

It is preferrable to use  'size_t' for memory allocation sizes.

> +
> +    do {
> +        size = sizeof(struct kvm_tdx_capabilities) +
> +               nr_cpuid_configs * sizeof(struct kvm_tdx_cpuid_config);
> +        caps = g_malloc0(size);
> +        caps->nr_cpuid_configs = nr_cpuid_configs;
> +
> +        r = tdx_vm_ioctl(KVM_TDX_CAPABILITIES, 0, caps);
> +        if (r == -E2BIG) {
> +            g_free(caps);
> +            nr_cpuid_configs *= 2;
> +            if (nr_cpuid_configs > KVM_MAX_CPUID_ENTRIES) {
> +                error_report("KVM TDX seems broken that number of CPUID entries in kvm_tdx_capabilities exceeds limit");

Include the limit in the error message, so if we ever need to change
the limit, it'll be clear what limit the QEMU version was built with.

Also use error_setg(errp, ...);

> +                exit(1);

Return -1

> +            }
> +        } else if (r < 0) {
> +            g_free(caps);
> +            error_report("KVM_TDX_CAPABILITIES failed: %s", strerror(-r));

Use error_setg_errno(errp, ...) instead of calling strerror yourself;

> +            exit(1);

Return -1

> +        }
> +    }
> +    while (r == -E2BIG);
> +
> +    tdx_caps = caps;

Return 0

> +}
> +
>  int tdx_kvm_init(MachineState *ms, Error **errp)
>  {
> +    if (!tdx_caps) {
> +        get_tdx_capabilities();

Pass 'errp' into this method, and check return value for failure

> +    }
> +
>      return 0;
>  }
>  
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 07/58] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
  2023-08-18  9:49 ` [PATCH v2 07/58] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object Xiaoyao Li
@ 2023-08-21  8:48   ` Daniel P. Berrangé
  2023-08-22  7:46     ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  8:48 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:49:50AM -0400, Xiaoyao Li wrote:
> It will need special handling for TDX VMs all around the QEMU.
> Introduce is_tdx_vm() helper to query if it's a TDX VM.
> 
> Cache tdx_guest object thus no need to cast from ms->cgs every time.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>  target/i386/kvm/tdx.c | 13 +++++++++++++
>  target/i386/kvm/tdx.h | 10 ++++++++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 255c47a2a553..56cb826f6125 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -21,8 +21,16 @@
>  #include "kvm_i386.h"
>  #include "tdx.h"
>  
> +static TdxGuest *tdx_guest;
> +
>  static struct kvm_tdx_capabilities *tdx_caps;
>  
> +/* It's valid after kvm_confidential_guest_init()->kvm_tdx_init() */
> +bool is_tdx_vm(void)
> +{
> +    return !!tdx_guest;
> +}
> +
>  enum tdx_ioctl_level{
>      TDX_PLATFORM_IOCTL,
>      TDX_VM_IOCTL,
> @@ -109,10 +117,15 @@ static void get_tdx_capabilities(void)
>  
>  int tdx_kvm_init(MachineState *ms, Error **errp)
>  {
> +    TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
> +                                                    TYPE_TDX_GUEST);

This method can return NULL.  Presumably tdx_kvm_init() should only
be called if we already checked  ms->cgs == TYPE_TDX_GUEST. If so
then use object_dynamic_cast_assert() instead.

> +
>      if (!tdx_caps) {
>          get_tdx_capabilities();
>      }
>  
> +    tdx_guest = tdx;
> +
>      return 0;
>  }
>  
> diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
> index c8a23d95258d..4036ca2f3f99 100644
> --- a/target/i386/kvm/tdx.h
> +++ b/target/i386/kvm/tdx.h
> @@ -1,6 +1,10 @@
>  #ifndef QEMU_I386_TDX_H
>  #define QEMU_I386_TDX_H
>  
> +#ifndef CONFIG_USER_ONLY
> +#include CONFIG_DEVICES /* CONFIG_TDX */
> +#endif
> +
>  #include "exec/confidential-guest-support.h"
>  
>  #define TYPE_TDX_GUEST "tdx-guest"
> @@ -16,6 +20,12 @@ typedef struct TdxGuest {
>      uint64_t attributes;    /* TD attributes */
>  } TdxGuest;
>  
> +#ifdef CONFIG_TDX
> +bool is_tdx_vm(void);
> +#else
> +#define is_tdx_vm() 0
> +#endif /* CONFIG_TDX */
> +
>  int tdx_kvm_init(MachineState *ms, Error **errp);
>  
>  #endif /* QEMU_I386_TDX_H */
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 14/58] i386/tdx: Initialize TDX before creating TD vcpus
  2023-08-18  9:49 ` [PATCH v2 14/58] i386/tdx: Initialize TDX before creating TD vcpus Xiaoyao Li
@ 2023-08-21  8:54   ` Daniel P. Berrangé
  0 siblings, 0 replies; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  8:54 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:49:57AM -0400, Xiaoyao Li wrote:
> Invoke KVM_TDX_INIT in kvm_arch_pre_create_vcpu() that KVM_TDX_INIT
> configures global TD configurations, e.g. the canonical CPUID config,
> and must be executed prior to creating vCPUs.
> 
> Use kvm_x86_arch_cpuid() to setup the CPUID settings for TDX VM.
> 
> Note, this doesn't address the fact that QEMU may change the CPUID
> configuration when creating vCPUs, i.e. punts on refactoring QEMU to
> provide a stable CPUID config prior to kvm_arch_init().
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>  accel/kvm/kvm-all.c        |  9 +++++++-
>  target/i386/kvm/kvm.c      |  8 +++++++
>  target/i386/kvm/tdx-stub.c |  5 +++++
>  target/i386/kvm/tdx.c      | 45 ++++++++++++++++++++++++++++++++++++++
>  target/i386/kvm/tdx.h      |  4 ++++
>  5 files changed, 70 insertions(+), 1 deletion(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 5071af917ae0..fceec7f2a83f 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -435,10 +435,17 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>  
>      trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  
> +    /*
> +     * tdx_pre_create_vcpu() may call cpu_x86_cpuid(). It in turn may call
> +     * kvm_vm_ioctl(). Set cpu->kvm_state in advance to avoid NULL pointer
> +     * dereference.
> +     */
> +    cpu->kvm_state = s;
>      ret = kvm_arch_pre_create_vcpu(cpu);
>      if (ret < 0) {
>          error_setg_errno(errp, -ret, "%s: kvm_arch_pre_create_vcpu() failed",
>                          __func__);
> +        cpu->kvm_state = NULL;
>          goto err;
>      }
>  
> @@ -446,11 +453,11 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>      if (ret < 0) {
>          error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
>                           kvm_arch_vcpu_id(cpu));
> +        cpu->kvm_state = NULL;
>          goto err;
>      }
>  
>      cpu->kvm_fd = ret;
> -    cpu->kvm_state = s;
>      cpu->vcpu_dirty = true;
>      cpu->dirty_pages = 0;
>      cpu->throttle_us_per_full = 0;
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 9ee41fffc445..d51067fdc12a 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2331,6 +2331,14 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      return r;
>  }
>  
> +int kvm_arch_pre_create_vcpu(CPUState *cpu)
> +{
> +    if (is_tdx_vm())
> +        return tdx_pre_create_vcpu(cpu);

Curly braces needed for coding style - run 'scripts/checkpatch.pl'
to validate it.

> +
> +    return 0;
> +}
> +
>  int kvm_arch_destroy_vcpu(CPUState *cs)
>  {
>      X86CPU *cpu = X86_CPU(cs);
> diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
> index 1d866d5496bf..61f70cc0d1d9 100644
> --- a/target/i386/kvm/tdx-stub.c
> +++ b/target/i386/kvm/tdx-stub.c
> @@ -6,3 +6,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>  {
>      return -EINVAL;
>  }
> +
> +int tdx_pre_create_vcpu(CPUState *cpu)
> +{
> +    return -EINVAL;
> +}
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 29f50fb9529e..3d313ed46bd1 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -458,6 +458,49 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>      return 0;
>  }
>  
> +int tdx_pre_create_vcpu(CPUState *cpu)

Add 'Error **errp' to this method

> +{
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    X86CPU *x86cpu = X86_CPU(cpu);
> +    CPUX86State *env = &x86cpu->env;
> +    struct kvm_tdx_init_vm *init_vm;
> +    int r = 0;
> +
> +    qemu_mutex_lock(&tdx_guest->lock);
> +    if (tdx_guest->initialized) {
> +        goto out;
> +    }
> +
> +    init_vm = g_malloc0(sizeof(struct kvm_tdx_init_vm) +
> +                        sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
> +
> +    r = kvm_vm_enable_cap(kvm_state, KVM_CAP_MAX_VCPUS, 0, ms->smp.cpus);
> +    if (r < 0) {
> +        error_report("Unable to set MAX VCPUS to %d", ms->smp.cpus);

Use error_setg / error_setg_errno in this method.

> +        goto out_free;
> +    }
> +
> +    init_vm->cpuid.nent = kvm_x86_arch_cpuid(env, init_vm->cpuid.entries, 0);
> +
> +    init_vm->attributes = tdx_guest->attributes;
> +
> +    do {
> +        r = tdx_vm_ioctl(KVM_TDX_INIT_VM, 0, init_vm);
> +    } while (r == -EAGAIN);
> +    if (r < 0) {
> +        error_report("KVM_TDX_INIT_VM failed %s", strerror(-r));
> +        goto out_free;
> +    }
> +
> +    tdx_guest->initialized = true;
> +
> +out_free:
> +    g_free(init_vm);
> +out:
> +    qemu_mutex_unlock(&tdx_guest->lock);
> +    return r;
> +}
> +
>  /* tdx guest */
>  OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
>                                     tdx_guest,
> @@ -470,6 +513,8 @@ static void tdx_guest_init(Object *obj)
>  {
>      TdxGuest *tdx = TDX_GUEST(obj);
>  
> +    qemu_mutex_init(&tdx->lock);
> +
>      tdx->attributes = 0;
>  }
>  
> diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
> index 06599b65b827..46a24ee8c7cc 100644
> --- a/target/i386/kvm/tdx.h
> +++ b/target/i386/kvm/tdx.h
> @@ -17,6 +17,9 @@ typedef struct TdxGuestClass {
>  typedef struct TdxGuest {
>      ConfidentialGuestSupport parent_obj;
>  
> +    QemuMutex lock;
> +
> +    bool initialized;
>      uint64_t attributes;    /* TD attributes */
>  } TdxGuest;
>  
> @@ -29,5 +32,6 @@ bool is_tdx_vm(void);
>  int tdx_kvm_init(MachineState *ms, Error **errp);
>  void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
>                               uint32_t *ret);
> +int tdx_pre_create_vcpu(CPUState *cpu);
>  
>  #endif /* QEMU_I386_TDX_H */
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu()
  2023-08-18  9:49 ` [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu() Xiaoyao Li
@ 2023-08-21  8:55   ` Daniel P. Berrangé
  2023-08-29 14:40   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  8:55 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:49:56AM -0400, Xiaoyao Li wrote:
> Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
> work prior to create any vcpu. This is for i386 TDX because it needs
> call TDX_INIT_VM before creating any vcpu.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>  accel/kvm/kvm-all.c  | 12 ++++++++++++
>  include/sysemu/kvm.h |  1 +
>  2 files changed, 13 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index c9f3aab5e587..5071af917ae0 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -422,6 +422,11 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
>      return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
>  }
>  
> +int __attribute__ ((weak)) kvm_arch_pre_create_vcpu(CPUState *cpu)
> +{
> +    return 0;
> +}
> +
>  int kvm_init_vcpu(CPUState *cpu, Error **errp)
>  {
>      KVMState *s = kvm_state;
> @@ -430,6 +435,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>  
>      trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  
> +    ret = kvm_arch_pre_create_vcpu(cpu);
> +    if (ret < 0) {
> +        error_setg_errno(errp, -ret, "%s: kvm_arch_pre_create_vcpu() failed",
> +                        __func__);

Don't report generic error messages here, when kvm_arch_pre_create_vcpu
can provide a better error - pass the 'errp' into the kvm_arch_pre_create_vcpu
method.

> +        goto err;
> +    }
> +
>      ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>      if (ret < 0) {
>          error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 49c896d8a512..d89ec87072d7 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -371,6 +371,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
>  
>  int kvm_arch_init(MachineState *ms, KVMState *s);
>  
> +int kvm_arch_pre_create_vcpu(CPUState *cpu);
>  int kvm_arch_init_vcpu(CPUState *cpu);
>  int kvm_arch_destroy_vcpu(CPUState *cpu);
>  
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 15/58] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2023-08-18  9:49 ` [PATCH v2 15/58] i386/tdx: Add property sept-ve-disable for tdx-guest object Xiaoyao Li
@ 2023-08-21  8:59   ` Daniel P. Berrangé
  2023-08-22  6:27     ` Markus Armbruster
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  8:59 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:49:58AM -0400, Xiaoyao Li wrote:
> Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it disables
> EPT violation conversion to #VE on guest TD access of PENDING pages.
> 
> Some guest OS (e.g., Linux TD guest) may require this bit as 1.
> Otherwise refuse to boot.
> 
> Add sept-ve-disable property for tdx-guest object, for user to configure
> this bit.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>  qapi/qom.json         |  4 +++-
>  target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
>  2 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi/qom.json b/qapi/qom.json
> index 2ca7ce7c0da5..cc08b9a98df9 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -871,10 +871,12 @@
>  #
>  # Properties for tdx-guest objects.
>  #
> +# @sept-ve-disable: bit 28 of TD attributes (default: 0)

This description isn't very useful as it forces the user to go off and
read the TDX specification to find out what bit 28 means. You've got a
more useful description in the commit message, so please use that
in the docs too. eg something like this

  @sept-ve-disable: toggle bit 28 of TD attributes to control disabling
                    of EPT violation conversion to #VE on guest
                    TD access of PENDING pages. Some guest OS (e.g.
		    Linux TD guest) may require this set, otherwise
                    they refuse to boot.

> +#
>  # Since: 8.2
>  ##
>  { 'struct': 'TdxGuestProperties',
> -  'data': { }}
> +  'data': { '*sept-ve-disable': 'bool' } }
>  
>  ##
>  # @ThreadContextProperties:
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 3d313ed46bd1..22130382c0c5 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -32,6 +32,8 @@
>                                       (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
>                                       (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
>  
> +#define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
> +
>  #define TDX_ATTRIBUTES_MAX_BITS      64
>  
>  static FeatureMask tdx_attrs_ctrl_fields[TDX_ATTRIBUTES_MAX_BITS] = {
> @@ -501,6 +503,24 @@ out:
>      return r;
>  }
>  
> +static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp)
> +{
> +    TdxGuest *tdx = TDX_GUEST(obj);
> +
> +    return !!(tdx->attributes & TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE);
> +}
> +
> +static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error **errp)
> +{
> +    TdxGuest *tdx = TDX_GUEST(obj);
> +
> +    if (value) {
> +        tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE;
> +    } else {
> +        tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE;
> +    }
> +}
> +
>  /* tdx guest */
>  OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
>                                     tdx_guest,
> @@ -516,6 +536,10 @@ static void tdx_guest_init(Object *obj)
>      qemu_mutex_init(&tdx->lock);
>  
>      tdx->attributes = 0;
> +
> +    object_property_add_bool(obj, "sept-ve-disable",
> +                             tdx_guest_get_sept_ve_disable,
> +                             tdx_guest_set_sept_ve_disable);
>  }
>  
>  static void tdx_guest_finalize(Object *obj)
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 18/58] i386/tdx: Validate TD attributes
  2023-08-18  9:50 ` [PATCH v2 18/58] i386/tdx: Validate TD attributes Xiaoyao Li
@ 2023-08-21  9:16   ` Daniel P. Berrangé
  2023-08-22 14:21     ` Xiaoyao Li
  2023-08-22 14:30     ` Xiaoyao Li
  0 siblings, 2 replies; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  9:16 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:01AM -0400, Xiaoyao Li wrote:
> Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
> fixed-1 bits must be set.
> 
> Besides, sanity check the attribute bits that have not been supported by
> QEMU yet. e.g., debug bit, it will be allowed in the future when debug
> TD support lands in QEMU.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>  target/i386/kvm/tdx.c | 27 +++++++++++++++++++++++++--
>  1 file changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 629abd267da8..73da15377ec3 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -32,6 +32,7 @@
>                                       (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
>                                       (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
>  
> +#define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
>  #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
>  #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
>  #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
> @@ -462,13 +463,32 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>      return 0;
>  }
>  
> -static void setup_td_guest_attributes(X86CPU *x86cpu)
> +static int tdx_validate_attributes(TdxGuest *tdx)
> +{
> +    if (((tdx->attributes & tdx_caps->attrs_fixed0) | tdx_caps->attrs_fixed1) !=
> +        tdx->attributes) {
> +            error_report("Invalid attributes 0x%lx for TDX VM (fixed0 0x%llx, fixed1 0x%llx)",
> +                          tdx->attributes, tdx_caps->attrs_fixed0, tdx_caps->attrs_fixed1);
> +            return -EINVAL;
> +    }
> +
> +    if (tdx->attributes & TDX_TD_ATTRIBUTES_DEBUG) {
> +        error_report("Current QEMU doesn't support attributes.debug[bit 0] for TDX VM");
> +        return -EINVAL;
> +    }

Use error_setg() in both cases, passing in a 'Error **errp' object,
and 'return -1' instead of returning an errno value.

> +
> +    return 0;
> +}
> +
> +static int setup_td_guest_attributes(X86CPU *x86cpu)
>  {
>      CPUX86State *env = &x86cpu->env;
>  
>      tdx_guest->attributes |= (env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS) ?
>                               TDX_TD_ATTRIBUTES_PKS : 0;
>      tdx_guest->attributes |= x86cpu->enable_pmu ? TDX_TD_ATTRIBUTES_PERFMON : 0;
> +
> +    return tdx_validate_attributes(tdx_guest);

Pass along "errp" into this

>  }
>  
>  int tdx_pre_create_vcpu(CPUState *cpu)
> @@ -493,7 +513,10 @@ int tdx_pre_create_vcpu(CPUState *cpu)

In an earlier patch I suggested adding 'Error **errp' to this method...

>          goto out_free;
>      }
>  
> -    setup_td_guest_attributes(x86cpu);
> +    r = setup_td_guest_attributes(x86cpu);

...it can also be passed into this method

> +    if (r) {
> +        goto out;
> +    }
>  
>      init_vm->cpuid.nent = kvm_x86_arch_cpuid(env, init_vm->cpuid.entries, 0);
>      init_vm->attributes = tdx_guest->attributes;
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 19/58] qom: implement property helper for sha384
  2023-08-18  9:50 ` [PATCH v2 19/58] qom: implement property helper for sha384 Xiaoyao Li
@ 2023-08-21  9:25   ` Daniel P. Berrangé
  2023-08-21 23:28     ` Isaku Yamahata
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  9:25 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:02AM -0400, Xiaoyao Li wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Implement property_add_sha384() which converts hex string <-> uint8_t[48]
> It will be used for TDX which uses sha384 for measurement.

I think it is likely a better idea to use base64 for the encoding
the binary hash - we use base64 for all the sev-guest properties
that were binary data.

At which points the property set/get logic is much simpler as it
is just needing a call to  g_base64_encode / g_base64_decode and
length validation for the decode case.

> 
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  include/qom/object.h | 17 ++++++++++
>  qom/object.c         | 76 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 93 insertions(+)
> 
> diff --git a/include/qom/object.h b/include/qom/object.h
> index ef7258a5e149..70399a5b1940 100644
> --- a/include/qom/object.h
> +++ b/include/qom/object.h
> @@ -1887,6 +1887,23 @@ ObjectProperty *object_property_add_alias(Object *obj, const char *name,
>  ObjectProperty *object_property_add_const_link(Object *obj, const char *name,
>                                                 Object *target);
>  
> +
> +/**
> + * object_property_add_sha384:
> + * @obj: the object to add a property to
> + * @name: the name of the property
> + * @v: pointer to value
> + * @flags: bitwise-or'd ObjectPropertyFlags
> + *
> + * Add an sha384 property in memory.  This function will add a
> + * property of type 'sha384'.
> + *
> + * Returns: The newly added property on success, or %NULL on failure.
> + */
> +ObjectProperty * object_property_add_sha384(Object *obj, const char *name,
> +                                            const uint8_t *v,
> +                                            ObjectPropertyFlags flags);
> +
>  /**
>   * object_property_set_description:
>   * @obj: the object owning the property
> diff --git a/qom/object.c b/qom/object.c
> index e25f1e96db1e..e71ce46ed576 100644
> --- a/qom/object.c
> +++ b/qom/object.c
> @@ -15,6 +15,7 @@
>  #include "qapi/error.h"
>  #include "qom/object.h"
>  #include "qom/object_interfaces.h"
> +#include "qemu/ctype.h"
>  #include "qemu/cutils.h"
>  #include "qemu/memalign.h"
>  #include "qapi/visitor.h"
> @@ -2781,6 +2782,81 @@ object_property_add_alias(Object *obj, const char *name,
>      return op;
>  }
>  
> +#define SHA384_DIGEST_SIZE      48
> +static void property_get_sha384(Object *obj, Visitor *v, const char *name,
> +                                void *opaque, Error **errp)
> +{
> +    uint8_t *value = (uint8_t *)opaque;
> +    char str[SHA384_DIGEST_SIZE * 2 + 1];
> +    char *str_ = (char*)str;
> +    size_t i;
> +
> +    for (i = 0; i < SHA384_DIGEST_SIZE; i++) {
> +        char *buf;
> +        buf = &str[i * 2];
> +
> +        sprintf(buf, "%02hhx", value[i]);
> +    }
> +    str[SHA384_DIGEST_SIZE * 2] = '\0';
> +
> +    visit_type_str(v, name, &str_, errp);
> +}
> +
> +static void property_set_sha384(Object *obj, Visitor *v, const char *name,
> +                                    void *opaque, Error **errp)
> +{
> +    uint8_t *value = (uint8_t *)opaque;
> +    char* str;
> +    size_t len;
> +    size_t i;
> +
> +    if (!visit_type_str(v, name, &str, errp)) {
> +        goto err;
> +    }
> +
> +    len = strlen(str);
> +    if (len != SHA384_DIGEST_SIZE * 2) {
> +        error_setg(errp, "invalid length for sha348 hex string %s. "
> +                   "it must be 48 * 2 hex", name);
> +        goto err;
> +    }
> +
> +    for (i = 0; i < SHA384_DIGEST_SIZE; i++) {
> +        if (!qemu_isxdigit(str[i * 2]) || !qemu_isxdigit(str[i * 2 + 1])) {
> +            error_setg(errp, "invalid char for sha318 hex string %s at %c%c",
> +                       name, str[i * 2], str[i * 2 + 1]);
> +            goto err;
> +        }
> +
> +        if (sscanf(str + i * 2, "%02hhx", &value[i]) != 1) {
> +            error_setg(errp, "invalid format for sha318 hex string %s", name);
> +            goto err;
> +        }
> +    }
> +
> +err:
> +    g_free(str);
> +}
> +
> +ObjectProperty *
> +object_property_add_sha384(Object *obj, const char *name,
> +                           const uint8_t *v, ObjectPropertyFlags flags)
> +{
> +    ObjectPropertyAccessor *getter = NULL;
> +    ObjectPropertyAccessor *setter = NULL;
> +
> +    if ((flags & OBJ_PROP_FLAG_READ) == OBJ_PROP_FLAG_READ) {
> +        getter = property_get_sha384;
> +    }
> +
> +    if ((flags & OBJ_PROP_FLAG_WRITE) == OBJ_PROP_FLAG_WRITE) {
> +        setter = property_set_sha384;
> +    }
> +
> +    return object_property_add(obj, name, "sha384",
> +                               getter, setter, NULL, (void *)v);
> +}
> +
>  void object_property_set_description(Object *obj, const char *name,
>                                       const char *description)
>  {
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 20/58] i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM
  2023-08-18  9:50 ` [PATCH v2 20/58] i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM Xiaoyao Li
@ 2023-08-21  9:29   ` Daniel P. Berrangé
  2023-08-22  6:35     ` Markus Armbruster
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  9:29 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:03AM -0400, Xiaoyao Li wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> When creating TDX vm, three sha384 hash values can be provided for
> TDX attestation.
> 
> So far they were hard coded as 0. Now allow user to specify those values
> via property mrconfigid, mrowner and mrownerconfig. Choose hex-encoded
> string as format since it's friendly for user to input.
> 
> example
> -object tdx-guest, \
>   mrconfigid=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef, \
>   mrowner=fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210, \
>   mrownerconfig=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
> 
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
> TODO:
>  - community requests to use base64 encoding if no special reason
> ---
>  qapi/qom.json         | 11 ++++++++++-
>  target/i386/kvm/tdx.c | 13 +++++++++++++
>  target/i386/kvm/tdx.h |  3 +++
>  3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi/qom.json b/qapi/qom.json
> index cc08b9a98df9..87c1d440f331 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -873,10 +873,19 @@
>  #
>  # @sept-ve-disable: bit 28 of TD attributes (default: 0)
>  #
> +# @mrconfigid: MRCONFIGID SHA384 hex string of 48 * 2 length (default: 0)
> +#
> +# @mrowner: MROWNER SHA384 hex string of 48 * 2 length (default: 0)
> +#
> +# @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)

Per previous patch, I suggest these should all be passed in base64
instead of hex. Also 'default: 0' makes no sense for a string,
which would be 'default: nil', and no need to document that as
the default is implicit from the fact that its an optional string
field. So eg

  @mrconfigid: base64 encoded MRCONFIGID SHA384 digest

> +#
>  # Since: 8.2
>  ##
>  { 'struct': 'TdxGuestProperties',
> -  'data': { '*sept-ve-disable': 'bool' } }
> +  'data': { '*sept-ve-disable': 'bool',
> +            '*mrconfigid': 'str',
> +            '*mrowner': 'str',
> +            '*mrownerconfig': 'str' } }
>  
>  ##
>  # @ThreadContextProperties:
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 73da15377ec3..33d015a08c34 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -521,6 +521,13 @@ int tdx_pre_create_vcpu(CPUState *cpu)
>      init_vm->cpuid.nent = kvm_x86_arch_cpuid(env, init_vm->cpuid.entries, 0);
>      init_vm->attributes = tdx_guest->attributes;
>  
> +    QEMU_BUILD_BUG_ON(sizeof(init_vm->mrconfigid) != sizeof(tdx_guest->mrconfigid));
> +    QEMU_BUILD_BUG_ON(sizeof(init_vm->mrowner) != sizeof(tdx_guest->mrowner));
> +    QEMU_BUILD_BUG_ON(sizeof(init_vm->mrownerconfig) != sizeof(tdx_guest->mrownerconfig));
> +    memcpy(init_vm->mrconfigid, tdx_guest->mrconfigid, sizeof(tdx_guest->mrconfigid));
> +    memcpy(init_vm->mrowner, tdx_guest->mrowner, sizeof(tdx_guest->mrowner));
> +    memcpy(init_vm->mrownerconfig, tdx_guest->mrownerconfig, sizeof(tdx_guest->mrownerconfig));
> +
>      do {
>          r = tdx_vm_ioctl(KVM_TDX_INIT_VM, 0, init_vm);
>      } while (r == -EAGAIN);
> @@ -575,6 +582,12 @@ static void tdx_guest_init(Object *obj)
>      object_property_add_bool(obj, "sept-ve-disable",
>                               tdx_guest_get_sept_ve_disable,
>                               tdx_guest_set_sept_ve_disable);
> +    object_property_add_sha384(obj, "mrconfigid", tdx->mrconfigid,
> +                               OBJ_PROP_FLAG_READWRITE);
> +    object_property_add_sha384(obj, "mrowner", tdx->mrowner,
> +                               OBJ_PROP_FLAG_READWRITE);
> +    object_property_add_sha384(obj, "mrownerconfig", tdx->mrownerconfig,
> +                               OBJ_PROP_FLAG_READWRITE);
>  }
>  
>  static void tdx_guest_finalize(Object *obj)
> diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
> index 46a24ee8c7cc..68f8327f2231 100644
> --- a/target/i386/kvm/tdx.h
> +++ b/target/i386/kvm/tdx.h
> @@ -21,6 +21,9 @@ typedef struct TdxGuest {
>  
>      bool initialized;
>      uint64_t attributes;    /* TD attributes */
> +    uint8_t mrconfigid[48];     /* sha348 digest */
> +    uint8_t mrowner[48];        /* sha348 digest */
> +    uint8_t mrownerconfig[48];  /* sha348 digest */
>  } TdxGuest;
>  
>  #ifdef CONFIG_TDX
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 21/58] i386/tdx: Implement user specified tsc frequency
  2023-08-18  9:50 ` [PATCH v2 21/58] i386/tdx: Implement user specified tsc frequency Xiaoyao Li
@ 2023-08-21  9:30   ` Daniel P. Berrangé
  0 siblings, 0 replies; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  9:30 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:04AM -0400, Xiaoyao Li wrote:
> Reuse "-cpu,tsc-frequency=" to get user wanted tsc frequency and call VM
> scope VM_SET_TSC_KHZ to set the tsc frequency of TD before KVM_TDX_INIT_VM.
> 
> Besides, sanity check the tsc frequency to be in the legal range and
> legal granularity (required by TDX module).
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
> Changes from RFC v4:
>   - Use VM scope VM_SET_TSC_KHZ to set the TSC frequency of TD since KVM
>     side drop the @tsc_khz field in struct kvm_tdx_init_vm
> ---
>  target/i386/kvm/kvm.c |  9 +++++++++
>  target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index d51067fdc12a..4a146bc42f63 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -859,6 +859,15 @@ static int kvm_arch_set_tsc_khz(CPUState *cs)
>      int r, cur_freq;
>      bool set_ioctl = false;
>  
> +    /*
> +     * TSC of TD vcpu is immutable, it cannot be set/changed via vcpu scope
> +     * VM_SET_TSC_KHZ, but only be initialized via VM scope VM_SET_TSC_KHZ
> +     * before ioctl KVM_TDX_INIT_VM in tdx_pre_create_vcpu()
> +     */
> +    if (is_tdx_vm()) {
> +        return 0;
> +    }
> +
>      if (!env->tsc_khz) {
>          return 0;
>      }
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 33d015a08c34..a72badfbfd65 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -32,6 +32,9 @@
>                                       (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
>                                       (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
>  
> +#define TDX_MIN_TSC_FREQUENCY_KHZ   (100 * 1000)
> +#define TDX_MAX_TSC_FREQUENCY_KHZ   (10 * 1000 * 1000)
> +
>  #define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
>  #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
>  #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
> @@ -513,6 +516,27 @@ int tdx_pre_create_vcpu(CPUState *cpu)
>          goto out_free;
>      }
>  
> +    r = -EINVAL;
> +    if (env->tsc_khz && (env->tsc_khz < TDX_MIN_TSC_FREQUENCY_KHZ ||
> +                         env->tsc_khz > TDX_MAX_TSC_FREQUENCY_KHZ)) {
> +        error_report("Invalid TSC %ld KHz, must specify cpu_frequency between [%d, %d] kHz",
> +                      env->tsc_khz, TDX_MIN_TSC_FREQUENCY_KHZ,
> +                      TDX_MAX_TSC_FREQUENCY_KHZ);
> +        goto out;
> +    }
> +
> +    if (env->tsc_khz % (25 * 1000)) {
> +        error_report("Invalid TSC %ld KHz, it must be multiple of 25MHz", env->tsc_khz);
> +        goto out;
> +    }
> +
> +    /* it's safe even env->tsc_khz is 0. KVM uses host's tsc_khz in this case */
> +    r = kvm_vm_ioctl(kvm_state, KVM_SET_TSC_KHZ, env->tsc_khz);
> +    if (r < 0) {
> +        error_report("Unable to set TSC frequency to %" PRId64 " kHz", env->tsc_khz);
> +        goto out;
> +    }

error_setg(errp, ....) in all of these cases.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 25/58] kvm/tdx: Don't complain when converting vMMIO region to shared
  2023-08-18  9:50 ` [PATCH v2 25/58] kvm/tdx: Don't complain when converting vMMIO region to shared Xiaoyao Li
@ 2023-08-21  9:34   ` Daniel P. Berrangé
  0 siblings, 0 replies; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  9:34 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:08AM -0400, Xiaoyao Li wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Because vMMIO region needs to be shared region, guest TD may explicitly
> convert such region from private to shared.  Don't complain such
> conversion.
> 
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  accel/kvm/kvm-all.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index fceec7f2a83f..9d0aa8c97feb 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -3094,8 +3094,24 @@ static int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
>           */
>          ram_block_convert_range(rb, offset, size, to_private);
>      } else {
> -        warn_report("Convert non guest-memfd backed memory region (0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") to %s",
> -                    start, size, to_private ? "private" : "shared");
> +        MemoryRegion *mr = section.mr;
> +
> +        /*
> +         * Because vMMIO region must be shared, guest TD may convert vMMIO
> +         * region to shared explicitly.  Don't complain such case.  See
> +         * memory_region_type() for checking if the region is MMIO region.
> +         */
> +        if (to_private ||
> +            memory_region_is_ram(mr) ||
> +            memory_region_is_ram_device(mr) ||
> +            memory_region_is_rom(mr) ||
> +            memory_region_is_romd(mr)) {

Should we also have  !is_tdx_vm() as the first clause to check, as IIUC
you only need the special logic for TDX VMs.

> +            warn_report("Convert non guest-memfd backed memory region (0x%"HWADDR_PRIx" ,+ 0x%"HWADDR_PRIx") of %s to %s",
> +                        start, size, mr->name, to_private ? "private" : "shared");
> +	    } else {
> +		    ret = 0;
> +	    }

Inconsistent indentation here due to use of tabs

> +
>      }
>  
>      memory_region_unref(section.mr);
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM
  2023-08-18  9:50 ` [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM Xiaoyao Li
@ 2023-08-21  9:38   ` Daniel P. Berrangé
  2023-08-22 15:39     ` Xiaoyao Li
  2023-08-21 23:40   ` Isaku Yamahata
  1 sibling, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  9:38 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:15AM -0400, Xiaoyao Li wrote:
> The RAM of TDX VM can be classified into two types:
> 
>  - TDX_RAM_UNACCEPTED: default type of TDX memory, which needs to be
>    accepted by TDX guest before it can be used and will be all-zeros
>    after being accepted.
> 
>  - TDX_RAM_ADDED: the RAM that is ADD'ed to TD guest before running, and
>    can be used directly. E.g., TD HOB and TEMP MEM that needed by TDVF.
> 
> Maintain TdxRamEntries[] which grabs the initial RAM info from e820 table
> and mark each RAM range as default type TDX_RAM_UNACCEPTED.
> 
> Then turn the range of TD HOB and TEMP MEM to TDX_RAM_ADDED since these
> ranges will be ADD'ed before TD runs and no need to be accepted runtime.
> 
> The TdxRamEntries[] are later used to setup the memory TD resource HOB
> that passes memory info from QEMU to TDVF.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> 
> ---
> Changes from RFC v4:
>   - simplify the algorithm of tdx_accept_ram_range() (Suggested-by: Gerd Hoffman)
>     (1) Change the existing entry to cover the accepted ram range.
>     (2) If there is room before the accepted ram range add a
> 	TDX_RAM_UNACCEPTED entry for that.
>     (3) If there is room after the accepted ram range add a
> 	TDX_RAM_UNACCEPTED entry for that.
> ---
>  target/i386/kvm/tdx.c | 110 ++++++++++++++++++++++++++++++++++++++++++
>  target/i386/kvm/tdx.h |  14 ++++++
>  2 files changed, 124 insertions(+)
> 
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index bb806736b4ff..ed617ebab266 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> +static int tdx_accept_ram_range(uint64_t address, uint64_t length)
> +{
> +    uint64_t head_start, tail_start, head_length, tail_length;
> +    uint64_t tmp_address, tmp_length;
> +    TdxRamEntry *e;
> +    int i;
> +
> +    for (i = 0; i < tdx_guest->nr_ram_entries; i++) {
> +        e = &tdx_guest->ram_entries[i];
> +
> +        if (address + length <= e->address ||
> +            e->address + e->length <= address) {
> +                continue;

Indented too far

> +        }
> +
> +        /*
> +         * The to-be-accepted ram range must be fully contained by one
> +         * RAM entry.
> +         */
> +        if (e->address > address ||
> +            e->address + e->length < address + length) {
> +            return -EINVAL;
> +        }
> +
> +        if (e->type == TDX_RAM_ADDED) {
> +            return -EINVAL;
> +        }
> +
> +        break;
> +    }
> +
> +    if (i == tdx_guest->nr_ram_entries) {
> +        return -1;
> +    }
> +
> +    tmp_address = e->address;
> +    tmp_length = e->length;
> +
> +    e->address = address;
> +    e->length = length;
> +    e->type = TDX_RAM_ADDED;
> +
> +    head_length = address - tmp_address;
> +    if (head_length > 0) {
> +        head_start = tmp_address;
> +        tdx_add_ram_entry(head_start, head_length, TDX_RAM_UNACCEPTED);
> +    }
> +
> +    tail_start = address + length;
> +    if (tail_start < tmp_address + tmp_length) {
> +        tail_length = tmp_address + tmp_length - tail_start;
> +        tdx_add_ram_entry(tail_start, tail_length, TDX_RAM_UNACCEPTED);
> +    }
> +
> +    return 0;
> +}

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 36/58] memory: Introduce memory_region_init_ram_gmem()
  2023-08-18  9:50 ` [PATCH v2 36/58] memory: Introduce memory_region_init_ram_gmem() Xiaoyao Li
@ 2023-08-21  9:40   ` Daniel P. Berrangé
  2023-08-29 14:33   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  9:40 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:19AM -0400, Xiaoyao Li wrote:
> Introduce memory_region_init_ram_gmem() to allocate private gmem on the
> MemoryRegion initialization. It's for the usercase of TDVF, which must
> be private on TDX case.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  include/exec/memory.h |  6 +++++
>  softmmu/memory.c      | 52 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 58 insertions(+)

> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index af6aa3c1e3c9..ded44dcef1aa 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -25,6 +25,7 @@
>  #include "qom/object.h"
>  #include "trace.h"
>  
> +#include <linux/kvm.h>
>  #include "exec/memory-internal.h"
>  #include "exec/ram_addr.h"
>  #include "sysemu/kvm.h"
> @@ -3602,6 +3603,57 @@ void memory_region_init_ram(MemoryRegion *mr,
>      vmstate_register_ram(mr, owner_dev);
>  }
>  
> +#ifdef CONFIG_KVM
> +void memory_region_init_ram_gmem(MemoryRegion *mr,
> +                                 Object *owner,
> +                                 const char *name,
> +                                 uint64_t size,
> +                                 Error **errp)

Since you have an 'errp' parameter here....

> +{
> +    DeviceState *owner_dev;
> +    Error *err = NULL;
> +    int priv_fd;
> +
> +    memory_region_init_ram_nomigrate(mr, owner, name, size, &err);
> +    if (err) {
> +        error_propagate(errp, err);
> +        return;
> +    }
> +
> +    if (object_dynamic_cast(OBJECT(current_accel()), TYPE_KVM_ACCEL)) {
> +        KVMState *s = KVM_STATE(current_accel());
> +        struct kvm_create_guest_memfd gmem = {
> +            .size = size,
> +            /* TODO: add property to hostmem backend for huge pmd */
> +            .flags = KVM_GUEST_MEMFD_ALLOW_HUGEPAGE,
> +        };
> +
> +        priv_fd = kvm_vm_ioctl(s, KVM_CREATE_GUEST_MEMFD, &gmem);
> +        if (priv_fd < 0) {
> +            fprintf(stderr, "%s: error creating gmem: %s\n", __func__,
> +                    strerror(-priv_fd));
> +            abort();

It should be using error_setg_errno() here and return not abort

> +        }
> +    } else {
> +        fprintf(stderr, "%s: gmem unsupported accel: %s\n", __func__,
> +                current_accel_name());

and error_setg() here and return.

> +        abort();
> +    }
> +
> +    memory_region_set_gmem_fd(mr, priv_fd);
> +    memory_region_set_default_private(mr);
> +
> +    /* This will assert if owner is neither NULL nor a DeviceState.
> +     * We only want the owner here for the purposes of defining a
> +     * unique name for migration. TODO: Ideally we should implement
> +     * a naming scheme for Objects which are not DeviceStates, in
> +     * which case we can relax this restriction.
> +     */
> +    owner_dev = DEVICE(owner);
> +    vmstate_register_ram(mr, owner_dev);
> +}
> +#endif
> +
>  void memory_region_init_rom(MemoryRegion *mr,
>                              Object *owner,
>                              const char *name,
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 47/58] i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility
  2023-08-18  9:50 ` [PATCH v2 47/58] i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility Xiaoyao Li
@ 2023-08-21  9:58   ` Daniel P. Berrangé
  2023-08-28 13:14     ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-21  9:58 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:30AM -0400, Xiaoyao Li wrote:
> Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  qapi/run-state.json   | 17 +++++++++++++--
>  softmmu/runstate.c    | 49 +++++++++++++++++++++++++++++++++++++++++++
>  target/i386/kvm/tdx.c | 24 ++++++++++++++++++++-
>  3 files changed, 87 insertions(+), 3 deletions(-)
> 
> diff --git a/qapi/run-state.json b/qapi/run-state.json
> index f216ba54ec4c..506bbe31541f 100644
> --- a/qapi/run-state.json
> +++ b/qapi/run-state.json
> @@ -499,7 +499,7 @@
>  # Since: 2.9
>  ##
>  { 'enum': 'GuestPanicInformationType',
> -  'data': [ 'hyper-v', 's390' ] }
> +  'data': [ 'hyper-v', 's390', 'tdx' ] }

Missing documentation for the 'tdx' value

>  
>  ##
>  # @GuestPanicInformation:
> @@ -514,7 +514,8 @@
>   'base': {'type': 'GuestPanicInformationType'},
>   'discriminator': 'type',
>   'data': {'hyper-v': 'GuestPanicInformationHyperV',
> -          's390': 'GuestPanicInformationS390'}}
> +          's390': 'GuestPanicInformationS390',
> +          'tdx' : 'GuestPanicInformationTdx'}}
>  
>  ##
>  # @GuestPanicInformationHyperV:
> @@ -577,6 +578,18 @@
>            'psw-addr': 'uint64',
>            'reason': 'S390CrashReason'}}
>  
> +##
> +# @GuestPanicInformationTdx:
> +#
> +# TDX GHCI TDG.VP.VMCALL<ReportFatalError> specific guest panic information

Not documented any of the struct members. Especially please include
the warning that 'message' comes from the guest and so must not be
trusted, not assumed to be well formed.

> +#
> +# Since: 8.2
> +##
> +{'struct': 'GuestPanicInformationTdx',
> + 'data': {'error-code': 'uint64',
> +          'gpa': 'uint64',
> +          'message': 'str'}}
> +
>  ##
>  # @MEMORY_FAILURE:
>  #
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
> index f3bd86281813..cab11484ed7e 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -518,7 +518,56 @@ void qemu_system_guest_panicked(GuestPanicInformation *info)
>                            S390CrashReason_str(info->u.s390.reason),
>                            info->u.s390.psw_mask,
>                            info->u.s390.psw_addr);
> +        } else if (info->type == GUEST_PANIC_INFORMATION_TYPE_TDX) {
> +            char *buf = NULL;
> +            bool printable = false;
> +
> +            /*
> +             * Although message is defined as a json string, we shouldn't
> +             * unconditionally treat it as is because the guest generated it and
> +             * it's not necessarily trustable.
> +             */
> +            if (info->u.tdx.message) {
> +                /* The caller guarantees the NUL-terminated string. */
> +                int len = strlen(info->u.tdx.message);
> +                int i;
> +
> +                printable = len > 0;
> +                for (i = 0; i < len; i++) {
> +                    if (!(0x20 <= info->u.tdx.message[i] &&
> +                          info->u.tdx.message[i] <= 0x7e)) {
> +                        printable = false;
> +                        break;
> +                    }
> +                }
> +
> +                /* 3 = length of "%02x " */
> +                buf = g_malloc(len * 3);
> +                for (i = 0; i < len; i++) {
> +                    if (info->u.tdx.message[i] == '\0') {
> +                        break;
> +                    } else {
> +                        sprintf(buf + 3 * i, "%02x ", info->u.tdx.message[i]);
> +                    }
> +                }
> +                if (i > 0)
> +                    /* replace the last ' '(space) to NUL */
> +                    buf[i * 3 - 1] = '\0';
> +                else
> +                    buf[0] = '\0';

You're building this escaped buffer but...

> +            }
> +
> +            qemu_log_mask(LOG_GUEST_ERROR,
> +                          //" TDX report fatal error:\"%s\" %s",
> +                          " TDX report fatal error:\"%s\""
> +                          "error: 0x%016" PRIx64 " gpa page: 0x%016" PRIx64 "\n",
> +                          printable ? info->u.tdx.message : "",
> +                          //buf ? buf : "",

...then not actually using it

Either delete the 'buf' code, or use it.

> +                          info->u.tdx.error_code,
> +                          info->u.tdx.gpa);
> +            g_free(buf);
>          }
> +
>          qapi_free_GuestPanicInformation(info);
>      }
>  }
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index f111b46dac92..7efaa13f59e2 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -18,6 +18,7 @@
>  #include "qom/object_interfaces.h"
>  #include "standard-headers/asm-x86/kvm_para.h"
>  #include "sysemu/kvm.h"
> +#include "sysemu/runstate.h"
>  #include "sysemu/sysemu.h"
>  #include "exec/address-spaces.h"
>  #include "exec/ramblock.h"
> @@ -1408,11 +1409,26 @@ static void tdx_handle_get_quote(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
>      vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
>  }
>  
> +static void tdx_panicked_on_fatal_error(X86CPU *cpu, uint64_t error_code,
> +                                        uint64_t gpa, char *message)
> +{
> +    GuestPanicInformation *panic_info;
> +
> +    panic_info = g_new0(GuestPanicInformation, 1);
> +    panic_info->type = GUEST_PANIC_INFORMATION_TYPE_TDX;
> +    panic_info->u.tdx.error_code = error_code;
> +    panic_info->u.tdx.gpa = gpa;
> +    panic_info->u.tdx.message = (char *)message;
> +
> +    qemu_system_guest_panicked(panic_info);
> +}
> +
>  static void tdx_handle_report_fatal_error(X86CPU *cpu,
>                                            struct kvm_tdx_vmcall *vmcall)
>  {
>      uint64_t error_code = vmcall->in_r12;
>      char *message = NULL;
> +    uint64_t gpa = -1ull;
>  
>      if (error_code & 0xffff) {
>          error_report("invalid error code of TDG.VP.VMCALL<REPORT_FATAL_ERROR>\n");
> @@ -1441,7 +1457,13 @@ static void tdx_handle_report_fatal_error(X86CPU *cpu,
>      }
>  
>      error_report("TD guest reports fatal error. %s\n", message ? : "");

In tdx_panicked_on_fatal_error you're avoiding printing 'message' if it
contains non-printable characters, but here you're printing it regardless.

Do we still need this error_report call at all ?

> -    exit(1);
> +
> +#define TDX_REPORT_FATAL_ERROR_GPA_VALID    BIT_ULL(63)
> +    if (error_code & TDX_REPORT_FATAL_ERROR_GPA_VALID) {
> +	gpa = vmcall->in_r13;

Bad indent

> +    }
> +
> +    tdx_panicked_on_fatal_error(cpu, error_code, gpa, message);
>  }
>  
>  static void tdx_handle_setup_event_notify_interrupt(X86CPU *cpu,
> -- 
> 2.34.1
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 03/58] target/i386: Parse TDX vm type
  2023-08-21  8:27   ` Daniel P. Berrangé
@ 2023-08-21 13:37     ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-21 13:37 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On 8/21/2023 4:27 PM, Daniel P. Berrangé wrote:
> On Fri, Aug 18, 2023 at 05:49:46AM -0400, Xiaoyao Li wrote:
>> TDX VM requires VM type KVM_X86_TDX_VM to be passed to
>> kvm_ioctl(KVM_CREATE_VM).
>>
>> If tdx-guest object is specified to confidential-guest-support, like,
>>
>>    qemu -machine ...,confidential-guest-support=tdx0 \
>>         -object tdx-guest,id=tdx0,...
>>
>> it parses VM type as KVM_X86_TDX_VM.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   target/i386/kvm/kvm.c | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index 62f237068a3a..77f4772afe6c 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -32,6 +32,7 @@
>>   #include "sysemu/runstate.h"
>>   #include "kvm_i386.h"
>>   #include "sev.h"
>> +#include "tdx.h"
>>   #include "xen-emu.h"
>>   #include "hyperv.h"
>>   #include "hyperv-proto.h"
>> @@ -158,6 +159,7 @@ static int kvm_get_one_msr(X86CPU *cpu, int index, uint64_t *value);
>>   static const char* vm_type_name[] = {
>>       [KVM_X86_DEFAULT_VM] = "default",
>>       [KVM_X86_SW_PROTECTED_VM] = "sw-protected-vm",
>> +    [KVM_X86_TDX_VM] = "tdx",
>>   };
>>   
>>   int kvm_get_vm_type(MachineState *ms, const char *vm_type)
>> @@ -170,12 +172,18 @@ int kvm_get_vm_type(MachineState *ms, const char *vm_type)
>>               kvm_type = KVM_X86_DEFAULT_VM;
>>           } else if (!g_ascii_strcasecmp(vm_type, "sw-protected-vm")) {
>>               kvm_type = KVM_X86_SW_PROTECTED_VM;
>> -        } else {
>> +        } else if (!g_ascii_strcasecmp(vm_type, "tdx")) {
>> +            kvm_type = KVM_X86_TDX_VM;
>> +        }else {
>>               error_report("Unknown kvm-type specified '%s'", vm_type);
>>               exit(1);
>>           }
>>       }
> 
> This whole block of code should go away - as this should not exist
> as a user visible property. It should be sufficient to use the
> tdx-guest object type to identify use of TDX.
> 

yes, agreed.

It's here because this series is based on the gmem series, which 
introduced property. I'm sorry that I forgot to mention it in the commit 
message.

Next gmem series will drop the implementation of kvm-type property [1] 
and above code will be dropped in next version as well.

[1] 
https://lore.kernel.org/qemu-devel/9b3a3e88-21f4-bfd2-a9c3-60a25832e698@intel.com/



^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 45/58] i386/tdx: Limit the range size for MapGPA
  2023-08-18  9:50 ` [PATCH v2 45/58] i386/tdx: Limit the range size for MapGPA Xiaoyao Li
@ 2023-08-21 22:30   ` Isaku Yamahata
  0 siblings, 0 replies; 120+ messages in thread
From: Isaku Yamahata @ 2023-08-21 22:30 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:28AM -0400,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> If the range for TDG.VP.VMCALL<MapGPA> is too large, process the limited
> size and return retry error.  It's bad for VMM to take too long time,
> e.g. second order, with blocking vcpu execution.  It results in too many
> missing timer interrupts.

This patch requires the guest side patch. [1]
Unless with large guest memory, it's unlikely to hit the limit with KVM/qemu,
though.

[1] https://lore.kernel.org/all/20230811021246.821-1-decui@microsoft.com/

> 
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  target/i386/kvm/tdx.c | 19 ++++++++++++++++++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 0c43c1f7759f..ced55be506d1 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -994,12 +994,16 @@ static hwaddr tdx_shared_bit(X86CPU *cpu)
>      return (cpu->phys_bits > 48) ? BIT_ULL(51) : BIT_ULL(47);
>  }
>  
> +/* 64MB at most in one call. What value is appropriate? */
> +#define TDX_MAP_GPA_MAX_LEN     (64 * 1024 * 1024)
> +
>  static void tdx_handle_map_gpa(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
>  {
>      hwaddr shared_bit = tdx_shared_bit(cpu);
>      hwaddr gpa = vmcall->in_r12 & ~shared_bit;
>      bool private = !(vmcall->in_r12 & shared_bit);
>      hwaddr size = vmcall->in_r13;
> +    bool retry = false;
>      int ret = 0;
>  
>      vmcall->status_code = TDG_VP_VMCALL_INVALID_OPERAND;
> @@ -1018,12 +1022,25 @@ static void tdx_handle_map_gpa(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
>          return;
>      }
>  
> +    if (size > TDX_MAP_GPA_MAX_LEN) {
> +        retry = true;
> +        size = TDX_MAP_GPA_MAX_LEN;
> +    }
> +
>      if (size > 0) {
>          ret = kvm_convert_memory(gpa, size, private);
>      }
>  
>      if (!ret) {
> -        vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
> +        if (retry) {
> +            vmcall->status_code = TDG_VP_VMCALL_RETRY;
> +            vmcall->out_r11 = gpa + size;
> +            if (!private) {
> +                vmcall->out_r11 |= shared_bit;
> +            }
> +        } else {
> +            vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
> +        }
>      }
>  }
>  
> -- 
> 2.34.1
> 
> 

-- 
Isaku Yamahata <isaku.yamahata@linux.intel.com>

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions
  2023-08-18  9:49 ` [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions Xiaoyao Li
@ 2023-08-21 23:00   ` Isaku Yamahata
  2023-08-23  3:59     ` Xiaoyao Li
  2023-10-10  1:02   ` Tina Zhang
  1 sibling, 1 reply; 120+ messages in thread
From: Isaku Yamahata @ 2023-08-21 23:00 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:49:51AM -0400,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 56cb826f6125..3198bc9fd5fb 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
...
> +static inline uint32_t host_cpuid_reg(uint32_t function,
> +                                      uint32_t index, int reg)
> +{
> +    uint32_t eax, ebx, ecx, edx;
> +    uint32_t ret = 0;
> +
> +    host_cpuid(function, index, &eax, &ebx, &ecx, &edx);
> +
> +    switch (reg) {
> +    case R_EAX:
> +        ret |= eax;
> +        break;
> +    case R_EBX:
> +        ret |= ebx;
> +        break;
> +    case R_ECX:
> +        ret |= ecx;
> +        break;
> +    case R_EDX:
> +        ret |= edx;

Nitpick: "|" isn't needed as we initialize ret = 0 above. Just '='.
-- 
Isaku Yamahata <isaku.yamahata@linux.intel.com>

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 19/58] qom: implement property helper for sha384
  2023-08-21  9:25   ` Daniel P. Berrangé
@ 2023-08-21 23:28     ` Isaku Yamahata
  0 siblings, 0 replies; 120+ messages in thread
From: Isaku Yamahata @ 2023-08-21 23:28 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Mon, Aug 21, 2023 at 10:25:35AM +0100,
"Daniel P. Berrangé" <berrange@redhat.com> wrote:

> On Fri, Aug 18, 2023 at 05:50:02AM -0400, Xiaoyao Li wrote:
> > From: Isaku Yamahata <isaku.yamahata@intel.com>
> > 
> > Implement property_add_sha384() which converts hex string <-> uint8_t[48]
> > It will be used for TDX which uses sha384 for measurement.
> 
> I think it is likely a better idea to use base64 for the encoding
> the binary hash - we use base64 for all the sev-guest properties
> that were binary data.
> 
> At which points the property set/get logic is much simpler as it
> is just needing a call to  g_base64_encode / g_base64_decode and
> length validation for the decode case.

Hex string is poplar to show hash value, isn't it?  Anyway it's easy for human
operator, shell scripts, libvirt or whatever to convert those representations
with utility commands like base64 or xxd, or library call.  Either way would
work.
-- 
Isaku Yamahata <isaku.yamahata@linux.intel.com>

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM
  2023-08-18  9:50 ` [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM Xiaoyao Li
  2023-08-21  9:38   ` Daniel P. Berrangé
@ 2023-08-21 23:40   ` Isaku Yamahata
  2023-08-22 15:45     ` Xiaoyao Li
  1 sibling, 1 reply; 120+ messages in thread
From: Isaku Yamahata @ 2023-08-21 23:40 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Fri, Aug 18, 2023 at 05:50:15AM -0400,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
> index e9d2888162ce..9b3c427766ef 100644
> --- a/target/i386/kvm/tdx.h
> +++ b/target/i386/kvm/tdx.h
> @@ -15,6 +15,17 @@ typedef struct TdxGuestClass {
>      ConfidentialGuestSupportClass parent_class;
>  } TdxGuestClass;
>  
> +enum TdxRamType{
> +    TDX_RAM_UNACCEPTED,
> +    TDX_RAM_ADDED,
> +};
> +
> +typedef struct TdxRamEntry {
> +    uint64_t address;
> +    uint64_t length;
> +    uint32_t type;

nitpick: enum TdxRamType. and related function arguments.

-- 
Isaku Yamahata <isaku.yamahata@linux.intel.com>

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 02/58] i386: Introduce tdx-guest object
  2023-08-18  9:49 ` [PATCH v2 02/58] i386: Introduce tdx-guest object Xiaoyao Li
@ 2023-08-22  6:22   ` Markus Armbruster
  2023-08-23  7:27     ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Markus Armbruster @ 2023-08-22  6:22 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

Xiaoyao Li <xiaoyao.li@intel.com> writes:

> Introduce tdx-guest object which implements the interface of
> CONFIDENTIAL_GUEST_SUPPORT, and will be used to create TDX VMs (TDs) by
>
>   qemu -machine ...,confidential-guest-support=tdx0	\
>        -object tdx-guset,id=tdx0

Typo: tdx-guest

> It has only one property 'attributes' with fixed value 0 and not
> configurable so far.

This must refer to TdxGuest member @attributes.

"Property" suggests QOM property, which @attributes isn't, at least not
in this patch.  Will it become a QOM property later in this series?

Hmm, @attributes appears to remain unused until PATCH 14.  Recommend to
delay its addition until then.

> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
> changes from RFC-V4
> - make @attributes not user-settable
> ---
>  configs/devices/i386-softmmu/default.mak |  1 +
>  hw/i386/Kconfig                          |  5 +++
>  qapi/qom.json                            | 12 +++++++
>  target/i386/kvm/meson.build              |  2 ++
>  target/i386/kvm/tdx.c                    | 40 ++++++++++++++++++++++++
>  target/i386/kvm/tdx.h                    | 19 +++++++++++
>  6 files changed, 79 insertions(+)
>  create mode 100644 target/i386/kvm/tdx.c
>  create mode 100644 target/i386/kvm/tdx.h
>
> diff --git a/configs/devices/i386-softmmu/default.mak b/configs/devices/i386-softmmu/default.mak
> index 598c6646dfc0..9b5ec59d65b0 100644
> --- a/configs/devices/i386-softmmu/default.mak
> +++ b/configs/devices/i386-softmmu/default.mak
> @@ -18,6 +18,7 @@
>  #CONFIG_QXL=n
>  #CONFIG_SEV=n
>  #CONFIG_SGA=n
> +#CONFIG_TDX=n
>  #CONFIG_TEST_DEVICES=n
>  #CONFIG_TPM_CRB=n
>  #CONFIG_TPM_TIS_ISA=n
> diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
> index 9051083c1e78..929f6c3f0e85 100644
> --- a/hw/i386/Kconfig
> +++ b/hw/i386/Kconfig
> @@ -10,6 +10,10 @@ config SGX
>      bool
>      depends on KVM
>  
> +config TDX
> +    bool
> +    depends on KVM
> +
>  config PC
>      bool
>      imply APPLESMC
> @@ -26,6 +30,7 @@ config PC
>      imply QXL
>      imply SEV
>      imply SGX
> +    imply TDX
>      imply TEST_DEVICES
>      imply TPM_CRB
>      imply TPM_TIS_ISA
> diff --git a/qapi/qom.json b/qapi/qom.json
> index e0b2044e3d20..2ca7ce7c0da5 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -866,6 +866,16 @@
>              'reduced-phys-bits': 'uint32',
>              '*kernel-hashes': 'bool' } }
>  
> +##
> +# @TdxGuestProperties:
> +#
> +# Properties for tdx-guest objects.
> +#
> +# Since: 8.2
> +##
> +{ 'struct': 'TdxGuestProperties',
> +  'data': { }}
> +
>  ##
>  # @ThreadContextProperties:
>  #
> @@ -944,6 +954,7 @@
>      'sev-guest',
>      'thread-context',
>      's390-pv-guest',
> +    'tdx-guest',
>      'throttle-group',
>      'tls-creds-anon',
>      'tls-creds-psk',
> @@ -1010,6 +1021,7 @@
>        'secret_keyring':             { 'type': 'SecretKeyringProperties',
>                                        'if': 'CONFIG_SECRET_KEYRING' },
>        'sev-guest':                  'SevGuestProperties',
> +      'tdx-guest':                  'TdxGuestProperties',
>        'thread-context':             'ThreadContextProperties',
>        'throttle-group':             'ThrottleGroupProperties',
>        'tls-creds-anon':             'TlsCredsAnonProperties',

Actually useful only when CONFIG_TDX is on, but can't make it
conditional here, as CONFIG_TDX is poisoned.

> diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
> index 40fbde96cac6..21ab03fe1349 100644
> --- a/target/i386/kvm/meson.build
> +++ b/target/i386/kvm/meson.build
> @@ -11,6 +11,8 @@ i386_softmmu_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files('xen-emu.c'))
>  
>  i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
>  
> +i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
> +
>  i386_system_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
>  
>  i386_system_ss.add_all(when: 'CONFIG_KVM', if_true: i386_softmmu_kvm_ss)
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> new file mode 100644
> index 000000000000..d3792d4a3d56
> --- /dev/null
> +++ b/target/i386/kvm/tdx.c
> @@ -0,0 +1,40 @@
> +/*
> + * QEMU TDX support
> + *
> + * Copyright Intel
> + *
> + * Author:
> + *      Xiaoyao Li <xiaoyao.li@intel.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qom/object_interfaces.h"
> +
> +#include "tdx.h"
> +
> +/* tdx guest */
> +OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
> +                                   tdx_guest,
> +                                   TDX_GUEST,
> +                                   CONFIDENTIAL_GUEST_SUPPORT,
> +                                   { TYPE_USER_CREATABLE },
> +                                   { NULL })
> +
> +static void tdx_guest_init(Object *obj)
> +{
> +    TdxGuest *tdx = TDX_GUEST(obj);
> +
> +    tdx->attributes = 0;
> +}
> +
> +static void tdx_guest_finalize(Object *obj)
> +{
> +}
> +
> +static void tdx_guest_class_init(ObjectClass *oc, void *data)
> +{
> +}
> diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
> new file mode 100644
> index 000000000000..415aeb5af746
> --- /dev/null
> +++ b/target/i386/kvm/tdx.h
> @@ -0,0 +1,19 @@
> +#ifndef QEMU_I386_TDX_H
> +#define QEMU_I386_TDX_H
> +
> +#include "exec/confidential-guest-support.h"
> +
> +#define TYPE_TDX_GUEST "tdx-guest"
> +#define TDX_GUEST(obj)  OBJECT_CHECK(TdxGuest, (obj), TYPE_TDX_GUEST)
> +
> +typedef struct TdxGuestClass {
> +    ConfidentialGuestSupportClass parent_class;
> +} TdxGuestClass;
> +
> +typedef struct TdxGuest {
> +    ConfidentialGuestSupport parent_obj;
> +
> +    uint64_t attributes;    /* TD attributes */
> +} TdxGuest;
> +
> +#endif /* QEMU_I386_TDX_H */

QAPI schema
Acked-by: Markus Armbruster <armbru@redhat.com>



^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 15/58] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2023-08-21  8:59   ` Daniel P. Berrangé
@ 2023-08-22  6:27     ` Markus Armbruster
  2023-08-22  8:39       ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Markus Armbruster @ 2023-08-22  6:27 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Aug 18, 2023 at 05:49:58AM -0400, Xiaoyao Li wrote:
>> Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it disables
>> EPT violation conversion to #VE on guest TD access of PENDING pages.
>> 
>> Some guest OS (e.g., Linux TD guest) may require this bit as 1.
>> Otherwise refuse to boot.
>> 
>> Add sept-ve-disable property for tdx-guest object, for user to configure
>> this bit.
>> 
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
>> ---
>>  qapi/qom.json         |  4 +++-
>>  target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
>>  2 files changed, 27 insertions(+), 1 deletion(-)
>> 
>> diff --git a/qapi/qom.json b/qapi/qom.json
>> index 2ca7ce7c0da5..cc08b9a98df9 100644
>> --- a/qapi/qom.json
>> +++ b/qapi/qom.json
>> @@ -871,10 +871,12 @@
>>  #
>>  # Properties for tdx-guest objects.
>>  #
>> +# @sept-ve-disable: bit 28 of TD attributes (default: 0)
>
> This description isn't very useful as it forces the user to go off and
> read the TDX specification to find out what bit 28 means. You've got a

Seconded.

> more useful description in the commit message, so please use that
> in the docs too. eg something like this
>
>   @sept-ve-disable: toggle bit 28 of TD attributes to control disabling
>                     of EPT violation conversion to #VE on guest
>                     TD access of PENDING pages. Some guest OS (e.g.
>                     Linux TD guest) may require this set, otherwise
>                     they refuse to boot.

But please format like

# @sept-ve-disable: toggle bit 28 of TD attributes to control disabling
#     of EPT violation conversion to #VE on guest TD access of PENDING
#     pages.  Some guest OS (e.g. Linux TD guest) may require this to
#     be set, otherwise they refuse to boot.

to blend in with recent commit a937b6aa739 (qapi: Reformat doc comments
to conform to current conventions).

>> +#
>>  # Since: 8.2
>>  ##
>>  { 'struct': 'TdxGuestProperties',
>> -  'data': { }}
>> +  'data': { '*sept-ve-disable': 'bool' } }
>>  
>>  ##
>>  # @ThreadContextProperties:

[...]



^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 20/58] i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM
  2023-08-21  9:29   ` Daniel P. Berrangé
@ 2023-08-22  6:35     ` Markus Armbruster
  0 siblings, 0 replies; 120+ messages in thread
From: Markus Armbruster @ 2023-08-22  6:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Aug 18, 2023 at 05:50:03AM -0400, Xiaoyao Li wrote:
>> From: Isaku Yamahata <isaku.yamahata@intel.com>
>> 
>> When creating TDX vm, three sha384 hash values can be provided for
>> TDX attestation.
>> 
>> So far they were hard coded as 0. Now allow user to specify those values
>> via property mrconfigid, mrowner and mrownerconfig. Choose hex-encoded
>> string as format since it's friendly for user to input.
>> 
>> example
>> -object tdx-guest, \
>>   mrconfigid=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef, \
>>   mrowner=fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210fedcba9876543210, \
>>   mrownerconfig=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
>> 
>> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>> TODO:
>>  - community requests to use base64 encoding if no special reason
>> ---
>>  qapi/qom.json         | 11 ++++++++++-
>>  target/i386/kvm/tdx.c | 13 +++++++++++++
>>  target/i386/kvm/tdx.h |  3 +++
>>  3 files changed, 26 insertions(+), 1 deletion(-)
>> 
>> diff --git a/qapi/qom.json b/qapi/qom.json
>> index cc08b9a98df9..87c1d440f331 100644
>> --- a/qapi/qom.json
>> +++ b/qapi/qom.json
>> @@ -873,10 +873,19 @@
>>  #
>>  # @sept-ve-disable: bit 28 of TD attributes (default: 0)
>>  #
>> +# @mrconfigid: MRCONFIGID SHA384 hex string of 48 * 2 length (default: 0)
>> +#
>> +# @mrowner: MROWNER SHA384 hex string of 48 * 2 length (default: 0)
>> +#
>> +# @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
>
> Per previous patch, I suggest these should all be passed in base64
> instead of hex.

I'm upgrading this suggestion to a demand: we use base64 for encoding
binary data everywhere in QAPI/QMP.  Consistency matters.

>                 Also 'default: 0' makes no sense for a string,
> which would be 'default: nil', and no need to document that as
> the default is implicit from the fact that its an optional string
> field. So eg
>
>   @mrconfigid: base64 encoded MRCONFIGID SHA384 digest

Agree.

The member names are abbreviations all run together, wheras QAPI/QMP
favors words-separated-with-dashes.  If you invented them, please change
them to QAPI/QMP style.  If they are established TDX terminology, keep
them as they are, but please point to your evidence.

>> +#
>>  # Since: 8.2
>>  ##
>>  { 'struct': 'TdxGuestProperties',
>> -  'data': { '*sept-ve-disable': 'bool' } }
>> +  'data': { '*sept-ve-disable': 'bool',
>> +            '*mrconfigid': 'str',
>> +            '*mrowner': 'str',
>> +            '*mrownerconfig': 'str' } }
>>  
>>  ##
>>  # @ThreadContextProperties:

[...]



^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-18  9:50 ` [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote> Xiaoyao Li
@ 2023-08-22  6:52   ` Markus Armbruster
  2023-08-22  8:24     ` Daniel P. Berrangé
  0 siblings, 1 reply; 120+ messages in thread
From: Markus Armbruster @ 2023-08-22  6:52 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

Xiaoyao Li <xiaoyao.li@intel.com> writes:

> From: Isaku Yamahata <isaku.yamahata@intel.com>
>
> For GetQuote, delegate a request to Quote Generation Service.  Add property
> of address of quote generation server and On request, connect to the
> server, read request buffer from shared guest memory, send the request
> buffer to the server and store the response into shared guest memory and
> notify TD guest by interrupt.
>
> "quote-generation-service" is a property to specify Quote Generation
> Service(QGS) in qemu socket address format.  The examples of the supported
> format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".
>
> command line example:
>   qemu-system-x86_64 \
>     -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
>     -machine confidential-guest-support=tdx0
>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  qapi/qom.json         |   5 +-
>  target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
>  target/i386/kvm/tdx.h |   7 +
>  3 files changed, 391 insertions(+), 1 deletion(-)
>
> diff --git a/qapi/qom.json b/qapi/qom.json
> index 87c1d440f331..37139949d761 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -879,13 +879,16 @@
>  #
>  # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
>  #
> +# @quote-generation-service: socket address for Quote Generation Service(QGS)
> +#
>  # Since: 8.2
>  ##
>  { 'struct': 'TdxGuestProperties',
>    'data': { '*sept-ve-disable': 'bool',
>              '*mrconfigid': 'str',
>              '*mrowner': 'str',
> -            '*mrownerconfig': 'str' } }
> +            '*mrownerconfig': 'str',
> +            '*quote-generation-service': 'str' } }

Why not type SocketAddress?

>  
>  ##
>  # @ThreadContextProperties:

[...]



^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 06/58] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  2023-08-21  8:46   ` Daniel P. Berrangé
@ 2023-08-22  7:31     ` Xiaoyao Li
  2023-08-22  8:19       ` Daniel P. Berrangé
  0 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-22  7:31 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On 8/21/2023 4:46 PM, Daniel P. Berrangé wrote:
> On Fri, Aug 18, 2023 at 05:49:49AM -0400, Xiaoyao Li wrote:
>> KVM provides TDX capabilities via sub command KVM_TDX_CAPABILITIES of
>> IOCTL(KVM_MEMORY_ENCRYPT_OP). Get the capabilities when initializing
>> TDX context. It will be used to validate user's setting later.
>>
>> Since there is no interface reporting how many cpuid configs contains in
>> KVM_TDX_CAPABILITIES, QEMU chooses to try starting with a known number
>> and abort when it exceeds KVM_MAX_CPUID_ENTRIES.
>>
>> Besides, introduce the interfaces to invoke TDX "ioctls" at different
>> scope (KVM, VM and VCPU) in preparation.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>> changes from v1:
>>    - Make the error message more clear;
>>
>> changes from RFC v4:
>>    - start from nr_cpuid_configs = 6 for the loop;
>>    - stop the loop when nr_cpuid_configs exceeds KVM_MAX_CPUID_ENTRIES;
>> ---
>>   target/i386/kvm/kvm.c      |  2 -
>>   target/i386/kvm/kvm_i386.h |  2 +
>>   target/i386/kvm/tdx.c      | 93 ++++++++++++++++++++++++++++++++++++++
>>   3 files changed, 95 insertions(+), 2 deletions(-)
>>
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index d6b988d6c2d1..ec5c07bffd38 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -1751,8 +1751,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
>>   
>>   static Error *invtsc_mig_blocker;
>>   
>> -#define KVM_MAX_CPUID_ENTRIES  100
>> -
>>   static void kvm_init_xsave(CPUX86State *env)
>>   {
>>       if (has_xsave2) {
>> diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
>> index ea3a5b174ac0..769eadbba56c 100644
>> --- a/target/i386/kvm/kvm_i386.h
>> +++ b/target/i386/kvm/kvm_i386.h
>> @@ -13,6 +13,8 @@
>>   
>>   #include "sysemu/kvm.h"
>>   
>> +#define KVM_MAX_CPUID_ENTRIES  100
>> +
>>   #define kvm_apic_in_kernel() (kvm_irqchip_in_kernel())
>>   
>>   #ifdef CONFIG_KVM
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index 77e33ae01147..255c47a2a553 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -12,14 +12,107 @@
>>    */
>>   
>>   #include "qemu/osdep.h"
>> +#include "qemu/error-report.h"
>>   #include "qapi/error.h"
>>   #include "qom/object_interfaces.h"
>> +#include "sysemu/kvm.h"
>>   
>>   #include "hw/i386/x86.h"
>> +#include "kvm_i386.h"
>>   #include "tdx.h"
>>   
>> +static struct kvm_tdx_capabilities *tdx_caps;
>> +
>> +enum tdx_ioctl_level{
>> +    TDX_PLATFORM_IOCTL,
>> +    TDX_VM_IOCTL,
>> +    TDX_VCPU_IOCTL,
>> +};
>> +
>> +static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
>> +                        __u32 flags, void *data)
> 
> Names with an initial double underscore are reserved for us by the
> platform implementation, so shouldn't be used in userspace app
> code.

How about tdx_ioctl_internal() ?

>> +{
>> +    struct kvm_tdx_cmd tdx_cmd;
>> +    int r;
>> +
>> +    memset(&tdx_cmd, 0x0, sizeof(tdx_cmd));
>> +
>> +    tdx_cmd.id = cmd_id;
>> +    tdx_cmd.flags = flags;
>> +    tdx_cmd.data = (__u64)(unsigned long)data;
>> +
>> +    switch (level) {
>> +    case TDX_PLATFORM_IOCTL:
>> +        r = kvm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
>> +        break;
>> +    case TDX_VM_IOCTL:
>> +        r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
>> +        break;
>> +    case TDX_VCPU_IOCTL:
>> +        r = kvm_vcpu_ioctl(state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
>> +        break;
>> +    default:
>> +        error_report("Invalid tdx_ioctl_level %d", level);
>> +        exit(1);
>> +    }
>> +
>> +    return r;
>> +}
>> +
>> +static inline int tdx_platform_ioctl(int cmd_id, __u32 flags, void *data)
>> +{
>> +    return __tdx_ioctl(NULL, TDX_PLATFORM_IOCTL, cmd_id, flags, data);
>> +}
>> +
>> +static inline int tdx_vm_ioctl(int cmd_id, __u32 flags, void *data)
>> +{
>> +    return __tdx_ioctl(NULL, TDX_VM_IOCTL, cmd_id, flags, data);
>> +}
>> +
>> +static inline int tdx_vcpu_ioctl(void *vcpu_fd, int cmd_id, __u32 flags,
>> +                                 void *data)
>> +{
>> +    return  __tdx_ioctl(vcpu_fd, TDX_VCPU_IOCTL, cmd_id, flags, data);
>> +}
>> +
>> +static void get_tdx_capabilities(void)
> 
> Pass in 'Error **errp'

OK. Will do it and all the following.

Thanks!

>> +{
>> +    struct kvm_tdx_capabilities *caps;
>> +    /* 1st generation of TDX reports 6 cpuid configs */
>> +    int nr_cpuid_configs = 6;
>> +    int r, size;
> 
> It is preferrable to use  'size_t' for memory allocation sizes.
> 
>> +
>> +    do {
>> +        size = sizeof(struct kvm_tdx_capabilities) +
>> +               nr_cpuid_configs * sizeof(struct kvm_tdx_cpuid_config);
>> +        caps = g_malloc0(size);
>> +        caps->nr_cpuid_configs = nr_cpuid_configs;
>> +
>> +        r = tdx_vm_ioctl(KVM_TDX_CAPABILITIES, 0, caps);
>> +        if (r == -E2BIG) {
>> +            g_free(caps);
>> +            nr_cpuid_configs *= 2;
>> +            if (nr_cpuid_configs > KVM_MAX_CPUID_ENTRIES) {
>> +                error_report("KVM TDX seems broken that number of CPUID entries in kvm_tdx_capabilities exceeds limit");
> 
> Include the limit in the error message, so if we ever need to change
> the limit, it'll be clear what limit the QEMU version was built with.
> 
> Also use error_setg(errp, ...);
> 
>> +                exit(1);
> 
> Return -1
> 
>> +            }
>> +        } else if (r < 0) {
>> +            g_free(caps);
>> +            error_report("KVM_TDX_CAPABILITIES failed: %s", strerror(-r));
> 
> Use error_setg_errno(errp, ...) instead of calling strerror yourself;
> 
>> +            exit(1);
> 
> Return -1
> 
>> +        }
>> +    }
>> +    while (r == -E2BIG);
>> +
>> +    tdx_caps = caps;
> 
> Return 0
> 
>> +}
>> +
>>   int tdx_kvm_init(MachineState *ms, Error **errp)
>>   {
>> +    if (!tdx_caps) {
>> +        get_tdx_capabilities();
> 
> Pass 'errp' into this method, and check return value for failure
> 
>> +    }
>> +
>>       return 0;
>>   }
>>   
>> -- 
>> 2.34.1
>>
> 
> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 07/58] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
  2023-08-21  8:48   ` Daniel P. Berrangé
@ 2023-08-22  7:46     ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-22  7:46 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On 8/21/2023 4:48 PM, Daniel P. Berrangé wrote:
> On Fri, Aug 18, 2023 at 05:49:50AM -0400, Xiaoyao Li wrote:
>> It will need special handling for TDX VMs all around the QEMU.
>> Introduce is_tdx_vm() helper to query if it's a TDX VM.
>>
>> Cache tdx_guest object thus no need to cast from ms->cgs every time.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
>> ---
>>   target/i386/kvm/tdx.c | 13 +++++++++++++
>>   target/i386/kvm/tdx.h | 10 ++++++++++
>>   2 files changed, 23 insertions(+)
>>
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index 255c47a2a553..56cb826f6125 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -21,8 +21,16 @@
>>   #include "kvm_i386.h"
>>   #include "tdx.h"
>>   
>> +static TdxGuest *tdx_guest;
>> +
>>   static struct kvm_tdx_capabilities *tdx_caps;
>>   
>> +/* It's valid after kvm_confidential_guest_init()->kvm_tdx_init() */
>> +bool is_tdx_vm(void)
>> +{
>> +    return !!tdx_guest;
>> +}
>> +
>>   enum tdx_ioctl_level{
>>       TDX_PLATFORM_IOCTL,
>>       TDX_VM_IOCTL,
>> @@ -109,10 +117,15 @@ static void get_tdx_capabilities(void)
>>   
>>   int tdx_kvm_init(MachineState *ms, Error **errp)
>>   {
>> +    TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
>> +                                                    TYPE_TDX_GUEST);
> 
> This method can return NULL.  Presumably tdx_kvm_init() should only
> be called if we already checked  ms->cgs == TYPE_TDX_GUEST. If so
> then use object_dynamic_cast_assert() instead.
> 

object_dynamic_cast_assert() is for OBJECT_CHECK() and INTERFACE_CHECK().

So I will use TDX_GUEST(OBJECT(ms->cgs)) (introduced in patch 2) 
instead, which is the wrapper of OBJECT_CHECK().


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 06/58] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  2023-08-22  7:31     ` Xiaoyao Li
@ 2023-08-22  8:19       ` Daniel P. Berrangé
  0 siblings, 0 replies; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-22  8:19 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Tue, Aug 22, 2023 at 03:31:44PM +0800, Xiaoyao Li wrote:
> On 8/21/2023 4:46 PM, Daniel P. Berrangé wrote:
> > On Fri, Aug 18, 2023 at 05:49:49AM -0400, Xiaoyao Li wrote:
> > > KVM provides TDX capabilities via sub command KVM_TDX_CAPABILITIES of
> > > IOCTL(KVM_MEMORY_ENCRYPT_OP). Get the capabilities when initializing
> > > TDX context. It will be used to validate user's setting later.
> > > 
> > > Since there is no interface reporting how many cpuid configs contains in
> > > KVM_TDX_CAPABILITIES, QEMU chooses to try starting with a known number
> > > and abort when it exceeds KVM_MAX_CPUID_ENTRIES.
> > > 
> > > Besides, introduce the interfaces to invoke TDX "ioctls" at different
> > > scope (KVM, VM and VCPU) in preparation.
> > > 
> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > ---
> > > changes from v1:
> > >    - Make the error message more clear;
> > > 
> > > changes from RFC v4:
> > >    - start from nr_cpuid_configs = 6 for the loop;
> > >    - stop the loop when nr_cpuid_configs exceeds KVM_MAX_CPUID_ENTRIES;
> > > ---
> > >   target/i386/kvm/kvm.c      |  2 -
> > >   target/i386/kvm/kvm_i386.h |  2 +
> > >   target/i386/kvm/tdx.c      | 93 ++++++++++++++++++++++++++++++++++++++
> > >   3 files changed, 95 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> > > index d6b988d6c2d1..ec5c07bffd38 100644
> > > --- a/target/i386/kvm/kvm.c
> > > +++ b/target/i386/kvm/kvm.c
> > > @@ -1751,8 +1751,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
> > >   static Error *invtsc_mig_blocker;
> > > -#define KVM_MAX_CPUID_ENTRIES  100
> > > -
> > >   static void kvm_init_xsave(CPUX86State *env)
> > >   {
> > >       if (has_xsave2) {
> > > diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
> > > index ea3a5b174ac0..769eadbba56c 100644
> > > --- a/target/i386/kvm/kvm_i386.h
> > > +++ b/target/i386/kvm/kvm_i386.h
> > > @@ -13,6 +13,8 @@
> > >   #include "sysemu/kvm.h"
> > > +#define KVM_MAX_CPUID_ENTRIES  100
> > > +
> > >   #define kvm_apic_in_kernel() (kvm_irqchip_in_kernel())
> > >   #ifdef CONFIG_KVM
> > > diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> > > index 77e33ae01147..255c47a2a553 100644
> > > --- a/target/i386/kvm/tdx.c
> > > +++ b/target/i386/kvm/tdx.c
> > > @@ -12,14 +12,107 @@
> > >    */
> > >   #include "qemu/osdep.h"
> > > +#include "qemu/error-report.h"
> > >   #include "qapi/error.h"
> > >   #include "qom/object_interfaces.h"
> > > +#include "sysemu/kvm.h"
> > >   #include "hw/i386/x86.h"
> > > +#include "kvm_i386.h"
> > >   #include "tdx.h"
> > > +static struct kvm_tdx_capabilities *tdx_caps;
> > > +
> > > +enum tdx_ioctl_level{
> > > +    TDX_PLATFORM_IOCTL,
> > > +    TDX_VM_IOCTL,
> > > +    TDX_VCPU_IOCTL,
> > > +};
> > > +
> > > +static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
> > > +                        __u32 flags, void *data)
> > 
> > Names with an initial double underscore are reserved for us by the
> > platform implementation, so shouldn't be used in userspace app
> > code.
> 
> How about tdx_ioctl_internal() ?

Sure, that's fine.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-22  6:52   ` Markus Armbruster
@ 2023-08-22  8:24     ` Daniel P. Berrangé
  2023-08-29  5:31       ` Chenyi Qiang
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-22  8:24 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

On Tue, Aug 22, 2023 at 08:52:30AM +0200, Markus Armbruster wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> 
> > From: Isaku Yamahata <isaku.yamahata@intel.com>
> >
> > For GetQuote, delegate a request to Quote Generation Service.  Add property
> > of address of quote generation server and On request, connect to the
> > server, read request buffer from shared guest memory, send the request
> > buffer to the server and store the response into shared guest memory and
> > notify TD guest by interrupt.
> >
> > "quote-generation-service" is a property to specify Quote Generation
> > Service(QGS) in qemu socket address format.  The examples of the supported
> > format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".
> >
> > command line example:
> >   qemu-system-x86_64 \
> >     -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
> >     -machine confidential-guest-support=tdx0
> >
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > ---
> >  qapi/qom.json         |   5 +-
> >  target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
> >  target/i386/kvm/tdx.h |   7 +
> >  3 files changed, 391 insertions(+), 1 deletion(-)
> >
> > diff --git a/qapi/qom.json b/qapi/qom.json
> > index 87c1d440f331..37139949d761 100644
> > --- a/qapi/qom.json
> > +++ b/qapi/qom.json
> > @@ -879,13 +879,16 @@
> >  #
> >  # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
> >  #
> > +# @quote-generation-service: socket address for Quote Generation Service(QGS)
> > +#
> >  # Since: 8.2
> >  ##
> >  { 'struct': 'TdxGuestProperties',
> >    'data': { '*sept-ve-disable': 'bool',
> >              '*mrconfigid': 'str',
> >              '*mrowner': 'str',
> > -            '*mrownerconfig': 'str' } }
> > +            '*mrownerconfig': 'str',
> > +            '*quote-generation-service': 'str' } }
> 
> Why not type SocketAddress?

Yes, the code uses SocketAddress internally when it eventually
calls qio_channel_socket_connect_async(), so we should directly
use SocketAddress in the QAPI from the start.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 15/58] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2023-08-22  6:27     ` Markus Armbruster
@ 2023-08-22  8:39       ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-22  8:39 UTC (permalink / raw)
  To: Markus Armbruster, Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

On 8/22/2023 2:27 PM, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
>> On Fri, Aug 18, 2023 at 05:49:58AM -0400, Xiaoyao Li wrote:
>>> Bit 28 of TD attribute, named SEPT_VE_DISABLE. When set to 1, it disables
>>> EPT violation conversion to #VE on guest TD access of PENDING pages.
>>>
>>> Some guest OS (e.g., Linux TD guest) may require this bit as 1.
>>> Otherwise refuse to boot.
>>>
>>> Add sept-ve-disable property for tdx-guest object, for user to configure
>>> this bit.
>>>
>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
>>> ---
>>>   qapi/qom.json         |  4 +++-
>>>   target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
>>>   2 files changed, 27 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qapi/qom.json b/qapi/qom.json
>>> index 2ca7ce7c0da5..cc08b9a98df9 100644
>>> --- a/qapi/qom.json
>>> +++ b/qapi/qom.json
>>> @@ -871,10 +871,12 @@
>>>   #
>>>   # Properties for tdx-guest objects.
>>>   #
>>> +# @sept-ve-disable: bit 28 of TD attributes (default: 0)
>>
>> This description isn't very useful as it forces the user to go off and
>> read the TDX specification to find out what bit 28 means. You've got a
> 
> Seconded.
> 
>> more useful description in the commit message, so please use that
>> in the docs too. eg something like this
>>
>>    @sept-ve-disable: toggle bit 28 of TD attributes to control disabling
>>                      of EPT violation conversion to #VE on guest
>>                      TD access of PENDING pages. Some guest OS (e.g.
>>                      Linux TD guest) may require this set, otherwise
>>                      they refuse to boot.
> 
> But please format like
> 
> # @sept-ve-disable: toggle bit 28 of TD attributes to control disabling
> #     of EPT violation conversion to #VE on guest TD access of PENDING
> #     pages.  Some guest OS (e.g. Linux TD guest) may require this to
> #     be set, otherwise they refuse to boot.
>

Thank you, Daniel and Markus.

Will use above in the next version.

> to blend in with recent commit a937b6aa739 (qapi: Reformat doc comments
> to conform to current conventions).
>
>>> +#
>>>   # Since: 8.2
>>>   ##
>>>   { 'struct': 'TdxGuestProperties',
>>> -  'data': { }}
>>> +  'data': { '*sept-ve-disable': 'bool' } }
>>>   
>>>   ##
>>>   # @ThreadContextProperties:
> 
> [...]
> 


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 18/58] i386/tdx: Validate TD attributes
  2023-08-21  9:16   ` Daniel P. Berrangé
@ 2023-08-22 14:21     ` Xiaoyao Li
  2023-08-22 14:30     ` Xiaoyao Li
  1 sibling, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-22 14:21 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On 8/21/2023 5:16 PM, Daniel P. Berrangé wrote:
> On Fri, Aug 18, 2023 at 05:50:01AM -0400, Xiaoyao Li wrote:
>> Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
>> fixed-1 bits must be set.
>>
>> Besides, sanity check the attribute bits that have not been supported by
>> QEMU yet. e.g., debug bit, it will be allowed in the future when debug
>> TD support lands in QEMU.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
>> ---
>>   target/i386/kvm/tdx.c | 27 +++++++++++++++++++++++++--
>>   1 file changed, 25 insertions(+), 2 deletions(-)
>>
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index 629abd267da8..73da15377ec3 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -32,6 +32,7 @@
>>                                        (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
>>                                        (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
>>   
>> +#define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
>>   #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
>>   #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
>>   #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
>> @@ -462,13 +463,32 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>>       return 0;
>>   }
>>   
>> -static void setup_td_guest_attributes(X86CPU *x86cpu)
>> +static int tdx_validate_attributes(TdxGuest *tdx)
>> +{
>> +    if (((tdx->attributes & tdx_caps->attrs_fixed0) | tdx_caps->attrs_fixed1) !=
>> +        tdx->attributes) {
>> +            error_report("Invalid attributes 0x%lx for TDX VM (fixed0 0x%llx, fixed1 0x%llx)",
>> +                          tdx->attributes, tdx_caps->attrs_fixed0, tdx_caps->attrs_fixed1);
>> +            return -EINVAL;
>> +    }
>> +
>> +    if (tdx->attributes & TDX_TD_ATTRIBUTES_DEBUG) {
>> +        error_report("Current QEMU doesn't support attributes.debug[bit 0] for TDX VM");
>> +        return -EINVAL;
>> +    }
> 
> Use error_setg() in both cases, passing in a 'Error **errp' object,
> and 'return -1' instead of returning an errno value.

Will do it in next version.

thanks!

>> +
>> +    return 0;
>> +}
>> +
>> +static int setup_td_guest_attributes(X86CPU *x86cpu)
>>   {
>>       CPUX86State *env = &x86cpu->env;
>>   
>>       tdx_guest->attributes |= (env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS) ?
>>                                TDX_TD_ATTRIBUTES_PKS : 0;
>>       tdx_guest->attributes |= x86cpu->enable_pmu ? TDX_TD_ATTRIBUTES_PERFMON : 0;
>> +
>> +    return tdx_validate_attributes(tdx_guest);
> 
> Pass along "errp" into this
> 
>>   }
>>   
>>   int tdx_pre_create_vcpu(CPUState *cpu)
>> @@ -493,7 +513,10 @@ int tdx_pre_create_vcpu(CPUState *cpu)
> 
> In an earlier patch I suggested adding 'Error **errp' to this method...
> 
>>           goto out_free;
>>       }
>>   
>> -    setup_td_guest_attributes(x86cpu);
>> +    r = setup_td_guest_attributes(x86cpu);
> 
> ...it can also be passed into this method
> 
>> +    if (r) {
>> +        goto out;
>> +    }
>>   
>>       init_vm->cpuid.nent = kvm_x86_arch_cpuid(env, init_vm->cpuid.entries, 0);
>>       init_vm->attributes = tdx_guest->attributes;
>> -- 
>> 2.34.1
>>
> 
> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 18/58] i386/tdx: Validate TD attributes
  2023-08-21  9:16   ` Daniel P. Berrangé
  2023-08-22 14:21     ` Xiaoyao Li
@ 2023-08-22 14:30     ` Xiaoyao Li
  2023-08-22 14:42       ` Daniel P. Berrangé
  1 sibling, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-22 14:30 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On 8/21/2023 5:16 PM, Daniel P. Berrangé wrote:
> On Fri, Aug 18, 2023 at 05:50:01AM -0400, Xiaoyao Li wrote:
>> Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
>> fixed-1 bits must be set.
>>
>> Besides, sanity check the attribute bits that have not been supported by
>> QEMU yet. e.g., debug bit, it will be allowed in the future when debug
>> TD support lands in QEMU.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
>> ---
>>   target/i386/kvm/tdx.c | 27 +++++++++++++++++++++++++--
>>   1 file changed, 25 insertions(+), 2 deletions(-)
>>
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index 629abd267da8..73da15377ec3 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -32,6 +32,7 @@
>>                                        (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
>>                                        (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
>>   
>> +#define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
>>   #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
>>   #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
>>   #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
>> @@ -462,13 +463,32 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>>       return 0;
>>   }
>>   
>> -static void setup_td_guest_attributes(X86CPU *x86cpu)
>> +static int tdx_validate_attributes(TdxGuest *tdx)
>> +{
>> +    if (((tdx->attributes & tdx_caps->attrs_fixed0) | tdx_caps->attrs_fixed1) !=
>> +        tdx->attributes) {
>> +            error_report("Invalid attributes 0x%lx for TDX VM (fixed0 0x%llx, fixed1 0x%llx)",
>> +                          tdx->attributes, tdx_caps->attrs_fixed0, tdx_caps->attrs_fixed1);
>> +            return -EINVAL;
>> +    }
>> +
>> +    if (tdx->attributes & TDX_TD_ATTRIBUTES_DEBUG) {
>> +        error_report("Current QEMU doesn't support attributes.debug[bit 0] for TDX VM");
>> +        return -EINVAL;
>> +    }
> 
> Use error_setg() in both cases, passing in a 'Error **errp' object,
> and 'return -1' instead of returning an errno value.
> 

why return -1 instead of -EINVAL?


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 18/58] i386/tdx: Validate TD attributes
  2023-08-22 14:30     ` Xiaoyao Li
@ 2023-08-22 14:42       ` Daniel P. Berrangé
  2023-08-23  7:31         ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-22 14:42 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Tue, Aug 22, 2023 at 10:30:47PM +0800, Xiaoyao Li wrote:
> On 8/21/2023 5:16 PM, Daniel P. Berrangé wrote:
> > On Fri, Aug 18, 2023 at 05:50:01AM -0400, Xiaoyao Li wrote:
> > > Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
> > > fixed-1 bits must be set.
> > > 
> > > Besides, sanity check the attribute bits that have not been supported by
> > > QEMU yet. e.g., debug bit, it will be allowed in the future when debug
> > > TD support lands in QEMU.
> > > 
> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> > > ---
> > >   target/i386/kvm/tdx.c | 27 +++++++++++++++++++++++++--
> > >   1 file changed, 25 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> > > index 629abd267da8..73da15377ec3 100644
> > > --- a/target/i386/kvm/tdx.c
> > > +++ b/target/i386/kvm/tdx.c
> > > @@ -32,6 +32,7 @@
> > >                                        (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
> > >                                        (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
> > > +#define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
> > >   #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
> > >   #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
> > >   #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
> > > @@ -462,13 +463,32 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
> > >       return 0;
> > >   }
> > > -static void setup_td_guest_attributes(X86CPU *x86cpu)
> > > +static int tdx_validate_attributes(TdxGuest *tdx)
> > > +{
> > > +    if (((tdx->attributes & tdx_caps->attrs_fixed0) | tdx_caps->attrs_fixed1) !=
> > > +        tdx->attributes) {
> > > +            error_report("Invalid attributes 0x%lx for TDX VM (fixed0 0x%llx, fixed1 0x%llx)",
> > > +                          tdx->attributes, tdx_caps->attrs_fixed0, tdx_caps->attrs_fixed1);
> > > +            return -EINVAL;
> > > +    }
> > > +
> > > +    if (tdx->attributes & TDX_TD_ATTRIBUTES_DEBUG) {
> > > +        error_report("Current QEMU doesn't support attributes.debug[bit 0] for TDX VM");
> > > +        return -EINVAL;
> > > +    }
> > 
> > Use error_setg() in both cases, passing in a 'Error **errp' object,
> > and 'return -1' instead of returning an errno value.
> > 
> 
> why return -1 instead of -EINVAL?

Returning errno values is useful if the method isn't providing an
"Error **errp" parameter, because it lets the caller report a
more detailed error message via strerror(). Once you add a Error **
parameter though, there is almost never any reason for the caller
to care about the original errno value, and so we use 0 / -1 as
success/fail indicators.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM
  2023-08-21  9:38   ` Daniel P. Berrangé
@ 2023-08-22 15:39     ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-22 15:39 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On 8/21/2023 5:38 PM, Daniel P. Berrangé wrote:
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index bb806736b4ff..ed617ebab266 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> +static int tdx_accept_ram_range(uint64_t address, uint64_t length)
>> +{
>> +    uint64_t head_start, tail_start, head_length, tail_length;
>> +    uint64_t tmp_address, tmp_length;
>> +    TdxRamEntry *e;
>> +    int i;
>> +
>> +    for (i = 0; i < tdx_guest->nr_ram_entries; i++) {
>> +        e = &tdx_guest->ram_entries[i];
>> +
>> +        if (address + length <= e->address ||
>> +            e->address + e->length <= address) {
>> +                continue;
> Indented too far
> 

Fixed.

Thanks!
-Xiaoyao

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM
  2023-08-21 23:40   ` Isaku Yamahata
@ 2023-08-22 15:45     ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-22 15:45 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	erdemaktas, Chenyi Qiang

On 8/22/2023 7:40 AM, Isaku Yamahata wrote:
> On Fri, Aug 18, 2023 at 05:50:15AM -0400,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
>> index e9d2888162ce..9b3c427766ef 100644
>> --- a/target/i386/kvm/tdx.h
>> +++ b/target/i386/kvm/tdx.h
>> @@ -15,6 +15,17 @@ typedef struct TdxGuestClass {
>>       ConfidentialGuestSupportClass parent_class;
>>   } TdxGuestClass;
>>   
>> +enum TdxRamType{
>> +    TDX_RAM_UNACCEPTED,
>> +    TDX_RAM_ADDED,
>> +};
>> +
>> +typedef struct TdxRamEntry {
>> +    uint64_t address;
>> +    uint64_t length;
>> +    uint32_t type;
> 
> nitpick: enum TdxRamType. and related function arguments.
> 

Will do it.

Thanks!


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions
  2023-08-21 23:00   ` Isaku Yamahata
@ 2023-08-23  3:59     ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-23  3:59 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	erdemaktas, Chenyi Qiang

On 8/22/2023 7:00 AM, Isaku Yamahata wrote:
> On Fri, Aug 18, 2023 at 05:49:51AM -0400,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index 56cb826f6125..3198bc9fd5fb 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
> ...
>> +static inline uint32_t host_cpuid_reg(uint32_t function,
>> +                                      uint32_t index, int reg)
>> +{
>> +    uint32_t eax, ebx, ecx, edx;
>> +    uint32_t ret = 0;
>> +
>> +    host_cpuid(function, index, &eax, &ebx, &ecx, &edx);
>> +
>> +    switch (reg) {
>> +    case R_EAX:
>> +        ret |= eax;
>> +        break;
>> +    case R_EBX:
>> +        ret |= ebx;
>> +        break;
>> +    case R_ECX:
>> +        ret |= ecx;
>> +        break;
>> +    case R_EDX:
>> +        ret |= edx;
> 
> Nitpick: "|" isn't needed as we initialize ret = 0 above. Just '='.

Will fix it in next version.

thanks!

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 02/58] i386: Introduce tdx-guest object
  2023-08-22  6:22   ` Markus Armbruster
@ 2023-08-23  7:27     ` Xiaoyao Li
  2023-08-23 11:14       ` Markus Armbruster
  0 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-23  7:27 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

On 8/22/2023 2:22 PM, Markus Armbruster wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> 
>> Introduce tdx-guest object which implements the interface of
>> CONFIDENTIAL_GUEST_SUPPORT, and will be used to create TDX VMs (TDs) by
>>
>>    qemu -machine ...,confidential-guest-support=tdx0	\
>>         -object tdx-guset,id=tdx0
> 
> Typo: tdx-guest

Will fix it.

>> It has only one property 'attributes' with fixed value 0 and not
>> configurable so far.
> 
> This must refer to TdxGuest member @attributes.
> 
> "Property" suggests QOM property, which @attributes isn't, at least not
> in this patch.  Will it become a QOM property later in this series?

At least not in this series. Maybe in the future there is request to 
directly configure the whole attributes via QOM property, but none from now.

I will change the description of it to avoid confusion.

> Hmm, @attributes appears to remain unused until PATCH 14.  Recommend to
> delay its addition until then.

IMHO, it's not suitable to introduce it in patch 14. Using a separate 
patch seems unnecessary. I'll leave it in this patch unless strong 
objection on it.

>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
>> ---
>> changes from RFC-V4
>> - make @attributes not user-settable
>> ---
>>   configs/devices/i386-softmmu/default.mak |  1 +
>>   hw/i386/Kconfig                          |  5 +++
>>   qapi/qom.json                            | 12 +++++++
>>   target/i386/kvm/meson.build              |  2 ++
>>   target/i386/kvm/tdx.c                    | 40 ++++++++++++++++++++++++
>>   target/i386/kvm/tdx.h                    | 19 +++++++++++
>>   6 files changed, 79 insertions(+)
>>   create mode 100644 target/i386/kvm/tdx.c
>>   create mode 100644 target/i386/kvm/tdx.h
>>
>> diff --git a/configs/devices/i386-softmmu/default.mak b/configs/devices/i386-softmmu/default.mak
>> index 598c6646dfc0..9b5ec59d65b0 100644
>> --- a/configs/devices/i386-softmmu/default.mak
>> +++ b/configs/devices/i386-softmmu/default.mak
>> @@ -18,6 +18,7 @@
>>   #CONFIG_QXL=n
>>   #CONFIG_SEV=n
>>   #CONFIG_SGA=n
>> +#CONFIG_TDX=n
>>   #CONFIG_TEST_DEVICES=n
>>   #CONFIG_TPM_CRB=n
>>   #CONFIG_TPM_TIS_ISA=n
>> diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
>> index 9051083c1e78..929f6c3f0e85 100644
>> --- a/hw/i386/Kconfig
>> +++ b/hw/i386/Kconfig
>> @@ -10,6 +10,10 @@ config SGX
>>       bool
>>       depends on KVM
>>   
>> +config TDX
>> +    bool
>> +    depends on KVM
>> +
>>   config PC
>>       bool
>>       imply APPLESMC
>> @@ -26,6 +30,7 @@ config PC
>>       imply QXL
>>       imply SEV
>>       imply SGX
>> +    imply TDX
>>       imply TEST_DEVICES
>>       imply TPM_CRB
>>       imply TPM_TIS_ISA
>> diff --git a/qapi/qom.json b/qapi/qom.json
>> index e0b2044e3d20..2ca7ce7c0da5 100644
>> --- a/qapi/qom.json
>> +++ b/qapi/qom.json
>> @@ -866,6 +866,16 @@
>>               'reduced-phys-bits': 'uint32',
>>               '*kernel-hashes': 'bool' } }
>>   
>> +##
>> +# @TdxGuestProperties:
>> +#
>> +# Properties for tdx-guest objects.
>> +#
>> +# Since: 8.2
>> +##
>> +{ 'struct': 'TdxGuestProperties',
>> +  'data': { }}
>> +
>>   ##
>>   # @ThreadContextProperties:
>>   #
>> @@ -944,6 +954,7 @@
>>       'sev-guest',
>>       'thread-context',
>>       's390-pv-guest',
>> +    'tdx-guest',
>>       'throttle-group',
>>       'tls-creds-anon',
>>       'tls-creds-psk',
>> @@ -1010,6 +1021,7 @@
>>         'secret_keyring':             { 'type': 'SecretKeyringProperties',
>>                                         'if': 'CONFIG_SECRET_KEYRING' },
>>         'sev-guest':                  'SevGuestProperties',
>> +      'tdx-guest':                  'TdxGuestProperties',
>>         'thread-context':             'ThreadContextProperties',
>>         'throttle-group':             'ThrottleGroupProperties',
>>         'tls-creds-anon':             'TlsCredsAnonProperties',
> 
> Actually useful only when CONFIG_TDX is on, but can't make it
> conditional here, as CONFIG_TDX is poisoned.

In fact, I just followed what SEV did.

To me, it looks OK to make it conditional on CONFIG_TDX. Could you 
please elaborate "but can't make it conditional here, as CONFIG_TDX is 
poisoned." ?


>> diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
>> index 40fbde96cac6..21ab03fe1349 100644
>> --- a/target/i386/kvm/meson.build
>> +++ b/target/i386/kvm/meson.build
>> @@ -11,6 +11,8 @@ i386_softmmu_kvm_ss.add(when: 'CONFIG_XEN_EMU', if_true: files('xen-emu.c'))
>>   
>>   i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
>>   
>> +i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
>> +
>>   i386_system_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
>>   
>>   i386_system_ss.add_all(when: 'CONFIG_KVM', if_true: i386_softmmu_kvm_ss)
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> new file mode 100644
>> index 000000000000..d3792d4a3d56
>> --- /dev/null
>> +++ b/target/i386/kvm/tdx.c
>> @@ -0,0 +1,40 @@
>> +/*
>> + * QEMU TDX support
>> + *
>> + * Copyright Intel
>> + *
>> + * Author:
>> + *      Xiaoyao Li <xiaoyao.li@intel.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory
>> + *
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qom/object_interfaces.h"
>> +
>> +#include "tdx.h"
>> +
>> +/* tdx guest */
>> +OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
>> +                                   tdx_guest,
>> +                                   TDX_GUEST,
>> +                                   CONFIDENTIAL_GUEST_SUPPORT,
>> +                                   { TYPE_USER_CREATABLE },
>> +                                   { NULL })
>> +
>> +static void tdx_guest_init(Object *obj)
>> +{
>> +    TdxGuest *tdx = TDX_GUEST(obj);
>> +
>> +    tdx->attributes = 0;
>> +}
>> +
>> +static void tdx_guest_finalize(Object *obj)
>> +{
>> +}
>> +
>> +static void tdx_guest_class_init(ObjectClass *oc, void *data)
>> +{
>> +}
>> diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
>> new file mode 100644
>> index 000000000000..415aeb5af746
>> --- /dev/null
>> +++ b/target/i386/kvm/tdx.h
>> @@ -0,0 +1,19 @@
>> +#ifndef QEMU_I386_TDX_H
>> +#define QEMU_I386_TDX_H
>> +
>> +#include "exec/confidential-guest-support.h"
>> +
>> +#define TYPE_TDX_GUEST "tdx-guest"
>> +#define TDX_GUEST(obj)  OBJECT_CHECK(TdxGuest, (obj), TYPE_TDX_GUEST)
>> +
>> +typedef struct TdxGuestClass {
>> +    ConfidentialGuestSupportClass parent_class;
>> +} TdxGuestClass;
>> +
>> +typedef struct TdxGuest {
>> +    ConfidentialGuestSupport parent_obj;
>> +
>> +    uint64_t attributes;    /* TD attributes */
>> +} TdxGuest;
>> +
>> +#endif /* QEMU_I386_TDX_H */
> 
> QAPI schema
> Acked-by: Markus Armbruster <armbru@redhat.com>

Thank you!



^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 18/58] i386/tdx: Validate TD attributes
  2023-08-22 14:42       ` Daniel P. Berrangé
@ 2023-08-23  7:31         ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-23  7:31 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On 8/22/2023 10:42 PM, Daniel P. Berrangé wrote:
> On Tue, Aug 22, 2023 at 10:30:47PM +0800, Xiaoyao Li wrote:
>> On 8/21/2023 5:16 PM, Daniel P. Berrangé wrote:
>>> On Fri, Aug 18, 2023 at 05:50:01AM -0400, Xiaoyao Li wrote:
>>>> Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
>>>> fixed-1 bits must be set.
>>>>
>>>> Besides, sanity check the attribute bits that have not been supported by
>>>> QEMU yet. e.g., debug bit, it will be allowed in the future when debug
>>>> TD support lands in QEMU.
>>>>
>>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
>>>> ---
>>>>    target/i386/kvm/tdx.c | 27 +++++++++++++++++++++++++--
>>>>    1 file changed, 25 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>>>> index 629abd267da8..73da15377ec3 100644
>>>> --- a/target/i386/kvm/tdx.c
>>>> +++ b/target/i386/kvm/tdx.c
>>>> @@ -32,6 +32,7 @@
>>>>                                         (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
>>>>                                         (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
>>>> +#define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
>>>>    #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
>>>>    #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
>>>>    #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
>>>> @@ -462,13 +463,32 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>>>>        return 0;
>>>>    }
>>>> -static void setup_td_guest_attributes(X86CPU *x86cpu)
>>>> +static int tdx_validate_attributes(TdxGuest *tdx)
>>>> +{
>>>> +    if (((tdx->attributes & tdx_caps->attrs_fixed0) | tdx_caps->attrs_fixed1) !=
>>>> +        tdx->attributes) {
>>>> +            error_report("Invalid attributes 0x%lx for TDX VM (fixed0 0x%llx, fixed1 0x%llx)",
>>>> +                          tdx->attributes, tdx_caps->attrs_fixed0, tdx_caps->attrs_fixed1);
>>>> +            return -EINVAL;
>>>> +    }
>>>> +
>>>> +    if (tdx->attributes & TDX_TD_ATTRIBUTES_DEBUG) {
>>>> +        error_report("Current QEMU doesn't support attributes.debug[bit 0] for TDX VM");
>>>> +        return -EINVAL;
>>>> +    }
>>>
>>> Use error_setg() in both cases, passing in a 'Error **errp' object,
>>> and 'return -1' instead of returning an errno value.
>>>
>>
>> why return -1 instead of -EINVAL?
> 
> Returning errno values is useful if the method isn't providing an
> "Error **errp" parameter, because it lets the caller report a
> more detailed error message via strerror(). Once you add a Error **
> parameter though, there is almost never any reason for the caller
> to care about the original errno value, and so we use 0 / -1 as
> success/fail indicators.

I see.

Thanks,
-Xiaoyao

> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 02/58] i386: Introduce tdx-guest object
  2023-08-23  7:27     ` Xiaoyao Li
@ 2023-08-23 11:14       ` Markus Armbruster
  0 siblings, 0 replies; 120+ messages in thread
From: Markus Armbruster @ 2023-08-23 11:14 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

Xiaoyao Li <xiaoyao.li@intel.com> writes:

> On 8/22/2023 2:22 PM, Markus Armbruster wrote:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>> 
>>> Introduce tdx-guest object which implements the interface of
>>> CONFIDENTIAL_GUEST_SUPPORT, and will be used to create TDX VMs (TDs) by
>>>
>>>    qemu -machine ...,confidential-guest-support=tdx0	\
>>>         -object tdx-guset,id=tdx0
>>
>> Typo: tdx-guest
>
> Will fix it.
>
>>> It has only one property 'attributes' with fixed value 0 and not
>>> configurable so far.
>>
>> This must refer to TdxGuest member @attributes.
>> "Property" suggests QOM property, which @attributes isn't, at least not
>> in this patch.  Will it become a QOM property later in this series?
>
> At least not in this series. Maybe in the future there is request to directly configure the whole attributes via QOM property, but none from now.
>
> I will change the description of it to avoid confusion.
>
>> Hmm, @attributes appears to remain unused until PATCH 14.  Recommend to
>> delay its addition until then.
>
> IMHO, it's not suitable to introduce it in patch 14. Using a separate patch seems unnecessary. I'll leave it in this patch unless strong objection on it.

Not worth arguing about.

>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
>>> ---
>>> changes from RFC-V4
>>> - make @attributes not user-settable
>>> ---
>>>   configs/devices/i386-softmmu/default.mak |  1 +
>>>   hw/i386/Kconfig                          |  5 +++
>>>   qapi/qom.json                            | 12 +++++++
>>>   target/i386/kvm/meson.build              |  2 ++
>>>   target/i386/kvm/tdx.c                    | 40 ++++++++++++++++++++++++
>>>   target/i386/kvm/tdx.h                    | 19 +++++++++++
>>>   6 files changed, 79 insertions(+)
>>>   create mode 100644 target/i386/kvm/tdx.c
>>>   create mode 100644 target/i386/kvm/tdx.h
>>>
>>> diff --git a/configs/devices/i386-softmmu/default.mak b/configs/devices/i386-softmmu/default.mak
>>> index 598c6646dfc0..9b5ec59d65b0 100644
>>> --- a/configs/devices/i386-softmmu/default.mak
>>> +++ b/configs/devices/i386-softmmu/default.mak
>>> @@ -18,6 +18,7 @@
>>>   #CONFIG_QXL=n
>>>   #CONFIG_SEV=n
>>>   #CONFIG_SGA=n
>>> +#CONFIG_TDX=n
>>>   #CONFIG_TEST_DEVICES=n
>>>   #CONFIG_TPM_CRB=n
>>>   #CONFIG_TPM_TIS_ISA=n
>>> diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
>>> index 9051083c1e78..929f6c3f0e85 100644
>>> --- a/hw/i386/Kconfig
>>> +++ b/hw/i386/Kconfig
>>> @@ -10,6 +10,10 @@ config SGX
>>>       bool
>>>       depends on KVM
>>>   +config TDX
>>> +    bool
>>> +    depends on KVM
>>> +
>>>   config PC
>>>       bool
>>>       imply APPLESMC
>>> @@ -26,6 +30,7 @@ config PC
>>>       imply QXL
>>>       imply SEV
>>>       imply SGX
>>> +    imply TDX
>>>       imply TEST_DEVICES
>>>       imply TPM_CRB
>>>       imply TPM_TIS_ISA
>>> diff --git a/qapi/qom.json b/qapi/qom.json
>>> index e0b2044e3d20..2ca7ce7c0da5 100644
>>> --- a/qapi/qom.json
>>> +++ b/qapi/qom.json
>>> @@ -866,6 +866,16 @@
>>>               'reduced-phys-bits': 'uint32',
>>>               '*kernel-hashes': 'bool' } }
>>>   +##
>>> +# @TdxGuestProperties:
>>> +#
>>> +# Properties for tdx-guest objects.
>>> +#
>>> +# Since: 8.2
>>> +##
>>> +{ 'struct': 'TdxGuestProperties',
>>> +  'data': { }}
>>> +
>>>   ##
>>>   # @ThreadContextProperties:
>>>   #
>>> @@ -944,6 +954,7 @@
>>>       'sev-guest',
>>>       'thread-context',
>>>       's390-pv-guest',
>>> +    'tdx-guest',
>>>       'throttle-group',
>>>       'tls-creds-anon',
>>>       'tls-creds-psk',
>>> @@ -1010,6 +1021,7 @@
>>>         'secret_keyring':             { 'type': 'SecretKeyringProperties',
>>>                                         'if': 'CONFIG_SECRET_KEYRING' },
>>>         'sev-guest':                  'SevGuestProperties',
>>> +      'tdx-guest':                  'TdxGuestProperties',
>>>         'thread-context':             'ThreadContextProperties',
>>>         'throttle-group':             'ThrottleGroupProperties',
>>>         'tls-creds-anon':             'TlsCredsAnonProperties',
>>
>> Actually useful only when CONFIG_TDX is on, but can't make it
>> conditional here, as CONFIG_TDX is poisoned.
>
> In fact, I just followed what SEV did.

Yup.

> To me, it looks OK to make it conditional on CONFIG_TDX. Could you please elaborate "but can't make it conditional here, as CONFIG_TDX is poisoned." ?

CONFIG_TDX is one of the macros that can only be used in
target-dependent code.  Enforced by config-poison.h's

    #pragma GCC poison CONFIG_TDX

The code generated from qom.json is target-independent.

To use 'if': 'CONFIG_TDX', we'd have to move the definition to a
target-dependent QAPI module, say qom-machine.json.  Sadly, that's more
trouble than it's worth.

[...]



^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 33/58] headers: Add definitions from UEFI spec for volumes, resources, etc...
  2023-08-18  9:50 ` [PATCH v2 33/58] headers: Add definitions from UEFI spec for volumes, resources, etc Xiaoyao Li
@ 2023-08-23 19:41   ` Isaku Yamahata
  2023-08-24  7:50     ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Isaku Yamahata @ 2023-08-23 19:41 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang, isaku.yamahata,
	isaku.yamahata

On Fri, Aug 18, 2023 at 05:50:16AM -0400,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> Add UEFI definitions for literals, enums, structs, GUIDs, etc... that
> will be used by TDX to build the UEFI Hand-Off Block (HOB) that is passed
> to the Trusted Domain Virtual Firmware (TDVF).
> 
> All values come from the UEFI specification and TDVF design guide. [1]
> 
> Note, EFI_RESOURCE_MEMORY_UNACCEPTED will be added in future UEFI spec.
> 
> [1] https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf

Nitpick: The specs [1] [2] include unaccepted memory.

[1] UEFI Specification Version 2.10 (released August 2022)
[2] UEFI Platform Initialization Distribution Packaging Specification Version 1.1)
-- 
Isaku Yamahata <isaku.yamahata@linux.intel.com>

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 42/58] i386/tdx: register the fd read callback with the main loop to read the quote data
  2023-08-18  9:50 ` [PATCH v2 42/58] i386/tdx: register the fd read callback with the main loop to read the quote data Xiaoyao Li
@ 2023-08-24  6:27   ` Chenyi Qiang
  0 siblings, 0 replies; 120+ messages in thread
From: Chenyi Qiang @ 2023-08-24  6:27 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas



On 8/18/2023 5:50 PM, Xiaoyao Li wrote:
> From: Chenyi Qiang <chenyi.qiang@intel.com>
> 
> When TD guest invokes getquote tdvmcall, QEMU will register a async qio
> task with default context when the qio channel is connected. However, as
> there is a blocking action (recvmsg()) in qio_channel_read() and it will
> block main thread and make TD guest have no response until the server
> returns.
> 
> Set the io channel non-blocking and register the socket fd with the main
> loop. Move the read operation into the callback. When the fd is readable,
> inovke the callback to handle the quote data.
> 
> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  target/i386/kvm/tdx.c | 147 +++++++++++++++++++++++++++---------------
>  1 file changed, 96 insertions(+), 51 deletions(-)
> 

How about squashing this patch with the previous one? I think this patch
is somewhat a bug fix for it to resolve the thread blocking issue.

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 43/58] i386/tdx: setup a timer for the qio channel
  2023-08-18  9:50 ` [PATCH v2 43/58] i386/tdx: setup a timer for the qio channel Xiaoyao Li
@ 2023-08-24  7:21   ` Chenyi Qiang
  2023-08-24  8:34     ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Chenyi Qiang @ 2023-08-24  7:21 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas



On 8/18/2023 5:50 PM, Xiaoyao Li wrote:
> From: Chenyi Qiang <chenyi.qiang@intel.com>
> 
> To avoid no response from QGS server, setup a timer for the transaction. If
> timeout, make it an error and interrupt guest. Define the threshold of time
> to 30s at present, maybe change to other value if not appropriate.
> 
> Extract the common cleanup code to make it more clear.
> 
> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  target/i386/kvm/tdx.c | 151 ++++++++++++++++++++++++------------------
>  1 file changed, 85 insertions(+), 66 deletions(-)
> 
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 3cb2163a0335..fa658ce1f2e4 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -1002,6 +1002,7 @@ struct tdx_get_quote_task {
>      struct tdx_get_quote_header hdr;
>      int event_notify_interrupt;
>      QIOChannelSocket *ioc;
> +    QEMUTimer timer;
>  };
>  
>  struct x86_msi {
> @@ -1084,13 +1085,48 @@ static void tdx_td_notify(struct tdx_get_quote_task *t)
>      }
>  }
>  
> +static void tdx_getquote_task_cleanup(struct tdx_get_quote_task *t, bool outlen_overflow)
> +{
> +    MachineState *ms;
> +    TdxGuest *tdx;
> +
> +    if (t->hdr.error_code != cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS) && !outlen_overflow) {
> +        t->hdr.out_len = cpu_to_le32(0);
> +    }
> +
> +    /* Publish the response contents before marking this request completed. */
> +    smp_wmb();
> +    if (address_space_write(
> +            &address_space_memory, t->gpa,
> +            MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
> +        error_report("TDX: failed to update GetQuote header.");
> +    }
> +    tdx_td_notify(t);
> +
> +    if (t->ioc->fd > 0) {
> +        qemu_set_fd_handler(t->ioc->fd, NULL, NULL, NULL);
> +    }
> +    qio_channel_close(QIO_CHANNEL(t->ioc), NULL);
> +    object_unref(OBJECT(t->ioc));
> +    timer_del(&t->timer);

Xiaoyao, I guess you missed a bug fix patch here as t->timer could be
uninitialized and then timer_del() will cause segv.

> +    g_free(t->out_data);
> +    g_free(t);
> +
> +    /* Maintain the number of in-flight requests. */
> +    ms = MACHINE(qdev_get_machine());
> +    tdx = TDX_GUEST(ms->cgs);
> +    qemu_mutex_lock(&tdx->lock);
> +    tdx->quote_generation_num--;
> +    qemu_mutex_unlock(&tdx->lock);
> +}
> +


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 33/58] headers: Add definitions from UEFI spec for volumes, resources, etc...
  2023-08-23 19:41   ` Isaku Yamahata
@ 2023-08-24  7:50     ` Xiaoyao Li
  2023-08-24  7:55       ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-24  7:50 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang, isaku.yamahata

On 8/24/2023 3:41 AM, Isaku Yamahata wrote:
> On Fri, Aug 18, 2023 at 05:50:16AM -0400,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> Add UEFI definitions for literals, enums, structs, GUIDs, etc... that
>> will be used by TDX to build the UEFI Hand-Off Block (HOB) that is passed
>> to the Trusted Domain Virtual Firmware (TDVF).
>>
>> All values come from the UEFI specification and TDVF design guide. [1]
>>
>> Note, EFI_RESOURCE_MEMORY_UNACCEPTED will be added in future UEFI spec.
>>
>> [1] https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf
> 
> Nitpick: The specs [1] [2] include unaccepted memory.

EfiUnacceptedMemoryType shows in UEFI spec while 
EFI_RESOURCE_MEMORY_UNACCEPTED is still missing in PI spec.

https://github.com/tianocore/edk2/commit/00bbb1e584ec05547159f405cca383e8ba5e4ddb

> [1] UEFI Specification Version 2.10 (released August 2022)
> [2] UEFI Platform Initialization Distribution Packaging Specification Version 1.1)


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 33/58] headers: Add definitions from UEFI spec for volumes, resources, etc...
  2023-08-24  7:50     ` Xiaoyao Li
@ 2023-08-24  7:55       ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-24  7:55 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang, isaku.yamahata

On 8/24/2023 3:50 PM, Xiaoyao Li wrote:
> On 8/24/2023 3:41 AM, Isaku Yamahata wrote:
>> On Fri, Aug 18, 2023 at 05:50:16AM -0400,
>> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
>>
>>> Add UEFI definitions for literals, enums, structs, GUIDs, etc... that
>>> will be used by TDX to build the UEFI Hand-Off Block (HOB) that is 
>>> passed
>>> to the Trusted Domain Virtual Firmware (TDVF).
>>>
>>> All values come from the UEFI specification and TDVF design guide. [1]
>>>
>>> Note, EFI_RESOURCE_MEMORY_UNACCEPTED will be added in future UEFI spec.
>>>
>>> [1] 
>>> https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf
>>
>> Nitpick: The specs [1] [2] include unaccepted memory.
> 
> EfiUnacceptedMemoryType shows in UEFI spec while 
> EFI_RESOURCE_MEMORY_UNACCEPTED is still missing in PI spec.
> 
> https://github.com/tianocore/edk2/commit/00bbb1e584ec05547159f405cca383e8ba5e4ddb

Sorry, I just find it shows in latest PI spec.

https://uefi.org/sites/default/files/resources/UEFI_PI_Spec_1_8_March3.pdf

>> [1] UEFI Specification Version 2.10 (released August 2022)
>> [2] UEFI Platform Initialization Distribution Packaging Specification 
>> Version 1.1)
> 
> 


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 43/58] i386/tdx: setup a timer for the qio channel
  2023-08-24  7:21   ` Chenyi Qiang
@ 2023-08-24  8:34     ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-24  8:34 UTC (permalink / raw)
  To: Chenyi Qiang, Paolo Bonzini, Richard Henderson,
	Michael S. Tsirkin, Marcel Apfelbaum, Igor Mammedov, Ani Sinha,
	Peter Xu, David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas

On 8/24/2023 3:21 PM, Chenyi Qiang wrote:
> 
> 
> On 8/18/2023 5:50 PM, Xiaoyao Li wrote:
>> From: Chenyi Qiang <chenyi.qiang@intel.com>
>>
>> To avoid no response from QGS server, setup a timer for the transaction. If
>> timeout, make it an error and interrupt guest. Define the threshold of time
>> to 30s at present, maybe change to other value if not appropriate.
>>
>> Extract the common cleanup code to make it more clear.
>>
>> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   target/i386/kvm/tdx.c | 151 ++++++++++++++++++++++++------------------
>>   1 file changed, 85 insertions(+), 66 deletions(-)
>>
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index 3cb2163a0335..fa658ce1f2e4 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -1002,6 +1002,7 @@ struct tdx_get_quote_task {
>>       struct tdx_get_quote_header hdr;
>>       int event_notify_interrupt;
>>       QIOChannelSocket *ioc;
>> +    QEMUTimer timer;
>>   };
>>   
>>   struct x86_msi {
>> @@ -1084,13 +1085,48 @@ static void tdx_td_notify(struct tdx_get_quote_task *t)
>>       }
>>   }
>>   
>> +static void tdx_getquote_task_cleanup(struct tdx_get_quote_task *t, bool outlen_overflow)
>> +{
>> +    MachineState *ms;
>> +    TdxGuest *tdx;
>> +
>> +    if (t->hdr.error_code != cpu_to_le64(TDX_VP_GET_QUOTE_SUCCESS) && !outlen_overflow) {
>> +        t->hdr.out_len = cpu_to_le32(0);
>> +    }
>> +
>> +    /* Publish the response contents before marking this request completed. */
>> +    smp_wmb();
>> +    if (address_space_write(
>> +            &address_space_memory, t->gpa,
>> +            MEMTXATTRS_UNSPECIFIED, &t->hdr, sizeof(t->hdr)) != MEMTX_OK) {
>> +        error_report("TDX: failed to update GetQuote header.");
>> +    }
>> +    tdx_td_notify(t);
>> +
>> +    if (t->ioc->fd > 0) {
>> +        qemu_set_fd_handler(t->ioc->fd, NULL, NULL, NULL);
>> +    }
>> +    qio_channel_close(QIO_CHANNEL(t->ioc), NULL);
>> +    object_unref(OBJECT(t->ioc));
>> +    timer_del(&t->timer);
> 
> Xiaoyao, I guess you missed a bug fix patch here as t->timer could be
> uninitialized and then timer_del() will cause segv.

Thanks for the reminding.
I'll update this patch to include the fix.

Thanks,
-Xiaoyao

>> +    g_free(t->out_data);
>> +    g_free(t);
>> +
>> +    /* Maintain the number of in-flight requests. */
>> +    ms = MACHINE(qdev_get_machine());
>> +    tdx = TDX_GUEST(ms->cgs);
>> +    qemu_mutex_lock(&tdx->lock);
>> +    tdx->quote_generation_num--;
>> +    qemu_mutex_unlock(&tdx->lock);
>> +}
>> +
> 


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 47/58] i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility
  2023-08-21  9:58   ` Daniel P. Berrangé
@ 2023-08-28 13:14     ` Xiaoyao Li
  2023-08-29 10:28       ` Daniel P. Berrangé
  0 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-28 13:14 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On 8/21/2023 5:58 PM, Daniel P. Berrangé wrote:
> On Fri, Aug 18, 2023 at 05:50:30AM -0400, Xiaoyao Li wrote:
>> Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   qapi/run-state.json   | 17 +++++++++++++--
>>   softmmu/runstate.c    | 49 +++++++++++++++++++++++++++++++++++++++++++
>>   target/i386/kvm/tdx.c | 24 ++++++++++++++++++++-
>>   3 files changed, 87 insertions(+), 3 deletions(-)
>>
>> diff --git a/qapi/run-state.json b/qapi/run-state.json
>> index f216ba54ec4c..506bbe31541f 100644
>> --- a/qapi/run-state.json
>> +++ b/qapi/run-state.json
>> @@ -499,7 +499,7 @@
>>   # Since: 2.9
>>   ##
>>   { 'enum': 'GuestPanicInformationType',
>> -  'data': [ 'hyper-v', 's390' ] }
>> +  'data': [ 'hyper-v', 's390', 'tdx' ] }
> 
> Missing documentation for the 'tdx' value
> 
>>   
>>   ##
>>   # @GuestPanicInformation:
>> @@ -514,7 +514,8 @@
>>    'base': {'type': 'GuestPanicInformationType'},
>>    'discriminator': 'type',
>>    'data': {'hyper-v': 'GuestPanicInformationHyperV',
>> -          's390': 'GuestPanicInformationS390'}}
>> +          's390': 'GuestPanicInformationS390',
>> +          'tdx' : 'GuestPanicInformationTdx'}}
>>   
>>   ##
>>   # @GuestPanicInformationHyperV:
>> @@ -577,6 +578,18 @@
>>             'psw-addr': 'uint64',
>>             'reason': 'S390CrashReason'}}
>>   
>> +##
>> +# @GuestPanicInformationTdx:
>> +#
>> +# TDX GHCI TDG.VP.VMCALL<ReportFatalError> specific guest panic information
> 
> Not documented any of the struct members. Especially please include
> the warning that 'message' comes from the guest and so must not be
> trusted, not assumed to be well formed.

Will do it in next version.

thanks!


>> +#
>> +# Since: 8.2
>> +##
>> +{'struct': 'GuestPanicInformationTdx',
>> + 'data': {'error-code': 'uint64',
>> +          'gpa': 'uint64',
>> +          'message': 'str'}}
>> +
>>   ##
>>   # @MEMORY_FAILURE:
>>   #
>> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
>> index f3bd86281813..cab11484ed7e 100644
>> --- a/softmmu/runstate.c
>> +++ b/softmmu/runstate.c
>> @@ -518,7 +518,56 @@ void qemu_system_guest_panicked(GuestPanicInformation *info)
>>                             S390CrashReason_str(info->u.s390.reason),
>>                             info->u.s390.psw_mask,
>>                             info->u.s390.psw_addr);
>> +        } else if (info->type == GUEST_PANIC_INFORMATION_TYPE_TDX) {
>> +            char *buf = NULL;
>> +            bool printable = false;
>> +
>> +            /*
>> +             * Although message is defined as a json string, we shouldn't
>> +             * unconditionally treat it as is because the guest generated it and
>> +             * it's not necessarily trustable.
>> +             */
>> +            if (info->u.tdx.message) {
>> +                /* The caller guarantees the NUL-terminated string. */
>> +                int len = strlen(info->u.tdx.message);
>> +                int i;
>> +
>> +                printable = len > 0;
>> +                for (i = 0; i < len; i++) {
>> +                    if (!(0x20 <= info->u.tdx.message[i] &&
>> +                          info->u.tdx.message[i] <= 0x7e)) {
>> +                        printable = false;
>> +                        break;
>> +                    }
>> +                }
>> +
>> +                /* 3 = length of "%02x " */
>> +                buf = g_malloc(len * 3);
>> +                for (i = 0; i < len; i++) {
>> +                    if (info->u.tdx.message[i] == '\0') {
>> +                        break;
>> +                    } else {
>> +                        sprintf(buf + 3 * i, "%02x ", info->u.tdx.message[i]);
>> +                    }
>> +                }
>> +                if (i > 0)
>> +                    /* replace the last ' '(space) to NUL */
>> +                    buf[i * 3 - 1] = '\0';
>> +                else
>> +                    buf[0] = '\0';
> 
> You're building this escaped buffer but...
> 
>> +            }
>> +
>> +            qemu_log_mask(LOG_GUEST_ERROR,
>> +                          //" TDX report fatal error:\"%s\" %s",
>> +                          " TDX report fatal error:\"%s\""
>> +                          "error: 0x%016" PRIx64 " gpa page: 0x%016" PRIx64 "\n",
>> +                          printable ? info->u.tdx.message : "",
>> +                          //buf ? buf : "",
> 
> ...then not actually using it
> 
> Either delete the 'buf' code, or use it.

Sorry for posting some internal testing version.
Does below look good to you?

@@ -518,7 +518,56 @@ void 
qemu_system_guest_panicked(GuestPanicInformation *info)
                            S390CrashReason_str(info->u.s390.reason),
                            info->u.s390.psw_mask,
                            info->u.s390.psw_addr);
+        } else if (info->type == GUEST_PANIC_INFORMATION_TYPE_TDX) {
+            bool printable = false;
+            char *buf = NULL;
+            int len = 0, i;
+
+            /*
+             * Although message is defined as a json string, we shouldn't
+             * unconditionally treat it as is because the guest 
generated it and
+             * it's not necessarily trustable.
+             */
+            if (info->u.tdx.message) {
+                /* The caller guarantees the NUL-terminated string. */
+                len = strlen(info->u.tdx.message);
+
+                printable = len > 0;
+                for (i = 0; i < len; i++) {
+                    if (!(0x20 <= info->u.tdx.message[i] &&
+                          info->u.tdx.message[i] <= 0x7e)) {
+                        printable = false;
+                        break;
+                    }
+                }
+            }
+
+            if (!printable && len) {
+                /* 3 = length of "%02x " */
+                buf = g_malloc(len * 3);
+                for (i = 0; i < len; i++) {
+                    if (info->u.tdx.message[i] == '\0') {
+                        break;
+                    } else {
+                        sprintf(buf + 3 * i, "%02x ", 
info->u.tdx.message[i]);
+                    }
+                }
+                if (i > 0)
+                    /* replace the last ' '(space) to NUL */
+                    buf[i * 3 - 1] = '\0';
+                else
+                    buf[0] = '\0';
+            }
+
+            qemu_log_mask(LOG_GUEST_ERROR,
+                          " TDX guest reports fatal error:\"%s\""
+                          " error code: 0x%016" PRIx64 " gpa page: 
0x%016" PRIx64 "\n",
+                          printable ? info->u.tdx.message : buf,
+                          info->u.tdx.error_code,
+                          info->u.tdx.gpa);
+            g_free(buf);
          }



>> +                          info->u.tdx.error_code,
>> +                          info->u.tdx.gpa);
>> +            g_free(buf);
>>           }
>> +
>>           qapi_free_GuestPanicInformation(info);
>>       }
>>   }
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index f111b46dac92..7efaa13f59e2 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -18,6 +18,7 @@
>>   #include "qom/object_interfaces.h"
>>   #include "standard-headers/asm-x86/kvm_para.h"
>>   #include "sysemu/kvm.h"
>> +#include "sysemu/runstate.h"
>>   #include "sysemu/sysemu.h"
>>   #include "exec/address-spaces.h"
>>   #include "exec/ramblock.h"
>> @@ -1408,11 +1409,26 @@ static void tdx_handle_get_quote(X86CPU *cpu, struct kvm_tdx_vmcall *vmcall)
>>       vmcall->status_code = TDG_VP_VMCALL_SUCCESS;
>>   }
>>   
>> +static void tdx_panicked_on_fatal_error(X86CPU *cpu, uint64_t error_code,
>> +                                        uint64_t gpa, char *message)
>> +{
>> +    GuestPanicInformation *panic_info;
>> +
>> +    panic_info = g_new0(GuestPanicInformation, 1);
>> +    panic_info->type = GUEST_PANIC_INFORMATION_TYPE_TDX;
>> +    panic_info->u.tdx.error_code = error_code;
>> +    panic_info->u.tdx.gpa = gpa;
>> +    panic_info->u.tdx.message = (char *)message;
>> +
>> +    qemu_system_guest_panicked(panic_info);
>> +}
>> +
>>   static void tdx_handle_report_fatal_error(X86CPU *cpu,
>>                                             struct kvm_tdx_vmcall *vmcall)
>>   {
>>       uint64_t error_code = vmcall->in_r12;
>>       char *message = NULL;
>> +    uint64_t gpa = -1ull;
>>   
>>       if (error_code & 0xffff) {
>>           error_report("invalid error code of TDG.VP.VMCALL<REPORT_FATAL_ERROR>\n");
>> @@ -1441,7 +1457,13 @@ static void tdx_handle_report_fatal_error(X86CPU *cpu,
>>       }
>>   
>>       error_report("TD guest reports fatal error. %s\n", message ? : "");
> 
> In tdx_panicked_on_fatal_error you're avoiding printing 'message' if it
> contains non-printable characters, but here you're printing it regardless.

I guess you meant qemu_system_guest_panicked().

> Do we still need this error_report call at all ?

yes. It can and should be dropped before I sent to maillist. I keep it 
internally for testing purpose because qemu_log_mask() doesn't get 
printed by default.

>> -    exit(1);
>> +
>> +#define TDX_REPORT_FATAL_ERROR_GPA_VALID    BIT_ULL(63)
>> +    if (error_code & TDX_REPORT_FATAL_ERROR_GPA_VALID) {
>> +	gpa = vmcall->in_r13;
> 
> Bad indent

Fixed.

>> +    }
>> +
>> +    tdx_panicked_on_fatal_error(cpu, error_code, gpa, message);
>>   }
>>   
>>   static void tdx_handle_setup_event_notify_interrupt(X86CPU *cpu,
>> -- 
>> 2.34.1
>>
> 
> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-22  8:24     ` Daniel P. Berrangé
@ 2023-08-29  5:31       ` Chenyi Qiang
  2023-08-29 10:25         ` Daniel P. Berrangé
  2023-09-26 20:33         ` Markus Armbruster
  0 siblings, 2 replies; 120+ messages in thread
From: Chenyi Qiang @ 2023-08-29  5:31 UTC (permalink / raw)
  To: Daniel P. Berrangé, Markus Armbruster
  Cc: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas



On 8/22/2023 4:24 PM, Daniel P. Berrangé wrote:
> On Tue, Aug 22, 2023 at 08:52:30AM +0200, Markus Armbruster wrote:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>
>>> From: Isaku Yamahata <isaku.yamahata@intel.com>
>>>
>>> For GetQuote, delegate a request to Quote Generation Service.  Add property
>>> of address of quote generation server and On request, connect to the
>>> server, read request buffer from shared guest memory, send the request
>>> buffer to the server and store the response into shared guest memory and
>>> notify TD guest by interrupt.
>>>
>>> "quote-generation-service" is a property to specify Quote Generation
>>> Service(QGS) in qemu socket address format.  The examples of the supported
>>> format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".
>>>
>>> command line example:
>>>   qemu-system-x86_64 \
>>>     -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
>>>     -machine confidential-guest-support=tdx0
>>>
>>> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>> ---
>>>  qapi/qom.json         |   5 +-
>>>  target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
>>>  target/i386/kvm/tdx.h |   7 +
>>>  3 files changed, 391 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qapi/qom.json b/qapi/qom.json
>>> index 87c1d440f331..37139949d761 100644
>>> --- a/qapi/qom.json
>>> +++ b/qapi/qom.json
>>> @@ -879,13 +879,16 @@
>>>  #
>>>  # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
>>>  #
>>> +# @quote-generation-service: socket address for Quote Generation Service(QGS)
>>> +#
>>>  # Since: 8.2
>>>  ##
>>>  { 'struct': 'TdxGuestProperties',
>>>    'data': { '*sept-ve-disable': 'bool',
>>>              '*mrconfigid': 'str',
>>>              '*mrowner': 'str',
>>> -            '*mrownerconfig': 'str' } }
>>> +            '*mrownerconfig': 'str',
>>> +            '*quote-generation-service': 'str' } }
>>
>> Why not type SocketAddress?
> 
> Yes, the code uses SocketAddress internally when it eventually
> calls qio_channel_socket_connect_async(), so we should directly
> use SocketAddress in the QAPI from the start.

Any benefit to directly use SocketAddress?

"quote-generation-service" here is optional, it seems not trivial to add
and parse the SocketAddress type in QEMU command. After I change 'str'
to 'SocketAddress' and specify the command like "-object
tdx-guest,type=vsock,cid=2,port=1234...", it will report "invalid
parameter cid".

> 
> With regards,
> Daniel

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-29  5:31       ` Chenyi Qiang
@ 2023-08-29 10:25         ` Daniel P. Berrangé
  2023-08-30  5:18           ` Chenyi Qiang
  2023-09-26 20:33         ` Markus Armbruster
  1 sibling, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-29 10:25 UTC (permalink / raw)
  To: Chenyi Qiang
  Cc: Markus Armbruster, Xiaoyao Li, Paolo Bonzini, Richard Henderson,
	Michael S. Tsirkin, Marcel Apfelbaum, Igor Mammedov, Ani Sinha,
	Peter Xu, David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas

On Tue, Aug 29, 2023 at 01:31:37PM +0800, Chenyi Qiang wrote:
> 
> 
> On 8/22/2023 4:24 PM, Daniel P. Berrangé wrote:
> > On Tue, Aug 22, 2023 at 08:52:30AM +0200, Markus Armbruster wrote:
> >> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> >>
> >>> From: Isaku Yamahata <isaku.yamahata@intel.com>
> >>>
> >>> For GetQuote, delegate a request to Quote Generation Service.  Add property
> >>> of address of quote generation server and On request, connect to the
> >>> server, read request buffer from shared guest memory, send the request
> >>> buffer to the server and store the response into shared guest memory and
> >>> notify TD guest by interrupt.
> >>>
> >>> "quote-generation-service" is a property to specify Quote Generation
> >>> Service(QGS) in qemu socket address format.  The examples of the supported
> >>> format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".
> >>>
> >>> command line example:
> >>>   qemu-system-x86_64 \
> >>>     -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
> >>>     -machine confidential-guest-support=tdx0
> >>>
> >>> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> >>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> >>> ---
> >>>  qapi/qom.json         |   5 +-
> >>>  target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
> >>>  target/i386/kvm/tdx.h |   7 +
> >>>  3 files changed, 391 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/qapi/qom.json b/qapi/qom.json
> >>> index 87c1d440f331..37139949d761 100644
> >>> --- a/qapi/qom.json
> >>> +++ b/qapi/qom.json
> >>> @@ -879,13 +879,16 @@
> >>>  #
> >>>  # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
> >>>  #
> >>> +# @quote-generation-service: socket address for Quote Generation Service(QGS)
> >>> +#
> >>>  # Since: 8.2
> >>>  ##
> >>>  { 'struct': 'TdxGuestProperties',
> >>>    'data': { '*sept-ve-disable': 'bool',
> >>>              '*mrconfigid': 'str',
> >>>              '*mrowner': 'str',
> >>> -            '*mrownerconfig': 'str' } }
> >>> +            '*mrownerconfig': 'str',
> >>> +            '*quote-generation-service': 'str' } }
> >>
> >> Why not type SocketAddress?
> > 
> > Yes, the code uses SocketAddress internally when it eventually
> > calls qio_channel_socket_connect_async(), so we should directly
> > use SocketAddress in the QAPI from the start.
> 
> Any benefit to directly use SocketAddress?

We don't want whatever code consumes the configuration to
do a second level of parsing to convert the 'str' value
into the 'SocketAddress' object it actually needs.

QEMU has a long history of having a second round of ad-hoc
parsing of configuration and we've found it to be a serious
maintenence burden. Thus we strive to have everything
represented in QAPI using the desired final type, and avoid
the second round of parsing.

> "quote-generation-service" here is optional, it seems not trivial to add
> and parse the SocketAddress type in QEMU command. After I change 'str'
> to 'SocketAddress' and specify the command like "-object
> tdx-guest,type=vsock,cid=2,port=1234...", it will report "invalid
> parameter cid".

The -object parameter supports JSON syntax for this reason

   -object '{"qom-type":"tdx-guest","quote-generation-service":{"type": "vsock", "cid":"2","port":"1234"}}'

libvirt will always use the JSON syntax for -object with a new enough
QEMU.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 47/58] i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility
  2023-08-28 13:14     ` Xiaoyao Li
@ 2023-08-29 10:28       ` Daniel P. Berrangé
  2023-08-30  2:15         ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-29 10:28 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Mon, Aug 28, 2023 at 09:14:41PM +0800, Xiaoyao Li wrote:
> On 8/21/2023 5:58 PM, Daniel P. Berrangé wrote:
> > On Fri, Aug 18, 2023 at 05:50:30AM -0400, Xiaoyao Li wrote:
> > > Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > ---
> > >   qapi/run-state.json   | 17 +++++++++++++--
> > >   softmmu/runstate.c    | 49 +++++++++++++++++++++++++++++++++++++++++++
> > >   target/i386/kvm/tdx.c | 24 ++++++++++++++++++++-
> > >   3 files changed, 87 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/qapi/run-state.json b/qapi/run-state.json
> > > index f216ba54ec4c..506bbe31541f 100644
> > > --- a/qapi/run-state.json
> > > +++ b/qapi/run-state.json
> > > @@ -499,7 +499,7 @@
> > >   # Since: 2.9
> > >   ##
> > >   { 'enum': 'GuestPanicInformationType',
> > > -  'data': [ 'hyper-v', 's390' ] }
> > > +  'data': [ 'hyper-v', 's390', 'tdx' ] }

> 
> > > +#
> > > +# Since: 8.2
> > > +##
> > > +{'struct': 'GuestPanicInformationTdx',
> > > + 'data': {'error-code': 'uint64',
> > > +          'gpa': 'uint64',
> > > +          'message': 'str'}}
> > > +
> > >   ##
> > >   # @MEMORY_FAILURE:
> > >   #
> > > diff --git a/softmmu/runstate.c b/softmmu/runstate.c
> > > index f3bd86281813..cab11484ed7e 100644
> > > --- a/softmmu/runstate.c
> > > +++ b/softmmu/runstate.c
> > > @@ -518,7 +518,56 @@ void qemu_system_guest_panicked(GuestPanicInformation *info)
> > >                             S390CrashReason_str(info->u.s390.reason),
> > >                             info->u.s390.psw_mask,
> > >                             info->u.s390.psw_addr);
> > > +        } else if (info->type == GUEST_PANIC_INFORMATION_TYPE_TDX) {
> > > +            char *buf = NULL;
> > > +            bool printable = false;
> > > +
> > > +            /*
> > > +             * Although message is defined as a json string, we shouldn't
> > > +             * unconditionally treat it as is because the guest generated it and
> > > +             * it's not necessarily trustable.
> > > +             */
> > > +            if (info->u.tdx.message) {
> > > +                /* The caller guarantees the NUL-terminated string. */
> > > +                int len = strlen(info->u.tdx.message);
> > > +                int i;
> > > +
> > > +                printable = len > 0;
> > > +                for (i = 0; i < len; i++) {
> > > +                    if (!(0x20 <= info->u.tdx.message[i] &&
> > > +                          info->u.tdx.message[i] <= 0x7e)) {
> > > +                        printable = false;
> > > +                        break;
> > > +                    }
> > > +                }
> > > +
> > > +                /* 3 = length of "%02x " */
> > > +                buf = g_malloc(len * 3);
> > > +                for (i = 0; i < len; i++) {
> > > +                    if (info->u.tdx.message[i] == '\0') {
> > > +                        break;
> > > +                    } else {
> > > +                        sprintf(buf + 3 * i, "%02x ", info->u.tdx.message[i]);
> > > +                    }
> > > +                }
> > > +                if (i > 0)
> > > +                    /* replace the last ' '(space) to NUL */
> > > +                    buf[i * 3 - 1] = '\0';
> > > +                else
> > > +                    buf[0] = '\0';
> > 
> > You're building this escaped buffer but...
> > 
> > > +            }
> > > +
> > > +            qemu_log_mask(LOG_GUEST_ERROR,
> > > +                          //" TDX report fatal error:\"%s\" %s",
> > > +                          " TDX report fatal error:\"%s\""
> > > +                          "error: 0x%016" PRIx64 " gpa page: 0x%016" PRIx64 "\n",
> > > +                          printable ? info->u.tdx.message : "",
> > > +                          //buf ? buf : "",
> > 
> > ...then not actually using it
> > 
> > Either delete the 'buf' code, or use it.
> 
> Sorry for posting some internal testing version.
> Does below look good to you?
> 
> @@ -518,7 +518,56 @@ void qemu_system_guest_panicked(GuestPanicInformation
> *info)
>                            S390CrashReason_str(info->u.s390.reason),
>                            info->u.s390.psw_mask,
>                            info->u.s390.psw_addr);
> +        } else if (info->type == GUEST_PANIC_INFORMATION_TYPE_TDX) {
> +            bool printable = false;
> +            char *buf = NULL;
> +            int len = 0, i;
> +
> +            /*
> +             * Although message is defined as a json string, we shouldn't
> +             * unconditionally treat it as is because the guest generated
> it and
> +             * it's not necessarily trustable.
> +             */
> +            if (info->u.tdx.message) {
> +                /* The caller guarantees the NUL-terminated string. */
> +                len = strlen(info->u.tdx.message);
> +
> +                printable = len > 0;
> +                for (i = 0; i < len; i++) {
> +                    if (!(0x20 <= info->u.tdx.message[i] &&
> +                          info->u.tdx.message[i] <= 0x7e)) {
> +                        printable = false;
> +                        break;
> +                    }
> +                }
> +            }
> +
> +            if (!printable && len) {
> +                /* 3 = length of "%02x " */
> +                buf = g_malloc(len * 3);
> +                for (i = 0; i < len; i++) {
> +                    if (info->u.tdx.message[i] == '\0') {
> +                        break;
> +                    } else {
> +                        sprintf(buf + 3 * i, "%02x ",
> info->u.tdx.message[i]);
> +                    }
> +                }
> +                if (i > 0)
> +                    /* replace the last ' '(space) to NUL */
> +                    buf[i * 3 - 1] = '\0';
> +                else
> +                    buf[0] = '\0';
> +            }
> +
> +            qemu_log_mask(LOG_GUEST_ERROR,
> +                          " TDX guest reports fatal error:\"%s\""
> +                          " error code: 0x%016" PRIx64 " gpa page: 0x%016"
> PRIx64 "\n",
> +                          printable ? info->u.tdx.message : buf,
> +                          info->u.tdx.error_code,
> +                          info->u.tdx.gpa);
> +            g_free(buf);
>          }


Ok that makes more sense now. BTW, probably a nice idea to create a
separate helper method that santizes the guest provided JSON into
the safe 'buf' string.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 36/58] memory: Introduce memory_region_init_ram_gmem()
  2023-08-18  9:50 ` [PATCH v2 36/58] memory: Introduce memory_region_init_ram_gmem() Xiaoyao Li
  2023-08-21  9:40   ` Daniel P. Berrangé
@ 2023-08-29 14:33   ` Philippe Mathieu-Daudé
  2023-08-30  1:53     ` Xiaoyao Li
  1 sibling, 1 reply; 120+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-08-29 14:33 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

On 18/8/23 11:50, Xiaoyao Li wrote:
> Introduce memory_region_init_ram_gmem() to allocate private gmem on the
> MemoryRegion initialization. It's for the usercase of TDVF, which must
> be private on TDX case.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>   include/exec/memory.h |  6 +++++
>   softmmu/memory.c      | 52 +++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 58 insertions(+)


> diff --git a/softmmu/memory.c b/softmmu/memory.c
> index af6aa3c1e3c9..ded44dcef1aa 100644
> --- a/softmmu/memory.c
> +++ b/softmmu/memory.c
> @@ -25,6 +25,7 @@
>   #include "qom/object.h"
>   #include "trace.h"
>   
> +#include <linux/kvm.h>

Unlikely to build on non-Linux hosts.

>   #include "exec/memory-internal.h"
>   #include "exec/ram_addr.h"
>   #include "sysemu/kvm.h"


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu()
  2023-08-18  9:49 ` [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu() Xiaoyao Li
  2023-08-21  8:55   ` Daniel P. Berrangé
@ 2023-08-29 14:40   ` Philippe Mathieu-Daudé
  2023-08-30  1:45     ` Xiaoyao Li
  1 sibling, 1 reply; 120+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-08-29 14:40 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

On 18/8/23 11:49, Xiaoyao Li wrote:
> Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
> work prior to create any vcpu. This is for i386 TDX because it needs
> call TDX_INIT_VM before creating any vcpu.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>   accel/kvm/kvm-all.c  | 12 ++++++++++++
>   include/sysemu/kvm.h |  1 +
>   2 files changed, 13 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index c9f3aab5e587..5071af917ae0 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -422,6 +422,11 @@ static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
>       return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
>   }
>   
> +int __attribute__ ((weak)) kvm_arch_pre_create_vcpu(CPUState *cpu)
> +{
> +    return 0;
> +}

kvm_arch_init_vcpu() is implemented for each arch. Why not use the
same approach here?

>   int kvm_init_vcpu(CPUState *cpu, Error **errp)
>   {
>       KVMState *s = kvm_state;
> @@ -430,6 +435,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>   
>       trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>   
> +    ret = kvm_arch_pre_create_vcpu(cpu);
> +    if (ret < 0) {
> +        error_setg_errno(errp, -ret, "%s: kvm_arch_pre_create_vcpu() failed",
> +                        __func__);
> +        goto err;
> +    }
> +
>       ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>       if (ret < 0) {
>           error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index 49c896d8a512..d89ec87072d7 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -371,6 +371,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
>   
>   int kvm_arch_init(MachineState *ms, KVMState *s);
>   
> +int kvm_arch_pre_create_vcpu(CPUState *cpu);
>   int kvm_arch_init_vcpu(CPUState *cpu);
>   int kvm_arch_destroy_vcpu(CPUState *cpu);
>   


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 04/58] target/i386: Introduce kvm_confidential_guest_init()
  2023-08-18  9:49 ` [PATCH v2 04/58] target/i386: Introduce kvm_confidential_guest_init() Xiaoyao Li
@ 2023-08-29 14:42   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 120+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-08-29 14:42 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

On 18/8/23 11:49, Xiaoyao Li wrote:
> Introduce a separate function kvm_confidential_guest_init() for SEV (and
> future TDX).
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> ---
>   target/i386/kvm/kvm.c | 11 ++++++++++-
>   target/i386/sev.c     |  1 -
>   target/i386/sev.h     |  2 ++
>   3 files changed, 12 insertions(+), 2 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu()
  2023-08-29 14:40   ` Philippe Mathieu-Daudé
@ 2023-08-30  1:45     ` Xiaoyao Li
  2023-08-30 16:54       ` Isaku Yamahata
  0 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-30  1:45 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

On 8/29/2023 10:40 PM, Philippe Mathieu-Daudé wrote:
> On 18/8/23 11:49, Xiaoyao Li wrote:
>> Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
>> work prior to create any vcpu. This is for i386 TDX because it needs
>> call TDX_INIT_VM before creating any vcpu.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> Acked-by: Gerd Hoffmann <kraxel@redhat.com>
>> ---
>>   accel/kvm/kvm-all.c  | 12 ++++++++++++
>>   include/sysemu/kvm.h |  1 +
>>   2 files changed, 13 insertions(+)
>>
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index c9f3aab5e587..5071af917ae0 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -422,6 +422,11 @@ static int kvm_get_vcpu(KVMState *s, unsigned 
>> long vcpu_id)
>>       return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
>>   }
>> +int __attribute__ ((weak)) kvm_arch_pre_create_vcpu(CPUState *cpu)
>> +{
>> +    return 0;
>> +}
> 
> kvm_arch_init_vcpu() is implemented for each arch. Why not use the
> same approach here?

Because only x86 needs it currently, for TDX. Other arches don't require 
an implementation.

If don't provide the _weak_ function, it needs to implement the empty 
function (justing return 0) in all the other arches just as the 
placeholder. If QEMU community prefers this approach, I can change to it 
in next version.

>>   int kvm_init_vcpu(CPUState *cpu, Error **errp)
>>   {
>>       KVMState *s = kvm_state;
>> @@ -430,6 +435,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>>       trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>> +    ret = kvm_arch_pre_create_vcpu(cpu);
>> +    if (ret < 0) {
>> +        error_setg_errno(errp, -ret, "%s: kvm_arch_pre_create_vcpu() 
>> failed",
>> +                        __func__);
>> +        goto err;
>> +    }
>> +
>>       ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>>       if (ret < 0) {
>>           error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu 
>> failed (%lu)",
>> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
>> index 49c896d8a512..d89ec87072d7 100644
>> --- a/include/sysemu/kvm.h
>> +++ b/include/sysemu/kvm.h
>> @@ -371,6 +371,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
>>   int kvm_arch_init(MachineState *ms, KVMState *s);
>> +int kvm_arch_pre_create_vcpu(CPUState *cpu);
>>   int kvm_arch_init_vcpu(CPUState *cpu);
>>   int kvm_arch_destroy_vcpu(CPUState *cpu);
> 


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 36/58] memory: Introduce memory_region_init_ram_gmem()
  2023-08-29 14:33   ` Philippe Mathieu-Daudé
@ 2023-08-30  1:53     ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-30  1:53 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

On 8/29/2023 10:33 PM, Philippe Mathieu-Daudé wrote:
> On 18/8/23 11:50, Xiaoyao Li wrote:
>> Introduce memory_region_init_ram_gmem() to allocate private gmem on the
>> MemoryRegion initialization. It's for the usercase of TDVF, which must
>> be private on TDX case.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   include/exec/memory.h |  6 +++++
>>   softmmu/memory.c      | 52 +++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 58 insertions(+)
> 
> 
>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>> index af6aa3c1e3c9..ded44dcef1aa 100644
>> --- a/softmmu/memory.c
>> +++ b/softmmu/memory.c
>> @@ -25,6 +25,7 @@
>>   #include "qom/object.h"
>>   #include "trace.h"
>> +#include <linux/kvm.h>
> 
> Unlikely to build on non-Linux hosts.

Thanks for catching it!

Will warp it with CONFIG_KVM.

Anyway, it's the main open of how to integrating KVM gmem into QEMU's 
memory system, in QMEU gmem series[*]. I'm still working on it.

[*] 
https://lore.kernel.org/qemu-devel/20230731162201.271114-1-xiaoyao.li@intel.com/

>>   #include "exec/memory-internal.h"
>>   #include "exec/ram_addr.h"
>>   #include "sysemu/kvm.h"
> 


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 47/58] i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility
  2023-08-29 10:28       ` Daniel P. Berrangé
@ 2023-08-30  2:15         ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-30  2:15 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On 8/29/2023 6:28 PM, Daniel P. Berrangé wrote:
> On Mon, Aug 28, 2023 at 09:14:41PM +0800, Xiaoyao Li wrote:
>> On 8/21/2023 5:58 PM, Daniel P. Berrangé wrote:
>>> On Fri, Aug 18, 2023 at 05:50:30AM -0400, Xiaoyao Li wrote:
>>>> Originated-from: Isaku Yamahata <isaku.yamahata@intel.com>
>>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>>> ---
>>>>    qapi/run-state.json   | 17 +++++++++++++--
>>>>    softmmu/runstate.c    | 49 +++++++++++++++++++++++++++++++++++++++++++
>>>>    target/i386/kvm/tdx.c | 24 ++++++++++++++++++++-
>>>>    3 files changed, 87 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/qapi/run-state.json b/qapi/run-state.json
>>>> index f216ba54ec4c..506bbe31541f 100644
>>>> --- a/qapi/run-state.json
>>>> +++ b/qapi/run-state.json
>>>> @@ -499,7 +499,7 @@
>>>>    # Since: 2.9
>>>>    ##
>>>>    { 'enum': 'GuestPanicInformationType',
>>>> -  'data': [ 'hyper-v', 's390' ] }
>>>> +  'data': [ 'hyper-v', 's390', 'tdx' ] }
> 
>>
>>>> +#
>>>> +# Since: 8.2
>>>> +##
>>>> +{'struct': 'GuestPanicInformationTdx',
>>>> + 'data': {'error-code': 'uint64',
>>>> +          'gpa': 'uint64',
>>>> +          'message': 'str'}}
>>>> +
>>>>    ##
>>>>    # @MEMORY_FAILURE:
>>>>    #
>>>> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
>>>> index f3bd86281813..cab11484ed7e 100644
>>>> --- a/softmmu/runstate.c
>>>> +++ b/softmmu/runstate.c
>>>> @@ -518,7 +518,56 @@ void qemu_system_guest_panicked(GuestPanicInformation *info)
>>>>                              S390CrashReason_str(info->u.s390.reason),
>>>>                              info->u.s390.psw_mask,
>>>>                              info->u.s390.psw_addr);
>>>> +        } else if (info->type == GUEST_PANIC_INFORMATION_TYPE_TDX) {
>>>> +            char *buf = NULL;
>>>> +            bool printable = false;
>>>> +
>>>> +            /*
>>>> +             * Although message is defined as a json string, we shouldn't
>>>> +             * unconditionally treat it as is because the guest generated it and
>>>> +             * it's not necessarily trustable.
>>>> +             */
>>>> +            if (info->u.tdx.message) {
>>>> +                /* The caller guarantees the NUL-terminated string. */
>>>> +                int len = strlen(info->u.tdx.message);
>>>> +                int i;
>>>> +
>>>> +                printable = len > 0;
>>>> +                for (i = 0; i < len; i++) {
>>>> +                    if (!(0x20 <= info->u.tdx.message[i] &&
>>>> +                          info->u.tdx.message[i] <= 0x7e)) {
>>>> +                        printable = false;
>>>> +                        break;
>>>> +                    }
>>>> +                }
>>>> +
>>>> +                /* 3 = length of "%02x " */
>>>> +                buf = g_malloc(len * 3);
>>>> +                for (i = 0; i < len; i++) {
>>>> +                    if (info->u.tdx.message[i] == '\0') {
>>>> +                        break;
>>>> +                    } else {
>>>> +                        sprintf(buf + 3 * i, "%02x ", info->u.tdx.message[i]);
>>>> +                    }
>>>> +                }
>>>> +                if (i > 0)
>>>> +                    /* replace the last ' '(space) to NUL */
>>>> +                    buf[i * 3 - 1] = '\0';
>>>> +                else
>>>> +                    buf[0] = '\0';
>>>
>>> You're building this escaped buffer but...
>>>
>>>> +            }
>>>> +
>>>> +            qemu_log_mask(LOG_GUEST_ERROR,
>>>> +                          //" TDX report fatal error:\"%s\" %s",
>>>> +                          " TDX report fatal error:\"%s\""
>>>> +                          "error: 0x%016" PRIx64 " gpa page: 0x%016" PRIx64 "\n",
>>>> +                          printable ? info->u.tdx.message : "",
>>>> +                          //buf ? buf : "",
>>>
>>> ...then not actually using it
>>>
>>> Either delete the 'buf' code, or use it.
>>
>> Sorry for posting some internal testing version.
>> Does below look good to you?
>>
>> @@ -518,7 +518,56 @@ void qemu_system_guest_panicked(GuestPanicInformation
>> *info)
>>                             S390CrashReason_str(info->u.s390.reason),
>>                             info->u.s390.psw_mask,
>>                             info->u.s390.psw_addr);
>> +        } else if (info->type == GUEST_PANIC_INFORMATION_TYPE_TDX) {
>> +            bool printable = false;
>> +            char *buf = NULL;
>> +            int len = 0, i;
>> +
>> +            /*
>> +             * Although message is defined as a json string, we shouldn't
>> +             * unconditionally treat it as is because the guest generated
>> it and
>> +             * it's not necessarily trustable.
>> +             */
>> +            if (info->u.tdx.message) {
>> +                /* The caller guarantees the NUL-terminated string. */
>> +                len = strlen(info->u.tdx.message);
>> +
>> +                printable = len > 0;
>> +                for (i = 0; i < len; i++) {
>> +                    if (!(0x20 <= info->u.tdx.message[i] &&
>> +                          info->u.tdx.message[i] <= 0x7e)) {
>> +                        printable = false;
>> +                        break;
>> +                    }
>> +                }
>> +            }
>> +
>> +            if (!printable && len) {
>> +                /* 3 = length of "%02x " */
>> +                buf = g_malloc(len * 3);
>> +                for (i = 0; i < len; i++) {
>> +                    if (info->u.tdx.message[i] == '\0') {
>> +                        break;
>> +                    } else {
>> +                        sprintf(buf + 3 * i, "%02x ",
>> info->u.tdx.message[i]);
>> +                    }
>> +                }
>> +                if (i > 0)
>> +                    /* replace the last ' '(space) to NUL */
>> +                    buf[i * 3 - 1] = '\0';
>> +                else
>> +                    buf[0] = '\0';
>> +            }
>> +
>> +            qemu_log_mask(LOG_GUEST_ERROR,
>> +                          " TDX guest reports fatal error:\"%s\""
>> +                          " error code: 0x%016" PRIx64 " gpa page: 0x%016"
>> PRIx64 "\n",
>> +                          printable ? info->u.tdx.message : buf,
>> +                          info->u.tdx.error_code,
>> +                          info->u.tdx.gpa);
>> +            g_free(buf);
>>           }
> 
> 
> Ok that makes more sense now. BTW, probably a nice idea to create a
> separate helper method that santizes the guest provided JSON into
> the safe 'buf' string.
> 

OK. Thanks for the suggestion.

Will do it in next version.

> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-29 10:25         ` Daniel P. Berrangé
@ 2023-08-30  5:18           ` Chenyi Qiang
  2023-08-30  5:57             ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Chenyi Qiang @ 2023-08-30  5:18 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Markus Armbruster, Xiaoyao Li, Paolo Bonzini, Richard Henderson,
	Michael S. Tsirkin, Marcel Apfelbaum, Igor Mammedov, Ani Sinha,
	Peter Xu, David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas



On 8/29/2023 6:25 PM, Daniel P. Berrangé wrote:
> On Tue, Aug 29, 2023 at 01:31:37PM +0800, Chenyi Qiang wrote:
>>
>>
>> On 8/22/2023 4:24 PM, Daniel P. Berrangé wrote:
>>> On Tue, Aug 22, 2023 at 08:52:30AM +0200, Markus Armbruster wrote:
>>>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>>>
>>>>> From: Isaku Yamahata <isaku.yamahata@intel.com>
>>>>>
>>>>> For GetQuote, delegate a request to Quote Generation Service.  Add property
>>>>> of address of quote generation server and On request, connect to the
>>>>> server, read request buffer from shared guest memory, send the request
>>>>> buffer to the server and store the response into shared guest memory and
>>>>> notify TD guest by interrupt.
>>>>>
>>>>> "quote-generation-service" is a property to specify Quote Generation
>>>>> Service(QGS) in qemu socket address format.  The examples of the supported
>>>>> format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".
>>>>>
>>>>> command line example:
>>>>>   qemu-system-x86_64 \
>>>>>     -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
>>>>>     -machine confidential-guest-support=tdx0
>>>>>
>>>>> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
>>>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>>>> ---
>>>>>  qapi/qom.json         |   5 +-
>>>>>  target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
>>>>>  target/i386/kvm/tdx.h |   7 +
>>>>>  3 files changed, 391 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/qapi/qom.json b/qapi/qom.json
>>>>> index 87c1d440f331..37139949d761 100644
>>>>> --- a/qapi/qom.json
>>>>> +++ b/qapi/qom.json
>>>>> @@ -879,13 +879,16 @@
>>>>>  #
>>>>>  # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
>>>>>  #
>>>>> +# @quote-generation-service: socket address for Quote Generation Service(QGS)
>>>>> +#
>>>>>  # Since: 8.2
>>>>>  ##
>>>>>  { 'struct': 'TdxGuestProperties',
>>>>>    'data': { '*sept-ve-disable': 'bool',
>>>>>              '*mrconfigid': 'str',
>>>>>              '*mrowner': 'str',
>>>>> -            '*mrownerconfig': 'str' } }
>>>>> +            '*mrownerconfig': 'str',
>>>>> +            '*quote-generation-service': 'str' } }
>>>>
>>>> Why not type SocketAddress?
>>>
>>> Yes, the code uses SocketAddress internally when it eventually
>>> calls qio_channel_socket_connect_async(), so we should directly
>>> use SocketAddress in the QAPI from the start.
>>
>> Any benefit to directly use SocketAddress?
> 
> We don't want whatever code consumes the configuration to
> do a second level of parsing to convert the 'str' value
> into the 'SocketAddress' object it actually needs.
> 
> QEMU has a long history of having a second round of ad-hoc
> parsing of configuration and we've found it to be a serious
> maintenence burden. Thus we strive to have everything
> represented in QAPI using the desired final type, and avoid
> the second round of parsing.

Thanks for your explanation.

> 
>> "quote-generation-service" here is optional, it seems not trivial to add
>> and parse the SocketAddress type in QEMU command. After I change 'str'
>> to 'SocketAddress' and specify the command like "-object
>> tdx-guest,type=vsock,cid=2,port=1234...", it will report "invalid
>> parameter cid".
> 
> The -object parameter supports JSON syntax for this reason
> 
>    -object '{"qom-type":"tdx-guest","quote-generation-service":{"type": "vsock", "cid":"2","port":"1234"}}'
> 
> libvirt will always use the JSON syntax for -object with a new enough
> QEMU.

The JSON syntax works for me. Then, we need to add some doc about using
JSON syntax when quote-generation-service is required.

> 
> With regards,
> Daniel

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-30  5:18           ` Chenyi Qiang
@ 2023-08-30  5:57             ` Xiaoyao Li
  2023-08-30  7:48               ` Daniel P. Berrangé
  0 siblings, 1 reply; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-30  5:57 UTC (permalink / raw)
  To: Chenyi Qiang, Daniel P. Berrangé
  Cc: Markus Armbruster, Paolo Bonzini, Richard Henderson,
	Michael S. Tsirkin, Marcel Apfelbaum, Igor Mammedov, Ani Sinha,
	Peter Xu, David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas

On 8/30/2023 1:18 PM, Chenyi Qiang wrote:
> 
> 
> On 8/29/2023 6:25 PM, Daniel P. Berrangé wrote:
>> On Tue, Aug 29, 2023 at 01:31:37PM +0800, Chenyi Qiang wrote:
>>>
>>>
>>> On 8/22/2023 4:24 PM, Daniel P. Berrangé wrote:
>>>> On Tue, Aug 22, 2023 at 08:52:30AM +0200, Markus Armbruster wrote:
>>>>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>>>>
>>>>>> From: Isaku Yamahata <isaku.yamahata@intel.com>
>>>>>>
>>>>>> For GetQuote, delegate a request to Quote Generation Service.  Add property
>>>>>> of address of quote generation server and On request, connect to the
>>>>>> server, read request buffer from shared guest memory, send the request
>>>>>> buffer to the server and store the response into shared guest memory and
>>>>>> notify TD guest by interrupt.
>>>>>>
>>>>>> "quote-generation-service" is a property to specify Quote Generation
>>>>>> Service(QGS) in qemu socket address format.  The examples of the supported
>>>>>> format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".
>>>>>>
>>>>>> command line example:
>>>>>>    qemu-system-x86_64 \
>>>>>>      -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
>>>>>>      -machine confidential-guest-support=tdx0
>>>>>>
>>>>>> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
>>>>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>>>>> ---
>>>>>>   qapi/qom.json         |   5 +-
>>>>>>   target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>   target/i386/kvm/tdx.h |   7 +
>>>>>>   3 files changed, 391 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/qapi/qom.json b/qapi/qom.json
>>>>>> index 87c1d440f331..37139949d761 100644
>>>>>> --- a/qapi/qom.json
>>>>>> +++ b/qapi/qom.json
>>>>>> @@ -879,13 +879,16 @@
>>>>>>   #
>>>>>>   # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
>>>>>>   #
>>>>>> +# @quote-generation-service: socket address for Quote Generation Service(QGS)
>>>>>> +#
>>>>>>   # Since: 8.2
>>>>>>   ##
>>>>>>   { 'struct': 'TdxGuestProperties',
>>>>>>     'data': { '*sept-ve-disable': 'bool',
>>>>>>               '*mrconfigid': 'str',
>>>>>>               '*mrowner': 'str',
>>>>>> -            '*mrownerconfig': 'str' } }
>>>>>> +            '*mrownerconfig': 'str',
>>>>>> +            '*quote-generation-service': 'str' } }
>>>>>
>>>>> Why not type SocketAddress?
>>>>
>>>> Yes, the code uses SocketAddress internally when it eventually
>>>> calls qio_channel_socket_connect_async(), so we should directly
>>>> use SocketAddress in the QAPI from the start.
>>>
>>> Any benefit to directly use SocketAddress?
>>
>> We don't want whatever code consumes the configuration to
>> do a second level of parsing to convert the 'str' value
>> into the 'SocketAddress' object it actually needs.
>>
>> QEMU has a long history of having a second round of ad-hoc
>> parsing of configuration and we've found it to be a serious
>> maintenence burden. Thus we strive to have everything
>> represented in QAPI using the desired final type, and avoid
>> the second round of parsing.
> 
> Thanks for your explanation.
> 
>>
>>> "quote-generation-service" here is optional, it seems not trivial to add
>>> and parse the SocketAddress type in QEMU command. After I change 'str'
>>> to 'SocketAddress' and specify the command like "-object
>>> tdx-guest,type=vsock,cid=2,port=1234...", it will report "invalid
>>> parameter cid".
>>
>> The -object parameter supports JSON syntax for this reason
>>
>>     -object '{"qom-type":"tdx-guest","quote-generation-service":{"type": "vsock", "cid":"2","port":"1234"}}'
>>
>> libvirt will always use the JSON syntax for -object with a new enough
>> QEMU.
> 
> The JSON syntax works for me. Then, we need to add some doc about using
> JSON syntax when quote-generation-service is required.

This limitation doesn't look reasonable to me.

@Daniel,

Is it acceptable by QEMU community?

>>
>> With regards,
>> Daniel


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-30  5:57             ` Xiaoyao Li
@ 2023-08-30  7:48               ` Daniel P. Berrangé
  2023-08-31  6:49                 ` Xiaoyao Li
  0 siblings, 1 reply; 120+ messages in thread
From: Daniel P. Berrangé @ 2023-08-30  7:48 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Chenyi Qiang, Markus Armbruster, Paolo Bonzini,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Igor Mammedov, Ani Sinha, Peter Xu, David Hildenbrand,
	Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas

On Wed, Aug 30, 2023 at 01:57:59PM +0800, Xiaoyao Li wrote:
> On 8/30/2023 1:18 PM, Chenyi Qiang wrote:
> > 
> > 
> > On 8/29/2023 6:25 PM, Daniel P. Berrangé wrote:
> > > On Tue, Aug 29, 2023 at 01:31:37PM +0800, Chenyi Qiang wrote:
> > > > 
> > > > 
> > > > On 8/22/2023 4:24 PM, Daniel P. Berrangé wrote:
> > > > > On Tue, Aug 22, 2023 at 08:52:30AM +0200, Markus Armbruster wrote:
> > > > > > Xiaoyao Li <xiaoyao.li@intel.com> writes:
> > > > > > 
> > > > > > > From: Isaku Yamahata <isaku.yamahata@intel.com>
> > > > > > > 
> > > > > > > For GetQuote, delegate a request to Quote Generation Service.  Add property
> > > > > > > of address of quote generation server and On request, connect to the
> > > > > > > server, read request buffer from shared guest memory, send the request
> > > > > > > buffer to the server and store the response into shared guest memory and
> > > > > > > notify TD guest by interrupt.
> > > > > > > 
> > > > > > > "quote-generation-service" is a property to specify Quote Generation
> > > > > > > Service(QGS) in qemu socket address format.  The examples of the supported
> > > > > > > format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".
> > > > > > > 
> > > > > > > command line example:
> > > > > > >    qemu-system-x86_64 \
> > > > > > >      -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
> > > > > > >      -machine confidential-guest-support=tdx0
> > > > > > > 
> > > > > > > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> > > > > > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > > > > > ---
> > > > > > >   qapi/qom.json         |   5 +-
> > > > > > >   target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
> > > > > > >   target/i386/kvm/tdx.h |   7 +
> > > > > > >   3 files changed, 391 insertions(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/qapi/qom.json b/qapi/qom.json
> > > > > > > index 87c1d440f331..37139949d761 100644
> > > > > > > --- a/qapi/qom.json
> > > > > > > +++ b/qapi/qom.json
> > > > > > > @@ -879,13 +879,16 @@
> > > > > > >   #
> > > > > > >   # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
> > > > > > >   #
> > > > > > > +# @quote-generation-service: socket address for Quote Generation Service(QGS)
> > > > > > > +#
> > > > > > >   # Since: 8.2
> > > > > > >   ##
> > > > > > >   { 'struct': 'TdxGuestProperties',
> > > > > > >     'data': { '*sept-ve-disable': 'bool',
> > > > > > >               '*mrconfigid': 'str',
> > > > > > >               '*mrowner': 'str',
> > > > > > > -            '*mrownerconfig': 'str' } }
> > > > > > > +            '*mrownerconfig': 'str',
> > > > > > > +            '*quote-generation-service': 'str' } }
> > > > > > 
> > > > > > Why not type SocketAddress?
> > > > > 
> > > > > Yes, the code uses SocketAddress internally when it eventually
> > > > > calls qio_channel_socket_connect_async(), so we should directly
> > > > > use SocketAddress in the QAPI from the start.
> > > > 
> > > > Any benefit to directly use SocketAddress?
> > > 
> > > We don't want whatever code consumes the configuration to
> > > do a second level of parsing to convert the 'str' value
> > > into the 'SocketAddress' object it actually needs.
> > > 
> > > QEMU has a long history of having a second round of ad-hoc
> > > parsing of configuration and we've found it to be a serious
> > > maintenence burden. Thus we strive to have everything
> > > represented in QAPI using the desired final type, and avoid
> > > the second round of parsing.
> > 
> > Thanks for your explanation.
> > 
> > > 
> > > > "quote-generation-service" here is optional, it seems not trivial to add
> > > > and parse the SocketAddress type in QEMU command. After I change 'str'
> > > > to 'SocketAddress' and specify the command like "-object
> > > > tdx-guest,type=vsock,cid=2,port=1234...", it will report "invalid
> > > > parameter cid".
> > > 
> > > The -object parameter supports JSON syntax for this reason
> > > 
> > >     -object '{"qom-type":"tdx-guest","quote-generation-service":{"type": "vsock", "cid":"2","port":"1234"}}'
> > > 
> > > libvirt will always use the JSON syntax for -object with a new enough
> > > QEMU.
> > 
> > The JSON syntax works for me. Then, we need to add some doc about using
> > JSON syntax when quote-generation-service is required.
> 
> This limitation doesn't look reasonable to me.
> 
> @Daniel,
> 
> Is it acceptable by QEMU community?

This is the expected approach for object types which have non-scalar
properties.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu()
  2023-08-30  1:45     ` Xiaoyao Li
@ 2023-08-30 16:54       ` Isaku Yamahata
  0 siblings, 0 replies; 120+ messages in thread
From: Isaku Yamahata @ 2023-08-30 16:54 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann, qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek,
	Isaku Yamahata, erdemaktas, Chenyi Qiang

On Wed, Aug 30, 2023 at 09:45:58AM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> On 8/29/2023 10:40 PM, Philippe Mathieu-Daudé wrote:
> > On 18/8/23 11:49, Xiaoyao Li wrote:
> > > Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
> > > work prior to create any vcpu. This is for i386 TDX because it needs
> > > call TDX_INIT_VM before creating any vcpu.
> > > 
> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > Acked-by: Gerd Hoffmann <kraxel@redhat.com>
> > > ---
> > >   accel/kvm/kvm-all.c  | 12 ++++++++++++
> > >   include/sysemu/kvm.h |  1 +
> > >   2 files changed, 13 insertions(+)
> > > 
> > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > > index c9f3aab5e587..5071af917ae0 100644
> > > --- a/accel/kvm/kvm-all.c
> > > +++ b/accel/kvm/kvm-all.c
> > > @@ -422,6 +422,11 @@ static int kvm_get_vcpu(KVMState *s, unsigned
> > > long vcpu_id)
> > >       return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
> > >   }
> > > +int __attribute__ ((weak)) kvm_arch_pre_create_vcpu(CPUState *cpu)
> > > +{
> > > +    return 0;
> > > +}
> > 
> > kvm_arch_init_vcpu() is implemented for each arch. Why not use the
> > same approach here?
> 
> Because only x86 needs it currently, for TDX. Other arches don't require an
> implementation.
> 
> If don't provide the _weak_ function, it needs to implement the empty
> function (justing return 0) in all the other arches just as the placeholder.
> If QEMU community prefers this approach, I can change to it in next version.

Alternative is to move the hook to x86 specific function, not common kvm
function. With my quick grepping, x86_cpus_init() or x86_cpu_realizefn().
-- 
Isaku Yamahata <isaku.yamahata@linux.intel.com>

^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-30  7:48               ` Daniel P. Berrangé
@ 2023-08-31  6:49                 ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-08-31  6:49 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Chenyi Qiang, Markus Armbruster, Paolo Bonzini,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Igor Mammedov, Ani Sinha, Peter Xu, David Hildenbrand,
	Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas

On 8/30/2023 3:48 PM, Daniel P. Berrangé wrote:
> On Wed, Aug 30, 2023 at 01:57:59PM +0800, Xiaoyao Li wrote:
>> On 8/30/2023 1:18 PM, Chenyi Qiang wrote:
>>>
>>>
>>> On 8/29/2023 6:25 PM, Daniel P. Berrangé wrote:
>>>> On Tue, Aug 29, 2023 at 01:31:37PM +0800, Chenyi Qiang wrote:
>>>>>
>>>>>
>>>>> On 8/22/2023 4:24 PM, Daniel P. Berrangé wrote:
>>>>>> On Tue, Aug 22, 2023 at 08:52:30AM +0200, Markus Armbruster wrote:
>>>>>>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>>>>>>
>>>>>>>> From: Isaku Yamahata <isaku.yamahata@intel.com>
>>>>>>>>
>>>>>>>> For GetQuote, delegate a request to Quote Generation Service.  Add property
>>>>>>>> of address of quote generation server and On request, connect to the
>>>>>>>> server, read request buffer from shared guest memory, send the request
>>>>>>>> buffer to the server and store the response into shared guest memory and
>>>>>>>> notify TD guest by interrupt.
>>>>>>>>
>>>>>>>> "quote-generation-service" is a property to specify Quote Generation
>>>>>>>> Service(QGS) in qemu socket address format.  The examples of the supported
>>>>>>>> format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".
>>>>>>>>
>>>>>>>> command line example:
>>>>>>>>     qemu-system-x86_64 \
>>>>>>>>       -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
>>>>>>>>       -machine confidential-guest-support=tdx0
>>>>>>>>
>>>>>>>> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
>>>>>>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>>>>>>> ---
>>>>>>>>    qapi/qom.json         |   5 +-
>>>>>>>>    target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>    target/i386/kvm/tdx.h |   7 +
>>>>>>>>    3 files changed, 391 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/qapi/qom.json b/qapi/qom.json
>>>>>>>> index 87c1d440f331..37139949d761 100644
>>>>>>>> --- a/qapi/qom.json
>>>>>>>> +++ b/qapi/qom.json
>>>>>>>> @@ -879,13 +879,16 @@
>>>>>>>>    #
>>>>>>>>    # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
>>>>>>>>    #
>>>>>>>> +# @quote-generation-service: socket address for Quote Generation Service(QGS)
>>>>>>>> +#
>>>>>>>>    # Since: 8.2
>>>>>>>>    ##
>>>>>>>>    { 'struct': 'TdxGuestProperties',
>>>>>>>>      'data': { '*sept-ve-disable': 'bool',
>>>>>>>>                '*mrconfigid': 'str',
>>>>>>>>                '*mrowner': 'str',
>>>>>>>> -            '*mrownerconfig': 'str' } }
>>>>>>>> +            '*mrownerconfig': 'str',
>>>>>>>> +            '*quote-generation-service': 'str' } }
>>>>>>>
>>>>>>> Why not type SocketAddress?
>>>>>>
>>>>>> Yes, the code uses SocketAddress internally when it eventually
>>>>>> calls qio_channel_socket_connect_async(), so we should directly
>>>>>> use SocketAddress in the QAPI from the start.
>>>>>
>>>>> Any benefit to directly use SocketAddress?
>>>>
>>>> We don't want whatever code consumes the configuration to
>>>> do a second level of parsing to convert the 'str' value
>>>> into the 'SocketAddress' object it actually needs.
>>>>
>>>> QEMU has a long history of having a second round of ad-hoc
>>>> parsing of configuration and we've found it to be a serious
>>>> maintenence burden. Thus we strive to have everything
>>>> represented in QAPI using the desired final type, and avoid
>>>> the second round of parsing.
>>>
>>> Thanks for your explanation.
>>>
>>>>
>>>>> "quote-generation-service" here is optional, it seems not trivial to add
>>>>> and parse the SocketAddress type in QEMU command. After I change 'str'
>>>>> to 'SocketAddress' and specify the command like "-object
>>>>> tdx-guest,type=vsock,cid=2,port=1234...", it will report "invalid
>>>>> parameter cid".
>>>>
>>>> The -object parameter supports JSON syntax for this reason
>>>>
>>>>      -object '{"qom-type":"tdx-guest","quote-generation-service":{"type": "vsock", "cid":"2","port":"1234"}}'
>>>>
>>>> libvirt will always use the JSON syntax for -object with a new enough
>>>> QEMU.
>>>
>>> The JSON syntax works for me. Then, we need to add some doc about using
>>> JSON syntax when quote-generation-service is required.
>>
>> This limitation doesn't look reasonable to me.
>>
>> @Daniel,
>>
>> Is it acceptable by QEMU community?
> 
> This is the expected approach for object types which have non-scalar
> properties.

Learned it.

thanks!

> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote>
  2023-08-29  5:31       ` Chenyi Qiang
  2023-08-29 10:25         ` Daniel P. Berrangé
@ 2023-09-26 20:33         ` Markus Armbruster
  1 sibling, 0 replies; 120+ messages in thread
From: Markus Armbruster @ 2023-09-26 20:33 UTC (permalink / raw)
  To: Chenyi Qiang
  Cc: Daniel P. Berrangé,
	Markus Armbruster, Xiaoyao Li, Paolo Bonzini, Richard Henderson,
	Michael S. Tsirkin, Marcel Apfelbaum, Igor Mammedov, Ani Sinha,
	Peter Xu, David Hildenbrand, Philippe Mathieu-Daudé,
	Cornelia Huck, Eric Blake, Marcelo Tosatti, Gerd Hoffmann,
	qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas

I sent this reply to your question on the same day, but it got eaten by
malfunctioning servers, and I noticed only now after another failure
made me dig through my logs.  Sorry for the inconvenience!

Chenyi Qiang <chenyi.qiang@intel.com> writes:

> On 8/22/2023 4:24 PM, Daniel P. Berrangé wrote:
>> On Tue, Aug 22, 2023 at 08:52:30AM +0200, Markus Armbruster wrote:
>>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>>
>>>> From: Isaku Yamahata <isaku.yamahata@intel.com>
>>>>
>>>> For GetQuote, delegate a request to Quote Generation Service.  Add property
>>>> of address of quote generation server and On request, connect to the
>>>> server, read request buffer from shared guest memory, send the request
>>>> buffer to the server and store the response into shared guest memory and
>>>> notify TD guest by interrupt.
>>>>
>>>> "quote-generation-service" is a property to specify Quote Generation
>>>> Service(QGS) in qemu socket address format.  The examples of the supported
>>>> format are "vsock:2:1234", "unix:/run/qgs", "localhost:1234".
>>>>
>>>> command line example:
>>>>   qemu-system-x86_64 \
>>>>     -object 'tdx-guest,id=tdx0,quote-generation-service=localhost:1234' \
>>>>     -machine confidential-guest-support=tdx0
>>>>
>>>> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
>>>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>>>> ---
>>>>  qapi/qom.json         |   5 +-
>>>>  target/i386/kvm/tdx.c | 380 ++++++++++++++++++++++++++++++++++++++++++
>>>>  target/i386/kvm/tdx.h |   7 +
>>>>  3 files changed, 391 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/qapi/qom.json b/qapi/qom.json
>>>> index 87c1d440f331..37139949d761 100644
>>>> --- a/qapi/qom.json
>>>> +++ b/qapi/qom.json
>>>> @@ -879,13 +879,16 @@
>>>>  #
>>>>  # @mrownerconfig: MROWNERCONFIG SHA384 hex string of 48 * 2 length (default: 0)
>>>>  #
>>>> +# @quote-generation-service: socket address for Quote Generation Service(QGS)
>>>> +#
>>>>  # Since: 8.2
>>>>  ##
>>>>  { 'struct': 'TdxGuestProperties',
>>>>    'data': { '*sept-ve-disable': 'bool',
>>>>              '*mrconfigid': 'str',
>>>>              '*mrowner': 'str',
>>>> -            '*mrownerconfig': 'str' } }
>>>> +            '*mrownerconfig': 'str',
>>>> +            '*quote-generation-service': 'str' } }
>>>
>>> Why not type SocketAddress?
>> 
>> Yes, the code uses SocketAddress internally when it eventually
>> calls qio_channel_socket_connect_async(), so we should directly
>> use SocketAddress in the QAPI from the start.
>
> Any benefit to directly use SocketAddress?

Design principle: use JSON to encode structured data as text in
QAPI/QMP.

Do: "mumble": [1, 2, 3]

Don't: "mumble": "1,2,3"

Do: "server": { "type": "inet", "host": "localhost", "port": "12345" }

Don't: "server": "host=localhost,port=12345"

We violate the principle in a couple of places.  Some are arguably
mistakes, some are pragmatic exceptions.

The principle implies "the only parser QAPI needs is the JSON parser".

The other benefit is consistency with existing interfaces.  They use
SocketAddress (a few old ones use SocketAddressLegacy).

> "quote-generation-service" here is optional, it seems not trivial to add
> and parse the SocketAddress type in QEMU command. After I change 'str'
> to 'SocketAddress' and specify the command like "-object
> tdx-guest,type=vsock,cid=2,port=1234...", it will report "invalid
> parameter cid".

Try "quote-generation-service.port=1234".



^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions
  2023-08-18  9:49 ` [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions Xiaoyao Li
  2023-08-21 23:00   ` Isaku Yamahata
@ 2023-10-10  1:02   ` Tina Zhang
  2023-10-10  5:29     ` Xiaoyao Li
  1 sibling, 1 reply; 120+ messages in thread
From: Tina Zhang @ 2023-10-10  1:02 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

Hi,

On 8/18/23 17:49, Xiaoyao Li wrote:
> According to Chapter "CPUID Virtualization" in TDX module spec, CPUID
> bits of TD can be classified into 6 types:
> 
> ------------------------------------------------------------------------
> 1 | As configured | configurable by VMM, independent of native value;
> ------------------------------------------------------------------------
> 2 | As configured | configurable by VMM if the bit is supported natively
>      (if native)   | Otherwise it equals as native(0).
> ------------------------------------------------------------------------
> 3 | Fixed         | fixed to 0/1
> ------------------------------------------------------------------------
> 4 | Native        | reflect the native value
> ------------------------------------------------------------------------
> 5 | Calculated    | calculated by TDX module.
> ------------------------------------------------------------------------
> 6 | Inducing #VE  | get #VE exception
> ------------------------------------------------------------------------
> 
> Note:
> 1. All the configurable XFAM related features and TD attributes related
>     features fall into type #2. And fixed0/1 bits of XFAM and TD
>     attributes fall into type #3.
> 
> 2. For CPUID leaves not listed in "CPUID virtualization Overview" table
>     in TDX module spec, TDX module injects #VE to TDs when those are
>     queried. For this case, TDs can request CPUID emulation from VMM via
>     TDVMCALL and the values are fully controlled by VMM.
> 
> Due to TDX module has its own virtualization policy on CPUID bits, it leads
> to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported
> CPUID bits for TDs. In order to keep a consistent CPUID configuration
> between VMM and TDs. Adjust supported CPUID for TDs based on TDX
> restrictions.
> 
> Currently only focus on the CPUID leaves recognized by QEMU's
> feature_word_info[] that are indexed by a FeatureWord.
> 
> Introduce a TDX CPUID lookup table, which maintains 1 entry for each
> FeatureWord. Each entry has below fields:
> 
>   - tdx_fixed0/1: The bits that are fixed as 0/1;
> 
>   - vmm_fixup:   The bits that are configurable from the view of TDX module.
>                  But they requires emulation of VMM when they are configured
> 	        as enabled. For those, they are not supported if VMM doesn't
> 		report them as supported. So they need be fixed up by
> 		checking if VMM supports them.
> 
>   - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is
>                  totally configurable by VMM.
> 
>   - supported_on_ve: It's valid only when @inducing_ve is true. It represents
> 		    the maximum feature set supported that be emulated
> 		    for TDs.
> 
> By applying TDX CPUID lookup table and TDX capabilities reported from
> TDX module, the supported CPUID for TDs can be obtained from following
> steps:
> 
> - get the base of VMM supported feature set;
> 
> - if the leaf is not a FeatureWord just return VMM's value without
>    modification;
> 
> - if the leaf is an inducing_ve type, applying supported_on_ve mask and
>    return;
> 
> - include all native bits, it covers type #2, #4, and parts of type #1.
>    (it also includes some unsupported bits. The following step will
>     correct it.)
> 
> - apply fixed0/1 to it (it covers #3, and rectifies the previous step);
> 
> - add configurable bits (it covers the other part of type #1);
> 
> - fix the ones in vmm_fixup;
> 
> - filter the one has valid .supported field;
> 
> (Calculated type is ignored since it's determined at runtime).
> 
> Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>   target/i386/cpu.h     |  16 +++
>   target/i386/kvm/kvm.c |   4 +
>   target/i386/kvm/tdx.c | 254 ++++++++++++++++++++++++++++++++++++++++++
>   target/i386/kvm/tdx.h |   2 +
>   4 files changed, 276 insertions(+)
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index e0771a10433b..c93dcd274531 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -780,6 +780,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
>   
>   /* Support RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE */
>   #define CPUID_7_0_EBX_FSGSBASE          (1U << 0)
> +/* Support for TSC adjustment MSR 0x3B */
> +#define CPUID_7_0_EBX_TSC_ADJUST        (1U << 1)
>   /* Support SGX */
>   #define CPUID_7_0_EBX_SGX               (1U << 2)
>   /* 1st Group of Advanced Bit Manipulation Extensions */
> @@ -798,8 +800,12 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
>   #define CPUID_7_0_EBX_INVPCID           (1U << 10)
>   /* Restricted Transactional Memory */
>   #define CPUID_7_0_EBX_RTM               (1U << 11)
> +/* Cache QoS Monitoring */
> +#define CPUID_7_0_EBX_PQM               (1U << 12)
>   /* Memory Protection Extension */
>   #define CPUID_7_0_EBX_MPX               (1U << 14)
> +/* Resource Director Technology Allocation */
> +#define CPUID_7_0_EBX_RDT_A             (1U << 15)
>   /* AVX-512 Foundation */
>   #define CPUID_7_0_EBX_AVX512F           (1U << 16)
>   /* AVX-512 Doubleword & Quadword Instruction */
> @@ -855,10 +861,16 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
>   #define CPUID_7_0_ECX_AVX512VNNI        (1U << 11)
>   /* Support for VPOPCNT[B,W] and VPSHUFBITQMB */
>   #define CPUID_7_0_ECX_AVX512BITALG      (1U << 12)
> +/* Intel Total Memory Encryption */
> +#define CPUID_7_0_ECX_TME               (1U << 13)
>   /* POPCNT for vectors of DW/QW */
>   #define CPUID_7_0_ECX_AVX512_VPOPCNTDQ  (1U << 14)
> +/* Placeholder for bit 15 */
> +#define CPUID_7_0_ECX_FZM               (1U << 15)
>   /* 5-level Page Tables */
>   #define CPUID_7_0_ECX_LA57              (1U << 16)
> +/* MAWAU for MPX */
> +#define CPUID_7_0_ECX_MAWAU             (31U << 17)
>   /* Read Processor ID */
>   #define CPUID_7_0_ECX_RDPID             (1U << 22)
>   /* Bus Lock Debug Exception */
> @@ -869,6 +881,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
>   #define CPUID_7_0_ECX_MOVDIRI           (1U << 27)
>   /* Move 64 Bytes as Direct Store Instruction */
>   #define CPUID_7_0_ECX_MOVDIR64B         (1U << 28)
> +/* ENQCMD and ENQCMDS instructions */
> +#define CPUID_7_0_ECX_ENQCMD            (1U << 29)
>   /* Support SGX Launch Control */
>   #define CPUID_7_0_ECX_SGX_LC            (1U << 30)
>   /* Protection Keys for Supervisor-mode Pages */
> @@ -886,6 +900,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
>   #define CPUID_7_0_EDX_SERIALIZE         (1U << 14)
>   /* TSX Suspend Load Address Tracking instruction */
>   #define CPUID_7_0_EDX_TSX_LDTRK         (1U << 16)
> +/* PCONFIG instruction */
> +#define CPUID_7_0_EDX_PCONFIG           (1U << 18)
>   /* Architectural LBRs */
>   #define CPUID_7_0_EDX_ARCH_LBR          (1U << 19)
>   /* AMX_BF16 instruction */
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index ec5c07bffd38..46a455a1e331 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -539,6 +539,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
>           ret |= 1U << KVM_HINTS_REALTIME;
>       }
>   
> +    if (is_tdx_vm()) {
> +        tdx_get_supported_cpuid(function, index, reg, &ret);
> +    }
> +
>       return ret;
>   }
>   
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 56cb826f6125..3198bc9fd5fb 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -15,11 +15,129 @@
>   #include "qemu/error-report.h"
>   #include "qapi/error.h"
>   #include "qom/object_interfaces.h"
> +#include "standard-headers/asm-x86/kvm_para.h"
>   #include "sysemu/kvm.h"
> +#include "sysemu/sysemu.h"
>   
>   #include "hw/i386/x86.h"
>   #include "kvm_i386.h"
>   #include "tdx.h"
> +#include "../cpu-internal.h"
> +
> +#define TDX_SUPPORTED_KVM_FEATURES  ((1U << KVM_FEATURE_NOP_IO_DELAY) | \
> +                                     (1U << KVM_FEATURE_PV_UNHALT) | \
> +                                     (1U << KVM_FEATURE_PV_TLB_FLUSH) | \
> +                                     (1U << KVM_FEATURE_PV_SEND_IPI) | \
> +                                     (1U << KVM_FEATURE_POLL_CONTROL) | \
> +                                     (1U << KVM_FEATURE_PV_SCHED_YIELD) | \
> +                                     (1U << KVM_FEATURE_MSI_EXT_DEST_ID))
> +
> +typedef struct KvmTdxCpuidLookup {
> +    uint32_t tdx_fixed0;
> +    uint32_t tdx_fixed1;
> +
> +    /*
> +     * The CPUID bits that are configurable from the view of TDX module
> +     * but require VMM emulation if configured to enabled by VMM.
> +     *
> +     * For those bits, they cannot be enabled actually if VMM (KVM/QEMU) cannot
> +     * virtualize them.
> +     */
> +    uint32_t vmm_fixup;
> +
> +    bool inducing_ve;
> +    /*
> +     * The maximum supported feature set for given inducing-#VE leaf.
> +     * It's valid only when .inducing_ve is true.
> +     */
> +    uint32_t supported_on_ve;
> +} KvmTdxCpuidLookup;
> +
> + /*
> +  * QEMU maintained TDX CPUID lookup tables, which reflects how CPUIDs are
> +  * virtualized for guest TDs based on "CPUID virtualization" of TDX spec.
> +  *
> +  * Note:
> +  *
> +  * This table will be updated runtime by tdx_caps reported by platform.
> +  *
> +  */
> +static KvmTdxCpuidLookup tdx_cpuid_lookup[FEATURE_WORDS] = {
> +    [FEAT_1_EDX] = {
> +        .tdx_fixed0 =
> +            BIT(10) /* Reserved */ | BIT(20) /* Reserved */ | CPUID_IA64,
> +        .tdx_fixed1 =
> +            CPUID_MSR | CPUID_PAE | CPUID_MCE | CPUID_APIC |
> +            CPUID_MTRR | CPUID_MCA | CPUID_CLFLUSH | CPUID_DTS,
> +        .vmm_fixup =
> +            CPUID_ACPI | CPUID_PBE,
CPUID_HT might also be needed here, as it's disabled by QEMU when TD 
guest only has a single processor core.

Regards,
-Tina


^ permalink raw reply	[flat|nested] 120+ messages in thread

* Re: [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions
  2023-10-10  1:02   ` Tina Zhang
@ 2023-10-10  5:29     ` Xiaoyao Li
  0 siblings, 0 replies; 120+ messages in thread
From: Xiaoyao Li @ 2023-10-10  5:29 UTC (permalink / raw)
  To: Tina Zhang, Paolo Bonzini, Richard Henderson, Michael S. Tsirkin,
	Marcel Apfelbaum, Igor Mammedov, Ani Sinha, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé,
	Daniel P. Berrangé,
	Cornelia Huck, Eric Blake, Markus Armbruster, Marcelo Tosatti,
	Gerd Hoffmann
  Cc: qemu-devel, kvm, Eduardo Habkost, Laszlo Ersek, Isaku Yamahata,
	erdemaktas, Chenyi Qiang

On 10/10/2023 9:02 AM, Tina Zhang wrote:
> Hi,
> 
> On 8/18/23 17:49, Xiaoyao Li wrote:
>> According to Chapter "CPUID Virtualization" in TDX module spec, CPUID
>> bits of TD can be classified into 6 types:
>>
>> ------------------------------------------------------------------------
>> 1 | As configured | configurable by VMM, independent of native value;
>> ------------------------------------------------------------------------
>> 2 | As configured | configurable by VMM if the bit is supported natively
>>      (if native)   | Otherwise it equals as native(0).
>> ------------------------------------------------------------------------
>> 3 | Fixed         | fixed to 0/1
>> ------------------------------------------------------------------------
>> 4 | Native        | reflect the native value
>> ------------------------------------------------------------------------
>> 5 | Calculated    | calculated by TDX module.
>> ------------------------------------------------------------------------
>> 6 | Inducing #VE  | get #VE exception
>> ------------------------------------------------------------------------
>>
>> Note:
>> 1. All the configurable XFAM related features and TD attributes related
>>     features fall into type #2. And fixed0/1 bits of XFAM and TD
>>     attributes fall into type #3.
>>
>> 2. For CPUID leaves not listed in "CPUID virtualization Overview" table
>>     in TDX module spec, TDX module injects #VE to TDs when those are
>>     queried. For this case, TDs can request CPUID emulation from VMM via
>>     TDVMCALL and the values are fully controlled by VMM.
>>
>> Due to TDX module has its own virtualization policy on CPUID bits, it 
>> leads
>> to what reported via KVM_GET_SUPPORTED_CPUID diverges from the supported
>> CPUID bits for TDs. In order to keep a consistent CPUID configuration
>> between VMM and TDs. Adjust supported CPUID for TDs based on TDX
>> restrictions.
>>
>> Currently only focus on the CPUID leaves recognized by QEMU's
>> feature_word_info[] that are indexed by a FeatureWord.
>>
>> Introduce a TDX CPUID lookup table, which maintains 1 entry for each
>> FeatureWord. Each entry has below fields:
>>
>>   - tdx_fixed0/1: The bits that are fixed as 0/1;
>>
>>   - vmm_fixup:   The bits that are configurable from the view of TDX 
>> module.
>>                  But they requires emulation of VMM when they are 
>> configured
>>             as enabled. For those, they are not supported if VMM doesn't
>>         report them as supported. So they need be fixed up by
>>         checking if VMM supports them.
>>
>>   - inducing_ve: TD gets #VE when querying this CPUID leaf. The result is
>>                  totally configurable by VMM.
>>
>>   - supported_on_ve: It's valid only when @inducing_ve is true. It 
>> represents
>>             the maximum feature set supported that be emulated
>>             for TDs.
>>
>> By applying TDX CPUID lookup table and TDX capabilities reported from
>> TDX module, the supported CPUID for TDs can be obtained from following
>> steps:
>>
>> - get the base of VMM supported feature set;
>>
>> - if the leaf is not a FeatureWord just return VMM's value without
>>    modification;
>>
>> - if the leaf is an inducing_ve type, applying supported_on_ve mask and
>>    return;
>>
>> - include all native bits, it covers type #2, #4, and parts of type #1.
>>    (it also includes some unsupported bits. The following step will
>>     correct it.)
>>
>> - apply fixed0/1 to it (it covers #3, and rectifies the previous step);
>>
>> - add configurable bits (it covers the other part of type #1);
>>
>> - fix the ones in vmm_fixup;
>>
>> - filter the one has valid .supported field;
>>
>> (Calculated type is ignored since it's determined at runtime).
>>
>> Co-developed-by: Chenyi Qiang <chenyi.qiang@intel.com>
>> Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   target/i386/cpu.h     |  16 +++
>>   target/i386/kvm/kvm.c |   4 +
>>   target/i386/kvm/tdx.c | 254 ++++++++++++++++++++++++++++++++++++++++++
>>   target/i386/kvm/tdx.h |   2 +
>>   4 files changed, 276 insertions(+)
>>
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index e0771a10433b..c93dcd274531 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -780,6 +780,8 @@ uint64_t 
>> x86_cpu_get_supported_feature_word(FeatureWord w,
>>   /* Support RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE */
>>   #define CPUID_7_0_EBX_FSGSBASE          (1U << 0)
>> +/* Support for TSC adjustment MSR 0x3B */
>> +#define CPUID_7_0_EBX_TSC_ADJUST        (1U << 1)
>>   /* Support SGX */
>>   #define CPUID_7_0_EBX_SGX               (1U << 2)
>>   /* 1st Group of Advanced Bit Manipulation Extensions */
>> @@ -798,8 +800,12 @@ uint64_t 
>> x86_cpu_get_supported_feature_word(FeatureWord w,
>>   #define CPUID_7_0_EBX_INVPCID           (1U << 10)
>>   /* Restricted Transactional Memory */
>>   #define CPUID_7_0_EBX_RTM               (1U << 11)
>> +/* Cache QoS Monitoring */
>> +#define CPUID_7_0_EBX_PQM               (1U << 12)
>>   /* Memory Protection Extension */
>>   #define CPUID_7_0_EBX_MPX               (1U << 14)
>> +/* Resource Director Technology Allocation */
>> +#define CPUID_7_0_EBX_RDT_A             (1U << 15)
>>   /* AVX-512 Foundation */
>>   #define CPUID_7_0_EBX_AVX512F           (1U << 16)
>>   /* AVX-512 Doubleword & Quadword Instruction */
>> @@ -855,10 +861,16 @@ uint64_t 
>> x86_cpu_get_supported_feature_word(FeatureWord w,
>>   #define CPUID_7_0_ECX_AVX512VNNI        (1U << 11)
>>   /* Support for VPOPCNT[B,W] and VPSHUFBITQMB */
>>   #define CPUID_7_0_ECX_AVX512BITALG      (1U << 12)
>> +/* Intel Total Memory Encryption */
>> +#define CPUID_7_0_ECX_TME               (1U << 13)
>>   /* POPCNT for vectors of DW/QW */
>>   #define CPUID_7_0_ECX_AVX512_VPOPCNTDQ  (1U << 14)
>> +/* Placeholder for bit 15 */
>> +#define CPUID_7_0_ECX_FZM               (1U << 15)
>>   /* 5-level Page Tables */
>>   #define CPUID_7_0_ECX_LA57              (1U << 16)
>> +/* MAWAU for MPX */
>> +#define CPUID_7_0_ECX_MAWAU             (31U << 17)
>>   /* Read Processor ID */
>>   #define CPUID_7_0_ECX_RDPID             (1U << 22)
>>   /* Bus Lock Debug Exception */
>> @@ -869,6 +881,8 @@ uint64_t 
>> x86_cpu_get_supported_feature_word(FeatureWord w,
>>   #define CPUID_7_0_ECX_MOVDIRI           (1U << 27)
>>   /* Move 64 Bytes as Direct Store Instruction */
>>   #define CPUID_7_0_ECX_MOVDIR64B         (1U << 28)
>> +/* ENQCMD and ENQCMDS instructions */
>> +#define CPUID_7_0_ECX_ENQCMD            (1U << 29)
>>   /* Support SGX Launch Control */
>>   #define CPUID_7_0_ECX_SGX_LC            (1U << 30)
>>   /* Protection Keys for Supervisor-mode Pages */
>> @@ -886,6 +900,8 @@ uint64_t 
>> x86_cpu_get_supported_feature_word(FeatureWord w,
>>   #define CPUID_7_0_EDX_SERIALIZE         (1U << 14)
>>   /* TSX Suspend Load Address Tracking instruction */
>>   #define CPUID_7_0_EDX_TSX_LDTRK         (1U << 16)
>> +/* PCONFIG instruction */
>> +#define CPUID_7_0_EDX_PCONFIG           (1U << 18)
>>   /* Architectural LBRs */
>>   #define CPUID_7_0_EDX_ARCH_LBR          (1U << 19)
>>   /* AMX_BF16 instruction */
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index ec5c07bffd38..46a455a1e331 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -539,6 +539,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState 
>> *s, uint32_t function,
>>           ret |= 1U << KVM_HINTS_REALTIME;
>>       }
>> +    if (is_tdx_vm()) {
>> +        tdx_get_supported_cpuid(function, index, reg, &ret);
>> +    }
>> +
>>       return ret;
>>   }
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index 56cb826f6125..3198bc9fd5fb 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -15,11 +15,129 @@
>>   #include "qemu/error-report.h"
>>   #include "qapi/error.h"
>>   #include "qom/object_interfaces.h"
>> +#include "standard-headers/asm-x86/kvm_para.h"
>>   #include "sysemu/kvm.h"
>> +#include "sysemu/sysemu.h"
>>   #include "hw/i386/x86.h"
>>   #include "kvm_i386.h"
>>   #include "tdx.h"
>> +#include "../cpu-internal.h"
>> +
>> +#define TDX_SUPPORTED_KVM_FEATURES  ((1U << KVM_FEATURE_NOP_IO_DELAY) 
>> | \
>> +                                     (1U << KVM_FEATURE_PV_UNHALT) | \
>> +                                     (1U << KVM_FEATURE_PV_TLB_FLUSH) 
>> | \
>> +                                     (1U << KVM_FEATURE_PV_SEND_IPI) | \
>> +                                     (1U << KVM_FEATURE_POLL_CONTROL) 
>> | \
>> +                                     (1U << 
>> KVM_FEATURE_PV_SCHED_YIELD) | \
>> +                                     (1U << 
>> KVM_FEATURE_MSI_EXT_DEST_ID))
>> +
>> +typedef struct KvmTdxCpuidLookup {
>> +    uint32_t tdx_fixed0;
>> +    uint32_t tdx_fixed1;
>> +
>> +    /*
>> +     * The CPUID bits that are configurable from the view of TDX module
>> +     * but require VMM emulation if configured to enabled by VMM.
>> +     *
>> +     * For those bits, they cannot be enabled actually if VMM 
>> (KVM/QEMU) cannot
>> +     * virtualize them.
>> +     */
>> +    uint32_t vmm_fixup;
>> +
>> +    bool inducing_ve;
>> +    /*
>> +     * The maximum supported feature set for given inducing-#VE leaf.
>> +     * It's valid only when .inducing_ve is true.
>> +     */
>> +    uint32_t supported_on_ve;
>> +} KvmTdxCpuidLookup;
>> +
>> + /*
>> +  * QEMU maintained TDX CPUID lookup tables, which reflects how 
>> CPUIDs are
>> +  * virtualized for guest TDs based on "CPUID virtualization" of TDX 
>> spec.
>> +  *
>> +  * Note:
>> +  *
>> +  * This table will be updated runtime by tdx_caps reported by platform.
>> +  *
>> +  */
>> +static KvmTdxCpuidLookup tdx_cpuid_lookup[FEATURE_WORDS] = {
>> +    [FEAT_1_EDX] = {
>> +        .tdx_fixed0 =
>> +            BIT(10) /* Reserved */ | BIT(20) /* Reserved */ | 
>> CPUID_IA64,
>> +        .tdx_fixed1 =
>> +            CPUID_MSR | CPUID_PAE | CPUID_MCE | CPUID_APIC |
>> +            CPUID_MTRR | CPUID_MCA | CPUID_CLFLUSH | CPUID_DTS,
>> +        .vmm_fixup =
>> +            CPUID_ACPI | CPUID_PBE,
> CPUID_HT might also be needed here, as it's disabled by QEMU when TD 
> guest only has a single processor core.

Add CPUID_HT here seems not correct fix. The root cause is that CPUID_HT 
is wrongly treated as auto_enabled bit, I will sent a fix separately.

> Regards,
> -Tina
> 


^ permalink raw reply	[flat|nested] 120+ messages in thread

end of thread, other threads:[~2023-10-10  5:30 UTC | newest]

Thread overview: 120+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-18  9:49 [PATCH v2 00/58] TDX QEMU support Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 01/58] *** HACK *** linux-headers: Update headers to pull in TDX API changes Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 02/58] i386: Introduce tdx-guest object Xiaoyao Li
2023-08-22  6:22   ` Markus Armbruster
2023-08-23  7:27     ` Xiaoyao Li
2023-08-23 11:14       ` Markus Armbruster
2023-08-18  9:49 ` [PATCH v2 03/58] target/i386: Parse TDX vm type Xiaoyao Li
2023-08-21  8:27   ` Daniel P. Berrangé
2023-08-21 13:37     ` Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 04/58] target/i386: Introduce kvm_confidential_guest_init() Xiaoyao Li
2023-08-29 14:42   ` Philippe Mathieu-Daudé
2023-08-18  9:49 ` [PATCH v2 05/58] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 06/58] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES Xiaoyao Li
2023-08-21  8:46   ` Daniel P. Berrangé
2023-08-22  7:31     ` Xiaoyao Li
2023-08-22  8:19       ` Daniel P. Berrangé
2023-08-18  9:49 ` [PATCH v2 07/58] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object Xiaoyao Li
2023-08-21  8:48   ` Daniel P. Berrangé
2023-08-22  7:46     ` Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 08/58] i386/tdx: Adjust the supported CPUID based on TDX restrictions Xiaoyao Li
2023-08-21 23:00   ` Isaku Yamahata
2023-08-23  3:59     ` Xiaoyao Li
2023-10-10  1:02   ` Tina Zhang
2023-10-10  5:29     ` Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 09/58] i386/tdx: Update tdx_cpuid_lookup[].tdx_fixed0/1 by tdx_caps.cpuid_config[] Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 10/58] i386/tdx: Integrate tdx_caps->xfam_fixed0/1 into tdx_cpuid_lookup Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 11/58] i386/tdx: Integrate tdx_caps->attrs_fixed0/1 to tdx_cpuid_lookup Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 12/58] i386/kvm: Move architectural CPUID leaf generation to separate helper Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 13/58] kvm: Introduce kvm_arch_pre_create_vcpu() Xiaoyao Li
2023-08-21  8:55   ` Daniel P. Berrangé
2023-08-29 14:40   ` Philippe Mathieu-Daudé
2023-08-30  1:45     ` Xiaoyao Li
2023-08-30 16:54       ` Isaku Yamahata
2023-08-18  9:49 ` [PATCH v2 14/58] i386/tdx: Initialize TDX before creating TD vcpus Xiaoyao Li
2023-08-21  8:54   ` Daniel P. Berrangé
2023-08-18  9:49 ` [PATCH v2 15/58] i386/tdx: Add property sept-ve-disable for tdx-guest object Xiaoyao Li
2023-08-21  8:59   ` Daniel P. Berrangé
2023-08-22  6:27     ` Markus Armbruster
2023-08-22  8:39       ` Xiaoyao Li
2023-08-18  9:49 ` [PATCH v2 16/58] i386/tdx: Make sept_ve_disable set by default Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 17/58] i386/tdx: Wire CPU features up with attributes of TD guest Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 18/58] i386/tdx: Validate TD attributes Xiaoyao Li
2023-08-21  9:16   ` Daniel P. Berrangé
2023-08-22 14:21     ` Xiaoyao Li
2023-08-22 14:30     ` Xiaoyao Li
2023-08-22 14:42       ` Daniel P. Berrangé
2023-08-23  7:31         ` Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 19/58] qom: implement property helper for sha384 Xiaoyao Li
2023-08-21  9:25   ` Daniel P. Berrangé
2023-08-21 23:28     ` Isaku Yamahata
2023-08-18  9:50 ` [PATCH v2 20/58] i386/tdx: Allows mrconfigid/mrowner/mrownerconfig for TDX_INIT_VM Xiaoyao Li
2023-08-21  9:29   ` Daniel P. Berrangé
2023-08-22  6:35     ` Markus Armbruster
2023-08-18  9:50 ` [PATCH v2 21/58] i386/tdx: Implement user specified tsc frequency Xiaoyao Li
2023-08-21  9:30   ` Daniel P. Berrangé
2023-08-18  9:50 ` [PATCH v2 22/58] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 23/58] i386/tdx: Make memory type private by default Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 24/58] i386/tdx: Create kvm gmem for TD Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 25/58] kvm/tdx: Don't complain when converting vMMIO region to shared Xiaoyao Li
2023-08-21  9:34   ` Daniel P. Berrangé
2023-08-18  9:50 ` [PATCH v2 26/58] kvm/tdx: Ignore memory conversion to shared of unassigned region Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 27/58] i386/tdvf: Introduce function to parse TDVF metadata Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 28/58] i386/tdx: Parse TDVF metadata for TDX VM Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 29/58] i386/tdx: Skip BIOS shadowing setup Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 30/58] i386/tdx: Don't initialize pc.rom for TDX VMs Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 31/58] i386/tdx: Track mem_ptr for each firmware entry of TDVF Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 32/58] i386/tdx: Track RAM entries for TDX VM Xiaoyao Li
2023-08-21  9:38   ` Daniel P. Berrangé
2023-08-22 15:39     ` Xiaoyao Li
2023-08-21 23:40   ` Isaku Yamahata
2023-08-22 15:45     ` Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 33/58] headers: Add definitions from UEFI spec for volumes, resources, etc Xiaoyao Li
2023-08-23 19:41   ` Isaku Yamahata
2023-08-24  7:50     ` Xiaoyao Li
2023-08-24  7:55       ` Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 34/58] i386/tdx: Setup the TD HOB list Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 35/58] i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 36/58] memory: Introduce memory_region_init_ram_gmem() Xiaoyao Li
2023-08-21  9:40   ` Daniel P. Berrangé
2023-08-29 14:33   ` Philippe Mathieu-Daudé
2023-08-30  1:53     ` Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 37/58] i386/tdx: register TDVF as private memory Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 38/58] i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 39/58] i386/tdx: Finalize TDX VM Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 40/58] i386/tdx: handle TDG.VP.VMCALL<SetupEventNotifyInterrupt> Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 41/58] i386/tdx: handle TDG.VP.VMCALL<GetQuote> Xiaoyao Li
2023-08-22  6:52   ` Markus Armbruster
2023-08-22  8:24     ` Daniel P. Berrangé
2023-08-29  5:31       ` Chenyi Qiang
2023-08-29 10:25         ` Daniel P. Berrangé
2023-08-30  5:18           ` Chenyi Qiang
2023-08-30  5:57             ` Xiaoyao Li
2023-08-30  7:48               ` Daniel P. Berrangé
2023-08-31  6:49                 ` Xiaoyao Li
2023-09-26 20:33         ` Markus Armbruster
2023-08-18  9:50 ` [PATCH v2 42/58] i386/tdx: register the fd read callback with the main loop to read the quote data Xiaoyao Li
2023-08-24  6:27   ` Chenyi Qiang
2023-08-18  9:50 ` [PATCH v2 43/58] i386/tdx: setup a timer for the qio channel Xiaoyao Li
2023-08-24  7:21   ` Chenyi Qiang
2023-08-24  8:34     ` Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 44/58] i386/tdx: handle TDG.VP.VMCALL<MapGPA> hypercall Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 45/58] i386/tdx: Limit the range size for MapGPA Xiaoyao Li
2023-08-21 22:30   ` Isaku Yamahata
2023-08-18  9:50 ` [PATCH v2 46/58] i386/tdx: Handle TDG.VP.VMCALL<REPORT_FATAL_ERROR> Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 47/58] i386/tdx: Wire REPORT_FATAL_ERROR with GuestPanic facility Xiaoyao Li
2023-08-21  9:58   ` Daniel P. Berrangé
2023-08-28 13:14     ` Xiaoyao Li
2023-08-29 10:28       ` Daniel P. Berrangé
2023-08-30  2:15         ` Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 48/58] i386/tdx: Disable SMM for TDX VMs Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 49/58] i386/tdx: Disable PIC " Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 50/58] i386/tdx: Don't allow system reset " Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 51/58] i386/tdx: LMCE is not supported for TDX Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 52/58] hw/i386: add eoi_intercept_unsupported member to X86MachineState Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 53/58] hw/i386: add option to forcibly report edge trigger in acpi tables Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 54/58] i386/tdx: Don't synchronize guest tsc for TDs Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 55/58] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() " Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 56/58] i386/tdx: Skip kvm_put_apicbase() " Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 57/58] i386/tdx: Don't get/put guest state for TDX VMs Xiaoyao Li
2023-08-18  9:50 ` [PATCH v2 58/58] docs: Add TDX documentation Xiaoyao Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.