All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v3 00/36] TDX QEMU support
@ 2022-03-17 13:58 ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

This patch series aims to enable TDX support to allow creating and booting a
TD (TDX VM) with QEMU. It needs to work with corresponding KVM patch
for TDX [1]. You can find TDX related documents in [2].

You can also find this series in below repo in github:

https://github.com/intel/qemu-tdx.git

and it's based on two cleanup patches

https://lore.kernel.org/qemu-devel/20220310122811.807794-1-xiaoyao.li@intel.com/


To boot a TDX VM, it requires several changes/additional steps in the flow:

 1. specify the vm type KVM_X86_TDX_VM when creating VM with
    IOCTL(KVM_CREATE_VM);
 2. initialize VM scope configuration before creating any VCPU;
 3. initialize VCPU scope configuration;
 4. initialize virtual firmware in guest private memory before vcpu running;

Besides, TDX VM needs to boot with TDVF (TDX virtual firmware, and come out
as OVMF). This series adds the support of parsing TDVF, loading TDVF into
guest's private memory and preparing TD HOB info for TDVF.

[1] KVM TDX basic feature support
https://lore.kernel.org/all/cover.1646422845.git.isaku.yamahata@intel.com/

[2] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html

== Limitation and future work ==
- Readonly memslot

  TDX only support readonly (write protection) memslot for shared memory, but
  not for private memory. For simplicity, just mark readonly memslot not
  supported entirely for TDX. 

- CPU model

  We cannot create a TD with arbitrarily CPU model like what for normal VMs,
  because only a subset of features can be configured for TD.
  
  - It's recommended to use '-cpu host' to create TD;
  - '+feature/-feature' might not work as expected;

  future work: To introduce specific CPU model for TDs and enhance +/-features
               for TDs.

- gdb suppport

  gdb support to debug a TD of off-debug mode is future work.

== Patch organization ==
1           Manually fetch Linux UAPI changes for TDX;
2-15,24,26  Basic TDX support that parses vm-type and invoke TDX
            specific IOCTLs
16-25       Load, parse and initialize TDVF for TDX VM;
27-31       Disable unsupported functions for TDX VM;
32-35       Avoid errors due to KVM's requirement on TDX;
36          Add documentation of TDX;

== Change history ==

Changes from v2:
- Get vm-type from confidential-guest-support object type;
- Drop machine_init_done_late_notifiers;
- Refactor tdx_ioctl implementation;
- re-use existing pflash interface to load TDVF (i.e., OVMF binaries);
- introduce new date structure to track memory type instead of changing
  e820 table;
- Force smm to off for TDX VM;
- Drop the patches that suppress level-trigger/SMI/INIT/SIPI since KVM
  will ingore them;
- Add documentation;

[v2] https://lore.kernel.org/qemu-devel/cover.1625704980.git.isaku.yamahata@intel.com/

Changes from v1:
- suppress level trigger/SMI/INIT/SIPI related to IOAPIC.
- add VM attribute sha384 to TD measurement.
- guest TSC Hz specification

[v1] https://lore.kernel.org/qemu-devel/cover.1613188118.git.isaku.yamahata@intel.com/

---
Isaku Yamahata (4):
  i386/tdvf: Introduce function to parse TDVF metadata
  i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION
  hw/i386: add option to forcibly report edge trigger in acpi tables
  i386/tdx: Don't synchronize guest tsc for TDs

Sean Christopherson (2):
  i386/kvm: Move architectural CPUID leaf generation to separate helper
  i386/tdx: Don't get/put guest state for TDX VMs

Xiaoyao Li (30):
  *** HACK *** linux-headers: Update headers to pull in TDX API changes
  i386: Introduce tdx-guest object
  target/i386: Implement mc->kvm_type() to get VM type
  target/i386: Introduce kvm_confidential_guest_init()
  i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
  i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
  i386/tdx: Adjust get_supported_cpuid() for TDX VM
  KVM: Introduce kvm_arch_pre_create_vcpu()
  i386/tdx: Initialize TDX before creating TD vcpus
  i386/tdx: Add property sept-ve-disable for tdx-guest object
  i386/tdx: Wire CPU features up with attributes of TD guest
  i386/tdx: Validate TD attributes
  i386/tdx: Implement user specified tsc frequency
  i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
  pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  i386/tdx: Parse TDVF metadata for TDX VM
  i386/tdx: Get and store the mem_ptr of TDVF firmware
  i386/tdx: Track mem_ptr for each firmware entry of TDVF
  i386/tdx: Track RAM entries for TDX VM
  i386/tdx: Create the TD HOB list upon machine init done
  i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
  i386/tdx: Finalize TDX VM
  i386/tdx: Disable SMM for TDX VMs
  i386/tdx: Disable PIC for TDX VMs
  i386/tdx: Don't allow system reset for TDX VMs
  hw/i386: add eoi_intercept_unsupported member to X86MachineState
  i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
  i386/tdx: Skip kvm_put_apicbase() for TDs
  docs: Add TDX documentation

 accel/kvm/kvm-all.c                        |  16 +-
 configs/devices/i386-softmmu/default.mak   |   1 +
 docs/system/confidential-guest-support.rst |   1 +
 docs/system/i386/tdx.rst                   | 103 ++++
 docs/system/target-i386.rst                |   1 +
 hw/block/pflash_cfi01.c                    |  25 +-
 hw/i386/Kconfig                            |   6 +
 hw/i386/acpi-build.c                       |  99 ++--
 hw/i386/acpi-common.c                      |  50 +-
 hw/i386/meson.build                        |   1 +
 hw/i386/pc_sysfw.c                         |  49 +-
 hw/i386/tdvf-hob.c                         | 212 ++++++++
 hw/i386/tdvf-hob.h                         |  25 +
 hw/i386/tdvf.c                             | 196 ++++++++
 hw/i386/uefi.h                             | 198 ++++++++
 hw/i386/x86.c                              |   7 +
 include/hw/i386/tdvf.h                     |  60 +++
 include/hw/i386/x86.h                      |   1 +
 include/sysemu/kvm.h                       |   1 +
 linux-headers/asm-x86/kvm.h                |  60 +++
 linux-headers/linux/kvm.h                  |   2 +
 qapi/qom.json                              |  17 +
 target/arm/kvm64.c                         |   5 +
 target/i386/cpu.h                          |   5 +
 target/i386/kvm/kvm.c                      | 362 ++++++++------
 target/i386/kvm/kvm_i386.h                 |   5 +
 target/i386/kvm/meson.build                |   2 +
 target/i386/kvm/tdx-stub.c                 |  24 +
 target/i386/kvm/tdx.c                      | 541 +++++++++++++++++++++
 target/i386/kvm/tdx.h                      |  56 +++
 target/i386/sev.c                          |   1 -
 target/i386/sev.h                          |   2 +
 target/mips/kvm.c                          |   5 +
 target/ppc/kvm.c                           |   5 +
 target/s390x/kvm/kvm.c                     |   5 +
 35 files changed, 1940 insertions(+), 209 deletions(-)
 create mode 100644 docs/system/i386/tdx.rst
 create mode 100644 hw/i386/tdvf-hob.c
 create mode 100644 hw/i386/tdvf-hob.h
 create mode 100644 hw/i386/tdvf.c
 create mode 100644 hw/i386/uefi.h
 create mode 100644 include/hw/i386/tdvf.h
 create mode 100644 target/i386/kvm/tdx-stub.c
 create mode 100644 target/i386/kvm/tdx.c
 create mode 100644 target/i386/kvm/tdx.h

-- 
2.27.0


^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 00/36] TDX QEMU support
@ 2022-03-17 13:58 ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

This patch series aims to enable TDX support to allow creating and booting a
TD (TDX VM) with QEMU. It needs to work with corresponding KVM patch
for TDX [1]. You can find TDX related documents in [2].

You can also find this series in below repo in github:

https://github.com/intel/qemu-tdx.git

and it's based on two cleanup patches

https://lore.kernel.org/qemu-devel/20220310122811.807794-1-xiaoyao.li@intel.com/


To boot a TDX VM, it requires several changes/additional steps in the flow:

 1. specify the vm type KVM_X86_TDX_VM when creating VM with
    IOCTL(KVM_CREATE_VM);
 2. initialize VM scope configuration before creating any VCPU;
 3. initialize VCPU scope configuration;
 4. initialize virtual firmware in guest private memory before vcpu running;

Besides, TDX VM needs to boot with TDVF (TDX virtual firmware, and come out
as OVMF). This series adds the support of parsing TDVF, loading TDVF into
guest's private memory and preparing TD HOB info for TDVF.

[1] KVM TDX basic feature support
https://lore.kernel.org/all/cover.1646422845.git.isaku.yamahata@intel.com/

[2] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html

== Limitation and future work ==
- Readonly memslot

  TDX only support readonly (write protection) memslot for shared memory, but
  not for private memory. For simplicity, just mark readonly memslot not
  supported entirely for TDX. 

- CPU model

  We cannot create a TD with arbitrarily CPU model like what for normal VMs,
  because only a subset of features can be configured for TD.
  
  - It's recommended to use '-cpu host' to create TD;
  - '+feature/-feature' might not work as expected;

  future work: To introduce specific CPU model for TDs and enhance +/-features
               for TDs.

- gdb suppport

  gdb support to debug a TD of off-debug mode is future work.

== Patch organization ==
1           Manually fetch Linux UAPI changes for TDX;
2-15,24,26  Basic TDX support that parses vm-type and invoke TDX
            specific IOCTLs
16-25       Load, parse and initialize TDVF for TDX VM;
27-31       Disable unsupported functions for TDX VM;
32-35       Avoid errors due to KVM's requirement on TDX;
36          Add documentation of TDX;

== Change history ==

Changes from v2:
- Get vm-type from confidential-guest-support object type;
- Drop machine_init_done_late_notifiers;
- Refactor tdx_ioctl implementation;
- re-use existing pflash interface to load TDVF (i.e., OVMF binaries);
- introduce new date structure to track memory type instead of changing
  e820 table;
- Force smm to off for TDX VM;
- Drop the patches that suppress level-trigger/SMI/INIT/SIPI since KVM
  will ingore them;
- Add documentation;

[v2] https://lore.kernel.org/qemu-devel/cover.1625704980.git.isaku.yamahata@intel.com/

Changes from v1:
- suppress level trigger/SMI/INIT/SIPI related to IOAPIC.
- add VM attribute sha384 to TD measurement.
- guest TSC Hz specification

[v1] https://lore.kernel.org/qemu-devel/cover.1613188118.git.isaku.yamahata@intel.com/

---
Isaku Yamahata (4):
  i386/tdvf: Introduce function to parse TDVF metadata
  i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION
  hw/i386: add option to forcibly report edge trigger in acpi tables
  i386/tdx: Don't synchronize guest tsc for TDs

Sean Christopherson (2):
  i386/kvm: Move architectural CPUID leaf generation to separate helper
  i386/tdx: Don't get/put guest state for TDX VMs

Xiaoyao Li (30):
  *** HACK *** linux-headers: Update headers to pull in TDX API changes
  i386: Introduce tdx-guest object
  target/i386: Implement mc->kvm_type() to get VM type
  target/i386: Introduce kvm_confidential_guest_init()
  i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
  i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
  i386/tdx: Adjust get_supported_cpuid() for TDX VM
  KVM: Introduce kvm_arch_pre_create_vcpu()
  i386/tdx: Initialize TDX before creating TD vcpus
  i386/tdx: Add property sept-ve-disable for tdx-guest object
  i386/tdx: Wire CPU features up with attributes of TD guest
  i386/tdx: Validate TD attributes
  i386/tdx: Implement user specified tsc frequency
  i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
  pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  i386/tdx: Parse TDVF metadata for TDX VM
  i386/tdx: Get and store the mem_ptr of TDVF firmware
  i386/tdx: Track mem_ptr for each firmware entry of TDVF
  i386/tdx: Track RAM entries for TDX VM
  i386/tdx: Create the TD HOB list upon machine init done
  i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
  i386/tdx: Finalize TDX VM
  i386/tdx: Disable SMM for TDX VMs
  i386/tdx: Disable PIC for TDX VMs
  i386/tdx: Don't allow system reset for TDX VMs
  hw/i386: add eoi_intercept_unsupported member to X86MachineState
  i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
  i386/tdx: Skip kvm_put_apicbase() for TDs
  docs: Add TDX documentation

 accel/kvm/kvm-all.c                        |  16 +-
 configs/devices/i386-softmmu/default.mak   |   1 +
 docs/system/confidential-guest-support.rst |   1 +
 docs/system/i386/tdx.rst                   | 103 ++++
 docs/system/target-i386.rst                |   1 +
 hw/block/pflash_cfi01.c                    |  25 +-
 hw/i386/Kconfig                            |   6 +
 hw/i386/acpi-build.c                       |  99 ++--
 hw/i386/acpi-common.c                      |  50 +-
 hw/i386/meson.build                        |   1 +
 hw/i386/pc_sysfw.c                         |  49 +-
 hw/i386/tdvf-hob.c                         | 212 ++++++++
 hw/i386/tdvf-hob.h                         |  25 +
 hw/i386/tdvf.c                             | 196 ++++++++
 hw/i386/uefi.h                             | 198 ++++++++
 hw/i386/x86.c                              |   7 +
 include/hw/i386/tdvf.h                     |  60 +++
 include/hw/i386/x86.h                      |   1 +
 include/sysemu/kvm.h                       |   1 +
 linux-headers/asm-x86/kvm.h                |  60 +++
 linux-headers/linux/kvm.h                  |   2 +
 qapi/qom.json                              |  17 +
 target/arm/kvm64.c                         |   5 +
 target/i386/cpu.h                          |   5 +
 target/i386/kvm/kvm.c                      | 362 ++++++++------
 target/i386/kvm/kvm_i386.h                 |   5 +
 target/i386/kvm/meson.build                |   2 +
 target/i386/kvm/tdx-stub.c                 |  24 +
 target/i386/kvm/tdx.c                      | 541 +++++++++++++++++++++
 target/i386/kvm/tdx.h                      |  56 +++
 target/i386/sev.c                          |   1 -
 target/i386/sev.h                          |   2 +
 target/mips/kvm.c                          |   5 +
 target/ppc/kvm.c                           |   5 +
 target/s390x/kvm/kvm.c                     |   5 +
 35 files changed, 1940 insertions(+), 209 deletions(-)
 create mode 100644 docs/system/i386/tdx.rst
 create mode 100644 hw/i386/tdvf-hob.c
 create mode 100644 hw/i386/tdvf-hob.h
 create mode 100644 hw/i386/tdvf.c
 create mode 100644 hw/i386/uefi.h
 create mode 100644 include/hw/i386/tdvf.h
 create mode 100644 target/i386/kvm/tdx-stub.c
 create mode 100644 target/i386/kvm/tdx.c
 create mode 100644 target/i386/kvm/tdx.h

-- 
2.27.0



^ permalink raw reply	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 01/36] *** HACK *** linux-headers: Update headers to pull in TDX API changes
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Pull in recent TDX updates, which are not backwards compatible.

It's just to make this series runnable. It will be updated by script

	scripts/update-linux-headers.sh

once TDX support is upstreamed in linux kernel.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 linux-headers/asm-x86/kvm.h | 60 +++++++++++++++++++++++++++++++++++++
 linux-headers/linux/kvm.h   |  2 ++
 2 files changed, 62 insertions(+)

diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index bf6e96011dfe..c4692bac8e51 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -525,4 +525,64 @@ struct kvm_pmu_event_filter {
 #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */
 #define   KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */
 
+
+#define KVM_X86_DEFAULT_VM	0
+#define KVM_X86_TDX_VM		1
+
+/* Trust Domain eXtension command*/
+enum kvm_tdx_cmd_id {
+	KVM_TDX_CAPABILITIES = 0,
+	KVM_TDX_INIT_VM,
+	KVM_TDX_INIT_VCPU,
+	KVM_TDX_INIT_MEM_REGION,
+	KVM_TDX_FINALIZE_VM,
+
+	KVM_TDX_CMD_NR_MAX,
+};
+
+struct kvm_tdx_cmd {
+	__u32 id;
+	__u32 metadata;
+	__u64 data;
+};
+
+struct kvm_tdx_cpuid_config {
+	__u32 leaf;
+	__u32 sub_leaf;
+	__u32 eax;
+	__u32 ebx;
+	__u32 ecx;
+	__u32 edx;
+};
+
+struct kvm_tdx_capabilities {
+	__u64 attrs_fixed0;
+	__u64 attrs_fixed1;
+	__u64 xfam_fixed0;
+	__u64 xfam_fixed1;
+
+	__u32 nr_cpuid_configs;
+	__u32 padding;
+	struct kvm_tdx_cpuid_config cpuid_configs[0];
+};
+
+struct kvm_tdx_init_vm {
+	__u32 max_vcpus;
+	__u32 tsc_khz;
+	__u64 attributes;
+	__u64 cpuid;
+	__u64 mrconfigid[6];    /* sha384 digest */
+	__u64 mrowner[6];       /* sha384 digest */
+	__u64 mrownerconfig[6]; /* sha348 digest */
+	__u64 reserved[43];     /* must be zero for future extensibility */
+};
+
+#define KVM_TDX_MEASURE_MEMORY_REGION	(1UL << 0)
+
+struct kvm_tdx_init_mem_region {
+	__u64 source_addr;
+	__u64 gpa;
+	__u64 nr_pages;
+};
+
 #endif /* _ASM_X86_KVM_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index d232feaae972..466f43f6d746 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1135,6 +1135,8 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_XSAVE2 208
 #define KVM_CAP_SYS_ATTRIBUTES 209
 
+#define KVM_CAP_VM_TYPES 1000
+
 #ifdef KVM_CAP_IRQ_ROUTING
 
 struct kvm_irq_routing_irqchip {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 01/36] *** HACK *** linux-headers: Update headers to pull in TDX API changes
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Pull in recent TDX updates, which are not backwards compatible.

It's just to make this series runnable. It will be updated by script

	scripts/update-linux-headers.sh

once TDX support is upstreamed in linux kernel.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 linux-headers/asm-x86/kvm.h | 60 +++++++++++++++++++++++++++++++++++++
 linux-headers/linux/kvm.h   |  2 ++
 2 files changed, 62 insertions(+)

diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index bf6e96011dfe..c4692bac8e51 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -525,4 +525,64 @@ struct kvm_pmu_event_filter {
 #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */
 #define   KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */
 
+
+#define KVM_X86_DEFAULT_VM	0
+#define KVM_X86_TDX_VM		1
+
+/* Trust Domain eXtension command*/
+enum kvm_tdx_cmd_id {
+	KVM_TDX_CAPABILITIES = 0,
+	KVM_TDX_INIT_VM,
+	KVM_TDX_INIT_VCPU,
+	KVM_TDX_INIT_MEM_REGION,
+	KVM_TDX_FINALIZE_VM,
+
+	KVM_TDX_CMD_NR_MAX,
+};
+
+struct kvm_tdx_cmd {
+	__u32 id;
+	__u32 metadata;
+	__u64 data;
+};
+
+struct kvm_tdx_cpuid_config {
+	__u32 leaf;
+	__u32 sub_leaf;
+	__u32 eax;
+	__u32 ebx;
+	__u32 ecx;
+	__u32 edx;
+};
+
+struct kvm_tdx_capabilities {
+	__u64 attrs_fixed0;
+	__u64 attrs_fixed1;
+	__u64 xfam_fixed0;
+	__u64 xfam_fixed1;
+
+	__u32 nr_cpuid_configs;
+	__u32 padding;
+	struct kvm_tdx_cpuid_config cpuid_configs[0];
+};
+
+struct kvm_tdx_init_vm {
+	__u32 max_vcpus;
+	__u32 tsc_khz;
+	__u64 attributes;
+	__u64 cpuid;
+	__u64 mrconfigid[6];    /* sha384 digest */
+	__u64 mrowner[6];       /* sha384 digest */
+	__u64 mrownerconfig[6]; /* sha348 digest */
+	__u64 reserved[43];     /* must be zero for future extensibility */
+};
+
+#define KVM_TDX_MEASURE_MEMORY_REGION	(1UL << 0)
+
+struct kvm_tdx_init_mem_region {
+	__u64 source_addr;
+	__u64 gpa;
+	__u64 nr_pages;
+};
+
 #endif /* _ASM_X86_KVM_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index d232feaae972..466f43f6d746 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1135,6 +1135,8 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_XSAVE2 208
 #define KVM_CAP_SYS_ATTRIBUTES 209
 
+#define KVM_CAP_VM_TYPES 1000
+
 #ifdef KVM_CAP_IRQ_ROUTING
 
 struct kvm_irq_routing_irqchip {
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 02/36] i386: Introduce tdx-guest object
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Introduce tdx-guest object which implements the interface of
CONFIDENTIAL_GUEST_SUPPORT, and will be used to create TDX VMs (TDs) by

  qemu -machine ...,confidential-guest-support=tdx0	\
       -object tdx-guset,id=tdx0

It has only one property 'attributes' with fixed value 0 and not
configurable so far.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 configs/devices/i386-softmmu/default.mak |  1 +
 hw/i386/Kconfig                          |  5 +++
 qapi/qom.json                            | 14 +++++++++
 target/i386/kvm/meson.build              |  2 ++
 target/i386/kvm/tdx.c                    | 40 ++++++++++++++++++++++++
 target/i386/kvm/tdx.h                    | 19 +++++++++++
 6 files changed, 81 insertions(+)
 create mode 100644 target/i386/kvm/tdx.c
 create mode 100644 target/i386/kvm/tdx.h

diff --git a/configs/devices/i386-softmmu/default.mak b/configs/devices/i386-softmmu/default.mak
index 598c6646dfc0..9b5ec59d65b0 100644
--- a/configs/devices/i386-softmmu/default.mak
+++ b/configs/devices/i386-softmmu/default.mak
@@ -18,6 +18,7 @@
 #CONFIG_QXL=n
 #CONFIG_SEV=n
 #CONFIG_SGA=n
+#CONFIG_TDX=n
 #CONFIG_TEST_DEVICES=n
 #CONFIG_TPM_CRB=n
 #CONFIG_TPM_TIS_ISA=n
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index d22ac4a4b952..9e40ff79fc2d 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -10,6 +10,10 @@ config SGX
     bool
     depends on KVM
 
+config TDX
+    bool
+    depends on KVM
+
 config PC
     bool
     imply APPLESMC
@@ -26,6 +30,7 @@ config PC
     imply QXL
     imply SEV
     imply SGX
+    imply TDX
     imply SGA
     imply TEST_DEVICES
     imply TPM_CRB
diff --git a/qapi/qom.json b/qapi/qom.json
index eeb5395ff3b7..1415ab22e531 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -785,6 +785,18 @@
             'reduced-phys-bits': 'uint32',
             '*kernel-hashes': 'bool' } }
 
+##
+# @TdxGuestProperties:
+#
+# Properties for tdx-guest objects.
+#
+# @attributes: TDX guest's attributes (default: 0)
+#
+# Since: 7.0
+##
+{ 'struct': 'TdxGuestProperties',
+  'data': { '*attributes': 'uint64' } }
+
 ##
 # @ObjectType:
 #
@@ -837,6 +849,7 @@
       'if': 'CONFIG_SECRET_KEYRING' },
     'sev-guest',
     's390-pv-guest',
+    'tdx-guest',
     'throttle-group',
     'tls-creds-anon',
     'tls-creds-psk',
@@ -900,6 +913,7 @@
       'secret_keyring':             { 'type': 'SecretKeyringProperties',
                                       'if': 'CONFIG_SECRET_KEYRING' },
       'sev-guest':                  'SevGuestProperties',
+      'tdx-guest':                  'TdxGuestProperties',
       'throttle-group':             'ThrottleGroupProperties',
       'tls-creds-anon':             'TlsCredsAnonProperties',
       'tls-creds-psk':              'TlsCredsPskProperties',
diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
index 736df8b72e3f..b2d7d41acde2 100644
--- a/target/i386/kvm/meson.build
+++ b/target/i386/kvm/meson.build
@@ -9,6 +9,8 @@ i386_softmmu_kvm_ss.add(files(
 
 i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
 
+i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
+
 i386_softmmu_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
 
 i386_softmmu_ss.add_all(when: 'CONFIG_KVM', if_true: i386_softmmu_kvm_ss)
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
new file mode 100644
index 000000000000..d3792d4a3d56
--- /dev/null
+++ b/target/i386/kvm/tdx.c
@@ -0,0 +1,40 @@
+/*
+ * QEMU TDX support
+ *
+ * Copyright Intel
+ *
+ * Author:
+ *      Xiaoyao Li <xiaoyao.li@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qom/object_interfaces.h"
+
+#include "tdx.h"
+
+/* tdx guest */
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
+                                   tdx_guest,
+                                   TDX_GUEST,
+                                   CONFIDENTIAL_GUEST_SUPPORT,
+                                   { TYPE_USER_CREATABLE },
+                                   { NULL })
+
+static void tdx_guest_init(Object *obj)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+
+    tdx->attributes = 0;
+}
+
+static void tdx_guest_finalize(Object *obj)
+{
+}
+
+static void tdx_guest_class_init(ObjectClass *oc, void *data)
+{
+}
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
new file mode 100644
index 000000000000..415aeb5af746
--- /dev/null
+++ b/target/i386/kvm/tdx.h
@@ -0,0 +1,19 @@
+#ifndef QEMU_I386_TDX_H
+#define QEMU_I386_TDX_H
+
+#include "exec/confidential-guest-support.h"
+
+#define TYPE_TDX_GUEST "tdx-guest"
+#define TDX_GUEST(obj)  OBJECT_CHECK(TdxGuest, (obj), TYPE_TDX_GUEST)
+
+typedef struct TdxGuestClass {
+    ConfidentialGuestSupportClass parent_class;
+} TdxGuestClass;
+
+typedef struct TdxGuest {
+    ConfidentialGuestSupport parent_obj;
+
+    uint64_t attributes;    /* TD attributes */
+} TdxGuest;
+
+#endif /* QEMU_I386_TDX_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 02/36] i386: Introduce tdx-guest object
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Introduce tdx-guest object which implements the interface of
CONFIDENTIAL_GUEST_SUPPORT, and will be used to create TDX VMs (TDs) by

  qemu -machine ...,confidential-guest-support=tdx0	\
       -object tdx-guset,id=tdx0

It has only one property 'attributes' with fixed value 0 and not
configurable so far.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 configs/devices/i386-softmmu/default.mak |  1 +
 hw/i386/Kconfig                          |  5 +++
 qapi/qom.json                            | 14 +++++++++
 target/i386/kvm/meson.build              |  2 ++
 target/i386/kvm/tdx.c                    | 40 ++++++++++++++++++++++++
 target/i386/kvm/tdx.h                    | 19 +++++++++++
 6 files changed, 81 insertions(+)
 create mode 100644 target/i386/kvm/tdx.c
 create mode 100644 target/i386/kvm/tdx.h

diff --git a/configs/devices/i386-softmmu/default.mak b/configs/devices/i386-softmmu/default.mak
index 598c6646dfc0..9b5ec59d65b0 100644
--- a/configs/devices/i386-softmmu/default.mak
+++ b/configs/devices/i386-softmmu/default.mak
@@ -18,6 +18,7 @@
 #CONFIG_QXL=n
 #CONFIG_SEV=n
 #CONFIG_SGA=n
+#CONFIG_TDX=n
 #CONFIG_TEST_DEVICES=n
 #CONFIG_TPM_CRB=n
 #CONFIG_TPM_TIS_ISA=n
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index d22ac4a4b952..9e40ff79fc2d 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -10,6 +10,10 @@ config SGX
     bool
     depends on KVM
 
+config TDX
+    bool
+    depends on KVM
+
 config PC
     bool
     imply APPLESMC
@@ -26,6 +30,7 @@ config PC
     imply QXL
     imply SEV
     imply SGX
+    imply TDX
     imply SGA
     imply TEST_DEVICES
     imply TPM_CRB
diff --git a/qapi/qom.json b/qapi/qom.json
index eeb5395ff3b7..1415ab22e531 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -785,6 +785,18 @@
             'reduced-phys-bits': 'uint32',
             '*kernel-hashes': 'bool' } }
 
+##
+# @TdxGuestProperties:
+#
+# Properties for tdx-guest objects.
+#
+# @attributes: TDX guest's attributes (default: 0)
+#
+# Since: 7.0
+##
+{ 'struct': 'TdxGuestProperties',
+  'data': { '*attributes': 'uint64' } }
+
 ##
 # @ObjectType:
 #
@@ -837,6 +849,7 @@
       'if': 'CONFIG_SECRET_KEYRING' },
     'sev-guest',
     's390-pv-guest',
+    'tdx-guest',
     'throttle-group',
     'tls-creds-anon',
     'tls-creds-psk',
@@ -900,6 +913,7 @@
       'secret_keyring':             { 'type': 'SecretKeyringProperties',
                                       'if': 'CONFIG_SECRET_KEYRING' },
       'sev-guest':                  'SevGuestProperties',
+      'tdx-guest':                  'TdxGuestProperties',
       'throttle-group':             'ThrottleGroupProperties',
       'tls-creds-anon':             'TlsCredsAnonProperties',
       'tls-creds-psk':              'TlsCredsPskProperties',
diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
index 736df8b72e3f..b2d7d41acde2 100644
--- a/target/i386/kvm/meson.build
+++ b/target/i386/kvm/meson.build
@@ -9,6 +9,8 @@ i386_softmmu_kvm_ss.add(files(
 
 i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
 
+i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
+
 i386_softmmu_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
 
 i386_softmmu_ss.add_all(when: 'CONFIG_KVM', if_true: i386_softmmu_kvm_ss)
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
new file mode 100644
index 000000000000..d3792d4a3d56
--- /dev/null
+++ b/target/i386/kvm/tdx.c
@@ -0,0 +1,40 @@
+/*
+ * QEMU TDX support
+ *
+ * Copyright Intel
+ *
+ * Author:
+ *      Xiaoyao Li <xiaoyao.li@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qom/object_interfaces.h"
+
+#include "tdx.h"
+
+/* tdx guest */
+OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
+                                   tdx_guest,
+                                   TDX_GUEST,
+                                   CONFIDENTIAL_GUEST_SUPPORT,
+                                   { TYPE_USER_CREATABLE },
+                                   { NULL })
+
+static void tdx_guest_init(Object *obj)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+
+    tdx->attributes = 0;
+}
+
+static void tdx_guest_finalize(Object *obj)
+{
+}
+
+static void tdx_guest_class_init(ObjectClass *oc, void *data)
+{
+}
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
new file mode 100644
index 000000000000..415aeb5af746
--- /dev/null
+++ b/target/i386/kvm/tdx.h
@@ -0,0 +1,19 @@
+#ifndef QEMU_I386_TDX_H
+#define QEMU_I386_TDX_H
+
+#include "exec/confidential-guest-support.h"
+
+#define TYPE_TDX_GUEST "tdx-guest"
+#define TDX_GUEST(obj)  OBJECT_CHECK(TdxGuest, (obj), TYPE_TDX_GUEST)
+
+typedef struct TdxGuestClass {
+    ConfidentialGuestSupportClass parent_class;
+} TdxGuestClass;
+
+typedef struct TdxGuest {
+    ConfidentialGuestSupport parent_obj;
+
+    uint64_t attributes;    /* TD attributes */
+} TdxGuest;
+
+#endif /* QEMU_I386_TDX_H */
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 03/36] target/i386: Implement mc->kvm_type() to get VM type
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

TDX VM requires VM type KVM_X86_TDX_VM to be passed to
kvm_ioctl(KVM_CREATE_VM). Hence implement mc->kvm_type() for i386
architecture.

If tdx-guest object is specified to confidential-guest-support, like,

  qemu -machine ...,confidential-guest-support=tdx0 \
       -object tdx-guest,id=tdx0,...

it parses VM type as KVM_X86_TDX_VM. Otherwise, it's KVM_X86_DEFAULT_VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/x86.c              |  6 ++++++
 target/i386/kvm/kvm.c      | 30 ++++++++++++++++++++++++++++++
 target/i386/kvm/kvm_i386.h |  1 +
 3 files changed, 37 insertions(+)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 8e30daccdb7c..10a88faf4c0e 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1330,6 +1330,11 @@ static void machine_set_sgx_epc(Object *obj, Visitor *v, const char *name,
     qapi_free_SgxEPCList(list);
 }
 
+static int x86_kvm_type(MachineState *ms, const char *vm_type)
+{
+    return kvm_get_vm_type(ms, vm_type);
+}
+
 static void x86_machine_initfn(Object *obj)
 {
     X86MachineState *x86ms = X86_MACHINE(obj);
@@ -1353,6 +1358,7 @@ static void x86_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
     mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
     mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
+    mc->kvm_type = x86_kvm_type;
     x86mc->save_tsc_khz = true;
     x86mc->fwcfg_dma_enabled = true;
     nc->nmi_monitor_handler = x86_nmi;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ef2c68a6f475..89d5eb58cb3e 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -30,6 +30,7 @@
 #include "sysemu/runstate.h"
 #include "kvm_i386.h"
 #include "sev.h"
+#include "tdx.h"
 #include "hyperv.h"
 #include "hyperv-proto.h"
 
@@ -139,6 +140,35 @@ static struct kvm_msr_list *kvm_feature_msrs;
 #define BUS_LOCK_SLICE_TIME 1000000000ULL /* ns */
 static RateLimit bus_lock_ratelimit_ctrl;
 
+static const char* vm_type_name[] = {
+    [KVM_X86_DEFAULT_VM] = "X86_DEFAULT_VM",
+    [KVM_X86_TDX_VM] = "X86_TDX_VM",
+};
+
+int kvm_get_vm_type(MachineState *ms, const char *vm_type)
+{
+    int kvm_type = KVM_X86_DEFAULT_VM;
+
+    if (ms->cgs && object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
+        kvm_type = KVM_X86_TDX_VM;
+    }
+
+    /*
+     * old KVM doesn't support KVM_CAP_VM_TYPES and KVM_X86_DEFAULT_VM
+     * is always supported
+     */
+    if (kvm_type == KVM_X86_DEFAULT_VM) {
+        return kvm_type;
+    }
+
+    if (!(kvm_check_extension(KVM_STATE(ms->accelerator), KVM_CAP_VM_TYPES) & BIT(kvm_type))) {
+        error_report("vm-type %s not supported by KVM", vm_type_name[kvm_type]);
+        exit(1);
+    }
+
+    return kvm_type;
+}
+
 int kvm_has_pit_state2(void)
 {
     return has_pit_state2;
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index 4124912c202e..b434feaa6b1d 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -37,6 +37,7 @@ bool kvm_has_adjust_clock(void);
 bool kvm_has_adjust_clock_stable(void);
 bool kvm_has_exception_payload(void);
 void kvm_synchronize_all_tsc(void);
+int kvm_get_vm_type(MachineState *ms, const char *vm_type);
 void kvm_arch_reset_vcpu(X86CPU *cs);
 void kvm_arch_do_init_vcpu(X86CPU *cs);
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 03/36] target/i386: Implement mc->kvm_type() to get VM type
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

TDX VM requires VM type KVM_X86_TDX_VM to be passed to
kvm_ioctl(KVM_CREATE_VM). Hence implement mc->kvm_type() for i386
architecture.

If tdx-guest object is specified to confidential-guest-support, like,

  qemu -machine ...,confidential-guest-support=tdx0 \
       -object tdx-guest,id=tdx0,...

it parses VM type as KVM_X86_TDX_VM. Otherwise, it's KVM_X86_DEFAULT_VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/x86.c              |  6 ++++++
 target/i386/kvm/kvm.c      | 30 ++++++++++++++++++++++++++++++
 target/i386/kvm/kvm_i386.h |  1 +
 3 files changed, 37 insertions(+)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 8e30daccdb7c..10a88faf4c0e 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1330,6 +1330,11 @@ static void machine_set_sgx_epc(Object *obj, Visitor *v, const char *name,
     qapi_free_SgxEPCList(list);
 }
 
+static int x86_kvm_type(MachineState *ms, const char *vm_type)
+{
+    return kvm_get_vm_type(ms, vm_type);
+}
+
 static void x86_machine_initfn(Object *obj)
 {
     X86MachineState *x86ms = X86_MACHINE(obj);
@@ -1353,6 +1358,7 @@ static void x86_machine_class_init(ObjectClass *oc, void *data)
     mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
     mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
     mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
+    mc->kvm_type = x86_kvm_type;
     x86mc->save_tsc_khz = true;
     x86mc->fwcfg_dma_enabled = true;
     nc->nmi_monitor_handler = x86_nmi;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ef2c68a6f475..89d5eb58cb3e 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -30,6 +30,7 @@
 #include "sysemu/runstate.h"
 #include "kvm_i386.h"
 #include "sev.h"
+#include "tdx.h"
 #include "hyperv.h"
 #include "hyperv-proto.h"
 
@@ -139,6 +140,35 @@ static struct kvm_msr_list *kvm_feature_msrs;
 #define BUS_LOCK_SLICE_TIME 1000000000ULL /* ns */
 static RateLimit bus_lock_ratelimit_ctrl;
 
+static const char* vm_type_name[] = {
+    [KVM_X86_DEFAULT_VM] = "X86_DEFAULT_VM",
+    [KVM_X86_TDX_VM] = "X86_TDX_VM",
+};
+
+int kvm_get_vm_type(MachineState *ms, const char *vm_type)
+{
+    int kvm_type = KVM_X86_DEFAULT_VM;
+
+    if (ms->cgs && object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
+        kvm_type = KVM_X86_TDX_VM;
+    }
+
+    /*
+     * old KVM doesn't support KVM_CAP_VM_TYPES and KVM_X86_DEFAULT_VM
+     * is always supported
+     */
+    if (kvm_type == KVM_X86_DEFAULT_VM) {
+        return kvm_type;
+    }
+
+    if (!(kvm_check_extension(KVM_STATE(ms->accelerator), KVM_CAP_VM_TYPES) & BIT(kvm_type))) {
+        error_report("vm-type %s not supported by KVM", vm_type_name[kvm_type]);
+        exit(1);
+    }
+
+    return kvm_type;
+}
+
 int kvm_has_pit_state2(void)
 {
     return has_pit_state2;
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index 4124912c202e..b434feaa6b1d 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -37,6 +37,7 @@ bool kvm_has_adjust_clock(void);
 bool kvm_has_adjust_clock_stable(void);
 bool kvm_has_exception_payload(void);
 void kvm_synchronize_all_tsc(void);
+int kvm_get_vm_type(MachineState *ms, const char *vm_type);
 void kvm_arch_reset_vcpu(X86CPU *cs);
 void kvm_arch_do_init_vcpu(X86CPU *cs);
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 04/36] target/i386: Introduce kvm_confidential_guest_init()
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Introduce a separate function kvm_confidential_guest_init() for SEV (and
future TDX).

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 11 ++++++++++-
 target/i386/sev.c     |  1 -
 target/i386/sev.h     |  2 ++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 89d5eb58cb3e..70454355f3bf 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2356,6 +2356,15 @@ static void register_smram_listener(Notifier *n, void *unused)
                                  &smram_address_space, 1, "kvm-smram");
 }
 
+static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
+        return sev_kvm_init(ms->cgs, errp);
+    }
+
+    return 0;
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     uint64_t identity_base = 0xfffbc000;
@@ -2376,7 +2385,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
      * mechanisms are supported in future (e.g. TDX), they'll need
      * their own initialization either here or elsewhere.
      */
-    ret = sev_kvm_init(ms->cgs, &local_err);
+    ret = kvm_confidential_guest_init(ms, &local_err);
     if (ret < 0) {
         error_report_err(local_err);
         return ret;
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 025ff7a6f845..912f5cdfb91d 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -39,7 +39,6 @@
 #include "hw/i386/pc.h"
 #include "exec/address-spaces.h"
 
-#define TYPE_SEV_GUEST "sev-guest"
 OBJECT_DECLARE_SIMPLE_TYPE(SevGuestState, SEV_GUEST)
 
 
diff --git a/target/i386/sev.h b/target/i386/sev.h
index 83e82aa42c41..a9c980dd4b2d 100644
--- a/target/i386/sev.h
+++ b/target/i386/sev.h
@@ -20,6 +20,8 @@
 
 #include "exec/confidential-guest-support.h"
 
+#define TYPE_SEV_GUEST "sev-guest"
+
 #define SEV_POLICY_NODBG        0x1
 #define SEV_POLICY_NOKS         0x2
 #define SEV_POLICY_ES           0x4
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 04/36] target/i386: Introduce kvm_confidential_guest_init()
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Introduce a separate function kvm_confidential_guest_init() for SEV (and
future TDX).

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 11 ++++++++++-
 target/i386/sev.c     |  1 -
 target/i386/sev.h     |  2 ++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 89d5eb58cb3e..70454355f3bf 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2356,6 +2356,15 @@ static void register_smram_listener(Notifier *n, void *unused)
                                  &smram_address_space, 1, "kvm-smram");
 }
 
+static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
+{
+    if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
+        return sev_kvm_init(ms->cgs, errp);
+    }
+
+    return 0;
+}
+
 int kvm_arch_init(MachineState *ms, KVMState *s)
 {
     uint64_t identity_base = 0xfffbc000;
@@ -2376,7 +2385,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
      * mechanisms are supported in future (e.g. TDX), they'll need
      * their own initialization either here or elsewhere.
      */
-    ret = sev_kvm_init(ms->cgs, &local_err);
+    ret = kvm_confidential_guest_init(ms, &local_err);
     if (ret < 0) {
         error_report_err(local_err);
         return ret;
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 025ff7a6f845..912f5cdfb91d 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -39,7 +39,6 @@
 #include "hw/i386/pc.h"
 #include "exec/address-spaces.h"
 
-#define TYPE_SEV_GUEST "sev-guest"
 OBJECT_DECLARE_SIMPLE_TYPE(SevGuestState, SEV_GUEST)
 
 
diff --git a/target/i386/sev.h b/target/i386/sev.h
index 83e82aa42c41..a9c980dd4b2d 100644
--- a/target/i386/sev.h
+++ b/target/i386/sev.h
@@ -20,6 +20,8 @@
 
 #include "exec/confidential-guest-support.h"
 
+#define TYPE_SEV_GUEST "sev-guest"
+
 #define SEV_POLICY_NODBG        0x1
 #define SEV_POLICY_NOKS         0x2
 #define SEV_POLICY_ES           0x4
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 05/36] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Introduce tdx_kvm_init() and invoke it in kvm_confidential_guest_init()
if it's a TDX VM. More initialization will be added later.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c       | 15 ++++++---------
 target/i386/kvm/meson.build |  2 +-
 target/i386/kvm/tdx-stub.c  |  9 +++++++++
 target/i386/kvm/tdx.c       | 13 +++++++++++++
 target/i386/kvm/tdx.h       |  2 ++
 5 files changed, 31 insertions(+), 10 deletions(-)
 create mode 100644 target/i386/kvm/tdx-stub.c

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 70454355f3bf..26ed5faf07b8 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -54,6 +54,7 @@
 #include "migration/blocker.h"
 #include "exec/memattrs.h"
 #include "trace.h"
+#include "tdx.h"
 
 //#define DEBUG_KVM
 
@@ -2360,6 +2361,8 @@ static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
 {
     if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
         return sev_kvm_init(ms->cgs, errp);
+    } else if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
+        return tdx_kvm_init(ms, errp);
     }
 
     return 0;
@@ -2374,16 +2377,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     Error *local_err = NULL;
 
     /*
-     * Initialize SEV context, if required
+     * Initialize confidential guest (SEV/TDX) context, if required
      *
-     * If no memory encryption is requested (ms->cgs == NULL) this is
-     * a no-op.
-     *
-     * It's also a no-op if a non-SEV confidential guest support
-     * mechanism is selected.  SEV is the only mechanism available to
-     * select on x86 at present, so this doesn't arise, but if new
-     * mechanisms are supported in future (e.g. TDX), they'll need
-     * their own initialization either here or elsewhere.
+     * It's a no-op if a non-SEV/non-tdx confidential guest support
+     * mechanism is selected, i.e., ms->cgs == NULL
      */
     ret = kvm_confidential_guest_init(ms, &local_err);
     if (ret < 0) {
diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
index b2d7d41acde2..fd30b93ecec9 100644
--- a/target/i386/kvm/meson.build
+++ b/target/i386/kvm/meson.build
@@ -9,7 +9,7 @@ i386_softmmu_kvm_ss.add(files(
 
 i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
 
-i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
+i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'), if_false: files('tdx-stub.c'))
 
 i386_softmmu_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
 
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
new file mode 100644
index 000000000000..1df24735201e
--- /dev/null
+++ b/target/i386/kvm/tdx-stub.c
@@ -0,0 +1,9 @@
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+
+#include "tdx.h"
+
+int tdx_kvm_init(MachineState *ms, Error **errp)
+{
+    return -EINVAL;
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index d3792d4a3d56..e3b94373b316 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -12,10 +12,23 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 #include "qom/object_interfaces.h"
 
+#include "hw/i386/x86.h"
 #include "tdx.h"
 
+int tdx_kvm_init(MachineState *ms, Error **errp)
+{
+    TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
+                                                    TYPE_TDX_GUEST);
+    if (!tdx) {
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 415aeb5af746..c8a23d95258d 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -16,4 +16,6 @@ typedef struct TdxGuest {
     uint64_t attributes;    /* TD attributes */
 } TdxGuest;
 
+int tdx_kvm_init(MachineState *ms, Error **errp);
+
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 05/36] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Introduce tdx_kvm_init() and invoke it in kvm_confidential_guest_init()
if it's a TDX VM. More initialization will be added later.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c       | 15 ++++++---------
 target/i386/kvm/meson.build |  2 +-
 target/i386/kvm/tdx-stub.c  |  9 +++++++++
 target/i386/kvm/tdx.c       | 13 +++++++++++++
 target/i386/kvm/tdx.h       |  2 ++
 5 files changed, 31 insertions(+), 10 deletions(-)
 create mode 100644 target/i386/kvm/tdx-stub.c

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 70454355f3bf..26ed5faf07b8 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -54,6 +54,7 @@
 #include "migration/blocker.h"
 #include "exec/memattrs.h"
 #include "trace.h"
+#include "tdx.h"
 
 //#define DEBUG_KVM
 
@@ -2360,6 +2361,8 @@ static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
 {
     if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
         return sev_kvm_init(ms->cgs, errp);
+    } else if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
+        return tdx_kvm_init(ms, errp);
     }
 
     return 0;
@@ -2374,16 +2377,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     Error *local_err = NULL;
 
     /*
-     * Initialize SEV context, if required
+     * Initialize confidential guest (SEV/TDX) context, if required
      *
-     * If no memory encryption is requested (ms->cgs == NULL) this is
-     * a no-op.
-     *
-     * It's also a no-op if a non-SEV confidential guest support
-     * mechanism is selected.  SEV is the only mechanism available to
-     * select on x86 at present, so this doesn't arise, but if new
-     * mechanisms are supported in future (e.g. TDX), they'll need
-     * their own initialization either here or elsewhere.
+     * It's a no-op if a non-SEV/non-tdx confidential guest support
+     * mechanism is selected, i.e., ms->cgs == NULL
      */
     ret = kvm_confidential_guest_init(ms, &local_err);
     if (ret < 0) {
diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
index b2d7d41acde2..fd30b93ecec9 100644
--- a/target/i386/kvm/meson.build
+++ b/target/i386/kvm/meson.build
@@ -9,7 +9,7 @@ i386_softmmu_kvm_ss.add(files(
 
 i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
 
-i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
+i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'), if_false: files('tdx-stub.c'))
 
 i386_softmmu_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
 
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
new file mode 100644
index 000000000000..1df24735201e
--- /dev/null
+++ b/target/i386/kvm/tdx-stub.c
@@ -0,0 +1,9 @@
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+
+#include "tdx.h"
+
+int tdx_kvm_init(MachineState *ms, Error **errp)
+{
+    return -EINVAL;
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index d3792d4a3d56..e3b94373b316 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -12,10 +12,23 @@
  */
 
 #include "qemu/osdep.h"
+#include "qapi/error.h"
 #include "qom/object_interfaces.h"
 
+#include "hw/i386/x86.h"
 #include "tdx.h"
 
+int tdx_kvm_init(MachineState *ms, Error **errp)
+{
+    TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
+                                                    TYPE_TDX_GUEST);
+    if (!tdx) {
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 415aeb5af746..c8a23d95258d 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -16,4 +16,6 @@ typedef struct TdxGuest {
     uint64_t attributes;    /* TD attributes */
 } TdxGuest;
 
+int tdx_kvm_init(MachineState *ms, Error **errp);
+
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 06/36] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

KVM provides TDX capabilities via sub command KVM_TDX_CAPABILITIES of
IOCTL(KVM_MEMORY_ENCRYPT_OP). Get the capabilities when initializing
TDX context. It will be used to validate user's setting later.

Besides, introduce the interfaces to invoke TDX "ioctls" at different
scope (VM and VCPU) in preparation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 71 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index e3b94373b316..bed337e5ba18 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -14,10 +14,77 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qom/object_interfaces.h"
+#include "sysemu/kvm.h"
 
 #include "hw/i386/x86.h"
 #include "tdx.h"
 
+enum tdx_ioctl_level{
+    TDX_VM_IOCTL,
+    TDX_VCPU_IOCTL,
+};
+
+static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
+                        __u32 metadata, void *data)
+{
+    struct kvm_tdx_cmd tdx_cmd;
+    int r;
+
+    memset(&tdx_cmd, 0x0, sizeof(tdx_cmd));
+
+    tdx_cmd.id = cmd_id;
+    tdx_cmd.metadata = metadata;
+    tdx_cmd.data = (__u64)(unsigned long)data;
+
+    switch (level) {
+    case TDX_VM_IOCTL:
+        r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
+        break;
+    case TDX_VCPU_IOCTL:
+        r = kvm_vcpu_ioctl(state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
+        break;
+    default:
+        error_report("Invalid tdx_ioctl_level %d", level);
+        exit(1);
+    }
+
+    return r;
+}
+
+#define tdx_vm_ioctl(cmd_id, metadata, data) \
+        __tdx_ioctl(NULL, TDX_VM_IOCTL, cmd_id, metadata, data)
+
+#define tdx_vcpu_ioctl(cpu, cmd_id, metadata, data) \
+        __tdx_ioctl(cpu, TDX_VCPU_IOCTL, cmd_id, metadata, data)
+
+static struct kvm_tdx_capabilities *tdx_caps;
+
+static void get_tdx_capabilities(void)
+{
+    struct kvm_tdx_capabilities *caps;
+    int max_ent = 1;
+    int r, size;
+
+    do {
+        size = sizeof(struct kvm_tdx_capabilities) +
+               max_ent * sizeof(struct kvm_tdx_cpuid_config);
+        caps = g_malloc0(size);
+        caps->nr_cpuid_configs = max_ent;
+
+        r = tdx_vm_ioctl(KVM_TDX_CAPABILITIES, 0, caps);
+        if (r == -E2BIG) {
+            g_free(caps);
+            max_ent *= 2;
+        } else if (r < 0) {
+            error_report("KVM_TDX_CAPABILITIES failed: %s\n", strerror(-r));
+            exit(1);
+        }
+    }
+    while (r == -E2BIG);
+
+    tdx_caps = caps;
+}
+
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
     TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
@@ -26,6 +93,10 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         return -EINVAL;
     }
 
+    if (!tdx_caps) {
+        get_tdx_capabilities();
+    }
+
     return 0;
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 06/36] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

KVM provides TDX capabilities via sub command KVM_TDX_CAPABILITIES of
IOCTL(KVM_MEMORY_ENCRYPT_OP). Get the capabilities when initializing
TDX context. It will be used to validate user's setting later.

Besides, introduce the interfaces to invoke TDX "ioctls" at different
scope (VM and VCPU) in preparation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 71 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index e3b94373b316..bed337e5ba18 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -14,10 +14,77 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qom/object_interfaces.h"
+#include "sysemu/kvm.h"
 
 #include "hw/i386/x86.h"
 #include "tdx.h"
 
+enum tdx_ioctl_level{
+    TDX_VM_IOCTL,
+    TDX_VCPU_IOCTL,
+};
+
+static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
+                        __u32 metadata, void *data)
+{
+    struct kvm_tdx_cmd tdx_cmd;
+    int r;
+
+    memset(&tdx_cmd, 0x0, sizeof(tdx_cmd));
+
+    tdx_cmd.id = cmd_id;
+    tdx_cmd.metadata = metadata;
+    tdx_cmd.data = (__u64)(unsigned long)data;
+
+    switch (level) {
+    case TDX_VM_IOCTL:
+        r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
+        break;
+    case TDX_VCPU_IOCTL:
+        r = kvm_vcpu_ioctl(state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
+        break;
+    default:
+        error_report("Invalid tdx_ioctl_level %d", level);
+        exit(1);
+    }
+
+    return r;
+}
+
+#define tdx_vm_ioctl(cmd_id, metadata, data) \
+        __tdx_ioctl(NULL, TDX_VM_IOCTL, cmd_id, metadata, data)
+
+#define tdx_vcpu_ioctl(cpu, cmd_id, metadata, data) \
+        __tdx_ioctl(cpu, TDX_VCPU_IOCTL, cmd_id, metadata, data)
+
+static struct kvm_tdx_capabilities *tdx_caps;
+
+static void get_tdx_capabilities(void)
+{
+    struct kvm_tdx_capabilities *caps;
+    int max_ent = 1;
+    int r, size;
+
+    do {
+        size = sizeof(struct kvm_tdx_capabilities) +
+               max_ent * sizeof(struct kvm_tdx_cpuid_config);
+        caps = g_malloc0(size);
+        caps->nr_cpuid_configs = max_ent;
+
+        r = tdx_vm_ioctl(KVM_TDX_CAPABILITIES, 0, caps);
+        if (r == -E2BIG) {
+            g_free(caps);
+            max_ent *= 2;
+        } else if (r < 0) {
+            error_report("KVM_TDX_CAPABILITIES failed: %s\n", strerror(-r));
+            exit(1);
+        }
+    }
+    while (r == -E2BIG);
+
+    tdx_caps = caps;
+}
+
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
     TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
@@ -26,6 +93,10 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         return -EINVAL;
     }
 
+    if (!tdx_caps) {
+        get_tdx_capabilities();
+    }
+
     return 0;
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 07/36] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

It will need special handling for TDX VMs all around the QEMU.
Introduce is_tdx_vm() helper to query if it's a TDX VM.

Cache tdx_guest object thus no need to cast from ms->cgs every time.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 10 ++++++++++
 target/i386/kvm/tdx.h | 10 ++++++++++
 2 files changed, 20 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index bed337e5ba18..846511b299f4 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -19,6 +19,14 @@
 #include "hw/i386/x86.h"
 #include "tdx.h"
 
+static TdxGuest *tdx_guest;
+
+/* It's valid after kvm_confidential_guest_init()->kvm_tdx_init() */
+bool is_tdx_vm(void)
+{
+    return !!tdx_guest;
+}
+
 enum tdx_ioctl_level{
     TDX_VM_IOCTL,
     TDX_VCPU_IOCTL,
@@ -97,6 +105,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         get_tdx_capabilities();
     }
 
+    tdx_guest = tdx;
+
     return 0;
 }
 
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index c8a23d95258d..4036ca2f3f99 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -1,6 +1,10 @@
 #ifndef QEMU_I386_TDX_H
 #define QEMU_I386_TDX_H
 
+#ifndef CONFIG_USER_ONLY
+#include CONFIG_DEVICES /* CONFIG_TDX */
+#endif
+
 #include "exec/confidential-guest-support.h"
 
 #define TYPE_TDX_GUEST "tdx-guest"
@@ -16,6 +20,12 @@ typedef struct TdxGuest {
     uint64_t attributes;    /* TD attributes */
 } TdxGuest;
 
+#ifdef CONFIG_TDX
+bool is_tdx_vm(void);
+#else
+#define is_tdx_vm() 0
+#endif /* CONFIG_TDX */
+
 int tdx_kvm_init(MachineState *ms, Error **errp);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 07/36] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

It will need special handling for TDX VMs all around the QEMU.
Introduce is_tdx_vm() helper to query if it's a TDX VM.

Cache tdx_guest object thus no need to cast from ms->cgs every time.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 10 ++++++++++
 target/i386/kvm/tdx.h | 10 ++++++++++
 2 files changed, 20 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index bed337e5ba18..846511b299f4 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -19,6 +19,14 @@
 #include "hw/i386/x86.h"
 #include "tdx.h"
 
+static TdxGuest *tdx_guest;
+
+/* It's valid after kvm_confidential_guest_init()->kvm_tdx_init() */
+bool is_tdx_vm(void)
+{
+    return !!tdx_guest;
+}
+
 enum tdx_ioctl_level{
     TDX_VM_IOCTL,
     TDX_VCPU_IOCTL,
@@ -97,6 +105,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         get_tdx_capabilities();
     }
 
+    tdx_guest = tdx;
+
     return 0;
 }
 
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index c8a23d95258d..4036ca2f3f99 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -1,6 +1,10 @@
 #ifndef QEMU_I386_TDX_H
 #define QEMU_I386_TDX_H
 
+#ifndef CONFIG_USER_ONLY
+#include CONFIG_DEVICES /* CONFIG_TDX */
+#endif
+
 #include "exec/confidential-guest-support.h"
 
 #define TYPE_TDX_GUEST "tdx-guest"
@@ -16,6 +20,12 @@ typedef struct TdxGuest {
     uint64_t attributes;    /* TD attributes */
 } TdxGuest;
 
+#ifdef CONFIG_TDX
+bool is_tdx_vm(void);
+#else
+#define is_tdx_vm() 0
+#endif /* CONFIG_TDX */
+
 int tdx_kvm_init(MachineState *ms, Error **errp);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 08/36] i386/tdx: Adjust get_supported_cpuid() for TDX VM
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

For TDX, the allowable CPUID configuration differs from what KVM
reports for KVM scope via KVM_GET_SUPPORTED_CPUID.

- Some CPUID bits are not supported for TDX VM while KVM reports the
  support. Mask them off for TDX VM. e.g., CPUID_EXT_VMX, some PV
  featues.

- The supported XCR0 and XSS bits needs to be caped by tdx_caps, because
  KVM uses them to setup XFAM of TD.

Introduce tdx_get_supported_cpuid() to adjust the
kvm_arch_get_supported_cpuid() for TDX VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/cpu.h     |  5 +++++
 target/i386/kvm/kvm.c |  4 ++++
 target/i386/kvm/tdx.c | 39 +++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h |  2 ++
 4 files changed, 50 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 5e406088a91a..7fa30f4ed7db 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -566,6 +566,11 @@ typedef enum X86Seg {
 #define ESA_FEATURE_XFD_MASK            (1U << ESA_FEATURE_XFD_BIT)
 
 
+#define XCR0_MASK       (XSTATE_FP_MASK | XSTATE_SSE_MASK | XSTATE_YMM_MASK | \
+                         XSTATE_BNDREGS_MASK | XSTATE_BNDCSR_MASK | \
+                         XSTATE_OPMASK_MASK | XSTATE_ZMM_Hi256_MASK | \
+                         XSTATE_Hi16_ZMM_MASK | XSTATE_PKRU_MASK)
+
 /* CPUID feature words */
 typedef enum FeatureWord {
     FEAT_1_EDX,         /* CPUID[1].EDX */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 26ed5faf07b8..ddbe8f64fadb 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -486,6 +486,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
         ret |= 1U << KVM_HINTS_REALTIME;
     }
 
+    if (is_tdx_vm()) {
+        tdx_get_supported_cpuid(function, index, reg, &ret);
+    }
+
     return ret;
 }
 
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 846511b299f4..e4ee55f30c79 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -14,6 +14,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qom/object_interfaces.h"
+#include "standard-headers/asm-x86/kvm_para.h"
 #include "sysemu/kvm.h"
 
 #include "hw/i386/x86.h"
@@ -110,6 +111,44 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
     return 0;
 }
 
+void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
+                             uint32_t *ret)
+{
+    switch (function) {
+    case 1:
+        if (reg == R_ECX) {
+            *ret &= ~CPUID_EXT_VMX;
+        }
+        break;
+    case 0xd:
+        if (index == 0) {
+            if (reg == R_EAX) {
+                *ret &= (uint32_t)tdx_caps->xfam_fixed0 & XCR0_MASK;
+                *ret |= (uint32_t)tdx_caps->xfam_fixed1 & XCR0_MASK;
+            } else if (reg == R_EDX) {
+                *ret &= (tdx_caps->xfam_fixed0 & XCR0_MASK) >> 32;
+                *ret |= (tdx_caps->xfam_fixed1 & XCR0_MASK) >> 32;
+            }
+        } else if (index == 1) {
+            /* TODO: Adjust XSS when it's supported. */
+        }
+        break;
+    case KVM_CPUID_FEATURES:
+        if (reg == R_EAX) {
+            *ret &= ~((1ULL << KVM_FEATURE_CLOCKSOURCE) |
+                      (1ULL << KVM_FEATURE_CLOCKSOURCE2) |
+                      (1ULL << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
+                      (1ULL << KVM_FEATURE_ASYNC_PF) |
+                      (1ULL << KVM_FEATURE_ASYNC_PF_VMEXIT) |
+                      (1ULL << KVM_FEATURE_ASYNC_PF_INT));
+        }
+        break;
+    default:
+        /* TODO: Use tdx_caps to adjust CPUID leafs. */
+        break;
+    }
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 4036ca2f3f99..06599b65b827 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -27,5 +27,7 @@ bool is_tdx_vm(void);
 #endif /* CONFIG_TDX */
 
 int tdx_kvm_init(MachineState *ms, Error **errp);
+void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
+                             uint32_t *ret);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 08/36] i386/tdx: Adjust get_supported_cpuid() for TDX VM
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

For TDX, the allowable CPUID configuration differs from what KVM
reports for KVM scope via KVM_GET_SUPPORTED_CPUID.

- Some CPUID bits are not supported for TDX VM while KVM reports the
  support. Mask them off for TDX VM. e.g., CPUID_EXT_VMX, some PV
  featues.

- The supported XCR0 and XSS bits needs to be caped by tdx_caps, because
  KVM uses them to setup XFAM of TD.

Introduce tdx_get_supported_cpuid() to adjust the
kvm_arch_get_supported_cpuid() for TDX VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/cpu.h     |  5 +++++
 target/i386/kvm/kvm.c |  4 ++++
 target/i386/kvm/tdx.c | 39 +++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h |  2 ++
 4 files changed, 50 insertions(+)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 5e406088a91a..7fa30f4ed7db 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -566,6 +566,11 @@ typedef enum X86Seg {
 #define ESA_FEATURE_XFD_MASK            (1U << ESA_FEATURE_XFD_BIT)
 
 
+#define XCR0_MASK       (XSTATE_FP_MASK | XSTATE_SSE_MASK | XSTATE_YMM_MASK | \
+                         XSTATE_BNDREGS_MASK | XSTATE_BNDCSR_MASK | \
+                         XSTATE_OPMASK_MASK | XSTATE_ZMM_Hi256_MASK | \
+                         XSTATE_Hi16_ZMM_MASK | XSTATE_PKRU_MASK)
+
 /* CPUID feature words */
 typedef enum FeatureWord {
     FEAT_1_EDX,         /* CPUID[1].EDX */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 26ed5faf07b8..ddbe8f64fadb 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -486,6 +486,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
         ret |= 1U << KVM_HINTS_REALTIME;
     }
 
+    if (is_tdx_vm()) {
+        tdx_get_supported_cpuid(function, index, reg, &ret);
+    }
+
     return ret;
 }
 
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 846511b299f4..e4ee55f30c79 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -14,6 +14,7 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qom/object_interfaces.h"
+#include "standard-headers/asm-x86/kvm_para.h"
 #include "sysemu/kvm.h"
 
 #include "hw/i386/x86.h"
@@ -110,6 +111,44 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
     return 0;
 }
 
+void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
+                             uint32_t *ret)
+{
+    switch (function) {
+    case 1:
+        if (reg == R_ECX) {
+            *ret &= ~CPUID_EXT_VMX;
+        }
+        break;
+    case 0xd:
+        if (index == 0) {
+            if (reg == R_EAX) {
+                *ret &= (uint32_t)tdx_caps->xfam_fixed0 & XCR0_MASK;
+                *ret |= (uint32_t)tdx_caps->xfam_fixed1 & XCR0_MASK;
+            } else if (reg == R_EDX) {
+                *ret &= (tdx_caps->xfam_fixed0 & XCR0_MASK) >> 32;
+                *ret |= (tdx_caps->xfam_fixed1 & XCR0_MASK) >> 32;
+            }
+        } else if (index == 1) {
+            /* TODO: Adjust XSS when it's supported. */
+        }
+        break;
+    case KVM_CPUID_FEATURES:
+        if (reg == R_EAX) {
+            *ret &= ~((1ULL << KVM_FEATURE_CLOCKSOURCE) |
+                      (1ULL << KVM_FEATURE_CLOCKSOURCE2) |
+                      (1ULL << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
+                      (1ULL << KVM_FEATURE_ASYNC_PF) |
+                      (1ULL << KVM_FEATURE_ASYNC_PF_VMEXIT) |
+                      (1ULL << KVM_FEATURE_ASYNC_PF_INT));
+        }
+        break;
+    default:
+        /* TODO: Use tdx_caps to adjust CPUID leafs. */
+        break;
+    }
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 4036ca2f3f99..06599b65b827 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -27,5 +27,7 @@ bool is_tdx_vm(void);
 #endif /* CONFIG_TDX */
 
 int tdx_kvm_init(MachineState *ms, Error **errp);
+void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
+                             uint32_t *ret);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 09/36] KVM: Introduce kvm_arch_pre_create_vcpu()
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
work prior to create any vcpu. This is for i386 TDX because it needs
call TDX_INIT_VM before creating any vcpu.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 accel/kvm/kvm-all.c    | 7 +++++++
 include/sysemu/kvm.h   | 1 +
 target/arm/kvm64.c     | 5 +++++
 target/i386/kvm/kvm.c  | 5 +++++
 target/mips/kvm.c      | 5 +++++
 target/ppc/kvm.c       | 5 +++++
 target/s390x/kvm/kvm.c | 5 +++++
 7 files changed, 33 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 27864dfaeaaa..a4bb449737a6 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -465,6 +465,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
+    ret = kvm_arch_pre_create_vcpu(cpu);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret,
+                         "kvm_init_vcpu: kvm_arch_pre_create_vcpu() failed");
+        goto err;
+    }
+
     ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
     if (ret < 0) {
         error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index a783c7886811..0e94031ab7c7 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -373,6 +373,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
 
 int kvm_arch_init(MachineState *ms, KVMState *s);
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu);
 int kvm_arch_init_vcpu(CPUState *cpu);
 int kvm_arch_destroy_vcpu(CPUState *cpu);
 
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index ccadfbbe72be..ae7336851c62 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -935,6 +935,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return kvm_arm_init_cpreg_list(cpu);
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     return 0;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ddbe8f64fadb..7bd5589e1e6c 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2102,6 +2102,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return r;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
diff --git a/target/mips/kvm.c b/target/mips/kvm.c
index 086debd9f013..0647fe7c654a 100644
--- a/target/mips/kvm.c
+++ b/target/mips/kvm.c
@@ -92,6 +92,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return ret;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     return 0;
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index dc93b99189ea..c14a20b80f12 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -507,6 +507,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return ret;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     return 0;
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 6acf14d5ecb4..8170c5fad0b8 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -405,6 +405,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return 0;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     S390CPU *cpu = S390_CPU(cs);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 09/36] KVM: Introduce kvm_arch_pre_create_vcpu()
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
work prior to create any vcpu. This is for i386 TDX because it needs
call TDX_INIT_VM before creating any vcpu.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 accel/kvm/kvm-all.c    | 7 +++++++
 include/sysemu/kvm.h   | 1 +
 target/arm/kvm64.c     | 5 +++++
 target/i386/kvm/kvm.c  | 5 +++++
 target/mips/kvm.c      | 5 +++++
 target/ppc/kvm.c       | 5 +++++
 target/s390x/kvm/kvm.c | 5 +++++
 7 files changed, 33 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 27864dfaeaaa..a4bb449737a6 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -465,6 +465,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
+    ret = kvm_arch_pre_create_vcpu(cpu);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret,
+                         "kvm_init_vcpu: kvm_arch_pre_create_vcpu() failed");
+        goto err;
+    }
+
     ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
     if (ret < 0) {
         error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index a783c7886811..0e94031ab7c7 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -373,6 +373,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
 
 int kvm_arch_init(MachineState *ms, KVMState *s);
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu);
 int kvm_arch_init_vcpu(CPUState *cpu);
 int kvm_arch_destroy_vcpu(CPUState *cpu);
 
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index ccadfbbe72be..ae7336851c62 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -935,6 +935,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return kvm_arm_init_cpreg_list(cpu);
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     return 0;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ddbe8f64fadb..7bd5589e1e6c 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2102,6 +2102,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return r;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
diff --git a/target/mips/kvm.c b/target/mips/kvm.c
index 086debd9f013..0647fe7c654a 100644
--- a/target/mips/kvm.c
+++ b/target/mips/kvm.c
@@ -92,6 +92,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return ret;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     return 0;
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index dc93b99189ea..c14a20b80f12 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -507,6 +507,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return ret;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     return 0;
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 6acf14d5ecb4..8170c5fad0b8 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -405,6 +405,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
     return 0;
 }
 
+int kvm_arch_pre_create_vcpu(CPUState *cpu)
+{
+    return 0;
+}
+
 int kvm_arch_destroy_vcpu(CPUState *cs)
 {
     S390CPU *cpu = S390_CPU(cs);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 10/36] i386/kvm: Move architectural CPUID leaf generation to separate helper
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

From: Sean Christopherson <sean.j.christopherson@intel.com>

Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c      | 222 +++++++++++++++++++------------------
 target/i386/kvm/kvm_i386.h |   4 +
 2 files changed, 119 insertions(+), 107 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 7bd5589e1e6c..02849f6ef142 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1621,8 +1621,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
 
 static Error *invtsc_mig_blocker;
 
-#define KVM_MAX_CPUID_ENTRIES  100
-
 static void kvm_init_xsave(CPUX86State *env)
 {
     if (has_xsave2) {
@@ -1643,115 +1641,21 @@ static void kvm_init_xsave(CPUX86State *env)
            env->xsave_buf_len);
 }
 
-int kvm_arch_init_vcpu(CPUState *cs)
+uint32_t kvm_x86_arch_cpuid(CPUX86State *env, struct kvm_cpuid_entry2 *entries,
+                            uint32_t cpuid_i)
 {
-    struct {
-        struct kvm_cpuid2 cpuid;
-        struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
-    } cpuid_data;
-    /*
-     * The kernel defines these structs with padding fields so there
-     * should be no extra padding in our cpuid_data struct.
-     */
-    QEMU_BUILD_BUG_ON(sizeof(cpuid_data) !=
-                      sizeof(struct kvm_cpuid2) +
-                      sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
-
-    X86CPU *cpu = X86_CPU(cs);
-    CPUX86State *env = &cpu->env;
-    uint32_t limit, i, j, cpuid_i;
+    uint32_t limit, i, j;
     uint32_t unused;
     struct kvm_cpuid_entry2 *c;
-    uint32_t signature[3];
-    int kvm_base = KVM_CPUID_SIGNATURE;
-    int max_nested_state_len;
-    int r;
-    Error *local_err = NULL;
-
-    memset(&cpuid_data, 0, sizeof(cpuid_data));
-
-    cpuid_i = 0;
-
-    has_xsave2 = kvm_check_extension(cs->kvm_state, KVM_CAP_XSAVE2);
-
-    r = kvm_arch_set_tsc_khz(cs);
-    if (r < 0) {
-        return r;
-    }
-
-    /* vcpu's TSC frequency is either specified by user, or following
-     * the value used by KVM if the former is not present. In the
-     * latter case, we query it from KVM and record in env->tsc_khz,
-     * so that vcpu's TSC frequency can be migrated later via this field.
-     */
-    if (!env->tsc_khz) {
-        r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
-            kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
-            -ENOTSUP;
-        if (r > 0) {
-            env->tsc_khz = r;
-        }
-    }
-
-    env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
-
-    /*
-     * kvm_hyperv_expand_features() is called here for the second time in case
-     * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly handle
-     * 'query-cpu-model-expansion' in this case as we don't have a KVM vCPU to
-     * check which Hyper-V enlightenments are supported and which are not, we
-     * can still proceed and check/expand Hyper-V enlightenments here so legacy
-     * behavior is preserved.
-     */
-    if (!kvm_hyperv_expand_features(cpu, &local_err)) {
-        error_report_err(local_err);
-        return -ENOSYS;
-    }
-
-    if (hyperv_enabled(cpu)) {
-        r = hyperv_init_vcpu(cpu);
-        if (r) {
-            return r;
-        }
-
-        cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
-        kvm_base = KVM_CPUID_SIGNATURE_NEXT;
-        has_msr_hv_hypercall = true;
-    }
-
-    if (cpu->expose_kvm) {
-        memcpy(signature, "KVMKVMKVM\0\0\0", 12);
-        c = &cpuid_data.entries[cpuid_i++];
-        c->function = KVM_CPUID_SIGNATURE | kvm_base;
-        c->eax = KVM_CPUID_FEATURES | kvm_base;
-        c->ebx = signature[0];
-        c->ecx = signature[1];
-        c->edx = signature[2];
-
-        c = &cpuid_data.entries[cpuid_i++];
-        c->function = KVM_CPUID_FEATURES | kvm_base;
-        c->eax = env->features[FEAT_KVM];
-        c->edx = env->features[FEAT_KVM_HINTS];
-    }
 
     cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
 
-    if (cpu->kvm_pv_enforce_cpuid) {
-        r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1);
-        if (r < 0) {
-            fprintf(stderr,
-                    "failed to enable KVM_CAP_ENFORCE_PV_FEATURE_CPUID: %s",
-                    strerror(-r));
-            abort();
-        }
-    }
-
     for (i = 0; i <= limit; i++) {
         if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
             fprintf(stderr, "unsupported level value: 0x%x\n", limit);
             abort();
         }
-        c = &cpuid_data.entries[cpuid_i++];
+        c = &entries[cpuid_i++];
 
         switch (i) {
         case 2: {
@@ -1770,7 +1674,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                             "cpuid(eax:2):eax & 0xf = 0x%x\n", times);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
                 c->function = i;
                 c->flags = KVM_CPUID_FLAG_STATEFUL_FUNC;
                 cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
@@ -1816,7 +1720,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                             "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
             }
             break;
         case 0x7:
@@ -1836,7 +1740,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                                 "cpuid(eax:0x12,ecx:0x%x)\n", j);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
             }
             break;
         case 0x14:
@@ -1856,7 +1760,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                                 "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
                 c->function = i;
                 c->index = j;
                 c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
@@ -1913,7 +1817,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
             fprintf(stderr, "unsupported xlevel value: 0x%x\n", limit);
             abort();
         }
-        c = &cpuid_data.entries[cpuid_i++];
+        c = &entries[cpuid_i++];
 
         switch (i) {
         case 0x8000001d:
@@ -1932,7 +1836,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                             "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
             }
             break;
         default:
@@ -1959,7 +1863,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                 fprintf(stderr, "unsupported xlevel2 value: 0x%x\n", limit);
                 abort();
             }
-            c = &cpuid_data.entries[cpuid_i++];
+            c = &entries[cpuid_i++];
 
             c->function = i;
             c->flags = 0;
@@ -1967,6 +1871,110 @@ int kvm_arch_init_vcpu(CPUState *cs)
         }
     }
 
+    return cpuid_i;
+}
+
+int kvm_arch_init_vcpu(CPUState *cs)
+{
+    struct {
+        struct kvm_cpuid2 cpuid;
+        struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
+    } cpuid_data;
+    /*
+     * The kernel defines these structs with padding fields so there
+     * should be no extra padding in our cpuid_data struct.
+     */
+    QEMU_BUILD_BUG_ON(sizeof(cpuid_data) !=
+                      sizeof(struct kvm_cpuid2) +
+                      sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
+
+    X86CPU *cpu = X86_CPU(cs);
+    CPUX86State *env = &cpu->env;
+    uint32_t cpuid_i;
+    struct kvm_cpuid_entry2 *c;
+    uint32_t signature[3];
+    int kvm_base = KVM_CPUID_SIGNATURE;
+    int max_nested_state_len;
+    int r;
+    Error *local_err = NULL;
+
+    memset(&cpuid_data, 0, sizeof(cpuid_data));
+
+    cpuid_i = 0;
+
+    has_xsave2 = kvm_check_extension(cs->kvm_state, KVM_CAP_XSAVE2);
+
+    r = kvm_arch_set_tsc_khz(cs);
+    if (r < 0) {
+        return r;
+    }
+
+    /* vcpu's TSC frequency is either specified by user, or following
+     * the value used by KVM if the former is not present. In the
+     * latter case, we query it from KVM and record in env->tsc_khz,
+     * so that vcpu's TSC frequency can be migrated later via this field.
+     */
+    if (!env->tsc_khz) {
+        r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
+            kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
+            -ENOTSUP;
+        if (r > 0) {
+            env->tsc_khz = r;
+        }
+    }
+
+    env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
+
+    /*
+     * kvm_hyperv_expand_features() is called here for the second time in case
+     * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly handle
+     * 'query-cpu-model-expansion' in this case as we don't have a KVM vCPU to
+     * check which Hyper-V enlightenments are supported and which are not, we
+     * can still proceed and check/expand Hyper-V enlightenments here so legacy
+     * behavior is preserved.
+     */
+    if (!kvm_hyperv_expand_features(cpu, &local_err)) {
+        error_report_err(local_err);
+        return -ENOSYS;
+    }
+
+    if (hyperv_enabled(cpu)) {
+        r = hyperv_init_vcpu(cpu);
+        if (r) {
+            return r;
+        }
+
+        cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
+        kvm_base = KVM_CPUID_SIGNATURE_NEXT;
+        has_msr_hv_hypercall = true;
+    }
+
+    if (cpu->expose_kvm) {
+        memcpy(signature, "KVMKVMKVM\0\0\0", 12);
+        c = &cpuid_data.entries[cpuid_i++];
+        c->function = KVM_CPUID_SIGNATURE | kvm_base;
+        c->eax = KVM_CPUID_FEATURES | kvm_base;
+        c->ebx = signature[0];
+        c->ecx = signature[1];
+        c->edx = signature[2];
+
+        c = &cpuid_data.entries[cpuid_i++];
+        c->function = KVM_CPUID_FEATURES | kvm_base;
+        c->eax = env->features[FEAT_KVM];
+        c->edx = env->features[FEAT_KVM_HINTS];
+    }
+
+    if (cpu->kvm_pv_enforce_cpuid) {
+        r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1);
+        if (r < 0) {
+            fprintf(stderr,
+                    "failed to enable KVM_CAP_ENFORCE_PV_FEATURE_CPUID: %s",
+                    strerror(-r));
+            abort();
+        }
+    }
+
+    cpuid_i = kvm_x86_arch_cpuid(env, cpuid_data.entries, cpuid_i);
     cpuid_data.cpuid.nent = cpuid_i;
 
     if (((env->cpuid_version >> 8)&0xF) >= 6
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index b434feaa6b1d..5c7972f617e8 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -24,6 +24,10 @@
 #define kvm_ioapic_in_kernel() \
     (kvm_irqchip_in_kernel() && !kvm_irqchip_is_split())
 
+#define KVM_MAX_CPUID_ENTRIES  100
+uint32_t kvm_x86_arch_cpuid(CPUX86State *env, struct kvm_cpuid_entry2 *entries,
+                            uint32_t cpuid_i);
+
 #else
 
 #define kvm_pit_in_kernel()      0
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 10/36] i386/kvm: Move architectural CPUID leaf generation to separate helper
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

From: Sean Christopherson <sean.j.christopherson@intel.com>

Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c      | 222 +++++++++++++++++++------------------
 target/i386/kvm/kvm_i386.h |   4 +
 2 files changed, 119 insertions(+), 107 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 7bd5589e1e6c..02849f6ef142 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1621,8 +1621,6 @@ static int hyperv_init_vcpu(X86CPU *cpu)
 
 static Error *invtsc_mig_blocker;
 
-#define KVM_MAX_CPUID_ENTRIES  100
-
 static void kvm_init_xsave(CPUX86State *env)
 {
     if (has_xsave2) {
@@ -1643,115 +1641,21 @@ static void kvm_init_xsave(CPUX86State *env)
            env->xsave_buf_len);
 }
 
-int kvm_arch_init_vcpu(CPUState *cs)
+uint32_t kvm_x86_arch_cpuid(CPUX86State *env, struct kvm_cpuid_entry2 *entries,
+                            uint32_t cpuid_i)
 {
-    struct {
-        struct kvm_cpuid2 cpuid;
-        struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
-    } cpuid_data;
-    /*
-     * The kernel defines these structs with padding fields so there
-     * should be no extra padding in our cpuid_data struct.
-     */
-    QEMU_BUILD_BUG_ON(sizeof(cpuid_data) !=
-                      sizeof(struct kvm_cpuid2) +
-                      sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
-
-    X86CPU *cpu = X86_CPU(cs);
-    CPUX86State *env = &cpu->env;
-    uint32_t limit, i, j, cpuid_i;
+    uint32_t limit, i, j;
     uint32_t unused;
     struct kvm_cpuid_entry2 *c;
-    uint32_t signature[3];
-    int kvm_base = KVM_CPUID_SIGNATURE;
-    int max_nested_state_len;
-    int r;
-    Error *local_err = NULL;
-
-    memset(&cpuid_data, 0, sizeof(cpuid_data));
-
-    cpuid_i = 0;
-
-    has_xsave2 = kvm_check_extension(cs->kvm_state, KVM_CAP_XSAVE2);
-
-    r = kvm_arch_set_tsc_khz(cs);
-    if (r < 0) {
-        return r;
-    }
-
-    /* vcpu's TSC frequency is either specified by user, or following
-     * the value used by KVM if the former is not present. In the
-     * latter case, we query it from KVM and record in env->tsc_khz,
-     * so that vcpu's TSC frequency can be migrated later via this field.
-     */
-    if (!env->tsc_khz) {
-        r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
-            kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
-            -ENOTSUP;
-        if (r > 0) {
-            env->tsc_khz = r;
-        }
-    }
-
-    env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
-
-    /*
-     * kvm_hyperv_expand_features() is called here for the second time in case
-     * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly handle
-     * 'query-cpu-model-expansion' in this case as we don't have a KVM vCPU to
-     * check which Hyper-V enlightenments are supported and which are not, we
-     * can still proceed and check/expand Hyper-V enlightenments here so legacy
-     * behavior is preserved.
-     */
-    if (!kvm_hyperv_expand_features(cpu, &local_err)) {
-        error_report_err(local_err);
-        return -ENOSYS;
-    }
-
-    if (hyperv_enabled(cpu)) {
-        r = hyperv_init_vcpu(cpu);
-        if (r) {
-            return r;
-        }
-
-        cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
-        kvm_base = KVM_CPUID_SIGNATURE_NEXT;
-        has_msr_hv_hypercall = true;
-    }
-
-    if (cpu->expose_kvm) {
-        memcpy(signature, "KVMKVMKVM\0\0\0", 12);
-        c = &cpuid_data.entries[cpuid_i++];
-        c->function = KVM_CPUID_SIGNATURE | kvm_base;
-        c->eax = KVM_CPUID_FEATURES | kvm_base;
-        c->ebx = signature[0];
-        c->ecx = signature[1];
-        c->edx = signature[2];
-
-        c = &cpuid_data.entries[cpuid_i++];
-        c->function = KVM_CPUID_FEATURES | kvm_base;
-        c->eax = env->features[FEAT_KVM];
-        c->edx = env->features[FEAT_KVM_HINTS];
-    }
 
     cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
 
-    if (cpu->kvm_pv_enforce_cpuid) {
-        r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1);
-        if (r < 0) {
-            fprintf(stderr,
-                    "failed to enable KVM_CAP_ENFORCE_PV_FEATURE_CPUID: %s",
-                    strerror(-r));
-            abort();
-        }
-    }
-
     for (i = 0; i <= limit; i++) {
         if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
             fprintf(stderr, "unsupported level value: 0x%x\n", limit);
             abort();
         }
-        c = &cpuid_data.entries[cpuid_i++];
+        c = &entries[cpuid_i++];
 
         switch (i) {
         case 2: {
@@ -1770,7 +1674,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                             "cpuid(eax:2):eax & 0xf = 0x%x\n", times);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
                 c->function = i;
                 c->flags = KVM_CPUID_FLAG_STATEFUL_FUNC;
                 cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
@@ -1816,7 +1720,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                             "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
             }
             break;
         case 0x7:
@@ -1836,7 +1740,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                                 "cpuid(eax:0x12,ecx:0x%x)\n", j);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
             }
             break;
         case 0x14:
@@ -1856,7 +1760,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                                 "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
                 c->function = i;
                 c->index = j;
                 c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
@@ -1913,7 +1817,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
             fprintf(stderr, "unsupported xlevel value: 0x%x\n", limit);
             abort();
         }
-        c = &cpuid_data.entries[cpuid_i++];
+        c = &entries[cpuid_i++];
 
         switch (i) {
         case 0x8000001d:
@@ -1932,7 +1836,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                             "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
                     abort();
                 }
-                c = &cpuid_data.entries[cpuid_i++];
+                c = &entries[cpuid_i++];
             }
             break;
         default:
@@ -1959,7 +1863,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
                 fprintf(stderr, "unsupported xlevel2 value: 0x%x\n", limit);
                 abort();
             }
-            c = &cpuid_data.entries[cpuid_i++];
+            c = &entries[cpuid_i++];
 
             c->function = i;
             c->flags = 0;
@@ -1967,6 +1871,110 @@ int kvm_arch_init_vcpu(CPUState *cs)
         }
     }
 
+    return cpuid_i;
+}
+
+int kvm_arch_init_vcpu(CPUState *cs)
+{
+    struct {
+        struct kvm_cpuid2 cpuid;
+        struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
+    } cpuid_data;
+    /*
+     * The kernel defines these structs with padding fields so there
+     * should be no extra padding in our cpuid_data struct.
+     */
+    QEMU_BUILD_BUG_ON(sizeof(cpuid_data) !=
+                      sizeof(struct kvm_cpuid2) +
+                      sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
+
+    X86CPU *cpu = X86_CPU(cs);
+    CPUX86State *env = &cpu->env;
+    uint32_t cpuid_i;
+    struct kvm_cpuid_entry2 *c;
+    uint32_t signature[3];
+    int kvm_base = KVM_CPUID_SIGNATURE;
+    int max_nested_state_len;
+    int r;
+    Error *local_err = NULL;
+
+    memset(&cpuid_data, 0, sizeof(cpuid_data));
+
+    cpuid_i = 0;
+
+    has_xsave2 = kvm_check_extension(cs->kvm_state, KVM_CAP_XSAVE2);
+
+    r = kvm_arch_set_tsc_khz(cs);
+    if (r < 0) {
+        return r;
+    }
+
+    /* vcpu's TSC frequency is either specified by user, or following
+     * the value used by KVM if the former is not present. In the
+     * latter case, we query it from KVM and record in env->tsc_khz,
+     * so that vcpu's TSC frequency can be migrated later via this field.
+     */
+    if (!env->tsc_khz) {
+        r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
+            kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
+            -ENOTSUP;
+        if (r > 0) {
+            env->tsc_khz = r;
+        }
+    }
+
+    env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
+
+    /*
+     * kvm_hyperv_expand_features() is called here for the second time in case
+     * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly handle
+     * 'query-cpu-model-expansion' in this case as we don't have a KVM vCPU to
+     * check which Hyper-V enlightenments are supported and which are not, we
+     * can still proceed and check/expand Hyper-V enlightenments here so legacy
+     * behavior is preserved.
+     */
+    if (!kvm_hyperv_expand_features(cpu, &local_err)) {
+        error_report_err(local_err);
+        return -ENOSYS;
+    }
+
+    if (hyperv_enabled(cpu)) {
+        r = hyperv_init_vcpu(cpu);
+        if (r) {
+            return r;
+        }
+
+        cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
+        kvm_base = KVM_CPUID_SIGNATURE_NEXT;
+        has_msr_hv_hypercall = true;
+    }
+
+    if (cpu->expose_kvm) {
+        memcpy(signature, "KVMKVMKVM\0\0\0", 12);
+        c = &cpuid_data.entries[cpuid_i++];
+        c->function = KVM_CPUID_SIGNATURE | kvm_base;
+        c->eax = KVM_CPUID_FEATURES | kvm_base;
+        c->ebx = signature[0];
+        c->ecx = signature[1];
+        c->edx = signature[2];
+
+        c = &cpuid_data.entries[cpuid_i++];
+        c->function = KVM_CPUID_FEATURES | kvm_base;
+        c->eax = env->features[FEAT_KVM];
+        c->edx = env->features[FEAT_KVM_HINTS];
+    }
+
+    if (cpu->kvm_pv_enforce_cpuid) {
+        r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1);
+        if (r < 0) {
+            fprintf(stderr,
+                    "failed to enable KVM_CAP_ENFORCE_PV_FEATURE_CPUID: %s",
+                    strerror(-r));
+            abort();
+        }
+    }
+
+    cpuid_i = kvm_x86_arch_cpuid(env, cpuid_data.entries, cpuid_i);
     cpuid_data.cpuid.nent = cpuid_i;
 
     if (((env->cpuid_version >> 8)&0xF) >= 6
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index b434feaa6b1d..5c7972f617e8 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -24,6 +24,10 @@
 #define kvm_ioapic_in_kernel() \
     (kvm_irqchip_in_kernel() && !kvm_irqchip_is_split())
 
+#define KVM_MAX_CPUID_ENTRIES  100
+uint32_t kvm_x86_arch_cpuid(CPUX86State *env, struct kvm_cpuid_entry2 *entries,
+                            uint32_t cpuid_i);
+
 #else
 
 #define kvm_pit_in_kernel()      0
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 11/36] i386/tdx: Initialize TDX before creating TD vcpus
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Invoke KVM_TDX_INIT in kvm_arch_pre_create_vcpu() that KVM_TDX_INIT
configures global TD state, e.g. the canonical CPUID config, and must
be executed prior to creating vCPUs.

Use kvm_x86_arch_cpuid() to setup the CPUID settings for TDX VM and
tie x86cpu->enable_pmu with TD's attributes.

Note, this doesn't address the fact that QEMU may change the CPUID
configuration when creating vCPUs, i.e. punts on refactoring QEMU to
provide a stable CPUID config prior to kvm_arch_init().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 accel/kvm/kvm-all.c        |  9 ++++++-
 target/i386/kvm/kvm.c      |  3 +++
 target/i386/kvm/tdx-stub.c |  5 ++++
 target/i386/kvm/tdx.c      | 49 ++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h      |  4 ++++
 5 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a4bb449737a6..fceb6b618b04 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -465,10 +465,17 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
+    /*
+     * tdx_pre_create_vcpu() may call cpu_x86_cpuid(). It in turn may call
+     * kvm_vm_ioctl(). Set cpu->kvm_state in advance to avoid NULL pointer
+     * dereference.
+     */
+    cpu->kvm_state = s;
     ret = kvm_arch_pre_create_vcpu(cpu);
     if (ret < 0) {
         error_setg_errno(errp, -ret,
                          "kvm_init_vcpu: kvm_arch_pre_create_vcpu() failed");
+        cpu->kvm_state = NULL;
         goto err;
     }
 
@@ -476,11 +483,11 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
     if (ret < 0) {
         error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
                          kvm_arch_vcpu_id(cpu));
+        cpu->kvm_state = NULL;
         goto err;
     }
 
     cpu->kvm_fd = ret;
-    cpu->kvm_state = s;
     cpu->vcpu_dirty = true;
     cpu->dirty_pages = 0;
 
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 02849f6ef142..f2d71359b59d 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2112,6 +2112,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 int kvm_arch_pre_create_vcpu(CPUState *cpu)
 {
+    if (is_tdx_vm())
+        return tdx_pre_create_vcpu(cpu);
+
     return 0;
 }
 
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
index 1df24735201e..2871de9d7b56 100644
--- a/target/i386/kvm/tdx-stub.c
+++ b/target/i386/kvm/tdx-stub.c
@@ -7,3 +7,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
 {
     return -EINVAL;
 }
+
+int tdx_pre_create_vcpu(CPUState *cpu)
+{
+    return -EINVAL;
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index e4ee55f30c79..a5cc187edbde 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -18,6 +18,7 @@
 #include "sysemu/kvm.h"
 
 #include "hw/i386/x86.h"
+#include "kvm_i386.h"
 #include "tdx.h"
 
 static TdxGuest *tdx_guest;
@@ -149,6 +150,52 @@ void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
     }
 }
 
+int tdx_pre_create_vcpu(CPUState *cpu)
+{
+    struct {
+        struct kvm_cpuid2 cpuid;
+        struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
+    } cpuid_data;
+
+    /*
+     * The kernel defines these structs with padding fields so there
+     * should be no extra padding in our cpuid_data struct.
+     */
+    QEMU_BUILD_BUG_ON(sizeof(cpuid_data) !=
+                      sizeof(struct kvm_cpuid2) +
+                      sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
+
+    MachineState *ms = MACHINE(qdev_get_machine());
+    X86CPU *x86cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86cpu->env;
+    struct kvm_tdx_init_vm init_vm;
+    int r = 0;
+
+    qemu_mutex_lock(&tdx_guest->lock);
+    if (tdx_guest->initialized) {
+        goto out;
+    }
+
+    memset(&cpuid_data, 0, sizeof(cpuid_data));
+    cpuid_data.cpuid.nent = kvm_x86_arch_cpuid(env, cpuid_data.entries, 0);
+
+    init_vm.cpuid = (__u64)(&cpuid_data);
+    init_vm.max_vcpus = ms->smp.cpus;
+    init_vm.attributes = tdx_guest->attributes;
+
+    r = tdx_vm_ioctl(KVM_TDX_INIT_VM, 0, &init_vm);
+    if (r < 0) {
+        error_report("KVM_TDX_INIT_VM failed %s", strerror(-r));
+        goto out;
+    }
+
+    tdx_guest->initialized = true;
+
+out:
+    qemu_mutex_unlock(&tdx_guest->lock);
+    return r;
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
@@ -161,6 +208,8 @@ static void tdx_guest_init(Object *obj)
 {
     TdxGuest *tdx = TDX_GUEST(obj);
 
+    qemu_mutex_init(&tdx->lock);
+
     tdx->attributes = 0;
 }
 
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 06599b65b827..46a24ee8c7cc 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -17,6 +17,9 @@ typedef struct TdxGuestClass {
 typedef struct TdxGuest {
     ConfidentialGuestSupport parent_obj;
 
+    QemuMutex lock;
+
+    bool initialized;
     uint64_t attributes;    /* TD attributes */
 } TdxGuest;
 
@@ -29,5 +32,6 @@ bool is_tdx_vm(void);
 int tdx_kvm_init(MachineState *ms, Error **errp);
 void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
                              uint32_t *ret);
+int tdx_pre_create_vcpu(CPUState *cpu);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 11/36] i386/tdx: Initialize TDX before creating TD vcpus
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Invoke KVM_TDX_INIT in kvm_arch_pre_create_vcpu() that KVM_TDX_INIT
configures global TD state, e.g. the canonical CPUID config, and must
be executed prior to creating vCPUs.

Use kvm_x86_arch_cpuid() to setup the CPUID settings for TDX VM and
tie x86cpu->enable_pmu with TD's attributes.

Note, this doesn't address the fact that QEMU may change the CPUID
configuration when creating vCPUs, i.e. punts on refactoring QEMU to
provide a stable CPUID config prior to kvm_arch_init().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 accel/kvm/kvm-all.c        |  9 ++++++-
 target/i386/kvm/kvm.c      |  3 +++
 target/i386/kvm/tdx-stub.c |  5 ++++
 target/i386/kvm/tdx.c      | 49 ++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h      |  4 ++++
 5 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index a4bb449737a6..fceb6b618b04 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -465,10 +465,17 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
+    /*
+     * tdx_pre_create_vcpu() may call cpu_x86_cpuid(). It in turn may call
+     * kvm_vm_ioctl(). Set cpu->kvm_state in advance to avoid NULL pointer
+     * dereference.
+     */
+    cpu->kvm_state = s;
     ret = kvm_arch_pre_create_vcpu(cpu);
     if (ret < 0) {
         error_setg_errno(errp, -ret,
                          "kvm_init_vcpu: kvm_arch_pre_create_vcpu() failed");
+        cpu->kvm_state = NULL;
         goto err;
     }
 
@@ -476,11 +483,11 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
     if (ret < 0) {
         error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
                          kvm_arch_vcpu_id(cpu));
+        cpu->kvm_state = NULL;
         goto err;
     }
 
     cpu->kvm_fd = ret;
-    cpu->kvm_state = s;
     cpu->vcpu_dirty = true;
     cpu->dirty_pages = 0;
 
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 02849f6ef142..f2d71359b59d 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2112,6 +2112,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 int kvm_arch_pre_create_vcpu(CPUState *cpu)
 {
+    if (is_tdx_vm())
+        return tdx_pre_create_vcpu(cpu);
+
     return 0;
 }
 
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
index 1df24735201e..2871de9d7b56 100644
--- a/target/i386/kvm/tdx-stub.c
+++ b/target/i386/kvm/tdx-stub.c
@@ -7,3 +7,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
 {
     return -EINVAL;
 }
+
+int tdx_pre_create_vcpu(CPUState *cpu)
+{
+    return -EINVAL;
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index e4ee55f30c79..a5cc187edbde 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -18,6 +18,7 @@
 #include "sysemu/kvm.h"
 
 #include "hw/i386/x86.h"
+#include "kvm_i386.h"
 #include "tdx.h"
 
 static TdxGuest *tdx_guest;
@@ -149,6 +150,52 @@ void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
     }
 }
 
+int tdx_pre_create_vcpu(CPUState *cpu)
+{
+    struct {
+        struct kvm_cpuid2 cpuid;
+        struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
+    } cpuid_data;
+
+    /*
+     * The kernel defines these structs with padding fields so there
+     * should be no extra padding in our cpuid_data struct.
+     */
+    QEMU_BUILD_BUG_ON(sizeof(cpuid_data) !=
+                      sizeof(struct kvm_cpuid2) +
+                      sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
+
+    MachineState *ms = MACHINE(qdev_get_machine());
+    X86CPU *x86cpu = X86_CPU(cpu);
+    CPUX86State *env = &x86cpu->env;
+    struct kvm_tdx_init_vm init_vm;
+    int r = 0;
+
+    qemu_mutex_lock(&tdx_guest->lock);
+    if (tdx_guest->initialized) {
+        goto out;
+    }
+
+    memset(&cpuid_data, 0, sizeof(cpuid_data));
+    cpuid_data.cpuid.nent = kvm_x86_arch_cpuid(env, cpuid_data.entries, 0);
+
+    init_vm.cpuid = (__u64)(&cpuid_data);
+    init_vm.max_vcpus = ms->smp.cpus;
+    init_vm.attributes = tdx_guest->attributes;
+
+    r = tdx_vm_ioctl(KVM_TDX_INIT_VM, 0, &init_vm);
+    if (r < 0) {
+        error_report("KVM_TDX_INIT_VM failed %s", strerror(-r));
+        goto out;
+    }
+
+    tdx_guest->initialized = true;
+
+out:
+    qemu_mutex_unlock(&tdx_guest->lock);
+    return r;
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
@@ -161,6 +208,8 @@ static void tdx_guest_init(Object *obj)
 {
     TdxGuest *tdx = TDX_GUEST(obj);
 
+    qemu_mutex_init(&tdx->lock);
+
     tdx->attributes = 0;
 }
 
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 06599b65b827..46a24ee8c7cc 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -17,6 +17,9 @@ typedef struct TdxGuestClass {
 typedef struct TdxGuest {
     ConfidentialGuestSupport parent_obj;
 
+    QemuMutex lock;
+
+    bool initialized;
     uint64_t attributes;    /* TD attributes */
 } TdxGuest;
 
@@ -29,5 +32,6 @@ bool is_tdx_vm(void);
 int tdx_kvm_init(MachineState *ms, Error **errp);
 void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
                              uint32_t *ret);
+int tdx_pre_create_vcpu(CPUState *cpu);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Add sept-ve-disable property for tdx-guest object. It's used to
configure bit 28 of TD attributes.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 qapi/qom.json         |  5 ++++-
 target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 1415ab22e531..fc380095a42c 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -792,10 +792,13 @@
 #
 # @attributes: TDX guest's attributes (default: 0)
 #
+# @sept-ve-disable: attributes.sept-ve-disable[bit 28] (default: 0)
+#
 # Since: 7.0
 ##
 { 'struct': 'TdxGuestProperties',
-  'data': { '*attributes': 'uint64' } }
+  'data': { '*attributes': 'uint64',
+            '*sept-ve-disable': 'bool' } }
 
 ##
 # @ObjectType:
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index a5cc187edbde..409526765304 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -21,6 +21,8 @@
 #include "kvm_i386.h"
 #include "tdx.h"
 
+#define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
+
 static TdxGuest *tdx_guest;
 
 /* It's valid after kvm_confidential_guest_init()->kvm_tdx_init() */
@@ -196,6 +198,24 @@ out:
     return r;
 }
 
+static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+
+    return !!(tdx->attributes & TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE);
+}
+
+static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error **errp)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+
+    if (value) {
+        tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE;
+    } else {
+        tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE;
+    }
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
@@ -211,6 +231,10 @@ static void tdx_guest_init(Object *obj)
     qemu_mutex_init(&tdx->lock);
 
     tdx->attributes = 0;
+
+    object_property_add_bool(obj, "sept-ve-disable",
+                             tdx_guest_get_sept_ve_disable,
+                             tdx_guest_set_sept_ve_disable);
 }
 
 static void tdx_guest_finalize(Object *obj)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Add sept-ve-disable property for tdx-guest object. It's used to
configure bit 28 of TD attributes.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 qapi/qom.json         |  5 ++++-
 target/i386/kvm/tdx.c | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 1415ab22e531..fc380095a42c 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -792,10 +792,13 @@
 #
 # @attributes: TDX guest's attributes (default: 0)
 #
+# @sept-ve-disable: attributes.sept-ve-disable[bit 28] (default: 0)
+#
 # Since: 7.0
 ##
 { 'struct': 'TdxGuestProperties',
-  'data': { '*attributes': 'uint64' } }
+  'data': { '*attributes': 'uint64',
+            '*sept-ve-disable': 'bool' } }
 
 ##
 # @ObjectType:
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index a5cc187edbde..409526765304 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -21,6 +21,8 @@
 #include "kvm_i386.h"
 #include "tdx.h"
 
+#define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
+
 static TdxGuest *tdx_guest;
 
 /* It's valid after kvm_confidential_guest_init()->kvm_tdx_init() */
@@ -196,6 +198,24 @@ out:
     return r;
 }
 
+static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+
+    return !!(tdx->attributes & TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE);
+}
+
+static void tdx_guest_set_sept_ve_disable(Object *obj, bool value, Error **errp)
+{
+    TdxGuest *tdx = TDX_GUEST(obj);
+
+    if (value) {
+        tdx->attributes |= TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE;
+    } else {
+        tdx->attributes &= ~TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE;
+    }
+}
+
 /* tdx guest */
 OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
                                    tdx_guest,
@@ -211,6 +231,10 @@ static void tdx_guest_init(Object *obj)
     qemu_mutex_init(&tdx->lock);
 
     tdx->attributes = 0;
+
+    object_property_add_bool(obj, "sept-ve-disable",
+                             tdx_guest_get_sept_ve_disable,
+                             tdx_guest_set_sept_ve_disable);
 }
 
 static void tdx_guest_finalize(Object *obj)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 13/36] i386/tdx: Wire CPU features up with attributes of TD guest
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

For QEMU VMs, PKS is configured via CPUID_7_0_ECX_PKS and PMU is
configured by x86cpu->enable_pmu. Reuse the existing configuration
interface for TDX VMs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 409526765304..de4146025995 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -22,6 +22,8 @@
 #include "tdx.h"
 
 #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
+#define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
+#define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
 
 static TdxGuest *tdx_guest;
 
@@ -152,6 +154,15 @@ void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
     }
 }
 
+static void setup_td_guest_attributes(X86CPU *x86cpu)
+{
+    CPUX86State *env = &x86cpu->env;
+
+    tdx_guest->attributes |= (env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS) ?
+                             TDX_TD_ATTRIBUTES_PKS : 0;
+    tdx_guest->attributes |= x86cpu->enable_pmu ? TDX_TD_ATTRIBUTES_PERFMON : 0;
+}
+
 int tdx_pre_create_vcpu(CPUState *cpu)
 {
     struct {
@@ -178,6 +189,8 @@ int tdx_pre_create_vcpu(CPUState *cpu)
         goto out;
     }
 
+    setup_td_guest_attributes(x86cpu);
+
     memset(&cpuid_data, 0, sizeof(cpuid_data));
     cpuid_data.cpuid.nent = kvm_x86_arch_cpuid(env, cpuid_data.entries, 0);
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 13/36] i386/tdx: Wire CPU features up with attributes of TD guest
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

For QEMU VMs, PKS is configured via CPUID_7_0_ECX_PKS and PMU is
configured by x86cpu->enable_pmu. Reuse the existing configuration
interface for TDX VMs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 409526765304..de4146025995 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -22,6 +22,8 @@
 #include "tdx.h"
 
 #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
+#define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
+#define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
 
 static TdxGuest *tdx_guest;
 
@@ -152,6 +154,15 @@ void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
     }
 }
 
+static void setup_td_guest_attributes(X86CPU *x86cpu)
+{
+    CPUX86State *env = &x86cpu->env;
+
+    tdx_guest->attributes |= (env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS) ?
+                             TDX_TD_ATTRIBUTES_PKS : 0;
+    tdx_guest->attributes |= x86cpu->enable_pmu ? TDX_TD_ATTRIBUTES_PERFMON : 0;
+}
+
 int tdx_pre_create_vcpu(CPUState *cpu)
 {
     struct {
@@ -178,6 +189,8 @@ int tdx_pre_create_vcpu(CPUState *cpu)
         goto out;
     }
 
+    setup_td_guest_attributes(x86cpu);
+
     memset(&cpuid_data, 0, sizeof(cpuid_data));
     cpuid_data.cpuid.nent = kvm_x86_arch_cpuid(env, cpuid_data.entries, 0);
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 14/36] i386/tdx: Validate TD attributes
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
fixed-1 bits must be set.

Besides, sanity check the attribute bits that have not been supported by
QEMU yet. e.g., debug bit, that it will be allowed in the future when debug
TD support lands in QEMU.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index de4146025995..a76c41fe5724 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -21,6 +21,7 @@
 #include "kvm_i386.h"
 #include "tdx.h"
 
+#define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
 #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
 #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
 #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
@@ -154,13 +155,32 @@ void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
     }
 }
 
-static void setup_td_guest_attributes(X86CPU *x86cpu)
+static int tdx_validate_attributes(TdxGuest *tdx)
+{
+    if (((tdx->attributes & tdx_caps->attrs_fixed0) | tdx_caps->attrs_fixed1) !=
+        tdx->attributes) {
+            error_report("Invalid attributes 0x%lx for TDX VM (fixed0 0x%llx, fixed1 0x%llx)",
+                          tdx->attributes, tdx_caps->attrs_fixed0, tdx_caps->attrs_fixed1);
+            return -EINVAL;
+    }
+
+    if (tdx->attributes & TDX_TD_ATTRIBUTES_DEBUG) {
+        error_report("Current QEMU doesn't support attributes.debug[bit 0] for TDX VM");
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static int setup_td_guest_attributes(X86CPU *x86cpu)
 {
     CPUX86State *env = &x86cpu->env;
 
     tdx_guest->attributes |= (env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS) ?
                              TDX_TD_ATTRIBUTES_PKS : 0;
     tdx_guest->attributes |= x86cpu->enable_pmu ? TDX_TD_ATTRIBUTES_PERFMON : 0;
+
+    return tdx_validate_attributes(tdx_guest);
 }
 
 int tdx_pre_create_vcpu(CPUState *cpu)
@@ -189,7 +209,10 @@ int tdx_pre_create_vcpu(CPUState *cpu)
         goto out;
     }
 
-    setup_td_guest_attributes(x86cpu);
+    r = setup_td_guest_attributes(x86cpu);
+    if (r) {
+        goto out;
+    }
 
     memset(&cpuid_data, 0, sizeof(cpuid_data));
     cpuid_data.cpuid.nent = kvm_x86_arch_cpuid(env, cpuid_data.entries, 0);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 14/36] i386/tdx: Validate TD attributes
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Validate TD attributes with tdx_caps that fixed-0 bits must be zero and
fixed-1 bits must be set.

Besides, sanity check the attribute bits that have not been supported by
QEMU yet. e.g., debug bit, that it will be allowed in the future when debug
TD support lands in QEMU.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index de4146025995..a76c41fe5724 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -21,6 +21,7 @@
 #include "kvm_i386.h"
 #include "tdx.h"
 
+#define TDX_TD_ATTRIBUTES_DEBUG             BIT_ULL(0)
 #define TDX_TD_ATTRIBUTES_SEPT_VE_DISABLE   BIT_ULL(28)
 #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
 #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
@@ -154,13 +155,32 @@ void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
     }
 }
 
-static void setup_td_guest_attributes(X86CPU *x86cpu)
+static int tdx_validate_attributes(TdxGuest *tdx)
+{
+    if (((tdx->attributes & tdx_caps->attrs_fixed0) | tdx_caps->attrs_fixed1) !=
+        tdx->attributes) {
+            error_report("Invalid attributes 0x%lx for TDX VM (fixed0 0x%llx, fixed1 0x%llx)",
+                          tdx->attributes, tdx_caps->attrs_fixed0, tdx_caps->attrs_fixed1);
+            return -EINVAL;
+    }
+
+    if (tdx->attributes & TDX_TD_ATTRIBUTES_DEBUG) {
+        error_report("Current QEMU doesn't support attributes.debug[bit 0] for TDX VM");
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static int setup_td_guest_attributes(X86CPU *x86cpu)
 {
     CPUX86State *env = &x86cpu->env;
 
     tdx_guest->attributes |= (env->features[FEAT_7_0_ECX] & CPUID_7_0_ECX_PKS) ?
                              TDX_TD_ATTRIBUTES_PKS : 0;
     tdx_guest->attributes |= x86cpu->enable_pmu ? TDX_TD_ATTRIBUTES_PERFMON : 0;
+
+    return tdx_validate_attributes(tdx_guest);
 }
 
 int tdx_pre_create_vcpu(CPUState *cpu)
@@ -189,7 +209,10 @@ int tdx_pre_create_vcpu(CPUState *cpu)
         goto out;
     }
 
-    setup_td_guest_attributes(x86cpu);
+    r = setup_td_guest_attributes(x86cpu);
+    if (r) {
+        goto out;
+    }
 
     memset(&cpuid_data, 0, sizeof(cpuid_data));
     cpuid_data.cpuid.nent = kvm_x86_arch_cpuid(env, cpuid_data.entries, 0);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 15/36] i386/tdx: Implement user specified tsc frequency
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Reuse "-cpu,tsc-frequency=" to get user wanted tsc frequency and pass it
to KVM_TDX_INIT_VM.

Besides, sanity check the tsc frequency to be in the legal range and
legal granularity (required by TDX module).

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c |  8 ++++++++
 target/i386/kvm/tdx.c | 18 ++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f2d71359b59d..4a8b6e2c8797 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -806,6 +806,14 @@ static int kvm_arch_set_tsc_khz(CPUState *cs)
     int r, cur_freq;
     bool set_ioctl = false;
 
+    /*
+     * TD guest's TSC is immutable, it cannot be set/changed via
+     * KVM_SET_TSC_KHZ, but only be initialized via KVM_TDX_INIT_VM
+     */
+    if (is_tdx_vm()) {
+        return 0;
+    }
+
     if (!env->tsc_khz) {
         return 0;
     }
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index a76c41fe5724..94a9c1ea7e9c 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -26,6 +26,9 @@
 #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
 #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
 
+#define TDX_MIN_TSC_FREQUENCY_KHZ   (100 * 1000)
+#define TDX_MAX_TSC_FREQUENCY_KHZ   (10 * 1000 * 1000)
+
 static TdxGuest *tdx_guest;
 
 /* It's valid after kvm_confidential_guest_init()->kvm_tdx_init() */
@@ -209,6 +212,20 @@ int tdx_pre_create_vcpu(CPUState *cpu)
         goto out;
     }
 
+    r = -EINVAL;
+    if (env->tsc_khz && (env->tsc_khz < TDX_MIN_TSC_FREQUENCY_KHZ ||
+                         env->tsc_khz > TDX_MAX_TSC_FREQUENCY_KHZ)) {
+        error_report("Invalid TSC %ld KHz, must specify cpu_frequency between [%d, %d] kHz",
+                      env->tsc_khz, TDX_MIN_TSC_FREQUENCY_KHZ,
+                      TDX_MAX_TSC_FREQUENCY_KHZ);
+        goto out;
+    }
+
+    if (env->tsc_khz % (25 * 1000)) {
+        error_report("Invalid TSC %ld KHz, it must be multiple of 25MHz", env->tsc_khz);
+        goto out;
+    }
+
     r = setup_td_guest_attributes(x86cpu);
     if (r) {
         goto out;
@@ -219,6 +236,7 @@ int tdx_pre_create_vcpu(CPUState *cpu)
 
     init_vm.cpuid = (__u64)(&cpuid_data);
     init_vm.max_vcpus = ms->smp.cpus;
+    init_vm.tsc_khz = env->tsc_khz;
     init_vm.attributes = tdx_guest->attributes;
 
     r = tdx_vm_ioctl(KVM_TDX_INIT_VM, 0, &init_vm);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 15/36] i386/tdx: Implement user specified tsc frequency
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Reuse "-cpu,tsc-frequency=" to get user wanted tsc frequency and pass it
to KVM_TDX_INIT_VM.

Besides, sanity check the tsc frequency to be in the legal range and
legal granularity (required by TDX module).

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c |  8 ++++++++
 target/i386/kvm/tdx.c | 18 ++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f2d71359b59d..4a8b6e2c8797 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -806,6 +806,14 @@ static int kvm_arch_set_tsc_khz(CPUState *cs)
     int r, cur_freq;
     bool set_ioctl = false;
 
+    /*
+     * TD guest's TSC is immutable, it cannot be set/changed via
+     * KVM_SET_TSC_KHZ, but only be initialized via KVM_TDX_INIT_VM
+     */
+    if (is_tdx_vm()) {
+        return 0;
+    }
+
     if (!env->tsc_khz) {
         return 0;
     }
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index a76c41fe5724..94a9c1ea7e9c 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -26,6 +26,9 @@
 #define TDX_TD_ATTRIBUTES_PKS               BIT_ULL(30)
 #define TDX_TD_ATTRIBUTES_PERFMON           BIT_ULL(63)
 
+#define TDX_MIN_TSC_FREQUENCY_KHZ   (100 * 1000)
+#define TDX_MAX_TSC_FREQUENCY_KHZ   (10 * 1000 * 1000)
+
 static TdxGuest *tdx_guest;
 
 /* It's valid after kvm_confidential_guest_init()->kvm_tdx_init() */
@@ -209,6 +212,20 @@ int tdx_pre_create_vcpu(CPUState *cpu)
         goto out;
     }
 
+    r = -EINVAL;
+    if (env->tsc_khz && (env->tsc_khz < TDX_MIN_TSC_FREQUENCY_KHZ ||
+                         env->tsc_khz > TDX_MAX_TSC_FREQUENCY_KHZ)) {
+        error_report("Invalid TSC %ld KHz, must specify cpu_frequency between [%d, %d] kHz",
+                      env->tsc_khz, TDX_MIN_TSC_FREQUENCY_KHZ,
+                      TDX_MAX_TSC_FREQUENCY_KHZ);
+        goto out;
+    }
+
+    if (env->tsc_khz % (25 * 1000)) {
+        error_report("Invalid TSC %ld KHz, it must be multiple of 25MHz", env->tsc_khz);
+        goto out;
+    }
+
     r = setup_td_guest_attributes(x86cpu);
     if (r) {
         goto out;
@@ -219,6 +236,7 @@ int tdx_pre_create_vcpu(CPUState *cpu)
 
     init_vm.cpuid = (__u64)(&cpuid_data);
     init_vm.max_vcpus = ms->smp.cpus;
+    init_vm.tsc_khz = env->tsc_khz;
     init_vm.attributes = tdx_guest->attributes;
 
     r = tdx_vm_ioctl(KVM_TDX_INIT_VM, 0, &init_vm);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 16/36] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

TDX only supports readonly for shared memory but not for private memory.

In the view of QEMU, it has no idea whether a memslot is used by shared
memory of private. Thus just mark kvm_readonly_mem_enabled to false to
TDX VM for simplicity.

Note, pflash has dependency on readonly capability from KVM while TDX
wants to reuse pflash interface to load TDVF (as OVMF). Excuse TDX VM
for readonly check in pflash.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/pc_sysfw.c    | 2 +-
 target/i386/kvm/tdx.c | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index c8b17af95353..75b34d02cb4f 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -245,7 +245,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
         /* Machine property pflash0 not set, use ROM mode */
         x86_bios_rom_init(MACHINE(pcms), "bios.bin", rom_memory, false);
     } else {
-        if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
+        if (kvm_enabled() && (!kvm_readonly_mem_enabled() && !is_tdx_vm())) {
             /*
              * Older KVM cannot execute from device memory. So, flash
              * memory cannot be used unless the readonly memory kvm
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 94a9c1ea7e9c..1bb8211e74e6 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -115,6 +115,15 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         get_tdx_capabilities();
     }
 
+    /*
+     * Set kvm_readonly_mem_allowed to false, because TDX only supports readonly
+     * memory for shared memory but not for private memory. Besides, whether a
+     * memslot is private or shared is not determined by QEMU.
+     *
+     * Thus, just mark readonly memory not supported for simplicity.
+     */
+    kvm_readonly_mem_allowed = false;
+
     tdx_guest = tdx;
 
     return 0;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 16/36] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

TDX only supports readonly for shared memory but not for private memory.

In the view of QEMU, it has no idea whether a memslot is used by shared
memory of private. Thus just mark kvm_readonly_mem_enabled to false to
TDX VM for simplicity.

Note, pflash has dependency on readonly capability from KVM while TDX
wants to reuse pflash interface to load TDVF (as OVMF). Excuse TDX VM
for readonly check in pflash.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/pc_sysfw.c    | 2 +-
 target/i386/kvm/tdx.c | 9 +++++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index c8b17af95353..75b34d02cb4f 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -245,7 +245,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
         /* Machine property pflash0 not set, use ROM mode */
         x86_bios_rom_init(MACHINE(pcms), "bios.bin", rom_memory, false);
     } else {
-        if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
+        if (kvm_enabled() && (!kvm_readonly_mem_enabled() && !is_tdx_vm())) {
             /*
              * Older KVM cannot execute from device memory. So, flash
              * memory cannot be used unless the readonly memory kvm
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 94a9c1ea7e9c..1bb8211e74e6 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -115,6 +115,15 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         get_tdx_capabilities();
     }
 
+    /*
+     * Set kvm_readonly_mem_allowed to false, because TDX only supports readonly
+     * memory for shared memory but not for private memory. Besides, whether a
+     * memslot is private or shared is not determined by QEMU.
+     *
+     * Thus, just mark readonly memory not supported for simplicity.
+     */
+    kvm_readonly_mem_allowed = false;
+
     tdx_guest = tdx;
 
     return 0;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
that OVMF is mapped as rom device, TDVF needs to be mapped as private
memory. This is because TDX architecture doesn't provide read-only
capability for VMM, and it doesn't support instruction emulation due
to guest memory and registers are not accessible for VMM.

On the other hand, OVMF can work as TDVF, which is usually configured
as pflash device in QEMU. To keep the same usage (QEMU parameter),
introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
ram_mode will be enabled automatically that map the firmware as RAM.

Note, this implies two things:
 1. TDVF (OVMF) is not read-only (write-protected).

 2. It doesn't support non-volatile UEFI variables as what pflash
    supports that the change to non-volatile UEFI variables won't get
    synced back to backend vars.fd file.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
 hw/i386/pc_sysfw.c      | 14 +++++++++++---
 2 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index 74c7190302bd..55e8bb2bd5ee 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -87,6 +87,7 @@ struct PFlashCFI01 {
     void *storage;
     VMChangeStateEntry *vmstate;
     bool old_multiple_chip_handling;
+    bool ram_mode;  /* if 1, the flash is mapped as RAM */
 };
 
 static int pflash_post_load(void *opaque, int version_id);
@@ -818,17 +819,24 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp)
 
     total_len = pfl->sector_len * pfl->nb_blocs;
 
-    memory_region_init_rom_device(
-        &pfl->mem, OBJECT(dev),
-        &pflash_cfi01_ops,
-        pfl,
-        pfl->name, total_len, errp);
+    if (pfl->ram_mode) {
+        memory_region_init_ram(&pfl->mem, OBJECT(dev),pfl->name, total_len, errp);
+    } else {
+        memory_region_init_rom_device(
+            &pfl->mem, OBJECT(dev),
+            &pflash_cfi01_ops,
+            pfl,
+            pfl->name, total_len, errp);
+    }
     if (*errp) {
         return;
     }
 
     pfl->storage = memory_region_get_ram_ptr(&pfl->mem);
-    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem);
+
+    if (!pfl->ram_mode) {
+        sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem);
+    }
 
     if (pfl->blk) {
         uint64_t perm;
@@ -879,7 +887,9 @@ static void pflash_cfi01_system_reset(DeviceState *dev)
      */
     pfl->cmd = 0x00;
     pfl->wcycle = 0;
-    memory_region_rom_device_set_romd(&pfl->mem, true);
+    if (!pfl->ram_mode) {
+        memory_region_rom_device_set_romd(&pfl->mem, true);
+    }
     /*
      * The WSM ready timer occurs at most 150ns after system reset.
      * This model deliberately ignores this delay.
@@ -924,6 +934,7 @@ static Property pflash_cfi01_properties[] = {
     DEFINE_PROP_STRING("name", PFlashCFI01, name),
     DEFINE_PROP_BOOL("old-multiple-chip-handling", PFlashCFI01,
                      old_multiple_chip_handling, false),
+    DEFINE_PROP_BOOL("ram-mode", PFlashCFI01, ram_mode, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 75b34d02cb4f..03c84b5aaa32 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -38,6 +38,7 @@
 #include "hw/block/flash.h"
 #include "sysemu/kvm.h"
 #include "sev.h"
+#include "kvm/tdx.h"
 
 #define FLASH_SECTOR_SIZE 4096
 
@@ -184,12 +185,19 @@ static void pc_system_flash_map(PCMachineState *pcms,
         total_size += size;
         qdev_prop_set_uint32(DEVICE(system_flash), "num-blocks",
                              size / FLASH_SECTOR_SIZE);
+        qdev_prop_set_bit(DEVICE(system_flash), "ram-mode", is_tdx_vm());
         sysbus_realize_and_unref(SYS_BUS_DEVICE(system_flash), &error_fatal);
-        sysbus_mmio_map(SYS_BUS_DEVICE(system_flash), 0,
-                        0x100000000ULL - total_size);
+        flash_mem = pflash_cfi01_get_memory(system_flash);
+        if (is_tdx_vm()) {
+            memory_region_add_subregion(get_system_memory(),
+                                        0x100000000ULL - total_size,
+                                        flash_mem);
+        } else {
+            sysbus_mmio_map(SYS_BUS_DEVICE(system_flash), 0,
+                            0x100000000ULL - total_size);
+        }
 
         if (i == 0) {
-            flash_mem = pflash_cfi01_get_memory(system_flash);
             pc_isa_bios_init(rom_memory, flash_mem, size);
 
             /* Encrypt the pflash boot ROM */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
that OVMF is mapped as rom device, TDVF needs to be mapped as private
memory. This is because TDX architecture doesn't provide read-only
capability for VMM, and it doesn't support instruction emulation due
to guest memory and registers are not accessible for VMM.

On the other hand, OVMF can work as TDVF, which is usually configured
as pflash device in QEMU. To keep the same usage (QEMU parameter),
introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
ram_mode will be enabled automatically that map the firmware as RAM.

Note, this implies two things:
 1. TDVF (OVMF) is not read-only (write-protected).

 2. It doesn't support non-volatile UEFI variables as what pflash
    supports that the change to non-volatile UEFI variables won't get
    synced back to backend vars.fd file.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
 hw/i386/pc_sysfw.c      | 14 +++++++++++---
 2 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index 74c7190302bd..55e8bb2bd5ee 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -87,6 +87,7 @@ struct PFlashCFI01 {
     void *storage;
     VMChangeStateEntry *vmstate;
     bool old_multiple_chip_handling;
+    bool ram_mode;  /* if 1, the flash is mapped as RAM */
 };
 
 static int pflash_post_load(void *opaque, int version_id);
@@ -818,17 +819,24 @@ static void pflash_cfi01_realize(DeviceState *dev, Error **errp)
 
     total_len = pfl->sector_len * pfl->nb_blocs;
 
-    memory_region_init_rom_device(
-        &pfl->mem, OBJECT(dev),
-        &pflash_cfi01_ops,
-        pfl,
-        pfl->name, total_len, errp);
+    if (pfl->ram_mode) {
+        memory_region_init_ram(&pfl->mem, OBJECT(dev),pfl->name, total_len, errp);
+    } else {
+        memory_region_init_rom_device(
+            &pfl->mem, OBJECT(dev),
+            &pflash_cfi01_ops,
+            pfl,
+            pfl->name, total_len, errp);
+    }
     if (*errp) {
         return;
     }
 
     pfl->storage = memory_region_get_ram_ptr(&pfl->mem);
-    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem);
+
+    if (!pfl->ram_mode) {
+        sysbus_init_mmio(SYS_BUS_DEVICE(dev), &pfl->mem);
+    }
 
     if (pfl->blk) {
         uint64_t perm;
@@ -879,7 +887,9 @@ static void pflash_cfi01_system_reset(DeviceState *dev)
      */
     pfl->cmd = 0x00;
     pfl->wcycle = 0;
-    memory_region_rom_device_set_romd(&pfl->mem, true);
+    if (!pfl->ram_mode) {
+        memory_region_rom_device_set_romd(&pfl->mem, true);
+    }
     /*
      * The WSM ready timer occurs at most 150ns after system reset.
      * This model deliberately ignores this delay.
@@ -924,6 +934,7 @@ static Property pflash_cfi01_properties[] = {
     DEFINE_PROP_STRING("name", PFlashCFI01, name),
     DEFINE_PROP_BOOL("old-multiple-chip-handling", PFlashCFI01,
                      old_multiple_chip_handling, false),
+    DEFINE_PROP_BOOL("ram-mode", PFlashCFI01, ram_mode, false),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 75b34d02cb4f..03c84b5aaa32 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -38,6 +38,7 @@
 #include "hw/block/flash.h"
 #include "sysemu/kvm.h"
 #include "sev.h"
+#include "kvm/tdx.h"
 
 #define FLASH_SECTOR_SIZE 4096
 
@@ -184,12 +185,19 @@ static void pc_system_flash_map(PCMachineState *pcms,
         total_size += size;
         qdev_prop_set_uint32(DEVICE(system_flash), "num-blocks",
                              size / FLASH_SECTOR_SIZE);
+        qdev_prop_set_bit(DEVICE(system_flash), "ram-mode", is_tdx_vm());
         sysbus_realize_and_unref(SYS_BUS_DEVICE(system_flash), &error_fatal);
-        sysbus_mmio_map(SYS_BUS_DEVICE(system_flash), 0,
-                        0x100000000ULL - total_size);
+        flash_mem = pflash_cfi01_get_memory(system_flash);
+        if (is_tdx_vm()) {
+            memory_region_add_subregion(get_system_memory(),
+                                        0x100000000ULL - total_size,
+                                        flash_mem);
+        } else {
+            sysbus_mmio_map(SYS_BUS_DEVICE(system_flash), 0,
+                            0x100000000ULL - total_size);
+        }
 
         if (i == 0) {
-            flash_mem = pflash_cfi01_get_memory(system_flash);
             pc_isa_bios_init(rom_memory, flash_mem, size);
 
             /* Encrypt the pflash boot ROM */
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 18/36] i386/tdvf: Introduce function to parse TDVF metadata
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

From: Isaku Yamahata <isaku.yamahata@intel.com>

TDX VM needs to boot with its specialized firmware, Trusted Domain
Virtual Firmware (TDVF). QEMU needs to parse TDVF and map it in TD
guest memory prior to running the TDX VM.

A TDVF Metadata in TDVF image describes the structure of firmware.
QEMU refers to it to setup memory for TDVF. Introduce function
tdvf_parse_metadata() to parse the metadata from TDVF image and store
the info of each TDVF section.

TDX metadata is located by a TDX metadata offset block, which is a
GUID-ed structure. The data portion of the GUID structure contains
only an 4-byte field that is the offset of TDX metadata to the end
of firmware file.

Select X86_FW_OVMF when TDX is enable to leverage existing functions
to parse and search OVMF's GUID-ed structures.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/Kconfig        |   1 +
 hw/i386/meson.build    |   1 +
 hw/i386/tdvf.c         | 196 +++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/tdvf.h |  51 +++++++++++
 4 files changed, 249 insertions(+)
 create mode 100644 hw/i386/tdvf.c
 create mode 100644 include/hw/i386/tdvf.h

diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index 9e40ff79fc2d..0c3e3a464012 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -12,6 +12,7 @@ config SGX
 
 config TDX
     bool
+    select X86_FW_OVMF
     depends on KVM
 
 config PC
diff --git a/hw/i386/meson.build b/hw/i386/meson.build
index 213e2e82b3d7..97f3b50503b0 100644
--- a/hw/i386/meson.build
+++ b/hw/i386/meson.build
@@ -28,6 +28,7 @@ i386_ss.add(when: 'CONFIG_PC', if_true: files(
   'port92.c'))
 i386_ss.add(when: 'CONFIG_X86_FW_OVMF', if_true: files('pc_sysfw_ovmf.c'),
                                         if_false: files('pc_sysfw_ovmf-stubs.c'))
+i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c'))
 
 subdir('kvm')
 subdir('xen')
diff --git a/hw/i386/tdvf.c b/hw/i386/tdvf.c
new file mode 100644
index 000000000000..02da1d2c12dd
--- /dev/null
+++ b/hw/i386/tdvf.c
@@ -0,0 +1,196 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+
+ * Copyright (c) 2020 Intel Corporation
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/i386/pc.h"
+#include "hw/i386/tdvf.h"
+#include "sysemu/kvm.h"
+
+#define TDX_METADATA_GUID "e47a6535-984a-4798-865e-4685a7bf8ec2"
+#define TDX_METADATA_VERSION    1
+#define TDVF_SIGNATURE_LE32     0x46564454 /* TDVF as little endian */
+
+typedef struct {
+    uint32_t DataOffset;
+    uint32_t RawDataSize;
+    uint64_t MemoryAddress;
+    uint64_t MemoryDataSize;
+    uint32_t Type;
+    uint32_t Attributes;
+} TdvfSectionEntry;
+
+typedef struct {
+    uint32_t Signature;
+    uint32_t Length;
+    uint32_t Version;
+    uint32_t NumberOfSectionEntries;
+    TdvfSectionEntry SectionEntries[];
+} TdvfMetadata;
+
+struct tdx_metadata_offset {
+    uint32_t offset;
+};
+
+static TdvfMetadata *tdvf_get_metadata(void *flash_ptr, int size)
+{
+    TdvfMetadata *metadata;
+    uint32_t offset = 0;
+    uint8_t *data;
+
+    if ((uint32_t) size != size) {
+        return NULL;
+    }
+
+    if (pc_system_ovmf_table_find(TDX_METADATA_GUID, &data, NULL)) {
+        offset = size - le32_to_cpu(((struct tdx_metadata_offset *)data)->offset);
+
+        if (offset + sizeof(*metadata) > size) {
+            return NULL;
+        }
+    } else {
+        error_report("Cannot find TDX_METADATA_GUID\n");
+        return NULL;
+    }
+
+    metadata = flash_ptr + offset;
+
+    /* Finally, verify the signature to determine if this is a TDVF image. */
+   if (metadata->Signature != TDVF_SIGNATURE_LE32) {
+       error_report("Invalid TDVF signature in metadata!\n");
+       return NULL;
+   }
+
+    /* Sanity check that the TDVF doesn't overlap its own metadata. */
+    metadata->Length = le32_to_cpu(metadata->Length);
+    if (offset + metadata->Length > size) {
+        return NULL;
+    }
+
+    /* Only version 1 is supported/defined. */
+    metadata->Version = le32_to_cpu(metadata->Version);
+    if (metadata->Version != TDX_METADATA_VERSION) {
+        return NULL;
+    }
+
+    return metadata;
+}
+
+static int tdvf_parse_section_entry(const TdvfSectionEntry *src,
+                                     TdxFirmwareEntry *entry)
+{
+    entry->data_offset = le32_to_cpu(src->DataOffset);
+    entry->data_len = le32_to_cpu(src->RawDataSize);
+    entry->address = le64_to_cpu(src->MemoryAddress);
+    entry->size = le64_to_cpu(src->MemoryDataSize);
+    entry->type = le32_to_cpu(src->Type);
+    entry->attributes = le32_to_cpu(src->Attributes);
+
+    /* sanity check */
+    if (entry->size < entry->data_len) {
+        error_report("Broken metadata RawDataSize 0x%x MemoryDataSize 0x%lx",
+                     entry->data_len, entry->size);
+        return -1;
+    }
+    if (!QEMU_IS_ALIGNED(entry->address, TARGET_PAGE_SIZE)) {
+        error_report("MemoryAddress 0x%lx not page aligned", entry->address);
+        return -1;
+    }
+    if (!QEMU_IS_ALIGNED(entry->size, TARGET_PAGE_SIZE)) {
+        error_report("MemoryDataSize 0x%lx not page aligned", entry->size);
+        return -1;
+    }
+
+    switch (entry->type) {
+    case TDVF_SECTION_TYPE_BFV:
+    case TDVF_SECTION_TYPE_CFV:
+        /* The sections that must be copied from firmware image to TD memory */
+        if (entry->data_len == 0) {
+            error_report("%d section with RawDataSize == 0", entry->type);
+            return -1;
+        }
+        break;
+    case TDVF_SECTION_TYPE_TD_HOB:
+    case TDVF_SECTION_TYPE_TEMP_MEM:
+        /* The sections that no need to be copied from firmware image */
+        if (entry->data_len != 0) {
+            error_report("%d section with RawDataSize 0x%x != 0",
+                         entry->type, entry->data_len);
+            return -1;
+        }
+        break;
+    default:
+        error_report("TDVF contains unsupported section type %d", entry->type);
+        return -1;
+    }
+
+    return 0;
+}
+
+int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size)
+{
+    TdvfSectionEntry *sections;
+    TdvfMetadata *metadata;
+    ssize_t entries_size;
+    uint32_t len, i;
+
+    metadata = tdvf_get_metadata(flash_ptr, size);
+    if (!metadata) {
+        return -EINVAL;
+    }
+
+    //load and parse metadata entries
+    fw->nr_entries = le32_to_cpu(metadata->NumberOfSectionEntries);
+    if (fw->nr_entries < 2) {
+        error_report("Invalid number of fw entries (%u) in TDVF", fw->nr_entries);
+        return -EINVAL;
+    }
+
+    len = le32_to_cpu(metadata->Length);
+    entries_size = fw->nr_entries * sizeof(TdvfSectionEntry);
+    if (len != sizeof(*metadata) + entries_size) {
+        error_report("TDVF metadata len (0x%x) mismatch, expected (0x%x)",
+                     len, (uint32_t)(sizeof(*metadata) + entries_size));
+        return -EINVAL;
+    }
+
+    fw->entries = g_new(TdxFirmwareEntry, fw->nr_entries);
+    sections = g_new(TdvfSectionEntry, fw->nr_entries);
+
+    if (!memcpy(sections, (void *)metadata + sizeof(*metadata), entries_size))  {
+        error_report("Failed to read TDVF section entries");
+        goto err;
+    }
+
+    for (i = 0; i < fw->nr_entries; i++) {
+        if (tdvf_parse_section_entry(&sections[i], &fw->entries[i])) {
+            goto err;
+        }
+    }
+    g_free(sections);
+
+    return 0;
+
+err:
+    g_free(sections);
+    fw->entries = 0;
+    g_free(fw->entries);
+    return -EINVAL;
+}
diff --git a/include/hw/i386/tdvf.h b/include/hw/i386/tdvf.h
new file mode 100644
index 000000000000..593341eb2e93
--- /dev/null
+++ b/include/hw/i386/tdvf.h
@@ -0,0 +1,51 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+
+ * Copyright (c) 2020 Intel Corporation
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_I386_TDVF_H
+#define HW_I386_TDVF_H
+
+#include "qemu/osdep.h"
+
+#define TDVF_SECTION_TYPE_BFV               0
+#define TDVF_SECTION_TYPE_CFV               1
+#define TDVF_SECTION_TYPE_TD_HOB            2
+#define TDVF_SECTION_TYPE_TEMP_MEM          3
+
+#define TDVF_SECTION_ATTRIBUTES_MR_EXTEND   (1U << 0)
+#define TDVF_SECTION_ATTRIBUTES_PAGE_AUG    (1U << 1)
+
+typedef struct TdxFirmwareEntry {
+    uint32_t data_offset;
+    uint32_t data_len;
+    uint64_t address;
+    uint64_t size;
+    uint32_t type;
+    uint32_t attributes;
+} TdxFirmwareEntry;
+
+typedef struct TdxFirmware {
+    uint32_t nr_entries;
+    TdxFirmwareEntry *entries;
+} TdxFirmware;
+
+int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size);
+
+#endif /* HW_I386_TDVF_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 18/36] i386/tdvf: Introduce function to parse TDVF metadata
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

From: Isaku Yamahata <isaku.yamahata@intel.com>

TDX VM needs to boot with its specialized firmware, Trusted Domain
Virtual Firmware (TDVF). QEMU needs to parse TDVF and map it in TD
guest memory prior to running the TDX VM.

A TDVF Metadata in TDVF image describes the structure of firmware.
QEMU refers to it to setup memory for TDVF. Introduce function
tdvf_parse_metadata() to parse the metadata from TDVF image and store
the info of each TDVF section.

TDX metadata is located by a TDX metadata offset block, which is a
GUID-ed structure. The data portion of the GUID structure contains
only an 4-byte field that is the offset of TDX metadata to the end
of firmware file.

Select X86_FW_OVMF when TDX is enable to leverage existing functions
to parse and search OVMF's GUID-ed structures.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/Kconfig        |   1 +
 hw/i386/meson.build    |   1 +
 hw/i386/tdvf.c         | 196 +++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/tdvf.h |  51 +++++++++++
 4 files changed, 249 insertions(+)
 create mode 100644 hw/i386/tdvf.c
 create mode 100644 include/hw/i386/tdvf.h

diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index 9e40ff79fc2d..0c3e3a464012 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -12,6 +12,7 @@ config SGX
 
 config TDX
     bool
+    select X86_FW_OVMF
     depends on KVM
 
 config PC
diff --git a/hw/i386/meson.build b/hw/i386/meson.build
index 213e2e82b3d7..97f3b50503b0 100644
--- a/hw/i386/meson.build
+++ b/hw/i386/meson.build
@@ -28,6 +28,7 @@ i386_ss.add(when: 'CONFIG_PC', if_true: files(
   'port92.c'))
 i386_ss.add(when: 'CONFIG_X86_FW_OVMF', if_true: files('pc_sysfw_ovmf.c'),
                                         if_false: files('pc_sysfw_ovmf-stubs.c'))
+i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c'))
 
 subdir('kvm')
 subdir('xen')
diff --git a/hw/i386/tdvf.c b/hw/i386/tdvf.c
new file mode 100644
index 000000000000..02da1d2c12dd
--- /dev/null
+++ b/hw/i386/tdvf.c
@@ -0,0 +1,196 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+
+ * Copyright (c) 2020 Intel Corporation
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/i386/pc.h"
+#include "hw/i386/tdvf.h"
+#include "sysemu/kvm.h"
+
+#define TDX_METADATA_GUID "e47a6535-984a-4798-865e-4685a7bf8ec2"
+#define TDX_METADATA_VERSION    1
+#define TDVF_SIGNATURE_LE32     0x46564454 /* TDVF as little endian */
+
+typedef struct {
+    uint32_t DataOffset;
+    uint32_t RawDataSize;
+    uint64_t MemoryAddress;
+    uint64_t MemoryDataSize;
+    uint32_t Type;
+    uint32_t Attributes;
+} TdvfSectionEntry;
+
+typedef struct {
+    uint32_t Signature;
+    uint32_t Length;
+    uint32_t Version;
+    uint32_t NumberOfSectionEntries;
+    TdvfSectionEntry SectionEntries[];
+} TdvfMetadata;
+
+struct tdx_metadata_offset {
+    uint32_t offset;
+};
+
+static TdvfMetadata *tdvf_get_metadata(void *flash_ptr, int size)
+{
+    TdvfMetadata *metadata;
+    uint32_t offset = 0;
+    uint8_t *data;
+
+    if ((uint32_t) size != size) {
+        return NULL;
+    }
+
+    if (pc_system_ovmf_table_find(TDX_METADATA_GUID, &data, NULL)) {
+        offset = size - le32_to_cpu(((struct tdx_metadata_offset *)data)->offset);
+
+        if (offset + sizeof(*metadata) > size) {
+            return NULL;
+        }
+    } else {
+        error_report("Cannot find TDX_METADATA_GUID\n");
+        return NULL;
+    }
+
+    metadata = flash_ptr + offset;
+
+    /* Finally, verify the signature to determine if this is a TDVF image. */
+   if (metadata->Signature != TDVF_SIGNATURE_LE32) {
+       error_report("Invalid TDVF signature in metadata!\n");
+       return NULL;
+   }
+
+    /* Sanity check that the TDVF doesn't overlap its own metadata. */
+    metadata->Length = le32_to_cpu(metadata->Length);
+    if (offset + metadata->Length > size) {
+        return NULL;
+    }
+
+    /* Only version 1 is supported/defined. */
+    metadata->Version = le32_to_cpu(metadata->Version);
+    if (metadata->Version != TDX_METADATA_VERSION) {
+        return NULL;
+    }
+
+    return metadata;
+}
+
+static int tdvf_parse_section_entry(const TdvfSectionEntry *src,
+                                     TdxFirmwareEntry *entry)
+{
+    entry->data_offset = le32_to_cpu(src->DataOffset);
+    entry->data_len = le32_to_cpu(src->RawDataSize);
+    entry->address = le64_to_cpu(src->MemoryAddress);
+    entry->size = le64_to_cpu(src->MemoryDataSize);
+    entry->type = le32_to_cpu(src->Type);
+    entry->attributes = le32_to_cpu(src->Attributes);
+
+    /* sanity check */
+    if (entry->size < entry->data_len) {
+        error_report("Broken metadata RawDataSize 0x%x MemoryDataSize 0x%lx",
+                     entry->data_len, entry->size);
+        return -1;
+    }
+    if (!QEMU_IS_ALIGNED(entry->address, TARGET_PAGE_SIZE)) {
+        error_report("MemoryAddress 0x%lx not page aligned", entry->address);
+        return -1;
+    }
+    if (!QEMU_IS_ALIGNED(entry->size, TARGET_PAGE_SIZE)) {
+        error_report("MemoryDataSize 0x%lx not page aligned", entry->size);
+        return -1;
+    }
+
+    switch (entry->type) {
+    case TDVF_SECTION_TYPE_BFV:
+    case TDVF_SECTION_TYPE_CFV:
+        /* The sections that must be copied from firmware image to TD memory */
+        if (entry->data_len == 0) {
+            error_report("%d section with RawDataSize == 0", entry->type);
+            return -1;
+        }
+        break;
+    case TDVF_SECTION_TYPE_TD_HOB:
+    case TDVF_SECTION_TYPE_TEMP_MEM:
+        /* The sections that no need to be copied from firmware image */
+        if (entry->data_len != 0) {
+            error_report("%d section with RawDataSize 0x%x != 0",
+                         entry->type, entry->data_len);
+            return -1;
+        }
+        break;
+    default:
+        error_report("TDVF contains unsupported section type %d", entry->type);
+        return -1;
+    }
+
+    return 0;
+}
+
+int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size)
+{
+    TdvfSectionEntry *sections;
+    TdvfMetadata *metadata;
+    ssize_t entries_size;
+    uint32_t len, i;
+
+    metadata = tdvf_get_metadata(flash_ptr, size);
+    if (!metadata) {
+        return -EINVAL;
+    }
+
+    //load and parse metadata entries
+    fw->nr_entries = le32_to_cpu(metadata->NumberOfSectionEntries);
+    if (fw->nr_entries < 2) {
+        error_report("Invalid number of fw entries (%u) in TDVF", fw->nr_entries);
+        return -EINVAL;
+    }
+
+    len = le32_to_cpu(metadata->Length);
+    entries_size = fw->nr_entries * sizeof(TdvfSectionEntry);
+    if (len != sizeof(*metadata) + entries_size) {
+        error_report("TDVF metadata len (0x%x) mismatch, expected (0x%x)",
+                     len, (uint32_t)(sizeof(*metadata) + entries_size));
+        return -EINVAL;
+    }
+
+    fw->entries = g_new(TdxFirmwareEntry, fw->nr_entries);
+    sections = g_new(TdvfSectionEntry, fw->nr_entries);
+
+    if (!memcpy(sections, (void *)metadata + sizeof(*metadata), entries_size))  {
+        error_report("Failed to read TDVF section entries");
+        goto err;
+    }
+
+    for (i = 0; i < fw->nr_entries; i++) {
+        if (tdvf_parse_section_entry(&sections[i], &fw->entries[i])) {
+            goto err;
+        }
+    }
+    g_free(sections);
+
+    return 0;
+
+err:
+    g_free(sections);
+    fw->entries = 0;
+    g_free(fw->entries);
+    return -EINVAL;
+}
diff --git a/include/hw/i386/tdvf.h b/include/hw/i386/tdvf.h
new file mode 100644
index 000000000000..593341eb2e93
--- /dev/null
+++ b/include/hw/i386/tdvf.h
@@ -0,0 +1,51 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+
+ * Copyright (c) 2020 Intel Corporation
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_I386_TDVF_H
+#define HW_I386_TDVF_H
+
+#include "qemu/osdep.h"
+
+#define TDVF_SECTION_TYPE_BFV               0
+#define TDVF_SECTION_TYPE_CFV               1
+#define TDVF_SECTION_TYPE_TD_HOB            2
+#define TDVF_SECTION_TYPE_TEMP_MEM          3
+
+#define TDVF_SECTION_ATTRIBUTES_MR_EXTEND   (1U << 0)
+#define TDVF_SECTION_ATTRIBUTES_PAGE_AUG    (1U << 1)
+
+typedef struct TdxFirmwareEntry {
+    uint32_t data_offset;
+    uint32_t data_len;
+    uint64_t address;
+    uint64_t size;
+    uint32_t type;
+    uint32_t attributes;
+} TdxFirmwareEntry;
+
+typedef struct TdxFirmware {
+    uint32_t nr_entries;
+    TdxFirmwareEntry *entries;
+} TdxFirmware;
+
+int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size);
+
+#endif /* HW_I386_TDVF_H */
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 19/36] i386/tdx: Parse TDVF metadata for TDX VM
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

When boot a TDX VM, parse firmware as TDVF. Only enable this on the case
that firmware is provided as flash, since it's the correct interface to
specify firmware for uefi guest.

- When unified firmware is provided, there is only one pflsh, pflash[0];

- When split images (CODE.fd and VARs.fd) are provided, metadata is
  located in CODE.fd, which means pflash[0].

So parse TDVF on plash[0].

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/pc_sysfw.c         | 21 ++++++++++++++-------
 target/i386/kvm/tdx-stub.c |  5 +++++
 target/i386/kvm/tdx.c      |  4 ++++
 target/i386/kvm/tdx.h      |  4 ++++
 4 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 03c84b5aaa32..bdec29fd9519 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -200,15 +200,16 @@ static void pc_system_flash_map(PCMachineState *pcms,
         if (i == 0) {
             pc_isa_bios_init(rom_memory, flash_mem, size);
 
+            flash_ptr = memory_region_get_ram_ptr(flash_mem);
+            flash_size = memory_region_size(flash_mem);
+            /*
+             * OVMF places a GUIDed structures in the flash, so
+             * search for them
+             */
+            pc_system_parse_ovmf_flash(flash_ptr, flash_size);
+
             /* Encrypt the pflash boot ROM */
             if (sev_enabled()) {
-                flash_ptr = memory_region_get_ram_ptr(flash_mem);
-                flash_size = memory_region_size(flash_mem);
-                /*
-                 * OVMF places a GUIDed structures in the flash, so
-                 * search for them
-                 */
-                pc_system_parse_ovmf_flash(flash_ptr, flash_size);
 
                 ret = sev_es_save_reset_vector(flash_ptr, flash_size);
                 if (ret) {
@@ -217,6 +218,12 @@ static void pc_system_flash_map(PCMachineState *pcms,
                 }
 
                 sev_encrypt_flash(flash_ptr, flash_size, &error_fatal);
+            } else if (is_tdx_vm()) {
+                ret = tdx_parse_tdvf(flash_ptr, flash_size);
+                if (ret) {
+                    error_report("failed to parse TDVF in pflash for TDX VM");
+                    exit(1);
+                }
             }
         }
     }
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
index 2871de9d7b56..395a59721266 100644
--- a/target/i386/kvm/tdx-stub.c
+++ b/target/i386/kvm/tdx-stub.c
@@ -12,3 +12,8 @@ int tdx_pre_create_vcpu(CPUState *cpu)
 {
     return -EINVAL;
 }
+
+int tdx_parse_tdvf(void *flash_ptr, int size)
+{
+    return -EINVAL;
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 1bb8211e74e6..7f34b14dc504 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -260,6 +260,10 @@ out:
     qemu_mutex_unlock(&tdx_guest->lock);
     return r;
 }
+int tdx_parse_tdvf(void *flash_ptr, int size)
+{
+    return tdvf_parse_metadata(&tdx_guest->tdvf, flash_ptr, size);
+}
 
 static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp)
 {
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 46a24ee8c7cc..12bcf25bb95b 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -6,6 +6,7 @@
 #endif
 
 #include "exec/confidential-guest-support.h"
+#include "hw/i386/tdvf.h"
 
 #define TYPE_TDX_GUEST "tdx-guest"
 #define TDX_GUEST(obj)  OBJECT_CHECK(TdxGuest, (obj), TYPE_TDX_GUEST)
@@ -21,6 +22,8 @@ typedef struct TdxGuest {
 
     bool initialized;
     uint64_t attributes;    /* TD attributes */
+
+    TdxFirmware tdvf;
 } TdxGuest;
 
 #ifdef CONFIG_TDX
@@ -33,5 +36,6 @@ int tdx_kvm_init(MachineState *ms, Error **errp);
 void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
                              uint32_t *ret);
 int tdx_pre_create_vcpu(CPUState *cpu);
+int tdx_parse_tdvf(void *flash_ptr, int size);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 19/36] i386/tdx: Parse TDVF metadata for TDX VM
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

When boot a TDX VM, parse firmware as TDVF. Only enable this on the case
that firmware is provided as flash, since it's the correct interface to
specify firmware for uefi guest.

- When unified firmware is provided, there is only one pflsh, pflash[0];

- When split images (CODE.fd and VARs.fd) are provided, metadata is
  located in CODE.fd, which means pflash[0].

So parse TDVF on plash[0].

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/pc_sysfw.c         | 21 ++++++++++++++-------
 target/i386/kvm/tdx-stub.c |  5 +++++
 target/i386/kvm/tdx.c      |  4 ++++
 target/i386/kvm/tdx.h      |  4 ++++
 4 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 03c84b5aaa32..bdec29fd9519 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -200,15 +200,16 @@ static void pc_system_flash_map(PCMachineState *pcms,
         if (i == 0) {
             pc_isa_bios_init(rom_memory, flash_mem, size);
 
+            flash_ptr = memory_region_get_ram_ptr(flash_mem);
+            flash_size = memory_region_size(flash_mem);
+            /*
+             * OVMF places a GUIDed structures in the flash, so
+             * search for them
+             */
+            pc_system_parse_ovmf_flash(flash_ptr, flash_size);
+
             /* Encrypt the pflash boot ROM */
             if (sev_enabled()) {
-                flash_ptr = memory_region_get_ram_ptr(flash_mem);
-                flash_size = memory_region_size(flash_mem);
-                /*
-                 * OVMF places a GUIDed structures in the flash, so
-                 * search for them
-                 */
-                pc_system_parse_ovmf_flash(flash_ptr, flash_size);
 
                 ret = sev_es_save_reset_vector(flash_ptr, flash_size);
                 if (ret) {
@@ -217,6 +218,12 @@ static void pc_system_flash_map(PCMachineState *pcms,
                 }
 
                 sev_encrypt_flash(flash_ptr, flash_size, &error_fatal);
+            } else if (is_tdx_vm()) {
+                ret = tdx_parse_tdvf(flash_ptr, flash_size);
+                if (ret) {
+                    error_report("failed to parse TDVF in pflash for TDX VM");
+                    exit(1);
+                }
             }
         }
     }
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
index 2871de9d7b56..395a59721266 100644
--- a/target/i386/kvm/tdx-stub.c
+++ b/target/i386/kvm/tdx-stub.c
@@ -12,3 +12,8 @@ int tdx_pre_create_vcpu(CPUState *cpu)
 {
     return -EINVAL;
 }
+
+int tdx_parse_tdvf(void *flash_ptr, int size)
+{
+    return -EINVAL;
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 1bb8211e74e6..7f34b14dc504 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -260,6 +260,10 @@ out:
     qemu_mutex_unlock(&tdx_guest->lock);
     return r;
 }
+int tdx_parse_tdvf(void *flash_ptr, int size)
+{
+    return tdvf_parse_metadata(&tdx_guest->tdvf, flash_ptr, size);
+}
 
 static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp)
 {
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 46a24ee8c7cc..12bcf25bb95b 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -6,6 +6,7 @@
 #endif
 
 #include "exec/confidential-guest-support.h"
+#include "hw/i386/tdvf.h"
 
 #define TYPE_TDX_GUEST "tdx-guest"
 #define TDX_GUEST(obj)  OBJECT_CHECK(TdxGuest, (obj), TYPE_TDX_GUEST)
@@ -21,6 +22,8 @@ typedef struct TdxGuest {
 
     bool initialized;
     uint64_t attributes;    /* TD attributes */
+
+    TdxFirmware tdvf;
 } TdxGuest;
 
 #ifdef CONFIG_TDX
@@ -33,5 +36,6 @@ int tdx_kvm_init(MachineState *ms, Error **errp);
 void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
                              uint32_t *ret);
 int tdx_pre_create_vcpu(CPUState *cpu);
+int tdx_parse_tdvf(void *flash_ptr, int size);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 20/36] i386/tdx: Get and store the mem_ptr of TDVF firmware
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

QEMU needs to later copy the context of TDVF firmware to guest private
memory. So get the mem_ptr of CODE.fd and VARS.fd and store them in
tdx_guest object.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/pc_sysfw.c         | 20 ++++++++++++--------
 include/hw/i386/tdvf.h     |  4 ++++
 target/i386/kvm/tdx-stub.c |  5 +++++
 target/i386/kvm/tdx.c      |  7 +++++++
 target/i386/kvm/tdx.h      |  1 +
 5 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index bdec29fd9519..fbe3e42278cd 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -147,8 +147,8 @@ static void pc_system_flash_map(PCMachineState *pcms,
     int64_t size;
     PFlashCFI01 *system_flash;
     MemoryRegion *flash_mem;
-    void *flash_ptr;
-    int flash_size;
+    void *flash_ptr[2] = {NULL, NULL};
+    int flash_size[2];
     int ret;
 
     assert(PC_MACHINE_GET_CLASS(pcms)->pci_enabled);
@@ -197,29 +197,29 @@ static void pc_system_flash_map(PCMachineState *pcms,
                             0x100000000ULL - total_size);
         }
 
+        flash_ptr[i] = memory_region_get_ram_ptr(flash_mem);
+        flash_size[i] = memory_region_size(flash_mem);
         if (i == 0) {
             pc_isa_bios_init(rom_memory, flash_mem, size);
 
-            flash_ptr = memory_region_get_ram_ptr(flash_mem);
-            flash_size = memory_region_size(flash_mem);
             /*
              * OVMF places a GUIDed structures in the flash, so
              * search for them
              */
-            pc_system_parse_ovmf_flash(flash_ptr, flash_size);
+            pc_system_parse_ovmf_flash(flash_ptr[i], flash_size[i]);
 
             /* Encrypt the pflash boot ROM */
             if (sev_enabled()) {
 
-                ret = sev_es_save_reset_vector(flash_ptr, flash_size);
+                ret = sev_es_save_reset_vector(flash_ptr[i], flash_size[i]);
                 if (ret) {
                     error_report("failed to locate and/or save reset vector");
                     exit(1);
                 }
 
-                sev_encrypt_flash(flash_ptr, flash_size, &error_fatal);
+                sev_encrypt_flash(flash_ptr[i], flash_size[i], &error_fatal);
             } else if (is_tdx_vm()) {
-                ret = tdx_parse_tdvf(flash_ptr, flash_size);
+                ret = tdx_parse_tdvf(flash_ptr[i], flash_size[i]);
                 if (ret) {
                     error_report("failed to parse TDVF in pflash for TDX VM");
                     exit(1);
@@ -227,6 +227,10 @@ static void pc_system_flash_map(PCMachineState *pcms,
             }
         }
     }
+
+    if (is_tdx_vm()) {
+        tdx_set_code_vars_ptr(flash_ptr[0], flash_ptr[1]);
+    }
 }
 
 void pc_system_firmware_init(PCMachineState *pcms,
diff --git a/include/hw/i386/tdvf.h b/include/hw/i386/tdvf.h
index 593341eb2e93..773bd39a3bff 100644
--- a/include/hw/i386/tdvf.h
+++ b/include/hw/i386/tdvf.h
@@ -42,6 +42,10 @@ typedef struct TdxFirmwareEntry {
 } TdxFirmwareEntry;
 
 typedef struct TdxFirmware {
+    bool split_tdvf;
+    void *code_ptr;
+    void *vars_ptr;
+
     uint32_t nr_entries;
     TdxFirmwareEntry *entries;
 } TdxFirmware;
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
index 395a59721266..b548b4578276 100644
--- a/target/i386/kvm/tdx-stub.c
+++ b/target/i386/kvm/tdx-stub.c
@@ -17,3 +17,8 @@ int tdx_parse_tdvf(void *flash_ptr, int size)
 {
     return -EINVAL;
 }
+
+void tdx_set_code_vars_ptr(void *code_ptr, void *vars_ptr)
+{
+    g_assert_not_reached();
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 7f34b14dc504..cd88b6dfc280 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -265,6 +265,13 @@ int tdx_parse_tdvf(void *flash_ptr, int size)
     return tdvf_parse_metadata(&tdx_guest->tdvf, flash_ptr, size);
 }
 
+void tdx_set_code_vars_ptr(void *code_ptr, void *vars_ptr)
+{
+    tdx_guest->tdvf.code_ptr = code_ptr;
+    tdx_guest->tdvf.vars_ptr = vars_ptr;
+    tdx_guest->tdvf.split_tdvf = vars_ptr ? true : false;
+}
+
 static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp)
 {
     TdxGuest *tdx = TDX_GUEST(obj);
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 12bcf25bb95b..b3cedd0d5d0c 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -37,5 +37,6 @@ void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
                              uint32_t *ret);
 int tdx_pre_create_vcpu(CPUState *cpu);
 int tdx_parse_tdvf(void *flash_ptr, int size);
+void tdx_set_code_vars_ptr(void *code_ptr, void *vars_ptr);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 20/36] i386/tdx: Get and store the mem_ptr of TDVF firmware
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

QEMU needs to later copy the context of TDVF firmware to guest private
memory. So get the mem_ptr of CODE.fd and VARS.fd and store them in
tdx_guest object.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/pc_sysfw.c         | 20 ++++++++++++--------
 include/hw/i386/tdvf.h     |  4 ++++
 target/i386/kvm/tdx-stub.c |  5 +++++
 target/i386/kvm/tdx.c      |  7 +++++++
 target/i386/kvm/tdx.h      |  1 +
 5 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index bdec29fd9519..fbe3e42278cd 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -147,8 +147,8 @@ static void pc_system_flash_map(PCMachineState *pcms,
     int64_t size;
     PFlashCFI01 *system_flash;
     MemoryRegion *flash_mem;
-    void *flash_ptr;
-    int flash_size;
+    void *flash_ptr[2] = {NULL, NULL};
+    int flash_size[2];
     int ret;
 
     assert(PC_MACHINE_GET_CLASS(pcms)->pci_enabled);
@@ -197,29 +197,29 @@ static void pc_system_flash_map(PCMachineState *pcms,
                             0x100000000ULL - total_size);
         }
 
+        flash_ptr[i] = memory_region_get_ram_ptr(flash_mem);
+        flash_size[i] = memory_region_size(flash_mem);
         if (i == 0) {
             pc_isa_bios_init(rom_memory, flash_mem, size);
 
-            flash_ptr = memory_region_get_ram_ptr(flash_mem);
-            flash_size = memory_region_size(flash_mem);
             /*
              * OVMF places a GUIDed structures in the flash, so
              * search for them
              */
-            pc_system_parse_ovmf_flash(flash_ptr, flash_size);
+            pc_system_parse_ovmf_flash(flash_ptr[i], flash_size[i]);
 
             /* Encrypt the pflash boot ROM */
             if (sev_enabled()) {
 
-                ret = sev_es_save_reset_vector(flash_ptr, flash_size);
+                ret = sev_es_save_reset_vector(flash_ptr[i], flash_size[i]);
                 if (ret) {
                     error_report("failed to locate and/or save reset vector");
                     exit(1);
                 }
 
-                sev_encrypt_flash(flash_ptr, flash_size, &error_fatal);
+                sev_encrypt_flash(flash_ptr[i], flash_size[i], &error_fatal);
             } else if (is_tdx_vm()) {
-                ret = tdx_parse_tdvf(flash_ptr, flash_size);
+                ret = tdx_parse_tdvf(flash_ptr[i], flash_size[i]);
                 if (ret) {
                     error_report("failed to parse TDVF in pflash for TDX VM");
                     exit(1);
@@ -227,6 +227,10 @@ static void pc_system_flash_map(PCMachineState *pcms,
             }
         }
     }
+
+    if (is_tdx_vm()) {
+        tdx_set_code_vars_ptr(flash_ptr[0], flash_ptr[1]);
+    }
 }
 
 void pc_system_firmware_init(PCMachineState *pcms,
diff --git a/include/hw/i386/tdvf.h b/include/hw/i386/tdvf.h
index 593341eb2e93..773bd39a3bff 100644
--- a/include/hw/i386/tdvf.h
+++ b/include/hw/i386/tdvf.h
@@ -42,6 +42,10 @@ typedef struct TdxFirmwareEntry {
 } TdxFirmwareEntry;
 
 typedef struct TdxFirmware {
+    bool split_tdvf;
+    void *code_ptr;
+    void *vars_ptr;
+
     uint32_t nr_entries;
     TdxFirmwareEntry *entries;
 } TdxFirmware;
diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
index 395a59721266..b548b4578276 100644
--- a/target/i386/kvm/tdx-stub.c
+++ b/target/i386/kvm/tdx-stub.c
@@ -17,3 +17,8 @@ int tdx_parse_tdvf(void *flash_ptr, int size)
 {
     return -EINVAL;
 }
+
+void tdx_set_code_vars_ptr(void *code_ptr, void *vars_ptr)
+{
+    g_assert_not_reached();
+}
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 7f34b14dc504..cd88b6dfc280 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -265,6 +265,13 @@ int tdx_parse_tdvf(void *flash_ptr, int size)
     return tdvf_parse_metadata(&tdx_guest->tdvf, flash_ptr, size);
 }
 
+void tdx_set_code_vars_ptr(void *code_ptr, void *vars_ptr)
+{
+    tdx_guest->tdvf.code_ptr = code_ptr;
+    tdx_guest->tdvf.vars_ptr = vars_ptr;
+    tdx_guest->tdvf.split_tdvf = vars_ptr ? true : false;
+}
+
 static bool tdx_guest_get_sept_ve_disable(Object *obj, Error **errp)
 {
     TdxGuest *tdx = TDX_GUEST(obj);
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 12bcf25bb95b..b3cedd0d5d0c 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -37,5 +37,6 @@ void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
                              uint32_t *ret);
 int tdx_pre_create_vcpu(CPUState *cpu);
 int tdx_parse_tdvf(void *flash_ptr, int size);
+void tdx_set_code_vars_ptr(void *code_ptr, void *vars_ptr);
 
 #endif /* QEMU_I386_TDX_H */
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 21/36] i386/tdx: Track mem_ptr for each firmware entry of TDVF
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

For every TDVF sections, QEMU needs to copy its content to guest
private memory via KVM API, to initialize them.

So add a field @mem_ptr to track the pointer of each TDVF sections.

BFV and CFV are firmware and loaded as plfash.

TEMP_MEM and TD_HOB always locate at guest RAM before 4G, specifically
starting from 0x80 0000 (8M)

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 include/hw/i386/tdvf.h |  5 +++++
 target/i386/kvm/tdx.c  | 42 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/include/hw/i386/tdvf.h b/include/hw/i386/tdvf.h
index 773bd39a3bff..ce28b7ec4543 100644
--- a/include/hw/i386/tdvf.h
+++ b/include/hw/i386/tdvf.h
@@ -39,6 +39,8 @@ typedef struct TdxFirmwareEntry {
     uint64_t size;
     uint32_t type;
     uint32_t attributes;
+
+    void *mem_ptr;
 } TdxFirmwareEntry;
 
 typedef struct TdxFirmware {
@@ -50,6 +52,9 @@ typedef struct TdxFirmware {
     TdxFirmwareEntry *entries;
 } TdxFirmware;
 
+#define for_each_tdx_fw_entry(fw, e)    \
+    for (e = (fw)->entries; e != (fw)->entries + (fw)->nr_entries; e++)
+
 int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size);
 
 #endif /* HW_I386_TDVF_H */
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index cd88b6dfc280..fe8554dcebb0 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -16,8 +16,10 @@
 #include "qom/object_interfaces.h"
 #include "standard-headers/asm-x86/kvm_para.h"
 #include "sysemu/kvm.h"
+#include "sysemu/sysemu.h"
 
 #include "hw/i386/x86.h"
+#include "hw/i386/tdvf.h"
 #include "kvm_i386.h"
 #include "tdx.h"
 
@@ -103,6 +105,44 @@ static void get_tdx_capabilities(void)
     tdx_caps = caps;
 }
 
+static void tdx_finalize_vm(Notifier *notifier, void *unused)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    void *base_ram_ptr = memory_region_get_ram_ptr(ms->ram);
+    TdxFirmware *tdvf = &tdx_guest->tdvf;
+    TdxFirmwareEntry *entry;
+
+    for_each_tdx_fw_entry(tdvf, entry) {
+        switch (entry->type) {
+        case TDVF_SECTION_TYPE_BFV:
+            if (tdvf->split_tdvf) {
+                entry->mem_ptr = tdvf->code_ptr;
+            } else {
+                entry->mem_ptr = tdvf->code_ptr + entry->data_offset;
+            }
+            break;
+        case TDVF_SECTION_TYPE_CFV:
+            if (tdvf->split_tdvf) {
+                entry->mem_ptr = tdvf->vars_ptr;
+            } else {
+                entry->mem_ptr = tdvf->code_ptr;
+            }
+            break;
+        case TDVF_SECTION_TYPE_TD_HOB:
+        case TDVF_SECTION_TYPE_TEMP_MEM:
+            entry->mem_ptr = base_ram_ptr + entry->address;
+            break;
+        default:
+            error_report("Unsupported TDVF section %d", entry->type);
+            exit(1);
+        }
+    }
+}
+
+static Notifier tdx_machine_done_notify = {
+    .notify = tdx_finalize_vm,
+};
+
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
     TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
@@ -124,6 +164,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
      */
     kvm_readonly_mem_allowed = false;
 
+    qemu_add_machine_init_done_notifier(&tdx_machine_done_notify);
+
     tdx_guest = tdx;
 
     return 0;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 21/36] i386/tdx: Track mem_ptr for each firmware entry of TDVF
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

For every TDVF sections, QEMU needs to copy its content to guest
private memory via KVM API, to initialize them.

So add a field @mem_ptr to track the pointer of each TDVF sections.

BFV and CFV are firmware and loaded as plfash.

TEMP_MEM and TD_HOB always locate at guest RAM before 4G, specifically
starting from 0x80 0000 (8M)

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 include/hw/i386/tdvf.h |  5 +++++
 target/i386/kvm/tdx.c  | 42 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/include/hw/i386/tdvf.h b/include/hw/i386/tdvf.h
index 773bd39a3bff..ce28b7ec4543 100644
--- a/include/hw/i386/tdvf.h
+++ b/include/hw/i386/tdvf.h
@@ -39,6 +39,8 @@ typedef struct TdxFirmwareEntry {
     uint64_t size;
     uint32_t type;
     uint32_t attributes;
+
+    void *mem_ptr;
 } TdxFirmwareEntry;
 
 typedef struct TdxFirmware {
@@ -50,6 +52,9 @@ typedef struct TdxFirmware {
     TdxFirmwareEntry *entries;
 } TdxFirmware;
 
+#define for_each_tdx_fw_entry(fw, e)    \
+    for (e = (fw)->entries; e != (fw)->entries + (fw)->nr_entries; e++)
+
 int tdvf_parse_metadata(TdxFirmware *fw, void *flash_ptr, int size);
 
 #endif /* HW_I386_TDVF_H */
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index cd88b6dfc280..fe8554dcebb0 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -16,8 +16,10 @@
 #include "qom/object_interfaces.h"
 #include "standard-headers/asm-x86/kvm_para.h"
 #include "sysemu/kvm.h"
+#include "sysemu/sysemu.h"
 
 #include "hw/i386/x86.h"
+#include "hw/i386/tdvf.h"
 #include "kvm_i386.h"
 #include "tdx.h"
 
@@ -103,6 +105,44 @@ static void get_tdx_capabilities(void)
     tdx_caps = caps;
 }
 
+static void tdx_finalize_vm(Notifier *notifier, void *unused)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    void *base_ram_ptr = memory_region_get_ram_ptr(ms->ram);
+    TdxFirmware *tdvf = &tdx_guest->tdvf;
+    TdxFirmwareEntry *entry;
+
+    for_each_tdx_fw_entry(tdvf, entry) {
+        switch (entry->type) {
+        case TDVF_SECTION_TYPE_BFV:
+            if (tdvf->split_tdvf) {
+                entry->mem_ptr = tdvf->code_ptr;
+            } else {
+                entry->mem_ptr = tdvf->code_ptr + entry->data_offset;
+            }
+            break;
+        case TDVF_SECTION_TYPE_CFV:
+            if (tdvf->split_tdvf) {
+                entry->mem_ptr = tdvf->vars_ptr;
+            } else {
+                entry->mem_ptr = tdvf->code_ptr;
+            }
+            break;
+        case TDVF_SECTION_TYPE_TD_HOB:
+        case TDVF_SECTION_TYPE_TEMP_MEM:
+            entry->mem_ptr = base_ram_ptr + entry->address;
+            break;
+        default:
+            error_report("Unsupported TDVF section %d", entry->type);
+            exit(1);
+        }
+    }
+}
+
+static Notifier tdx_machine_done_notify = {
+    .notify = tdx_finalize_vm,
+};
+
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
     TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
@@ -124,6 +164,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
      */
     kvm_readonly_mem_allowed = false;
 
+    qemu_add_machine_init_done_notifier(&tdx_machine_done_notify);
+
     tdx_guest = tdx;
 
     return 0;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 22/36] i386/tdx: Track RAM entries for TDX VM
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:58   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

The RAM of TDX VM can be classified into two types:

 - TDX_RAM_UNACCEPTED: default type of TDX memory, which needs to be
   accepted by TDX guest before it can be used and will be all-zeros
   after being accepted.

 - TDX_RAM_ADDED: the RAM that is ADD'ed to TD guest before running, and
   can be used directly without being accepted. It's used to initialized
   TDVF TD HOB and TEMP MEM.

Maintain TdxRamEntries[] which grabs the initial RAM infos from e820 table
and mark each RAM range as default type TDX_RAM_UNACCEPTED.

Then it turns the range of TD HOB and TEMP MEM to TDX_RAM_ADDED since these
ranges will be ADD'ed before TD runs and no need to be accepted runtime.

The TdxRamEntries[] are later used to setup the memory TD resource HOB
that passes memory info from QEMU to TDVF.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 99 +++++++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h | 14 ++++++
 2 files changed, 113 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index fe8554dcebb0..59446ed10ce4 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -18,6 +18,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 
+#include "hw/i386/e820_memory_layout.h"
 #include "hw/i386/x86.h"
 #include "hw/i386/tdvf.h"
 #include "kvm_i386.h"
@@ -105,6 +106,98 @@ static void get_tdx_capabilities(void)
     tdx_caps = caps;
 }
 
+static void tdx_add_ram_entry(uint64_t address, uint64_t length, uint32_t type)
+{
+    uint32_t nr_entries = tdx_guest->nr_ram_entries;
+    tdx_guest->ram_entries = g_renew(TdxRamEntry, tdx_guest->ram_entries,
+                                     nr_entries + 1);
+
+    tdx_guest->ram_entries[nr_entries].address = address;
+    tdx_guest->ram_entries[nr_entries].length = length;
+    tdx_guest->ram_entries[nr_entries].type = type;
+    tdx_guest->nr_ram_entries++;
+}
+
+static int tdx_accept_ram_range(uint64_t address, uint64_t length)
+{
+    TdxRamEntry *e;
+    int i;
+
+    for (i = 0; i < tdx_guest->nr_ram_entries; i++) {
+        e = &tdx_guest->ram_entries[i];
+
+        if (address + length < e->address ||
+            e->address + e->length < address) {
+                continue;
+        }
+
+        if (e->address > address ||
+            e->address + e->length < address + length) {
+            return -EINVAL;
+        }
+
+        if (e->address == address && e->length == length) {
+            e->type = TDX_RAM_ADDED;
+        } else if (e->address == address) {
+            e->address += length;
+            e->length -= length;
+            tdx_add_ram_entry(address, length, TDX_RAM_ADDED);
+        } else if (e->address + e->length == address + length) {
+            e->length -= length;
+            tdx_add_ram_entry(address, length, TDX_RAM_ADDED);
+        } else {
+            TdxRamEntry tmp = {
+                .address = e->address,
+                .length = e->length,
+            };
+            e->length = address - tmp.address;
+
+            tdx_add_ram_entry(address, length, TDX_RAM_ADDED);
+            tdx_add_ram_entry(address + length,
+                              tmp.address + tmp.length - (address + length),
+                              TDX_RAM_UNACCEPTED);
+        }
+
+        return 0;
+    }
+
+    return -1;
+}
+
+static int tdx_ram_entry_compare(const void *lhs_, const void* rhs_)
+{
+    const TdxRamEntry *lhs = lhs_;
+    const TdxRamEntry *rhs = rhs_;
+
+    if (lhs->address == rhs->address) {
+        return 0;
+    }
+    if (le64_to_cpu(lhs->address) > le64_to_cpu(rhs->address)) {
+        return 1;
+    }
+    return -1;
+}
+
+static void tdx_init_ram_entries(void)
+{
+    unsigned i, j, nr_e820_entries;
+
+    nr_e820_entries = e820_get_num_entries();
+    tdx_guest->ram_entries = g_new(TdxRamEntry, nr_e820_entries);
+
+    for (i = 0, j = 0; i < nr_e820_entries; i++) {
+        uint64_t addr, len;
+
+        if (e820_get_entry(i, E820_RAM, &addr, &len)) {
+            tdx_guest->ram_entries[j].address = addr;
+            tdx_guest->ram_entries[j].length = len;
+            tdx_guest->ram_entries[j].type = TDX_RAM_UNACCEPTED;
+            j++;
+        }
+    }
+    tdx_guest->nr_ram_entries = j;
+}
+
 static void tdx_finalize_vm(Notifier *notifier, void *unused)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
@@ -112,6 +205,8 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     TdxFirmware *tdvf = &tdx_guest->tdvf;
     TdxFirmwareEntry *entry;
 
+    tdx_init_ram_entries();
+
     for_each_tdx_fw_entry(tdvf, entry) {
         switch (entry->type) {
         case TDVF_SECTION_TYPE_BFV:
@@ -131,12 +226,16 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
         case TDVF_SECTION_TYPE_TD_HOB:
         case TDVF_SECTION_TYPE_TEMP_MEM:
             entry->mem_ptr = base_ram_ptr + entry->address;
+            tdx_accept_ram_range(entry->address, entry->size);
             break;
         default:
             error_report("Unsupported TDVF section %d", entry->type);
             exit(1);
         }
     }
+
+    qsort(tdx_guest->ram_entries, tdx_guest->nr_ram_entries,
+          sizeof(TdxRamEntry), &tdx_ram_entry_compare);
 }
 
 static Notifier tdx_machine_done_notify = {
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index b3cedd0d5d0c..1e2d5d6f2a24 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -15,6 +15,17 @@ typedef struct TdxGuestClass {
     ConfidentialGuestSupportClass parent_class;
 } TdxGuestClass;
 
+enum TdxRamType{
+    TDX_RAM_UNACCEPTED,
+    TDX_RAM_ADDED,
+};
+
+typedef struct TdxRamEntry {
+    uint64_t address;
+    uint64_t length;
+    uint32_t type;
+} TdxRamEntry;
+
 typedef struct TdxGuest {
     ConfidentialGuestSupport parent_obj;
 
@@ -24,6 +35,9 @@ typedef struct TdxGuest {
     uint64_t attributes;    /* TD attributes */
 
     TdxFirmware tdvf;
+
+    uint32_t nr_ram_entries;
+    TdxRamEntry *ram_entries;
 } TdxGuest;
 
 #ifdef CONFIG_TDX
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 22/36] i386/tdx: Track RAM entries for TDX VM
@ 2022-03-17 13:58   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:58 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

The RAM of TDX VM can be classified into two types:

 - TDX_RAM_UNACCEPTED: default type of TDX memory, which needs to be
   accepted by TDX guest before it can be used and will be all-zeros
   after being accepted.

 - TDX_RAM_ADDED: the RAM that is ADD'ed to TD guest before running, and
   can be used directly without being accepted. It's used to initialized
   TDVF TD HOB and TEMP MEM.

Maintain TdxRamEntries[] which grabs the initial RAM infos from e820 table
and mark each RAM range as default type TDX_RAM_UNACCEPTED.

Then it turns the range of TD HOB and TEMP MEM to TDX_RAM_ADDED since these
ranges will be ADD'ed before TD runs and no need to be accepted runtime.

The TdxRamEntries[] are later used to setup the memory TD resource HOB
that passes memory info from QEMU to TDVF.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 99 +++++++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.h | 14 ++++++
 2 files changed, 113 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index fe8554dcebb0..59446ed10ce4 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -18,6 +18,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 
+#include "hw/i386/e820_memory_layout.h"
 #include "hw/i386/x86.h"
 #include "hw/i386/tdvf.h"
 #include "kvm_i386.h"
@@ -105,6 +106,98 @@ static void get_tdx_capabilities(void)
     tdx_caps = caps;
 }
 
+static void tdx_add_ram_entry(uint64_t address, uint64_t length, uint32_t type)
+{
+    uint32_t nr_entries = tdx_guest->nr_ram_entries;
+    tdx_guest->ram_entries = g_renew(TdxRamEntry, tdx_guest->ram_entries,
+                                     nr_entries + 1);
+
+    tdx_guest->ram_entries[nr_entries].address = address;
+    tdx_guest->ram_entries[nr_entries].length = length;
+    tdx_guest->ram_entries[nr_entries].type = type;
+    tdx_guest->nr_ram_entries++;
+}
+
+static int tdx_accept_ram_range(uint64_t address, uint64_t length)
+{
+    TdxRamEntry *e;
+    int i;
+
+    for (i = 0; i < tdx_guest->nr_ram_entries; i++) {
+        e = &tdx_guest->ram_entries[i];
+
+        if (address + length < e->address ||
+            e->address + e->length < address) {
+                continue;
+        }
+
+        if (e->address > address ||
+            e->address + e->length < address + length) {
+            return -EINVAL;
+        }
+
+        if (e->address == address && e->length == length) {
+            e->type = TDX_RAM_ADDED;
+        } else if (e->address == address) {
+            e->address += length;
+            e->length -= length;
+            tdx_add_ram_entry(address, length, TDX_RAM_ADDED);
+        } else if (e->address + e->length == address + length) {
+            e->length -= length;
+            tdx_add_ram_entry(address, length, TDX_RAM_ADDED);
+        } else {
+            TdxRamEntry tmp = {
+                .address = e->address,
+                .length = e->length,
+            };
+            e->length = address - tmp.address;
+
+            tdx_add_ram_entry(address, length, TDX_RAM_ADDED);
+            tdx_add_ram_entry(address + length,
+                              tmp.address + tmp.length - (address + length),
+                              TDX_RAM_UNACCEPTED);
+        }
+
+        return 0;
+    }
+
+    return -1;
+}
+
+static int tdx_ram_entry_compare(const void *lhs_, const void* rhs_)
+{
+    const TdxRamEntry *lhs = lhs_;
+    const TdxRamEntry *rhs = rhs_;
+
+    if (lhs->address == rhs->address) {
+        return 0;
+    }
+    if (le64_to_cpu(lhs->address) > le64_to_cpu(rhs->address)) {
+        return 1;
+    }
+    return -1;
+}
+
+static void tdx_init_ram_entries(void)
+{
+    unsigned i, j, nr_e820_entries;
+
+    nr_e820_entries = e820_get_num_entries();
+    tdx_guest->ram_entries = g_new(TdxRamEntry, nr_e820_entries);
+
+    for (i = 0, j = 0; i < nr_e820_entries; i++) {
+        uint64_t addr, len;
+
+        if (e820_get_entry(i, E820_RAM, &addr, &len)) {
+            tdx_guest->ram_entries[j].address = addr;
+            tdx_guest->ram_entries[j].length = len;
+            tdx_guest->ram_entries[j].type = TDX_RAM_UNACCEPTED;
+            j++;
+        }
+    }
+    tdx_guest->nr_ram_entries = j;
+}
+
 static void tdx_finalize_vm(Notifier *notifier, void *unused)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
@@ -112,6 +205,8 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     TdxFirmware *tdvf = &tdx_guest->tdvf;
     TdxFirmwareEntry *entry;
 
+    tdx_init_ram_entries();
+
     for_each_tdx_fw_entry(tdvf, entry) {
         switch (entry->type) {
         case TDVF_SECTION_TYPE_BFV:
@@ -131,12 +226,16 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
         case TDVF_SECTION_TYPE_TD_HOB:
         case TDVF_SECTION_TYPE_TEMP_MEM:
             entry->mem_ptr = base_ram_ptr + entry->address;
+            tdx_accept_ram_range(entry->address, entry->size);
             break;
         default:
             error_report("Unsupported TDVF section %d", entry->type);
             exit(1);
         }
     }
+
+    qsort(tdx_guest->ram_entries, tdx_guest->nr_ram_entries,
+          sizeof(TdxRamEntry), &tdx_ram_entry_compare);
 }
 
 static Notifier tdx_machine_done_notify = {
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index b3cedd0d5d0c..1e2d5d6f2a24 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -15,6 +15,17 @@ typedef struct TdxGuestClass {
     ConfidentialGuestSupportClass parent_class;
 } TdxGuestClass;
 
+enum TdxRamType{
+    TDX_RAM_UNACCEPTED,
+    TDX_RAM_ADDED,
+};
+
+typedef struct TdxRamEntry {
+    uint64_t address;
+    uint64_t length;
+    uint32_t type;
+} TdxRamEntry;
+
 typedef struct TdxGuest {
     ConfidentialGuestSupport parent_obj;
 
@@ -24,6 +35,9 @@ typedef struct TdxGuest {
     uint64_t attributes;    /* TD attributes */
 
     TdxFirmware tdvf;
+
+    uint32_t nr_ram_entries;
+    TdxRamEntry *ram_entries;
 } TdxGuest;
 
 #ifdef CONFIG_TDX
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 23/36] i386/tdx: Create the TD HOB list upon machine init done
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

The TD HOB list is used to pass the information from VMM to TDVF. The TD
HOB must include PHIT HOB and Resource Descriptor HOB. More details can
be found in TDVF specification and PI specification.

Build the TD HOB in machine_init_done callback.

Co-developed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/meson.build   |   2 +-
 hw/i386/tdvf-hob.c    | 212 ++++++++++++++++++++++++++++++++++++++++++
 hw/i386/tdvf-hob.h    |  25 +++++
 hw/i386/uefi.h        | 198 +++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.c |  15 +++
 5 files changed, 451 insertions(+), 1 deletion(-)
 create mode 100644 hw/i386/tdvf-hob.c
 create mode 100644 hw/i386/tdvf-hob.h
 create mode 100644 hw/i386/uefi.h

diff --git a/hw/i386/meson.build b/hw/i386/meson.build
index 97f3b50503b0..b59e0d35bba3 100644
--- a/hw/i386/meson.build
+++ b/hw/i386/meson.build
@@ -28,7 +28,7 @@ i386_ss.add(when: 'CONFIG_PC', if_true: files(
   'port92.c'))
 i386_ss.add(when: 'CONFIG_X86_FW_OVMF', if_true: files('pc_sysfw_ovmf.c'),
                                         if_false: files('pc_sysfw_ovmf-stubs.c'))
-i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c'))
+i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c', 'tdvf-hob.c'))
 
 subdir('kvm')
 subdir('xen')
diff --git a/hw/i386/tdvf-hob.c b/hw/i386/tdvf-hob.c
new file mode 100644
index 000000000000..31160e9f95c5
--- /dev/null
+++ b/hw/i386/tdvf-hob.c
@@ -0,0 +1,212 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+
+ * Copyright (c) 2020 Intel Corporation
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "e820_memory_layout.h"
+#include "hw/i386/pc.h"
+#include "hw/i386/x86.h"
+#include "hw/pci/pcie_host.h"
+#include "sysemu/kvm.h"
+#include "tdvf-hob.h"
+#include "uefi.h"
+
+typedef struct TdvfHob {
+    hwaddr hob_addr;
+    void *ptr;
+    int size;
+
+    /* working area */
+    void *current;
+    void *end;
+} TdvfHob;
+
+static uint64_t tdvf_current_guest_addr(const TdvfHob *hob)
+{
+    return hob->hob_addr + (hob->current - hob->ptr);
+}
+
+static void tdvf_align(TdvfHob *hob, size_t align)
+{
+    hob->current = QEMU_ALIGN_PTR_UP(hob->current, align);
+}
+
+static void *tdvf_get_area(TdvfHob *hob, uint64_t size)
+{
+    void *ret;
+
+    if (hob->current + size > hob->end) {
+        error_report("TD_HOB overrun, size = 0x%" PRIx64, size);
+        exit(1);
+    }
+
+    ret = hob->current;
+    hob->current += size;
+    tdvf_align(hob, 8);
+    return ret;
+}
+
+static void tdvf_hob_add_mmio_resource(TdvfHob *hob, uint64_t start,
+                                       uint64_t end)
+{
+    EFI_HOB_RESOURCE_DESCRIPTOR *region;
+
+    if (!start) {
+        return;
+    }
+
+    region = tdvf_get_area(hob, sizeof(*region));
+    *region = (EFI_HOB_RESOURCE_DESCRIPTOR) {
+        .Header = {
+            .HobType = EFI_HOB_TYPE_RESOURCE_DESCRIPTOR,
+            .HobLength = cpu_to_le16(sizeof(*region)),
+            .Reserved = cpu_to_le32(0),
+        },
+        .Owner = EFI_HOB_OWNER_ZERO,
+        .ResourceType = cpu_to_le32(EFI_RESOURCE_MEMORY_MAPPED_IO),
+        .ResourceAttribute = cpu_to_le32(EFI_RESOURCE_ATTRIBUTE_TDVF_MMIO),
+        .PhysicalStart = cpu_to_le64(start),
+        .ResourceLength = cpu_to_le64(end - start),
+    };
+}
+
+static void tdvf_hob_add_mmio_resources(TdvfHob *hob)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    X86MachineState *x86ms = X86_MACHINE(ms);
+    PCIHostState *pci_host;
+    uint64_t start, end;
+    uint64_t mcfg_base, mcfg_size;
+    Object *host;
+
+    /* Effectively PCI hole + other MMIO devices. */
+    tdvf_hob_add_mmio_resource(hob, x86ms->below_4g_mem_size,
+                               APIC_DEFAULT_ADDRESS);
+
+    /* Stolen from acpi_get_i386_pci_host(), there's gotta be an easier way. */
+    pci_host = OBJECT_CHECK(PCIHostState,
+                            object_resolve_path("/machine/i440fx", NULL),
+                            TYPE_PCI_HOST_BRIDGE);
+    if (!pci_host) {
+        pci_host = OBJECT_CHECK(PCIHostState,
+                                object_resolve_path("/machine/q35", NULL),
+                                TYPE_PCI_HOST_BRIDGE);
+    }
+    g_assert(pci_host);
+
+    host = OBJECT(pci_host);
+
+    /* PCI hole above 4gb. */
+    start = object_property_get_uint(host, PCI_HOST_PROP_PCI_HOLE64_START,
+                                     NULL);
+    end = object_property_get_uint(host, PCI_HOST_PROP_PCI_HOLE64_END, NULL);
+    tdvf_hob_add_mmio_resource(hob, start, end);
+
+    /* MMCFG region */
+    mcfg_base = object_property_get_uint(host, PCIE_HOST_MCFG_BASE, NULL);
+    mcfg_size = object_property_get_uint(host, PCIE_HOST_MCFG_SIZE, NULL);
+    if (mcfg_base && mcfg_base != PCIE_BASE_ADDR_UNMAPPED && mcfg_size) {
+        tdvf_hob_add_mmio_resource(hob, mcfg_base, mcfg_base + mcfg_size);
+    }
+}
+
+static void tdvf_hob_add_memory_resources(TdxGuest *tdx, TdvfHob *hob)
+{
+    EFI_HOB_RESOURCE_DESCRIPTOR *region;
+    EFI_RESOURCE_ATTRIBUTE_TYPE attr;
+    EFI_RESOURCE_TYPE resource_type;
+
+    TdxRamEntry *e;
+    int i;
+
+    for (i = 0; i < tdx->nr_ram_entries; i++) {
+        e = &tdx->ram_entries[i];
+
+        if (e->type == TDX_RAM_UNACCEPTED) {
+            resource_type = EFI_RESOURCE_MEMORY_UNACCEPTED;
+            attr = EFI_RESOURCE_ATTRIBUTE_TDVF_UNACCEPTED;
+        } else if (e->type == TDX_RAM_ADDED){
+            resource_type = EFI_RESOURCE_SYSTEM_MEMORY;
+            attr = EFI_RESOURCE_ATTRIBUTE_TDVF_PRIVATE;
+        } else {
+            error_report("unknown TDXRAMENTRY type %d", e->type);
+            exit(1);
+        }
+
+        region = tdvf_get_area(hob, sizeof(*region));
+        *region = (EFI_HOB_RESOURCE_DESCRIPTOR) {
+            .Header = {
+                .HobType = EFI_HOB_TYPE_RESOURCE_DESCRIPTOR,
+                .HobLength = cpu_to_le16(sizeof(*region)),
+                .Reserved = cpu_to_le32(0),
+            },
+            .Owner = EFI_HOB_OWNER_ZERO,
+            .ResourceType = cpu_to_le32(resource_type),
+            .ResourceAttribute = cpu_to_le32(attr),
+            .PhysicalStart = e->address,
+            .ResourceLength = e->length,
+        };
+    }
+}
+
+void tdvf_hob_create(TdxGuest *tdx, TdxFirmwareEntry *td_hob)
+{
+    TdvfHob hob = {
+        .hob_addr = td_hob->address,
+        .size = td_hob->size,
+        .ptr = td_hob->mem_ptr,
+
+        .current = td_hob->mem_ptr,
+        .end = td_hob->mem_ptr + td_hob->size,
+    };
+
+    EFI_HOB_GENERIC_HEADER *last_hob;
+    EFI_HOB_HANDOFF_INFO_TABLE *hit;
+
+    /* Note, Efi{Free}Memory{Bottom,Top} are ignored, leave 'em zeroed. */
+    hit = tdvf_get_area(&hob, sizeof(*hit));
+    *hit = (EFI_HOB_HANDOFF_INFO_TABLE) {
+        .Header = {
+            .HobType = EFI_HOB_TYPE_HANDOFF,
+            .HobLength = cpu_to_le16(sizeof(*hit)),
+            .Reserved = cpu_to_le32(0),
+        },
+        .Version = cpu_to_le32(EFI_HOB_HANDOFF_TABLE_VERSION),
+        .BootMode = cpu_to_le32(0),
+        .EfiMemoryTop = cpu_to_le64(0),
+        .EfiMemoryBottom = cpu_to_le64(0),
+        .EfiFreeMemoryTop = cpu_to_le64(0),
+        .EfiFreeMemoryBottom = cpu_to_le64(0),
+        .EfiEndOfHobList = cpu_to_le64(0), /* initialized later */
+    };
+
+    tdvf_hob_add_memory_resources(tdx, &hob);
+
+    tdvf_hob_add_mmio_resources(&hob);
+
+    last_hob = tdvf_get_area(&hob, sizeof(*last_hob));
+    *last_hob =  (EFI_HOB_GENERIC_HEADER) {
+        .HobType = EFI_HOB_TYPE_END_OF_HOB_LIST,
+        .HobLength = cpu_to_le16(sizeof(*last_hob)),
+        .Reserved = cpu_to_le32(0),
+    };
+    hit->EfiEndOfHobList = tdvf_current_guest_addr(&hob);
+}
diff --git a/hw/i386/tdvf-hob.h b/hw/i386/tdvf-hob.h
new file mode 100644
index 000000000000..f0494e8c4af8
--- /dev/null
+++ b/hw/i386/tdvf-hob.h
@@ -0,0 +1,25 @@
+#ifndef HW_I386_TD_HOB_H
+#define HW_I386_TD_HOB_H
+
+#include "hw/i386/tdvf.h"
+#include "hw/i386/uefi.h"
+#include "target/i386/kvm/tdx.h"
+
+void tdvf_hob_create(TdxGuest *tdx, TdxFirmwareEntry *td_hob);
+
+#define EFI_RESOURCE_ATTRIBUTE_TDVF_PRIVATE     \
+    (EFI_RESOURCE_ATTRIBUTE_PRESENT |           \
+     EFI_RESOURCE_ATTRIBUTE_INITIALIZED |       \
+     EFI_RESOURCE_ATTRIBUTE_TESTED)
+
+#define EFI_RESOURCE_ATTRIBUTE_TDVF_UNACCEPTED  \
+    (EFI_RESOURCE_ATTRIBUTE_PRESENT |           \
+     EFI_RESOURCE_ATTRIBUTE_INITIALIZED |       \
+     EFI_RESOURCE_ATTRIBUTE_TESTED)
+
+#define EFI_RESOURCE_ATTRIBUTE_TDVF_MMIO        \
+    (EFI_RESOURCE_ATTRIBUTE_PRESENT     |       \
+     EFI_RESOURCE_ATTRIBUTE_INITIALIZED |       \
+     EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE)
+
+#endif
diff --git a/hw/i386/uefi.h b/hw/i386/uefi.h
new file mode 100644
index 000000000000..b15aba796156
--- /dev/null
+++ b/hw/i386/uefi.h
@@ -0,0 +1,198 @@
+/*
+ * Copyright (C) 2020 Intel Corporation
+ *
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#ifndef HW_I386_UEFI_H
+#define HW_I386_UEFI_H
+
+/***************************************************************************/
+/*
+ * basic EFI definitions
+ * supplemented with UEFI Specification Version 2.8 (Errata A)
+ * released February 2020
+ */
+/* UEFI integer is little endian */
+
+typedef struct {
+    uint32_t Data1;
+    uint16_t Data2;
+    uint16_t Data3;
+    uint8_t Data4[8];
+} EFI_GUID;
+
+typedef enum {
+    EfiReservedMemoryType,
+    EfiLoaderCode,
+    EfiLoaderData,
+    EfiBootServicesCode,
+    EfiBootServicesData,
+    EfiRuntimeServicesCode,
+    EfiRuntimeServicesData,
+    EfiConventionalMemory,
+    EfiUnusableMemory,
+    EfiACPIReclaimMemory,
+    EfiACPIMemoryNVS,
+    EfiMemoryMappedIO,
+    EfiMemoryMappedIOPortSpace,
+    EfiPalCode,
+    EfiPersistentMemory,
+    EfiUnacceptedMemoryType,
+    EfiMaxMemoryType
+} EFI_MEMORY_TYPE;
+
+#define EFI_HOB_HANDOFF_TABLE_VERSION 0x0009
+
+#define EFI_HOB_TYPE_HANDOFF              0x0001
+#define EFI_HOB_TYPE_MEMORY_ALLOCATION    0x0002
+#define EFI_HOB_TYPE_RESOURCE_DESCRIPTOR  0x0003
+#define EFI_HOB_TYPE_GUID_EXTENSION       0x0004
+#define EFI_HOB_TYPE_FV                   0x0005
+#define EFI_HOB_TYPE_CPU                  0x0006
+#define EFI_HOB_TYPE_MEMORY_POOL          0x0007
+#define EFI_HOB_TYPE_FV2                  0x0009
+#define EFI_HOB_TYPE_LOAD_PEIM_UNUSED     0x000A
+#define EFI_HOB_TYPE_UEFI_CAPSULE         0x000B
+#define EFI_HOB_TYPE_FV3                  0x000C
+#define EFI_HOB_TYPE_UNUSED               0xFFFE
+#define EFI_HOB_TYPE_END_OF_HOB_LIST      0xFFFF
+
+typedef struct {
+    uint16_t HobType;
+    uint16_t HobLength;
+    uint32_t Reserved;
+} EFI_HOB_GENERIC_HEADER;
+
+typedef uint64_t EFI_PHYSICAL_ADDRESS;
+typedef uint32_t EFI_BOOT_MODE;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    uint32_t Version;
+    EFI_BOOT_MODE BootMode;
+    EFI_PHYSICAL_ADDRESS EfiMemoryTop;
+    EFI_PHYSICAL_ADDRESS EfiMemoryBottom;
+    EFI_PHYSICAL_ADDRESS EfiFreeMemoryTop;
+    EFI_PHYSICAL_ADDRESS EfiFreeMemoryBottom;
+    EFI_PHYSICAL_ADDRESS EfiEndOfHobList;
+} EFI_HOB_HANDOFF_INFO_TABLE;
+
+#define EFI_RESOURCE_SYSTEM_MEMORY          0x00000000
+#define EFI_RESOURCE_MEMORY_MAPPED_IO       0x00000001
+#define EFI_RESOURCE_IO                     0x00000002
+#define EFI_RESOURCE_FIRMWARE_DEVICE        0x00000003
+#define EFI_RESOURCE_MEMORY_MAPPED_IO_PORT  0x00000004
+#define EFI_RESOURCE_MEMORY_RESERVED        0x00000005
+#define EFI_RESOURCE_IO_RESERVED            0x00000006
+#define EFI_RESOURCE_MEMORY_UNACCEPTED      0x00000007
+#define EFI_RESOURCE_MAX_MEMORY_TYPE        0x00000008
+
+#define EFI_RESOURCE_ATTRIBUTE_PRESENT                  0x00000001
+#define EFI_RESOURCE_ATTRIBUTE_INITIALIZED              0x00000002
+#define EFI_RESOURCE_ATTRIBUTE_TESTED                   0x00000004
+#define EFI_RESOURCE_ATTRIBUTE_SINGLE_BIT_ECC           0x00000008
+#define EFI_RESOURCE_ATTRIBUTE_MULTIPLE_BIT_ECC         0x00000010
+#define EFI_RESOURCE_ATTRIBUTE_ECC_RESERVED_1           0x00000020
+#define EFI_RESOURCE_ATTRIBUTE_ECC_RESERVED_2           0x00000040
+#define EFI_RESOURCE_ATTRIBUTE_READ_PROTECTED           0x00000080
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_PROTECTED          0x00000100
+#define EFI_RESOURCE_ATTRIBUTE_EXECUTION_PROTECTED      0x00000200
+#define EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE              0x00000400
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_COMBINEABLE        0x00000800
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_THROUGH_CACHEABLE  0x00001000
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_BACK_CACHEABLE     0x00002000
+#define EFI_RESOURCE_ATTRIBUTE_16_BIT_IO                0x00004000
+#define EFI_RESOURCE_ATTRIBUTE_32_BIT_IO                0x00008000
+#define EFI_RESOURCE_ATTRIBUTE_64_BIT_IO                0x00010000
+#define EFI_RESOURCE_ATTRIBUTE_UNCACHED_EXPORTED        0x00020000
+#define EFI_RESOURCE_ATTRIBUTE_READ_ONLY_PROTECTED      0x00040000
+#define EFI_RESOURCE_ATTRIBUTE_READ_ONLY_PROTECTABLE    0x00080000
+#define EFI_RESOURCE_ATTRIBUTE_READ_PROTECTABLE         0x00100000
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_PROTECTABLE        0x00200000
+#define EFI_RESOURCE_ATTRIBUTE_EXECUTION_PROTECTABLE    0x00400000
+#define EFI_RESOURCE_ATTRIBUTE_PERSISTENT               0x00800000
+#define EFI_RESOURCE_ATTRIBUTE_PERSISTABLE              0x01000000
+#define EFI_RESOURCE_ATTRIBUTE_MORE_RELIABLE            0x02000000
+
+typedef uint32_t EFI_RESOURCE_TYPE;
+typedef uint32_t EFI_RESOURCE_ATTRIBUTE_TYPE;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_GUID Owner;
+    EFI_RESOURCE_TYPE ResourceType;
+    EFI_RESOURCE_ATTRIBUTE_TYPE ResourceAttribute;
+    EFI_PHYSICAL_ADDRESS PhysicalStart;
+    uint64_t ResourceLength;
+} EFI_HOB_RESOURCE_DESCRIPTOR;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_GUID Name;
+
+    /* guid specific data follows */
+} EFI_HOB_GUID_TYPE;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+} EFI_HOB_FIRMWARE_VOLUME;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+    EFI_GUID FvName;
+    EFI_GUID FileName;
+} EFI_HOB_FIRMWARE_VOLUME2;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+    uint32_t AuthenticationStatus;
+    bool ExtractedFv;
+    EFI_GUID FvName;
+    EFI_GUID FileName;
+} EFI_HOB_FIRMWARE_VOLUME3;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    uint8_t SizeOfMemorySpace;
+    uint8_t SizeOfIoSpace;
+    uint8_t Reserved[6];
+} EFI_HOB_CPU;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+} EFI_HOB_MEMORY_POOL;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+} EFI_HOB_UEFI_CAPSULE;
+
+#define EFI_HOB_OWNER_ZERO                                      \
+    ((EFI_GUID){ 0x00000000, 0x0000, 0x0000,                    \
+        { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 } })
+
+#endif
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 59446ed10ce4..f7a18f07a4df 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -21,6 +21,7 @@
 #include "hw/i386/e820_memory_layout.h"
 #include "hw/i386/x86.h"
 #include "hw/i386/tdvf.h"
+#include "hw/i386/tdvf-hob.h"
 #include "kvm_i386.h"
 #include "tdx.h"
 
@@ -106,6 +107,19 @@ static void get_tdx_capabilities(void)
     tdx_caps = caps;
 }
 
+static TdxFirmwareEntry *tdx_get_hob_entry(TdxGuest *tdx)
+{
+    TdxFirmwareEntry *entry;
+
+    for_each_tdx_fw_entry(&tdx->tdvf, entry) {
+        if (entry->type == TDVF_SECTION_TYPE_TD_HOB) {
+            return entry;
+        }
+    }
+    error_report("TDVF metadata doesn't specify TD_HOB location.");
+    exit(1);
+}
+
 static void tdx_add_ram_entry(uint64_t address, uint64_t length, uint32_t type)
 {
     uint32_t nr_entries = tdx_guest->nr_ram_entries;
@@ -236,6 +250,7 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
 
     qsort(tdx_guest->ram_entries, tdx_guest->nr_ram_entries,
           sizeof(TdxRamEntry), &tdx_ram_entry_compare);
+    tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 23/36] i386/tdx: Create the TD HOB list upon machine init done
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

The TD HOB list is used to pass the information from VMM to TDVF. The TD
HOB must include PHIT HOB and Resource Descriptor HOB. More details can
be found in TDVF specification and PI specification.

Build the TD HOB in machine_init_done callback.

Co-developed-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/meson.build   |   2 +-
 hw/i386/tdvf-hob.c    | 212 ++++++++++++++++++++++++++++++++++++++++++
 hw/i386/tdvf-hob.h    |  25 +++++
 hw/i386/uefi.h        | 198 +++++++++++++++++++++++++++++++++++++++
 target/i386/kvm/tdx.c |  15 +++
 5 files changed, 451 insertions(+), 1 deletion(-)
 create mode 100644 hw/i386/tdvf-hob.c
 create mode 100644 hw/i386/tdvf-hob.h
 create mode 100644 hw/i386/uefi.h

diff --git a/hw/i386/meson.build b/hw/i386/meson.build
index 97f3b50503b0..b59e0d35bba3 100644
--- a/hw/i386/meson.build
+++ b/hw/i386/meson.build
@@ -28,7 +28,7 @@ i386_ss.add(when: 'CONFIG_PC', if_true: files(
   'port92.c'))
 i386_ss.add(when: 'CONFIG_X86_FW_OVMF', if_true: files('pc_sysfw_ovmf.c'),
                                         if_false: files('pc_sysfw_ovmf-stubs.c'))
-i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c'))
+i386_ss.add(when: 'CONFIG_TDX', if_true: files('tdvf.c', 'tdvf-hob.c'))
 
 subdir('kvm')
 subdir('xen')
diff --git a/hw/i386/tdvf-hob.c b/hw/i386/tdvf-hob.c
new file mode 100644
index 000000000000..31160e9f95c5
--- /dev/null
+++ b/hw/i386/tdvf-hob.c
@@ -0,0 +1,212 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+
+ * Copyright (c) 2020 Intel Corporation
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "e820_memory_layout.h"
+#include "hw/i386/pc.h"
+#include "hw/i386/x86.h"
+#include "hw/pci/pcie_host.h"
+#include "sysemu/kvm.h"
+#include "tdvf-hob.h"
+#include "uefi.h"
+
+typedef struct TdvfHob {
+    hwaddr hob_addr;
+    void *ptr;
+    int size;
+
+    /* working area */
+    void *current;
+    void *end;
+} TdvfHob;
+
+static uint64_t tdvf_current_guest_addr(const TdvfHob *hob)
+{
+    return hob->hob_addr + (hob->current - hob->ptr);
+}
+
+static void tdvf_align(TdvfHob *hob, size_t align)
+{
+    hob->current = QEMU_ALIGN_PTR_UP(hob->current, align);
+}
+
+static void *tdvf_get_area(TdvfHob *hob, uint64_t size)
+{
+    void *ret;
+
+    if (hob->current + size > hob->end) {
+        error_report("TD_HOB overrun, size = 0x%" PRIx64, size);
+        exit(1);
+    }
+
+    ret = hob->current;
+    hob->current += size;
+    tdvf_align(hob, 8);
+    return ret;
+}
+
+static void tdvf_hob_add_mmio_resource(TdvfHob *hob, uint64_t start,
+                                       uint64_t end)
+{
+    EFI_HOB_RESOURCE_DESCRIPTOR *region;
+
+    if (!start) {
+        return;
+    }
+
+    region = tdvf_get_area(hob, sizeof(*region));
+    *region = (EFI_HOB_RESOURCE_DESCRIPTOR) {
+        .Header = {
+            .HobType = EFI_HOB_TYPE_RESOURCE_DESCRIPTOR,
+            .HobLength = cpu_to_le16(sizeof(*region)),
+            .Reserved = cpu_to_le32(0),
+        },
+        .Owner = EFI_HOB_OWNER_ZERO,
+        .ResourceType = cpu_to_le32(EFI_RESOURCE_MEMORY_MAPPED_IO),
+        .ResourceAttribute = cpu_to_le32(EFI_RESOURCE_ATTRIBUTE_TDVF_MMIO),
+        .PhysicalStart = cpu_to_le64(start),
+        .ResourceLength = cpu_to_le64(end - start),
+    };
+}
+
+static void tdvf_hob_add_mmio_resources(TdvfHob *hob)
+{
+    MachineState *ms = MACHINE(qdev_get_machine());
+    X86MachineState *x86ms = X86_MACHINE(ms);
+    PCIHostState *pci_host;
+    uint64_t start, end;
+    uint64_t mcfg_base, mcfg_size;
+    Object *host;
+
+    /* Effectively PCI hole + other MMIO devices. */
+    tdvf_hob_add_mmio_resource(hob, x86ms->below_4g_mem_size,
+                               APIC_DEFAULT_ADDRESS);
+
+    /* Stolen from acpi_get_i386_pci_host(), there's gotta be an easier way. */
+    pci_host = OBJECT_CHECK(PCIHostState,
+                            object_resolve_path("/machine/i440fx", NULL),
+                            TYPE_PCI_HOST_BRIDGE);
+    if (!pci_host) {
+        pci_host = OBJECT_CHECK(PCIHostState,
+                                object_resolve_path("/machine/q35", NULL),
+                                TYPE_PCI_HOST_BRIDGE);
+    }
+    g_assert(pci_host);
+
+    host = OBJECT(pci_host);
+
+    /* PCI hole above 4gb. */
+    start = object_property_get_uint(host, PCI_HOST_PROP_PCI_HOLE64_START,
+                                     NULL);
+    end = object_property_get_uint(host, PCI_HOST_PROP_PCI_HOLE64_END, NULL);
+    tdvf_hob_add_mmio_resource(hob, start, end);
+
+    /* MMCFG region */
+    mcfg_base = object_property_get_uint(host, PCIE_HOST_MCFG_BASE, NULL);
+    mcfg_size = object_property_get_uint(host, PCIE_HOST_MCFG_SIZE, NULL);
+    if (mcfg_base && mcfg_base != PCIE_BASE_ADDR_UNMAPPED && mcfg_size) {
+        tdvf_hob_add_mmio_resource(hob, mcfg_base, mcfg_base + mcfg_size);
+    }
+}
+
+static void tdvf_hob_add_memory_resources(TdxGuest *tdx, TdvfHob *hob)
+{
+    EFI_HOB_RESOURCE_DESCRIPTOR *region;
+    EFI_RESOURCE_ATTRIBUTE_TYPE attr;
+    EFI_RESOURCE_TYPE resource_type;
+
+    TdxRamEntry *e;
+    int i;
+
+    for (i = 0; i < tdx->nr_ram_entries; i++) {
+        e = &tdx->ram_entries[i];
+
+        if (e->type == TDX_RAM_UNACCEPTED) {
+            resource_type = EFI_RESOURCE_MEMORY_UNACCEPTED;
+            attr = EFI_RESOURCE_ATTRIBUTE_TDVF_UNACCEPTED;
+        } else if (e->type == TDX_RAM_ADDED){
+            resource_type = EFI_RESOURCE_SYSTEM_MEMORY;
+            attr = EFI_RESOURCE_ATTRIBUTE_TDVF_PRIVATE;
+        } else {
+            error_report("unknown TDXRAMENTRY type %d", e->type);
+            exit(1);
+        }
+
+        region = tdvf_get_area(hob, sizeof(*region));
+        *region = (EFI_HOB_RESOURCE_DESCRIPTOR) {
+            .Header = {
+                .HobType = EFI_HOB_TYPE_RESOURCE_DESCRIPTOR,
+                .HobLength = cpu_to_le16(sizeof(*region)),
+                .Reserved = cpu_to_le32(0),
+            },
+            .Owner = EFI_HOB_OWNER_ZERO,
+            .ResourceType = cpu_to_le32(resource_type),
+            .ResourceAttribute = cpu_to_le32(attr),
+            .PhysicalStart = e->address,
+            .ResourceLength = e->length,
+        };
+    }
+}
+
+void tdvf_hob_create(TdxGuest *tdx, TdxFirmwareEntry *td_hob)
+{
+    TdvfHob hob = {
+        .hob_addr = td_hob->address,
+        .size = td_hob->size,
+        .ptr = td_hob->mem_ptr,
+
+        .current = td_hob->mem_ptr,
+        .end = td_hob->mem_ptr + td_hob->size,
+    };
+
+    EFI_HOB_GENERIC_HEADER *last_hob;
+    EFI_HOB_HANDOFF_INFO_TABLE *hit;
+
+    /* Note, Efi{Free}Memory{Bottom,Top} are ignored, leave 'em zeroed. */
+    hit = tdvf_get_area(&hob, sizeof(*hit));
+    *hit = (EFI_HOB_HANDOFF_INFO_TABLE) {
+        .Header = {
+            .HobType = EFI_HOB_TYPE_HANDOFF,
+            .HobLength = cpu_to_le16(sizeof(*hit)),
+            .Reserved = cpu_to_le32(0),
+        },
+        .Version = cpu_to_le32(EFI_HOB_HANDOFF_TABLE_VERSION),
+        .BootMode = cpu_to_le32(0),
+        .EfiMemoryTop = cpu_to_le64(0),
+        .EfiMemoryBottom = cpu_to_le64(0),
+        .EfiFreeMemoryTop = cpu_to_le64(0),
+        .EfiFreeMemoryBottom = cpu_to_le64(0),
+        .EfiEndOfHobList = cpu_to_le64(0), /* initialized later */
+    };
+
+    tdvf_hob_add_memory_resources(tdx, &hob);
+
+    tdvf_hob_add_mmio_resources(&hob);
+
+    last_hob = tdvf_get_area(&hob, sizeof(*last_hob));
+    *last_hob =  (EFI_HOB_GENERIC_HEADER) {
+        .HobType = EFI_HOB_TYPE_END_OF_HOB_LIST,
+        .HobLength = cpu_to_le16(sizeof(*last_hob)),
+        .Reserved = cpu_to_le32(0),
+    };
+    hit->EfiEndOfHobList = tdvf_current_guest_addr(&hob);
+}
diff --git a/hw/i386/tdvf-hob.h b/hw/i386/tdvf-hob.h
new file mode 100644
index 000000000000..f0494e8c4af8
--- /dev/null
+++ b/hw/i386/tdvf-hob.h
@@ -0,0 +1,25 @@
+#ifndef HW_I386_TD_HOB_H
+#define HW_I386_TD_HOB_H
+
+#include "hw/i386/tdvf.h"
+#include "hw/i386/uefi.h"
+#include "target/i386/kvm/tdx.h"
+
+void tdvf_hob_create(TdxGuest *tdx, TdxFirmwareEntry *td_hob);
+
+#define EFI_RESOURCE_ATTRIBUTE_TDVF_PRIVATE     \
+    (EFI_RESOURCE_ATTRIBUTE_PRESENT |           \
+     EFI_RESOURCE_ATTRIBUTE_INITIALIZED |       \
+     EFI_RESOURCE_ATTRIBUTE_TESTED)
+
+#define EFI_RESOURCE_ATTRIBUTE_TDVF_UNACCEPTED  \
+    (EFI_RESOURCE_ATTRIBUTE_PRESENT |           \
+     EFI_RESOURCE_ATTRIBUTE_INITIALIZED |       \
+     EFI_RESOURCE_ATTRIBUTE_TESTED)
+
+#define EFI_RESOURCE_ATTRIBUTE_TDVF_MMIO        \
+    (EFI_RESOURCE_ATTRIBUTE_PRESENT     |       \
+     EFI_RESOURCE_ATTRIBUTE_INITIALIZED |       \
+     EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE)
+
+#endif
diff --git a/hw/i386/uefi.h b/hw/i386/uefi.h
new file mode 100644
index 000000000000..b15aba796156
--- /dev/null
+++ b/hw/i386/uefi.h
@@ -0,0 +1,198 @@
+/*
+ * Copyright (C) 2020 Intel Corporation
+ *
+ * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
+ *                        <isaku.yamahata at intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ */
+
+#ifndef HW_I386_UEFI_H
+#define HW_I386_UEFI_H
+
+/***************************************************************************/
+/*
+ * basic EFI definitions
+ * supplemented with UEFI Specification Version 2.8 (Errata A)
+ * released February 2020
+ */
+/* UEFI integer is little endian */
+
+typedef struct {
+    uint32_t Data1;
+    uint16_t Data2;
+    uint16_t Data3;
+    uint8_t Data4[8];
+} EFI_GUID;
+
+typedef enum {
+    EfiReservedMemoryType,
+    EfiLoaderCode,
+    EfiLoaderData,
+    EfiBootServicesCode,
+    EfiBootServicesData,
+    EfiRuntimeServicesCode,
+    EfiRuntimeServicesData,
+    EfiConventionalMemory,
+    EfiUnusableMemory,
+    EfiACPIReclaimMemory,
+    EfiACPIMemoryNVS,
+    EfiMemoryMappedIO,
+    EfiMemoryMappedIOPortSpace,
+    EfiPalCode,
+    EfiPersistentMemory,
+    EfiUnacceptedMemoryType,
+    EfiMaxMemoryType
+} EFI_MEMORY_TYPE;
+
+#define EFI_HOB_HANDOFF_TABLE_VERSION 0x0009
+
+#define EFI_HOB_TYPE_HANDOFF              0x0001
+#define EFI_HOB_TYPE_MEMORY_ALLOCATION    0x0002
+#define EFI_HOB_TYPE_RESOURCE_DESCRIPTOR  0x0003
+#define EFI_HOB_TYPE_GUID_EXTENSION       0x0004
+#define EFI_HOB_TYPE_FV                   0x0005
+#define EFI_HOB_TYPE_CPU                  0x0006
+#define EFI_HOB_TYPE_MEMORY_POOL          0x0007
+#define EFI_HOB_TYPE_FV2                  0x0009
+#define EFI_HOB_TYPE_LOAD_PEIM_UNUSED     0x000A
+#define EFI_HOB_TYPE_UEFI_CAPSULE         0x000B
+#define EFI_HOB_TYPE_FV3                  0x000C
+#define EFI_HOB_TYPE_UNUSED               0xFFFE
+#define EFI_HOB_TYPE_END_OF_HOB_LIST      0xFFFF
+
+typedef struct {
+    uint16_t HobType;
+    uint16_t HobLength;
+    uint32_t Reserved;
+} EFI_HOB_GENERIC_HEADER;
+
+typedef uint64_t EFI_PHYSICAL_ADDRESS;
+typedef uint32_t EFI_BOOT_MODE;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    uint32_t Version;
+    EFI_BOOT_MODE BootMode;
+    EFI_PHYSICAL_ADDRESS EfiMemoryTop;
+    EFI_PHYSICAL_ADDRESS EfiMemoryBottom;
+    EFI_PHYSICAL_ADDRESS EfiFreeMemoryTop;
+    EFI_PHYSICAL_ADDRESS EfiFreeMemoryBottom;
+    EFI_PHYSICAL_ADDRESS EfiEndOfHobList;
+} EFI_HOB_HANDOFF_INFO_TABLE;
+
+#define EFI_RESOURCE_SYSTEM_MEMORY          0x00000000
+#define EFI_RESOURCE_MEMORY_MAPPED_IO       0x00000001
+#define EFI_RESOURCE_IO                     0x00000002
+#define EFI_RESOURCE_FIRMWARE_DEVICE        0x00000003
+#define EFI_RESOURCE_MEMORY_MAPPED_IO_PORT  0x00000004
+#define EFI_RESOURCE_MEMORY_RESERVED        0x00000005
+#define EFI_RESOURCE_IO_RESERVED            0x00000006
+#define EFI_RESOURCE_MEMORY_UNACCEPTED      0x00000007
+#define EFI_RESOURCE_MAX_MEMORY_TYPE        0x00000008
+
+#define EFI_RESOURCE_ATTRIBUTE_PRESENT                  0x00000001
+#define EFI_RESOURCE_ATTRIBUTE_INITIALIZED              0x00000002
+#define EFI_RESOURCE_ATTRIBUTE_TESTED                   0x00000004
+#define EFI_RESOURCE_ATTRIBUTE_SINGLE_BIT_ECC           0x00000008
+#define EFI_RESOURCE_ATTRIBUTE_MULTIPLE_BIT_ECC         0x00000010
+#define EFI_RESOURCE_ATTRIBUTE_ECC_RESERVED_1           0x00000020
+#define EFI_RESOURCE_ATTRIBUTE_ECC_RESERVED_2           0x00000040
+#define EFI_RESOURCE_ATTRIBUTE_READ_PROTECTED           0x00000080
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_PROTECTED          0x00000100
+#define EFI_RESOURCE_ATTRIBUTE_EXECUTION_PROTECTED      0x00000200
+#define EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE              0x00000400
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_COMBINEABLE        0x00000800
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_THROUGH_CACHEABLE  0x00001000
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_BACK_CACHEABLE     0x00002000
+#define EFI_RESOURCE_ATTRIBUTE_16_BIT_IO                0x00004000
+#define EFI_RESOURCE_ATTRIBUTE_32_BIT_IO                0x00008000
+#define EFI_RESOURCE_ATTRIBUTE_64_BIT_IO                0x00010000
+#define EFI_RESOURCE_ATTRIBUTE_UNCACHED_EXPORTED        0x00020000
+#define EFI_RESOURCE_ATTRIBUTE_READ_ONLY_PROTECTED      0x00040000
+#define EFI_RESOURCE_ATTRIBUTE_READ_ONLY_PROTECTABLE    0x00080000
+#define EFI_RESOURCE_ATTRIBUTE_READ_PROTECTABLE         0x00100000
+#define EFI_RESOURCE_ATTRIBUTE_WRITE_PROTECTABLE        0x00200000
+#define EFI_RESOURCE_ATTRIBUTE_EXECUTION_PROTECTABLE    0x00400000
+#define EFI_RESOURCE_ATTRIBUTE_PERSISTENT               0x00800000
+#define EFI_RESOURCE_ATTRIBUTE_PERSISTABLE              0x01000000
+#define EFI_RESOURCE_ATTRIBUTE_MORE_RELIABLE            0x02000000
+
+typedef uint32_t EFI_RESOURCE_TYPE;
+typedef uint32_t EFI_RESOURCE_ATTRIBUTE_TYPE;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_GUID Owner;
+    EFI_RESOURCE_TYPE ResourceType;
+    EFI_RESOURCE_ATTRIBUTE_TYPE ResourceAttribute;
+    EFI_PHYSICAL_ADDRESS PhysicalStart;
+    uint64_t ResourceLength;
+} EFI_HOB_RESOURCE_DESCRIPTOR;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_GUID Name;
+
+    /* guid specific data follows */
+} EFI_HOB_GUID_TYPE;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+} EFI_HOB_FIRMWARE_VOLUME;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+    EFI_GUID FvName;
+    EFI_GUID FileName;
+} EFI_HOB_FIRMWARE_VOLUME2;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+    uint32_t AuthenticationStatus;
+    bool ExtractedFv;
+    EFI_GUID FvName;
+    EFI_GUID FileName;
+} EFI_HOB_FIRMWARE_VOLUME3;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+    uint8_t SizeOfMemorySpace;
+    uint8_t SizeOfIoSpace;
+    uint8_t Reserved[6];
+} EFI_HOB_CPU;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+} EFI_HOB_MEMORY_POOL;
+
+typedef struct {
+    EFI_HOB_GENERIC_HEADER Header;
+
+    EFI_PHYSICAL_ADDRESS BaseAddress;
+    uint64_t Length;
+} EFI_HOB_UEFI_CAPSULE;
+
+#define EFI_HOB_OWNER_ZERO                                      \
+    ((EFI_GUID){ 0x00000000, 0x0000, 0x0000,                    \
+        { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 } })
+
+#endif
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 59446ed10ce4..f7a18f07a4df 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -21,6 +21,7 @@
 #include "hw/i386/e820_memory_layout.h"
 #include "hw/i386/x86.h"
 #include "hw/i386/tdvf.h"
+#include "hw/i386/tdvf-hob.h"
 #include "kvm_i386.h"
 #include "tdx.h"
 
@@ -106,6 +107,19 @@ static void get_tdx_capabilities(void)
     tdx_caps = caps;
 }
 
+static TdxFirmwareEntry *tdx_get_hob_entry(TdxGuest *tdx)
+{
+    TdxFirmwareEntry *entry;
+
+    for_each_tdx_fw_entry(&tdx->tdvf, entry) {
+        if (entry->type == TDVF_SECTION_TYPE_TD_HOB) {
+            return entry;
+        }
+    }
+    error_report("TDVF metadata doesn't specify TD_HOB location.");
+    exit(1);
+}
+
 static void tdx_add_ram_entry(uint64_t address, uint64_t length, uint32_t type)
 {
     uint32_t nr_entries = tdx_guest->nr_ram_entries;
@@ -236,6 +250,7 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
 
     qsort(tdx_guest->ram_entries, tdx_guest->nr_ram_entries,
           sizeof(TdxRamEntry), &tdx_ram_entry_compare);
+    tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 24/36] i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

TDX vcpu needs to be initialized by SEAMCALL(TDH.VP.INIT) and KVM
provides vcpu level IOCTL KVM_TDX_INIT_VCPU for it.

KVM_TDX_INIT_VCPU needs the address of the HOB as input. Invoke it for
each vcpu after HOB list is created.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index f7a18f07a4df..f06a0895b77a 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -212,6 +212,22 @@ static void tdx_init_ram_entries(void)
     tdx_guest->nr_ram_entries = j;
 }
 
+static void tdx_post_init_vcpu(void)
+{
+    TdxFirmwareEntry *hob;
+    CPUState *cpu;
+    int r;
+
+    hob = tdx_get_hob_entry(tdx_guest);
+    CPU_FOREACH(cpu) {
+        r = tdx_vcpu_ioctl(cpu, KVM_TDX_INIT_VCPU, 0, (void *)hob->address);
+        if (r < 0) {
+            error_report("KVM_TDX_INIT_VCPU failed %s", strerror(-r));
+            exit(1);
+        }
+    }
+}
+
 static void tdx_finalize_vm(Notifier *notifier, void *unused)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
@@ -251,6 +267,8 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     qsort(tdx_guest->ram_entries, tdx_guest->nr_ram_entries,
           sizeof(TdxRamEntry), &tdx_ram_entry_compare);
     tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
+
+    tdx_post_init_vcpu();
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 24/36] i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

TDX vcpu needs to be initialized by SEAMCALL(TDH.VP.INIT) and KVM
provides vcpu level IOCTL KVM_TDX_INIT_VCPU for it.

KVM_TDX_INIT_VCPU needs the address of the HOB as input. Invoke it for
each vcpu after HOB list is created.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index f7a18f07a4df..f06a0895b77a 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -212,6 +212,22 @@ static void tdx_init_ram_entries(void)
     tdx_guest->nr_ram_entries = j;
 }
 
+static void tdx_post_init_vcpu(void)
+{
+    TdxFirmwareEntry *hob;
+    CPUState *cpu;
+    int r;
+
+    hob = tdx_get_hob_entry(tdx_guest);
+    CPU_FOREACH(cpu) {
+        r = tdx_vcpu_ioctl(cpu, KVM_TDX_INIT_VCPU, 0, (void *)hob->address);
+        if (r < 0) {
+            error_report("KVM_TDX_INIT_VCPU failed %s", strerror(-r));
+            exit(1);
+        }
+    }
+}
+
 static void tdx_finalize_vm(Notifier *notifier, void *unused)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
@@ -251,6 +267,8 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     qsort(tdx_guest->ram_entries, tdx_guest->nr_ram_entries,
           sizeof(TdxRamEntry), &tdx_ram_entry_compare);
     tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
+
+    tdx_post_init_vcpu();
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 25/36] i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

From: Isaku Yamahata <isaku.yamahata@intel.com>

TDVF firmware (CODE and VARS) needs to be added/copied to TD's private
memory via KVM_TDX_INIT_MEM_REGION, as well as TD HOB and TEMP memory.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index f06a0895b77a..fc03079571a1 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -234,6 +234,7 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     void *base_ram_ptr = memory_region_get_ram_ptr(ms->ram);
     TdxFirmware *tdvf = &tdx_guest->tdvf;
     TdxFirmwareEntry *entry;
+    int r;
 
     tdx_init_ram_entries();
 
@@ -269,6 +270,23 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
 
     tdx_post_init_vcpu();
+
+    for_each_tdx_fw_entry(tdvf, entry) {
+        struct kvm_tdx_init_mem_region mem_region = {
+            .source_addr = (__u64)entry->mem_ptr,
+            .gpa = entry->address,
+            .nr_pages = entry->size / 4096,
+        };
+
+        __u32 metadata = entry->attributes & TDVF_SECTION_ATTRIBUTES_MR_EXTEND ?
+                         KVM_TDX_MEASURE_MEMORY_REGION : 0;
+
+        r = tdx_vm_ioctl(KVM_TDX_INIT_MEM_REGION, metadata, &mem_region);
+        if (r < 0) {
+             error_report("KVM_TDX_INIT_MEM_REGION failed %s", strerror(-r));
+             exit(1);
+        }
+    }
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 25/36] i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

From: Isaku Yamahata <isaku.yamahata@intel.com>

TDVF firmware (CODE and VARS) needs to be added/copied to TD's private
memory via KVM_TDX_INIT_MEM_REGION, as well as TD HOB and TEMP memory.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index f06a0895b77a..fc03079571a1 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -234,6 +234,7 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     void *base_ram_ptr = memory_region_get_ram_ptr(ms->ram);
     TdxFirmware *tdvf = &tdx_guest->tdvf;
     TdxFirmwareEntry *entry;
+    int r;
 
     tdx_init_ram_entries();
 
@@ -269,6 +270,23 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
     tdvf_hob_create(tdx_guest, tdx_get_hob_entry(tdx_guest));
 
     tdx_post_init_vcpu();
+
+    for_each_tdx_fw_entry(tdvf, entry) {
+        struct kvm_tdx_init_mem_region mem_region = {
+            .source_addr = (__u64)entry->mem_ptr,
+            .gpa = entry->address,
+            .nr_pages = entry->size / 4096,
+        };
+
+        __u32 metadata = entry->attributes & TDVF_SECTION_ATTRIBUTES_MR_EXTEND ?
+                         KVM_TDX_MEASURE_MEMORY_REGION : 0;
+
+        r = tdx_vm_ioctl(KVM_TDX_INIT_MEM_REGION, metadata, &mem_region);
+        if (r < 0) {
+             error_report("KVM_TDX_INIT_MEM_REGION failed %s", strerror(-r));
+             exit(1);
+        }
+    }
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 26/36] i386/tdx: Finalize TDX VM
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Invoke KVM_TDX_FINALIZE_VM to finalize the TD's measurement and make
the TD vCPUs runnable once machine initialization is complete.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index fc03079571a1..deb9634b27dc 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -287,6 +287,13 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
              exit(1);
         }
     }
+
+    r = tdx_vm_ioctl(KVM_TDX_FINALIZE_VM, 0, NULL);
+    if (r < 0) {
+        error_report("KVM_TDX_FINALIZE_VM failed %s", strerror(-r));
+        exit(0);
+    }
+    tdx_guest->parent_obj.ready = true;
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 26/36] i386/tdx: Finalize TDX VM
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Invoke KVM_TDX_FINALIZE_VM to finalize the TD's measurement and make
the TD vCPUs runnable once machine initialization is complete.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index fc03079571a1..deb9634b27dc 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -287,6 +287,13 @@ static void tdx_finalize_vm(Notifier *notifier, void *unused)
              exit(1);
         }
     }
+
+    r = tdx_vm_ioctl(KVM_TDX_FINALIZE_VM, 0, NULL);
+    if (r < 0) {
+        error_report("KVM_TDX_FINALIZE_VM failed %s", strerror(-r));
+        exit(0);
+    }
+    tdx_guest->parent_obj.ready = true;
 }
 
 static Notifier tdx_machine_done_notify = {
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 27/36] i386/tdx: Disable SMM for TDX VMs
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

TDX doesn't support SMM and VMM cannot emulate SMM for TDX VMs because
VMM cannot manipulate TDX VM's memory.

Disable SMM for TDX VMs and error out if user requests to enable SMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index deb9634b27dc..ec6f5d7a2e48 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -302,12 +302,25 @@ static Notifier tdx_machine_done_notify = {
 
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
+    X86MachineState *x86ms = X86_MACHINE(ms);
     TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
                                                     TYPE_TDX_GUEST);
     if (!tdx) {
         return -EINVAL;
     }
 
+    if (!kvm_enable_x2apic()) {
+        error_setg(errp, "Failed to enable x2apic in KVM");
+        return -EINVAL;
+    }
+
+    if (x86ms->smm == ON_OFF_AUTO_AUTO) {
+        x86ms->smm = ON_OFF_AUTO_OFF;
+    } else if (x86ms->smm == ON_OFF_AUTO_ON) {
+        error_setg(errp, "TDX VM doesn't support SMM");
+        return -EINVAL;
+    }
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 27/36] i386/tdx: Disable SMM for TDX VMs
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

TDX doesn't support SMM and VMM cannot emulate SMM for TDX VMs because
VMM cannot manipulate TDX VM's memory.

Disable SMM for TDX VMs and error out if user requests to enable SMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index deb9634b27dc..ec6f5d7a2e48 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -302,12 +302,25 @@ static Notifier tdx_machine_done_notify = {
 
 int tdx_kvm_init(MachineState *ms, Error **errp)
 {
+    X86MachineState *x86ms = X86_MACHINE(ms);
     TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
                                                     TYPE_TDX_GUEST);
     if (!tdx) {
         return -EINVAL;
     }
 
+    if (!kvm_enable_x2apic()) {
+        error_setg(errp, "Failed to enable x2apic in KVM");
+        return -EINVAL;
+    }
+
+    if (x86ms->smm == ON_OFF_AUTO_AUTO) {
+        x86ms->smm = ON_OFF_AUTO_OFF;
+    } else if (x86ms->smm == ON_OFF_AUTO_ON) {
+        error_setg(errp, "TDX VM doesn't support SMM");
+        return -EINVAL;
+    }
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 28/36] i386/tdx: Disable PIC for TDX VMs
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Legacy PIC (8259) cannot be supported for TDX VMs since TDX module
doesn't allow directly interrupt injection.  Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.

Hence disable PIC for TDX VMs and error out if user wants PIC.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index ec6f5d7a2e48..6e9cb7178d25 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -321,6 +321,13 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         return -EINVAL;
     }
 
+    if (x86ms->pic == ON_OFF_AUTO_AUTO) {
+        x86ms->pic = ON_OFF_AUTO_OFF;
+    } else if (x86ms->pic == ON_OFF_AUTO_ON) {
+        error_setg(errp, "TDX VM doesn't support PIC");
+        return -EINVAL;
+    }
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 28/36] i386/tdx: Disable PIC for TDX VMs
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Legacy PIC (8259) cannot be supported for TDX VMs since TDX module
doesn't allow directly interrupt injection.  Using posted interrupts
for the PIC is not a viable option as the guest BIOS/kernel will not
do EOI for PIC IRQs, i.e. will leave the vIRR bit set.

Hence disable PIC for TDX VMs and error out if user wants PIC.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/tdx.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index ec6f5d7a2e48..6e9cb7178d25 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -321,6 +321,13 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         return -EINVAL;
     }
 
+    if (x86ms->pic == ON_OFF_AUTO_AUTO) {
+        x86ms->pic = ON_OFF_AUTO_OFF;
+    } else if (x86ms->pic == ON_OFF_AUTO_ON) {
+        error_setg(errp, "TDX VM doesn't support PIC");
+        return -EINVAL;
+    }
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 29/36] i386/tdx: Don't allow system reset for TDX VMs
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

TDX CPU state is protected and thus vcpu state cann't be reset by VMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4a8b6e2c8797..ccbafb4ca183 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5266,7 +5266,7 @@ bool kvm_has_waitpkg(void)
 
 bool kvm_arch_cpu_check_are_resettable(void)
 {
-    return !sev_es_enabled();
+    return !sev_es_enabled() && !is_tdx_vm();
 }
 
 #define ARCH_REQ_XCOMP_GUEST_PERM       0x1025
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 29/36] i386/tdx: Don't allow system reset for TDX VMs
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

TDX CPU state is protected and thus vcpu state cann't be reset by VMM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4a8b6e2c8797..ccbafb4ca183 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5266,7 +5266,7 @@ bool kvm_has_waitpkg(void)
 
 bool kvm_arch_cpu_check_are_resettable(void)
 {
-    return !sev_es_enabled();
+    return !sev_es_enabled() && !is_tdx_vm();
 }
 
 #define ARCH_REQ_XCOMP_GUEST_PERM       0x1025
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 30/36] hw/i386: add eoi_intercept_unsupported member to X86MachineState
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Add a new bool member, eoi_intercept_unsupported, to X86MachineState
with default value false. Set true for TDX VM.

Inability to intercept eoi causes impossibility to emulate level
triggered interrupt to be re-injected when level is still kept active.
which affects interrupt controller emulation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/x86.c         | 1 +
 include/hw/i386/x86.h | 1 +
 target/i386/kvm/tdx.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 10a88faf4c0e..03101f1ba1dc 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1347,6 +1347,7 @@ static void x86_machine_initfn(Object *obj)
     x86ms->oem_id = g_strndup(ACPI_BUILD_APPNAME6, 6);
     x86ms->oem_table_id = g_strndup(ACPI_BUILD_APPNAME8, 8);
     x86ms->bus_lock_ratelimit = 0;
+    x86ms->eoi_intercept_unsupported = false;
 }
 
 static void x86_machine_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index e903c69b32e0..ef863c2df625 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -58,6 +58,7 @@ struct X86MachineState {
 
     /* CPU and apic information: */
     bool apic_xrupt_override;
+    bool eoi_intercept_unsupported;
     unsigned pci_irq_mask;
     unsigned apic_id_limit;
     uint16_t boot_cpus;
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 6e9cb7178d25..35e7c93de350 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -328,6 +328,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         return -EINVAL;
     }
 
+    x86ms->eoi_intercept_unsupported = true;
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 30/36] hw/i386: add eoi_intercept_unsupported member to X86MachineState
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Add a new bool member, eoi_intercept_unsupported, to X86MachineState
with default value false. Set true for TDX VM.

Inability to intercept eoi causes impossibility to emulate level
triggered interrupt to be re-injected when level is still kept active.
which affects interrupt controller emulation.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/x86.c         | 1 +
 include/hw/i386/x86.h | 1 +
 target/i386/kvm/tdx.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 10a88faf4c0e..03101f1ba1dc 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1347,6 +1347,7 @@ static void x86_machine_initfn(Object *obj)
     x86ms->oem_id = g_strndup(ACPI_BUILD_APPNAME6, 6);
     x86ms->oem_table_id = g_strndup(ACPI_BUILD_APPNAME8, 8);
     x86ms->bus_lock_ratelimit = 0;
+    x86ms->eoi_intercept_unsupported = false;
 }
 
 static void x86_machine_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index e903c69b32e0..ef863c2df625 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -58,6 +58,7 @@ struct X86MachineState {
 
     /* CPU and apic information: */
     bool apic_xrupt_override;
+    bool eoi_intercept_unsupported;
     unsigned pci_irq_mask;
     unsigned apic_id_limit;
     uint16_t boot_cpus;
diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
index 6e9cb7178d25..35e7c93de350 100644
--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -328,6 +328,8 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
         return -EINVAL;
     }
 
+    x86ms->eoi_intercept_unsupported = true;
+
     if (!tdx_caps) {
         get_tdx_capabilities();
     }
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 31/36] hw/i386: add option to forcibly report edge trigger in acpi tables
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

From: Isaku Yamahata <isaku.yamahata@intel.com>

When level trigger isn't supported on x86 platform,
forcibly report edge trigger in acpi tables.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/acpi-build.c  | 99 ++++++++++++++++++++++++++++---------------
 hw/i386/acpi-common.c | 50 ++++++++++++++++------
 2 files changed, 104 insertions(+), 45 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 4ad4d7286c89..a2323bad6e82 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -912,7 +912,8 @@ static void build_dbg_aml(Aml *table)
     aml_append(table, scope);
 }
 
-static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
+static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg,
+                           bool level_trigger_unsupported)
 {
     Aml *dev;
     Aml *crs;
@@ -924,7 +925,10 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
     aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
 
     crs = aml_resource_template();
-    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
+    aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                  level_trigger_unsupported ?
+                                  AML_EDGE : AML_LEVEL,
+                                  AML_ACTIVE_HIGH,
                                   AML_SHARED, irqs, ARRAY_SIZE(irqs)));
     aml_append(dev, aml_name_decl("_PRS", crs));
 
@@ -948,7 +952,8 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
     return dev;
  }
 
-static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
+static Aml *build_gsi_link_dev(const char *name, uint8_t uid,
+                               uint8_t gsi, bool level_trigger_unsupported)
 {
     Aml *dev;
     Aml *crs;
@@ -961,7 +966,10 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
 
     crs = aml_resource_template();
     irqs = gsi;
-    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
+    aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                  level_trigger_unsupported ?
+                                  AML_EDGE : AML_LEVEL,
+                                  AML_ACTIVE_HIGH,
                                   AML_SHARED, &irqs, 1));
     aml_append(dev, aml_name_decl("_PRS", crs));
 
@@ -980,7 +988,7 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
 }
 
 /* _CRS method - get current settings */
-static Aml *build_iqcr_method(bool is_piix4)
+static Aml *build_iqcr_method(bool is_piix4, bool level_trigger_unsupported)
 {
     Aml *if_ctx;
     uint32_t irqs;
@@ -988,7 +996,9 @@ static Aml *build_iqcr_method(bool is_piix4)
     Aml *crs = aml_resource_template();
 
     irqs = 0;
-    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
+    aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                  level_trigger_unsupported ?
+                                  AML_EDGE : AML_LEVEL,
                                   AML_ACTIVE_HIGH, AML_SHARED, &irqs, 1));
     aml_append(method, aml_name_decl("PRR0", crs));
 
@@ -1022,7 +1032,7 @@ static Aml *build_irq_status_method(void)
     return method;
 }
 
-static void build_piix4_pci0_int(Aml *table)
+static void build_piix4_pci0_int(Aml *table, bool level_trigger_unsupported)
 {
     Aml *dev;
     Aml *crs;
@@ -1043,12 +1053,16 @@ static void build_piix4_pci0_int(Aml *table)
     aml_append(sb_scope, field);
 
     aml_append(sb_scope, build_irq_status_method());
-    aml_append(sb_scope, build_iqcr_method(true));
+    aml_append(sb_scope, build_iqcr_method(true, level_trigger_unsupported));
 
-    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0")));
-    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1")));
-    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2")));
-    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3")));
+    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3"),
+                                        level_trigger_unsupported));
 
     dev = aml_device("LNKS");
     {
@@ -1057,7 +1071,9 @@ static void build_piix4_pci0_int(Aml *table)
 
         crs = aml_resource_template();
         irqs = 9;
-        aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
+        aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                      level_trigger_unsupported ?
+                                      AML_EDGE : AML_LEVEL,
                                       AML_ACTIVE_HIGH, AML_SHARED,
                                       &irqs, 1));
         aml_append(dev, aml_name_decl("_PRS", crs));
@@ -1143,7 +1159,7 @@ static Aml *build_q35_routing_table(const char *str)
     return pkg;
 }
 
-static void build_q35_pci0_int(Aml *table)
+static void build_q35_pci0_int(Aml *table, bool level_trigger_unsupported)
 {
     Aml *field;
     Aml *method;
@@ -1195,25 +1211,41 @@ static void build_q35_pci0_int(Aml *table)
     aml_append(sb_scope, field);
 
     aml_append(sb_scope, build_irq_status_method());
-    aml_append(sb_scope, build_iqcr_method(false));
+    aml_append(sb_scope, build_iqcr_method(false, level_trigger_unsupported));
 
-    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA")));
-    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB")));
-    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC")));
-    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD")));
-    aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE")));
-    aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF")));
-    aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG")));
-    aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH")));
+    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH"),
+                                        level_trigger_unsupported));
 
-    aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10));
-    aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11));
-    aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12));
-    aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13));
-    aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14));
-    aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15));
-    aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16));
-    aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17));
+    aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17,
+                                            level_trigger_unsupported));
 
     aml_append(table, sb_scope);
 }
@@ -1420,6 +1452,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     PCMachineState *pcms = PC_MACHINE(machine);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
     X86MachineState *x86ms = X86_MACHINE(machine);
+    bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
     AcpiMcfgInfo mcfg;
     bool mcfg_valid = !!acpi_get_mcfg(&mcfg);
     uint32_t nr_mem = machine->ram_slots;
@@ -1454,7 +1487,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         if (pm->pcihp_bridge_en || pm->pcihp_root_en) {
             build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
         }
-        build_piix4_pci0_int(dsdt);
+        build_piix4_pci0_int(dsdt, level_trigger_unsupported);
     } else {
         sb_scope = aml_scope("_SB");
         dev = aml_device("PCI0");
@@ -1503,7 +1536,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         if (pm->pcihp_bridge_en) {
             build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
         }
-        build_q35_pci0_int(dsdt);
+        build_q35_pci0_int(dsdt, level_trigger_unsupported);
         if (pcms->smbus && !pcmc->do_not_add_smb_acpi) {
             build_smb0(dsdt, pcms->smbus, ICH9_SMB_DEV, ICH9_SMB_FUNC);
         }
diff --git a/hw/i386/acpi-common.c b/hw/i386/acpi-common.c
index 4aaafbdd7b5d..485fc17816be 100644
--- a/hw/i386/acpi-common.c
+++ b/hw/i386/acpi-common.c
@@ -105,6 +105,7 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
     AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(adev);
     AcpiTable table = { .sig = "APIC", .rev = 1, .oem_id = oem_id,
                         .oem_table_id = oem_table_id };
+    bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
 
     acpi_table_begin(&table, table_data);
     /* Local APIC Address */
@@ -124,18 +125,43 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
                      IO_APIC_SECONDARY_ADDRESS, IO_APIC_SECONDARY_IRQBASE);
     }
 
-    if (x86ms->apic_xrupt_override) {
-        build_xrupt_override(table_data, 0, 2,
-            0 /* Flags: Conforms to the specifications of the bus */);
-    }
-
-    for (i = 1; i < 16; i++) {
-        if (!(x86ms->pci_irq_mask & (1 << i))) {
-            /* No need for a INT source override structure. */
-            continue;
-        }
-        build_xrupt_override(table_data, i, i,
-            0xd /* Flags: Active high, Level Triggered */);
+    if (level_trigger_unsupported) {
+        /* Force edge trigger */
+        if (x86ms->apic_xrupt_override) {
+            build_xrupt_override(table_data, 0, 2,
+                                 /* Flags: active high, edge triggered */
+                                 1 | (1 << 2));
+        }
+
+        for (i = x86ms->apic_xrupt_override ? 1 : 0; i < 16; i++) {
+            build_xrupt_override(table_data, i, i,
+                                 /* Flags: active high, edge triggered */
+                                 1 | (1 << 2));
+        }
+
+        if (x86ms->ioapic2) {
+            for (i = 0; i < 16; i++) {
+                build_xrupt_override(table_data, IO_APIC_SECONDARY_IRQBASE + i,
+                                     IO_APIC_SECONDARY_IRQBASE + i,
+                                     /* Flags: active high, edge triggered */
+                                     1 | (1 << 2));
+            }
+        }
+    } else {
+        if (x86ms->apic_xrupt_override) {
+            build_xrupt_override(table_data, 0, 2,
+                                 0 /* Flags: Conforms to the specifications of the bus */);
+        }
+
+        for (i = 1; i < 16; i++) {
+            if (!(x86ms->pci_irq_mask & (1 << i))) {
+                /* No need for a INT source override structure. */
+                continue;
+            }
+            build_xrupt_override(table_data, i, i,
+                                 0xd /* Flags: Active high, Level Triggered */);
+
+        }
     }
 
     if (x2apic_mode) {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 31/36] hw/i386: add option to forcibly report edge trigger in acpi tables
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

From: Isaku Yamahata <isaku.yamahata@intel.com>

When level trigger isn't supported on x86 platform,
forcibly report edge trigger in acpi tables.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/i386/acpi-build.c  | 99 ++++++++++++++++++++++++++++---------------
 hw/i386/acpi-common.c | 50 ++++++++++++++++------
 2 files changed, 104 insertions(+), 45 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 4ad4d7286c89..a2323bad6e82 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -912,7 +912,8 @@ static void build_dbg_aml(Aml *table)
     aml_append(table, scope);
 }
 
-static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
+static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg,
+                           bool level_trigger_unsupported)
 {
     Aml *dev;
     Aml *crs;
@@ -924,7 +925,10 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
     aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
 
     crs = aml_resource_template();
-    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
+    aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                  level_trigger_unsupported ?
+                                  AML_EDGE : AML_LEVEL,
+                                  AML_ACTIVE_HIGH,
                                   AML_SHARED, irqs, ARRAY_SIZE(irqs)));
     aml_append(dev, aml_name_decl("_PRS", crs));
 
@@ -948,7 +952,8 @@ static Aml *build_link_dev(const char *name, uint8_t uid, Aml *reg)
     return dev;
  }
 
-static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
+static Aml *build_gsi_link_dev(const char *name, uint8_t uid,
+                               uint8_t gsi, bool level_trigger_unsupported)
 {
     Aml *dev;
     Aml *crs;
@@ -961,7 +966,10 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
 
     crs = aml_resource_template();
     irqs = gsi;
-    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
+    aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                  level_trigger_unsupported ?
+                                  AML_EDGE : AML_LEVEL,
+                                  AML_ACTIVE_HIGH,
                                   AML_SHARED, &irqs, 1));
     aml_append(dev, aml_name_decl("_PRS", crs));
 
@@ -980,7 +988,7 @@ static Aml *build_gsi_link_dev(const char *name, uint8_t uid, uint8_t gsi)
 }
 
 /* _CRS method - get current settings */
-static Aml *build_iqcr_method(bool is_piix4)
+static Aml *build_iqcr_method(bool is_piix4, bool level_trigger_unsupported)
 {
     Aml *if_ctx;
     uint32_t irqs;
@@ -988,7 +996,9 @@ static Aml *build_iqcr_method(bool is_piix4)
     Aml *crs = aml_resource_template();
 
     irqs = 0;
-    aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
+    aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                  level_trigger_unsupported ?
+                                  AML_EDGE : AML_LEVEL,
                                   AML_ACTIVE_HIGH, AML_SHARED, &irqs, 1));
     aml_append(method, aml_name_decl("PRR0", crs));
 
@@ -1022,7 +1032,7 @@ static Aml *build_irq_status_method(void)
     return method;
 }
 
-static void build_piix4_pci0_int(Aml *table)
+static void build_piix4_pci0_int(Aml *table, bool level_trigger_unsupported)
 {
     Aml *dev;
     Aml *crs;
@@ -1043,12 +1053,16 @@ static void build_piix4_pci0_int(Aml *table)
     aml_append(sb_scope, field);
 
     aml_append(sb_scope, build_irq_status_method());
-    aml_append(sb_scope, build_iqcr_method(true));
+    aml_append(sb_scope, build_iqcr_method(true, level_trigger_unsupported));
 
-    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0")));
-    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1")));
-    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2")));
-    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3")));
+    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQ0"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQ1"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQ2"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQ3"),
+                                        level_trigger_unsupported));
 
     dev = aml_device("LNKS");
     {
@@ -1057,7 +1071,9 @@ static void build_piix4_pci0_int(Aml *table)
 
         crs = aml_resource_template();
         irqs = 9;
-        aml_append(crs, aml_interrupt(AML_CONSUMER, AML_LEVEL,
+        aml_append(crs, aml_interrupt(AML_CONSUMER,
+                                      level_trigger_unsupported ?
+                                      AML_EDGE : AML_LEVEL,
                                       AML_ACTIVE_HIGH, AML_SHARED,
                                       &irqs, 1));
         aml_append(dev, aml_name_decl("_PRS", crs));
@@ -1143,7 +1159,7 @@ static Aml *build_q35_routing_table(const char *str)
     return pkg;
 }
 
-static void build_q35_pci0_int(Aml *table)
+static void build_q35_pci0_int(Aml *table, bool level_trigger_unsupported)
 {
     Aml *field;
     Aml *method;
@@ -1195,25 +1211,41 @@ static void build_q35_pci0_int(Aml *table)
     aml_append(sb_scope, field);
 
     aml_append(sb_scope, build_irq_status_method());
-    aml_append(sb_scope, build_iqcr_method(false));
+    aml_append(sb_scope, build_iqcr_method(false, level_trigger_unsupported));
 
-    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA")));
-    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB")));
-    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC")));
-    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD")));
-    aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE")));
-    aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF")));
-    aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG")));
-    aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH")));
+    aml_append(sb_scope, build_link_dev("LNKA", 0, aml_name("PRQA"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKB", 1, aml_name("PRQB"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKC", 2, aml_name("PRQC"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKD", 3, aml_name("PRQD"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKE", 4, aml_name("PRQE"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKF", 5, aml_name("PRQF"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKG", 6, aml_name("PRQG"),
+                                        level_trigger_unsupported));
+    aml_append(sb_scope, build_link_dev("LNKH", 7, aml_name("PRQH"),
+                                        level_trigger_unsupported));
 
-    aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10));
-    aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11));
-    aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12));
-    aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13));
-    aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14));
-    aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15));
-    aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16));
-    aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17));
+    aml_append(sb_scope, build_gsi_link_dev("GSIA", 0x10, 0x10,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIB", 0x11, 0x11,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIC", 0x12, 0x12,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSID", 0x13, 0x13,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIE", 0x14, 0x14,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIF", 0x15, 0x15,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIG", 0x16, 0x16,
+                                            level_trigger_unsupported));
+    aml_append(sb_scope, build_gsi_link_dev("GSIH", 0x17, 0x17,
+                                            level_trigger_unsupported));
 
     aml_append(table, sb_scope);
 }
@@ -1420,6 +1452,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     PCMachineState *pcms = PC_MACHINE(machine);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
     X86MachineState *x86ms = X86_MACHINE(machine);
+    bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
     AcpiMcfgInfo mcfg;
     bool mcfg_valid = !!acpi_get_mcfg(&mcfg);
     uint32_t nr_mem = machine->ram_slots;
@@ -1454,7 +1487,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         if (pm->pcihp_bridge_en || pm->pcihp_root_en) {
             build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
         }
-        build_piix4_pci0_int(dsdt);
+        build_piix4_pci0_int(dsdt, level_trigger_unsupported);
     } else {
         sb_scope = aml_scope("_SB");
         dev = aml_device("PCI0");
@@ -1503,7 +1536,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         if (pm->pcihp_bridge_en) {
             build_x86_acpi_pci_hotplug(dsdt, pm->pcihp_io_base);
         }
-        build_q35_pci0_int(dsdt);
+        build_q35_pci0_int(dsdt, level_trigger_unsupported);
         if (pcms->smbus && !pcmc->do_not_add_smb_acpi) {
             build_smb0(dsdt, pcms->smbus, ICH9_SMB_DEV, ICH9_SMB_FUNC);
         }
diff --git a/hw/i386/acpi-common.c b/hw/i386/acpi-common.c
index 4aaafbdd7b5d..485fc17816be 100644
--- a/hw/i386/acpi-common.c
+++ b/hw/i386/acpi-common.c
@@ -105,6 +105,7 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
     AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(adev);
     AcpiTable table = { .sig = "APIC", .rev = 1, .oem_id = oem_id,
                         .oem_table_id = oem_table_id };
+    bool level_trigger_unsupported = x86ms->eoi_intercept_unsupported;
 
     acpi_table_begin(&table, table_data);
     /* Local APIC Address */
@@ -124,18 +125,43 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
                      IO_APIC_SECONDARY_ADDRESS, IO_APIC_SECONDARY_IRQBASE);
     }
 
-    if (x86ms->apic_xrupt_override) {
-        build_xrupt_override(table_data, 0, 2,
-            0 /* Flags: Conforms to the specifications of the bus */);
-    }
-
-    for (i = 1; i < 16; i++) {
-        if (!(x86ms->pci_irq_mask & (1 << i))) {
-            /* No need for a INT source override structure. */
-            continue;
-        }
-        build_xrupt_override(table_data, i, i,
-            0xd /* Flags: Active high, Level Triggered */);
+    if (level_trigger_unsupported) {
+        /* Force edge trigger */
+        if (x86ms->apic_xrupt_override) {
+            build_xrupt_override(table_data, 0, 2,
+                                 /* Flags: active high, edge triggered */
+                                 1 | (1 << 2));
+        }
+
+        for (i = x86ms->apic_xrupt_override ? 1 : 0; i < 16; i++) {
+            build_xrupt_override(table_data, i, i,
+                                 /* Flags: active high, edge triggered */
+                                 1 | (1 << 2));
+        }
+
+        if (x86ms->ioapic2) {
+            for (i = 0; i < 16; i++) {
+                build_xrupt_override(table_data, IO_APIC_SECONDARY_IRQBASE + i,
+                                     IO_APIC_SECONDARY_IRQBASE + i,
+                                     /* Flags: active high, edge triggered */
+                                     1 | (1 << 2));
+            }
+        }
+    } else {
+        if (x86ms->apic_xrupt_override) {
+            build_xrupt_override(table_data, 0, 2,
+                                 0 /* Flags: Conforms to the specifications of the bus */);
+        }
+
+        for (i = 1; i < 16; i++) {
+            if (!(x86ms->pci_irq_mask & (1 << i))) {
+                /* No need for a INT source override structure. */
+                continue;
+            }
+            build_xrupt_override(table_data, i, i,
+                                 0xd /* Flags: Active high, Level Triggered */);
+
+        }
     }
 
     if (x2apic_mode) {
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 32/36] i386/tdx: Don't synchronize guest tsc for TDs
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

From: Isaku Yamahata <isaku.yamahata@intel.com>

TSC of TDs is not accessible and KVM doesn't allow access of
MSR_IA32_TSC for TDs. To avoid the assert() in kvm_get_tsc, make
kvm_synchronize_all_tsc() noop for TDs,

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ccbafb4ca183..f6024b723b70 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -273,7 +273,7 @@ void kvm_synchronize_all_tsc(void)
 {
     CPUState *cpu;
 
-    if (kvm_enabled()) {
+    if (kvm_enabled() && !is_tdx_vm()) {
         CPU_FOREACH(cpu) {
             run_on_cpu(cpu, do_kvm_synchronize_tsc, RUN_ON_CPU_NULL);
         }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 32/36] i386/tdx: Don't synchronize guest tsc for TDs
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

From: Isaku Yamahata <isaku.yamahata@intel.com>

TSC of TDs is not accessible and KVM doesn't allow access of
MSR_IA32_TSC for TDs. To avoid the assert() in kvm_get_tsc, make
kvm_synchronize_all_tsc() noop for TDs,

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index ccbafb4ca183..f6024b723b70 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -273,7 +273,7 @@ void kvm_synchronize_all_tsc(void)
 {
     CPUState *cpu;
 
-    if (kvm_enabled()) {
+    if (kvm_enabled() && !is_tdx_vm()) {
         CPU_FOREACH(cpu) {
             run_on_cpu(cpu, do_kvm_synchronize_tsc, RUN_ON_CPU_NULL);
         }
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 33/36] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
by VMM, while the features enumerated/controlled by other MSRs except
MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.

Only configure MSR_IA32_UCODE_REV for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 44 ++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f6024b723b70..480c05d6c969 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3055,32 +3055,34 @@ static void kvm_init_msrs(X86CPU *cpu)
     CPUX86State *env = &cpu->env;
 
     kvm_msr_buf_reset(cpu);
-    if (has_msr_arch_capabs) {
-        kvm_msr_entry_add(cpu, MSR_IA32_ARCH_CAPABILITIES,
-                          env->features[FEAT_ARCH_CAPABILITIES]);
-    }
-
-    if (has_msr_core_capabs) {
-        kvm_msr_entry_add(cpu, MSR_IA32_CORE_CAPABILITY,
-                          env->features[FEAT_CORE_CAPABILITY]);
-    }
-
-    if (has_msr_perf_capabs && cpu->enable_pmu) {
-        kvm_msr_entry_add_perf(cpu, env->features);
+
+    if (!is_tdx_vm()) {
+        if (has_msr_arch_capabs) {
+            kvm_msr_entry_add(cpu, MSR_IA32_ARCH_CAPABILITIES,
+                                env->features[FEAT_ARCH_CAPABILITIES]);
+        }
+
+        if (has_msr_core_capabs) {
+            kvm_msr_entry_add(cpu, MSR_IA32_CORE_CAPABILITY,
+                                env->features[FEAT_CORE_CAPABILITY]);
+        }
+
+        if (has_msr_perf_capabs && cpu->enable_pmu) {
+            kvm_msr_entry_add_perf(cpu, env->features);
+        }
+
+        /*
+         * Older kernels do not include VMX MSRs in KVM_GET_MSR_INDEX_LIST, but
+         * all kernels with MSR features should have them.
+         */
+        if (kvm_feature_msrs && cpu_has_vmx(env)) {
+            kvm_msr_entry_add_vmx(cpu, env->features);
+        }
     }
 
     if (has_msr_ucode_rev) {
         kvm_msr_entry_add(cpu, MSR_IA32_UCODE_REV, cpu->ucode_rev);
     }
-
-    /*
-     * Older kernels do not include VMX MSRs in KVM_GET_MSR_INDEX_LIST, but
-     * all kernels with MSR features should have them.
-     */
-    if (kvm_feature_msrs && cpu_has_vmx(env)) {
-        kvm_msr_entry_add_vmx(cpu, env->features);
-    }
-
     assert(kvm_buf_set_msrs(cpu) == 0);
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 33/36] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
by VMM, while the features enumerated/controlled by other MSRs except
MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.

Only configure MSR_IA32_UCODE_REV for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 44 ++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f6024b723b70..480c05d6c969 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3055,32 +3055,34 @@ static void kvm_init_msrs(X86CPU *cpu)
     CPUX86State *env = &cpu->env;
 
     kvm_msr_buf_reset(cpu);
-    if (has_msr_arch_capabs) {
-        kvm_msr_entry_add(cpu, MSR_IA32_ARCH_CAPABILITIES,
-                          env->features[FEAT_ARCH_CAPABILITIES]);
-    }
-
-    if (has_msr_core_capabs) {
-        kvm_msr_entry_add(cpu, MSR_IA32_CORE_CAPABILITY,
-                          env->features[FEAT_CORE_CAPABILITY]);
-    }
-
-    if (has_msr_perf_capabs && cpu->enable_pmu) {
-        kvm_msr_entry_add_perf(cpu, env->features);
+
+    if (!is_tdx_vm()) {
+        if (has_msr_arch_capabs) {
+            kvm_msr_entry_add(cpu, MSR_IA32_ARCH_CAPABILITIES,
+                                env->features[FEAT_ARCH_CAPABILITIES]);
+        }
+
+        if (has_msr_core_capabs) {
+            kvm_msr_entry_add(cpu, MSR_IA32_CORE_CAPABILITY,
+                                env->features[FEAT_CORE_CAPABILITY]);
+        }
+
+        if (has_msr_perf_capabs && cpu->enable_pmu) {
+            kvm_msr_entry_add_perf(cpu, env->features);
+        }
+
+        /*
+         * Older kernels do not include VMX MSRs in KVM_GET_MSR_INDEX_LIST, but
+         * all kernels with MSR features should have them.
+         */
+        if (kvm_feature_msrs && cpu_has_vmx(env)) {
+            kvm_msr_entry_add_vmx(cpu, env->features);
+        }
     }
 
     if (has_msr_ucode_rev) {
         kvm_msr_entry_add(cpu, MSR_IA32_UCODE_REV, cpu->ucode_rev);
     }
-
-    /*
-     * Older kernels do not include VMX MSRs in KVM_GET_MSR_INDEX_LIST, but
-     * all kernels with MSR features should have them.
-     */
-    if (kvm_feature_msrs && cpu_has_vmx(env)) {
-        kvm_msr_entry_add_vmx(cpu, env->features);
-    }
-
     assert(kvm_buf_set_msrs(cpu) == 0);
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 34/36] i386/tdx: Skip kvm_put_apicbase() for TDs
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

KVM doesn't allow wirting to MSR_IA32_APICBASE for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 480c05d6c969..9c7eb3dea0a8 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2837,6 +2837,11 @@ void kvm_put_apicbase(X86CPU *cpu, uint64_t value)
 {
     int ret;
 
+    /* TODO: Allow accessing guest state for debug TDs. */
+    if (is_tdx_vm()) {
+        return;
+    }
+
     ret = kvm_put_one_msr(cpu, MSR_IA32_APICBASE, value);
     assert(ret == 1);
 }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 34/36] i386/tdx: Skip kvm_put_apicbase() for TDs
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

KVM doesn't allow wirting to MSR_IA32_APICBASE for TDs.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 480c05d6c969..9c7eb3dea0a8 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2837,6 +2837,11 @@ void kvm_put_apicbase(X86CPU *cpu, uint64_t value)
 {
     int ret;
 
+    /* TODO: Allow accessing guest state for debug TDs. */
+    if (is_tdx_vm()) {
+        return;
+    }
+
     ret = kvm_put_one_msr(cpu, MSR_IA32_APICBASE, value);
     assert(ret == 1);
 }
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 35/36] i386/tdx: Don't get/put guest state for TDX VMs
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

From: Sean Christopherson <sean.j.christopherson@intel.com>

Don't get/put state of TDX VMs since accessing/mutating guest state of
production TDs is not supported.

Note, it will be allowed for a debug TD. Corresponding support will be
introduced when debug TD support is implemented in the future.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 9c7eb3dea0a8..dafb63d4d2d7 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4395,6 +4395,11 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 
     assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
 
+    /* TODO: Allow accessing guest state for debug TDs. */
+    if (is_tdx_vm()) {
+        return 0;
+    }
+
     /* must be before kvm_put_nested_state so that EFER.SVME is set */
     ret = has_sregs2 ? kvm_put_sregs2(x86_cpu) : kvm_put_sregs(x86_cpu);
     if (ret < 0) {
@@ -4489,6 +4494,12 @@ int kvm_arch_get_registers(CPUState *cs)
     if (ret < 0) {
         goto out;
     }
+
+    /* TODO: Allow accessing guest state for debug TDs. */
+    if (is_tdx_vm()) {
+        return 0;
+    }
+
     ret = kvm_getput_regs(cpu, 0);
     if (ret < 0) {
         goto out;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 35/36] i386/tdx: Don't get/put guest state for TDX VMs
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

From: Sean Christopherson <sean.j.christopherson@intel.com>

Don't get/put state of TDX VMs since accessing/mutating guest state of
production TDs is not supported.

Note, it will be allowed for a debug TD. Corresponding support will be
introduced when debug TD support is implemented in the future.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/kvm/kvm.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 9c7eb3dea0a8..dafb63d4d2d7 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4395,6 +4395,11 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 
     assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
 
+    /* TODO: Allow accessing guest state for debug TDs. */
+    if (is_tdx_vm()) {
+        return 0;
+    }
+
     /* must be before kvm_put_nested_state so that EFER.SVME is set */
     ret = has_sregs2 ? kvm_put_sregs2(x86_cpu) : kvm_put_sregs(x86_cpu);
     if (ret < 0) {
@@ -4489,6 +4494,12 @@ int kvm_arch_get_registers(CPUState *cs)
     if (ret < 0) {
         goto out;
     }
+
+    /* TODO: Allow accessing guest state for debug TDs. */
+    if (is_tdx_vm()) {
+        return 0;
+    }
+
     ret = kvm_getput_regs(cpu, 0);
     if (ret < 0) {
         goto out;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 36/36] docs: Add TDX documentation
  2022-03-17 13:58 ` Xiaoyao Li
@ 2022-03-17 13:59   ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, xiaoyao.li, erdemaktas, kvm,
	qemu-devel, seanjc

Add docs/system/i386/tdx.rst for TDX support, and add tdx in
confidential-guest-support.rst

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 docs/system/confidential-guest-support.rst |   1 +
 docs/system/i386/tdx.rst                   | 103 +++++++++++++++++++++
 docs/system/target-i386.rst                |   1 +
 3 files changed, 105 insertions(+)
 create mode 100644 docs/system/i386/tdx.rst

diff --git a/docs/system/confidential-guest-support.rst b/docs/system/confidential-guest-support.rst
index 0c490dbda2b7..66129fbab64c 100644
--- a/docs/system/confidential-guest-support.rst
+++ b/docs/system/confidential-guest-support.rst
@@ -38,6 +38,7 @@ Supported mechanisms
 Currently supported confidential guest mechanisms are:
 
 * AMD Secure Encrypted Virtualization (SEV) (see :doc:`i386/amd-memory-encryption`)
+* Intel Trust Domain Extension (TDX) (see :doc:`i386/tdx`)
 * POWER Protected Execution Facility (PEF) (see :ref:`power-papr-protected-execution-facility-pef`)
 * s390x Protected Virtualization (PV) (see :doc:`s390x/protvirt`)
 
diff --git a/docs/system/i386/tdx.rst b/docs/system/i386/tdx.rst
new file mode 100644
index 000000000000..b6c410202c77
--- /dev/null
+++ b/docs/system/i386/tdx.rst
@@ -0,0 +1,103 @@
+Intel Trusted Domain eXtension (TDX)
+====================================
+
+Intel Trusted Domain eXtensions (TDX) refers to an Intel technology that extends
+Virtual Machine Extensions (VMX) and Multi-Key Total Memory Encryption (MKTME)
+with a new kind of virtual machine guest called a Trust Domain (TD). A TD runs
+in a CPU mode that is designed to protect the confidentiality of its memory
+contents and its CPU state from any other software, including the hosting
+Virtual Machine Monitor (VMM), unless explicitly shared by the TD itself.
+
+Prerequisites
+-------------
+
+To run TD, the physical machine needs to have TDX module loaded and initialized
+whihe KVM hypervisor has TDX support. It those requirements are met, the
+``KVM_CAP_VM_TYPES`` will report the support of ``KVM_X86_TDX_VM``.
+
+Trust Domain Virtual Firmware (TDVF)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Trust Domain Virtual Firmware (TDVF) is required to provide TD services to boot
+TD Guest OS. TDVF needs to be copied to guest private memory and measured before
+a TD boots.
+
+The VM scope ``MEMORY_ENCRYPT_OP`` ioctl provides command ``KVM_TDX_INIT_MEM_REGION``
+to copy the firmware image to TD's private memory space.
+
+OVMF is the opensource firmware that implements the TDVF support. It can be used
+as exsiting usage that mapped via pflash interface to TDX VM. It's user-friendly
+that requies no use model change, but it's mapped and acting as RAM instead of
+flash (ROM device) unlike what OVMF acts to standard VM.
+
+Feature Control
+---------------
+
+Unlike non-TDX VM, the CPU features (enumerated by CPU or MSR) of a TD is not
+under full control of VMM. VMM can only configure part of features of a TD on
+``KVM_TDX_INIT_VM`` command of VM scope ``MEMORY_ENCRYPT_OP`` ioctl.
+
+The configurable features have three types:
+
+- Attributes:
+  - PKS (bit 30) controls whether Supervisor Protection Keys is exposed to TD,
+  which determines related CPUID bit and CR4 bit;
+  - PERFMON (bit 63) controls whether PMU is exposed to TD.
+
+- XSAVE related features (XFAM):
+  XFAM is a 64b mask, which has the same format as XCR0 or IA32_XSS MSR. It
+  determines the set of extended features available for use by the guest TD.
+
+- CPUID features:
+  Only some bits of some CPUID leaves are directly configurable by VMM.
+
+What features can be configured is reported via TDX capabilities.
+
+TDX capabilities
+~~~~~~~~~~~~~~~~
+
+The VM scope ``MEMORY_ENCRYPT_OP`` ioctl provides command ``KVM_TDX_CAPABILITIES``
+to get the TDX capabilities from KVM. It returns a data structure of
+``struct kvm_tdx_capabilites``, which tells the supported configuration of
+attributes, XFAM and CPUIDs.
+
+Launching a TD (TDX VM)
+-----------------------
+
+To launch a TDX guest:
+
+.. parsed-literal::
+
+    |qemu_system_x86| \\
+        -machine ...,confidential-guest-support=tdx0 \\
+        -object tdx-guest,id=tdx0,[sept-ve-disable=off] \\
+        -drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd \\
+        -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd \\
+
+Debugging
+---------
+
+Bit 0 of TD attributes, is DEBUG bit, which decides if the TD runs in off-TD
+debug mode. When in off-TD debug mode, TD's VCPU state and private memory are
+accessible via given SEAMCALLs. This requires KVM to expose APIs to invoke those
+SEAMCALLs and resonponding QEMU change.
+
+It's targeted as future work.
+
+restrictions
+------------
+
+ - No readonly support for private memory;
+
+ - No SMM support: SMM support requires manipulating the guset register states
+   which is not allowed;
+
+Live Migration
+--------------
+
+TODO
+
+References
+----------
+
+- `TDX Homepage <https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html>`__
diff --git a/docs/system/target-i386.rst b/docs/system/target-i386.rst
index 96bf54889a82..16dd4f1a8c80 100644
--- a/docs/system/target-i386.rst
+++ b/docs/system/target-i386.rst
@@ -29,6 +29,7 @@ Architectural features
    i386/kvm-pv
    i386/sgx
    i386/amd-memory-encryption
+   i386/tdx
 
 .. _pcsys_005freq:
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* [RFC PATCH v3 36/36] docs: Add TDX documentation
@ 2022-03-17 13:59   ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-17 13:59 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, xiaoyao.li,
	qemu-devel, erdemaktas

Add docs/system/i386/tdx.rst for TDX support, and add tdx in
confidential-guest-support.rst

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 docs/system/confidential-guest-support.rst |   1 +
 docs/system/i386/tdx.rst                   | 103 +++++++++++++++++++++
 docs/system/target-i386.rst                |   1 +
 3 files changed, 105 insertions(+)
 create mode 100644 docs/system/i386/tdx.rst

diff --git a/docs/system/confidential-guest-support.rst b/docs/system/confidential-guest-support.rst
index 0c490dbda2b7..66129fbab64c 100644
--- a/docs/system/confidential-guest-support.rst
+++ b/docs/system/confidential-guest-support.rst
@@ -38,6 +38,7 @@ Supported mechanisms
 Currently supported confidential guest mechanisms are:
 
 * AMD Secure Encrypted Virtualization (SEV) (see :doc:`i386/amd-memory-encryption`)
+* Intel Trust Domain Extension (TDX) (see :doc:`i386/tdx`)
 * POWER Protected Execution Facility (PEF) (see :ref:`power-papr-protected-execution-facility-pef`)
 * s390x Protected Virtualization (PV) (see :doc:`s390x/protvirt`)
 
diff --git a/docs/system/i386/tdx.rst b/docs/system/i386/tdx.rst
new file mode 100644
index 000000000000..b6c410202c77
--- /dev/null
+++ b/docs/system/i386/tdx.rst
@@ -0,0 +1,103 @@
+Intel Trusted Domain eXtension (TDX)
+====================================
+
+Intel Trusted Domain eXtensions (TDX) refers to an Intel technology that extends
+Virtual Machine Extensions (VMX) and Multi-Key Total Memory Encryption (MKTME)
+with a new kind of virtual machine guest called a Trust Domain (TD). A TD runs
+in a CPU mode that is designed to protect the confidentiality of its memory
+contents and its CPU state from any other software, including the hosting
+Virtual Machine Monitor (VMM), unless explicitly shared by the TD itself.
+
+Prerequisites
+-------------
+
+To run TD, the physical machine needs to have TDX module loaded and initialized
+whihe KVM hypervisor has TDX support. It those requirements are met, the
+``KVM_CAP_VM_TYPES`` will report the support of ``KVM_X86_TDX_VM``.
+
+Trust Domain Virtual Firmware (TDVF)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Trust Domain Virtual Firmware (TDVF) is required to provide TD services to boot
+TD Guest OS. TDVF needs to be copied to guest private memory and measured before
+a TD boots.
+
+The VM scope ``MEMORY_ENCRYPT_OP`` ioctl provides command ``KVM_TDX_INIT_MEM_REGION``
+to copy the firmware image to TD's private memory space.
+
+OVMF is the opensource firmware that implements the TDVF support. It can be used
+as exsiting usage that mapped via pflash interface to TDX VM. It's user-friendly
+that requies no use model change, but it's mapped and acting as RAM instead of
+flash (ROM device) unlike what OVMF acts to standard VM.
+
+Feature Control
+---------------
+
+Unlike non-TDX VM, the CPU features (enumerated by CPU or MSR) of a TD is not
+under full control of VMM. VMM can only configure part of features of a TD on
+``KVM_TDX_INIT_VM`` command of VM scope ``MEMORY_ENCRYPT_OP`` ioctl.
+
+The configurable features have three types:
+
+- Attributes:
+  - PKS (bit 30) controls whether Supervisor Protection Keys is exposed to TD,
+  which determines related CPUID bit and CR4 bit;
+  - PERFMON (bit 63) controls whether PMU is exposed to TD.
+
+- XSAVE related features (XFAM):
+  XFAM is a 64b mask, which has the same format as XCR0 or IA32_XSS MSR. It
+  determines the set of extended features available for use by the guest TD.
+
+- CPUID features:
+  Only some bits of some CPUID leaves are directly configurable by VMM.
+
+What features can be configured is reported via TDX capabilities.
+
+TDX capabilities
+~~~~~~~~~~~~~~~~
+
+The VM scope ``MEMORY_ENCRYPT_OP`` ioctl provides command ``KVM_TDX_CAPABILITIES``
+to get the TDX capabilities from KVM. It returns a data structure of
+``struct kvm_tdx_capabilites``, which tells the supported configuration of
+attributes, XFAM and CPUIDs.
+
+Launching a TD (TDX VM)
+-----------------------
+
+To launch a TDX guest:
+
+.. parsed-literal::
+
+    |qemu_system_x86| \\
+        -machine ...,confidential-guest-support=tdx0 \\
+        -object tdx-guest,id=tdx0,[sept-ve-disable=off] \\
+        -drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd \\
+        -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd \\
+
+Debugging
+---------
+
+Bit 0 of TD attributes, is DEBUG bit, which decides if the TD runs in off-TD
+debug mode. When in off-TD debug mode, TD's VCPU state and private memory are
+accessible via given SEAMCALLs. This requires KVM to expose APIs to invoke those
+SEAMCALLs and resonponding QEMU change.
+
+It's targeted as future work.
+
+restrictions
+------------
+
+ - No readonly support for private memory;
+
+ - No SMM support: SMM support requires manipulating the guset register states
+   which is not allowed;
+
+Live Migration
+--------------
+
+TODO
+
+References
+----------
+
+- `TDX Homepage <https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html>`__
diff --git a/docs/system/target-i386.rst b/docs/system/target-i386.rst
index 96bf54889a82..16dd4f1a8c80 100644
--- a/docs/system/target-i386.rst
+++ b/docs/system/target-i386.rst
@@ -29,6 +29,7 @@ Architectural features
    i386/kvm-pv
    i386/sgx
    i386/amd-memory-encryption
+   i386/tdx
 
 .. _pcsys_005freq:
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 05/36] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
  2022-03-17 13:58   ` Xiaoyao Li
@ 2022-03-18  2:07     ` Isaku Yamahata
  -1 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18  2:07 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:42PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> Introduce tdx_kvm_init() and invoke it in kvm_confidential_guest_init()
> if it's a TDX VM. More initialization will be added later.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  target/i386/kvm/kvm.c       | 15 ++++++---------
>  target/i386/kvm/meson.build |  2 +-
>  target/i386/kvm/tdx-stub.c  |  9 +++++++++
>  target/i386/kvm/tdx.c       | 13 +++++++++++++
>  target/i386/kvm/tdx.h       |  2 ++
>  5 files changed, 31 insertions(+), 10 deletions(-)
>  create mode 100644 target/i386/kvm/tdx-stub.c
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 70454355f3bf..26ed5faf07b8 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -54,6 +54,7 @@
>  #include "migration/blocker.h"
>  #include "exec/memattrs.h"
>  #include "trace.h"
> +#include "tdx.h"
>  
>  //#define DEBUG_KVM
>  
> @@ -2360,6 +2361,8 @@ static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
>  {
>      if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
>          return sev_kvm_init(ms->cgs, errp);
> +    } else if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
> +        return tdx_kvm_init(ms, errp);
>      }
>  
>      return 0;
> @@ -2374,16 +2377,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>      Error *local_err = NULL;
>  
>      /*
> -     * Initialize SEV context, if required
> +     * Initialize confidential guest (SEV/TDX) context, if required
>       *
> -     * If no memory encryption is requested (ms->cgs == NULL) this is
> -     * a no-op.
> -     *
> -     * It's also a no-op if a non-SEV confidential guest support
> -     * mechanism is selected.  SEV is the only mechanism available to
> -     * select on x86 at present, so this doesn't arise, but if new
> -     * mechanisms are supported in future (e.g. TDX), they'll need
> -     * their own initialization either here or elsewhere.
> +     * It's a no-op if a non-SEV/non-tdx confidential guest support
> +     * mechanism is selected, i.e., ms->cgs == NULL
>       */
>      ret = kvm_confidential_guest_init(ms, &local_err);
>      if (ret < 0) {
> diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
> index b2d7d41acde2..fd30b93ecec9 100644
> --- a/target/i386/kvm/meson.build
> +++ b/target/i386/kvm/meson.build
> @@ -9,7 +9,7 @@ i386_softmmu_kvm_ss.add(files(
>  
>  i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
>  
> -i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
> +i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'), if_false: files('tdx-stub.c'))
>  
>  i386_softmmu_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
>  
> diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
> new file mode 100644
> index 000000000000..1df24735201e
> --- /dev/null
> +++ b/target/i386/kvm/tdx-stub.c
> @@ -0,0 +1,9 @@
> +#include "qemu/osdep.h"
> +#include "qemu-common.h"
> +
> +#include "tdx.h"
> +
> +int tdx_kvm_init(MachineState *ms, Error **errp)
> +{
> +    return -EINVAL;
> +}
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index d3792d4a3d56..e3b94373b316 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -12,10 +12,23 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qapi/error.h"
>  #include "qom/object_interfaces.h"
>  
> +#include "hw/i386/x86.h"
>  #include "tdx.h"
>  
> +int tdx_kvm_init(MachineState *ms, Error **errp)
> +{
> +    TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
> +                                                    TYPE_TDX_GUEST);

The caller already checks it.  This is redundant. Maybe assert?


-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 05/36] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
@ 2022-03-18  2:07     ` Isaku Yamahata
  0 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18  2:07 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:42PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> Introduce tdx_kvm_init() and invoke it in kvm_confidential_guest_init()
> if it's a TDX VM. More initialization will be added later.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  target/i386/kvm/kvm.c       | 15 ++++++---------
>  target/i386/kvm/meson.build |  2 +-
>  target/i386/kvm/tdx-stub.c  |  9 +++++++++
>  target/i386/kvm/tdx.c       | 13 +++++++++++++
>  target/i386/kvm/tdx.h       |  2 ++
>  5 files changed, 31 insertions(+), 10 deletions(-)
>  create mode 100644 target/i386/kvm/tdx-stub.c
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 70454355f3bf..26ed5faf07b8 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -54,6 +54,7 @@
>  #include "migration/blocker.h"
>  #include "exec/memattrs.h"
>  #include "trace.h"
> +#include "tdx.h"
>  
>  //#define DEBUG_KVM
>  
> @@ -2360,6 +2361,8 @@ static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
>  {
>      if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
>          return sev_kvm_init(ms->cgs, errp);
> +    } else if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
> +        return tdx_kvm_init(ms, errp);
>      }
>  
>      return 0;
> @@ -2374,16 +2377,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>      Error *local_err = NULL;
>  
>      /*
> -     * Initialize SEV context, if required
> +     * Initialize confidential guest (SEV/TDX) context, if required
>       *
> -     * If no memory encryption is requested (ms->cgs == NULL) this is
> -     * a no-op.
> -     *
> -     * It's also a no-op if a non-SEV confidential guest support
> -     * mechanism is selected.  SEV is the only mechanism available to
> -     * select on x86 at present, so this doesn't arise, but if new
> -     * mechanisms are supported in future (e.g. TDX), they'll need
> -     * their own initialization either here or elsewhere.
> +     * It's a no-op if a non-SEV/non-tdx confidential guest support
> +     * mechanism is selected, i.e., ms->cgs == NULL
>       */
>      ret = kvm_confidential_guest_init(ms, &local_err);
>      if (ret < 0) {
> diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
> index b2d7d41acde2..fd30b93ecec9 100644
> --- a/target/i386/kvm/meson.build
> +++ b/target/i386/kvm/meson.build
> @@ -9,7 +9,7 @@ i386_softmmu_kvm_ss.add(files(
>  
>  i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
>  
> -i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
> +i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'), if_false: files('tdx-stub.c'))
>  
>  i386_softmmu_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
>  
> diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
> new file mode 100644
> index 000000000000..1df24735201e
> --- /dev/null
> +++ b/target/i386/kvm/tdx-stub.c
> @@ -0,0 +1,9 @@
> +#include "qemu/osdep.h"
> +#include "qemu-common.h"
> +
> +#include "tdx.h"
> +
> +int tdx_kvm_init(MachineState *ms, Error **errp)
> +{
> +    return -EINVAL;
> +}
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index d3792d4a3d56..e3b94373b316 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -12,10 +12,23 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qapi/error.h"
>  #include "qom/object_interfaces.h"
>  
> +#include "hw/i386/x86.h"
>  #include "tdx.h"
>  
> +int tdx_kvm_init(MachineState *ms, Error **errp)
> +{
> +    TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
> +                                                    TYPE_TDX_GUEST);

The caller already checks it.  This is redundant. Maybe assert?


-- 
Isaku Yamahata <isaku.yamahata@gmail.com>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 06/36] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  2022-03-17 13:58   ` Xiaoyao Li
@ 2022-03-18  2:08     ` Isaku Yamahata
  -1 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18  2:08 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:43PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index e3b94373b316..bed337e5ba18 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -14,10 +14,77 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "qom/object_interfaces.h"
> +#include "sysemu/kvm.h"
>  
>  #include "hw/i386/x86.h"
>  #include "tdx.h"
>  
> +enum tdx_ioctl_level{
> +    TDX_VM_IOCTL,
> +    TDX_VCPU_IOCTL,
> +};
> +
> +static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
> +                        __u32 metadata, void *data)
> +{
> +    struct kvm_tdx_cmd tdx_cmd;
> +    int r;
> +
> +    memset(&tdx_cmd, 0x0, sizeof(tdx_cmd));
> +
> +    tdx_cmd.id = cmd_id;
> +    tdx_cmd.metadata = metadata;
> +    tdx_cmd.data = (__u64)(unsigned long)data;
> +
> +    switch (level) {
> +    case TDX_VM_IOCTL:
> +        r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
> +        break;
> +    case TDX_VCPU_IOCTL:
> +        r = kvm_vcpu_ioctl(state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
> +        break;
> +    default:
> +        error_report("Invalid tdx_ioctl_level %d", level);
> +        exit(1);
> +    }
> +
> +    return r;
> +}
> +
> +#define tdx_vm_ioctl(cmd_id, metadata, data) \
> +        __tdx_ioctl(NULL, TDX_VM_IOCTL, cmd_id, metadata, data)
> +
> +#define tdx_vcpu_ioctl(cpu, cmd_id, metadata, data) \
> +        __tdx_ioctl(cpu, TDX_VCPU_IOCTL, cmd_id, metadata, data)

No point to use macro.  Normal (inline) function can works.

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 06/36] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
@ 2022-03-18  2:08     ` Isaku Yamahata
  0 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18  2:08 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:43PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index e3b94373b316..bed337e5ba18 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -14,10 +14,77 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "qom/object_interfaces.h"
> +#include "sysemu/kvm.h"
>  
>  #include "hw/i386/x86.h"
>  #include "tdx.h"
>  
> +enum tdx_ioctl_level{
> +    TDX_VM_IOCTL,
> +    TDX_VCPU_IOCTL,
> +};
> +
> +static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
> +                        __u32 metadata, void *data)
> +{
> +    struct kvm_tdx_cmd tdx_cmd;
> +    int r;
> +
> +    memset(&tdx_cmd, 0x0, sizeof(tdx_cmd));
> +
> +    tdx_cmd.id = cmd_id;
> +    tdx_cmd.metadata = metadata;
> +    tdx_cmd.data = (__u64)(unsigned long)data;
> +
> +    switch (level) {
> +    case TDX_VM_IOCTL:
> +        r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
> +        break;
> +    case TDX_VCPU_IOCTL:
> +        r = kvm_vcpu_ioctl(state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
> +        break;
> +    default:
> +        error_report("Invalid tdx_ioctl_level %d", level);
> +        exit(1);
> +    }
> +
> +    return r;
> +}
> +
> +#define tdx_vm_ioctl(cmd_id, metadata, data) \
> +        __tdx_ioctl(NULL, TDX_VM_IOCTL, cmd_id, metadata, data)
> +
> +#define tdx_vcpu_ioctl(cpu, cmd_id, metadata, data) \
> +        __tdx_ioctl(cpu, TDX_VCPU_IOCTL, cmd_id, metadata, data)

No point to use macro.  Normal (inline) function can works.

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-17 13:58   ` Xiaoyao Li
@ 2022-03-18 14:07     ` Philippe Mathieu-Daudé
  -1 siblings, 0 replies; 154+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-03-18 14:07 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

Hi,

On 17/3/22 14:58, Xiaoyao Li wrote:
> TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
> that OVMF is mapped as rom device, TDVF needs to be mapped as private
> memory. This is because TDX architecture doesn't provide read-only
> capability for VMM, and it doesn't support instruction emulation due
> to guest memory and registers are not accessible for VMM.
> 
> On the other hand, OVMF can work as TDVF, which is usually configured
> as pflash device in QEMU. To keep the same usage (QEMU parameter),
> introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
> ram_mode will be enabled automatically that map the firmware as RAM.
> 
> Note, this implies two things:
>   1. TDVF (OVMF) is not read-only (write-protected).
> 
>   2. It doesn't support non-volatile UEFI variables as what pflash
>      supports that the change to non-volatile UEFI variables won't get
>      synced back to backend vars.fd file.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>   hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
>   hw/i386/pc_sysfw.c      | 14 +++++++++++---
>   2 files changed, 29 insertions(+), 10 deletions(-)

If you don't need a pflash device, don't use it: simply map your nvram
region as ram in your machine. No need to clutter the pflash model like
that.

NAcked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-18 14:07     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 154+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-03-18 14:07 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel, erdemaktas

Hi,

On 17/3/22 14:58, Xiaoyao Li wrote:
> TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
> that OVMF is mapped as rom device, TDVF needs to be mapped as private
> memory. This is because TDX architecture doesn't provide read-only
> capability for VMM, and it doesn't support instruction emulation due
> to guest memory and registers are not accessible for VMM.
> 
> On the other hand, OVMF can work as TDVF, which is usually configured
> as pflash device in QEMU. To keep the same usage (QEMU parameter),
> introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
> ram_mode will be enabled automatically that map the firmware as RAM.
> 
> Note, this implies two things:
>   1. TDVF (OVMF) is not read-only (write-protected).
> 
>   2. It doesn't support non-volatile UEFI variables as what pflash
>      supports that the change to non-volatile UEFI variables won't get
>      synced back to backend vars.fd file.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>   hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
>   hw/i386/pc_sysfw.c      | 14 +++++++++++---
>   2 files changed, 29 insertions(+), 10 deletions(-)

If you don't need a pflash device, don't use it: simply map your nvram
region as ram in your machine. No need to clutter the pflash model like
that.

NAcked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 08/36] i386/tdx: Adjust get_supported_cpuid() for TDX VM
  2022-03-17 13:58   ` Xiaoyao Li
@ 2022-03-18 16:55     ` Isaku Yamahata
  -1 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 16:55 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:45PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> For TDX, the allowable CPUID configuration differs from what KVM
> reports for KVM scope via KVM_GET_SUPPORTED_CPUID.
> 
> - Some CPUID bits are not supported for TDX VM while KVM reports the
>   support. Mask them off for TDX VM. e.g., CPUID_EXT_VMX, some PV
>   featues.
> 
> - The supported XCR0 and XSS bits needs to be caped by tdx_caps, because
>   KVM uses them to setup XFAM of TD.
> 
> Introduce tdx_get_supported_cpuid() to adjust the
> kvm_arch_get_supported_cpuid() for TDX VM.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  target/i386/cpu.h     |  5 +++++
>  target/i386/kvm/kvm.c |  4 ++++
>  target/i386/kvm/tdx.c | 39 +++++++++++++++++++++++++++++++++++++++
>  target/i386/kvm/tdx.h |  2 ++
>  4 files changed, 50 insertions(+)
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 5e406088a91a..7fa30f4ed7db 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -566,6 +566,11 @@ typedef enum X86Seg {
>  #define ESA_FEATURE_XFD_MASK            (1U << ESA_FEATURE_XFD_BIT)
>  
>  
> +#define XCR0_MASK       (XSTATE_FP_MASK | XSTATE_SSE_MASK | XSTATE_YMM_MASK | \
> +                         XSTATE_BNDREGS_MASK | XSTATE_BNDCSR_MASK | \
> +                         XSTATE_OPMASK_MASK | XSTATE_ZMM_Hi256_MASK | \
> +                         XSTATE_Hi16_ZMM_MASK | XSTATE_PKRU_MASK)
> +
>  /* CPUID feature words */
>  typedef enum FeatureWord {
>      FEAT_1_EDX,         /* CPUID[1].EDX */
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 26ed5faf07b8..ddbe8f64fadb 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -486,6 +486,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
>          ret |= 1U << KVM_HINTS_REALTIME;
>      }
>  
> +    if (is_tdx_vm()) {
> +        tdx_get_supported_cpuid(function, index, reg, &ret);
> +    }
> +
>      return ret;
>  }
>  
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 846511b299f4..e4ee55f30c79 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -14,6 +14,7 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "qom/object_interfaces.h"
> +#include "standard-headers/asm-x86/kvm_para.h"
>  #include "sysemu/kvm.h"
>  
>  #include "hw/i386/x86.h"
> @@ -110,6 +111,44 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>      return 0;
>  }
>  
> +void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
> +                             uint32_t *ret)
> +{
> +    switch (function) {
> +    case 1:
> +        if (reg == R_ECX) {
> +            *ret &= ~CPUID_EXT_VMX;
> +        }
> +        break;
> +    case 0xd:
> +        if (index == 0) {
> +            if (reg == R_EAX) {
> +                *ret &= (uint32_t)tdx_caps->xfam_fixed0 & XCR0_MASK;
> +                *ret |= (uint32_t)tdx_caps->xfam_fixed1 & XCR0_MASK;
> +            } else if (reg == R_EDX) {
> +                *ret &= (tdx_caps->xfam_fixed0 & XCR0_MASK) >> 32;
> +                *ret |= (tdx_caps->xfam_fixed1 & XCR0_MASK) >> 32;
> +            }
> +        } else if (index == 1) {
> +            /* TODO: Adjust XSS when it's supported. */
> +        }
> +        break;
> +    case KVM_CPUID_FEATURES:
> +        if (reg == R_EAX) {
> +            *ret &= ~((1ULL << KVM_FEATURE_CLOCKSOURCE) |
> +                      (1ULL << KVM_FEATURE_CLOCKSOURCE2) |
> +                      (1ULL << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
> +                      (1ULL << KVM_FEATURE_ASYNC_PF) |
> +                      (1ULL << KVM_FEATURE_ASYNC_PF_VMEXIT) |
> +                      (1ULL << KVM_FEATURE_ASYNC_PF_INT));

Because new feature bit may be introduced in future (it's unlikely though),
*ret &= (supported_bits) is better than *ret &= ~(unsupported_bits)

Thanks,

> +        }
> +        break;
> +    default:
> +        /* TODO: Use tdx_caps to adjust CPUID leafs. */
> +        break;
> +    }
> +}
> +
>  /* tdx guest */
>  OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
>                                     tdx_guest,
> diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
> index 4036ca2f3f99..06599b65b827 100644
> --- a/target/i386/kvm/tdx.h
> +++ b/target/i386/kvm/tdx.h
> @@ -27,5 +27,7 @@ bool is_tdx_vm(void);
>  #endif /* CONFIG_TDX */
>  
>  int tdx_kvm_init(MachineState *ms, Error **errp);
> +void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
> +                             uint32_t *ret);
>  
>  #endif /* QEMU_I386_TDX_H */
> -- 
> 2.27.0
> 
> 

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 08/36] i386/tdx: Adjust get_supported_cpuid() for TDX VM
@ 2022-03-18 16:55     ` Isaku Yamahata
  0 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 16:55 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:45PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> For TDX, the allowable CPUID configuration differs from what KVM
> reports for KVM scope via KVM_GET_SUPPORTED_CPUID.
> 
> - Some CPUID bits are not supported for TDX VM while KVM reports the
>   support. Mask them off for TDX VM. e.g., CPUID_EXT_VMX, some PV
>   featues.
> 
> - The supported XCR0 and XSS bits needs to be caped by tdx_caps, because
>   KVM uses them to setup XFAM of TD.
> 
> Introduce tdx_get_supported_cpuid() to adjust the
> kvm_arch_get_supported_cpuid() for TDX VM.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  target/i386/cpu.h     |  5 +++++
>  target/i386/kvm/kvm.c |  4 ++++
>  target/i386/kvm/tdx.c | 39 +++++++++++++++++++++++++++++++++++++++
>  target/i386/kvm/tdx.h |  2 ++
>  4 files changed, 50 insertions(+)
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index 5e406088a91a..7fa30f4ed7db 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -566,6 +566,11 @@ typedef enum X86Seg {
>  #define ESA_FEATURE_XFD_MASK            (1U << ESA_FEATURE_XFD_BIT)
>  
>  
> +#define XCR0_MASK       (XSTATE_FP_MASK | XSTATE_SSE_MASK | XSTATE_YMM_MASK | \
> +                         XSTATE_BNDREGS_MASK | XSTATE_BNDCSR_MASK | \
> +                         XSTATE_OPMASK_MASK | XSTATE_ZMM_Hi256_MASK | \
> +                         XSTATE_Hi16_ZMM_MASK | XSTATE_PKRU_MASK)
> +
>  /* CPUID feature words */
>  typedef enum FeatureWord {
>      FEAT_1_EDX,         /* CPUID[1].EDX */
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 26ed5faf07b8..ddbe8f64fadb 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -486,6 +486,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
>          ret |= 1U << KVM_HINTS_REALTIME;
>      }
>  
> +    if (is_tdx_vm()) {
> +        tdx_get_supported_cpuid(function, index, reg, &ret);
> +    }
> +
>      return ret;
>  }
>  
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 846511b299f4..e4ee55f30c79 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -14,6 +14,7 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "qom/object_interfaces.h"
> +#include "standard-headers/asm-x86/kvm_para.h"
>  #include "sysemu/kvm.h"
>  
>  #include "hw/i386/x86.h"
> @@ -110,6 +111,44 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>      return 0;
>  }
>  
> +void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
> +                             uint32_t *ret)
> +{
> +    switch (function) {
> +    case 1:
> +        if (reg == R_ECX) {
> +            *ret &= ~CPUID_EXT_VMX;
> +        }
> +        break;
> +    case 0xd:
> +        if (index == 0) {
> +            if (reg == R_EAX) {
> +                *ret &= (uint32_t)tdx_caps->xfam_fixed0 & XCR0_MASK;
> +                *ret |= (uint32_t)tdx_caps->xfam_fixed1 & XCR0_MASK;
> +            } else if (reg == R_EDX) {
> +                *ret &= (tdx_caps->xfam_fixed0 & XCR0_MASK) >> 32;
> +                *ret |= (tdx_caps->xfam_fixed1 & XCR0_MASK) >> 32;
> +            }
> +        } else if (index == 1) {
> +            /* TODO: Adjust XSS when it's supported. */
> +        }
> +        break;
> +    case KVM_CPUID_FEATURES:
> +        if (reg == R_EAX) {
> +            *ret &= ~((1ULL << KVM_FEATURE_CLOCKSOURCE) |
> +                      (1ULL << KVM_FEATURE_CLOCKSOURCE2) |
> +                      (1ULL << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
> +                      (1ULL << KVM_FEATURE_ASYNC_PF) |
> +                      (1ULL << KVM_FEATURE_ASYNC_PF_VMEXIT) |
> +                      (1ULL << KVM_FEATURE_ASYNC_PF_INT));

Because new feature bit may be introduced in future (it's unlikely though),
*ret &= (supported_bits) is better than *ret &= ~(unsupported_bits)

Thanks,

> +        }
> +        break;
> +    default:
> +        /* TODO: Use tdx_caps to adjust CPUID leafs. */
> +        break;
> +    }
> +}
> +
>  /* tdx guest */
>  OBJECT_DEFINE_TYPE_WITH_INTERFACES(TdxGuest,
>                                     tdx_guest,
> diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
> index 4036ca2f3f99..06599b65b827 100644
> --- a/target/i386/kvm/tdx.h
> +++ b/target/i386/kvm/tdx.h
> @@ -27,5 +27,7 @@ bool is_tdx_vm(void);
>  #endif /* CONFIG_TDX */
>  
>  int tdx_kvm_init(MachineState *ms, Error **errp);
> +void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
> +                             uint32_t *ret);
>  
>  #endif /* QEMU_I386_TDX_H */
> -- 
> 2.27.0
> 
> 

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 09/36] KVM: Introduce kvm_arch_pre_create_vcpu()
  2022-03-17 13:58   ` Xiaoyao Li
@ 2022-03-18 16:56     ` Isaku Yamahata
  -1 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 16:56 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:46PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
> work prior to create any vcpu. This is for i386 TDX because it needs
> call TDX_INIT_VM before creating any vcpu.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  accel/kvm/kvm-all.c    | 7 +++++++
>  include/sysemu/kvm.h   | 1 +
>  target/arm/kvm64.c     | 5 +++++
>  target/i386/kvm/kvm.c  | 5 +++++
>  target/mips/kvm.c      | 5 +++++
>  target/ppc/kvm.c       | 5 +++++
>  target/s390x/kvm/kvm.c | 5 +++++
>  7 files changed, 33 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 27864dfaeaaa..a4bb449737a6 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -465,6 +465,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>  
>      trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  
> +    ret = kvm_arch_pre_create_vcpu(cpu);
> +    if (ret < 0) {
> +        error_setg_errno(errp, -ret,
> +                         "kvm_init_vcpu: kvm_arch_pre_create_vcpu() failed");
> +        goto err;
> +    }
> +
>      ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>      if (ret < 0) {
>          error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index a783c7886811..0e94031ab7c7 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -373,6 +373,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
>  
>  int kvm_arch_init(MachineState *ms, KVMState *s);
>  
> +int kvm_arch_pre_create_vcpu(CPUState *cpu);
>  int kvm_arch_init_vcpu(CPUState *cpu);
>  int kvm_arch_destroy_vcpu(CPUState *cpu);
>  
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index ccadfbbe72be..ae7336851c62 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -935,6 +935,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      return kvm_arm_init_cpreg_list(cpu);
>  }
>  
> +int kvm_arch_pre_create_vcpu(CPUState *cpu)
> +{
> +    return 0;
> +}
> +

Weak symbol can be used to avoid update all the arch.

Thanks,
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 09/36] KVM: Introduce kvm_arch_pre_create_vcpu()
@ 2022-03-18 16:56     ` Isaku Yamahata
  0 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 16:56 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:46PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
> work prior to create any vcpu. This is for i386 TDX because it needs
> call TDX_INIT_VM before creating any vcpu.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  accel/kvm/kvm-all.c    | 7 +++++++
>  include/sysemu/kvm.h   | 1 +
>  target/arm/kvm64.c     | 5 +++++
>  target/i386/kvm/kvm.c  | 5 +++++
>  target/mips/kvm.c      | 5 +++++
>  target/ppc/kvm.c       | 5 +++++
>  target/s390x/kvm/kvm.c | 5 +++++
>  7 files changed, 33 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 27864dfaeaaa..a4bb449737a6 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -465,6 +465,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>  
>      trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>  
> +    ret = kvm_arch_pre_create_vcpu(cpu);
> +    if (ret < 0) {
> +        error_setg_errno(errp, -ret,
> +                         "kvm_init_vcpu: kvm_arch_pre_create_vcpu() failed");
> +        goto err;
> +    }
> +
>      ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>      if (ret < 0) {
>          error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
> index a783c7886811..0e94031ab7c7 100644
> --- a/include/sysemu/kvm.h
> +++ b/include/sysemu/kvm.h
> @@ -373,6 +373,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
>  
>  int kvm_arch_init(MachineState *ms, KVMState *s);
>  
> +int kvm_arch_pre_create_vcpu(CPUState *cpu);
>  int kvm_arch_init_vcpu(CPUState *cpu);
>  int kvm_arch_destroy_vcpu(CPUState *cpu);
>  
> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
> index ccadfbbe72be..ae7336851c62 100644
> --- a/target/arm/kvm64.c
> +++ b/target/arm/kvm64.c
> @@ -935,6 +935,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
>      return kvm_arm_init_cpreg_list(cpu);
>  }
>  
> +int kvm_arch_pre_create_vcpu(CPUState *cpu)
> +{
> +    return 0;
> +}
> +

Weak symbol can be used to avoid update all the arch.

Thanks,
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 16/36] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
  2022-03-17 13:58   ` Xiaoyao Li
@ 2022-03-18 17:11     ` Isaku Yamahata
  -1 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 17:11 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:53PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> TDX only supports readonly for shared memory but not for private memory.
> 
> In the view of QEMU, it has no idea whether a memslot is used by shared
> memory of private. Thus just mark kvm_readonly_mem_enabled to false to
> TDX VM for simplicity.
> 
> Note, pflash has dependency on readonly capability from KVM while TDX
> wants to reuse pflash interface to load TDVF (as OVMF). Excuse TDX VM
> for readonly check in pflash.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  hw/i386/pc_sysfw.c    | 2 +-
>  target/i386/kvm/tdx.c | 9 +++++++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index c8b17af95353..75b34d02cb4f 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -245,7 +245,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
>          /* Machine property pflash0 not set, use ROM mode */
>          x86_bios_rom_init(MACHINE(pcms), "bios.bin", rom_memory, false);
>      } else {
> -        if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
> +        if (kvm_enabled() && (!kvm_readonly_mem_enabled() && !is_tdx_vm())) {

Is this called before tdx_kvm_init()?

Thanks,


>              /*
>               * Older KVM cannot execute from device memory. So, flash
>               * memory cannot be used unless the readonly memory kvm
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 94a9c1ea7e9c..1bb8211e74e6 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -115,6 +115,15 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>          get_tdx_capabilities();
>      }
>  
> +    /*
> +     * Set kvm_readonly_mem_allowed to false, because TDX only supports readonly
> +     * memory for shared memory but not for private memory. Besides, whether a
> +     * memslot is private or shared is not determined by QEMU.
> +     *
> +     * Thus, just mark readonly memory not supported for simplicity.
> +     */
> +    kvm_readonly_mem_allowed = false;
> +
>      tdx_guest = tdx;
>  
>      return 0;
> -- 
> 2.27.0
> 
> 

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 16/36] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
@ 2022-03-18 17:11     ` Isaku Yamahata
  0 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 17:11 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:53PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> TDX only supports readonly for shared memory but not for private memory.
> 
> In the view of QEMU, it has no idea whether a memslot is used by shared
> memory of private. Thus just mark kvm_readonly_mem_enabled to false to
> TDX VM for simplicity.
> 
> Note, pflash has dependency on readonly capability from KVM while TDX
> wants to reuse pflash interface to load TDVF (as OVMF). Excuse TDX VM
> for readonly check in pflash.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  hw/i386/pc_sysfw.c    | 2 +-
>  target/i386/kvm/tdx.c | 9 +++++++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index c8b17af95353..75b34d02cb4f 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -245,7 +245,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
>          /* Machine property pflash0 not set, use ROM mode */
>          x86_bios_rom_init(MACHINE(pcms), "bios.bin", rom_memory, false);
>      } else {
> -        if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
> +        if (kvm_enabled() && (!kvm_readonly_mem_enabled() && !is_tdx_vm())) {

Is this called before tdx_kvm_init()?

Thanks,


>              /*
>               * Older KVM cannot execute from device memory. So, flash
>               * memory cannot be used unless the readonly memory kvm
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index 94a9c1ea7e9c..1bb8211e74e6 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -115,6 +115,15 @@ int tdx_kvm_init(MachineState *ms, Error **errp)
>          get_tdx_capabilities();
>      }
>  
> +    /*
> +     * Set kvm_readonly_mem_allowed to false, because TDX only supports readonly
> +     * memory for shared memory but not for private memory. Besides, whether a
> +     * memslot is private or shared is not determined by QEMU.
> +     *
> +     * Thus, just mark readonly memory not supported for simplicity.
> +     */
> +    kvm_readonly_mem_allowed = false;
> +
>      tdx_guest = tdx;
>  
>      return 0;
> -- 
> 2.27.0
> 
> 

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 18/36] i386/tdvf: Introduce function to parse TDVF metadata
  2022-03-17 13:58   ` Xiaoyao Li
@ 2022-03-18 17:19     ` Isaku Yamahata
  -1 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 17:19 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:55PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> diff --git a/hw/i386/tdvf.c b/hw/i386/tdvf.c
> new file mode 100644
> index 000000000000..02da1d2c12dd
> --- /dev/null
> +++ b/hw/i386/tdvf.c
> @@ -0,0 +1,196 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0-or-later
> +
> + * Copyright (c) 2020 Intel Corporation
> + * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
> + *                        <isaku.yamahata at intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> +
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> +
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/i386/pc.h"
> +#include "hw/i386/tdvf.h"
> +#include "sysemu/kvm.h"
> +
> +#define TDX_METADATA_GUID "e47a6535-984a-4798-865e-4685a7bf8ec2"
> +#define TDX_METADATA_VERSION    1
> +#define TDVF_SIGNATURE_LE32     0x46564454 /* TDVF as little endian */

_LE32 doesn't make sense.  qemu doesn't provide macro version for byteswap.
Let's convert at the usage point.


> +
> +typedef struct {
> +    uint32_t DataOffset;
> +    uint32_t RawDataSize;
> +    uint64_t MemoryAddress;
> +    uint64_t MemoryDataSize;
> +    uint32_t Type;
> +    uint32_t Attributes;
> +} TdvfSectionEntry;
> +
> +typedef struct {
> +    uint32_t Signature;
> +    uint32_t Length;
> +    uint32_t Version;
> +    uint32_t NumberOfSectionEntries;
> +    TdvfSectionEntry SectionEntries[];
> +} TdvfMetadata;
> +
> +struct tdx_metadata_offset {
> +    uint32_t offset;
> +};
> +
> +static TdvfMetadata *tdvf_get_metadata(void *flash_ptr, int size)
> +{
> +    TdvfMetadata *metadata;
> +    uint32_t offset = 0;
> +    uint8_t *data;
> +
> +    if ((uint32_t) size != size) {
> +        return NULL;
> +    }
> +
> +    if (pc_system_ovmf_table_find(TDX_METADATA_GUID, &data, NULL)) {
> +        offset = size - le32_to_cpu(((struct tdx_metadata_offset *)data)->offset);
> +
> +        if (offset + sizeof(*metadata) > size) {
> +            return NULL;
> +        }
> +    } else {
> +        error_report("Cannot find TDX_METADATA_GUID\n");
> +        return NULL;
> +    }
> +
> +    metadata = flash_ptr + offset;
> +
> +    /* Finally, verify the signature to determine if this is a TDVF image. */
> +   if (metadata->Signature != TDVF_SIGNATURE_LE32) {


metadata->Signature = le32_to_cpu(metadata->Signature);
metadata->Signature != TDVF_SIGNATURE for consistency.

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 18/36] i386/tdvf: Introduce function to parse TDVF metadata
@ 2022-03-18 17:19     ` Isaku Yamahata
  0 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 17:19 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek, isaku.yamahata

On Thu, Mar 17, 2022 at 09:58:55PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> diff --git a/hw/i386/tdvf.c b/hw/i386/tdvf.c
> new file mode 100644
> index 000000000000..02da1d2c12dd
> --- /dev/null
> +++ b/hw/i386/tdvf.c
> @@ -0,0 +1,196 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0-or-later
> +
> + * Copyright (c) 2020 Intel Corporation
> + * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
> + *                        <isaku.yamahata at intel.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> +
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> +
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/i386/pc.h"
> +#include "hw/i386/tdvf.h"
> +#include "sysemu/kvm.h"
> +
> +#define TDX_METADATA_GUID "e47a6535-984a-4798-865e-4685a7bf8ec2"
> +#define TDX_METADATA_VERSION    1
> +#define TDVF_SIGNATURE_LE32     0x46564454 /* TDVF as little endian */

_LE32 doesn't make sense.  qemu doesn't provide macro version for byteswap.
Let's convert at the usage point.


> +
> +typedef struct {
> +    uint32_t DataOffset;
> +    uint32_t RawDataSize;
> +    uint64_t MemoryAddress;
> +    uint64_t MemoryDataSize;
> +    uint32_t Type;
> +    uint32_t Attributes;
> +} TdvfSectionEntry;
> +
> +typedef struct {
> +    uint32_t Signature;
> +    uint32_t Length;
> +    uint32_t Version;
> +    uint32_t NumberOfSectionEntries;
> +    TdvfSectionEntry SectionEntries[];
> +} TdvfMetadata;
> +
> +struct tdx_metadata_offset {
> +    uint32_t offset;
> +};
> +
> +static TdvfMetadata *tdvf_get_metadata(void *flash_ptr, int size)
> +{
> +    TdvfMetadata *metadata;
> +    uint32_t offset = 0;
> +    uint8_t *data;
> +
> +    if ((uint32_t) size != size) {
> +        return NULL;
> +    }
> +
> +    if (pc_system_ovmf_table_find(TDX_METADATA_GUID, &data, NULL)) {
> +        offset = size - le32_to_cpu(((struct tdx_metadata_offset *)data)->offset);
> +
> +        if (offset + sizeof(*metadata) > size) {
> +            return NULL;
> +        }
> +    } else {
> +        error_report("Cannot find TDX_METADATA_GUID\n");
> +        return NULL;
> +    }
> +
> +    metadata = flash_ptr + offset;
> +
> +    /* Finally, verify the signature to determine if this is a TDVF image. */
> +   if (metadata->Signature != TDVF_SIGNATURE_LE32) {


metadata->Signature = le32_to_cpu(metadata->Signature);
metadata->Signature != TDVF_SIGNATURE for consistency.

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 33/36] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
  2022-03-17 13:59   ` Xiaoyao Li
@ 2022-03-18 17:31     ` Isaku Yamahata
  -1 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 17:31 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas, isaku.yamahata

On Thu, Mar 17, 2022 at 09:59:10PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
> by VMM, while the features enumerated/controlled by other MSRs except
> MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.
> 
> Only configure MSR_IA32_UCODE_REV for TDs.

non-TDs?
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 33/36] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
@ 2022-03-18 17:31     ` Isaku Yamahata
  0 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-18 17:31 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek, isaku.yamahata

On Thu, Mar 17, 2022 at 09:59:10PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
> by VMM, while the features enumerated/controlled by other MSRs except
> MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.
> 
> Only configure MSR_IA32_UCODE_REV for TDs.

non-TDs?
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 05/36] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
  2022-03-18  2:07     ` Isaku Yamahata
@ 2022-03-21  5:35       ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  5:35 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas

On 3/18/2022 10:07 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:42PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> Introduce tdx_kvm_init() and invoke it in kvm_confidential_guest_init()
>> if it's a TDX VM. More initialization will be added later.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   target/i386/kvm/kvm.c       | 15 ++++++---------
>>   target/i386/kvm/meson.build |  2 +-
>>   target/i386/kvm/tdx-stub.c  |  9 +++++++++
>>   target/i386/kvm/tdx.c       | 13 +++++++++++++
>>   target/i386/kvm/tdx.h       |  2 ++
>>   5 files changed, 31 insertions(+), 10 deletions(-)
>>   create mode 100644 target/i386/kvm/tdx-stub.c
>>
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index 70454355f3bf..26ed5faf07b8 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -54,6 +54,7 @@
>>   #include "migration/blocker.h"
>>   #include "exec/memattrs.h"
>>   #include "trace.h"
>> +#include "tdx.h"
>>   
>>   //#define DEBUG_KVM
>>   
>> @@ -2360,6 +2361,8 @@ static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
>>   {
>>       if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
>>           return sev_kvm_init(ms->cgs, errp);
>> +    } else if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
>> +        return tdx_kvm_init(ms, errp);
>>       }
>>   
>>       return 0;
>> @@ -2374,16 +2377,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>>       Error *local_err = NULL;
>>   
>>       /*
>> -     * Initialize SEV context, if required
>> +     * Initialize confidential guest (SEV/TDX) context, if required
>>        *
>> -     * If no memory encryption is requested (ms->cgs == NULL) this is
>> -     * a no-op.
>> -     *
>> -     * It's also a no-op if a non-SEV confidential guest support
>> -     * mechanism is selected.  SEV is the only mechanism available to
>> -     * select on x86 at present, so this doesn't arise, but if new
>> -     * mechanisms are supported in future (e.g. TDX), they'll need
>> -     * their own initialization either here or elsewhere.
>> +     * It's a no-op if a non-SEV/non-tdx confidential guest support
>> +     * mechanism is selected, i.e., ms->cgs == NULL
>>        */
>>       ret = kvm_confidential_guest_init(ms, &local_err);
>>       if (ret < 0) {
>> diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
>> index b2d7d41acde2..fd30b93ecec9 100644
>> --- a/target/i386/kvm/meson.build
>> +++ b/target/i386/kvm/meson.build
>> @@ -9,7 +9,7 @@ i386_softmmu_kvm_ss.add(files(
>>   
>>   i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
>>   
>> -i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
>> +i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'), if_false: files('tdx-stub.c'))
>>   
>>   i386_softmmu_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
>>   
>> diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
>> new file mode 100644
>> index 000000000000..1df24735201e
>> --- /dev/null
>> +++ b/target/i386/kvm/tdx-stub.c
>> @@ -0,0 +1,9 @@
>> +#include "qemu/osdep.h"
>> +#include "qemu-common.h"
>> +
>> +#include "tdx.h"
>> +
>> +int tdx_kvm_init(MachineState *ms, Error **errp)
>> +{
>> +    return -EINVAL;
>> +}
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index d3792d4a3d56..e3b94373b316 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -12,10 +12,23 @@
>>    */
>>   
>>   #include "qemu/osdep.h"
>> +#include "qapi/error.h"
>>   #include "qom/object_interfaces.h"
>>   
>> +#include "hw/i386/x86.h"
>>   #include "tdx.h"
>>   
>> +int tdx_kvm_init(MachineState *ms, Error **errp)
>> +{
>> +    TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
>> +                                                    TYPE_TDX_GUEST);
> 
> The caller already checks it.  This is redundant. Maybe assert?

the cast is to get TdxGuest pointer for later usage. I can move it the 
patch that really uses tdx pointer.

Thanks,
-Xiaoyao

> 


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 05/36] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context
@ 2022-03-21  5:35       ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  5:35 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek

On 3/18/2022 10:07 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:42PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> Introduce tdx_kvm_init() and invoke it in kvm_confidential_guest_init()
>> if it's a TDX VM. More initialization will be added later.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   target/i386/kvm/kvm.c       | 15 ++++++---------
>>   target/i386/kvm/meson.build |  2 +-
>>   target/i386/kvm/tdx-stub.c  |  9 +++++++++
>>   target/i386/kvm/tdx.c       | 13 +++++++++++++
>>   target/i386/kvm/tdx.h       |  2 ++
>>   5 files changed, 31 insertions(+), 10 deletions(-)
>>   create mode 100644 target/i386/kvm/tdx-stub.c
>>
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index 70454355f3bf..26ed5faf07b8 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -54,6 +54,7 @@
>>   #include "migration/blocker.h"
>>   #include "exec/memattrs.h"
>>   #include "trace.h"
>> +#include "tdx.h"
>>   
>>   //#define DEBUG_KVM
>>   
>> @@ -2360,6 +2361,8 @@ static int kvm_confidential_guest_init(MachineState *ms, Error **errp)
>>   {
>>       if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_SEV_GUEST)) {
>>           return sev_kvm_init(ms->cgs, errp);
>> +    } else if (object_dynamic_cast(OBJECT(ms->cgs), TYPE_TDX_GUEST)) {
>> +        return tdx_kvm_init(ms, errp);
>>       }
>>   
>>       return 0;
>> @@ -2374,16 +2377,10 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>>       Error *local_err = NULL;
>>   
>>       /*
>> -     * Initialize SEV context, if required
>> +     * Initialize confidential guest (SEV/TDX) context, if required
>>        *
>> -     * If no memory encryption is requested (ms->cgs == NULL) this is
>> -     * a no-op.
>> -     *
>> -     * It's also a no-op if a non-SEV confidential guest support
>> -     * mechanism is selected.  SEV is the only mechanism available to
>> -     * select on x86 at present, so this doesn't arise, but if new
>> -     * mechanisms are supported in future (e.g. TDX), they'll need
>> -     * their own initialization either here or elsewhere.
>> +     * It's a no-op if a non-SEV/non-tdx confidential guest support
>> +     * mechanism is selected, i.e., ms->cgs == NULL
>>        */
>>       ret = kvm_confidential_guest_init(ms, &local_err);
>>       if (ret < 0) {
>> diff --git a/target/i386/kvm/meson.build b/target/i386/kvm/meson.build
>> index b2d7d41acde2..fd30b93ecec9 100644
>> --- a/target/i386/kvm/meson.build
>> +++ b/target/i386/kvm/meson.build
>> @@ -9,7 +9,7 @@ i386_softmmu_kvm_ss.add(files(
>>   
>>   i386_softmmu_kvm_ss.add(when: 'CONFIG_SEV', if_false: files('sev-stub.c'))
>>   
>> -i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'))
>> +i386_softmmu_kvm_ss.add(when: 'CONFIG_TDX', if_true: files('tdx.c'), if_false: files('tdx-stub.c'))
>>   
>>   i386_softmmu_ss.add(when: 'CONFIG_HYPERV', if_true: files('hyperv.c'), if_false: files('hyperv-stub.c'))
>>   
>> diff --git a/target/i386/kvm/tdx-stub.c b/target/i386/kvm/tdx-stub.c
>> new file mode 100644
>> index 000000000000..1df24735201e
>> --- /dev/null
>> +++ b/target/i386/kvm/tdx-stub.c
>> @@ -0,0 +1,9 @@
>> +#include "qemu/osdep.h"
>> +#include "qemu-common.h"
>> +
>> +#include "tdx.h"
>> +
>> +int tdx_kvm_init(MachineState *ms, Error **errp)
>> +{
>> +    return -EINVAL;
>> +}
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index d3792d4a3d56..e3b94373b316 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -12,10 +12,23 @@
>>    */
>>   
>>   #include "qemu/osdep.h"
>> +#include "qapi/error.h"
>>   #include "qom/object_interfaces.h"
>>   
>> +#include "hw/i386/x86.h"
>>   #include "tdx.h"
>>   
>> +int tdx_kvm_init(MachineState *ms, Error **errp)
>> +{
>> +    TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
>> +                                                    TYPE_TDX_GUEST);
> 
> The caller already checks it.  This is redundant. Maybe assert?

the cast is to get TdxGuest pointer for later usage. I can move it the 
patch that really uses tdx pointer.

Thanks,
-Xiaoyao

> 



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 08/36] i386/tdx: Adjust get_supported_cpuid() for TDX VM
  2022-03-18 16:55     ` Isaku Yamahata
@ 2022-03-21  5:37       ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  5:37 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas

On 3/19/2022 12:55 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:45PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
...
>> +void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
>> +                             uint32_t *ret)
>> +{
>> +    switch (function) {
>> +    case 1:
>> +        if (reg == R_ECX) {
>> +            *ret &= ~CPUID_EXT_VMX;
>> +        }
>> +        break;
>> +    case 0xd:
>> +        if (index == 0) {
>> +            if (reg == R_EAX) {
>> +                *ret &= (uint32_t)tdx_caps->xfam_fixed0 & XCR0_MASK;
>> +                *ret |= (uint32_t)tdx_caps->xfam_fixed1 & XCR0_MASK;
>> +            } else if (reg == R_EDX) {
>> +                *ret &= (tdx_caps->xfam_fixed0 & XCR0_MASK) >> 32;
>> +                *ret |= (tdx_caps->xfam_fixed1 & XCR0_MASK) >> 32;
>> +            }
>> +        } else if (index == 1) {
>> +            /* TODO: Adjust XSS when it's supported. */
>> +        }
>> +        break;
>> +    case KVM_CPUID_FEATURES:
>> +        if (reg == R_EAX) {
>> +            *ret &= ~((1ULL << KVM_FEATURE_CLOCKSOURCE) |
>> +                      (1ULL << KVM_FEATURE_CLOCKSOURCE2) |
>> +                      (1ULL << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
>> +                      (1ULL << KVM_FEATURE_ASYNC_PF) |
>> +                      (1ULL << KVM_FEATURE_ASYNC_PF_VMEXIT) |
>> +                      (1ULL << KVM_FEATURE_ASYNC_PF_INT));
> 
> Because new feature bit may be introduced in future (it's unlikely though),
> *ret &= (supported_bits) is better than *ret &= ~(unsupported_bits)
> 

Good point, I will introduce supported_kvm_features for it.


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 08/36] i386/tdx: Adjust get_supported_cpuid() for TDX VM
@ 2022-03-21  5:37       ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  5:37 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek

On 3/19/2022 12:55 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:45PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
...
>> +void tdx_get_supported_cpuid(uint32_t function, uint32_t index, int reg,
>> +                             uint32_t *ret)
>> +{
>> +    switch (function) {
>> +    case 1:
>> +        if (reg == R_ECX) {
>> +            *ret &= ~CPUID_EXT_VMX;
>> +        }
>> +        break;
>> +    case 0xd:
>> +        if (index == 0) {
>> +            if (reg == R_EAX) {
>> +                *ret &= (uint32_t)tdx_caps->xfam_fixed0 & XCR0_MASK;
>> +                *ret |= (uint32_t)tdx_caps->xfam_fixed1 & XCR0_MASK;
>> +            } else if (reg == R_EDX) {
>> +                *ret &= (tdx_caps->xfam_fixed0 & XCR0_MASK) >> 32;
>> +                *ret |= (tdx_caps->xfam_fixed1 & XCR0_MASK) >> 32;
>> +            }
>> +        } else if (index == 1) {
>> +            /* TODO: Adjust XSS when it's supported. */
>> +        }
>> +        break;
>> +    case KVM_CPUID_FEATURES:
>> +        if (reg == R_EAX) {
>> +            *ret &= ~((1ULL << KVM_FEATURE_CLOCKSOURCE) |
>> +                      (1ULL << KVM_FEATURE_CLOCKSOURCE2) |
>> +                      (1ULL << KVM_FEATURE_CLOCKSOURCE_STABLE_BIT) |
>> +                      (1ULL << KVM_FEATURE_ASYNC_PF) |
>> +                      (1ULL << KVM_FEATURE_ASYNC_PF_VMEXIT) |
>> +                      (1ULL << KVM_FEATURE_ASYNC_PF_INT));
> 
> Because new feature bit may be introduced in future (it's unlikely though),
> *ret &= (supported_bits) is better than *ret &= ~(unsupported_bits)
> 

Good point, I will introduce supported_kvm_features for it.



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 33/36] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
  2022-03-18 17:31     ` Isaku Yamahata
@ 2022-03-21  6:08       ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  6:08 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas

On 3/19/2022 1:31 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:59:10PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
>> by VMM, while the features enumerated/controlled by other MSRs except
>> MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.
>>
>> Only configure MSR_IA32_UCODE_REV for TDs.
> 
> non-TDs?

No. It meant exactly TDs.

Only MSR_IA32_UCODE_REV is supported to be emulated by VMM for TDs.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 33/36] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() for TDs
@ 2022-03-21  6:08       ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  6:08 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek

On 3/19/2022 1:31 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:59:10PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> For TDs, only MSR_IA32_UCODE_REV in kvm_init_msrs() can be configured
>> by VMM, while the features enumerated/controlled by other MSRs except
>> MSR_IA32_UCODE_REV in kvm_init_msrs() are not under control of VMM.
>>
>> Only configure MSR_IA32_UCODE_REV for TDs.
> 
> non-TDs?

No. It meant exactly TDs.

Only MSR_IA32_UCODE_REV is supported to be emulated by VMM for TDs.


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 18/36] i386/tdvf: Introduce function to parse TDVF metadata
  2022-03-18 17:19     ` Isaku Yamahata
@ 2022-03-21  6:11       ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  6:11 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas

On 3/19/2022 1:19 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:55PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> diff --git a/hw/i386/tdvf.c b/hw/i386/tdvf.c
>> new file mode 100644
>> index 000000000000..02da1d2c12dd
>> --- /dev/null
>> +++ b/hw/i386/tdvf.c
>> @@ -0,0 +1,196 @@
>> +/*
>> + * SPDX-License-Identifier: GPL-2.0-or-later
>> +
>> + * Copyright (c) 2020 Intel Corporation
>> + * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
>> + *                        <isaku.yamahata at intel.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> +
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> +
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "hw/i386/pc.h"
>> +#include "hw/i386/tdvf.h"
>> +#include "sysemu/kvm.h"
>> +
>> +#define TDX_METADATA_GUID "e47a6535-984a-4798-865e-4685a7bf8ec2"
>> +#define TDX_METADATA_VERSION    1
>> +#define TDVF_SIGNATURE_LE32     0x46564454 /* TDVF as little endian */
> 
> _LE32 doesn't make sense.  qemu doesn't provide macro version for byteswap.
> Let's convert at the usage point.

OK
>> +
>> +    /* Finally, verify the signature to determine if this is a TDVF image. */
>> +   if (metadata->Signature != TDVF_SIGNATURE_LE32) {
> 
> 
> metadata->Signature = le32_to_cpu(metadata->Signature);
> metadata->Signature != TDVF_SIGNATURE for consistency.
> 


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 18/36] i386/tdvf: Introduce function to parse TDVF metadata
@ 2022-03-21  6:11       ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  6:11 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek

On 3/19/2022 1:19 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:55PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> diff --git a/hw/i386/tdvf.c b/hw/i386/tdvf.c
>> new file mode 100644
>> index 000000000000..02da1d2c12dd
>> --- /dev/null
>> +++ b/hw/i386/tdvf.c
>> @@ -0,0 +1,196 @@
>> +/*
>> + * SPDX-License-Identifier: GPL-2.0-or-later
>> +
>> + * Copyright (c) 2020 Intel Corporation
>> + * Author: Isaku Yamahata <isaku.yamahata at gmail.com>
>> + *                        <isaku.yamahata at intel.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> +
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> +
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "hw/i386/pc.h"
>> +#include "hw/i386/tdvf.h"
>> +#include "sysemu/kvm.h"
>> +
>> +#define TDX_METADATA_GUID "e47a6535-984a-4798-865e-4685a7bf8ec2"
>> +#define TDX_METADATA_VERSION    1
>> +#define TDVF_SIGNATURE_LE32     0x46564454 /* TDVF as little endian */
> 
> _LE32 doesn't make sense.  qemu doesn't provide macro version for byteswap.
> Let's convert at the usage point.

OK
>> +
>> +    /* Finally, verify the signature to determine if this is a TDVF image. */
>> +   if (metadata->Signature != TDVF_SIGNATURE_LE32) {
> 
> 
> metadata->Signature = le32_to_cpu(metadata->Signature);
> metadata->Signature != TDVF_SIGNATURE for consistency.
> 



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 27/36] i386/tdx: Disable SMM for TDX VMs
  2022-03-17 13:59   ` Xiaoyao Li
@ 2022-03-21  6:51     ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  6:51 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

On 3/17/2022 9:59 PM, Xiaoyao Li wrote:
> TDX doesn't support SMM and VMM cannot emulate SMM for TDX VMs because
> VMM cannot manipulate TDX VM's memory.
> 
> Disable SMM for TDX VMs and error out if user requests to enable SMM.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>   target/i386/kvm/tdx.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index deb9634b27dc..ec6f5d7a2e48 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -302,12 +302,25 @@ static Notifier tdx_machine_done_notify = {
>   
>   int tdx_kvm_init(MachineState *ms, Error **errp)
>   {
> +    X86MachineState *x86ms = X86_MACHINE(ms);
>       TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
>                                                       TYPE_TDX_GUEST);
>       if (!tdx) {
>           return -EINVAL;
>       }
>   
> +    if (!kvm_enable_x2apic()) {
> +        error_setg(errp, "Failed to enable x2apic in KVM");
> +        return -EINVAL;
> +    }

above change is not relevant to this patch, will remove it in next version.

> +
> +    if (x86ms->smm == ON_OFF_AUTO_AUTO) {
> +        x86ms->smm = ON_OFF_AUTO_OFF;
> +    } else if (x86ms->smm == ON_OFF_AUTO_ON) {
> +        error_setg(errp, "TDX VM doesn't support SMM");
> +        return -EINVAL;
> +    }
> +
>       if (!tdx_caps) {
>           get_tdx_capabilities();
>       }


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 27/36] i386/tdx: Disable SMM for TDX VMs
@ 2022-03-21  6:51     ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  6:51 UTC (permalink / raw)
  To: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel, erdemaktas

On 3/17/2022 9:59 PM, Xiaoyao Li wrote:
> TDX doesn't support SMM and VMM cannot emulate SMM for TDX VMs because
> VMM cannot manipulate TDX VM's memory.
> 
> Disable SMM for TDX VMs and error out if user requests to enable SMM.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>   target/i386/kvm/tdx.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
> index deb9634b27dc..ec6f5d7a2e48 100644
> --- a/target/i386/kvm/tdx.c
> +++ b/target/i386/kvm/tdx.c
> @@ -302,12 +302,25 @@ static Notifier tdx_machine_done_notify = {
>   
>   int tdx_kvm_init(MachineState *ms, Error **errp)
>   {
> +    X86MachineState *x86ms = X86_MACHINE(ms);
>       TdxGuest *tdx = (TdxGuest *)object_dynamic_cast(OBJECT(ms->cgs),
>                                                       TYPE_TDX_GUEST);
>       if (!tdx) {
>           return -EINVAL;
>       }
>   
> +    if (!kvm_enable_x2apic()) {
> +        error_setg(errp, "Failed to enable x2apic in KVM");
> +        return -EINVAL;
> +    }

above change is not relevant to this patch, will remove it in next version.

> +
> +    if (x86ms->smm == ON_OFF_AUTO_AUTO) {
> +        x86ms->smm = ON_OFF_AUTO_OFF;
> +    } else if (x86ms->smm == ON_OFF_AUTO_ON) {
> +        error_setg(errp, "TDX VM doesn't support SMM");
> +        return -EINVAL;
> +    }
> +
>       if (!tdx_caps) {
>           get_tdx_capabilities();
>       }



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 06/36] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
  2022-03-18  2:08     ` Isaku Yamahata
@ 2022-03-21  6:56       ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  6:56 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas

On 3/18/2022 10:08 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:43PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index e3b94373b316..bed337e5ba18 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -14,10 +14,77 @@
>>   #include "qemu/osdep.h"
>>   #include "qapi/error.h"
>>   #include "qom/object_interfaces.h"
>> +#include "sysemu/kvm.h"
>>   
>>   #include "hw/i386/x86.h"
>>   #include "tdx.h"
>>   
>> +enum tdx_ioctl_level{
>> +    TDX_VM_IOCTL,
>> +    TDX_VCPU_IOCTL,
>> +};
>> +
>> +static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
>> +                        __u32 metadata, void *data)
>> +{
>> +    struct kvm_tdx_cmd tdx_cmd;
>> +    int r;
>> +
>> +    memset(&tdx_cmd, 0x0, sizeof(tdx_cmd));
>> +
>> +    tdx_cmd.id = cmd_id;
>> +    tdx_cmd.metadata = metadata;
>> +    tdx_cmd.data = (__u64)(unsigned long)data;
>> +
>> +    switch (level) {
>> +    case TDX_VM_IOCTL:
>> +        r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
>> +        break;
>> +    case TDX_VCPU_IOCTL:
>> +        r = kvm_vcpu_ioctl(state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
>> +        break;
>> +    default:
>> +        error_report("Invalid tdx_ioctl_level %d", level);
>> +        exit(1);
>> +    }
>> +
>> +    return r;
>> +}
>> +
>> +#define tdx_vm_ioctl(cmd_id, metadata, data) \
>> +        __tdx_ioctl(NULL, TDX_VM_IOCTL, cmd_id, metadata, data)
>> +
>> +#define tdx_vcpu_ioctl(cpu, cmd_id, metadata, data) \
>> +        __tdx_ioctl(cpu, TDX_VCPU_IOCTL, cmd_id, metadata, data)
> 
> No point to use macro.  Normal (inline) function can works.
> 

OK. Will change it to inline function.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 06/36] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES
@ 2022-03-21  6:56       ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  6:56 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek

On 3/18/2022 10:08 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:43PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> diff --git a/target/i386/kvm/tdx.c b/target/i386/kvm/tdx.c
>> index e3b94373b316..bed337e5ba18 100644
>> --- a/target/i386/kvm/tdx.c
>> +++ b/target/i386/kvm/tdx.c
>> @@ -14,10 +14,77 @@
>>   #include "qemu/osdep.h"
>>   #include "qapi/error.h"
>>   #include "qom/object_interfaces.h"
>> +#include "sysemu/kvm.h"
>>   
>>   #include "hw/i386/x86.h"
>>   #include "tdx.h"
>>   
>> +enum tdx_ioctl_level{
>> +    TDX_VM_IOCTL,
>> +    TDX_VCPU_IOCTL,
>> +};
>> +
>> +static int __tdx_ioctl(void *state, enum tdx_ioctl_level level, int cmd_id,
>> +                        __u32 metadata, void *data)
>> +{
>> +    struct kvm_tdx_cmd tdx_cmd;
>> +    int r;
>> +
>> +    memset(&tdx_cmd, 0x0, sizeof(tdx_cmd));
>> +
>> +    tdx_cmd.id = cmd_id;
>> +    tdx_cmd.metadata = metadata;
>> +    tdx_cmd.data = (__u64)(unsigned long)data;
>> +
>> +    switch (level) {
>> +    case TDX_VM_IOCTL:
>> +        r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
>> +        break;
>> +    case TDX_VCPU_IOCTL:
>> +        r = kvm_vcpu_ioctl(state, KVM_MEMORY_ENCRYPT_OP, &tdx_cmd);
>> +        break;
>> +    default:
>> +        error_report("Invalid tdx_ioctl_level %d", level);
>> +        exit(1);
>> +    }
>> +
>> +    return r;
>> +}
>> +
>> +#define tdx_vm_ioctl(cmd_id, metadata, data) \
>> +        __tdx_ioctl(NULL, TDX_VM_IOCTL, cmd_id, metadata, data)
>> +
>> +#define tdx_vcpu_ioctl(cpu, cmd_id, metadata, data) \
>> +        __tdx_ioctl(cpu, TDX_VCPU_IOCTL, cmd_id, metadata, data)
> 
> No point to use macro.  Normal (inline) function can works.
> 

OK. Will change it to inline function.


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 09/36] KVM: Introduce kvm_arch_pre_create_vcpu()
  2022-03-18 16:56     ` Isaku Yamahata
@ 2022-03-21  7:02       ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  7:02 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas

On 3/19/2022 12:56 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:46PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
>> work prior to create any vcpu. This is for i386 TDX because it needs
>> call TDX_INIT_VM before creating any vcpu.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   accel/kvm/kvm-all.c    | 7 +++++++
>>   include/sysemu/kvm.h   | 1 +
>>   target/arm/kvm64.c     | 5 +++++
>>   target/i386/kvm/kvm.c  | 5 +++++
>>   target/mips/kvm.c      | 5 +++++
>>   target/ppc/kvm.c       | 5 +++++
>>   target/s390x/kvm/kvm.c | 5 +++++
>>   7 files changed, 33 insertions(+)
>>
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index 27864dfaeaaa..a4bb449737a6 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -465,6 +465,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>>   
>>       trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   
>> +    ret = kvm_arch_pre_create_vcpu(cpu);
>> +    if (ret < 0) {
>> +        error_setg_errno(errp, -ret,
>> +                         "kvm_init_vcpu: kvm_arch_pre_create_vcpu() failed");
>> +        goto err;
>> +    }
>> +
>>       ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>>       if (ret < 0) {
>>           error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
>> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
>> index a783c7886811..0e94031ab7c7 100644
>> --- a/include/sysemu/kvm.h
>> +++ b/include/sysemu/kvm.h
>> @@ -373,6 +373,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
>>   
>>   int kvm_arch_init(MachineState *ms, KVMState *s);
>>   
>> +int kvm_arch_pre_create_vcpu(CPUState *cpu);
>>   int kvm_arch_init_vcpu(CPUState *cpu);
>>   int kvm_arch_destroy_vcpu(CPUState *cpu);
>>   
>> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
>> index ccadfbbe72be..ae7336851c62 100644
>> --- a/target/arm/kvm64.c
>> +++ b/target/arm/kvm64.c
>> @@ -935,6 +935,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
>>       return kvm_arm_init_cpreg_list(cpu);
>>   }
>>   
>> +int kvm_arch_pre_create_vcpu(CPUState *cpu)
>> +{
>> +    return 0;
>> +}
>> +
> 
> Weak symbol can be used to avoid update all the arch.

OK. will use __attribute__ ((weak))

> Thanks,


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 09/36] KVM: Introduce kvm_arch_pre_create_vcpu()
@ 2022-03-21  7:02       ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  7:02 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek

On 3/19/2022 12:56 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:46PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> Introduce kvm_arch_pre_create_vcpu(), to perform arch-dependent
>> work prior to create any vcpu. This is for i386 TDX because it needs
>> call TDX_INIT_VM before creating any vcpu.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   accel/kvm/kvm-all.c    | 7 +++++++
>>   include/sysemu/kvm.h   | 1 +
>>   target/arm/kvm64.c     | 5 +++++
>>   target/i386/kvm/kvm.c  | 5 +++++
>>   target/mips/kvm.c      | 5 +++++
>>   target/ppc/kvm.c       | 5 +++++
>>   target/s390x/kvm/kvm.c | 5 +++++
>>   7 files changed, 33 insertions(+)
>>
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index 27864dfaeaaa..a4bb449737a6 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -465,6 +465,13 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp)
>>   
>>       trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   
>> +    ret = kvm_arch_pre_create_vcpu(cpu);
>> +    if (ret < 0) {
>> +        error_setg_errno(errp, -ret,
>> +                         "kvm_init_vcpu: kvm_arch_pre_create_vcpu() failed");
>> +        goto err;
>> +    }
>> +
>>       ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>>       if (ret < 0) {
>>           error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
>> diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
>> index a783c7886811..0e94031ab7c7 100644
>> --- a/include/sysemu/kvm.h
>> +++ b/include/sysemu/kvm.h
>> @@ -373,6 +373,7 @@ int kvm_arch_put_registers(CPUState *cpu, int level);
>>   
>>   int kvm_arch_init(MachineState *ms, KVMState *s);
>>   
>> +int kvm_arch_pre_create_vcpu(CPUState *cpu);
>>   int kvm_arch_init_vcpu(CPUState *cpu);
>>   int kvm_arch_destroy_vcpu(CPUState *cpu);
>>   
>> diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
>> index ccadfbbe72be..ae7336851c62 100644
>> --- a/target/arm/kvm64.c
>> +++ b/target/arm/kvm64.c
>> @@ -935,6 +935,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
>>       return kvm_arm_init_cpreg_list(cpu);
>>   }
>>   
>> +int kvm_arch_pre_create_vcpu(CPUState *cpu)
>> +{
>> +    return 0;
>> +}
>> +
> 
> Weak symbol can be used to avoid update all the arch.

OK. will use __attribute__ ((weak))

> Thanks,



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 16/36] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
  2022-03-18 17:11     ` Isaku Yamahata
@ 2022-03-21  8:15       ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  8:15 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas

On 3/19/2022 1:11 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:53PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> TDX only supports readonly for shared memory but not for private memory.
>>
>> In the view of QEMU, it has no idea whether a memslot is used by shared
>> memory of private. Thus just mark kvm_readonly_mem_enabled to false to
>> TDX VM for simplicity.
>>
>> Note, pflash has dependency on readonly capability from KVM while TDX
>> wants to reuse pflash interface to load TDVF (as OVMF). Excuse TDX VM
>> for readonly check in pflash.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   hw/i386/pc_sysfw.c    | 2 +-
>>   target/i386/kvm/tdx.c | 9 +++++++++
>>   2 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
>> index c8b17af95353..75b34d02cb4f 100644
>> --- a/hw/i386/pc_sysfw.c
>> +++ b/hw/i386/pc_sysfw.c
>> @@ -245,7 +245,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
>>           /* Machine property pflash0 not set, use ROM mode */
>>           x86_bios_rom_init(MACHINE(pcms), "bios.bin", rom_memory, false);
>>       } else {
>> -        if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
>> +        if (kvm_enabled() && (!kvm_readonly_mem_enabled() && !is_tdx_vm())) {
> 
> Is this called before tdx_kvm_init()?

yes.

pc_init1()/ pc_q35_init()
  pc_memory_init()
     pc_system_firmware_init()

is called after configure_accelerator() to configure kvm.

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 16/36] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM
@ 2022-03-21  8:15       ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  8:15 UTC (permalink / raw)
  To: Isaku Yamahata
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini,
	Laszlo Ersek

On 3/19/2022 1:11 AM, Isaku Yamahata wrote:
> On Thu, Mar 17, 2022 at 09:58:53PM +0800,
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> TDX only supports readonly for shared memory but not for private memory.
>>
>> In the view of QEMU, it has no idea whether a memslot is used by shared
>> memory of private. Thus just mark kvm_readonly_mem_enabled to false to
>> TDX VM for simplicity.
>>
>> Note, pflash has dependency on readonly capability from KVM while TDX
>> wants to reuse pflash interface to load TDVF (as OVMF). Excuse TDX VM
>> for readonly check in pflash.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   hw/i386/pc_sysfw.c    | 2 +-
>>   target/i386/kvm/tdx.c | 9 +++++++++
>>   2 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
>> index c8b17af95353..75b34d02cb4f 100644
>> --- a/hw/i386/pc_sysfw.c
>> +++ b/hw/i386/pc_sysfw.c
>> @@ -245,7 +245,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
>>           /* Machine property pflash0 not set, use ROM mode */
>>           x86_bios_rom_init(MACHINE(pcms), "bios.bin", rom_memory, false);
>>       } else {
>> -        if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
>> +        if (kvm_enabled() && (!kvm_readonly_mem_enabled() && !is_tdx_vm())) {
> 
> Is this called before tdx_kvm_init()?

yes.

pc_init1()/ pc_q35_init()
  pc_memory_init()
     pc_system_firmware_init()

is called after configure_accelerator() to configure kvm.


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-18 14:07     ` Philippe Mathieu-Daudé
@ 2022-03-21  8:54       ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  8:54 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

On 3/18/2022 10:07 PM, Philippe Mathieu-Daudé wrote:
> Hi,
> 
> On 17/3/22 14:58, Xiaoyao Li wrote:
>> TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
>> that OVMF is mapped as rom device, TDVF needs to be mapped as private
>> memory. This is because TDX architecture doesn't provide read-only
>> capability for VMM, and it doesn't support instruction emulation due
>> to guest memory and registers are not accessible for VMM.
>>
>> On the other hand, OVMF can work as TDVF, which is usually configured
>> as pflash device in QEMU. To keep the same usage (QEMU parameter),
>> introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
>> ram_mode will be enabled automatically that map the firmware as RAM.
>>
>> Note, this implies two things:
>>   1. TDVF (OVMF) is not read-only (write-protected).
>>
>>   2. It doesn't support non-volatile UEFI variables as what pflash
>>      supports that the change to non-volatile UEFI variables won't get
>>      synced back to backend vars.fd file.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
>>   hw/i386/pc_sysfw.c      | 14 +++++++++++---
>>   2 files changed, 29 insertions(+), 10 deletions(-)
> 
> If you don't need a pflash device, don't use it: simply map your nvram
> region as ram in your machine. No need to clutter the pflash model like
> that.

I know it's dirty to hack the pflash device. The purpose is to make the 
user interface unchanged that people can still use

	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd

to create TD guest.

I can go back to use generic loader[1] to load TDVF in v2.

[1] 
https://lore.kernel.org/qemu-devel/acaf651389c3f407a9d6d0a2e943daf0a85bb5fc.1625704981.git.isaku.yamahata@intel.com/ 


> NAcked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> 


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-21  8:54       ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-21  8:54 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake
  Cc: isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel, erdemaktas

On 3/18/2022 10:07 PM, Philippe Mathieu-Daudé wrote:
> Hi,
> 
> On 17/3/22 14:58, Xiaoyao Li wrote:
>> TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
>> that OVMF is mapped as rom device, TDVF needs to be mapped as private
>> memory. This is because TDX architecture doesn't provide read-only
>> capability for VMM, and it doesn't support instruction emulation due
>> to guest memory and registers are not accessible for VMM.
>>
>> On the other hand, OVMF can work as TDVF, which is usually configured
>> as pflash device in QEMU. To keep the same usage (QEMU parameter),
>> introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
>> ram_mode will be enabled automatically that map the firmware as RAM.
>>
>> Note, this implies two things:
>>   1. TDVF (OVMF) is not read-only (write-protected).
>>
>>   2. It doesn't support non-volatile UEFI variables as what pflash
>>      supports that the change to non-volatile UEFI variables won't get
>>      synced back to backend vars.fd file.
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
>>   hw/i386/pc_sysfw.c      | 14 +++++++++++---
>>   2 files changed, 29 insertions(+), 10 deletions(-)
> 
> If you don't need a pflash device, don't use it: simply map your nvram
> region as ram in your machine. No need to clutter the pflash model like
> that.

I know it's dirty to hack the pflash device. The purpose is to make the 
user interface unchanged that people can still use

	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd

to create TD guest.

I can go back to use generic loader[1] to load TDVF in v2.

[1] 
https://lore.kernel.org/qemu-devel/acaf651389c3f407a9d6d0a2e943daf0a85bb5fc.1625704981.git.isaku.yamahata@intel.com/ 


> NAcked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> 



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-21  8:54       ` Xiaoyao Li
@ 2022-03-21 22:06         ` Isaku Yamahata
  -1 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-21 22:06 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, Philippe Mathieu-Daud???,
	Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek,
	isaku.yamahata

On Mon, Mar 21, 2022 at 04:54:51PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> On 3/18/2022 10:07 PM, Philippe Mathieu-Daudé wrote:
> > Hi,
> > 
> > On 17/3/22 14:58, Xiaoyao Li wrote:
> > > TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
> > > that OVMF is mapped as rom device, TDVF needs to be mapped as private
> > > memory. This is because TDX architecture doesn't provide read-only
> > > capability for VMM, and it doesn't support instruction emulation due
> > > to guest memory and registers are not accessible for VMM.
> > > 
> > > On the other hand, OVMF can work as TDVF, which is usually configured
> > > as pflash device in QEMU. To keep the same usage (QEMU parameter),
> > > introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
> > > ram_mode will be enabled automatically that map the firmware as RAM.
> > > 
> > > Note, this implies two things:
> > > ?? 1. TDVF (OVMF) is not read-only (write-protected).
> > > 
> > > ?? 2. It doesn't support non-volatile UEFI variables as what pflash
> > > ???????? supports that the change to non-volatile UEFI variables won't get
> > > ???????? synced back to backend vars.fd file.
> > > 
> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > ---
> > > ?? hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
> > > ?? hw/i386/pc_sysfw.c?????????? | 14 +++++++++++---
> > > ?? 2 files changed, 29 insertions(+), 10 deletions(-)
> > 
> > If you don't need a pflash device, don't use it: simply map your nvram
> > region as ram in your machine. No need to clutter the pflash model like
> > that.
> 
> I know it's dirty to hack the pflash device. The purpose is to make the user
> interface unchanged that people can still use
> 
> 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
>         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
> 
> to create TD guest.

For the compatibility for qemu command line, you don't have to modify pflash
device.  Don't instantiate pflash at pc_system_flash_create(), and at
pc_system_firmware_init(), you can retrieve necessary parameters, and then
populate memory.  Although it's still hacky, it would be cleaner a bit.
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-21 22:06         ` Isaku Yamahata
  0 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-21 22:06 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Philippe Mathieu-Daud???, Paolo Bonzini, Philippe Mathieu-Daud???,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrang???,
	Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann, Eric Blake,
	isaku.yamahata, kvm, Connor Kuehl, seanjc, qemu-devel,
	erdemaktas, isaku.yamahata

On Mon, Mar 21, 2022 at 04:54:51PM +0800,
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> On 3/18/2022 10:07 PM, Philippe Mathieu-Daudé wrote:
> > Hi,
> > 
> > On 17/3/22 14:58, Xiaoyao Li wrote:
> > > TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
> > > that OVMF is mapped as rom device, TDVF needs to be mapped as private
> > > memory. This is because TDX architecture doesn't provide read-only
> > > capability for VMM, and it doesn't support instruction emulation due
> > > to guest memory and registers are not accessible for VMM.
> > > 
> > > On the other hand, OVMF can work as TDVF, which is usually configured
> > > as pflash device in QEMU. To keep the same usage (QEMU parameter),
> > > introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
> > > ram_mode will be enabled automatically that map the firmware as RAM.
> > > 
> > > Note, this implies two things:
> > > ?? 1. TDVF (OVMF) is not read-only (write-protected).
> > > 
> > > ?? 2. It doesn't support non-volatile UEFI variables as what pflash
> > > ???????? supports that the change to non-volatile UEFI variables won't get
> > > ???????? synced back to backend vars.fd file.
> > > 
> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > ---
> > > ?? hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
> > > ?? hw/i386/pc_sysfw.c?????????? | 14 +++++++++++---
> > > ?? 2 files changed, 29 insertions(+), 10 deletions(-)
> > 
> > If you don't need a pflash device, don't use it: simply map your nvram
> > region as ram in your machine. No need to clutter the pflash model like
> > that.
> 
> I know it's dirty to hack the pflash device. The purpose is to make the user
> interface unchanged that people can still use
> 
> 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
>         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
> 
> to create TD guest.

For the compatibility for qemu command line, you don't have to modify pflash
device.  Don't instantiate pflash at pc_system_flash_create(), and at
pc_system_firmware_init(), you can retrieve necessary parameters, and then
populate memory.  Although it's still hacky, it would be cleaner a bit.
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2022-03-17 13:58   ` Xiaoyao Li
@ 2022-03-22  9:02     ` Gerd Hoffmann
  -1 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-22  9:02 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Eric Blake, Connor Kuehl,
	isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

On Thu, Mar 17, 2022 at 09:58:49PM +0800, Xiaoyao Li wrote:
> Add sept-ve-disable property for tdx-guest object. It's used to
> configure bit 28 of TD attributes.

What is this?

> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -792,10 +792,13 @@
>  #
>  # @attributes: TDX guest's attributes (default: 0)
>  #
> +# @sept-ve-disable: attributes.sept-ve-disable[bit 28] (default: 0)

I'd suggest to document this here.

thanks,
  Gerd


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
@ 2022-03-22  9:02     ` Gerd Hoffmann
  0 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-22  9:02 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrangé,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daudé,
	qemu-devel, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On Thu, Mar 17, 2022 at 09:58:49PM +0800, Xiaoyao Li wrote:
> Add sept-ve-disable property for tdx-guest object. It's used to
> configure bit 28 of TD attributes.

What is this?

> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -792,10 +792,13 @@
>  #
>  # @attributes: TDX guest's attributes (default: 0)
>  #
> +# @sept-ve-disable: attributes.sept-ve-disable[bit 28] (default: 0)

I'd suggest to document this here.

thanks,
  Gerd



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-21  8:54       ` Xiaoyao Li
@ 2022-03-22  9:21         ` Gerd Hoffmann
  -1 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-22  9:21 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Eric Blake, Connor Kuehl,
	isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

  Hi,

> > If you don't need a pflash device, don't use it: simply map your nvram
> > region as ram in your machine. No need to clutter the pflash model like
> > that.

Using the pflash device for something which isn't actually flash looks a
bit silly indeed.

> 
> I know it's dirty to hack the pflash device. The purpose is to make the user
> interface unchanged that people can still use
> 
> 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
>         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
> 
> to create TD guest.

Well, if persistent vars are not supported anyway there is little reason
to split the firmware into CODE and VARS files.  You can use just use
OVMF.fd with a single pflash device.  libvirt recently got support for
that.

Just using -bios OVMF.fd might work too.  Daniel tried that recently for
sev, but ran into problems with wiring up ovmf metadata parsing for
-bios.  Don't remember the details though.

take care,
  Gerd


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-22  9:21         ` Gerd Hoffmann
  0 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-22  9:21 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrangé,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

  Hi,

> > If you don't need a pflash device, don't use it: simply map your nvram
> > region as ram in your machine. No need to clutter the pflash model like
> > that.

Using the pflash device for something which isn't actually flash looks a
bit silly indeed.

> 
> I know it's dirty to hack the pflash device. The purpose is to make the user
> interface unchanged that people can still use
> 
> 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
>         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
> 
> to create TD guest.

Well, if persistent vars are not supported anyway there is little reason
to split the firmware into CODE and VARS files.  You can use just use
OVMF.fd with a single pflash device.  libvirt recently got support for
that.

Just using -bios OVMF.fd might work too.  Daniel tried that recently for
sev, but ran into problems with wiring up ovmf metadata parsing for
-bios.  Don't remember the details though.

take care,
  Gerd



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-21  8:54       ` Xiaoyao Li
@ 2022-03-22  9:27         ` Daniel P. Berrangé
  -1 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-22  9:27 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann,
	Eric Blake, Connor Kuehl, isaku.yamahata, erdemaktas, kvm,
	qemu-devel, seanjc

On Mon, Mar 21, 2022 at 04:54:51PM +0800, Xiaoyao Li wrote:
> On 3/18/2022 10:07 PM, Philippe Mathieu-Daudé wrote:
> > Hi,
> > 
> > On 17/3/22 14:58, Xiaoyao Li wrote:
> > > TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
> > > that OVMF is mapped as rom device, TDVF needs to be mapped as private
> > > memory. This is because TDX architecture doesn't provide read-only
> > > capability for VMM, and it doesn't support instruction emulation due
> > > to guest memory and registers are not accessible for VMM.
> > > 
> > > On the other hand, OVMF can work as TDVF, which is usually configured
> > > as pflash device in QEMU. To keep the same usage (QEMU parameter),
> > > introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
> > > ram_mode will be enabled automatically that map the firmware as RAM.
> > > 
> > > Note, this implies two things:
> > >   1. TDVF (OVMF) is not read-only (write-protected).
> > > 
> > >   2. It doesn't support non-volatile UEFI variables as what pflash
> > >      supports that the change to non-volatile UEFI variables won't get
> > >      synced back to backend vars.fd file.
> > > 
> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > ---
> > >   hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
> > >   hw/i386/pc_sysfw.c      | 14 +++++++++++---
> > >   2 files changed, 29 insertions(+), 10 deletions(-)
> > 
> > If you don't need a pflash device, don't use it: simply map your nvram
> > region as ram in your machine. No need to clutter the pflash model like
> > that.
> 
> I know it's dirty to hack the pflash device. The purpose is to make the user
> interface unchanged that people can still use
> 
> 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
>         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd

Note, that in the default pflash config, libvirt will set the 'readonly=on'
flag for OVMF_CODE.fd ie, it will use

    -drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd,readonly=on
    -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd

IOW, we're requiring OVMF_CODE.fd is ROM, while OVMF_VARS.fd is NVRAM

IIUC, this patch here is changing the semantics of these args:

   - OVMF_CODE.fd is mapped as RAM, but IIUC, QEMU would still be
     prevented from writing to it due to readonly=on in the
     block layer

   - OVMF_VARS.fd is mapped as RAM, but IIUC you're saying that
     none the less, any writes don't propagate back into the file ?



Dealing with OVMF_VARS.fd first, I really wonder why you want to have
a OVMF_VARS.fd file at all, if you don't have writes propagated into
it ? It has no reason to exist if you're not writing to it.

IMHO the AmdSev build for OVMF gets this right by entirely disabling
the split OVMF_CODE.fd vs OVMF_VARS.fd, and just having a single
OVMF.fd file that is exposed read-only to the guest.

This is further represented in $QEMU.git/docs/interop/firmware.json
by marking the firmware as 'stateless', which apps like libvirt will
use to figure out what QEMU command line to pick.

IOW, if you don't want OVMF_VARS.fd to be written to, then follow
what AmdSev has done, and get rid of the split files.


As for exposing OVMF_CODE.fd as RAM instead of ROM. That feels a
little odd, but as long as its backing store file on disk honours
the readony=on request to -drive, that's not terrible IMHO.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-22  9:27         ` Daniel P. Berrangé
  0 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-22  9:27 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Eric Blake, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On Mon, Mar 21, 2022 at 04:54:51PM +0800, Xiaoyao Li wrote:
> On 3/18/2022 10:07 PM, Philippe Mathieu-Daudé wrote:
> > Hi,
> > 
> > On 17/3/22 14:58, Xiaoyao Li wrote:
> > > TDX VM needs to boot with Trust Domain Virtual Firmware (TDVF). Unlike
> > > that OVMF is mapped as rom device, TDVF needs to be mapped as private
> > > memory. This is because TDX architecture doesn't provide read-only
> > > capability for VMM, and it doesn't support instruction emulation due
> > > to guest memory and registers are not accessible for VMM.
> > > 
> > > On the other hand, OVMF can work as TDVF, which is usually configured
> > > as pflash device in QEMU. To keep the same usage (QEMU parameter),
> > > introduce ram_mode to pflash for TDVF. When it's creating a TDX VM,
> > > ram_mode will be enabled automatically that map the firmware as RAM.
> > > 
> > > Note, this implies two things:
> > >   1. TDVF (OVMF) is not read-only (write-protected).
> > > 
> > >   2. It doesn't support non-volatile UEFI variables as what pflash
> > >      supports that the change to non-volatile UEFI variables won't get
> > >      synced back to backend vars.fd file.
> > > 
> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > > ---
> > >   hw/block/pflash_cfi01.c | 25 ++++++++++++++++++-------
> > >   hw/i386/pc_sysfw.c      | 14 +++++++++++---
> > >   2 files changed, 29 insertions(+), 10 deletions(-)
> > 
> > If you don't need a pflash device, don't use it: simply map your nvram
> > region as ram in your machine. No need to clutter the pflash model like
> > that.
> 
> I know it's dirty to hack the pflash device. The purpose is to make the user
> interface unchanged that people can still use
> 
> 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
>         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd

Note, that in the default pflash config, libvirt will set the 'readonly=on'
flag for OVMF_CODE.fd ie, it will use

    -drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd,readonly=on
    -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd

IOW, we're requiring OVMF_CODE.fd is ROM, while OVMF_VARS.fd is NVRAM

IIUC, this patch here is changing the semantics of these args:

   - OVMF_CODE.fd is mapped as RAM, but IIUC, QEMU would still be
     prevented from writing to it due to readonly=on in the
     block layer

   - OVMF_VARS.fd is mapped as RAM, but IIUC you're saying that
     none the less, any writes don't propagate back into the file ?



Dealing with OVMF_VARS.fd first, I really wonder why you want to have
a OVMF_VARS.fd file at all, if you don't have writes propagated into
it ? It has no reason to exist if you're not writing to it.

IMHO the AmdSev build for OVMF gets this right by entirely disabling
the split OVMF_CODE.fd vs OVMF_VARS.fd, and just having a single
OVMF.fd file that is exposed read-only to the guest.

This is further represented in $QEMU.git/docs/interop/firmware.json
by marking the firmware as 'stateless', which apps like libvirt will
use to figure out what QEMU command line to pick.

IOW, if you don't want OVMF_VARS.fd to be written to, then follow
what AmdSev has done, and get rid of the split files.


As for exposing OVMF_CODE.fd as RAM instead of ROM. That feels a
little odd, but as long as its backing store file on disk honours
the readony=on request to -drive, that's not terrible IMHO.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-22  9:21         ` Gerd Hoffmann
@ 2022-03-22  9:29           ` Daniel P. Berrangé
  -1 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-22  9:29 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Xiaoyao Li, Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

On Tue, Mar 22, 2022 at 10:21:41AM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > > If you don't need a pflash device, don't use it: simply map your nvram
> > > region as ram in your machine. No need to clutter the pflash model like
> > > that.
> 
> Using the pflash device for something which isn't actually flash looks a
> bit silly indeed.
> 
> > 
> > I know it's dirty to hack the pflash device. The purpose is to make the user
> > interface unchanged that people can still use
> > 
> > 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
> >         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
> > 
> > to create TD guest.
> 
> Well, if persistent vars are not supported anyway there is little reason
> to split the firmware into CODE and VARS files.  You can use just use
> OVMF.fd with a single pflash device.  libvirt recently got support for
> that.

Agreed.

> Just using -bios OVMF.fd might work too.  Daniel tried that recently for
> sev, but ran into problems with wiring up ovmf metadata parsing for
> -bios.  Don't remember the details though.

It was related to the BIOS shadowing, whereby QEMU loads it at one
address, and then when CPUs start it is copied to another address.
This was not compatible with the way AMD SEV wants to do measurement
of the firmware. May or may not be relevant for TDX, I don't know
enough about TDX to say.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-22  9:29           ` Daniel P. Berrangé
  0 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-22  9:29 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Xiaoyao Li, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	Eric Blake, qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On Tue, Mar 22, 2022 at 10:21:41AM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > > If you don't need a pflash device, don't use it: simply map your nvram
> > > region as ram in your machine. No need to clutter the pflash model like
> > > that.
> 
> Using the pflash device for something which isn't actually flash looks a
> bit silly indeed.
> 
> > 
> > I know it's dirty to hack the pflash device. The purpose is to make the user
> > interface unchanged that people can still use
> > 
> > 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
> >         -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
> > 
> > to create TD guest.
> 
> Well, if persistent vars are not supported anyway there is little reason
> to split the firmware into CODE and VARS files.  You can use just use
> OVMF.fd with a single pflash device.  libvirt recently got support for
> that.

Agreed.

> Just using -bios OVMF.fd might work too.  Daniel tried that recently for
> sev, but ran into problems with wiring up ovmf metadata parsing for
> -bios.  Don't remember the details though.

It was related to the BIOS shadowing, whereby QEMU loads it at one
address, and then when CPUs start it is copied to another address.
This was not compatible with the way AMD SEV wants to do measurement
of the firmware. May or may not be relevant for TDX, I don't know
enough about TDX to say.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-22  9:29           ` Daniel P. Berrangé
@ 2022-03-22 10:35             ` Gerd Hoffmann
  -1 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-22 10:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Xiaoyao Li, Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

  Hi,

> > Just using -bios OVMF.fd might work too.  Daniel tried that recently for
> > sev, but ran into problems with wiring up ovmf metadata parsing for
> > -bios.  Don't remember the details though.
> 
> It was related to the BIOS shadowing, whereby QEMU loads it at one
> address, and then when CPUs start it is copied to another address.

Is this the top 128k of the firmware being copied below 1M so the
firmware reset vector is available in real mode address space?

> This was not compatible with the way AMD SEV wants to do measurement
> of the firmware. May or may not be relevant for TDX, I don't know
> enough about TDX to say.

TDX boots in 32bit mode, so simply skipping any real mode compatibility
stuff shouldn't cause any problems here.

Not sure about SEV.  There is this SevProcessorReset entry in the ovmf
metadata block.  Is that the SEV reset vector?  If SEV cpu bringup
doesn't go through real mode either we maybe can just skip the BIOS
shadowing setup for confidential computing guests ...

take care,
  Gerd


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-22 10:35             ` Gerd Hoffmann
  0 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-22 10:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Xiaoyao Li, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	Eric Blake, qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

  Hi,

> > Just using -bios OVMF.fd might work too.  Daniel tried that recently for
> > sev, but ran into problems with wiring up ovmf metadata parsing for
> > -bios.  Don't remember the details though.
> 
> It was related to the BIOS shadowing, whereby QEMU loads it at one
> address, and then when CPUs start it is copied to another address.

Is this the top 128k of the firmware being copied below 1M so the
firmware reset vector is available in real mode address space?

> This was not compatible with the way AMD SEV wants to do measurement
> of the firmware. May or may not be relevant for TDX, I don't know
> enough about TDX to say.

TDX boots in 32bit mode, so simply skipping any real mode compatibility
stuff shouldn't cause any problems here.

Not sure about SEV.  There is this SevProcessorReset entry in the ovmf
metadata block.  Is that the SEV reset vector?  If SEV cpu bringup
doesn't go through real mode either we maybe can just skip the BIOS
shadowing setup for confidential computing guests ...

take care,
  Gerd



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-22 10:35             ` Gerd Hoffmann
@ 2022-03-22 10:51               ` Daniel P. Berrangé
  -1 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-22 10:51 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Xiaoyao Li, Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

On Tue, Mar 22, 2022 at 11:35:18AM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > > Just using -bios OVMF.fd might work too.  Daniel tried that recently for
> > > sev, but ran into problems with wiring up ovmf metadata parsing for
> > > -bios.  Don't remember the details though.
> > 
> > It was related to the BIOS shadowing, whereby QEMU loads it at one
> > address, and then when CPUs start it is copied to another address.
> 
> Is this the top 128k of the firmware being copied below 1M so the
> firmware reset vector is available in real mode address space?

It was the 'rom_reset' method in hw/core/loader.c that was involved
in the root of the problem, copying the firmware from ROM to RAM.

At the time I did try a gross hack that (IIRC) disabled the
rom_reset logic, and munged x86_bios_rom_init so that it would
force load it straight at the RAM location. I couldn't figure
out an attractive way to make this into something supportable,
so abandoned the whole idea. Messing with this area of code is
a somewhat beyond my level of understanding of x86 boot process
anyway.

> > This was not compatible with the way AMD SEV wants to do measurement
> > of the firmware. May or may not be relevant for TDX, I don't know
> > enough about TDX to say.
> 
> TDX boots in 32bit mode, so simply skipping any real mode compatibility
> stuff shouldn't cause any problems here.
> 
> Not sure about SEV.  There is this SevProcessorReset entry in the ovmf
> metadata block.  Is that the SEV reset vector?  If SEV cpu bringup
> doesn't go through real mode either we maybe can just skip the BIOS
> shadowing setup for confidential computing guests ...


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-22 10:51               ` Daniel P. Berrangé
  0 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-22 10:51 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Xiaoyao Li, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	Eric Blake, qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On Tue, Mar 22, 2022 at 11:35:18AM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > > Just using -bios OVMF.fd might work too.  Daniel tried that recently for
> > > sev, but ran into problems with wiring up ovmf metadata parsing for
> > > -bios.  Don't remember the details though.
> > 
> > It was related to the BIOS shadowing, whereby QEMU loads it at one
> > address, and then when CPUs start it is copied to another address.
> 
> Is this the top 128k of the firmware being copied below 1M so the
> firmware reset vector is available in real mode address space?

It was the 'rom_reset' method in hw/core/loader.c that was involved
in the root of the problem, copying the firmware from ROM to RAM.

At the time I did try a gross hack that (IIRC) disabled the
rom_reset logic, and munged x86_bios_rom_init so that it would
force load it straight at the RAM location. I couldn't figure
out an attractive way to make this into something supportable,
so abandoned the whole idea. Messing with this area of code is
a somewhat beyond my level of understanding of x86 boot process
anyway.

> > This was not compatible with the way AMD SEV wants to do measurement
> > of the firmware. May or may not be relevant for TDX, I don't know
> > enough about TDX to say.
> 
> TDX boots in 32bit mode, so simply skipping any real mode compatibility
> stuff shouldn't cause any problems here.
> 
> Not sure about SEV.  There is this SevProcessorReset entry in the ovmf
> metadata block.  Is that the SEV reset vector?  If SEV cpu bringup
> doesn't go through real mode either we maybe can just skip the BIOS
> shadowing setup for confidential computing guests ...


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-22 10:51               ` Daniel P. Berrangé
@ 2022-03-22 12:20                 ` Gerd Hoffmann
  -1 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-22 12:20 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Xiaoyao Li, Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

  Hi,

> At the time I did try a gross hack that (IIRC) disabled the
> rom_reset logic, and munged x86_bios_rom_init so that it would
> force load it straight at the RAM location.

Sounds reasonable.  The whole rom logic exists to handle resets,
but with confidential guests we don't need that, we can't change
guest state to perform a reset anyway ...

take care,
  Gerd

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 4cf107baea34..169ef96682de 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1115,15 +1115,26 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
         goto bios_error;
     }
     bios = g_malloc(sizeof(*bios));
+
     memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
-    if (!isapc_ram_fw) {
-        memory_region_set_readonly(bios, true);
-    }
-    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
-    if (ret != 0) {
-    bios_error:
-        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
-        exit(1);
+    if (1 /* confidential computing */) {
+        /*
+         * The concept of a "reset" simply doesn't exist for
+         * confidential computing guests, we have to destroy and
+         * re-launch them instead.  So there is no need to register
+         * the firmware as rom to properly re-initialize on reset.
+         * Just go for a straight file load instead.
+         */
+        void *ptr = memory_region_get_ram_ptr(bios);
+        load_image_size(filename, ptr, bios_size);
+    } else {
+        if (!isapc_ram_fw) {
+            memory_region_set_readonly(bios, true);
+        }
+        ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
+        if (ret != 0) {
+            goto bios_error;
+        }
     }
     g_free(filename);
 
@@ -1144,6 +1155,11 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
     memory_region_add_subregion(rom_memory,
                                 (uint32_t)(-bios_size),
                                 bios);
+    return;
+
+bios_error:
+    fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
+    exit(1);
 }
 
 bool x86_machine_is_smm_enabled(const X86MachineState *x86ms)


^ permalink raw reply related	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-22 12:20                 ` Gerd Hoffmann
  0 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-22 12:20 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Xiaoyao Li, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	Eric Blake, qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

  Hi,

> At the time I did try a gross hack that (IIRC) disabled the
> rom_reset logic, and munged x86_bios_rom_init so that it would
> force load it straight at the RAM location.

Sounds reasonable.  The whole rom logic exists to handle resets,
but with confidential guests we don't need that, we can't change
guest state to perform a reset anyway ...

take care,
  Gerd

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 4cf107baea34..169ef96682de 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1115,15 +1115,26 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
         goto bios_error;
     }
     bios = g_malloc(sizeof(*bios));
+
     memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
-    if (!isapc_ram_fw) {
-        memory_region_set_readonly(bios, true);
-    }
-    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
-    if (ret != 0) {
-    bios_error:
-        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
-        exit(1);
+    if (1 /* confidential computing */) {
+        /*
+         * The concept of a "reset" simply doesn't exist for
+         * confidential computing guests, we have to destroy and
+         * re-launch them instead.  So there is no need to register
+         * the firmware as rom to properly re-initialize on reset.
+         * Just go for a straight file load instead.
+         */
+        void *ptr = memory_region_get_ram_ptr(bios);
+        load_image_size(filename, ptr, bios_size);
+    } else {
+        if (!isapc_ram_fw) {
+            memory_region_set_readonly(bios, true);
+        }
+        ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
+        if (ret != 0) {
+            goto bios_error;
+        }
     }
     g_free(filename);
 
@@ -1144,6 +1155,11 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
     memory_region_add_subregion(rom_memory,
                                 (uint32_t)(-bios_size),
                                 bios);
+    return;
+
+bios_error:
+    fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
+    exit(1);
 }
 
 bool x86_machine_is_smm_enabled(const X86MachineState *x86ms)



^ permalink raw reply related	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-22  9:29           ` Daniel P. Berrangé
@ 2022-03-24  6:13             ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24  6:13 UTC (permalink / raw)
  To: Daniel P. Berrangé, Gerd Hoffmann
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

On 3/22/2022 5:29 PM, Daniel P. Berrangé wrote:
> On Tue, Mar 22, 2022 at 10:21:41AM +0100, Gerd Hoffmann wrote:
>>    Hi,
>>
>>>> If you don't need a pflash device, don't use it: simply map your nvram
>>>> region as ram in your machine. No need to clutter the pflash model like
>>>> that.
>>
>> Using the pflash device for something which isn't actually flash looks a
>> bit silly indeed.
>>
>>>
>>> I know it's dirty to hack the pflash device. The purpose is to make the user
>>> interface unchanged that people can still use
>>>
>>> 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
>>>          -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
>>>
>>> to create TD guest.
>>
>> Well, if persistent vars are not supported anyway there is little reason
>> to split the firmware into CODE and VARS files.  You can use just use
>> OVMF.fd with a single pflash device.  libvirt recently got support for
>> that.
> 
> Agreed.

The purpose of using split firmware is that people can share the same 
code.fd while using different vars.fd




^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-24  6:13             ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24  6:13 UTC (permalink / raw)
  To: Daniel P. Berrangé, Gerd Hoffmann
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Eric Blake, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On 3/22/2022 5:29 PM, Daniel P. Berrangé wrote:
> On Tue, Mar 22, 2022 at 10:21:41AM +0100, Gerd Hoffmann wrote:
>>    Hi,
>>
>>>> If you don't need a pflash device, don't use it: simply map your nvram
>>>> region as ram in your machine. No need to clutter the pflash model like
>>>> that.
>>
>> Using the pflash device for something which isn't actually flash looks a
>> bit silly indeed.
>>
>>>
>>> I know it's dirty to hack the pflash device. The purpose is to make the user
>>> interface unchanged that people can still use
>>>
>>> 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
>>>          -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
>>>
>>> to create TD guest.
>>
>> Well, if persistent vars are not supported anyway there is little reason
>> to split the firmware into CODE and VARS files.  You can use just use
>> OVMF.fd with a single pflash device.  libvirt recently got support for
>> that.
> 
> Agreed.

The purpose of using split firmware is that people can share the same 
code.fd while using different vars.fd





^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2022-03-22  9:02     ` Gerd Hoffmann
@ 2022-03-24  6:52       ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24  6:52 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Eric Blake, Connor Kuehl,
	isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

On 3/22/2022 5:02 PM, Gerd Hoffmann wrote:
> On Thu, Mar 17, 2022 at 09:58:49PM +0800, Xiaoyao Li wrote:
>> Add sept-ve-disable property for tdx-guest object. It's used to
>> configure bit 28 of TD attributes.
> 
> What is this?

It seems this bit doesn't show up in the public spec yet.

Bit 28 (SEPT_VE_DISABLE): Disable EPT violation conversion to #VE ON 
guest TD ACCESS of PENDING pages.

The TDX architecture requires a private page to be accepted before 
using. If guest accesses a not-accepted (pending) page it will get #VE.

For some OS, e.g., Linux TD guest, it doesn't want the #VE on pending 
page so it will set this bit.

>> --- a/qapi/qom.json
>> +++ b/qapi/qom.json
>> @@ -792,10 +792,13 @@
>>   #
>>   # @attributes: TDX guest's attributes (default: 0)
>>   #
>> +# @sept-ve-disable: attributes.sept-ve-disable[bit 28] (default: 0)
> 
> I'd suggest to document this here.
> 
> thanks,
>    Gerd
> 


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
@ 2022-03-24  6:52       ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24  6:52 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrangé,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daudé,
	qemu-devel, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On 3/22/2022 5:02 PM, Gerd Hoffmann wrote:
> On Thu, Mar 17, 2022 at 09:58:49PM +0800, Xiaoyao Li wrote:
>> Add sept-ve-disable property for tdx-guest object. It's used to
>> configure bit 28 of TD attributes.
> 
> What is this?

It seems this bit doesn't show up in the public spec yet.

Bit 28 (SEPT_VE_DISABLE): Disable EPT violation conversion to #VE ON 
guest TD ACCESS of PENDING pages.

The TDX architecture requires a private page to be accepted before 
using. If guest accesses a not-accepted (pending) page it will get #VE.

For some OS, e.g., Linux TD guest, it doesn't want the #VE on pending 
page so it will set this bit.

>> --- a/qapi/qom.json
>> +++ b/qapi/qom.json
>> @@ -792,10 +792,13 @@
>>   #
>>   # @attributes: TDX guest's attributes (default: 0)
>>   #
>> +# @sept-ve-disable: attributes.sept-ve-disable[bit 28] (default: 0)
> 
> I'd suggest to document this here.
> 
> thanks,
>    Gerd
> 



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2022-03-24  6:52       ` Xiaoyao Li
@ 2022-03-24  7:57         ` Gerd Hoffmann
  -1 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-24  7:57 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Eric Blake, Connor Kuehl,
	isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

On Thu, Mar 24, 2022 at 02:52:10PM +0800, Xiaoyao Li wrote:
> On 3/22/2022 5:02 PM, Gerd Hoffmann wrote:
> > On Thu, Mar 17, 2022 at 09:58:49PM +0800, Xiaoyao Li wrote:
> > > Add sept-ve-disable property for tdx-guest object. It's used to
> > > configure bit 28 of TD attributes.
> > 
> > What is this?
> 
> It seems this bit doesn't show up in the public spec yet.
> 
> Bit 28 (SEPT_VE_DISABLE): Disable EPT violation conversion to #VE ON guest
> TD ACCESS of PENDING pages.
> 
> The TDX architecture requires a private page to be accepted before using. If
> guest accesses a not-accepted (pending) page it will get #VE.
> 
> For some OS, e.g., Linux TD guest, it doesn't want the #VE on pending page
> so it will set this bit.

Hmm.  That looks rather pointless to me.  The TDX patches for OVMF add a
#VE handler, so I suspect every guest wants #VE exceptions if even the
firmware cares to install a handler ...

Also: What will happen instead? EPT fault delivered to the host?

take care,
  Gerd


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
@ 2022-03-24  7:57         ` Gerd Hoffmann
  0 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-24  7:57 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrangé,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daudé,
	qemu-devel, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On Thu, Mar 24, 2022 at 02:52:10PM +0800, Xiaoyao Li wrote:
> On 3/22/2022 5:02 PM, Gerd Hoffmann wrote:
> > On Thu, Mar 17, 2022 at 09:58:49PM +0800, Xiaoyao Li wrote:
> > > Add sept-ve-disable property for tdx-guest object. It's used to
> > > configure bit 28 of TD attributes.
> > 
> > What is this?
> 
> It seems this bit doesn't show up in the public spec yet.
> 
> Bit 28 (SEPT_VE_DISABLE): Disable EPT violation conversion to #VE ON guest
> TD ACCESS of PENDING pages.
> 
> The TDX architecture requires a private page to be accepted before using. If
> guest accesses a not-accepted (pending) page it will get #VE.
> 
> For some OS, e.g., Linux TD guest, it doesn't want the #VE on pending page
> so it will set this bit.

Hmm.  That looks rather pointless to me.  The TDX patches for OVMF add a
#VE handler, so I suspect every guest wants #VE exceptions if even the
firmware cares to install a handler ...

Also: What will happen instead? EPT fault delivered to the host?

take care,
  Gerd



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-24  6:13             ` Xiaoyao Li
@ 2022-03-24  7:58               ` Gerd Hoffmann
  -1 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-24  7:58 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

  Hi,

> > > Well, if persistent vars are not supported anyway there is little reason
> > > to split the firmware into CODE and VARS files.  You can use just use
> > > OVMF.fd with a single pflash device.  libvirt recently got support for
> > > that.
> > 
> > Agreed.
> 
> The purpose of using split firmware is that people can share the same
> code.fd while using different vars.fd

Using different vars.fd files is pointless though when changes are never
written back ...

take care,
  Gerd


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-24  7:58               ` Gerd Hoffmann
  0 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-24  7:58 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrangé,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

  Hi,

> > > Well, if persistent vars are not supported anyway there is little reason
> > > to split the firmware into CODE and VARS files.  You can use just use
> > > OVMF.fd with a single pflash device.  libvirt recently got support for
> > > that.
> > 
> > Agreed.
> 
> The purpose of using split firmware is that people can share the same
> code.fd while using different vars.fd

Using different vars.fd files is pointless though when changes are never
written back ...

take care,
  Gerd



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2022-03-24  7:57         ` Gerd Hoffmann
@ 2022-03-24  8:08           ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24  8:08 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Eric Blake, Connor Kuehl,
	isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

On 3/24/2022 3:57 PM, Gerd Hoffmann wrote:
> On Thu, Mar 24, 2022 at 02:52:10PM +0800, Xiaoyao Li wrote:
>> On 3/22/2022 5:02 PM, Gerd Hoffmann wrote:
>>> On Thu, Mar 17, 2022 at 09:58:49PM +0800, Xiaoyao Li wrote:
>>>> Add sept-ve-disable property for tdx-guest object. It's used to
>>>> configure bit 28 of TD attributes.
>>>
>>> What is this?
>>
>> It seems this bit doesn't show up in the public spec yet.
>>
>> Bit 28 (SEPT_VE_DISABLE): Disable EPT violation conversion to #VE ON guest
>> TD ACCESS of PENDING pages.
>>
>> The TDX architecture requires a private page to be accepted before using. If
>> guest accesses a not-accepted (pending) page it will get #VE.
>>
>> For some OS, e.g., Linux TD guest, it doesn't want the #VE on pending page
>> so it will set this bit.
> 
> Hmm.  That looks rather pointless to me.  The TDX patches for OVMF add a
> #VE handler, so I suspect every guest wants #VE exceptions if even the
> firmware cares to install a handler ...

#VE can be triggered in various situations. e.g., CPUID on some leaves, 
and RD/WRMSR on some MSRs. #VE on pending page is just one of the 
sources, Linux just wants to disable this kind of #VE since it wants to 
prevent unexpected #VE during SYSCALL gap.

> Also: What will happen instead? EPT fault delivered to the host?

Yes.

> take care,
>    Gerd
> 


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
@ 2022-03-24  8:08           ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24  8:08 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrangé,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daudé,
	qemu-devel, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On 3/24/2022 3:57 PM, Gerd Hoffmann wrote:
> On Thu, Mar 24, 2022 at 02:52:10PM +0800, Xiaoyao Li wrote:
>> On 3/22/2022 5:02 PM, Gerd Hoffmann wrote:
>>> On Thu, Mar 17, 2022 at 09:58:49PM +0800, Xiaoyao Li wrote:
>>>> Add sept-ve-disable property for tdx-guest object. It's used to
>>>> configure bit 28 of TD attributes.
>>>
>>> What is this?
>>
>> It seems this bit doesn't show up in the public spec yet.
>>
>> Bit 28 (SEPT_VE_DISABLE): Disable EPT violation conversion to #VE ON guest
>> TD ACCESS of PENDING pages.
>>
>> The TDX architecture requires a private page to be accepted before using. If
>> guest accesses a not-accepted (pending) page it will get #VE.
>>
>> For some OS, e.g., Linux TD guest, it doesn't want the #VE on pending page
>> so it will set this bit.
> 
> Hmm.  That looks rather pointless to me.  The TDX patches for OVMF add a
> #VE handler, so I suspect every guest wants #VE exceptions if even the
> firmware cares to install a handler ...

#VE can be triggered in various situations. e.g., CPUID on some leaves, 
and RD/WRMSR on some MSRs. #VE on pending page is just one of the 
sources, Linux just wants to disable this kind of #VE since it wants to 
prevent unexpected #VE during SYSCALL gap.

> Also: What will happen instead? EPT fault delivered to the host?

Yes.

> take care,
>    Gerd
> 



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-24  7:58               ` Gerd Hoffmann
@ 2022-03-24  8:18                 ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24  8:18 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Daniel P. Berrangé, Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

On 3/24/2022 3:58 PM, Gerd Hoffmann wrote:
>    Hi,
> 
>>>> Well, if persistent vars are not supported anyway there is little reason
>>>> to split the firmware into CODE and VARS files.  You can use just use
>>>> OVMF.fd with a single pflash device.  libvirt recently got support for
>>>> that.
>>>
>>> Agreed.
>>
>> The purpose of using split firmware is that people can share the same
>> code.fd while using different vars.fd
> 
> Using different vars.fd files is pointless though when changes are never
> written back ...

Yes, I agree on this.

Off the topic. If we really want to NVRAM capability to TDX guest, 1) we 
can use the PV interface issue MMIO write in OVMF, like what SEV does in 
OVMF. 2) map OVMF as shared, thus existing pflash works well.

However, both options will expose the content to VMM, which loses 
confidentiality.

> take care,
>    Gerd
> 


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-24  8:18                 ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24  8:18 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrangé,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On 3/24/2022 3:58 PM, Gerd Hoffmann wrote:
>    Hi,
> 
>>>> Well, if persistent vars are not supported anyway there is little reason
>>>> to split the firmware into CODE and VARS files.  You can use just use
>>>> OVMF.fd with a single pflash device.  libvirt recently got support for
>>>> that.
>>>
>>> Agreed.
>>
>> The purpose of using split firmware is that people can share the same
>> code.fd while using different vars.fd
> 
> Using different vars.fd files is pointless though when changes are never
> written back ...

Yes, I agree on this.

Off the topic. If we really want to NVRAM capability to TDX guest, 1) we 
can use the PV interface issue MMIO write in OVMF, like what SEV does in 
OVMF. 2) map OVMF as shared, thus existing pflash works well.

However, both options will expose the content to VMM, which loses 
confidentiality.

> take care,
>    Gerd
> 



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-22 12:20                 ` Gerd Hoffmann
@ 2022-03-24  8:35                   ` Gerd Hoffmann
  -1 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-24  8:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Xiaoyao Li, Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

On Tue, Mar 22, 2022 at 01:20:24PM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > At the time I did try a gross hack that (IIRC) disabled the
> > rom_reset logic, and munged x86_bios_rom_init so that it would
> > force load it straight at the RAM location.
> 
> Sounds reasonable.  The whole rom logic exists to handle resets,
> but with confidential guests we don't need that, we can't change
> guest state to perform a reset anyway ...

Completed, cleaned up a bit, but untested:
  https://git.kraxel.org/cgit/qemu/log/?h=sirius/cc

Any chance you can give this a try?

thanks,
  Gerd


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-24  8:35                   ` Gerd Hoffmann
  0 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-24  8:35 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Xiaoyao Li, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	Eric Blake, qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On Tue, Mar 22, 2022 at 01:20:24PM +0100, Gerd Hoffmann wrote:
>   Hi,
> 
> > At the time I did try a gross hack that (IIRC) disabled the
> > rom_reset logic, and munged x86_bios_rom_init so that it would
> > force load it straight at the RAM location.
> 
> Sounds reasonable.  The whole rom logic exists to handle resets,
> but with confidential guests we don't need that, we can't change
> guest state to perform a reset anyway ...

Completed, cleaned up a bit, but untested:
  https://git.kraxel.org/cgit/qemu/log/?h=sirius/cc

Any chance you can give this a try?

thanks,
  Gerd



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-24  6:13             ` Xiaoyao Li
@ 2022-03-24  8:52               ` Daniel P. Berrangé
  -1 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-24  8:52 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Gerd Hoffmann, Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

On Thu, Mar 24, 2022 at 02:13:53PM +0800, Xiaoyao Li wrote:
> On 3/22/2022 5:29 PM, Daniel P. Berrangé wrote:
> > On Tue, Mar 22, 2022 at 10:21:41AM +0100, Gerd Hoffmann wrote:
> > >    Hi,
> > > 
> > > > > If you don't need a pflash device, don't use it: simply map your nvram
> > > > > region as ram in your machine. No need to clutter the pflash model like
> > > > > that.
> > > 
> > > Using the pflash device for something which isn't actually flash looks a
> > > bit silly indeed.
> > > 
> > > > 
> > > > I know it's dirty to hack the pflash device. The purpose is to make the user
> > > > interface unchanged that people can still use
> > > > 
> > > > 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
> > > >          -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
> > > > 
> > > > to create TD guest.
> > > 
> > > Well, if persistent vars are not supported anyway there is little reason
> > > to split the firmware into CODE and VARS files.  You can use just use
> > > OVMF.fd with a single pflash device.  libvirt recently got support for
> > > that.
> > 
> > Agreed.
> 
> The purpose of using split firmware is that people can share the same
> code.fd while using different vars.fd

That's fine for firmware that writes to vars.fd, but it was said earlier
that changes aren't written with TDX (nor are they written with SEV),
so a separate vars.fd serves no pupose in these cases.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-24  8:52               ` Daniel P. Berrangé
  0 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-24  8:52 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Eric Blake, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On Thu, Mar 24, 2022 at 02:13:53PM +0800, Xiaoyao Li wrote:
> On 3/22/2022 5:29 PM, Daniel P. Berrangé wrote:
> > On Tue, Mar 22, 2022 at 10:21:41AM +0100, Gerd Hoffmann wrote:
> > >    Hi,
> > > 
> > > > > If you don't need a pflash device, don't use it: simply map your nvram
> > > > > region as ram in your machine. No need to clutter the pflash model like
> > > > > that.
> > > 
> > > Using the pflash device for something which isn't actually flash looks a
> > > bit silly indeed.
> > > 
> > > > 
> > > > I know it's dirty to hack the pflash device. The purpose is to make the user
> > > > interface unchanged that people can still use
> > > > 
> > > > 	-drive if=pflash,format=raw,unit=0,file=/path/to/OVMF_CODE.fd
> > > >          -drive if=pflash,format=raw,unit=1,file=/path/to/OVMF_VARS.fd
> > > > 
> > > > to create TD guest.
> > > 
> > > Well, if persistent vars are not supported anyway there is little reason
> > > to split the firmware into CODE and VARS files.  You can use just use
> > > OVMF.fd with a single pflash device.  libvirt recently got support for
> > > that.
> > 
> > Agreed.
> 
> The purpose of using split firmware is that people can share the same
> code.fd while using different vars.fd

That's fine for firmware that writes to vars.fd, but it was said earlier
that changes aren't written with TDX (nor are they written with SEV),
so a separate vars.fd serves no pupose in these cases.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2022-03-24  8:08           ` Xiaoyao Li
@ 2022-03-24  9:37             ` Gerd Hoffmann
  -1 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-24  9:37 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Eric Blake, Connor Kuehl,
	isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

  Hi,

> #VE can be triggered in various situations. e.g., CPUID on some leaves, and
> RD/WRMSR on some MSRs. #VE on pending page is just one of the sources, Linux
> just wants to disable this kind of #VE since it wants to prevent unexpected
> #VE during SYSCALL gap.

Linux guests can't disable those on their own?  Requiring this being
configured on the host looks rather fragile to me ...

take care,
  Gerd


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
@ 2022-03-24  9:37             ` Gerd Hoffmann
  0 siblings, 0 replies; 154+ messages in thread
From: Gerd Hoffmann @ 2022-03-24  9:37 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrangé,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daudé,
	qemu-devel, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

  Hi,

> #VE can be triggered in various situations. e.g., CPUID on some leaves, and
> RD/WRMSR on some MSRs. #VE on pending page is just one of the sources, Linux
> just wants to disable this kind of #VE since it wants to prevent unexpected
> #VE during SYSCALL gap.

Linux guests can't disable those on their own?  Requiring this being
configured on the host looks rather fragile to me ...

take care,
  Gerd



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2022-03-24  9:37             ` Gerd Hoffmann
@ 2022-03-24 14:36               ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24 14:36 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Daniel P. Berrangé,
	Marcelo Tosatti, Laszlo Ersek, Eric Blake, Connor Kuehl,
	isaku.yamahata, erdemaktas, kvm, qemu-devel, seanjc

On 3/24/2022 5:37 PM, Gerd Hoffmann wrote:
>    Hi,
> 
>> #VE can be triggered in various situations. e.g., CPUID on some leaves, and
>> RD/WRMSR on some MSRs. #VE on pending page is just one of the sources, Linux
>> just wants to disable this kind of #VE since it wants to prevent unexpected
>> #VE during SYSCALL gap.
> 
> Linux guests can't disable those on their own?  Requiring this being
> configured on the host looks rather fragile to me ...

Yes, current TDX architecture doesn't allow TD guest to do so. Maybe in 
the future, it can be allowed, maybe.

> take care,
>    Gerd
> 


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
@ 2022-03-24 14:36               ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-24 14:36 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: isaku.yamahata, Marcelo Tosatti, Daniel P. Berrangé,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daudé,
	qemu-devel, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On 3/24/2022 5:37 PM, Gerd Hoffmann wrote:
>    Hi,
> 
>> #VE can be triggered in various situations. e.g., CPUID on some leaves, and
>> RD/WRMSR on some MSRs. #VE on pending page is just one of the sources, Linux
>> just wants to disable this kind of #VE since it wants to prevent unexpected
>> #VE during SYSCALL gap.
> 
> Linux guests can't disable those on their own?  Requiring this being
> configured on the host looks rather fragile to me ...

Yes, current TDX architecture doesn't allow TD guest to do so. Maybe in 
the future, it can be allowed, maybe.

> take care,
>    Gerd
> 



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
  2022-03-24  9:37             ` Gerd Hoffmann
@ 2022-03-25  1:35               ` Isaku Yamahata
  -1 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-25  1:35 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: isaku.yamahata, Cornelia Huck, Daniel P. Berrang???,
	Laszlo Ersek, kvm, Michael S. Tsirkin, Connor Kuehl,
	Richard Henderson, Marcelo Tosatti, Xiaoyao Li,
	Philippe Mathieu-Daud???,
	qemu-devel, seanjc, erdemaktas, Paolo Bonzini, Eric Blake,
	isaku.yamahata

On Thu, Mar 24, 2022 at 10:37:25AM +0100,
Gerd Hoffmann <kraxel@redhat.com> wrote:

> > #VE can be triggered in various situations. e.g., CPUID on some leaves, and
> > RD/WRMSR on some MSRs. #VE on pending page is just one of the sources, Linux
> > just wants to disable this kind of #VE since it wants to prevent unexpected
> > #VE during SYSCALL gap.
> 
> Linux guests can't disable those on their own?  Requiring this being
> configured on the host looks rather fragile to me ...

Guest can get the attributes. (But can't change it).  If the attributes isn't
what the guest expects, the guest can stop working itself.
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object
@ 2022-03-25  1:35               ` Isaku Yamahata
  0 siblings, 0 replies; 154+ messages in thread
From: Isaku Yamahata @ 2022-03-25  1:35 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Xiaoyao Li, isaku.yamahata, Marcelo Tosatti, Daniel P. Berrang???,
	kvm, Michael S. Tsirkin, Connor Kuehl, Eric Blake, Cornelia Huck,
	Richard Henderson, Philippe Mathieu-Daud???,
	qemu-devel, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek,
	isaku.yamahata

On Thu, Mar 24, 2022 at 10:37:25AM +0100,
Gerd Hoffmann <kraxel@redhat.com> wrote:

> > #VE can be triggered in various situations. e.g., CPUID on some leaves, and
> > RD/WRMSR on some MSRs. #VE on pending page is just one of the sources, Linux
> > just wants to disable this kind of #VE since it wants to prevent unexpected
> > #VE during SYSCALL gap.
> 
> Linux guests can't disable those on their own?  Requiring this being
> configured on the host looks rather fragile to me ...

Guest can get the attributes. (But can't change it).  If the attributes isn't
what the guest expects, the guest can stop working itself.
-- 
Isaku Yamahata <isaku.yamahata@gmail.com>

^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-24  8:35                   ` Gerd Hoffmann
@ 2022-03-31  6:57                     ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-31  6:57 UTC (permalink / raw)
  To: Gerd Hoffmann, Daniel P. Berrangé
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Eric Blake,
	Connor Kuehl, isaku.yamahata, erdemaktas, kvm, qemu-devel,
	seanjc

On 3/24/2022 4:35 PM, Gerd Hoffmann wrote:
> On Tue, Mar 22, 2022 at 01:20:24PM +0100, Gerd Hoffmann wrote:
>>    Hi,
>>
>>> At the time I did try a gross hack that (IIRC) disabled the
>>> rom_reset logic, and munged x86_bios_rom_init so that it would
>>> force load it straight at the RAM location.
>>
>> Sounds reasonable.  The whole rom logic exists to handle resets,
>> but with confidential guests we don't need that, we can't change
>> guest state to perform a reset anyway ...
> 
> Completed, cleaned up a bit, but untested:
>    https://git.kraxel.org/cgit/qemu/log/?h=sirius/cc
> 
> Any chance you can give this a try?

Hi Gred,

I refactor the TDX series to load TDVF via "-bios" option upon it.

No issue hit.

Thanks,
-Xiaoyao

> thanks,
>    Gerd
> 


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-31  6:57                     ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-31  6:57 UTC (permalink / raw)
  To: Gerd Hoffmann, Daniel P. Berrangé
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Eric Blake, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On 3/24/2022 4:35 PM, Gerd Hoffmann wrote:
> On Tue, Mar 22, 2022 at 01:20:24PM +0100, Gerd Hoffmann wrote:
>>    Hi,
>>
>>> At the time I did try a gross hack that (IIRC) disabled the
>>> rom_reset logic, and munged x86_bios_rom_init so that it would
>>> force load it straight at the RAM location.
>>
>> Sounds reasonable.  The whole rom logic exists to handle resets,
>> but with confidential guests we don't need that, we can't change
>> guest state to perform a reset anyway ...
> 
> Completed, cleaned up a bit, but untested:
>    https://git.kraxel.org/cgit/qemu/log/?h=sirius/cc
> 
> Any chance you can give this a try?

Hi Gred,

I refactor the TDX series to load TDVF via "-bios" option upon it.

No issue hit.

Thanks,
-Xiaoyao

> thanks,
>    Gerd
> 



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-22  9:27         ` Daniel P. Berrangé
@ 2022-03-31  8:51           ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-31  8:51 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann,
	Eric Blake, Connor Kuehl, isaku.yamahata, erdemaktas, kvm,
	qemu-devel, seanjc

On 3/22/2022 5:27 PM, Daniel P. Berrangé wrote:
...
> IMHO the AmdSev build for OVMF gets this right by entirely disabling
> the split OVMF_CODE.fd vs OVMF_VARS.fd, and just having a single
> OVMF.fd file that is exposed read-only to the guest.
> 
> This is further represented in $QEMU.git/docs/interop/firmware.json
> by marking the firmware as 'stateless', which apps like libvirt will
> use to figure out what QEMU command line to pick.

Hi Daniel,

I don't play with AMD SEV and I'm not sure if AMD SEV requires only 
single OVMF.fd. But IIUC, from edk2

commit 437eb3f7a8db ("OvmfPkg/QemuFlashFvbServicesRuntimeDxe: Bypass 
flash detection with SEV-ES")

, AMD SEV(-ES) does support NVRAM via proactive VMGEXIT MMIO 
QemuFlashWrite(). If so, AMD SEV seems to be able to support split OVMF, 
right?

> IOW, if you don't want OVMF_VARS.fd to be written to, then follow
> what AmdSev has done, and get rid of the split files.
> 
> 
> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-31  8:51           ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-31  8:51 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Eric Blake, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On 3/22/2022 5:27 PM, Daniel P. Berrangé wrote:
...
> IMHO the AmdSev build for OVMF gets this right by entirely disabling
> the split OVMF_CODE.fd vs OVMF_VARS.fd, and just having a single
> OVMF.fd file that is exposed read-only to the guest.
> 
> This is further represented in $QEMU.git/docs/interop/firmware.json
> by marking the firmware as 'stateless', which apps like libvirt will
> use to figure out what QEMU command line to pick.

Hi Daniel,

I don't play with AMD SEV and I'm not sure if AMD SEV requires only 
single OVMF.fd. But IIUC, from edk2

commit 437eb3f7a8db ("OvmfPkg/QemuFlashFvbServicesRuntimeDxe: Bypass 
flash detection with SEV-ES")

, AMD SEV(-ES) does support NVRAM via proactive VMGEXIT MMIO 
QemuFlashWrite(). If so, AMD SEV seems to be able to support split OVMF, 
right?

> IOW, if you don't want OVMF_VARS.fd to be written to, then follow
> what AmdSev has done, and get rid of the split files.
> 
> 
> With regards,
> Daniel



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-31  8:51           ` Xiaoyao Li
@ 2022-03-31  9:00             ` Daniel P. Berrangé
  -1 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-31  9:00 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann,
	Eric Blake, Connor Kuehl, isaku.yamahata, erdemaktas, kvm,
	qemu-devel, seanjc

On Thu, Mar 31, 2022 at 04:51:27PM +0800, Xiaoyao Li wrote:
> On 3/22/2022 5:27 PM, Daniel P. Berrangé wrote:
> ...
> > IMHO the AmdSev build for OVMF gets this right by entirely disabling
> > the split OVMF_CODE.fd vs OVMF_VARS.fd, and just having a single
> > OVMF.fd file that is exposed read-only to the guest.
> > 
> > This is further represented in $QEMU.git/docs/interop/firmware.json
> > by marking the firmware as 'stateless', which apps like libvirt will
> > use to figure out what QEMU command line to pick.
> 
> Hi Daniel,
> 
> I don't play with AMD SEV and I'm not sure if AMD SEV requires only single
> OVMF.fd. But IIUC, from edk2
> 
> commit 437eb3f7a8db ("OvmfPkg/QemuFlashFvbServicesRuntimeDxe: Bypass flash
> detection with SEV-ES")
> 
> , AMD SEV(-ES) does support NVRAM via proactive VMGEXIT MMIO
> QemuFlashWrite(). If so, AMD SEV seems to be able to support split OVMF,
> right?

Note that while the traditional OvmfPkg build can be used with
SEV/SEV-ES, this is not viable for measured boot, as it uses
the NVRAM whose content is not measured.

I was specifically referring to the OvmfPkg/AmdSev build which
doesn't use seprate NVRAM, and has no variables persistence.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-31  9:00             ` Daniel P. Berrangé
  0 siblings, 0 replies; 154+ messages in thread
From: Daniel P. Berrangé @ 2022-03-31  9:00 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Eric Blake, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On Thu, Mar 31, 2022 at 04:51:27PM +0800, Xiaoyao Li wrote:
> On 3/22/2022 5:27 PM, Daniel P. Berrangé wrote:
> ...
> > IMHO the AmdSev build for OVMF gets this right by entirely disabling
> > the split OVMF_CODE.fd vs OVMF_VARS.fd, and just having a single
> > OVMF.fd file that is exposed read-only to the guest.
> > 
> > This is further represented in $QEMU.git/docs/interop/firmware.json
> > by marking the firmware as 'stateless', which apps like libvirt will
> > use to figure out what QEMU command line to pick.
> 
> Hi Daniel,
> 
> I don't play with AMD SEV and I'm not sure if AMD SEV requires only single
> OVMF.fd. But IIUC, from edk2
> 
> commit 437eb3f7a8db ("OvmfPkg/QemuFlashFvbServicesRuntimeDxe: Bypass flash
> detection with SEV-ES")
> 
> , AMD SEV(-ES) does support NVRAM via proactive VMGEXIT MMIO
> QemuFlashWrite(). If so, AMD SEV seems to be able to support split OVMF,
> right?

Note that while the traditional OvmfPkg build can be used with
SEV/SEV-ES, this is not viable for measured boot, as it uses
the NVRAM whose content is not measured.

I was specifically referring to the OvmfPkg/AmdSev build which
doesn't use seprate NVRAM, and has no variables persistence.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
  2022-03-31  9:00             ` Daniel P. Berrangé
@ 2022-03-31 14:50               ` Xiaoyao Li
  -1 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-31 14:50 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Philippe Mathieu-Daudé,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin, Marcel Apfelbaum,
	Cornelia Huck, Marcelo Tosatti, Laszlo Ersek, Gerd Hoffmann,
	Eric Blake, Connor Kuehl, isaku.yamahata, erdemaktas, kvm,
	qemu-devel, seanjc

On 3/31/2022 5:00 PM, Daniel P. Berrangé wrote:
> On Thu, Mar 31, 2022 at 04:51:27PM +0800, Xiaoyao Li wrote:
>> On 3/22/2022 5:27 PM, Daniel P. Berrangé wrote:
>> ...
>>> IMHO the AmdSev build for OVMF gets this right by entirely disabling
>>> the split OVMF_CODE.fd vs OVMF_VARS.fd, and just having a single
>>> OVMF.fd file that is exposed read-only to the guest.
>>>
>>> This is further represented in $QEMU.git/docs/interop/firmware.json
>>> by marking the firmware as 'stateless', which apps like libvirt will
>>> use to figure out what QEMU command line to pick.
>>
>> Hi Daniel,
>>
>> I don't play with AMD SEV and I'm not sure if AMD SEV requires only single
>> OVMF.fd. But IIUC, from edk2
>>
>> commit 437eb3f7a8db ("OvmfPkg/QemuFlashFvbServicesRuntimeDxe: Bypass flash
>> detection with SEV-ES")
>>
>> , AMD SEV(-ES) does support NVRAM via proactive VMGEXIT MMIO
>> QemuFlashWrite(). If so, AMD SEV seems to be able to support split OVMF,
>> right?
> 
> Note that while the traditional OvmfPkg build can be used with
> SEV/SEV-ES, this is not viable for measured boot, as it uses
> the NVRAM whose content is not measured.
> 
> I was specifically referring to the OvmfPkg/AmdSev build which
> doesn't use seprate NVRAM, and has no variables persistence.

Thanks for the info. It seems I need to learn more about those. It would 
be very appreciated if you can provide me some links.

> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 154+ messages in thread

* Re: [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF
@ 2022-03-31 14:50               ` Xiaoyao Li
  0 siblings, 0 replies; 154+ messages in thread
From: Xiaoyao Li @ 2022-03-31 14:50 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: isaku.yamahata, Marcelo Tosatti, kvm, Michael S. Tsirkin,
	Connor Kuehl, Eric Blake, Cornelia Huck, Richard Henderson,
	Philippe Mathieu-Daudé,
	qemu-devel, Philippe Mathieu-Daudé,
	Gerd Hoffmann, seanjc, erdemaktas, Paolo Bonzini, Laszlo Ersek

On 3/31/2022 5:00 PM, Daniel P. Berrangé wrote:
> On Thu, Mar 31, 2022 at 04:51:27PM +0800, Xiaoyao Li wrote:
>> On 3/22/2022 5:27 PM, Daniel P. Berrangé wrote:
>> ...
>>> IMHO the AmdSev build for OVMF gets this right by entirely disabling
>>> the split OVMF_CODE.fd vs OVMF_VARS.fd, and just having a single
>>> OVMF.fd file that is exposed read-only to the guest.
>>>
>>> This is further represented in $QEMU.git/docs/interop/firmware.json
>>> by marking the firmware as 'stateless', which apps like libvirt will
>>> use to figure out what QEMU command line to pick.
>>
>> Hi Daniel,
>>
>> I don't play with AMD SEV and I'm not sure if AMD SEV requires only single
>> OVMF.fd. But IIUC, from edk2
>>
>> commit 437eb3f7a8db ("OvmfPkg/QemuFlashFvbServicesRuntimeDxe: Bypass flash
>> detection with SEV-ES")
>>
>> , AMD SEV(-ES) does support NVRAM via proactive VMGEXIT MMIO
>> QemuFlashWrite(). If so, AMD SEV seems to be able to support split OVMF,
>> right?
> 
> Note that while the traditional OvmfPkg build can be used with
> SEV/SEV-ES, this is not viable for measured boot, as it uses
> the NVRAM whose content is not measured.
> 
> I was specifically referring to the OvmfPkg/AmdSev build which
> doesn't use seprate NVRAM, and has no variables persistence.

Thanks for the info. It seems I need to learn more about those. It would 
be very appreciated if you can provide me some links.

> With regards,
> Daniel



^ permalink raw reply	[flat|nested] 154+ messages in thread

end of thread, other threads:[~2022-03-31 14:53 UTC | newest]

Thread overview: 154+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-17 13:58 [RFC PATCH v3 00/36] TDX QEMU support Xiaoyao Li
2022-03-17 13:58 ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 01/36] *** HACK *** linux-headers: Update headers to pull in TDX API changes Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 02/36] i386: Introduce tdx-guest object Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 03/36] target/i386: Implement mc->kvm_type() to get VM type Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 04/36] target/i386: Introduce kvm_confidential_guest_init() Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 05/36] i386/tdx: Implement tdx_kvm_init() to initialize TDX VM context Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-18  2:07   ` Isaku Yamahata
2022-03-18  2:07     ` Isaku Yamahata
2022-03-21  5:35     ` Xiaoyao Li
2022-03-21  5:35       ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 06/36] i386/tdx: Get tdx_capabilities via KVM_TDX_CAPABILITIES Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-18  2:08   ` Isaku Yamahata
2022-03-18  2:08     ` Isaku Yamahata
2022-03-21  6:56     ` Xiaoyao Li
2022-03-21  6:56       ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 07/36] i386/tdx: Introduce is_tdx_vm() helper and cache tdx_guest object Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 08/36] i386/tdx: Adjust get_supported_cpuid() for TDX VM Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-18 16:55   ` Isaku Yamahata
2022-03-18 16:55     ` Isaku Yamahata
2022-03-21  5:37     ` Xiaoyao Li
2022-03-21  5:37       ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 09/36] KVM: Introduce kvm_arch_pre_create_vcpu() Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-18 16:56   ` Isaku Yamahata
2022-03-18 16:56     ` Isaku Yamahata
2022-03-21  7:02     ` Xiaoyao Li
2022-03-21  7:02       ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 10/36] i386/kvm: Move architectural CPUID leaf generation to separate helper Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 11/36] i386/tdx: Initialize TDX before creating TD vcpus Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 12/36] i386/tdx: Add property sept-ve-disable for tdx-guest object Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-22  9:02   ` Gerd Hoffmann
2022-03-22  9:02     ` Gerd Hoffmann
2022-03-24  6:52     ` Xiaoyao Li
2022-03-24  6:52       ` Xiaoyao Li
2022-03-24  7:57       ` Gerd Hoffmann
2022-03-24  7:57         ` Gerd Hoffmann
2022-03-24  8:08         ` Xiaoyao Li
2022-03-24  8:08           ` Xiaoyao Li
2022-03-24  9:37           ` Gerd Hoffmann
2022-03-24  9:37             ` Gerd Hoffmann
2022-03-24 14:36             ` Xiaoyao Li
2022-03-24 14:36               ` Xiaoyao Li
2022-03-25  1:35             ` Isaku Yamahata
2022-03-25  1:35               ` Isaku Yamahata
2022-03-17 13:58 ` [RFC PATCH v3 13/36] i386/tdx: Wire CPU features up with attributes of TD guest Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 14/36] i386/tdx: Validate TD attributes Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 15/36] i386/tdx: Implement user specified tsc frequency Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 16/36] i386/tdx: Set kvm_readonly_mem_enabled to false for TDX VM Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-18 17:11   ` Isaku Yamahata
2022-03-18 17:11     ` Isaku Yamahata
2022-03-21  8:15     ` Xiaoyao Li
2022-03-21  8:15       ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 17/36] pflash_cfi01/tdx: Introduce ram_mode of pflash for TDVF Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-18 14:07   ` Philippe Mathieu-Daudé
2022-03-18 14:07     ` Philippe Mathieu-Daudé
2022-03-21  8:54     ` Xiaoyao Li
2022-03-21  8:54       ` Xiaoyao Li
2022-03-21 22:06       ` Isaku Yamahata
2022-03-21 22:06         ` Isaku Yamahata
2022-03-22  9:21       ` Gerd Hoffmann
2022-03-22  9:21         ` Gerd Hoffmann
2022-03-22  9:29         ` Daniel P. Berrangé
2022-03-22  9:29           ` Daniel P. Berrangé
2022-03-22 10:35           ` Gerd Hoffmann
2022-03-22 10:35             ` Gerd Hoffmann
2022-03-22 10:51             ` Daniel P. Berrangé
2022-03-22 10:51               ` Daniel P. Berrangé
2022-03-22 12:20               ` Gerd Hoffmann
2022-03-22 12:20                 ` Gerd Hoffmann
2022-03-24  8:35                 ` Gerd Hoffmann
2022-03-24  8:35                   ` Gerd Hoffmann
2022-03-31  6:57                   ` Xiaoyao Li
2022-03-31  6:57                     ` Xiaoyao Li
2022-03-24  6:13           ` Xiaoyao Li
2022-03-24  6:13             ` Xiaoyao Li
2022-03-24  7:58             ` Gerd Hoffmann
2022-03-24  7:58               ` Gerd Hoffmann
2022-03-24  8:18               ` Xiaoyao Li
2022-03-24  8:18                 ` Xiaoyao Li
2022-03-24  8:52             ` Daniel P. Berrangé
2022-03-24  8:52               ` Daniel P. Berrangé
2022-03-22  9:27       ` Daniel P. Berrangé
2022-03-22  9:27         ` Daniel P. Berrangé
2022-03-31  8:51         ` Xiaoyao Li
2022-03-31  8:51           ` Xiaoyao Li
2022-03-31  9:00           ` Daniel P. Berrangé
2022-03-31  9:00             ` Daniel P. Berrangé
2022-03-31 14:50             ` Xiaoyao Li
2022-03-31 14:50               ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 18/36] i386/tdvf: Introduce function to parse TDVF metadata Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-18 17:19   ` Isaku Yamahata
2022-03-18 17:19     ` Isaku Yamahata
2022-03-21  6:11     ` Xiaoyao Li
2022-03-21  6:11       ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 19/36] i386/tdx: Parse TDVF metadata for TDX VM Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 20/36] i386/tdx: Get and store the mem_ptr of TDVF firmware Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 21/36] i386/tdx: Track mem_ptr for each firmware entry of TDVF Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:58 ` [RFC PATCH v3 22/36] i386/tdx: Track RAM entries for TDX VM Xiaoyao Li
2022-03-17 13:58   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 23/36] i386/tdx: Create the TD HOB list upon machine init done Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 24/36] i386/tdx: Call KVM_TDX_INIT_VCPU to initialize TDX vcpu Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 25/36] i386/tdx: Add TDVF memory via KVM_TDX_INIT_MEM_REGION Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 26/36] i386/tdx: Finalize TDX VM Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 27/36] i386/tdx: Disable SMM for TDX VMs Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-21  6:51   ` Xiaoyao Li
2022-03-21  6:51     ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 28/36] i386/tdx: Disable PIC " Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 29/36] i386/tdx: Don't allow system reset " Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 30/36] hw/i386: add eoi_intercept_unsupported member to X86MachineState Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 31/36] hw/i386: add option to forcibly report edge trigger in acpi tables Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 32/36] i386/tdx: Don't synchronize guest tsc for TDs Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 33/36] i386/tdx: Only configure MSR_IA32_UCODE_REV in kvm_init_msrs() " Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-18 17:31   ` Isaku Yamahata
2022-03-18 17:31     ` Isaku Yamahata
2022-03-21  6:08     ` Xiaoyao Li
2022-03-21  6:08       ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 34/36] i386/tdx: Skip kvm_put_apicbase() " Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 35/36] i386/tdx: Don't get/put guest state for TDX VMs Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li
2022-03-17 13:59 ` [RFC PATCH v3 36/36] docs: Add TDX documentation Xiaoyao Li
2022-03-17 13:59   ` Xiaoyao Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.