linux-sgx.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
@ 2021-01-26  9:29 ` Kai Huang
  2021-01-26  9:30 ` [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features Kai Huang
                   ` (27 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:29 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jethro, b.thiel, jmattson, joro, vkuznets,
	wanpengli, corbet

--- Disclaimer ---

These patches were originally written by Sean Christopherson while at Intel.
Now that Sean has left Intel, I (Kai) have taken over getting them upstream.
This series needs more review before it can be merged.  It is being posted
publicly and under RFC so Sean and others can review it. Maintainers are safe
ignoring it for now.

------------------

Hi all,

This series adds KVM SGX virtualization support. The first 15 patches starting
with x86/sgx or x86/cpu.. are necessary changes to x86 and SGX core/driver to
support KVM SGX virtualization, while the rest are patches to KVM subsystem.

Please help to review this series. Any feedback is highly appreciated.
Please let me know if I forgot to CC anyone, or anyone wants to be removed from
CC. Thanks in advance!

This series is based against tip/x86/sgx. You can also get the code from
upstream branch of kvm-sgx repo on github:

        https://github.com/intel/kvm-sgx.git upstream

It also requires Qemu changes to create VM with SGX support. You can find Qemu
repo here:

	https://github.com/intel/qemu-sgx.git upstream

Please refer to README.md of above qemu-sgx repo for detail on how to create
guest with SGX support. At meantime, for your quick reference you can use below
command to create SGX guest:

	#qemu-system-x86_64 -smp 4 -m 2G -drive file=<your_vm_image>,if=virtio \
		-cpu host,+sgx_provisionkey \
		-sgx-epc id=epc1,memdev=mem1 \
		-object memory-backend-epc,id=mem1,size=64M,prealloc

Please note that the SGX relevant part is:

		-cpu host,+sgx_provisionkey \
		-sgx-epc id=epc1,memdev=mem1 \
		-object memory-backend-epc,id=mem1,size=64M,prealloc

And you can change other parameters of your qemu command based on your needs.

=========
Changelog:

(Changelog here is for global changes. Please see each patch's changelog for
 changes made to specific patch.)

v2->v3:

 - Split original "x86/cpufeatures: Add SGX1 and SGX2 sub-features" patch into
   two patches, by splitting moving SGX_LC bit also into cpuid-deps table logic
   into a separate patch 2:
       [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
       [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit
 - Changed from /dev/sgx_virt_epc to /dev/sgx_vepc, per Jarkko. And accordingly,
   changed prefix 'sgx_virt_epc_xx' to 'sgx_vepc_xx' in various functions and
   structures.
 - Changed CONFIG_X86_SGX_VIRTUALIZATION to CONFIG_X86_SGX_KVM, per Dave. Couple
   of x86 patches and KVM patches are changed too due to the renaming.

v1->v2:

 - Refined this cover letter by addressing comments from Dave and Jarkko.
 - The original patch which introduced new X86_FEATURE_SGX1/SGX2 were replaced
   by 3 new patches from Sean, following Boris and Sean's discussion.
       [RFC PATCH v2 01/26] x86/cpufeatures: Add SGX1 and SGX2 sub-features
       [RFC PATCH v2 18/26] KVM: x86: Add support for reverse CPUID lookup of scattered features
       [RFC PATCH v2 19/26] KVM: x86: Add reverse-CPUID lookup support for scattered SGX features
 - The original patch 1
       x86/sgx: Split out adding EPC page to free list to separate helper
   was replaced with 2 new patches from Jarkko
       [RFC PATCH v2 02/26] x86/sgx: Remove a warn from sgx_free_epc_page()
       [RFC PATCH v2 03/26] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
   addressing Jarkko's comments.
 - Moved modifying sgx_init() to always initialize sgx_virt_epc_init() out of
   patch
       x86/sgx: Introduce virtual EPC for use by KVM guests
   to a separate patch:
       [RFC PATCH v2 07/26] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
   to address Dave's comment that patch ordering can be improved due to before
   patch "Allow SGX virtualization without Launch Control support", all SGX,
   including SGX virtualization, is actually disabled when SGX LC is not
   present.

=========
KVM SGX virtualization Overview

- Virtual EPC

SGX enclave memory is special and is reserved specifically for enclave use.
In bare-metal SGX enclaves, the kernel allocates enclave pages, copies data
into the pages with privileged instructions, then allows the enclave to start.
In this scenario, only initialized pages already assigned to an enclave are
mapped to userspace.

In virtualized environments, the hypervisor still needs to do the physical
enclave page allocation.  The guest kernel is responsible for the data copying
(among other things).  This means that the job of starting an enclave is now
split between hypervisor and guest.

This series introduces a new misc device: /dev/sgx_vepc.  This device allows
the host to map *uninitialized* enclave memory into userspace, which can then
be passed into a guest.

While it might be *possible* to start a host-side enclave with /dev/sgx_enclave
and pass its memory into a guest, it would be wasteful and convoluted.

Implement the *raw* EPC allocation in the x86 core-SGX subsystem via
/dev/sgx_vepc rather than in KVM.  Doing so has two major advantages:

  - Does not require changes to KVM's uAPI, e.g. EPC gets handled as
    just another memory backend for guests.

  - EPC management is wholly contained in the SGX subsystem, e.g. SGX
    does not have to export any symbols, changes to reclaim flows don't
    need to be routed through KVM, SGX's dirty laundry doesn't have to
    get aired out for the world to see, and so on and so forth.

The virtual EPC pages allocated to guests are currently not reclaimable.
Reclaiming EPC page used by enclave requires a special reclaim mechanism
separate from normal page reclaim, and that mechanism is not supported
for virutal EPC pages.  Due to the complications of handling reclaim
conflicts between guest and host, reclaiming virtual EPC pages is 
significantly more complex than basic support for SGX virtualization.

- Support SGX virtualization without SGX Flexible Launch Control

SGX hardware supports two "launch control" modes to limit which enclaves can
run.  In the "locked" mode, the hardware prevents enclaves from running unless
they are blessed by a third party.  In the unlocked mode, the kernel is in
full control of which enclaves can run.  The bare-metal SGX code refuses to
launch enclaves unless it is in the unlocked mode.

This sgx_virt_epc driver does not have such a restriction.  This allows guests
which are OK with the locked mode to use SGX, even if the host kernel refuses
to.

- Support exposing SGX2

Due to the same reason above, SGX2 feature detection is added to core SGX code
to allow KVM to expose SGX2 to guest, even currently SGX driver doesn't support
SGX2, because SGX2 can work just fine in guest w/o any interaction to host SGX
driver.

- Restricit SGX guest access to provisioning key

To grant guest being able to fully use SGX, guest needs to be able to access
provisioning key.  The provisioning key is sensitive, and accessing to it should
be restricted. In bare-metal driver, allowing enclave to access provisioning key
is restricted by being able to open /dev/sgx_provision.

Add a new KVM_CAP_SGX_ATTRIBUTE to KVM uAPI to extend above mechanism to KVM
guests as well.  When userspace hypervisor creates a new VM, the new cap is only
added to VM when userspace hypervisior is able to open /dev/sgx_provision,
following the same role as in bare-metal driver.  KVM then traps ECREATE from
guest, and only allows ECREATE with provisioning key bit to run when guest
supports KVM_CAP_SGX_ATTRIBUTE.

Jarkko Sakkinen (2):
  x86/sgx: Remove a warn from sgx_free_epc_page()
  x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()

Kai Huang (3):
  x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit
  x86/sgx: Initialize virtual EPC driver even when SGX driver is
    disabled
  x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs

Sean Christopherson (22):
  x86/cpufeatures: Add SGX1 and SGX2 sub-features
  x86/sgx: Add SGX_CHILD_PRESENT hardware error code
  x86/sgx: Introduce virtual EPC for use by KVM guests
  x86/cpu/intel: Allow SGX virtualization without Launch Control support
  x86/sgx: Expose SGX architectural definitions to the kernel
  x86/sgx: Move ENCLS leaf definitions to sgx_arch.h
  x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT)
  x86/sgx: Add encls_faulted() helper
  x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  x86/sgx: Move provisioning device creation out of SGX driver
  KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX)
  KVM: x86: Define new #PF SGX error code bit
  KVM: x86: Add support for reverse CPUID lookup of scattered features
  KVM: x86: Add reverse-CPUID lookup support for scattered SGX features
  KVM: VMX: Add basic handling of VM-Exit from SGX enclave
  KVM: VMX: Frame in ENCLS handler for SGX virtualization
  KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs
  KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC)
  KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC
  KVM: x86: Add capability to grant VM access to privileged SGX
    attribute

 Documentation/virt/kvm/api.rst                |  23 +
 arch/x86/Kconfig                              |  12 +
 arch/x86/include/asm/cpufeatures.h            |   2 +
 arch/x86/include/asm/kvm_host.h               |   5 +
 arch/x86/include/asm/sgx.h                    |  19 +
 .../cpu/sgx/arch.h => include/asm/sgx_arch.h} |  20 +
 arch/x86/include/asm/vmx.h                    |   1 +
 arch/x86/include/uapi/asm/vmx.h               |   1 +
 arch/x86/kernel/cpu/cpuid-deps.c              |   3 +
 arch/x86/kernel/cpu/feat_ctl.c                |  70 ++-
 arch/x86/kernel/cpu/scattered.c               |   2 +
 arch/x86/kernel/cpu/sgx/Makefile              |   1 +
 arch/x86/kernel/cpu/sgx/driver.c              |  17 -
 arch/x86/kernel/cpu/sgx/encl.c                |  15 +-
 arch/x86/kernel/cpu/sgx/encls.h               |  30 +-
 arch/x86/kernel/cpu/sgx/ioctl.c               |  23 +-
 arch/x86/kernel/cpu/sgx/main.c                |  87 +++-
 arch/x86/kernel/cpu/sgx/sgx.h                 |   4 +-
 arch/x86/kernel/cpu/sgx/virt.c                | 347 +++++++++++++
 arch/x86/kernel/cpu/sgx/virt.h                |  14 +
 arch/x86/kvm/Makefile                         |   2 +
 arch/x86/kvm/cpuid.c                          |  89 +++-
 arch/x86/kvm/cpuid.h                          |  50 +-
 arch/x86/kvm/vmx/nested.c                     |  70 ++-
 arch/x86/kvm/vmx/nested.h                     |   5 +
 arch/x86/kvm/vmx/sgx.c                        | 462 ++++++++++++++++++
 arch/x86/kvm/vmx/sgx.h                        |  34 ++
 arch/x86/kvm/vmx/vmcs12.c                     |   1 +
 arch/x86/kvm/vmx/vmcs12.h                     |   4 +-
 arch/x86/kvm/vmx/vmx.c                        | 171 +++++--
 arch/x86/kvm/vmx/vmx.h                        |  27 +-
 arch/x86/kvm/x86.c                            |  24 +
 include/uapi/linux/kvm.h                      |   1 +
 tools/testing/selftests/sgx/defines.h         |   2 +-
 34 files changed, 1482 insertions(+), 156 deletions(-)
 create mode 100644 arch/x86/include/asm/sgx.h
 rename arch/x86/{kernel/cpu/sgx/arch.h => include/asm/sgx_arch.h} (96%)
 create mode 100644 arch/x86/kernel/cpu/sgx/virt.c
 create mode 100644 arch/x86/kernel/cpu/sgx/virt.h
 create mode 100644 arch/x86/kvm/vmx/sgx.c
 create mode 100644 arch/x86/kvm/vmx/sgx.h

-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
  2021-01-26  9:29 ` Kai Huang
@ 2021-01-26  9:30 ` Kai Huang
  2021-01-26 15:34   ` Dave Hansen
  2021-01-30 13:11   ` Jarkko Sakkinen
  2021-01-26  9:30 ` [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit Kai Huang
                   ` (26 subsequent siblings)
  28 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:30 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Sean Christopherson <seanjc@google.com>

Add SGX1 and SGX2 feature flags, via CPUID.0x12.0x0.EAX, as scattered
features, since adding a new leaf for only two bits would be wasteful.
As part of virtualizing SGX, KVM will expose the SGX CPUID leafs to its
guest, and to do so correctly needs to query hardware and kernel support
for SGX1 and SGX2.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

- Split moving SGX_LC to cpuid-deps table logic into separate patch.

---
 arch/x86/include/asm/cpufeatures.h | 2 ++
 arch/x86/kernel/cpu/cpuid-deps.c   | 2 ++
 arch/x86/kernel/cpu/scattered.c    | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 84b887825f12..18b2d0c8bbbe 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -292,6 +292,8 @@
 #define X86_FEATURE_FENCE_SWAPGS_KERNEL	(11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */
 #define X86_FEATURE_SPLIT_LOCK_DETECT	(11*32+ 6) /* #AC for split lock */
 #define X86_FEATURE_PER_THREAD_MBA	(11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */
+#define X86_FEATURE_SGX1		(11*32+ 8) /* Software Guard Extensions sub-feature SGX1 */
+#define X86_FEATURE_SGX2        	(11*32+ 9) /* Software Guard Extensions sub-feature SGX2 */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
 #define X86_FEATURE_AVX512_BF16		(12*32+ 5) /* AVX512 BFLOAT16 instructions */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 42af31b64c2c..5cf965580dd4 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -72,6 +72,8 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
 	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
 	{ X86_FEATURE_PER_THREAD_MBA,		X86_FEATURE_MBA       },
+	{ X86_FEATURE_SGX1,			X86_FEATURE_SGX       },
+	{ X86_FEATURE_SGX2,			X86_FEATURE_SGX1      },
 	{}
 };
 
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 236924930bf0..fea0df867d18 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -36,6 +36,8 @@ static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_CDP_L2,		CPUID_ECX,  2, 0x00000010, 2 },
 	{ X86_FEATURE_MBA,		CPUID_EBX,  3, 0x00000010, 0 },
 	{ X86_FEATURE_PER_THREAD_MBA,	CPUID_ECX,  0, 0x00000010, 3 },
+	{ X86_FEATURE_SGX1,		CPUID_EAX,  0, 0x00000012, 0 },
+	{ X86_FEATURE_SGX2,		CPUID_EAX,  1, 0x00000012, 0 },
 	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
  2021-01-26  9:29 ` Kai Huang
  2021-01-26  9:30 ` [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features Kai Huang
@ 2021-01-26  9:30 ` Kai Huang
  2021-01-26 15:35   ` Dave Hansen
  2021-01-30 13:22   ` Jarkko Sakkinen
  2021-01-26  9:30 ` [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page() Kai Huang
                   ` (25 subsequent siblings)
  28 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:30 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

Move SGX_LC feature bit to CPUID dependency table as well, along with
new added SGX1 and SGX2 bit, to make clearing all SGX feature bits
easier. Also remove clear_sgx_caps() since it is just a wrapper of
setup_clear_cpu_cap(X86_FEATURE_SGX) now.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kernel/cpu/cpuid-deps.c |  1 +
 arch/x86/kernel/cpu/feat_ctl.c   | 12 +++---------
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 5cf965580dd4..defda61f372d 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -72,6 +72,7 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
 	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
 	{ X86_FEATURE_PER_THREAD_MBA,		X86_FEATURE_MBA       },
+	{ X86_FEATURE_SGX_LC,			X86_FEATURE_SGX	      },
 	{ X86_FEATURE_SGX1,			X86_FEATURE_SGX       },
 	{ X86_FEATURE_SGX2,			X86_FEATURE_SGX1      },
 	{}
diff --git a/arch/x86/kernel/cpu/feat_ctl.c b/arch/x86/kernel/cpu/feat_ctl.c
index 3b1b01f2b248..27533a6e04fa 100644
--- a/arch/x86/kernel/cpu/feat_ctl.c
+++ b/arch/x86/kernel/cpu/feat_ctl.c
@@ -93,15 +93,9 @@ static void init_vmx_capabilities(struct cpuinfo_x86 *c)
 }
 #endif /* CONFIG_X86_VMX_FEATURE_NAMES */
 
-static void clear_sgx_caps(void)
-{
-	setup_clear_cpu_cap(X86_FEATURE_SGX);
-	setup_clear_cpu_cap(X86_FEATURE_SGX_LC);
-}
-
 static int __init nosgx(char *str)
 {
-	clear_sgx_caps();
+	setup_clear_cpu_cap(X86_FEATURE_SGX);
 
 	return 0;
 }
@@ -116,7 +110,7 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
 
 	if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
 		clear_cpu_cap(c, X86_FEATURE_VMX);
-		clear_sgx_caps();
+		clear_cpu_cap(c, X86_FEATURE_SGX);
 		return;
 	}
 
@@ -177,6 +171,6 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
 	    !(msr & FEAT_CTL_SGX_LC_ENABLED) || !enable_sgx) {
 		if (enable_sgx)
 			pr_err_once("SGX disabled by BIOS\n");
-		clear_sgx_caps();
+		clear_cpu_cap(c, X86_FEATURE_SGX);
 	}
 }
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page()
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (2 preceding siblings ...)
  2021-01-26  9:30 ` [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit Kai Huang
@ 2021-01-26  9:30 ` Kai Huang
  2021-01-26 15:39   ` Dave Hansen
  2021-01-26  9:30 ` [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE " Kai Huang
                   ` (24 subsequent siblings)
  28 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:30 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Jarkko Sakkinen <jarkko@kernel.org>

Remove SGX_EPC_PAGE_RECLAIMER_TRACKED check and warning.  This cannot
happen, as enclave pages are freed only at the time when encl->refcount
triggers, i.e. when both VFS and the page reclaimer have given up on
their references.

Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 8df81a3ed945..f330abdb5bb1 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -605,8 +605,6 @@ void sgx_free_epc_page(struct sgx_epc_page *page)
 	struct sgx_epc_section *section = &sgx_epc_sections[page->section];
 	int ret;
 
-	WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
-
 	ret = __eremove(sgx_get_epc_virt_addr(page));
 	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret))
 		return;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (3 preceding siblings ...)
  2021-01-26  9:30 ` [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page() Kai Huang
@ 2021-01-26  9:30 ` Kai Huang
  2021-01-26 16:04   ` Dave Hansen
  2021-01-26  9:30 ` [RFC PATCH v3 05/27] x86/sgx: Add SGX_CHILD_PRESENT hardware error code Kai Huang
                   ` (23 subsequent siblings)
  28 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:30 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Jarkko Sakkinen <jarkko@kernel.org>

Encapsulate the snippet in sgx_free_epc_page() concerning EREMOVE to
sgx_reset_epc_page(), which is a static helper function for
sgx_encl_release().  It's the only function existing, which deals with
initialized pages.

Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 13 +++++++++++++
 arch/x86/kernel/cpu/sgx/main.c | 10 ++++------
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index ee50a5010277..a78b71447771 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -389,6 +389,16 @@ const struct vm_operations_struct sgx_vm_ops = {
 	.access = sgx_vma_access,
 };
 
+
+static void sgx_reset_epc_page(struct sgx_epc_page *epc_page)
+{
+	int ret;
+
+	ret = __eremove(sgx_get_epc_virt_addr(epc_page));
+	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret))
+		return;
+}
+
 /**
  * sgx_encl_release - Destroy an enclave instance
  * @kref:	address of a kref inside &sgx_encl
@@ -412,6 +422,7 @@ void sgx_encl_release(struct kref *ref)
 			if (sgx_unmark_page_reclaimable(entry->epc_page))
 				continue;
 
+			sgx_reset_epc_page(entry->epc_page);
 			sgx_free_epc_page(entry->epc_page);
 			encl->secs_child_cnt--;
 			entry->epc_page = NULL;
@@ -423,6 +434,7 @@ void sgx_encl_release(struct kref *ref)
 	xa_destroy(&encl->page_array);
 
 	if (!encl->secs_child_cnt && encl->secs.epc_page) {
+		sgx_reset_epc_page(encl->secs.epc_page);
 		sgx_free_epc_page(encl->secs.epc_page);
 		encl->secs.epc_page = NULL;
 	}
@@ -431,6 +443,7 @@ void sgx_encl_release(struct kref *ref)
 		va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
 					   list);
 		list_del(&va_page->list);
+		sgx_reset_epc_page(va_page->epc_page);
 		sgx_free_epc_page(va_page->epc_page);
 		kfree(va_page);
 	}
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index f330abdb5bb1..21c2ffa13870 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -598,16 +598,14 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
  * sgx_free_epc_page() - Free an EPC page
  * @page:	an EPC page
  *
- * Call EREMOVE for an EPC page and insert it back to the list of free pages.
+ * Put the EPC page back to the list of free pages. It's the callers
+ * responsibility to make sure that the page is in uninitialized state In other
+ * words, do EREMOVE, EWB or whatever operation is necessary before calling
+ * this function.
  */
 void sgx_free_epc_page(struct sgx_epc_page *page)
 {
 	struct sgx_epc_section *section = &sgx_epc_sections[page->section];
-	int ret;
-
-	ret = __eremove(sgx_get_epc_virt_addr(page));
-	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret))
-		return;
 
 	spin_lock(&section->lock);
 	list_add_tail(&page->list, &section->page_list);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 05/27] x86/sgx: Add SGX_CHILD_PRESENT hardware error code
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (4 preceding siblings ...)
  2021-01-26  9:30 ` [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE " Kai Huang
@ 2021-01-26  9:30 ` Kai Huang
  2021-01-26 15:49   ` Dave Hansen
  2021-01-26  9:30 ` [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests Kai Huang
                   ` (22 subsequent siblings)
  28 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:30 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

SGX virtualization requires to allocate "raw" EPC and use it as "virtual
EPC" for SGX guest.  Unlike EPC used by SGX driver, virtual EPC doesn't
track how EPC pages are used in VM, e.g. (de)construction of enclaves,
so it cannot guarantee EREMOVE success, e.g. it doesn't have a priori
knowledge of which pages are SECS with non-zero child counts.

Add SGX_CHILD_PRESENT for use by SGX virtualization to assert EREMOVE
failures are expected, but only due to SGX_CHILD_PRESENT.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

 - Changed from 'Enclave has child' to 'SECS has child', per Jarkko.

---
 arch/x86/kernel/cpu/sgx/arch.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/arch.h b/arch/x86/kernel/cpu/sgx/arch.h
index dd7602c44c72..abf99bb71fdc 100644
--- a/arch/x86/kernel/cpu/sgx/arch.h
+++ b/arch/x86/kernel/cpu/sgx/arch.h
@@ -26,12 +26,14 @@
  * enum sgx_return_code - The return code type for ENCLS, ENCLU and ENCLV
  * %SGX_NOT_TRACKED:		Previous ETRACK's shootdown sequence has not
  *				been completed yet.
+ * %SGX_CHILD_PRESENT		SECS has child pages present in the EPC.
  * %SGX_INVALID_EINITTOKEN:	EINITTOKEN is invalid and enclave signer's
  *				public key does not match IA32_SGXLEPUBKEYHASH.
  * %SGX_UNMASKED_EVENT:		An unmasked event, e.g. INTR, was received
  */
 enum sgx_return_code {
 	SGX_NOT_TRACKED			= 11,
+	SGX_CHILD_PRESENT		= 13,
 	SGX_INVALID_EINITTOKEN		= 16,
 	SGX_UNMASKED_EVENT		= 128,
 };
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (5 preceding siblings ...)
  2021-01-26  9:30 ` [RFC PATCH v3 05/27] x86/sgx: Add SGX_CHILD_PRESENT hardware error code Kai Huang
@ 2021-01-26  9:30 ` Kai Huang
  2021-01-26 16:19   ` Dave Hansen
  2021-01-30 14:41   ` Jarkko Sakkinen
  2021-01-26  9:30 ` [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support Kai Huang
                   ` (21 subsequent siblings)
  28 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:30 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add a misc device /dev/sgx_vepc to allow userspace to allocate "raw" EPC
without an associated enclave.  The intended and only known use case for
raw EPC allocation is to expose EPC to a KVM guest, hence the 'vepc'
moniker, virt.{c,h} files and X86_SGX_KVM Kconfig.

More specifically, to allocate a virtual EPC instance with particular
size, the userspace hypervisor opens the device node, and uses mmap()
with the intended size to get an address range of virtual EPC.  Then
it may use the address range to create one KVM memory slot as virtual
EPC for guest.

Implement the "raw" EPC allocation in the x86 core-SGX subsystem via
/dev/sgx_vepc rather than in KVM. Doing so has two major advantages:

  - Does not require changes to KVM's uAPI, e.g. EPC gets handled as
    just another memory backend for guests.

  - EPC management is wholly contained in the SGX subsystem, e.g. SGX
    does not have to export any symbols, changes to reclaim flows don't
    need to be routed through KVM, SGX's dirty laundry doesn't have to
    get aired out for the world to see, and so on and so forth.

The virtual EPC pages allocated to guests are currently not reclaimable.
Reclaiming EPC page used by enclave requires a special reclaim mechanism
separate from normal page reclaim, and that mechanism is not supported
for virutal EPC pages.  Due to the complications of handling reclaim
conflicts between guest and host, reclaiming virtual EPC pages is
significantly more complex than basic support for SGX virtualization.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

 - Changed from /dev/sgx_virt_epc to /dev/sgx_vepc, per Jarkko. Accordingly,
   renamed 'sgx_virt_epc_xx' to 'sgx_vepc_xx' for various functions and
   structrues.
 - Changed CONFIG_X86_SGX_VIRTUALIZATION to CONFIG_X86_SGX_KVM, per Dave.

v1->v2:

 - Added one paragraph to explain fops of virtual EPC, per Jarkko's suggestion.
 - Moved change to sgx_init() out of this patch to a separate patch, as stated
   in cover letter.
 - In sgx_virt_epc_init(), return error if VMX is not supported, or
   CONFIG_KVM_INTEL is not enabled, because there's no point to create
   /dev/sgx_virt_epc if KVM is not supported.
 - Removed 'struct mm_struct *mm' in 'struct sgx_virt_epc', and related logic in
   sgx_virt_epc_open/release/mmap(), per Dave's comment.
 - Renamed 'virtual_epc_zombie_pages' and 'virt_epc_lock' to 'zombie_secs_pages'
   'zombie_secs_pages_lock', per Dave's suggestion.
 - Changed __sgx_free_epc_page() to sgx_free_epc_page() due to Jarkko's patch
   removes EREMOVE in sgx_free_epc_page().
 - Changed all struct sgx_virt_epc *epc to struct sgx_virt_epc *vepc.
 - In __sgx_virt_epc_fault(), changed comment to use WARN_ON() to make sure
   vepc->lock has already been hold, per Dave's suggestion.
 - In sgx_virt_epc_free_page(), added comments to explain SGX_ENCLAVE_ACT is not
   expected; and changed to use WARN_ONCE() to dump actual error code, per
   Dave's comment.
 - Removed NULL page check in sgx_virt_epc_free_page(), per Dave's comment.

---
 arch/x86/Kconfig                 |  12 ++
 arch/x86/kernel/cpu/sgx/Makefile |   1 +
 arch/x86/kernel/cpu/sgx/virt.c   | 254 +++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/virt.h   |  14 ++
 4 files changed, 281 insertions(+)
 create mode 100644 arch/x86/kernel/cpu/sgx/virt.c
 create mode 100644 arch/x86/kernel/cpu/sgx/virt.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 21f851179ff0..ccb35d14c297 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1951,6 +1951,18 @@ config X86_SGX
 
 	  If unsure, say N.
 
+config X86_SGX_KVM
+	bool "Software Guard eXtensions (SGX) Virtualization"
+	depends on X86_SGX && KVM_INTEL
+	help
+
+	  Enables KVM guests to create SGX enclaves.
+
+	  This includes support to expose "raw" unreclaimable enclave memory to
+	  guests via a device node, e.g. /dev/sgx_vepc.
+
+	  If unsure, say N.
+
 config EFI
 	bool "EFI runtime service support"
 	depends on ACPI
diff --git a/arch/x86/kernel/cpu/sgx/Makefile b/arch/x86/kernel/cpu/sgx/Makefile
index 91d3dc784a29..9c1656779b2a 100644
--- a/arch/x86/kernel/cpu/sgx/Makefile
+++ b/arch/x86/kernel/cpu/sgx/Makefile
@@ -3,3 +3,4 @@ obj-y += \
 	encl.o \
 	ioctl.o \
 	main.o
+obj-$(CONFIG_X86_SGX_KVM)	+= virt.o
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
new file mode 100644
index 000000000000..e1ad7856d878
--- /dev/null
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -0,0 +1,254 @@
+// SPDX-License-Identifier: GPL-2.0
+/*  Copyright(c) 2016-20 Intel Corporation. */
+
+#define pr_fmt(fmt)	"SGX virtual EPC: " fmt
+
+#include <linux/miscdevice.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/sched/mm.h>
+#include <linux/sched/signal.h>
+#include <linux/slab.h>
+#include <linux/xarray.h>
+#include <asm/sgx.h>
+#include <uapi/asm/sgx.h>
+
+#include "encls.h"
+#include "sgx.h"
+#include "virt.h"
+
+struct sgx_vepc {
+	struct xarray page_array;
+	struct mutex lock;
+};
+
+static struct mutex zombie_secs_pages_lock;
+static struct list_head zombie_secs_pages;
+
+static int __sgx_vepc_fault(struct sgx_vepc *vepc,
+			    struct vm_area_struct *vma, unsigned long addr)
+{
+	struct sgx_epc_page *epc_page;
+	unsigned long index, pfn;
+	int ret;
+
+	WARN_ON(!mutex_is_locked(&vepc->lock));
+
+	/* Calculate index of EPC page in virtual EPC's page_array */
+	index = vma->vm_pgoff + PFN_DOWN(addr - vma->vm_start);
+
+	epc_page = xa_load(&vepc->page_array, index);
+	if (epc_page)
+		return 0;
+
+	epc_page = sgx_alloc_epc_page(vepc, false);
+	if (IS_ERR(epc_page))
+		return PTR_ERR(epc_page);
+
+	ret = xa_err(xa_store(&vepc->page_array, index, epc_page, GFP_KERNEL));
+	if (ret)
+		goto err_free;
+
+	pfn = PFN_DOWN(sgx_get_epc_phys_addr(epc_page));
+
+	ret = vmf_insert_pfn(vma, addr, pfn);
+	if (ret != VM_FAULT_NOPAGE) {
+		ret = -EFAULT;
+		goto err_delete;
+	}
+
+	return 0;
+
+err_delete:
+	xa_erase(&vepc->page_array, index);
+err_free:
+	sgx_free_epc_page(epc_page);
+	return ret;
+}
+
+static vm_fault_t sgx_vepc_fault(struct vm_fault *vmf)
+{
+	struct vm_area_struct *vma = vmf->vma;
+	struct sgx_vepc *vepc = vma->vm_private_data;
+	int ret;
+
+	mutex_lock(&vepc->lock);
+	ret = __sgx_vepc_fault(vepc, vma, vmf->address);
+	mutex_unlock(&vepc->lock);
+
+	if (!ret)
+		return VM_FAULT_NOPAGE;
+
+	if (ret == -EBUSY && (vmf->flags & FAULT_FLAG_ALLOW_RETRY)) {
+		mmap_read_unlock(vma->vm_mm);
+		return VM_FAULT_RETRY;
+	}
+
+	return VM_FAULT_SIGBUS;
+}
+
+const struct vm_operations_struct sgx_vepc_vm_ops = {
+	.fault = sgx_vepc_fault,
+};
+
+static int sgx_vepc_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct sgx_vepc *vepc = file->private_data;
+
+	if (!(vma->vm_flags & VM_SHARED))
+		return -EINVAL;
+
+	vma->vm_ops = &sgx_vepc_vm_ops;
+	/* Don't copy VMA in fork() */
+	vma->vm_flags |= VM_PFNMAP | VM_IO | VM_DONTDUMP | VM_DONTCOPY;
+	vma->vm_private_data = vepc;
+
+	return 0;
+}
+
+static int sgx_vepc_free_page(struct sgx_epc_page *epc_page)
+{
+	int ret;
+
+	/*
+	 * Take a previously guest-owned EPC page and return it to the
+	 * general EPC page pool.
+	 *
+	 * Guests can not be trusted to have left this page in a good
+	 * state, so run EREMOVE on the page unconditionally.  In the
+	 * case that a guest properly EREMOVE'd this page, a superfluous
+	 * EREMOVE is harmless.
+	 */
+	ret = __eremove(sgx_get_epc_virt_addr(epc_page));
+	if (ret) {
+		/*
+		 * Only SGX_CHILD_PRESENT is expected, which is because of
+		 * EREMOVE'ing an SECS still with child, in which case it can
+		 * be handled by EREMOVE'ing the SECS again after all pages in
+		 * virtual EPC have been EREMOVE'd. See comments in below in
+		 * sgx_vepc_release().
+		 *
+		 * The user of virtual EPC (KVM) needs to guarantee there's no
+		 * logical processor is still running in the enclave in guest,
+		 * otherwise EREMOVE will get SGX_ENCLAVE_ACT which cannot be
+		 * handled here.
+		 */
+		WARN_ONCE(ret != SGX_CHILD_PRESENT,
+			  "EREMOVE (EPC page 0x%lx): unexpected error: %d\n",
+			  sgx_get_epc_phys_addr(epc_page), ret);
+		return ret;
+	}
+
+	sgx_free_epc_page(epc_page);
+	return 0;
+}
+
+static int sgx_vepc_release(struct inode *inode, struct file *file)
+{
+	struct sgx_vepc *vepc = file->private_data;
+	struct sgx_epc_page *epc_page, *tmp, *entry;
+	unsigned long index;
+
+	LIST_HEAD(secs_pages);
+
+	xa_for_each(&vepc->page_array, index, entry) {
+		/*
+		 * Remove all normal, child pages.  sgx_vepc_free_page()
+		 * will fail if EREMOVE fails, but this is OK and expected on
+		 * SECS pages.  Those can only be EREMOVE'd *after* all their
+		 * child pages. Retries below will clean them up.
+		 */
+		if (sgx_vepc_free_page(entry))
+			continue;
+
+		xa_erase(&vepc->page_array, index);
+	}
+
+	/*
+	 * Retry EREMOVE'ing pages.  This will clean up any SECS pages that
+	 * only had children in this 'epc' area.
+	 */
+	xa_for_each(&vepc->page_array, index, entry) {
+		epc_page = entry;
+		/*
+		 * An EREMOVE failure here means that the SECS page still
+		 * has children.  But, since all children in this 'sgx_vepc'
+		 * have been removed, the SECS page must have a child on
+		 * another instance.
+		 */
+		if (sgx_vepc_free_page(epc_page))
+			list_add_tail(&epc_page->list, &secs_pages);
+
+		xa_erase(&vepc->page_array, index);
+	}
+
+	/*
+	 * SECS pages are "pinned" by child pages, an unpinned once all
+	 * children have been EREMOVE'd.  A child page in this instance
+	 * may have pinned an SECS page encountered in an earlier release(),
+	 * creating a zombie.  Since some children were EREMOVE'd above,
+	 * try to EREMOVE all zombies in the hopes that one was unpinned.
+	 */
+	mutex_lock(&zombie_secs_pages_lock);
+	list_for_each_entry_safe(epc_page, tmp, &zombie_secs_pages, list) {
+		/*
+		 * Speculatively remove the page from the list of zombies,
+		 * if the page is successfully EREMOVE it will be added to
+		 * the list of free pages.  If EREMOVE fails, throw the page
+		 * on the local list, which will be spliced on at the end.
+		 */
+		list_del(&epc_page->list);
+
+		if (sgx_vepc_free_page(epc_page))
+			list_add_tail(&epc_page->list, &secs_pages);
+	}
+
+	if (!list_empty(&secs_pages))
+		list_splice_tail(&secs_pages, &zombie_secs_pages);
+	mutex_unlock(&zombie_secs_pages_lock);
+
+	kfree(vepc);
+
+	return 0;
+}
+
+static int sgx_vepc_open(struct inode *inode, struct file *file)
+{
+	struct sgx_vepc *vepc;
+
+	vepc = kzalloc(sizeof(struct sgx_vepc), GFP_KERNEL);
+	if (!vepc)
+		return -ENOMEM;
+	mutex_init(&vepc->lock);
+	xa_init(&vepc->page_array);
+
+	file->private_data = vepc;
+
+	return 0;
+}
+
+static const struct file_operations sgx_vepc_fops = {
+	.owner		= THIS_MODULE,
+	.open		= sgx_vepc_open,
+	.release	= sgx_vepc_release,
+	.mmap		= sgx_vepc_mmap,
+};
+
+static struct miscdevice sgx_vepc_dev = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = "sgx_vepc",
+	.nodename = "sgx_vepc",
+	.fops = &sgx_vepc_fops,
+};
+
+int __init sgx_vepc_init(void)
+{
+	/* SGX virtualization requires KVM to work */
+	if (!boot_cpu_has(X86_FEATURE_VMX) || !IS_ENABLED(CONFIG_KVM_INTEL))
+		return -ENODEV;
+
+	INIT_LIST_HEAD(&zombie_secs_pages);
+	mutex_init(&zombie_secs_pages_lock);
+
+	return misc_register(&sgx_vepc_dev);
+}
diff --git a/arch/x86/kernel/cpu/sgx/virt.h b/arch/x86/kernel/cpu/sgx/virt.h
new file mode 100644
index 000000000000..44d872380ca1
--- /dev/null
+++ b/arch/x86/kernel/cpu/sgx/virt.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
+#ifndef _ASM_X86_SGX_VIRT_H
+#define _ASM_X86_SGX_VIRT_H
+
+#ifdef CONFIG_X86_SGX_KVM
+int __init sgx_vepc_init(void);
+#else
+static inline int __init sgx_vepc_init(void)
+{
+	return -ENODEV;
+}
+#endif
+
+#endif /* _ASM_X86_SGX_VIRT_H */
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (6 preceding siblings ...)
  2021-01-26  9:30 ` [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests Kai Huang
@ 2021-01-26  9:30 ` Kai Huang
  2021-01-26 16:26   ` Dave Hansen
  2021-01-30 14:42   ` Jarkko Sakkinen
  2021-01-26  9:31 ` [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled Kai Huang
                   ` (20 subsequent siblings)
  28 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:30 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jethro, b.thiel, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

The kernel will currently disable all SGX support if the hardware does
not support launch control.  Make it more permissive to allow SGX
virtualization on systems without Launch Control support.  This will
allow KVM to expose SGX to guests that have less-strict requirements on
the availability of flexible launch control.

Improve error message to distinguish between three cases.  There are two
cases where SGX support is completely disabled:
1) SGX has been disabled completely by the BIOS
2) SGX LC is locked by the BIOS.  Bare-metal support is disabled because
   of LC unavailability.  SGX virtualization is unavailable (because of
   Kconfig).
One where it is partially available:
3) SGX LC is locked by the BIOS.  Bare-metal support is disabled because
   of LC unavailability.  SGX virtualization is supported.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

 - Added to use 'enable_sgx_any', per Dave.
 - Changed to call clear_cpu_cap() directly, rather than using clear_sgx_caps()
   and clear_sgx_lc().
 - Changed to use CONFIG_X86_SGX_KVM, instead of CONFIG_X86_SGX_VIRTUALIZATION.

v1->v2:

 - Refined commit message per Dave's comments.
 - Added check to only enable SGX virtualization when VMX is supported, per
   Dave's comment.
 - Refined error msg print to explicitly call out SGX virtualization will be
   supported when LC is locked by BIOS, per Dave's comment.

---
 arch/x86/kernel/cpu/feat_ctl.c | 58 ++++++++++++++++++++++++++--------
 1 file changed, 45 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/cpu/feat_ctl.c b/arch/x86/kernel/cpu/feat_ctl.c
index 27533a6e04fa..0fc202550fcc 100644
--- a/arch/x86/kernel/cpu/feat_ctl.c
+++ b/arch/x86/kernel/cpu/feat_ctl.c
@@ -105,7 +105,8 @@ early_param("nosgx", nosgx);
 void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
 {
 	bool tboot = tboot_enabled();
-	bool enable_sgx;
+	bool enable_vmx;
+	bool enable_sgx_any, enable_sgx_kvm, enable_sgx_driver;
 	u64 msr;
 
 	if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
@@ -114,13 +115,22 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
 		return;
 	}
 
+	enable_vmx = cpu_has(c, X86_FEATURE_VMX) &&
+		     IS_ENABLED(CONFIG_KVM_INTEL);
+
 	/*
-	 * Enable SGX if and only if the kernel supports SGX and Launch Control
-	 * is supported, i.e. disable SGX if the LE hash MSRs can't be written.
+	 * Enable SGX if and only if the kernel supports SGX.  Require Launch
+	 * Control support if SGX virtualization is *not* supported, i.e.
+	 * disable SGX if the LE hash MSRs can't be written and SGX can't be
+	 * exposed to a KVM guest (which might support non-LC configurations).
 	 */
-	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
-		     cpu_has(c, X86_FEATURE_SGX_LC) &&
-		     IS_ENABLED(CONFIG_X86_SGX);
+	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
+			 cpu_has(c, X86_FEATURE_SGX1) &&
+			 IS_ENABLED(CONFIG_X86_SGX);
+	enable_sgx_driver = enable_sgx_any &&
+			    cpu_has(c, X86_FEATURE_SGX_LC);
+	enable_sgx_kvm = enable_sgx_any && enable_vmx &&
+			  IS_ENABLED(CONFIG_X86_SGX_KVM);
 
 	if (msr & FEAT_CTL_LOCKED)
 		goto update_caps;
@@ -136,15 +146,18 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
 	 * i.e. KVM is enabled, to avoid unnecessarily adding an attack vector
 	 * for the kernel, e.g. using VMX to hide malicious code.
 	 */
-	if (cpu_has(c, X86_FEATURE_VMX) && IS_ENABLED(CONFIG_KVM_INTEL)) {
+	if (enable_vmx) {
 		msr |= FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
 
 		if (tboot)
 			msr |= FEAT_CTL_VMX_ENABLED_INSIDE_SMX;
 	}
 
-	if (enable_sgx)
-		msr |= FEAT_CTL_SGX_ENABLED | FEAT_CTL_SGX_LC_ENABLED;
+	if (enable_sgx_kvm || enable_sgx_driver) {
+		msr |= FEAT_CTL_SGX_ENABLED;
+		if (enable_sgx_driver)
+			msr |= FEAT_CTL_SGX_LC_ENABLED;
+	}
 
 	wrmsrl(MSR_IA32_FEAT_CTL, msr);
 
@@ -167,10 +180,29 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
 	}
 
 update_sgx:
-	if (!(msr & FEAT_CTL_SGX_ENABLED) ||
-	    !(msr & FEAT_CTL_SGX_LC_ENABLED) || !enable_sgx) {
-		if (enable_sgx)
-			pr_err_once("SGX disabled by BIOS\n");
+	if (!(msr & FEAT_CTL_SGX_ENABLED)) {
+		if (enable_sgx_kvm || enable_sgx_driver)
+			pr_err_once("SGX disabled by BIOS.\n");
 		clear_cpu_cap(c, X86_FEATURE_SGX);
+		return;
+	}
+
+	/*
+	 * VMX feature bit may be cleared due to being disabled in BIOS,
+	 * in which case SGX virtualization cannot be supported either.
+	 */
+	if (!cpu_has(c, X86_FEATURE_VMX) && enable_sgx_kvm) {
+		pr_err_once("SGX virtualization disabled due to lack of VMX.\n");
+		enable_sgx_kvm = 0;
+	}
+
+	if (!(msr & FEAT_CTL_SGX_LC_ENABLED) && enable_sgx_driver) {
+		if (!enable_sgx_kvm) {
+			pr_err_once("SGX Launch Control is locked. Disable SGX.\n");
+			clear_cpu_cap(c, X86_FEATURE_SGX);
+		} else {
+			pr_err_once("SGX Launch Control is locked. Support SGX virtualization only.\n");
+			clear_cpu_cap(c, X86_FEATURE_SGX_LC);
+		}
 	}
 }
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (7 preceding siblings ...)
  2021-01-26  9:30 ` [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26 17:03   ` Dave Hansen
  2021-01-30 14:45   ` Jarkko Sakkinen
  2021-01-26  9:31 ` [RFC PATCH v3 09/27] x86/sgx: Expose SGX architectural definitions to the kernel Kai Huang
                   ` (19 subsequent siblings)
  28 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

Modify sgx_init() to always try to initialize the virtual EPC driver,
even if the bare-metal SGX driver is disabled.  The bare-metal driver
might be disabled if SGX Launch Control is in locked mode, or not
supported in the hardware at all.  This allows (non-Linux) guests that
support non-LC configurations to use SGX.

Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

 - Changed from sgx_virt_epc_init() to sgx_vepc_init().

---
 arch/x86/kernel/cpu/sgx/main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 21c2ffa13870..93d249f7bff3 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -12,6 +12,7 @@
 #include "driver.h"
 #include "encl.h"
 #include "encls.h"
+#include "virt.h"
 
 struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
 static int sgx_nr_epc_sections;
@@ -712,7 +713,8 @@ static int __init sgx_init(void)
 		goto err_page_cache;
 	}
 
-	ret = sgx_drv_init();
+	/* Success if the native *or* virtual EPC driver initialized cleanly. */
+	ret = !!sgx_drv_init() & !!sgx_vepc_init();
 	if (ret)
 		goto err_kthread;
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 09/27] x86/sgx: Expose SGX architectural definitions to the kernel
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (8 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-30 14:46   ` Jarkko Sakkinen
  2021-01-26  9:31 ` [RFC PATCH v3 10/27] x86/sgx: Move ENCLS leaf definitions to sgx_arch.h Kai Huang
                   ` (18 subsequent siblings)
  28 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Expose SGX architectural structures, as KVM will use many of the
architectural constants and structs to virtualize SGX.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

 - Added "Expose SGX architectural structures, as..." to commit message,
   per Jarkko.

---
 arch/x86/{kernel/cpu/sgx/arch.h => include/asm/sgx_arch.h} | 0
 arch/x86/kernel/cpu/sgx/encl.c                             | 2 +-
 arch/x86/kernel/cpu/sgx/sgx.h                              | 2 +-
 tools/testing/selftests/sgx/defines.h                      | 2 +-
 4 files changed, 3 insertions(+), 3 deletions(-)
 rename arch/x86/{kernel/cpu/sgx/arch.h => include/asm/sgx_arch.h} (100%)

diff --git a/arch/x86/kernel/cpu/sgx/arch.h b/arch/x86/include/asm/sgx_arch.h
similarity index 100%
rename from arch/x86/kernel/cpu/sgx/arch.h
rename to arch/x86/include/asm/sgx_arch.h
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index a78b71447771..68941c349cfe 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -7,7 +7,7 @@
 #include <linux/shmem_fs.h>
 #include <linux/suspend.h>
 #include <linux/sched/mm.h>
-#include "arch.h"
+#include <asm/sgx_arch.h>
 #include "encl.h"
 #include "encls.h"
 #include "sgx.h"
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index 5fa42d143feb..509f2af33e1d 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -8,7 +8,7 @@
 #include <linux/rwsem.h>
 #include <linux/types.h>
 #include <asm/asm.h>
-#include "arch.h"
+#include <asm/sgx_arch.h>
 
 #undef pr_fmt
 #define pr_fmt(fmt) "sgx: " fmt
diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h
index 592c1ccf4576..4dd39a003f40 100644
--- a/tools/testing/selftests/sgx/defines.h
+++ b/tools/testing/selftests/sgx/defines.h
@@ -14,7 +14,7 @@
 #define __aligned(x) __attribute__((__aligned__(x)))
 #define __packed __attribute__((packed))
 
-#include "../../../../arch/x86/kernel/cpu/sgx/arch.h"
+#include "../../../../arch/x86/include/asm/sgx_arch.h"
 #include "../../../../arch/x86/include/asm/enclu.h"
 #include "../../../../arch/x86/include/uapi/asm/sgx.h"
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 10/27] x86/sgx: Move ENCLS leaf definitions to sgx_arch.h
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (9 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 09/27] x86/sgx: Expose SGX architectural definitions to the kernel Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 11/27] x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT) Kai Huang
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Move the ENCLS leaf definitions to sgx_arch.h so that they can be used
by KVM.  And because they're architectural.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/include/asm/sgx_arch.h | 15 +++++++++++++++
 arch/x86/kernel/cpu/sgx/encls.h | 15 ---------------
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/sgx_arch.h b/arch/x86/include/asm/sgx_arch.h
index abf99bb71fdc..3dbe7aacf552 100644
--- a/arch/x86/include/asm/sgx_arch.h
+++ b/arch/x86/include/asm/sgx_arch.h
@@ -22,6 +22,21 @@
 /* The bitmask for the EPC section type. */
 #define SGX_CPUID_EPC_MASK	GENMASK(3, 0)
 
+enum sgx_encls_function {
+	ECREATE	= 0x00,
+	EADD	= 0x01,
+	EINIT	= 0x02,
+	EREMOVE	= 0x03,
+	EDGBRD	= 0x04,
+	EDGBWR	= 0x05,
+	EEXTEND	= 0x06,
+	ELDU	= 0x08,
+	EBLOCK	= 0x09,
+	EPA	= 0x0A,
+	EWB	= 0x0B,
+	ETRACK	= 0x0C,
+};
+
 /**
  * enum sgx_return_code - The return code type for ENCLS, ENCLU and ENCLV
  * %SGX_NOT_TRACKED:		Previous ETRACK's shootdown sequence has not
diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index 443188fe7e70..be5c49689980 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -11,21 +11,6 @@
 #include <asm/traps.h>
 #include "sgx.h"
 
-enum sgx_encls_function {
-	ECREATE	= 0x00,
-	EADD	= 0x01,
-	EINIT	= 0x02,
-	EREMOVE	= 0x03,
-	EDGBRD	= 0x04,
-	EDGBWR	= 0x05,
-	EEXTEND	= 0x06,
-	ELDU	= 0x08,
-	EBLOCK	= 0x09,
-	EPA	= 0x0A,
-	EWB	= 0x0B,
-	ETRACK	= 0x0C,
-};
-
 /**
  * ENCLS_FAULT_FLAG - flag signifying an ENCLS return code is a trapnr
  *
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 11/27] x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT)
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (10 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 10/27] x86/sgx: Move ENCLS leaf definitions to sgx_arch.h Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 12/27] x86/sgx: Add encls_faulted() helper Kai Huang
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Define the ENCLS leafs that are available with SGX2, also referred to as
Enclave Dynamic Memory Management (EDMM).  The leafs will be used by KVM
to conditionally expose SGX2 capabilities to guests.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/include/asm/sgx_arch.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/sgx_arch.h b/arch/x86/include/asm/sgx_arch.h
index 3dbe7aacf552..756c8dacc52a 100644
--- a/arch/x86/include/asm/sgx_arch.h
+++ b/arch/x86/include/asm/sgx_arch.h
@@ -35,6 +35,9 @@ enum sgx_encls_function {
 	EPA	= 0x0A,
 	EWB	= 0x0B,
 	ETRACK	= 0x0C,
+	EAUG	= 0x0D,
+	EMODPR	= 0x0E,
+	EMODT	= 0x0F,
 };
 
 /**
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 12/27] x86/sgx: Add encls_faulted() helper
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (11 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 11/27] x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT) Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-30 14:48   ` Jarkko Sakkinen
  2021-01-26  9:31 ` [RFC PATCH v3 13/27] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs Kai Huang
                   ` (15 subsequent siblings)
  28 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add a helper to extract the fault indicator from an encoded ENCLS return
value.  SGX virtualization will also need to detect ENCLS faults.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

 - Changed commenting style for return value, per Jarkko.

---
 arch/x86/kernel/cpu/sgx/encls.h | 15 ++++++++++++++-
 arch/x86/kernel/cpu/sgx/ioctl.c |  2 +-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index be5c49689980..3219d011ee28 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -40,6 +40,19 @@
 	} while (0);							  \
 }
 
+/*
+ * encls_faulted() - Check if an ENCLS leaf faulted given an error code
+ * @ret 	the return value of an ENCLS leaf function call
+ *
+ * Return:
+ * - true:	ENCLS leaf faulted.
+ * - false:	Otherwise.
+ */
+static inline bool encls_faulted(int ret)
+{
+	return ret & ENCLS_FAULT_FLAG;
+}
+
 /**
  * encls_failed() - Check if an ENCLS function failed
  * @ret:	the return value of an ENCLS function call
@@ -50,7 +63,7 @@
  */
 static inline bool encls_failed(int ret)
 {
-	if (ret & ENCLS_FAULT_FLAG)
+	if (encls_faulted(ret))
 		return ENCLS_TRAPNR(ret) != X86_TRAP_PF;
 
 	return !!ret;
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 90a5caf76939..e5977752c7be 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -568,7 +568,7 @@ static int sgx_encl_init(struct sgx_encl *encl, struct sgx_sigstruct *sigstruct,
 		}
 	}
 
-	if (ret & ENCLS_FAULT_FLAG) {
+	if (encls_faulted(ret)) {
 		if (encls_failed(ret))
 			ENCLS_WARN(ret, "EINIT");
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 13/27] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (12 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 12/27] x86/sgx: Add encls_faulted() helper Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-30 14:49   ` Jarkko Sakkinen
  2021-01-26  9:31 ` [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM Kai Huang
                   ` (14 subsequent siblings)
  28 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

Add a helper to update SGX_LEPUBKEYHASHn MSRs.  SGX virtualization also
needs to update those MSRs based on guest's "virtual" SGX_LEPUBKEYHASHn
before EINIT from guest.

Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

 - Added comment for sgx_update_lepubkeyhash(), per Jarkko and Dave.

---
 arch/x86/kernel/cpu/sgx/ioctl.c |  5 ++---
 arch/x86/kernel/cpu/sgx/main.c  | 15 +++++++++++++++
 arch/x86/kernel/cpu/sgx/sgx.h   |  2 ++
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index e5977752c7be..1bae754268d1 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -495,7 +495,7 @@ static int sgx_encl_init(struct sgx_encl *encl, struct sgx_sigstruct *sigstruct,
 			 void *token)
 {
 	u64 mrsigner[4];
-	int i, j, k;
+	int i, j;
 	void *addr;
 	int ret;
 
@@ -544,8 +544,7 @@ static int sgx_encl_init(struct sgx_encl *encl, struct sgx_sigstruct *sigstruct,
 
 			preempt_disable();
 
-			for (k = 0; k < 4; k++)
-				wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0 + k, mrsigner[k]);
+			sgx_update_lepubkeyhash(mrsigner);
 
 			ret = __einit(sigstruct, token, addr);
 
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 93d249f7bff3..b456899a9532 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -697,6 +697,21 @@ static bool __init sgx_page_cache_init(void)
 	return true;
 }
 
+
+/*
+ * Update the SGX_LEPUBKEYHASH MSRs to the values specified by caller.
+ * Bare-metal driver requires to update them to hash of enclave's signer
+ * before EINIT. KVM needs to update them to guest's virtual MSR values
+ * before doing EINIT from guest.
+ */
+void sgx_update_lepubkeyhash(u64 *lepubkeyhash)
+{
+	int i;
+
+	for (i = 0; i < 4; i++)
+		wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0 + i, lepubkeyhash[i]);
+}
+
 static int __init sgx_init(void)
 {
 	int ret;
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index 509f2af33e1d..ccd4f145c464 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -83,4 +83,6 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
 int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
 struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
 
+void sgx_update_lepubkeyhash(u64 *lepubkeyhash);
+
 #endif /* _X86_SGX_H */
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (13 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 13/27] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-30 14:51   ` Jarkko Sakkinen
  2021-02-04  3:53   ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 15/27] x86/sgx: Move provisioning device creation out of SGX driver Kai Huang
                   ` (13 subsequent siblings)
  28 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

The bare-metal kernel must intercept ECREATE to be able to impose policies
on guests.  When it does this, the bare-metal kernel runs ECREATE against
the userspace mapping of the virtualized EPC.

Provide wrappers around __ecreate() and __einit() to hide the ugliness
of overloading the ENCLS return value to encode multiple error formats
in a single int.  KVM will trap-and-execute ECREATE and EINIT as part
of SGX virtualization, and on an exception, KVM needs the trapnr so that
it can inject the correct fault into the guest.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

 - Added kdoc for sgx_virt_ecreate() and sgx_virt_einit(), per Jarkko.
 - Changed to use CONFIG_X86_SGX_KVM.

---
 arch/x86/include/asm/sgx.h     | 16 ++++++
 arch/x86/kernel/cpu/sgx/virt.c | 93 ++++++++++++++++++++++++++++++++++
 2 files changed, 109 insertions(+)
 create mode 100644 arch/x86/include/asm/sgx.h

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
new file mode 100644
index 000000000000..8a3ea3e1efbe
--- /dev/null
+++ b/arch/x86/include/asm/sgx.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_SGX_H
+#define _ASM_X86_SGX_H
+
+#include <linux/types.h>
+
+#ifdef CONFIG_X86_SGX_KVM
+struct sgx_pageinfo;
+
+int sgx_virt_ecreate(struct sgx_pageinfo *pageinfo, void __user *secs,
+		     int *trapnr);
+int sgx_virt_einit(void __user *sigstruct, void __user *token,
+		   void __user *secs, u64 *lepubkeyhash, int *trapnr);
+#endif
+
+#endif /* _ASM_X86_SGX_H */
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
index e1ad7856d878..0f5b0e4e33dd 100644
--- a/arch/x86/kernel/cpu/sgx/virt.c
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -252,3 +252,96 @@ int __init sgx_vepc_init(void)
 
 	return misc_register(&sgx_vepc_dev);
 }
+
+/**
+ * sgx_virt_ecreate() - Run ECREATE on behalf of guest
+ * @pageinfo:	Pointer to PAGEINFO structure
+ * @secs:	Userspace pointer to SECS page
+ * @trapnr:	trap number injected to guest in case of ECREATE error
+ *
+ * Run ECREATE on behalf of guest after KVM traps ECREATE for the purpose
+ * of enforcing policies of guest's enclaves, and return the trap number
+ * which should be injected to guest in case of any ECREATE error.
+ *
+ * Return:
+ * - 0: 	ECREATE was successful.
+ * - -EFAULT:	ECREATE returned error.
+ */
+int sgx_virt_ecreate(struct sgx_pageinfo *pageinfo, void __user *secs,
+		     int *trapnr)
+{
+	int ret;
+
+	/*
+	 * @secs is userspace address, and it's not guaranteed @secs points at
+	 * an actual EPC page. It's also possible to generate a kernel mapping
+	 * to physical EPC page by resolving PFN but using __uaccess_xx() is
+	 * simpler.
+	 */
+	__uaccess_begin();
+	ret = __ecreate(pageinfo, (void *)secs);
+	__uaccess_end();
+
+	if (encls_faulted(ret)) {
+		*trapnr = ENCLS_TRAPNR(ret);
+		return -EFAULT;
+	}
+
+	/* ECREATE doesn't return an error code, it faults or succeeds. */
+	WARN_ON_ONCE(ret);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(sgx_virt_ecreate);
+
+static int __sgx_virt_einit(void __user *sigstruct, void __user *token,
+			    void __user *secs)
+{
+	int ret;
+
+	__uaccess_begin();
+	ret =  __einit((void *)sigstruct, (void *)token, (void *)secs);
+	__uaccess_end();
+	return ret;
+}
+
+/**
+ * sgx_virt_ecreate() - Run EINIT on behalf of guest
+ * @sigstruct:		Userspace pointer to SIGSTRUCT structure
+ * @token:		Userspace pointer to EINITTOKEN structure
+ * @secs:		Userspace pointer to SECS page
+ * @lepubkeyhash:	Pointer to guest's *virtual* SGX_LEPUBKEYHASH MSR
+ * 			values
+ * @trapnr:		trap number injected to guest in case of EINIT error
+ *
+ * Run EINIT on behalf of guest after KVM traps EINIT. If SGX_LC is available
+ * in host, bare-metal driver may rewrite the hardware values, therefore KVM
+ * needs to update hardware values to guest's virtual MSR values in order to
+ * ensure EINIT is executed with expected hardware values.
+ *
+ * Return:
+ * - 0: 	EINIT was successful.
+ * - -EFAULT:	EINIT returned error.
+ */
+int sgx_virt_einit(void __user *sigstruct, void __user *token,
+		   void __user *secs, u64 *lepubkeyhash, int *trapnr)
+{
+	int ret;
+
+	if (!boot_cpu_has(X86_FEATURE_SGX_LC)) {
+		ret = __sgx_virt_einit(sigstruct, token, secs);
+	} else {
+		preempt_disable();
+
+		sgx_update_lepubkeyhash(lepubkeyhash);
+
+		ret = __sgx_virt_einit(sigstruct, token, secs);
+		preempt_enable();
+	}
+
+	if (encls_faulted(ret)) {
+		*trapnr = ENCLS_TRAPNR(ret);
+		return -EFAULT;
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(sgx_virt_einit);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 15/27] x86/sgx: Move provisioning device creation out of SGX driver
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (14 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-30 14:52   ` Jarkko Sakkinen
  2021-01-26  9:31 ` [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union Kai Huang
                   ` (12 subsequent siblings)
  28 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

And extract sgx_set_attribute() out of sgx_ioc_enclave_provision() and
export it as symbol for KVM to use.

Provisioning key is sensitive. SGX driver only allows to create enclave
which can access provisioning key when enclave creator has permission to
open /dev/sgx_provision.  It should apply to VM as well, as provisioning
key is platform specific, thus unrestricted VM can also potentially
compromise provisioning key.

Move provisioning device creation out of sgx_drv_init() to sgx_init() as
preparation for adding SGX virtualization support, so that even SGX
driver is not enabled due to flexible launch control is not available,
SGX virtualization can still be enabled, and use it to restrict VM's
capability of being able to access provisioning key.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
v2->v3:

 - Added kdoc for sgx_set_attribute(), per Jarkko.

---
 arch/x86/include/asm/sgx.h       |  3 ++
 arch/x86/kernel/cpu/sgx/driver.c | 17 ----------
 arch/x86/kernel/cpu/sgx/ioctl.c  | 16 ++-------
 arch/x86/kernel/cpu/sgx/main.c   | 58 +++++++++++++++++++++++++++++++-
 4 files changed, 62 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
index 8a3ea3e1efbe..d67afb051db3 100644
--- a/arch/x86/include/asm/sgx.h
+++ b/arch/x86/include/asm/sgx.h
@@ -4,6 +4,9 @@
 
 #include <linux/types.h>
 
+int sgx_set_attribute(unsigned long *allowed_attributes,
+		      unsigned int attribute_fd);
+
 #ifdef CONFIG_X86_SGX_KVM
 struct sgx_pageinfo;
 
diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
index f2eac41bb4ff..4f3241109bda 100644
--- a/arch/x86/kernel/cpu/sgx/driver.c
+++ b/arch/x86/kernel/cpu/sgx/driver.c
@@ -133,10 +133,6 @@ static const struct file_operations sgx_encl_fops = {
 	.get_unmapped_area	= sgx_get_unmapped_area,
 };
 
-const struct file_operations sgx_provision_fops = {
-	.owner			= THIS_MODULE,
-};
-
 static struct miscdevice sgx_dev_enclave = {
 	.minor = MISC_DYNAMIC_MINOR,
 	.name = "sgx_enclave",
@@ -144,13 +140,6 @@ static struct miscdevice sgx_dev_enclave = {
 	.fops = &sgx_encl_fops,
 };
 
-static struct miscdevice sgx_dev_provision = {
-	.minor = MISC_DYNAMIC_MINOR,
-	.name = "sgx_provision",
-	.nodename = "sgx_provision",
-	.fops = &sgx_provision_fops,
-};
-
 int __init sgx_drv_init(void)
 {
 	unsigned int eax, ebx, ecx, edx;
@@ -184,11 +173,5 @@ int __init sgx_drv_init(void)
 	if (ret)
 		return ret;
 
-	ret = misc_register(&sgx_dev_provision);
-	if (ret) {
-		misc_deregister(&sgx_dev_enclave);
-		return ret;
-	}
-
 	return 0;
 }
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 1bae754268d1..4714de12422d 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -2,6 +2,7 @@
 /*  Copyright(c) 2016-20 Intel Corporation. */
 
 #include <asm/mman.h>
+#include <asm/sgx.h>
 #include <linux/mman.h>
 #include <linux/delay.h>
 #include <linux/file.h>
@@ -664,24 +665,11 @@ static long sgx_ioc_enclave_init(struct sgx_encl *encl, void __user *arg)
 static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
 {
 	struct sgx_enclave_provision params;
-	struct file *file;
 
 	if (copy_from_user(&params, arg, sizeof(params)))
 		return -EFAULT;
 
-	file = fget(params.fd);
-	if (!file)
-		return -EINVAL;
-
-	if (file->f_op != &sgx_provision_fops) {
-		fput(file);
-		return -EINVAL;
-	}
-
-	encl->attributes_mask |= SGX_ATTR_PROVISIONKEY;
-
-	fput(file);
-	return 0;
+	return sgx_set_attribute(&encl->attributes_mask, params.fd);
 }
 
 long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index b456899a9532..fba3eaf2ae26 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -1,14 +1,18 @@
 // SPDX-License-Identifier: GPL-2.0
 /*  Copyright(c) 2016-20 Intel Corporation. */
 
+#include <linux/file.h>
 #include <linux/freezer.h>
 #include <linux/highmem.h>
 #include <linux/kthread.h>
+#include <linux/miscdevice.h>
 #include <linux/pagemap.h>
 #include <linux/ratelimit.h>
 #include <linux/sched/mm.h>
 #include <linux/sched/signal.h>
 #include <linux/slab.h>
+#include <asm/sgx_arch.h>
+#include <asm/sgx.h>
 #include "driver.h"
 #include "encl.h"
 #include "encls.h"
@@ -712,6 +716,51 @@ void sgx_update_lepubkeyhash(u64 *lepubkeyhash)
 		wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0 + i, lepubkeyhash[i]);
 }
 
+const struct file_operations sgx_provision_fops = {
+	.owner			= THIS_MODULE,
+};
+
+static struct miscdevice sgx_dev_provision = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = "sgx_provision",
+	.nodename = "sgx_provision",
+	.fops = &sgx_provision_fops,
+};
+
+/**
+ * sgx_set_attribute() - Update allowed attributes given file descriptor
+ * @allowed_attributes: 	Pointer to allowed enclave attributes
+ * @attribute_fd:		File descriptor for specific attribute
+ *
+ * Append enclave attribute indicated by file descriptor to allowed
+ * attributes. Currently only SGX_ATTR_PROVISIONKEY indicated by
+ * /dev/sgx_provision is supported.
+ *
+ * Return:
+ * -0:		SGX_ATTR_PROVISIONKEY is appended to allowed_attributes
+ * -EINVAL:	Invalid, or not supported file descriptor
+ */
+int sgx_set_attribute(unsigned long *allowed_attributes,
+		      unsigned int attribute_fd)
+{
+	struct file *file;
+
+	file = fget(attribute_fd);
+	if (!file)
+		return -EINVAL;
+
+	if (file->f_op != &sgx_provision_fops) {
+		fput(file);
+		return -EINVAL;
+	}
+
+	*allowed_attributes |= SGX_ATTR_PROVISIONKEY;
+
+	fput(file);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(sgx_set_attribute);
+
 static int __init sgx_init(void)
 {
 	int ret;
@@ -728,13 +777,20 @@ static int __init sgx_init(void)
 		goto err_page_cache;
 	}
 
+	ret = misc_register(&sgx_dev_provision);
+	if (ret)
+		goto err_kthread;
+
 	/* Success if the native *or* virtual EPC driver initialized cleanly. */
 	ret = !!sgx_drv_init() & !!sgx_vepc_init();
 	if (ret)
-		goto err_kthread;
+		goto err_provision;
 
 	return 0;
 
+err_provision:
+	misc_deregister(&sgx_dev_provision);
+
 err_kthread:
 	kthread_stop(ksgxd_tsk);
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (15 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 15/27] x86/sgx: Move provisioning device creation out of SGX driver Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-30 15:00   ` Jarkko Sakkinen
  2021-01-26  9:31 ` [RFC PATCH v3 17/27] KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX) Kai Huang
                   ` (11 subsequent siblings)
  28 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
bits 15:0, and single-bit modifiers in bits 31:16.

Historically, KVM has only had to worry about handling the "failed
VM-Entry" modifier, which could only be set in very specific flows and
required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
bit was a somewhat viable approach.  But even with only a single bit to
worry about, KVM has had several bugs related to comparing a basic exit
reason against the full exit reason store in vcpu_vmx.

Upcoming Intel features, e.g. SGX, will add new modifier bits that can
be set on more or less any VM-Exit, as opposed to the significantly more
restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
flows isn't scalable.  Tracking exit reason in a union forces code to
explicitly choose between consuming the full exit reason and the basic
exit, and is a convenient way to document and access the modifiers.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/vmx/nested.c | 42 +++++++++++++++---------
 arch/x86/kvm/vmx/vmx.c    | 68 ++++++++++++++++++++-------------------
 arch/x86/kvm/vmx/vmx.h    | 25 +++++++++++++-
 3 files changed, 86 insertions(+), 49 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 0fbb46990dfc..f112c2482887 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3311,7 +3311,11 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
 	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
 	enum vm_entry_failure_code entry_failure_code;
 	bool evaluate_pending_interrupts;
-	u32 exit_reason, failed_index;
+	u32 failed_index;
+	union vmx_exit_reason exit_reason = {
+		.basic = -1,
+		.failed_vmentry = 1,
+	};
 
 	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu))
 		kvm_vcpu_flush_tlb_current(vcpu);
@@ -3363,7 +3367,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
 
 		if (nested_vmx_check_guest_state(vcpu, vmcs12,
 						 &entry_failure_code)) {
-			exit_reason = EXIT_REASON_INVALID_STATE;
+			exit_reason.basic = EXIT_REASON_INVALID_STATE;
 			vmcs12->exit_qualification = entry_failure_code;
 			goto vmentry_fail_vmexit;
 		}
@@ -3374,7 +3378,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
 		vcpu->arch.tsc_offset += vmcs12->tsc_offset;
 
 	if (prepare_vmcs02(vcpu, vmcs12, &entry_failure_code)) {
-		exit_reason = EXIT_REASON_INVALID_STATE;
+		exit_reason.basic = EXIT_REASON_INVALID_STATE;
 		vmcs12->exit_qualification = entry_failure_code;
 		goto vmentry_fail_vmexit_guest_mode;
 	}
@@ -3384,7 +3388,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
 						   vmcs12->vm_entry_msr_load_addr,
 						   vmcs12->vm_entry_msr_load_count);
 		if (failed_index) {
-			exit_reason = EXIT_REASON_MSR_LOAD_FAIL;
+			exit_reason.basic = EXIT_REASON_MSR_LOAD_FAIL;
 			vmcs12->exit_qualification = failed_index;
 			goto vmentry_fail_vmexit_guest_mode;
 		}
@@ -3452,7 +3456,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
 		return NVMX_VMENTRY_VMEXIT;
 
 	load_vmcs12_host_state(vcpu, vmcs12);
-	vmcs12->vm_exit_reason = exit_reason | VMX_EXIT_REASONS_FAILED_VMENTRY;
+	vmcs12->vm_exit_reason = exit_reason.full;
 	if (enable_shadow_vmcs || vmx->nested.hv_evmcs)
 		vmx->nested.need_vmcs12_to_shadow_sync = true;
 	return NVMX_VMENTRY_VMEXIT;
@@ -5540,7 +5544,12 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
 	return kvm_skip_emulated_instruction(vcpu);
 
 fail:
-	nested_vmx_vmexit(vcpu, vmx->exit_reason,
+	/*
+	 * This is effectively a reflected VM-Exit, as opposed to a synthesized
+	 * nested VM-Exit.  Pass the original exit reason, i.e. don't hardcode
+	 * EXIT_REASON_VMFUNC as the exit reason.
+	 */
+	nested_vmx_vmexit(vcpu, vmx->exit_reason.full,
 			  vmx_get_intr_info(vcpu),
 			  vmx_get_exit_qual(vcpu));
 	return 1;
@@ -5608,7 +5617,8 @@ static bool nested_vmx_exit_handled_io(struct kvm_vcpu *vcpu,
  * MSR bitmap. This may be the case even when L0 doesn't use MSR bitmaps.
  */
 static bool nested_vmx_exit_handled_msr(struct kvm_vcpu *vcpu,
-	struct vmcs12 *vmcs12, u32 exit_reason)
+					struct vmcs12 *vmcs12,
+					union vmx_exit_reason exit_reason)
 {
 	u32 msr_index = kvm_rcx_read(vcpu);
 	gpa_t bitmap;
@@ -5622,7 +5632,7 @@ static bool nested_vmx_exit_handled_msr(struct kvm_vcpu *vcpu,
 	 * First we need to figure out which of the four to use:
 	 */
 	bitmap = vmcs12->msr_bitmap;
-	if (exit_reason == EXIT_REASON_MSR_WRITE)
+	if (exit_reason.basic == EXIT_REASON_MSR_WRITE)
 		bitmap += 2048;
 	if (msr_index >= 0xc0000000) {
 		msr_index -= 0xc0000000;
@@ -5759,11 +5769,12 @@ static bool nested_vmx_exit_handled_mtf(struct vmcs12 *vmcs12)
  * Return true if L0 wants to handle an exit from L2 regardless of whether or not
  * L1 wants the exit.  Only call this when in is_guest_mode (L2).
  */
-static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu, u32 exit_reason)
+static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu,
+				     union vmx_exit_reason exit_reason)
 {
 	u32 intr_info;
 
-	switch ((u16)exit_reason) {
+	switch (exit_reason.basic) {
 	case EXIT_REASON_EXCEPTION_NMI:
 		intr_info = vmx_get_intr_info(vcpu);
 		if (is_nmi(intr_info))
@@ -5819,12 +5830,13 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu, u32 exit_reason)
  * Return 1 if L1 wants to intercept an exit from L2.  Only call this when in
  * is_guest_mode (L2).
  */
-static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu, u32 exit_reason)
+static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu,
+				     union vmx_exit_reason exit_reason)
 {
 	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
 	u32 intr_info;
 
-	switch ((u16)exit_reason) {
+	switch (exit_reason.basic) {
 	case EXIT_REASON_EXCEPTION_NMI:
 		intr_info = vmx_get_intr_info(vcpu);
 		if (is_nmi(intr_info))
@@ -5943,7 +5955,7 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu, u32 exit_reason)
 bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
-	u32 exit_reason = vmx->exit_reason;
+	union vmx_exit_reason exit_reason = vmx->exit_reason;
 	unsigned long exit_qual;
 	u32 exit_intr_info;
 
@@ -5962,7 +5974,7 @@ bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu)
 		goto reflect_vmexit;
 	}
 
-	trace_kvm_nested_vmexit(exit_reason, vcpu, KVM_ISA_VMX);
+	trace_kvm_nested_vmexit(exit_reason.full, vcpu, KVM_ISA_VMX);
 
 	/* If L0 (KVM) wants the exit, it trumps L1's desires. */
 	if (nested_vmx_l0_wants_exit(vcpu, exit_reason))
@@ -5988,7 +6000,7 @@ bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu)
 	exit_qual = vmx_get_exit_qual(vcpu);
 
 reflect_vmexit:
-	nested_vmx_vmexit(vcpu, exit_reason, exit_intr_info, exit_qual);
+	nested_vmx_vmexit(vcpu, exit_reason.full, exit_intr_info, exit_qual);
 	return true;
 }
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2af05d3b0590..746b87375aff 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1577,7 +1577,7 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
 	 * i.e. we end up advancing IP with some random value.
 	 */
 	if (!static_cpu_has(X86_FEATURE_HYPERVISOR) ||
-	    to_vmx(vcpu)->exit_reason != EXIT_REASON_EPT_MISCONFIG) {
+	    to_vmx(vcpu)->exit_reason.basic != EXIT_REASON_EPT_MISCONFIG) {
 		orig_rip = kvm_rip_read(vcpu);
 		rip = orig_rip + vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
 #ifdef CONFIG_X86_64
@@ -5667,7 +5667,7 @@ static void vmx_get_exit_info(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2,
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
 	*info1 = vmx_get_exit_qual(vcpu);
-	if (!(vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) {
+	if (!vmx->exit_reason.failed_vmentry) {
 		*info2 = vmx->idt_vectoring_info;
 		*intr_info = vmx_get_intr_info(vcpu);
 		if (is_exception_with_error_code(*intr_info))
@@ -5911,8 +5911,9 @@ void dump_vmcs(void)
 static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
-	u32 exit_reason = vmx->exit_reason;
+	union vmx_exit_reason exit_reason = vmx->exit_reason;
 	u32 vectoring_info = vmx->idt_vectoring_info;
+	u16 exit_handler_index;
 
 	/*
 	 * Flush logged GPAs PML buffer, this will make dirty_bitmap more
@@ -5954,11 +5955,11 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 			return 1;
 	}
 
-	if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
+	if (exit_reason.failed_vmentry) {
 		dump_vmcs();
 		vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
 		vcpu->run->fail_entry.hardware_entry_failure_reason
-			= exit_reason;
+			= exit_reason.full;
 		vcpu->run->fail_entry.cpu = vcpu->arch.last_vmentry_cpu;
 		return 0;
 	}
@@ -5980,18 +5981,18 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 	 * will cause infinite loop.
 	 */
 	if ((vectoring_info & VECTORING_INFO_VALID_MASK) &&
-			(exit_reason != EXIT_REASON_EXCEPTION_NMI &&
-			exit_reason != EXIT_REASON_EPT_VIOLATION &&
-			exit_reason != EXIT_REASON_PML_FULL &&
-			exit_reason != EXIT_REASON_APIC_ACCESS &&
-			exit_reason != EXIT_REASON_TASK_SWITCH)) {
+	    (exit_reason.basic != EXIT_REASON_EXCEPTION_NMI &&
+	     exit_reason.basic != EXIT_REASON_EPT_VIOLATION &&
+	     exit_reason.basic != EXIT_REASON_PML_FULL &&
+	     exit_reason.basic != EXIT_REASON_APIC_ACCESS &&
+	     exit_reason.basic != EXIT_REASON_TASK_SWITCH)) {
 		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
 		vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV;
 		vcpu->run->internal.ndata = 3;
 		vcpu->run->internal.data[0] = vectoring_info;
-		vcpu->run->internal.data[1] = exit_reason;
+		vcpu->run->internal.data[1] = exit_reason.full;
 		vcpu->run->internal.data[2] = vcpu->arch.exit_qualification;
-		if (exit_reason == EXIT_REASON_EPT_MISCONFIG) {
+		if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG) {
 			vcpu->run->internal.ndata++;
 			vcpu->run->internal.data[3] =
 				vmcs_read64(GUEST_PHYSICAL_ADDRESS);
@@ -6023,38 +6024,39 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
 	if (exit_fastpath != EXIT_FASTPATH_NONE)
 		return 1;
 
-	if (exit_reason >= kvm_vmx_max_exit_handlers)
+	if (exit_reason.basic >= kvm_vmx_max_exit_handlers)
 		goto unexpected_vmexit;
 #ifdef CONFIG_RETPOLINE
-	if (exit_reason == EXIT_REASON_MSR_WRITE)
+	if (exit_reason.basic == EXIT_REASON_MSR_WRITE)
 		return kvm_emulate_wrmsr(vcpu);
-	else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER)
+	else if (exit_reason.basic == EXIT_REASON_PREEMPTION_TIMER)
 		return handle_preemption_timer(vcpu);
-	else if (exit_reason == EXIT_REASON_INTERRUPT_WINDOW)
+	else if (exit_reason.basic == EXIT_REASON_INTERRUPT_WINDOW)
 		return handle_interrupt_window(vcpu);
-	else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
+	else if (exit_reason.basic == EXIT_REASON_EXTERNAL_INTERRUPT)
 		return handle_external_interrupt(vcpu);
-	else if (exit_reason == EXIT_REASON_HLT)
+	else if (exit_reason.basic == EXIT_REASON_HLT)
 		return kvm_emulate_halt(vcpu);
-	else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
+	else if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG)
 		return handle_ept_misconfig(vcpu);
 #endif
 
-	exit_reason = array_index_nospec(exit_reason,
-					 kvm_vmx_max_exit_handlers);
-	if (!kvm_vmx_exit_handlers[exit_reason])
+	exit_handler_index = array_index_nospec((u16)exit_reason.basic,
+						kvm_vmx_max_exit_handlers);
+	if (!kvm_vmx_exit_handlers[exit_handler_index])
 		goto unexpected_vmexit;
 
-	return kvm_vmx_exit_handlers[exit_reason](vcpu);
+	return kvm_vmx_exit_handlers[exit_handler_index](vcpu);
 
 unexpected_vmexit:
-	vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n", exit_reason);
+	vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n",
+		    exit_reason.full);
 	dump_vmcs();
 	vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
 	vcpu->run->internal.suberror =
 			KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
 	vcpu->run->internal.ndata = 2;
-	vcpu->run->internal.data[0] = exit_reason;
+	vcpu->run->internal.data[0] = exit_reason.full;
 	vcpu->run->internal.data[1] = vcpu->arch.last_vmentry_cpu;
 	return 0;
 }
@@ -6373,9 +6375,9 @@ static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
-	if (vmx->exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
+	if (vmx->exit_reason.basic == EXIT_REASON_EXTERNAL_INTERRUPT)
 		handle_external_interrupt_irqoff(vcpu);
-	else if (vmx->exit_reason == EXIT_REASON_EXCEPTION_NMI)
+	else if (vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI)
 		handle_exception_nmi_irqoff(vmx);
 }
 
@@ -6567,7 +6569,7 @@ void noinstr vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)
 
 static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 {
-	switch (to_vmx(vcpu)->exit_reason) {
+	switch (to_vmx(vcpu)->exit_reason.basic) {
 	case EXIT_REASON_MSR_WRITE:
 		return handle_fastpath_set_msr_irqoff(vcpu);
 	case EXIT_REASON_PREEMPTION_TIMER:
@@ -6766,17 +6768,17 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	vmx->idt_vectoring_info = 0;
 
 	if (unlikely(vmx->fail)) {
-		vmx->exit_reason = 0xdead;
+		vmx->exit_reason.full = 0xdead;
 		return EXIT_FASTPATH_NONE;
 	}
 
-	vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
-	if (unlikely((u16)vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY))
+	vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON);
+	if (unlikely(vmx->exit_reason.basic == EXIT_REASON_MCE_DURING_VMENTRY))
 		kvm_machine_check();
 
-	trace_kvm_exit(vmx->exit_reason, vcpu, KVM_ISA_VMX);
+	trace_kvm_exit(vmx->exit_reason.full, vcpu, KVM_ISA_VMX);
 
-	if (unlikely(vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
+	if (unlikely(vmx->exit_reason.failed_vmentry))
 		return EXIT_FASTPATH_NONE;
 
 	vmx->loaded_vmcs->launched = 1;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 9d3a557949ac..903f246b5abd 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -70,6 +70,29 @@ struct pt_desc {
 	struct pt_ctx guest;
 };
 
+union vmx_exit_reason {
+	struct {
+		u32	basic			: 16;
+		u32	reserved16		: 1;
+		u32	reserved17		: 1;
+		u32	reserved18		: 1;
+		u32	reserved19		: 1;
+		u32	reserved20		: 1;
+		u32	reserved21		: 1;
+		u32	reserved22		: 1;
+		u32	reserved23		: 1;
+		u32	reserved24		: 1;
+		u32	reserved25		: 1;
+		u32	reserved26		: 1;
+		u32	sgx_enclave_mode	: 1;
+		u32	smi_pending_mtf		: 1;
+		u32	smi_from_vmx_root	: 1;
+		u32	reserved30		: 1;
+		u32	failed_vmentry		: 1;
+	};
+	u32 full;
+};
+
 /*
  * The nested_vmx structure is part of vcpu_vmx, and holds information we need
  * for correct emulation of VMX (i.e., nested VMX) on this vcpu.
@@ -244,7 +267,7 @@ struct vcpu_vmx {
 	int vpid;
 	bool emulation_required;
 
-	u32 exit_reason;
+	union vmx_exit_reason exit_reason;
 
 	/* Posted interrupt descriptor */
 	struct pi_desc pi_desc;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 17/27] KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX)
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (16 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 18/27] KVM: x86: Define new #PF SGX error code bit Kai Huang
                   ` (10 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Export the gva_to_gpa() helpers for use by SGX virtualization when
executing ENCLS[ECREATE] and ENCLS[EINIT] on behalf of the guest.
To execute ECREATE and EINIT, KVM must obtain the GPA of the target
Secure Enclave Control Structure (SECS) in order to get its
corresponding HVA.

Because the SECS must reside in the Enclave Page Cache (EPC), copying
the SECS's data to a host-controlled buffer via existing exported
helpers is not a viable option as the EPC is not readable or writable
by the kernel.

SGX virtualization will also use gva_to_gpa() to obtain HVAs for
non-EPC pages in order to pass user pointers directly to ECREATE and
EINIT, which avoids having to copy pages worth of data into the kernel.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/x86.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9a8969a6dd06..5ca7b181a3ae 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5891,6 +5891,7 @@ gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva,
 	u32 access = (kvm_x86_ops.get_cpl(vcpu) == 3) ? PFERR_USER_MASK : 0;
 	return vcpu->arch.walk_mmu->gva_to_gpa(vcpu, gva, access, exception);
 }
+EXPORT_SYMBOL_GPL(kvm_mmu_gva_to_gpa_read);
 
  gpa_t kvm_mmu_gva_to_gpa_fetch(struct kvm_vcpu *vcpu, gva_t gva,
 				struct x86_exception *exception)
@@ -5907,6 +5908,7 @@ gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva,
 	access |= PFERR_WRITE_MASK;
 	return vcpu->arch.walk_mmu->gva_to_gpa(vcpu, gva, access, exception);
 }
+EXPORT_SYMBOL_GPL(kvm_mmu_gva_to_gpa_write);
 
 /* uses this to access any guest's mapped memory without checking CPL */
 gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva,
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 18/27] KVM: x86: Define new #PF SGX error code bit
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (17 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 17/27] KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX) Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 19/27] KVM: x86: Add support for reverse CPUID lookup of scattered features Kai Huang
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Page faults that are signaled by the SGX Enclave Page Cache Map (EPCM),
as opposed to the traditional IA32/EPT page tables, set an SGX bit in
the error code to indicate that the #PF was induced by SGX.  KVM will
need to emulate this behavior as part of its trap-and-execute scheme for
virtualizing SGX Launch Control, e.g. to inject SGX-induced #PFs if
EINIT faults in the host, and to support live migration.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/include/asm/kvm_host.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3d6616f6f6ef..9581f81e62a4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -216,6 +216,7 @@ enum x86_intercept_stage;
 #define PFERR_RSVD_BIT 3
 #define PFERR_FETCH_BIT 4
 #define PFERR_PK_BIT 5
+#define PFERR_SGX_BIT 15
 #define PFERR_GUEST_FINAL_BIT 32
 #define PFERR_GUEST_PAGE_BIT 33
 
@@ -225,6 +226,7 @@ enum x86_intercept_stage;
 #define PFERR_RSVD_MASK (1U << PFERR_RSVD_BIT)
 #define PFERR_FETCH_MASK (1U << PFERR_FETCH_BIT)
 #define PFERR_PK_MASK (1U << PFERR_PK_BIT)
+#define PFERR_SGX_MASK (1U << PFERR_SGX_BIT)
 #define PFERR_GUEST_FINAL_MASK (1ULL << PFERR_GUEST_FINAL_BIT)
 #define PFERR_GUEST_PAGE_MASK (1ULL << PFERR_GUEST_PAGE_BIT)
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 19/27] KVM: x86: Add support for reverse CPUID lookup of scattered features
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (18 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 18/27] KVM: x86: Define new #PF SGX error code bit Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 20/27] KVM: x86: Add reverse-CPUID lookup support for scattered SGX features Kai Huang
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <seanjc@google.com>

Introduce a scheme that allows KVM's CPUID magic to support features
that are scattered in the kernel's feature words.  To advertise and/or
query guest support for CPUID-based features, KVM requires the bit
number of an X86_FEATURE_* to match the bit number in its associated
CPUID entry.  For scattered features, this does not hold true.

Add a framework to allow defining KVM-only words, stored in
kvm_cpu_caps after the shared kernel caps, that can be used to gather
the scattered feature bits by translating X86_FEATURE_* flags into their
KVM-defined feature.

Note, because reverse_cpuid_check() effectively forces kvm_cpu_caps
lookups to be resolved at compile time, there is no runtime cost for
translating from kernel-defined to kvm-defined features.

More details here:  https://lkml.kernel.org/r/X/jxCOLG+HUO4QlZ@google.com

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/cpuid.c | 32 +++++++++++++++++++++++++++-----
 arch/x86/kvm/cpuid.h | 39 ++++++++++++++++++++++++++++++++++-----
 2 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 13036cf0b912..f8037fab8950 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -28,7 +28,7 @@
  * Unlike "struct cpuinfo_x86.x86_capability", kvm_cpu_caps doesn't need to be
  * aligned to sizeof(unsigned long) because it's not accessed via bitops.
  */
-u32 kvm_cpu_caps[NCAPINTS] __read_mostly;
+u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_cpu_caps);
 
 static u32 xstate_required_size(u64 xstate_bv, bool compacted)
@@ -53,6 +53,7 @@ static u32 xstate_required_size(u64 xstate_bv, bool compacted)
 }
 
 #define F feature_bit
+#define SF(name) (boot_cpu_has(X86_FEATURE_##name) ? F(name) : 0)
 
 static inline struct kvm_cpuid_entry2 *cpuid_entry2_find(
 	struct kvm_cpuid_entry2 *entries, int nent, u32 function, u32 index)
@@ -331,13 +332,13 @@ int kvm_vcpu_ioctl_get_cpuid2(struct kvm_vcpu *vcpu,
 	return r;
 }
 
-static __always_inline void kvm_cpu_cap_mask(enum cpuid_leafs leaf, u32 mask)
+/* Mask kvm_cpu_caps for @leaf with the raw CPUID capabilities of this CPU. */
+static __always_inline void __kvm_cpu_cap_mask(enum cpuid_leafs leaf)
 {
 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);
 	struct kvm_cpuid_entry2 entry;
 
 	reverse_cpuid_check(leaf);
-	kvm_cpu_caps[leaf] &= mask;
 
 	cpuid_count(cpuid.function, cpuid.index,
 		    &entry.eax, &entry.ebx, &entry.ecx, &entry.edx);
@@ -345,6 +346,26 @@ static __always_inline void kvm_cpu_cap_mask(enum cpuid_leafs leaf, u32 mask)
 	kvm_cpu_caps[leaf] &= *__cpuid_entry_get_reg(&entry, cpuid.reg);
 }
 
+static __always_inline void kvm_cpu_cap_mask(enum cpuid_leafs leaf, u32 mask)
+{
+	/* Use the "init" variant for scattered leafs. */
+	BUILD_BUG_ON(leaf >= NCAPINTS);
+
+	kvm_cpu_caps[leaf] &= mask;
+
+	__kvm_cpu_cap_mask(leaf);
+}
+
+static __always_inline void kvm_cpu_cap_init(enum cpuid_leafs leaf, u32 mask)
+{
+	/* Use the "mask" variant for hardwared-defined leafs. */
+	BUILD_BUG_ON(leaf < NCAPINTS);
+
+	kvm_cpu_caps[leaf] = mask;
+
+	__kvm_cpu_cap_mask(leaf);
+}
+
 void kvm_set_cpu_caps(void)
 {
 	unsigned int f_nx = is_efer_nx() ? F(NX) : 0;
@@ -355,12 +376,13 @@ void kvm_set_cpu_caps(void)
 	unsigned int f_gbpages = 0;
 	unsigned int f_lm = 0;
 #endif
+	memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps));
 
-	BUILD_BUG_ON(sizeof(kvm_cpu_caps) >
+	BUILD_BUG_ON(sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)) >
 		     sizeof(boot_cpu_data.x86_capability));
 
 	memcpy(&kvm_cpu_caps, &boot_cpu_data.x86_capability,
-	       sizeof(kvm_cpu_caps));
+	       sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)));
 
 	kvm_cpu_cap_mask(CPUID_1_ECX,
 		/*
diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index dc921d76e42e..2041e2f07347 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -7,7 +7,20 @@
 #include <asm/processor.h>
 #include <uapi/asm/kvm_para.h>
 
-extern u32 kvm_cpu_caps[NCAPINTS] __read_mostly;
+/*
+ * Hardware-defined CPUID leafs that are scattered in the kernel, but need to
+ * be directly used by KVM.  Note, these word values conflict with the kernel's
+ * "bug" caps, but KVM doesn't use those.
+ */
+enum kvm_only_cpuid_leafs {
+	NR_KVM_CPU_CAPS = NCAPINTS,
+
+	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
+};
+
+#define X86_KVM_FEATURE(w, f)		((w)*32 + (f))
+
+extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
 void kvm_set_cpu_caps(void);
 
 void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu);
@@ -83,6 +96,20 @@ static __always_inline void reverse_cpuid_check(unsigned int x86_leaf)
 	BUILD_BUG_ON(reverse_cpuid[x86_leaf].function == 0);
 }
 
+/*
+ * Translate feature bits that are scattered in the kernel's cpufeatures word
+ * into KVM feature words that align with hardware's definitions.
+ */
+static __always_inline u32 __feature_translate(int x86_feature)
+{
+	return x86_feature;
+}
+
+static __always_inline u32 __feature_leaf(int x86_feature)
+{
+	return __feature_translate(x86_feature) / 32;
+}
+
 /*
  * Retrieve the bit mask from an X86_FEATURE_* definition.  Features contain
  * the hardware defined bit number (stored in bits 4:0) and a software defined
@@ -91,6 +118,8 @@ static __always_inline void reverse_cpuid_check(unsigned int x86_leaf)
  */
 static __always_inline u32 __feature_bit(int x86_feature)
 {
+	x86_feature = __feature_translate(x86_feature);
+
 	reverse_cpuid_check(x86_feature / 32);
 	return 1 << (x86_feature & 31);
 }
@@ -99,7 +128,7 @@ static __always_inline u32 __feature_bit(int x86_feature)
 
 static __always_inline struct cpuid_reg x86_feature_cpuid(unsigned int x86_feature)
 {
-	unsigned int x86_leaf = x86_feature / 32;
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
 	reverse_cpuid_check(x86_leaf);
 	return reverse_cpuid[x86_leaf];
@@ -291,7 +320,7 @@ static inline bool cpuid_fault_enabled(struct kvm_vcpu *vcpu)
 
 static __always_inline void kvm_cpu_cap_clear(unsigned int x86_feature)
 {
-	unsigned int x86_leaf = x86_feature / 32;
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
 	reverse_cpuid_check(x86_leaf);
 	kvm_cpu_caps[x86_leaf] &= ~__feature_bit(x86_feature);
@@ -299,7 +328,7 @@ static __always_inline void kvm_cpu_cap_clear(unsigned int x86_feature)
 
 static __always_inline void kvm_cpu_cap_set(unsigned int x86_feature)
 {
-	unsigned int x86_leaf = x86_feature / 32;
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
 	reverse_cpuid_check(x86_leaf);
 	kvm_cpu_caps[x86_leaf] |= __feature_bit(x86_feature);
@@ -307,7 +336,7 @@ static __always_inline void kvm_cpu_cap_set(unsigned int x86_feature)
 
 static __always_inline u32 kvm_cpu_cap_get(unsigned int x86_feature)
 {
-	unsigned int x86_leaf = x86_feature / 32;
+	unsigned int x86_leaf = __feature_leaf(x86_feature);
 
 	reverse_cpuid_check(x86_leaf);
 	return kvm_cpu_caps[x86_leaf] & __feature_bit(x86_feature);
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 20/27] KVM: x86: Add reverse-CPUID lookup support for scattered SGX features
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (19 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 19/27] KVM: x86: Add support for reverse CPUID lookup of scattered features Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 21/27] KVM: VMX: Add basic handling of VM-Exit from SGX enclave Kai Huang
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <seanjc@google.com>

Define a new KVM-only feature word for advertising and querying SGX
sub-features in CPUID.0x12.0x0.EAX.  Because SGX1 and SGX2 are scattered
in the kernel's feature word, they need to be translated so that the
bit numbers match those of hardware.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/cpuid.h | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 2041e2f07347..f55701ef58fc 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -13,13 +13,18 @@
  * "bug" caps, but KVM doesn't use those.
  */
 enum kvm_only_cpuid_leafs {
-	NR_KVM_CPU_CAPS = NCAPINTS,
+	CPUID_12_EAX	 = NCAPINTS,
+	NR_KVM_CPU_CAPS,
 
 	NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
 };
 
 #define X86_KVM_FEATURE(w, f)		((w)*32 + (f))
 
+/* Intel-defined SGX sub-features, CPUID level 0x12 (EAX). */
+#define __X86_FEATURE_SGX1		X86_KVM_FEATURE(CPUID_12_EAX, 0)
+#define __X86_FEATURE_SGX2		X86_KVM_FEATURE(CPUID_12_EAX, 1)
+
 extern u32 kvm_cpu_caps[NR_KVM_CPU_CAPS] __read_mostly;
 void kvm_set_cpu_caps(void);
 
@@ -76,6 +81,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
 	[CPUID_8000_0007_EBX] = {0x80000007, 0, CPUID_EBX},
 	[CPUID_7_EDX]         = {         7, 0, CPUID_EDX},
 	[CPUID_7_1_EAX]       = {         7, 1, CPUID_EAX},
+	[CPUID_12_EAX]        = {0x00000012, 0, CPUID_EAX},
 };
 
 /*
@@ -102,6 +108,11 @@ static __always_inline void reverse_cpuid_check(unsigned int x86_leaf)
  */
 static __always_inline u32 __feature_translate(int x86_feature)
 {
+	if (x86_feature == X86_FEATURE_SGX1)
+		return __X86_FEATURE_SGX1;
+	else if (x86_feature == X86_FEATURE_SGX2)
+		return __X86_FEATURE_SGX2;
+
 	return x86_feature;
 }
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 21/27] KVM: VMX: Add basic handling of VM-Exit from SGX enclave
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (20 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 20/27] KVM: x86: Add reverse-CPUID lookup support for scattered SGX features Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 22/27] KVM: VMX: Frame in ENCLS handler for SGX virtualization Kai Huang
                   ` (6 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add support for handling VM-Exits that originate from a guest SGX
enclave.  In SGX, an "enclave" is a new CPL3-only execution environment,
wherein the CPU and memory state is protected by hardware to make the
state inaccesible to code running outside of the enclave.  When exiting
an enclave due to an asynchronous event (from the perspective of the
enclave), e.g. exceptions, interrupts, and VM-Exits, the enclave's state
is automatically saved and scrubbed (the CPU loads synthetic state), and
then reloaded when re-entering the enclave.  E.g. after an instruction
based VM-Exit from an enclave, vmcs.GUEST_RIP will not contain the RIP
of the enclave instruction that trigered VM-Exit, but will instead point
to a RIP in the enclave's untrusted runtime (the guest userspace code
that coordinates entry/exit to/from the enclave).

To help a VMM recognize and handle exits from enclaves, SGX adds bits to
existing VMCS fields, VM_EXIT_REASON.VMX_EXIT_REASON_FROM_ENCLAVE and
GUEST_INTERRUPTIBILITY_INFO.GUEST_INTR_STATE_ENCLAVE_INTR.  Define the
new architectural bits, and add a boolean to struct vcpu_vmx to cache
VMX_EXIT_REASON_FROM_ENCLAVE.  Clear the bit in exit_reason so that
checks against exit_reason do not need to account for SGX, e.g.
"if (exit_reason == EXIT_REASON_EXCEPTION_NMI)" continues to work.

KVM is a largely a passive observer of the new bits, e.g. KVM needs to
account for the bits when propagating information to a nested VMM, but
otherwise doesn't need to act differently for the majority of VM-Exits
from enclaves.

The one scenario that is directly impacted is emulation, which is for
all intents and purposes impossible[1] since KVM does not have access to
the RIP or instruction stream that triggered the VM-Exit.  The inability
to emulate is a non-issue for KVM, as most instructions that might
trigger VM-Exit unconditionally #UD in an enclave (before the VM-Exit
check.  For the few instruction that conditionally #UD, KVM either never
sets the exiting control, e.g. PAUSE_EXITING[2], or sets it if and only
if the feature is not exposed to the guest in order to inject a #UD,
e.g. RDRAND_EXITING.

But, because it is still possible for a guest to trigger emulation,
e.g. MMIO, inject a #UD if KVM ever attempts emulation after a VM-Exit
from an enclave.  This is architecturally accurate for instruction
VM-Exits, and for MMIO it's the least bad choice, e.g. it's preferable
to killing the VM.  In practice, only broken or particularly stupid
guests should ever encounter this behavior.

Add a WARN in skip_emulated_instruction to detect any attempt to
modify the guest's RIP during an SGX enclave VM-Exit as all such flows
should either be unreachable or must handle exits from enclaves before
getting to skip_emulated_instruction.

[1] Impossible for all practical purposes.  Not truly impossible
    since KVM could implement some form of para-virtualization scheme.

[2] PAUSE_LOOP_EXITING only affects CPL0 and enclaves exist only at
    CPL3, so we also don't need to worry about that interaction.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/include/asm/vmx.h      |  1 +
 arch/x86/include/uapi/asm/vmx.h |  1 +
 arch/x86/kvm/vmx/nested.c       |  2 ++
 arch/x86/kvm/vmx/vmx.c          | 38 +++++++++++++++++++++++++++++++--
 4 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 38ca445a8429..e99021a00eb9 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -372,6 +372,7 @@ enum vmcs_field {
 #define GUEST_INTR_STATE_MOV_SS		0x00000002
 #define GUEST_INTR_STATE_SMI		0x00000004
 #define GUEST_INTR_STATE_NMI		0x00000008
+#define GUEST_INTR_STATE_ENCLAVE_INTR	0x00000010
 
 /* GUEST_ACTIVITY_STATE flags */
 #define GUEST_ACTIVITY_ACTIVE		0
diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h
index ada955c5ebb6..c7a18eb2a074 100644
--- a/arch/x86/include/uapi/asm/vmx.h
+++ b/arch/x86/include/uapi/asm/vmx.h
@@ -27,6 +27,7 @@
 
 
 #define VMX_EXIT_REASONS_FAILED_VMENTRY         0x80000000
+#define VMX_EXIT_REASONS_SGX_ENCLAVE_MODE	0x08000000
 
 #define EXIT_REASON_EXCEPTION_NMI       0
 #define EXIT_REASON_EXTERNAL_INTERRUPT  1
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f112c2482887..562eab7b0a51 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4126,6 +4126,8 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
 {
 	/* update exit information fields: */
 	vmcs12->vm_exit_reason = vm_exit_reason;
+	if (to_vmx(vcpu)->exit_reason.sgx_enclave_mode)
+		vmcs12->vm_exit_reason |= VMX_EXIT_REASONS_SGX_ENCLAVE_MODE;
 	vmcs12->exit_qualification = exit_qualification;
 	vmcs12->vm_exit_intr_info = exit_intr_info;
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 746b87375aff..4cb8a3f1374c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1561,12 +1561,18 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
 
 static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int insn_len)
 {
+	if (to_vmx(vcpu)->exit_reason.sgx_enclave_mode) {
+		kvm_queue_exception(vcpu, UD_VECTOR);
+		return false;
+	}
 	return true;
 }
 
 static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
 {
+	union vmx_exit_reason exit_reason = to_vmx(vcpu)->exit_reason;
 	unsigned long rip, orig_rip;
+	u32 instr_len;
 
 	/*
 	 * Using VMCS.VM_EXIT_INSTRUCTION_LEN on EPT misconfig depends on
@@ -1577,9 +1583,33 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
 	 * i.e. we end up advancing IP with some random value.
 	 */
 	if (!static_cpu_has(X86_FEATURE_HYPERVISOR) ||
-	    to_vmx(vcpu)->exit_reason.basic != EXIT_REASON_EPT_MISCONFIG) {
+	    exit_reason.basic != EXIT_REASON_EPT_MISCONFIG) {
+		instr_len = vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
+
+		/*
+		 * Emulating an enclave's instructions isn't supported as KVM
+		 * cannot access the enclave's memory or its true RIP, e.g. the
+		 * vmcs.GUEST_RIP points at the exit point of the enclave, not
+		 * the RIP that actually triggered the VM-Exit.  But, because
+		 * most instructions that cause VM-Exit will #UD in an enclave,
+		 * most instruction-based VM-Exits simply do not occur.
+		 *
+		 * There are a few exceptions, notably the debug instructions
+		 * INT1ICEBRK and INT3, as they are allowed in debug enclaves
+		 * and generate #DB/#BP as expected, which KVM might intercept.
+		 * But again, the CPU does the dirty work and saves an instr
+		 * length of zero so VMMs don't shoot themselves in the foot.
+		 * WARN if KVM tries to skip a non-zero length instruction on
+		 * a VM-Exit from an enclave.
+		 */
+		if (!instr_len)
+			goto rip_updated;
+
+		WARN(exit_reason.sgx_enclave_mode,
+		     "KVM: skipping instruction after SGX enclave VM-Exit");
+
 		orig_rip = kvm_rip_read(vcpu);
-		rip = orig_rip + vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
+		rip = orig_rip + instr_len;
 #ifdef CONFIG_X86_64
 		/*
 		 * We need to mask out the high 32 bits of RIP if not in 64-bit
@@ -1595,6 +1625,7 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
 			return 0;
 	}
 
+rip_updated:
 	/* skipping an emulated instruction also counts */
 	vmx_set_interrupt_shadow(vcpu, 0);
 
@@ -5341,6 +5372,9 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
 {
 	gpa_t gpa;
 
+	if (!vmx_can_emulate_instruction(vcpu, NULL, 0))
+		return 1;
+
 	/*
 	 * A nested guest cannot optimize MMIO vmexits, because we have an
 	 * nGPA here instead of the required GPA.
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 22/27] KVM: VMX: Frame in ENCLS handler for SGX virtualization
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (21 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 21/27] KVM: VMX: Add basic handling of VM-Exit from SGX enclave Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions Kai Huang
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Introduce sgx.c and sgx.h, along with the framework for handling ENCLS
VM-Exits.  Add a bool, enable_sgx, that will eventually be wired up to a
module param to control whether or not SGX virtualization is enabled at
runtime.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/Makefile  |  2 ++
 arch/x86/kvm/vmx/sgx.c | 51 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/sgx.h | 15 +++++++++++++
 arch/x86/kvm/vmx/vmx.c |  9 +++++---
 4 files changed, 74 insertions(+), 3 deletions(-)
 create mode 100644 arch/x86/kvm/vmx/sgx.c
 create mode 100644 arch/x86/kvm/vmx/sgx.h

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 4bd14ab01323..5c86edc73b72 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -21,6 +21,8 @@ kvm-y			+= x86.o emulate.o i8259.o irq.o lapic.o \
 
 kvm-intel-y		+= vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
 			   vmx/evmcs.o vmx/nested.o vmx/posted_intr.o
+kvm-intel-$(CONFIG_X86_SGX_KVM)	+= vmx/sgx.o
+
 kvm-amd-y		+= svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o svm/sev.o
 
 obj-$(CONFIG_KVM)	+= kvm.o
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
new file mode 100644
index 000000000000..693bf7735308
--- /dev/null
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0
+/*  Copyright(c) 2016-20 Intel Corporation. */
+
+#include <asm/sgx.h>
+#include <asm/sgx_arch.h>
+
+#include "cpuid.h"
+#include "kvm_cache_regs.h"
+#include "sgx.h"
+#include "vmx.h"
+#include "x86.h"
+
+bool __read_mostly enable_sgx;
+
+static inline bool encls_leaf_enabled_in_guest(struct kvm_vcpu *vcpu, u32 leaf)
+{
+	if (!enable_sgx || !guest_cpuid_has(vcpu, X86_FEATURE_SGX))
+		return false;
+
+	if (leaf >= ECREATE && leaf <= ETRACK)
+		return guest_cpuid_has(vcpu, X86_FEATURE_SGX1);
+
+	if (leaf >= EAUG && leaf <= EMODT)
+		return guest_cpuid_has(vcpu, X86_FEATURE_SGX2);
+
+	return false;
+}
+
+static inline bool sgx_enabled_in_guest_bios(struct kvm_vcpu *vcpu)
+{
+	const u64 bits = FEAT_CTL_SGX_ENABLED | FEAT_CTL_LOCKED;
+
+	return (to_vmx(vcpu)->msr_ia32_feature_control & bits) == bits;
+}
+
+int handle_encls(struct kvm_vcpu *vcpu)
+{
+	u32 leaf = (u32)vcpu->arch.regs[VCPU_REGS_RAX];
+
+	if (!encls_leaf_enabled_in_guest(vcpu, leaf)) {
+		kvm_queue_exception(vcpu, UD_VECTOR);
+	} else if (!sgx_enabled_in_guest_bios(vcpu)) {
+		kvm_inject_gp(vcpu, 0);
+	} else {
+		WARN(1, "KVM: unexpected exit on ENCLS[%u]", leaf);
+		vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
+		vcpu->run->hw.hardware_exit_reason = EXIT_REASON_ENCLS;
+		return 0;
+	}
+	return 1;
+}
diff --git a/arch/x86/kvm/vmx/sgx.h b/arch/x86/kvm/vmx/sgx.h
new file mode 100644
index 000000000000..6e17ecd4aca3
--- /dev/null
+++ b/arch/x86/kvm/vmx/sgx.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __KVM_X86_SGX_H
+#define __KVM_X86_SGX_H
+
+#include <linux/kvm_host.h>
+
+#ifdef CONFIG_X86_SGX_KVM
+extern bool __read_mostly enable_sgx;
+
+int handle_encls(struct kvm_vcpu *vcpu);
+#else
+#define enable_sgx 0
+#endif
+
+#endif /* __KVM_X86_SGX_H */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4cb8a3f1374c..dbe585329842 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -56,6 +56,7 @@
 #include "mmu.h"
 #include "nested.h"
 #include "pmu.h"
+#include "sgx.h"
 #include "trace.h"
 #include "vmcs.h"
 #include "vmcs12.h"
@@ -5623,16 +5624,18 @@ static int handle_vmx_instruction(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+#ifndef CONFIG_X86_SGX_KVM
 static int handle_encls(struct kvm_vcpu *vcpu)
 {
 	/*
-	 * SGX virtualization is not yet supported.  There is no software
-	 * enable bit for SGX, so we have to trap ENCLS and inject a #UD
-	 * to prevent the guest from executing ENCLS.
+	 * SGX virtualization is disabled.  There is no software enable bit for
+	 * SGX, so KVM intercepts all ENCLS leafs and injects a #UD to prevent
+	 * the guest from executing ENCLS (when SGX is supported by hardware).
 	 */
 	kvm_queue_exception(vcpu, UD_VECTOR);
 	return 1;
 }
+#endif /* CONFIG_X86_SGX_KVM */
 
 /*
  * The exit handlers return 1 if the exit was handled fully and guest execution
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (22 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 22/27] KVM: VMX: Frame in ENCLS handler for SGX virtualization Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-02-03  0:52   ` Edgecombe, Rick P
  2021-02-03 18:47   ` Edgecombe, Rick P
  2021-01-26  9:31 ` [RFC PATCH v3 24/27] KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs Kai Huang
                   ` (4 subsequent siblings)
  28 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add an ECREATE handler that will be used to intercept ECREATE for the
purpose of enforcing and enclave's MISCSELECT, ATTRIBUTES and XFRM, i.e.
to allow userspace to restrict SGX features via CPUID.  ECREATE will be
intercepted when any of the aforementioned masks diverges from hardware
in order to enforce the desired CPUID model, i.e. inject #GP if the
guest attempts to set a bit that hasn't been enumerated as allowed-1 in
CPUID.

Note, access to the PROVISIONKEY is not yet supported.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/include/asm/kvm_host.h |   3 +
 arch/x86/kvm/vmx/sgx.c          | 243 ++++++++++++++++++++++++++++++++
 2 files changed, 246 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9581f81e62a4..cd71f30fbdd1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1000,6 +1000,9 @@ struct kvm_arch {
 		struct msr_bitmap_range ranges[16];
 	} msr_filter;
 
+	/* Guest can access the SGX PROVISIONKEY. */
+	bool sgx_provisioning_allowed;
+
 	struct kvm_pmu_event_filter *pmu_event_filter;
 	struct task_struct *nx_lpage_recovery_thread;
 
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 693bf7735308..4281045318ac 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -12,6 +12,247 @@
 
 bool __read_mostly enable_sgx;
 
+/*
+ * ENCLS's memory operands use a fixed segment (DS) and a fixed
+ * address size based on the mode.  Related prefixes are ignored.
+ */
+static int sgx_get_encls_gva(struct kvm_vcpu *vcpu, unsigned long offset,
+			     int size, int alignment, gva_t *gva)
+{
+	struct kvm_segment s;
+	bool fault;
+
+	/* Skip vmcs.GUEST_DS retrieval for 64-bit mode to avoid VMREADs. */
+	*gva = offset;
+	if (!is_long_mode(vcpu)) {
+		vmx_get_segment(vcpu, &s, VCPU_SREG_DS);
+		*gva += s.base;
+	}
+
+	if (!IS_ALIGNED(*gva, alignment)) {
+		fault = true;
+	} else if (likely(is_long_mode(vcpu))) {
+		fault = is_noncanonical_address(*gva, vcpu);
+	} else {
+		*gva &= 0xffffffff;
+		fault = (s.unusable) ||
+			(s.type != 2 && s.type != 3) ||
+			(*gva > s.limit) ||
+			((s.base != 0 || s.limit != 0xffffffff) &&
+			(((u64)*gva + size - 1) > s.limit + 1));
+	}
+	if (fault)
+		kvm_inject_gp(vcpu, 0);
+	return fault ? -EINVAL : 0;
+}
+
+static void sgx_handle_emulation_failure(struct kvm_vcpu *vcpu, u64 addr,
+					 unsigned int size)
+{
+	vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+	vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
+	vcpu->run->internal.ndata = 2;
+	vcpu->run->internal.data[0] = addr;
+	vcpu->run->internal.data[1] = size;
+}
+
+static int sgx_read_hva(struct kvm_vcpu *vcpu, unsigned long hva, void *data,
+			unsigned int size)
+{
+	if (__copy_from_user(data, (void __user *)hva, size)) {
+		sgx_handle_emulation_failure(vcpu, hva, size);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int sgx_gva_to_gpa(struct kvm_vcpu *vcpu, gva_t gva, bool write,
+			  gpa_t *gpa)
+{
+	struct x86_exception ex;
+
+	if (write)
+		*gpa = kvm_mmu_gva_to_gpa_write(vcpu, gva, &ex);
+	else
+		*gpa = kvm_mmu_gva_to_gpa_read(vcpu, gva, &ex);
+
+	if (*gpa == UNMAPPED_GVA) {
+		kvm_inject_emulated_page_fault(vcpu, &ex);
+		return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int sgx_gpa_to_hva(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned long *hva)
+{
+	*hva = kvm_vcpu_gfn_to_hva(vcpu, PFN_DOWN(gpa));
+	if (kvm_is_error_hva(*hva)) {
+		sgx_handle_emulation_failure(vcpu, gpa, 1);
+		return -EFAULT;
+	}
+
+	*hva |= gpa & ~PAGE_MASK;
+
+	return 0;
+}
+
+static int sgx_inject_fault(struct kvm_vcpu *vcpu, gva_t gva, int trapnr)
+{
+	struct x86_exception ex;
+
+	/*
+	 * A non-EPCM #PF indicates a bad userspace HVA.  This *should* check
+	 * for PFEC.SGX and not assume any #PF on SGX2 originated in the EPC,
+	 * but the error code isn't (yet) plumbed through the ENCLS helpers.
+	 */
+	if (trapnr == PF_VECTOR && !boot_cpu_has(X86_FEATURE_SGX2)) {
+		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+		vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
+		vcpu->run->internal.ndata = 0;
+		return 0;
+	}
+
+	/*
+	 * If the guest thinks it's running on SGX2 hardware, inject an SGX
+	 * #PF if the fault matches an EPCM fault signature (#GP on SGX1,
+	 * #PF on SGX2).  The assumption is that EPCM faults are much more
+	 * likely than a bad userspace address.
+	 */
+	if ((trapnr == PF_VECTOR || !boot_cpu_has(X86_FEATURE_SGX2)) &&
+	    guest_cpuid_has(vcpu, X86_FEATURE_SGX2)) {
+		memset(&ex, 0, sizeof(ex));
+		ex.vector = PF_VECTOR;
+		ex.error_code = PFERR_PRESENT_MASK | PFERR_WRITE_MASK |
+				PFERR_SGX_MASK;
+		ex.address = gva;
+		ex.error_code_valid = true;
+		ex.nested_page_fault = false;
+		kvm_inject_page_fault(vcpu, &ex);
+	} else {
+		kvm_inject_gp(vcpu, 0);
+	}
+	return 1;
+}
+
+static int handle_encls_ecreate(struct kvm_vcpu *vcpu)
+{
+	unsigned long a_hva, m_hva, x_hva, s_hva, secs_hva;
+	struct kvm_cpuid_entry2 *sgx_12_0, *sgx_12_1;
+	gpa_t metadata_gpa, contents_gpa, secs_gpa;
+	struct sgx_pageinfo pageinfo;
+	gva_t pageinfo_gva, secs_gva;
+	u64 attributes, xfrm, size;
+	struct x86_exception ex;
+	u8 max_size_log2;
+	u32 miscselect;
+	int trapnr, r;
+
+	sgx_12_0 = kvm_find_cpuid_entry(vcpu, 0x12, 0);
+	sgx_12_1 = kvm_find_cpuid_entry(vcpu, 0x12, 1);
+	if (!sgx_12_0 || !sgx_12_1) {
+		kvm_inject_gp(vcpu, 0);
+		return 1;
+	}
+
+	if (sgx_get_encls_gva(vcpu, kvm_rbx_read(vcpu), 32, 32, &pageinfo_gva) ||
+	    sgx_get_encls_gva(vcpu, kvm_rcx_read(vcpu), 4096, 4096, &secs_gva))
+		return 1;
+
+	/*
+	 * Copy the PAGEINFO to local memory, its pointers need to be
+	 * translated, i.e. we need to do a deep copy/translate.
+	 */
+	r = kvm_read_guest_virt(vcpu, pageinfo_gva, &pageinfo,
+				sizeof(pageinfo), &ex);
+	if (r == X86EMUL_PROPAGATE_FAULT) {
+		kvm_inject_emulated_page_fault(vcpu, &ex);
+		return 1;
+	} else if (r != X86EMUL_CONTINUE) {
+		sgx_handle_emulation_failure(vcpu, pageinfo_gva, size);
+		return 0;
+	}
+
+	/*
+	 * Verify alignment early.  This conveniently avoids having to worry
+	 * about page splits on userspace addresses.
+	 */
+	if (!IS_ALIGNED(pageinfo.metadata, 64) ||
+	    !IS_ALIGNED(pageinfo.contents, 4096)) {
+		kvm_inject_gp(vcpu, 0);
+		return 1;
+	}
+
+	/*
+	 * Translate the SECINFO, SOURCE and SECS pointers from GVA to GPA.
+	 * Resume the guest on failure to inject a #PF.
+	 */
+	if (sgx_gva_to_gpa(vcpu, pageinfo.metadata, false, &metadata_gpa) ||
+	    sgx_gva_to_gpa(vcpu, pageinfo.contents, false, &contents_gpa) ||
+	    sgx_gva_to_gpa(vcpu, secs_gva, true, &secs_gpa))
+		return 1;
+
+	/*
+	 * ...and then to HVA.  The order of accesses isn't architectural, i.e.
+	 * KVM doesn't have to fully process one address at a time.  Exit to
+	 * userspace if a GPA is invalid.
+	 */
+	if (sgx_gpa_to_hva(vcpu, metadata_gpa,
+			   (unsigned long *)&pageinfo.metadata) ||
+	    sgx_gpa_to_hva(vcpu, contents_gpa,
+			   (unsigned long *)&pageinfo.contents) ||
+	    sgx_gpa_to_hva(vcpu, secs_gpa, &secs_hva))
+		return 0;
+
+	/*
+	 * Read out select portions of the input SECS to enforce userspace
+	 * restrictions on MISCSELECT, ATTRIBUTES, etc...  Note, 'contents' is
+	 * page aligned, i.e. no need to worry about page splits.
+	 */
+	m_hva = pageinfo.contents + offsetof(struct sgx_secs, miscselect);
+	a_hva = pageinfo.contents + offsetof(struct sgx_secs, attributes);
+	x_hva = pageinfo.contents + offsetof(struct sgx_secs, xfrm);
+	s_hva = pageinfo.contents + offsetof(struct sgx_secs, size);
+
+	/* Exit to userspace if copying from a host userspace address fails. */
+	if (sgx_read_hva(vcpu, m_hva, &miscselect, sizeof(miscselect)) ||
+	    sgx_read_hva(vcpu, a_hva, &attributes, sizeof(attributes)) ||
+	    sgx_read_hva(vcpu, x_hva, &xfrm, sizeof(xfrm)) ||
+	    sgx_read_hva(vcpu, s_hva, &size, sizeof(size)))
+		return 0;
+
+	/* Enforce restriction of access to the PROVISIONKEY. */
+	if (!vcpu->kvm->arch.sgx_provisioning_allowed &&
+	    (attributes & SGX_ATTR_PROVISIONKEY)) {
+		if (sgx_12_1->eax & SGX_ATTR_PROVISIONKEY)
+			pr_warn_once("KVM: SGX PROVISIONKEY advertised but not allowed\n");
+		kvm_inject_gp(vcpu, 0);
+		return 1;
+	}
+
+	/* Enforce CPUID restrictions on MISCSELECT, ATTRIBUTES and XFRM. */
+	if ((u32)miscselect & ~sgx_12_0->ebx ||
+	    (u32)attributes & ~sgx_12_1->eax ||
+	    (u32)(attributes >> 32) & ~sgx_12_1->ebx ||
+	    (u32)xfrm & ~sgx_12_1->ecx ||
+	    (u32)(xfrm >> 32) & ~sgx_12_1->edx) {
+		kvm_inject_gp(vcpu, 0);
+		return 1;
+	}
+
+	/* Enforce CPUID restriction on max enclave size. */
+	max_size_log2 = (attributes & SGX_ATTR_MODE64BIT) ? sgx_12_0->edx >> 8 :
+							    sgx_12_0->edx;
+	if (size >= BIT_ULL(max_size_log2))
+		kvm_inject_gp(vcpu, 0);
+
+	if (sgx_virt_ecreate(&pageinfo, (void __user *)secs_hva, &trapnr))
+		return sgx_inject_fault(vcpu, secs_gva, trapnr);
+
+	return kvm_skip_emulated_instruction(vcpu);
+}
+
 static inline bool encls_leaf_enabled_in_guest(struct kvm_vcpu *vcpu, u32 leaf)
 {
 	if (!enable_sgx || !guest_cpuid_has(vcpu, X86_FEATURE_SGX))
@@ -42,6 +283,8 @@ int handle_encls(struct kvm_vcpu *vcpu)
 	} else if (!sgx_enabled_in_guest_bios(vcpu)) {
 		kvm_inject_gp(vcpu, 0);
 	} else {
+		if (leaf == ECREATE)
+			return handle_encls_ecreate(vcpu);
 		WARN(1, "KVM: unexpected exit on ENCLS[%u]", leaf);
 		vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
 		vcpu->run->hw.hardware_exit_reason = EXIT_REASON_ENCLS;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 24/27] KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (23 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 25/27] KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC) Kai Huang
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Emulate the four Launch Enclave public key hash MSRs (LE hash MSRs) that
exist on CPUs that support SGX Launch Control (LC).  SGX LC modifies the
behavior of ENCLS[EINIT] to use the LE hash MSRs when verifying the key
used to sign an enclave.  On CPUs without LC support, the LE hash is
hardwired into the CPU to an Intel controlled key (the Intel key is also
the reset value of the LE hash MSRs). Track the guest's desired hash so
that a future patch can stuff the hash into the hardware MSRs when
executing EINIT on behalf of the guest, when those MSRs are writable in
host.

Note, KVM allows writes to the LE hash MSRs if IA32_FEATURE_CONTROL is
unlocked.  This is technically not architectural behavior, but it's
roughly equivalent to the arch behavior of the MSRs being writable prior
to activating SGX[1].  Emulating SGX activation is feasible, but adds no
tangible benefits and would just create extra work for KVM and guest
firmware.

[1] SGX related bits in IA32_FEATURE_CONTROL cannot be set until SGX
    is activated, e.g. by firmware.  SGX activation is triggered by
    setting bit 0 in MSR 0x7a.  Until SGX is activated, the LE hash
    MSRs are writable, e.g. to allow firmware to lock down the LE
    root key with a non-Intel value.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Co-developed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/vmx/sgx.c | 35 +++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/sgx.h |  6 ++++++
 arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h |  2 ++
 4 files changed, 63 insertions(+)

diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 4281045318ac..6ad6a24c4e93 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -12,6 +12,9 @@
 
 bool __read_mostly enable_sgx;
 
+/* Initial value of guest's virtual SGX_LEPUBKEYHASHn MSRs */
+static u64 sgx_pubkey_hash[4] __ro_after_init;
+
 /*
  * ENCLS's memory operands use a fixed segment (DS) and a fixed
  * address size based on the mode.  Related prefixes are ignored.
@@ -292,3 +295,35 @@ int handle_encls(struct kvm_vcpu *vcpu)
 	}
 	return 1;
 }
+
+void setup_default_sgx_lepubkeyhash(void)
+{
+	/*
+	 * Use Intel's default value for Skylake hardware if Launch Control is
+	 * not supported, i.e. Intel's hash is hardcoded into silicon, or if
+	 * Launch Control is supported and enabled, i.e. mimic the reset value
+	 * and let the guest write the MSRs at will.  If Launch Control is
+	 * supported but disabled, then use the current MSR values as the hash
+	 * MSRs exist but are read-only (locked and not writable).
+	 */
+	if (!enable_sgx || !boot_cpu_has(X86_FEATURE_SGX_LC) ||
+	    rdmsrl_safe(MSR_IA32_SGXLEPUBKEYHASH0, &sgx_pubkey_hash[0])) {
+		sgx_pubkey_hash[0] = 0xa6053e051270b7acULL;
+		sgx_pubkey_hash[1] = 0x6cfbe8ba8b3b413dULL;
+		sgx_pubkey_hash[2] = 0xc4916d99f2b3735dULL;
+		sgx_pubkey_hash[3] = 0xd4f8c05909f9bb3bULL;
+	} else {
+		/* MSR_IA32_SGXLEPUBKEYHASH0 is read above */
+		rdmsrl(MSR_IA32_SGXLEPUBKEYHASH1, sgx_pubkey_hash[1]);
+		rdmsrl(MSR_IA32_SGXLEPUBKEYHASH2, sgx_pubkey_hash[2]);
+		rdmsrl(MSR_IA32_SGXLEPUBKEYHASH3, sgx_pubkey_hash[3]);
+	}
+}
+
+void vcpu_setup_sgx_lepubkeyhash(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	memcpy(vmx->msr_ia32_sgxlepubkeyhash, sgx_pubkey_hash,
+	       sizeof(sgx_pubkey_hash));
+}
diff --git a/arch/x86/kvm/vmx/sgx.h b/arch/x86/kvm/vmx/sgx.h
index 6e17ecd4aca3..6502fa52c7e9 100644
--- a/arch/x86/kvm/vmx/sgx.h
+++ b/arch/x86/kvm/vmx/sgx.h
@@ -8,8 +8,14 @@
 extern bool __read_mostly enable_sgx;
 
 int handle_encls(struct kvm_vcpu *vcpu);
+
+void setup_default_sgx_lepubkeyhash(void);
+void vcpu_setup_sgx_lepubkeyhash(struct kvm_vcpu *vcpu);
 #else
 #define enable_sgx 0
+
+static inline void setup_default_sgx_lepubkeyhash(void) { }
+static inline void vcpu_setup_sgx_lepubkeyhash(struct kvm_vcpu *vcpu) { }
 #endif
 
 #endif /* __KVM_X86_SGX_H */
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dbe585329842..349585f63c4d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1888,6 +1888,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case MSR_IA32_FEAT_CTL:
 		msr_info->data = vmx->msr_ia32_feature_control;
 		break;
+	case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
+		if (!msr_info->host_initiated &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC))
+			return 1;
+		msr_info->data = to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash
+			[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
+		break;
 	case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
 		if (!nested_vmx_allowed(vcpu))
 			return 1;
@@ -2154,6 +2161,15 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		if (msr_info->host_initiated && data == 0)
 			vmx_leave_nested(vcpu);
 		break;
+	case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
+		if (!msr_info->host_initiated &&
+		    (!guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC) ||
+		    ((vmx->msr_ia32_feature_control & FEAT_CTL_LOCKED) &&
+		    !(vmx->msr_ia32_feature_control & FEAT_CTL_SGX_LC_ENABLED))))
+			return 1;
+		vmx->msr_ia32_sgxlepubkeyhash
+			[msr_index - MSR_IA32_SGXLEPUBKEYHASH0] = data;
+		break;
 	case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
 		if (!msr_info->host_initiated)
 			return 1; /* they are read-only */
@@ -6957,6 +6973,8 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 	else
 		memset(&vmx->nested.msrs, 0, sizeof(vmx->nested.msrs));
 
+	vcpu_setup_sgx_lepubkeyhash(vcpu);
+
 	vmx->nested.posted_intr_nv = -1;
 	vmx->nested.current_vmptr = -1ull;
 
@@ -7907,6 +7925,8 @@ static __init int hardware_setup(void)
 	if (!enable_ept || !cpu_has_vmx_intel_pt())
 		pt_mode = PT_MODE_SYSTEM;
 
+	setup_default_sgx_lepubkeyhash();
+
 	if (nested) {
 		nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
 					   vmx_capability.ept);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 903f246b5abd..af4bced6c84b 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -299,6 +299,8 @@ struct vcpu_vmx {
 	 */
 	u64 msr_ia32_feature_control;
 	u64 msr_ia32_feature_control_valid_bits;
+	/* SGX Launch Control public key hash */
+	u64 msr_ia32_sgxlepubkeyhash[4];
 	u64 ept_pointer;
 
 	struct pt_desc pt_desc;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 25/27] KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC)
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (24 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 24/27] KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:31 ` [RFC PATCH v3 26/27] KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC Kai Huang
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add a VM-Exit handler to trap-and-execute EINIT when SGX LC is enabled
in the host.  When SGX LC is enabled, the host kernel may rewrite the
hardware values at will, e.g. to launch enclaves with different signers,
thus KVM needs to intercept EINIT to ensure it is executed with the
correct LE hash (even if the guest sees a hardwired hash).

Switching the LE hash MSRs on VM-Enter/VM-Exit is not a viable option as
writing the MSRs is prohibitively expensive, e.g. on SKL hardware each
WRMSR is ~400 cycles.  And because EINIT takes tens of thousands of
cycles to execute, the ~1500 cycle overhead to trap-and-execute EINIT is
unlikely to be noticed by the guest, let alone impact its overall SGX
performance.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/vmx/sgx.c | 55 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 6ad6a24c4e93..979d0597e4ac 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -256,6 +256,59 @@ static int handle_encls_ecreate(struct kvm_vcpu *vcpu)
 	return kvm_skip_emulated_instruction(vcpu);
 }
 
+static int handle_encls_einit(struct kvm_vcpu *vcpu)
+{
+	unsigned long sig_hva, secs_hva, token_hva, rflags;
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+	gva_t sig_gva, secs_gva, token_gva;
+	gpa_t sig_gpa, secs_gpa, token_gpa;
+	int ret, trapnr;
+
+	if (sgx_get_encls_gva(vcpu, kvm_rbx_read(vcpu), 1808, 4096, &sig_gva) ||
+	    sgx_get_encls_gva(vcpu, kvm_rcx_read(vcpu), 4096, 4096, &secs_gva) ||
+	    sgx_get_encls_gva(vcpu, kvm_rdx_read(vcpu), 304, 512, &token_gva))
+		return 1;
+
+	/*
+	 * Translate the SIGSTRUCT, SECS and TOKEN pointers from GVA to GPA.
+	 * Resume the guest on failure to inject a #PF.
+	 */
+	if (sgx_gva_to_gpa(vcpu, sig_gva, false, &sig_gpa) ||
+	    sgx_gva_to_gpa(vcpu, secs_gva, true, &secs_gpa) ||
+	    sgx_gva_to_gpa(vcpu, token_gva, false, &token_gpa))
+		return 1;
+
+	/*
+	 * ...and then to HVA.  The order of accesses isn't architectural, i.e.
+	 * KVM doesn't have to fully process one address at a time.  Exit to
+	 * userspace if a GPA is invalid.  Note, all structures are aligned and
+	 * cannot split pages.
+	 */
+	if (sgx_gpa_to_hva(vcpu, sig_gpa, &sig_hva) ||
+	    sgx_gpa_to_hva(vcpu, secs_gpa, &secs_hva) ||
+	    sgx_gpa_to_hva(vcpu, token_gpa, &token_hva))
+		return 0;
+
+	ret = sgx_virt_einit((void __user *)sig_hva, (void __user *)token_hva,
+			     (void __user *)secs_hva,
+			     vmx->msr_ia32_sgxlepubkeyhash, &trapnr);
+
+	if (ret == -EFAULT)
+		return sgx_inject_fault(vcpu, secs_gva, trapnr);
+
+	rflags = vmx_get_rflags(vcpu) & ~(X86_EFLAGS_CF | X86_EFLAGS_PF |
+					  X86_EFLAGS_AF | X86_EFLAGS_SF |
+					  X86_EFLAGS_OF);
+	if (ret)
+		rflags |= X86_EFLAGS_ZF;
+	else
+		rflags &= ~X86_EFLAGS_ZF;
+	vmx_set_rflags(vcpu, rflags);
+
+	kvm_rax_write(vcpu, ret);
+	return kvm_skip_emulated_instruction(vcpu);
+}
+
 static inline bool encls_leaf_enabled_in_guest(struct kvm_vcpu *vcpu, u32 leaf)
 {
 	if (!enable_sgx || !guest_cpuid_has(vcpu, X86_FEATURE_SGX))
@@ -288,6 +341,8 @@ int handle_encls(struct kvm_vcpu *vcpu)
 	} else {
 		if (leaf == ECREATE)
 			return handle_encls_ecreate(vcpu);
+		if (leaf == EINIT)
+			return handle_encls_einit(vcpu);
 		WARN(1, "KVM: unexpected exit on ENCLS[%u]", leaf);
 		vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
 		vcpu->run->hw.hardware_exit_reason = EXIT_REASON_ENCLS;
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 26/27] KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (25 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 25/27] KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC) Kai Huang
@ 2021-01-26  9:31 ` Kai Huang
  2021-01-26  9:32 ` [RFC PATCH v3 27/27] KVM: x86: Add capability to grant VM access to privileged SGX attribute Kai Huang
  2021-02-02 22:21 ` [RFC PATCH v3 00/27] KVM SGX virtualization support Edgecombe, Rick P
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:31 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Enable SGX virtualization now that KVM has the VM-Exit handlers needed
to trap-and-execute ENCLS to ensure correctness and/or enforce the CPU
model exposed to the guest.  Add a KVM module param, "sgx", to allow an
admin to disable SGX virtualization independent of the kernel.

When supported in hardware and the kernel, advertise SGX1, SGX2 and SGX
LC to userspace via CPUID and wire up the ENCLS_EXITING bitmap based on
the guest's SGX capabilities, i.e. to allow ENCLS to be executed in an
SGX-enabled guest.  With the exception of the provision key, all SGX
attribute bits may be exposed to the guest.  Guest access to the
provision key, which is controlled via securityfs, will be added in a
future patch.

Note, KVM does not yet support exposing ENCLS_C leafs or ENCLV leafs.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/kvm/cpuid.c      | 57 +++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/nested.c | 26 +++++++++++--
 arch/x86/kvm/vmx/nested.h |  5 +++
 arch/x86/kvm/vmx/sgx.c    | 80 ++++++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx/sgx.h    | 13 +++++++
 arch/x86/kvm/vmx/vmcs12.c |  1 +
 arch/x86/kvm/vmx/vmcs12.h |  4 +-
 arch/x86/kvm/vmx/vmx.c    | 38 ++++++++++++++++++-
 8 files changed, 215 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index f8037fab8950..04b2f5de2d7b 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -18,6 +18,7 @@
 #include <asm/processor.h>
 #include <asm/user.h>
 #include <asm/fpu/xstate.h>
+#include <asm/sgx_arch.h>
 #include "cpuid.h"
 #include "lapic.h"
 #include "mmu.h"
@@ -171,6 +172,21 @@ static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 		vcpu->arch.guest_supported_xcr0 =
 			(best->eax | ((u64)best->edx << 32)) & supported_xcr0;
 
+	/*
+	 * Bits 127:0 of the allowed SECS.ATTRIBUTES (CPUID.0x12.0x1) enumerate
+	 * the supported XSAVE Feature Request Mask (XFRM), i.e. the enclave's
+	 * requested XCR0 value.  The enclave's XFRM must be a subset of XCRO
+	 * at the time of EENTER, thus adjust the allowed XFRM by the guest's
+	 * supported XCR0.  Similar to XCR0 handling, FP and SSE are forced to
+	 * '1' even on CPUs that don't support XSAVE.
+	 */
+	best = kvm_find_cpuid_entry(vcpu, 0x12, 0x1);
+	if (best) {
+		best->ecx &= vcpu->arch.guest_supported_xcr0 & 0xffffffff;
+		best->edx &= vcpu->arch.guest_supported_xcr0 >> 32;
+		best->ecx |= XFEATURE_MASK_FPSSE;
+	}
+
 	kvm_update_pv_runtime(vcpu);
 
 	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
@@ -413,7 +429,7 @@ void kvm_set_cpu_caps(void)
 	);
 
 	kvm_cpu_cap_mask(CPUID_7_0_EBX,
-		F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
+		F(FSGSBASE) | F(SGX) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
 		F(BMI2) | F(ERMS) | 0 /*INVPCID*/ | F(RTM) | 0 /*MPX*/ | F(RDSEED) |
 		F(ADX) | F(SMAP) | F(AVX512IFMA) | F(AVX512F) | F(AVX512PF) |
 		F(AVX512ER) | F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB) | F(AVX512DQ) |
@@ -424,7 +440,8 @@ void kvm_set_cpu_caps(void)
 		F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
 		F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
 		F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
-		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
+		F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
+		F(SGX_LC)
 	);
 	/* Set LA57 based on hardware capability. */
 	if (cpuid_ecx(7) & F(LA57))
@@ -463,6 +480,10 @@ void kvm_set_cpu_caps(void)
 		F(XSAVEOPT) | F(XSAVEC) | F(XGETBV1) | F(XSAVES)
 	);
 
+	kvm_cpu_cap_init(CPUID_12_EAX,
+		SF(SGX1) | SF(SGX2)
+	);
+
 	kvm_cpu_cap_mask(CPUID_8000_0001_ECX,
 		F(LAHF_LM) | F(CMP_LEGACY) | 0 /*SVM*/ | 0 /* ExtApicSpace */ |
 		F(CR8_LEGACY) | F(ABM) | F(SSE4A) | F(MISALIGNSSE) |
@@ -784,6 +805,38 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 			entry->edx = 0;
 		}
 		break;
+	case 0x12:
+		/* Intel SGX */
+		if (!kvm_cpu_cap_has(X86_FEATURE_SGX)) {
+			entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
+			break;
+		}
+
+		/*
+		 * Index 0: Sub-features, MISCSELECT (a.k.a extended features)
+		 * and max enclave sizes.   The SGX sub-features and MISCSELECT
+		 * are restricted by kernel and KVM capabilities (like most
+		 * feature flags), while enclave size is unrestricted.
+		 */
+		cpuid_entry_override(entry, CPUID_12_EAX);
+		entry->ebx &= SGX_MISC_EXINFO;
+
+		entry = do_host_cpuid(array, function, 1);
+		if (!entry)
+			goto out;
+
+		/*
+		 * Index 1: SECS.ATTRIBUTES.  ATTRIBUTES are restricted a la
+		 * feature flags.  Advertise all supported flags, including
+		 * privileged attributes that require explicit opt-in from
+		 * userspace.  ATTRIBUTES.XFRM is not adjusted as userspace is
+		 * expected to derive it from supported XCR0.
+		 */
+		entry->eax &= SGX_ATTR_DEBUG | SGX_ATTR_MODE64BIT |
+			      /* PROVISIONKEY | */ SGX_ATTR_EINITTOKENKEY |
+			      SGX_ATTR_KSS;
+		entry->ebx &= 0;
+		break;
 	/* Intel PT */
 	case 0x14:
 		if (!kvm_cpu_cap_has(X86_FEATURE_INTEL_PT)) {
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 562eab7b0a51..fca1f4c8cc5b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -11,6 +11,7 @@
 #include "mmu.h"
 #include "nested.h"
 #include "pmu.h"
+#include "sgx.h"
 #include "trace.h"
 #include "x86.h"
 
@@ -2318,6 +2319,9 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
 		if (!nested_cpu_has2(vmcs12, SECONDARY_EXEC_UNRESTRICTED_GUEST))
 		    exec_control &= ~SECONDARY_EXEC_UNRESTRICTED_GUEST;
 
+		if (exec_control & SECONDARY_EXEC_ENCLS_EXITING)
+			vmx_write_encls_bitmap(&vmx->vcpu, vmcs12);
+
 		secondary_exec_controls_set(vmx, exec_control);
 	}
 
@@ -5726,6 +5730,20 @@ static bool nested_vmx_exit_handled_cr(struct kvm_vcpu *vcpu,
 	return false;
 }
 
+static bool nested_vmx_exit_handled_encls(struct kvm_vcpu *vcpu,
+					  struct vmcs12 *vmcs12)
+{
+	u32 encls_leaf;
+
+	if (!nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENCLS_EXITING))
+		return false;
+
+	encls_leaf = kvm_rax_read(vcpu);
+	if (encls_leaf > 62)
+		encls_leaf = 63;
+	return vmcs12->encls_exiting_bitmap & BIT_ULL(encls_leaf);
+}
+
 static bool nested_vmx_exit_handled_vmcs_access(struct kvm_vcpu *vcpu,
 	struct vmcs12 *vmcs12, gpa_t bitmap)
 {
@@ -5819,9 +5837,6 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu,
 	case EXIT_REASON_VMFUNC:
 		/* VM functions are emulated through L2->L0 vmexits. */
 		return true;
-	case EXIT_REASON_ENCLS:
-		/* SGX is never exposed to L1 */
-		return true;
 	default:
 		break;
 	}
@@ -5945,6 +5960,8 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu,
 	case EXIT_REASON_TPAUSE:
 		return nested_cpu_has2(vmcs12,
 			SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE);
+	case EXIT_REASON_ENCLS:
+		return nested_vmx_exit_handled_encls(vcpu, vmcs12);
 	default:
 		return true;
 	}
@@ -6517,6 +6534,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 		msrs->secondary_ctls_high |=
 			SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
 
+	if (enable_sgx)
+		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
+
 	/* miscellaneous data */
 	rdmsr(MSR_IA32_VMX_MISC,
 		msrs->misc_low,
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index 197148d76b8f..184418baeb3c 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -244,6 +244,11 @@ static inline bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
 		PIN_BASED_EXT_INTR_MASK;
 }
 
+static inline bool nested_cpu_has_encls_exit(struct vmcs12 *vmcs12)
+{
+	return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENCLS_EXITING);
+}
+
 /*
  * if fixed0[i] == 1: val[i] must be 1
  * if fixed1[i] == 0: val[i] must be 0
diff --git a/arch/x86/kvm/vmx/sgx.c b/arch/x86/kvm/vmx/sgx.c
index 979d0597e4ac..62c3f3ec960b 100644
--- a/arch/x86/kvm/vmx/sgx.c
+++ b/arch/x86/kvm/vmx/sgx.c
@@ -6,11 +6,13 @@
 
 #include "cpuid.h"
 #include "kvm_cache_regs.h"
+#include "nested.h"
 #include "sgx.h"
 #include "vmx.h"
 #include "x86.h"
 
-bool __read_mostly enable_sgx;
+bool __read_mostly enable_sgx = 1;
+module_param_named(sgx, enable_sgx, bool, 0444);
 
 /* Initial value of guest's virtual SGX_LEPUBKEYHASHn MSRs */
 static u64 sgx_pubkey_hash[4] __ro_after_init;
@@ -382,3 +384,79 @@ void vcpu_setup_sgx_lepubkeyhash(struct kvm_vcpu *vcpu)
 	memcpy(vmx->msr_ia32_sgxlepubkeyhash, sgx_pubkey_hash,
 	       sizeof(sgx_pubkey_hash));
 }
+
+/*
+ * ECREATE must be intercepted to enforce MISCSELECT, ATTRIBUTES and XFRM
+ * restrictions if the guest's allowed-1 settings diverge from hardware.
+ */
+static bool sgx_intercept_encls_ecreate(struct kvm_vcpu *vcpu)
+{
+	struct kvm_cpuid_entry2 *guest_cpuid;
+	u32 eax, ebx, ecx, edx;
+
+	if (!vcpu->kvm->arch.sgx_provisioning_allowed)
+		return true;
+
+	guest_cpuid = kvm_find_cpuid_entry(vcpu, 0x12, 0);
+	if (!guest_cpuid)
+		return true;
+
+	cpuid_count(0x12, 0, &eax, &ebx, &ecx, &edx);
+	if (guest_cpuid->ebx != ebx || guest_cpuid->edx != edx)
+		return true;
+
+	guest_cpuid = kvm_find_cpuid_entry(vcpu, 0x12, 1);
+	if (!guest_cpuid)
+		return true;
+
+	cpuid_count(0x12, 1, &eax, &ebx, &ecx, &edx);
+	if (guest_cpuid->eax != eax || guest_cpuid->ebx != ebx ||
+	    guest_cpuid->ecx != ecx || guest_cpuid->edx != edx)
+		return true;
+
+	return false;
+}
+
+void vmx_write_encls_bitmap(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
+{
+	/*
+	 * There is no software enable bit for SGX that is virtualized by
+	 * hardware, e.g. there's no CR4.SGXE, so when SGX is disabled in the
+	 * guest (either by the host or by the guest's BIOS) but enabled in the
+	 * host, trap all ENCLS leafs and inject #UD/#GP as needed to emulate
+	 * the expected system behavior for ENCLS.
+	 */
+	u64 bitmap = -1ull;
+
+	/* Nothing to do if hardware doesn't support SGX */
+	if (!cpu_has_vmx_encls_vmexit())
+		return;
+
+	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX) &&
+	    sgx_enabled_in_guest_bios(vcpu)) {
+		if (guest_cpuid_has(vcpu, X86_FEATURE_SGX1)) {
+			bitmap &= ~GENMASK_ULL(ETRACK, ECREATE);
+			if (sgx_intercept_encls_ecreate(vcpu))
+				bitmap |= (1 << ECREATE);
+		}
+
+		if (guest_cpuid_has(vcpu, X86_FEATURE_SGX2))
+			bitmap &= ~GENMASK_ULL(EMODT, EAUG);
+
+		/*
+		 * Trap and execute EINIT if launch control is enabled in the
+		 * host using the guest's values for launch control MSRs, even
+		 * if the guest's values are fixed to hardware default values.
+		 * The MSRs are not loaded/saved on VM-Enter/VM-Exit as writing
+		 * the MSRs is extraordinarily expensive.
+		 */
+		if (boot_cpu_has(X86_FEATURE_SGX_LC))
+			bitmap |= (1 << EINIT);
+
+		if (!vmcs12 && is_guest_mode(vcpu))
+			vmcs12 = get_vmcs12(vcpu);
+		if (vmcs12 && nested_cpu_has_encls_exit(vmcs12))
+			bitmap |= vmcs12->encls_exiting_bitmap;
+	}
+	vmcs_write64(ENCLS_EXITING_BITMAP, bitmap);
+}
diff --git a/arch/x86/kvm/vmx/sgx.h b/arch/x86/kvm/vmx/sgx.h
index 6502fa52c7e9..a400888b376d 100644
--- a/arch/x86/kvm/vmx/sgx.h
+++ b/arch/x86/kvm/vmx/sgx.h
@@ -4,6 +4,9 @@
 
 #include <linux/kvm_host.h>
 
+#include "capabilities.h"
+#include "vmx_ops.h"
+
 #ifdef CONFIG_X86_SGX_KVM
 extern bool __read_mostly enable_sgx;
 
@@ -11,11 +14,21 @@ int handle_encls(struct kvm_vcpu *vcpu);
 
 void setup_default_sgx_lepubkeyhash(void);
 void vcpu_setup_sgx_lepubkeyhash(struct kvm_vcpu *vcpu);
+
+void vmx_write_encls_bitmap(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12);
 #else
 #define enable_sgx 0
 
 static inline void setup_default_sgx_lepubkeyhash(void) { }
 static inline void vcpu_setup_sgx_lepubkeyhash(struct kvm_vcpu *vcpu) { }
+
+static inline void vmx_write_encls_bitmap(struct kvm_vcpu *vcpu,
+					  struct vmcs12 *vmcs12)
+{
+	/* Nothing to do if hardware doesn't support SGX */
+	if (cpu_has_vmx_encls_vmexit())
+		vmcs_write64(ENCLS_EXITING_BITMAP, -1ull);
+}
 #endif
 
 #endif /* __KVM_X86_SGX_H */
diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c
index c8e51c004f78..034adb6404dc 100644
--- a/arch/x86/kvm/vmx/vmcs12.c
+++ b/arch/x86/kvm/vmx/vmcs12.c
@@ -50,6 +50,7 @@ const unsigned short vmcs_field_to_offset_table[] = {
 	FIELD64(VMREAD_BITMAP, vmread_bitmap),
 	FIELD64(VMWRITE_BITMAP, vmwrite_bitmap),
 	FIELD64(XSS_EXIT_BITMAP, xss_exit_bitmap),
+	FIELD64(ENCLS_EXITING_BITMAP, encls_exiting_bitmap),
 	FIELD64(GUEST_PHYSICAL_ADDRESS, guest_physical_address),
 	FIELD64(VMCS_LINK_POINTER, vmcs_link_pointer),
 	FIELD64(GUEST_IA32_DEBUGCTL, guest_ia32_debugctl),
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index 80232daf00ff..13494956d0e9 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -69,7 +69,8 @@ struct __packed vmcs12 {
 	u64 vm_function_control;
 	u64 eptp_list_address;
 	u64 pml_address;
-	u64 padding64[3]; /* room for future expansion */
+	u64 encls_exiting_bitmap;
+	u64 padding64[2]; /* room for future expansion */
 	/*
 	 * To allow migration of L1 (complete with its L2 guests) between
 	 * machines of different natural widths (32 or 64 bit), we cannot have
@@ -256,6 +257,7 @@ static inline void vmx_check_vmcs12_offsets(void)
 	CHECK_OFFSET(vm_function_control, 296);
 	CHECK_OFFSET(eptp_list_address, 304);
 	CHECK_OFFSET(pml_address, 312);
+	CHECK_OFFSET(encls_exiting_bitmap, 320);
 	CHECK_OFFSET(cr0_guest_host_mask, 344);
 	CHECK_OFFSET(cr4_guest_host_mask, 352);
 	CHECK_OFFSET(cr0_read_shadow, 360);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 349585f63c4d..9a2293a39e37 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2160,6 +2160,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		vmx->msr_ia32_feature_control = data;
 		if (msr_info->host_initiated && data == 0)
 			vmx_leave_nested(vcpu);
+
+		/* SGX may be enabled/disabled by guest's firmware */
+		vmx_write_encls_bitmap(vcpu, NULL);
 		break;
 	case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
 		if (!msr_info->host_initiated &&
@@ -4317,6 +4320,15 @@ static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
 	vmx_adjust_sec_exec_control(vmx, &exec_control, waitpkg, WAITPKG,
 				    ENABLE_USR_WAIT_PAUSE, false);
 
+	if (cpu_has_vmx_encls_vmexit() && nested) {
+		if (guest_cpuid_has(vcpu, X86_FEATURE_SGX))
+			vmx->nested.msrs.secondary_ctls_high |=
+				SECONDARY_EXEC_ENCLS_EXITING;
+		else
+			vmx->nested.msrs.secondary_ctls_high &=
+				~SECONDARY_EXEC_ENCLS_EXITING;
+	}
+
 	vmx->secondary_exec_control = exec_control;
 }
 
@@ -4416,8 +4428,7 @@ static void init_vmcs(struct vcpu_vmx *vmx)
 		vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1);
 	}
 
-	if (cpu_has_vmx_encls_vmexit())
-		vmcs_write64(ENCLS_EXITING_BITMAP, -1ull);
+	vmx_write_encls_bitmap(&vmx->vcpu, NULL);
 
 	if (vmx_pt_mode_is_host_guest()) {
 		memset(&vmx->pt_desc, 0, sizeof(vmx->pt_desc));
@@ -7301,6 +7312,22 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
 	set_cr4_guest_host_mask(vmx);
 
+	vmx_write_encls_bitmap(vcpu, NULL);
+	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX))
+		vmx->msr_ia32_feature_control_valid_bits |= FEAT_CTL_SGX_ENABLED;
+	else
+		vmx->msr_ia32_feature_control_valid_bits &= ~FEAT_CTL_SGX_ENABLED;
+	/*
+	 * Only allow guest to write its virtual SGX_LEPUBKEYHASHn MSRs when
+	 * host is writable, otherwise it is meaningless.
+	 */
+	if (guest_cpuid_has(vcpu, X86_FEATURE_SGX_LC))
+		vmx->msr_ia32_feature_control_valid_bits |=
+			FEAT_CTL_SGX_LC_ENABLED;
+	else
+		vmx->msr_ia32_feature_control_valid_bits &=
+			~FEAT_CTL_SGX_LC_ENABLED;
+
 	/* Refresh #PF interception to account for MAXPHYADDR changes. */
 	update_exception_bitmap(vcpu);
 }
@@ -7321,6 +7348,13 @@ static __init void vmx_set_cpu_caps(void)
 	if (vmx_pt_mode_is_host_guest())
 		kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
 
+	if (!enable_sgx) {
+		kvm_cpu_cap_clear(X86_FEATURE_SGX);
+		kvm_cpu_cap_clear(X86_FEATURE_SGX_LC);
+		kvm_cpu_cap_clear(X86_FEATURE_SGX1);
+		kvm_cpu_cap_clear(X86_FEATURE_SGX2);
+	}
+
 	if (vmx_umip_emulated())
 		kvm_cpu_cap_set(X86_FEATURE_UMIP);
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 27/27] KVM: x86: Add capability to grant VM access to privileged SGX attribute
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (26 preceding siblings ...)
  2021-01-26  9:31 ` [RFC PATCH v3 26/27] KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC Kai Huang
@ 2021-01-26  9:32 ` Kai Huang
  2021-02-02 22:21 ` [RFC PATCH v3 00/27] KVM SGX virtualization support Edgecombe, Rick P
  28 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26  9:32 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jmattson, joro, vkuznets, wanpengli, corbet,
	Andy Lutomirski, Kai Huang

From: Sean Christopherson <sean.j.christopherson@intel.com>

Add a capability, KVM_CAP_SGX_ATTRIBUTE, that can be used by userspace
to grant a VM access to a priveleged attribute, with args[0] holding a
file handle to a valid SGX attribute file.

The SGX subsystem restricts access to a subset of enclave attributes to
provide additional security for an uncompromised kernel, e.g. to prevent
malware from using the PROVISIONKEY to ensure its nodes are running
inside a geniune SGX enclave and/or to obtain a stable fingerprint.

To prevent userspace from circumventing such restrictions by running an
enclave in a VM, KVM restricts guest access to privileged attributes by
default.

Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 Documentation/virt/kvm/api.rst | 23 +++++++++++++++++++++++
 arch/x86/kvm/cpuid.c           |  2 +-
 arch/x86/kvm/x86.c             | 22 ++++++++++++++++++++++
 include/uapi/linux/kvm.h       |  1 +
 4 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index c136e254b496..47c7c7c33025 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6037,6 +6037,29 @@ KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR exit notifications which user space
 can then handle to implement model specific MSR handling and/or user notifications
 to inform a user that an MSR was not handled.
 
+7.22 KVM_CAP_SGX_ATTRIBUTE
+----------------------
+
+:Architectures: x86
+:Target: VM
+:Parameters: args[0] is a file handle of a SGX attribute file in securityfs
+:Returns: 0 on success, -EINVAL if the file handle is invalid or if a requested
+          attribute is not supported by KVM.
+
+KVM_CAP_SGX_ATTRIBUTE enables a userspace VMM to grant a VM access to one or
+more priveleged enclave attributes.  args[0] must hold a file handle to a valid
+SGX attribute file corresponding to an attribute that is supported/restricted
+by KVM (currently only PROVISIONKEY).
+
+The SGX subsystem restricts access to a subset of enclave attributes to provide
+additional security for an uncompromised kernel, e.g. use of the PROVISIONKEY
+is restricted to deter malware from using the PROVISIONKEY to obtain a stable
+system fingerprint.  To prevent userspace from circumventing such restrictions
+by running an enclave in a VM, KVM prevents access to privileged attributes by
+default.
+
+See Documentation/x86/sgx/2.Kernel-internals.rst for more details.
+
 8. Other capabilities.
 ======================
 
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 04b2f5de2d7b..ad00a1af1545 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -833,7 +833,7 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 		 * expected to derive it from supported XCR0.
 		 */
 		entry->eax &= SGX_ATTR_DEBUG | SGX_ATTR_MODE64BIT |
-			      /* PROVISIONKEY | */ SGX_ATTR_EINITTOKENKEY |
+			      SGX_ATTR_PROVISIONKEY | SGX_ATTR_EINITTOKENKEY |
 			      SGX_ATTR_KSS;
 		entry->ebx &= 0;
 		break;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5ca7b181a3ae..3d1b4113a57b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -74,6 +74,8 @@
 #include <asm/tlbflush.h>
 #include <asm/intel_pt.h>
 #include <asm/emulate_prefix.h>
+#include <asm/sgx.h>
+#include <asm/sgx_arch.h>
 #include <clocksource/hyperv_timer.h>
 
 #define CREATE_TRACE_POINTS
@@ -3767,6 +3769,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_X86_USER_SPACE_MSR:
 	case KVM_CAP_X86_MSR_FILTER:
 	case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
+#ifdef CONFIG_X86_SGX_KVM
+	case KVM_CAP_SGX_ATTRIBUTE:
+#endif
 		r = 1;
 		break;
 	case KVM_CAP_SYNC_REGS:
@@ -5295,6 +5300,23 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		kvm->arch.user_space_msr_mask = cap->args[0];
 		r = 0;
 		break;
+#ifdef CONFIG_X86_SGX_KVM
+	case KVM_CAP_SGX_ATTRIBUTE: {
+		unsigned long allowed_attributes = 0;
+
+		r = sgx_set_attribute(&allowed_attributes, cap->args[0]);
+		if (r)
+			break;
+
+		/* KVM only supports the PROVISIONKEY privileged attribute. */
+		if ((allowed_attributes & SGX_ATTR_PROVISIONKEY) &&
+		    !(allowed_attributes & ~SGX_ATTR_PROVISIONKEY))
+			kvm->arch.sgx_provisioning_allowed = true;
+		else
+			r = -EINVAL;
+		break;
+	}
+#endif
 	default:
 		r = -EINVAL;
 		break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 374c67875cdb..e17bda18a9b4 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1058,6 +1058,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 190
 #define KVM_CAP_SYS_HYPERV_CPUID 191
 #define KVM_CAP_DIRTY_LOG_RING 192
+#define KVM_CAP_SGX_ATTRIBUTE 200
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* [RFC PATCH v3 00/27] KVM SGX virtualization support
@ 2021-01-26 10:10 Kai Huang
  2021-01-26  9:29 ` Kai Huang
                   ` (28 more replies)
  0 siblings, 29 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26 10:10 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa, jethro, b.thiel, jmattson, joro, vkuznets,
	wanpengli, corbet

--- Disclaimer ---

These patches were originally written by Sean Christopherson while at Intel.
Now that Sean has left Intel, I (Kai) have taken over getting them upstream.
This series needs more review before it can be merged.  It is being posted
publicly and under RFC so Sean and others can review it. Maintainers are safe
ignoring it for now.

------------------

Hi all,

This series adds KVM SGX virtualization support. The first 15 patches starting
with x86/sgx or x86/cpu.. are necessary changes to x86 and SGX core/driver to
support KVM SGX virtualization, while the rest are patches to KVM subsystem.

Please help to review this series. Any feedback is highly appreciated.
Please let me know if I forgot to CC anyone, or anyone wants to be removed from
CC. Thanks in advance!

This series is based against tip/x86/sgx. You can also get the code from
upstream branch of kvm-sgx repo on github:

        https://github.com/intel/kvm-sgx.git upstream

It also requires Qemu changes to create VM with SGX support. You can find Qemu
repo here:

	https://github.com/intel/qemu-sgx.git upstream

Please refer to README.md of above qemu-sgx repo for detail on how to create
guest with SGX support. At meantime, for your quick reference you can use below
command to create SGX guest:

	#qemu-system-x86_64 -smp 4 -m 2G -drive file=<your_vm_image>,if=virtio \
		-cpu host,+sgx_provisionkey \
		-sgx-epc id=epc1,memdev=mem1 \
		-object memory-backend-epc,id=mem1,size=64M,prealloc

Please note that the SGX relevant part is:

		-cpu host,+sgx_provisionkey \
		-sgx-epc id=epc1,memdev=mem1 \
		-object memory-backend-epc,id=mem1,size=64M,prealloc

And you can change other parameters of your qemu command based on your needs.

=========
Changelog:

(Changelog here is for global changes. Please see each patch's changelog for
 changes made to specific patch.)

v2->v3:

 - Split original "x86/cpufeatures: Add SGX1 and SGX2 sub-features" patch into
   two patches, by splitting moving SGX_LC bit also into cpuid-deps table logic
   into a separate patch 2:
       [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
       [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit
 - Changed from /dev/sgx_virt_epc to /dev/sgx_vepc, per Jarkko. And accordingly,
   changed prefix 'sgx_virt_epc_xx' to 'sgx_vepc_xx' in various functions and
   structures.
 - Changed CONFIG_X86_SGX_VIRTUALIZATION to CONFIG_X86_SGX_KVM, per Dave. Couple
   of x86 patches and KVM patches are changed too due to the renaming.

v1->v2:

 - Refined this cover letter by addressing comments from Dave and Jarkko.
 - The original patch which introduced new X86_FEATURE_SGX1/SGX2 were replaced
   by 3 new patches from Sean, following Boris and Sean's discussion.
       [RFC PATCH v2 01/26] x86/cpufeatures: Add SGX1 and SGX2 sub-features
       [RFC PATCH v2 18/26] KVM: x86: Add support for reverse CPUID lookup of scattered features
       [RFC PATCH v2 19/26] KVM: x86: Add reverse-CPUID lookup support for scattered SGX features
 - The original patch 1
       x86/sgx: Split out adding EPC page to free list to separate helper
   was replaced with 2 new patches from Jarkko
       [RFC PATCH v2 02/26] x86/sgx: Remove a warn from sgx_free_epc_page()
       [RFC PATCH v2 03/26] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
   addressing Jarkko's comments.
 - Moved modifying sgx_init() to always initialize sgx_virt_epc_init() out of
   patch
       x86/sgx: Introduce virtual EPC for use by KVM guests
   to a separate patch:
       [RFC PATCH v2 07/26] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
   to address Dave's comment that patch ordering can be improved due to before
   patch "Allow SGX virtualization without Launch Control support", all SGX,
   including SGX virtualization, is actually disabled when SGX LC is not
   present.

=========
KVM SGX virtualization Overview

- Virtual EPC

SGX enclave memory is special and is reserved specifically for enclave use.
In bare-metal SGX enclaves, the kernel allocates enclave pages, copies data
into the pages with privileged instructions, then allows the enclave to start.
In this scenario, only initialized pages already assigned to an enclave are
mapped to userspace.

In virtualized environments, the hypervisor still needs to do the physical
enclave page allocation.  The guest kernel is responsible for the data copying
(among other things).  This means that the job of starting an enclave is now
split between hypervisor and guest.

This series introduces a new misc device: /dev/sgx_vepc.  This device allows
the host to map *uninitialized* enclave memory into userspace, which can then
be passed into a guest.

While it might be *possible* to start a host-side enclave with /dev/sgx_enclave
and pass its memory into a guest, it would be wasteful and convoluted.

Implement the *raw* EPC allocation in the x86 core-SGX subsystem via
/dev/sgx_vepc rather than in KVM.  Doing so has two major advantages:

  - Does not require changes to KVM's uAPI, e.g. EPC gets handled as
    just another memory backend for guests.

  - EPC management is wholly contained in the SGX subsystem, e.g. SGX
    does not have to export any symbols, changes to reclaim flows don't
    need to be routed through KVM, SGX's dirty laundry doesn't have to
    get aired out for the world to see, and so on and so forth.

The virtual EPC pages allocated to guests are currently not reclaimable.
Reclaiming EPC page used by enclave requires a special reclaim mechanism
separate from normal page reclaim, and that mechanism is not supported
for virutal EPC pages.  Due to the complications of handling reclaim
conflicts between guest and host, reclaiming virtual EPC pages is 
significantly more complex than basic support for SGX virtualization.

- Support SGX virtualization without SGX Flexible Launch Control

SGX hardware supports two "launch control" modes to limit which enclaves can
run.  In the "locked" mode, the hardware prevents enclaves from running unless
they are blessed by a third party.  In the unlocked mode, the kernel is in
full control of which enclaves can run.  The bare-metal SGX code refuses to
launch enclaves unless it is in the unlocked mode.

This sgx_virt_epc driver does not have such a restriction.  This allows guests
which are OK with the locked mode to use SGX, even if the host kernel refuses
to.

- Support exposing SGX2

Due to the same reason above, SGX2 feature detection is added to core SGX code
to allow KVM to expose SGX2 to guest, even currently SGX driver doesn't support
SGX2, because SGX2 can work just fine in guest w/o any interaction to host SGX
driver.

- Restricit SGX guest access to provisioning key

To grant guest being able to fully use SGX, guest needs to be able to access
provisioning key.  The provisioning key is sensitive, and accessing to it should
be restricted. In bare-metal driver, allowing enclave to access provisioning key
is restricted by being able to open /dev/sgx_provision.

Add a new KVM_CAP_SGX_ATTRIBUTE to KVM uAPI to extend above mechanism to KVM
guests as well.  When userspace hypervisor creates a new VM, the new cap is only
added to VM when userspace hypervisior is able to open /dev/sgx_provision,
following the same role as in bare-metal driver.  KVM then traps ECREATE from
guest, and only allows ECREATE with provisioning key bit to run when guest
supports KVM_CAP_SGX_ATTRIBUTE.

Jarkko Sakkinen (2):
  x86/sgx: Remove a warn from sgx_free_epc_page()
  x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()

Kai Huang (3):
  x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit
  x86/sgx: Initialize virtual EPC driver even when SGX driver is
    disabled
  x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs

Sean Christopherson (22):
  x86/cpufeatures: Add SGX1 and SGX2 sub-features
  x86/sgx: Add SGX_CHILD_PRESENT hardware error code
  x86/sgx: Introduce virtual EPC for use by KVM guests
  x86/cpu/intel: Allow SGX virtualization without Launch Control support
  x86/sgx: Expose SGX architectural definitions to the kernel
  x86/sgx: Move ENCLS leaf definitions to sgx_arch.h
  x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT)
  x86/sgx: Add encls_faulted() helper
  x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  x86/sgx: Move provisioning device creation out of SGX driver
  KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX)
  KVM: x86: Define new #PF SGX error code bit
  KVM: x86: Add support for reverse CPUID lookup of scattered features
  KVM: x86: Add reverse-CPUID lookup support for scattered SGX features
  KVM: VMX: Add basic handling of VM-Exit from SGX enclave
  KVM: VMX: Frame in ENCLS handler for SGX virtualization
  KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs
  KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC)
  KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC
  KVM: x86: Add capability to grant VM access to privileged SGX
    attribute

 Documentation/virt/kvm/api.rst                |  23 +
 arch/x86/Kconfig                              |  12 +
 arch/x86/include/asm/cpufeatures.h            |   2 +
 arch/x86/include/asm/kvm_host.h               |   5 +
 arch/x86/include/asm/sgx.h                    |  19 +
 .../cpu/sgx/arch.h => include/asm/sgx_arch.h} |  20 +
 arch/x86/include/asm/vmx.h                    |   1 +
 arch/x86/include/uapi/asm/vmx.h               |   1 +
 arch/x86/kernel/cpu/cpuid-deps.c              |   3 +
 arch/x86/kernel/cpu/feat_ctl.c                |  70 ++-
 arch/x86/kernel/cpu/scattered.c               |   2 +
 arch/x86/kernel/cpu/sgx/Makefile              |   1 +
 arch/x86/kernel/cpu/sgx/driver.c              |  17 -
 arch/x86/kernel/cpu/sgx/encl.c                |  15 +-
 arch/x86/kernel/cpu/sgx/encls.h               |  30 +-
 arch/x86/kernel/cpu/sgx/ioctl.c               |  23 +-
 arch/x86/kernel/cpu/sgx/main.c                |  87 +++-
 arch/x86/kernel/cpu/sgx/sgx.h                 |   4 +-
 arch/x86/kernel/cpu/sgx/virt.c                | 347 +++++++++++++
 arch/x86/kernel/cpu/sgx/virt.h                |  14 +
 arch/x86/kvm/Makefile                         |   2 +
 arch/x86/kvm/cpuid.c                          |  89 +++-
 arch/x86/kvm/cpuid.h                          |  50 +-
 arch/x86/kvm/vmx/nested.c                     |  70 ++-
 arch/x86/kvm/vmx/nested.h                     |   5 +
 arch/x86/kvm/vmx/sgx.c                        | 462 ++++++++++++++++++
 arch/x86/kvm/vmx/sgx.h                        |  34 ++
 arch/x86/kvm/vmx/vmcs12.c                     |   1 +
 arch/x86/kvm/vmx/vmcs12.h                     |   4 +-
 arch/x86/kvm/vmx/vmx.c                        | 171 +++++--
 arch/x86/kvm/vmx/vmx.h                        |  27 +-
 arch/x86/kvm/x86.c                            |  24 +
 include/uapi/linux/kvm.h                      |   1 +
 tools/testing/selftests/sgx/defines.h         |   2 +-
 34 files changed, 1482 insertions(+), 156 deletions(-)
 create mode 100644 arch/x86/include/asm/sgx.h
 rename arch/x86/{kernel/cpu/sgx/arch.h => include/asm/sgx_arch.h} (96%)
 create mode 100644 arch/x86/kernel/cpu/sgx/virt.c
 create mode 100644 arch/x86/kernel/cpu/sgx/virt.h
 create mode 100644 arch/x86/kvm/vmx/sgx.c
 create mode 100644 arch/x86/kvm/vmx/sgx.h

-- 
2.29.2


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-01-26  9:30 ` [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features Kai Huang
@ 2021-01-26 15:34   ` Dave Hansen
  2021-01-26 23:18     ` Kai Huang
  2021-01-30 13:11   ` Jarkko Sakkinen
  1 sibling, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-01-26 15:34 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On 1/26/21 1:30 AM, Kai Huang wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Add SGX1 and SGX2 feature flags, via CPUID.0x12.0x0.EAX, as scattered
> features, since adding a new leaf for only two bits would be wasteful.
> As part of virtualizing SGX, KVM will expose the SGX CPUID leafs to its
> guest, and to do so correctly needs to query hardware and kernel support
> for SGX1 and SGX2.

It's also not _just_ exposing the CPUID leaves.  There are some checks
here when KVM is emulating some SGX instructions too, right?

> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 84b887825f12..18b2d0c8bbbe 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -292,6 +292,8 @@
>  #define X86_FEATURE_FENCE_SWAPGS_KERNEL	(11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */
>  #define X86_FEATURE_SPLIT_LOCK_DETECT	(11*32+ 6) /* #AC for split lock */
>  #define X86_FEATURE_PER_THREAD_MBA	(11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */
> +#define X86_FEATURE_SGX1		(11*32+ 8) /* Software Guard Extensions sub-feature SGX1 */
> +#define X86_FEATURE_SGX2        	(11*32+ 9) /* Software Guard Extensions sub-feature SGX2 */

FWIW, I'm not sure how valuable it is to spell the SGX acronym out three
times.  Can't we use those bytes to put something more useful in that
comment?

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit
  2021-01-26  9:30 ` [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit Kai Huang
@ 2021-01-26 15:35   ` Dave Hansen
  2021-01-30 13:22   ` Jarkko Sakkinen
  1 sibling, 0 replies; 156+ messages in thread
From: Dave Hansen @ 2021-01-26 15:35 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On 1/26/21 1:30 AM, Kai Huang wrote:
> Move SGX_LC feature bit to CPUID dependency table as well, along with
> new added SGX1 and SGX2 bit, to make clearing all SGX feature bits
> easier. Also remove clear_sgx_caps() since it is just a wrapper of
> setup_clear_cpu_cap(X86_FEATURE_SGX) now.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>

Looks good:

Acked-by: Dave Hansen <dave.hansen@intel.com>

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page()
  2021-01-26  9:30 ` [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page() Kai Huang
@ 2021-01-26 15:39   ` Dave Hansen
  2021-01-26 16:30     ` Sean Christopherson
  2021-01-27  1:08     ` Kai Huang
  0 siblings, 2 replies; 156+ messages in thread
From: Dave Hansen @ 2021-01-26 15:39 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On 1/26/21 1:30 AM, Kai Huang wrote:
> Remove SGX_EPC_PAGE_RECLAIMER_TRACKED check and warning.  This cannot
> happen, as enclave pages are freed only at the time when encl->refcount
> triggers, i.e. when both VFS and the page reclaimer have given up on
> their references.
> 
> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 8df81a3ed945..f330abdb5bb1 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -605,8 +605,6 @@ void sgx_free_epc_page(struct sgx_epc_page *page)
>  	struct sgx_epc_section *section = &sgx_epc_sections[page->section];
>  	int ret;
>  
> -	WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);

I'm all for cleaning up silly, useless warnings.  But, don't we usually
put warnings in for things that we don't expect to be able to happen?

In other words, I'm fine with removing this if it hasn't been a valuable
warning and we don't expect it to become a valuable warning.  But, the
changelog doesn't say that.  It also doesn't explain what this patch is
doing in this series.

Why is this her?

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 05/27] x86/sgx: Add SGX_CHILD_PRESENT hardware error code
  2021-01-26  9:30 ` [RFC PATCH v3 05/27] x86/sgx: Add SGX_CHILD_PRESENT hardware error code Kai Huang
@ 2021-01-26 15:49   ` Dave Hansen
  2021-01-27  0:00     ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-01-26 15:49 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On 1/26/21 1:30 AM, Kai Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> SGX virtualization requires to allocate "raw" EPC and use it as "virtual
> EPC" for SGX guest.  Unlike EPC used by SGX driver, virtual EPC doesn't
> track how EPC pages are used in VM, e.g. (de)construction of enclaves,
> so it cannot guarantee EREMOVE success, e.g. it doesn't have a priori
> knowledge of which pages are SECS with non-zero child counts.

The grammar there is a bit questionable in spots.  Here's a rewrite:

SGX can accurately track how bare-metal enclave pages are used.  This
enables SECS to be specifically targeted and EREMOVE'd only after all
child pages have been EREMOVE'd.  This ensures that bare-metal SGX will
never encounter SGX_CHILD_PRESENT in normal operation.

Virtual EPC is different.  The host does not track how EPC pages are
used by the guest, so it cannot guarantee EREMOVE success.  It might,
for instance, encounter a SECS with a non-zero child count.

Aside: Would it be *possible* for the host to figure out where the SECS
pages are?  If not, we can say "host can not track" versus what I said:
"host does not track".

> Add SGX_CHILD_PRESENT for use by SGX virtualization to assert EREMOVE
> failures are expected, but only due to SGX_CHILD_PRESENT.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
> Signed-off-by: Kai Huang <kai.huang@intel.com>

With the improved changelog:

Acked-by: Dave Hansen <dave.hansen@intel.com>

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
  2021-01-26  9:30 ` [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE " Kai Huang
@ 2021-01-26 16:04   ` Dave Hansen
  2021-01-27  1:25     ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-01-26 16:04 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On 1/26/21 1:30 AM, Kai Huang wrote:
> From: Jarkko Sakkinen <jarkko@kernel.org>
> 
> Encapsulate the snippet in sgx_free_epc_page() concerning EREMOVE to
> sgx_reset_epc_page(), which is a static helper function for
> sgx_encl_release().  It's the only function existing, which deals with
> initialized pages.

Yikes.  I have no idea what that is saying.  Here's a rewrite:

EREMOVE takes a pages and removes any association between that page and
an enclave.  It must be run on a page before it can be added into
another enclave.  Currently, EREMOVE is run as part of pages being freed
into the SGX page allocator.  It is not expected to fail.

KVM does not track how guest pages are used, which means that SGX
virtualization use of EREMOVE might fail.

Break out the EREMOVE call from the SGX page allocator.  This will allow
the SGX virtualization code to use the allocator directly.  (SGX/KVM
will also introduce a more permissive EREMOVE helper).

> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index ee50a5010277..a78b71447771 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -389,6 +389,16 @@ const struct vm_operations_struct sgx_vm_ops = {
>  	.access = sgx_vma_access,
>  };
>  
> +
> +static void sgx_reset_epc_page(struct sgx_epc_page *epc_page)
> +{
> +	int ret;
> +
> +	ret = __eremove(sgx_get_epc_virt_addr(epc_page));
> +	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret))
> +		return;
> +}
> +
>  /**
>   * sgx_encl_release - Destroy an enclave instance
>   * @kref:	address of a kref inside &sgx_encl
> @@ -412,6 +422,7 @@ void sgx_encl_release(struct kref *ref)
>  			if (sgx_unmark_page_reclaimable(entry->epc_page))
>  				continue;
>  
> +			sgx_reset_epc_page(entry->epc_page);
>  			sgx_free_epc_page(entry->epc_page);
>  			encl->secs_child_cnt--;
>  			entry->epc_page = NULL;
> @@ -423,6 +434,7 @@ void sgx_encl_release(struct kref *ref)
>  	xa_destroy(&encl->page_array);
>  
>  	if (!encl->secs_child_cnt && encl->secs.epc_page) {
> +		sgx_reset_epc_page(encl->secs.epc_page);
>  		sgx_free_epc_page(encl->secs.epc_page);
>  		encl->secs.epc_page = NULL;
>  	}
> @@ -431,6 +443,7 @@ void sgx_encl_release(struct kref *ref)
>  		va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
>  					   list);
>  		list_del(&va_page->list);
> +		sgx_reset_epc_page(va_page->epc_page);
>  		sgx_free_epc_page(va_page->epc_page);
>  		kfree(va_page);
>  	}
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index f330abdb5bb1..21c2ffa13870 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -598,16 +598,14 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
>   * sgx_free_epc_page() - Free an EPC page
>   * @page:	an EPC page
>   *
> - * Call EREMOVE for an EPC page and insert it back to the list of free pages.
> + * Put the EPC page back to the list of free pages. It's the callers

"caller's"

> + * responsibility to make sure that the page is in uninitialized state In other

Period after "state", please.

> + * words, do EREMOVE, EWB or whatever operation is necessary before calling
> + * this function.
>   */

OK, so if you're going to say "the caller must put the page in
uninitialized state", let's also add a comment to the place that *DO*
that, like the shiny new sgx_reset_epc_page().

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests
  2021-01-26  9:30 ` [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests Kai Huang
@ 2021-01-26 16:19   ` Dave Hansen
  2021-01-27  0:16     ` Kai Huang
  2021-01-30 14:41   ` Jarkko Sakkinen
  1 sibling, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-01-26 16:19 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

I'd also like to see some comments about code sharing between this and
the main driver.  For instance, this *could* try to share 99% of the
->fault function.  Why doesn't it?  I'm sure there's a good reason.

> diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> new file mode 100644
> index 000000000000..e1ad7856d878
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/sgx/virt.c
> @@ -0,0 +1,254 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*  Copyright(c) 2016-20 Intel Corporation. */
> +
> +#define pr_fmt(fmt)	"SGX virtual EPC: " fmt

Does this actually get used anywhere?  Also, isn't this a bit long?  Maybe:

#define pr_fmt(fmt)	"sgx/virt: " fmt

Also, a one-line summary about what's in here would be nice next to the
copyright (which needs to be updated).

/*
 * Device driver to expose SGX enclave memory to KVM guests.
 *
 * Copyright(c) 2016-20 Intel Corporation.
 */


> +#include <linux/miscdevice.h>
> +#include <linux/mm.h>
> +#include <linux/mman.h>
> +#include <linux/sched/mm.h>
> +#include <linux/sched/signal.h>
> +#include <linux/slab.h>
> +#include <linux/xarray.h>
> +#include <asm/sgx.h>
> +#include <uapi/asm/sgx.h>
> +
> +#include "encls.h"
> +#include "sgx.h"
> +#include "virt.h"
> +
> +struct sgx_vepc {
> +	struct xarray page_array;
> +	struct mutex lock;
> +};
> +
> +static struct mutex zombie_secs_pages_lock;
> +static struct list_head zombie_secs_pages;

Comments would be nice for this random lock and list.

The main core functions (fault, etc...) are looking OK to me.

...
> +int __init sgx_vepc_init(void)
> +{
> +	/* SGX virtualization requires KVM to work */
> +	if (!boot_cpu_has(X86_FEATURE_VMX) || !IS_ENABLED(CONFIG_KVM_INTEL))
> +		return -ENODEV;

Can this even be built without IS_ENABLED(CONFIG_KVM_INTEL)?

> +	INIT_LIST_HEAD(&zombie_secs_pages);
> +	mutex_init(&zombie_secs_pages_lock);
> +
> +	return misc_register(&sgx_vepc_dev);
> +}
> diff --git a/arch/x86/kernel/cpu/sgx/virt.h b/arch/x86/kernel/cpu/sgx/virt.h
> new file mode 100644
> index 000000000000..44d872380ca1
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/sgx/virt.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
> +#ifndef _ASM_X86_SGX_VIRT_H
> +#define _ASM_X86_SGX_VIRT_H
> +
> +#ifdef CONFIG_X86_SGX_KVM
> +int __init sgx_vepc_init(void);
> +#else
> +static inline int __init sgx_vepc_init(void)
> +{
> +	return -ENODEV;
> +}
> +#endif
> +
> +#endif /* _ASM_X86_SGX_VIRT_H */

Is more going to go in this header?  It's a little sparse as-is.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-26  9:30 ` [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support Kai Huang
@ 2021-01-26 16:26   ` Dave Hansen
  2021-01-26 17:00     ` Sean Christopherson
  2021-01-26 23:56     ` Kai Huang
  2021-01-30 14:42   ` Jarkko Sakkinen
  1 sibling, 2 replies; 156+ messages in thread
From: Dave Hansen @ 2021-01-26 16:26 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo,
	hpa, jethro, b.thiel

On 1/26/21 1:30 AM, Kai Huang wrote:
> --- a/arch/x86/kernel/cpu/feat_ctl.c
> +++ b/arch/x86/kernel/cpu/feat_ctl.c
> @@ -105,7 +105,8 @@ early_param("nosgx", nosgx);
>  void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>  {
>  	bool tboot = tboot_enabled();
> -	bool enable_sgx;
> +	bool enable_vmx;
> +	bool enable_sgx_any, enable_sgx_kvm, enable_sgx_driver;
>  	u64 msr;
>  
>  	if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
> @@ -114,13 +115,22 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>  		return;
>  	}
>  
> +	enable_vmx = cpu_has(c, X86_FEATURE_VMX) &&
> +		     IS_ENABLED(CONFIG_KVM_INTEL);

The reason it's called 'enable_sgx' below is because this code is
actually going to "enable sgx".  This code does not "enable vmx".  That
makes this a badly-named variable.  "vmx_enabled" or "vmx_available"
would be better.

>  	/*
> -	 * Enable SGX if and only if the kernel supports SGX and Launch Control
> -	 * is supported, i.e. disable SGX if the LE hash MSRs can't be written.
> +	 * Enable SGX if and only if the kernel supports SGX.  Require Launch
> +	 * Control support if SGX virtualization is *not* supported, i.e.
> +	 * disable SGX if the LE hash MSRs can't be written and SGX can't be
> +	 * exposed to a KVM guest (which might support non-LC configurations).
>  	 */

I hate this comment.

	/*
	 * Separate out bare-metal SGX enabling from KVM.  This allows
	 * KVM guests to use SGX even if the kernel refuses to use it on
	 * bare-metal.  This happens if flexible Faunch Control is not
	 * available.
	 *

> -	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
> -		     cpu_has(c, X86_FEATURE_SGX_LC) &&
> -		     IS_ENABLED(CONFIG_X86_SGX);
> +	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
> +			 cpu_has(c, X86_FEATURE_SGX1) &&
> +			 IS_ENABLED(CONFIG_X86_SGX);

The X86_FEATURE_SGX1 check seems to have snuck in here.  Why?

> +	enable_sgx_driver = enable_sgx_any &&
> +			    cpu_has(c, X86_FEATURE_SGX_LC);
> +	enable_sgx_kvm = enable_sgx_any && enable_vmx &&
> +			  IS_ENABLED(CONFIG_X86_SGX_KVM);
>  
>  	if (msr & FEAT_CTL_LOCKED)
>  		goto update_caps;
> @@ -136,15 +146,18 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>  	 * i.e. KVM is enabled, to avoid unnecessarily adding an attack vector
>  	 * for the kernel, e.g. using VMX to hide malicious code.
>  	 */
> -	if (cpu_has(c, X86_FEATURE_VMX) && IS_ENABLED(CONFIG_KVM_INTEL)) {
> +	if (enable_vmx) {
>  		msr |= FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
>  
>  		if (tboot)
>  			msr |= FEAT_CTL_VMX_ENABLED_INSIDE_SMX;
>  	}
>  
> -	if (enable_sgx)
> -		msr |= FEAT_CTL_SGX_ENABLED | FEAT_CTL_SGX_LC_ENABLED;
> +	if (enable_sgx_kvm || enable_sgx_driver) {
> +		msr |= FEAT_CTL_SGX_ENABLED;
> +		if (enable_sgx_driver)
> +			msr |= FEAT_CTL_SGX_LC_ENABLED;
> +	}
>  
>  	wrmsrl(MSR_IA32_FEAT_CTL, msr);
>  
> @@ -167,10 +180,29 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>  	}
>  
>  update_sgx:
> -	if (!(msr & FEAT_CTL_SGX_ENABLED) ||
> -	    !(msr & FEAT_CTL_SGX_LC_ENABLED) || !enable_sgx) {
> -		if (enable_sgx)
> -			pr_err_once("SGX disabled by BIOS\n");
> +	if (!(msr & FEAT_CTL_SGX_ENABLED)) {
> +		if (enable_sgx_kvm || enable_sgx_driver)
> +			pr_err_once("SGX disabled by BIOS.\n");
>  		clear_cpu_cap(c, X86_FEATURE_SGX);
> +		return;
> +	}


Isn't there a pr_fmt here already?  Won't these just look like:

	sgx: SGX disabled by BIOS.

That seems a bit silly.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page()
  2021-01-26 15:39   ` Dave Hansen
@ 2021-01-26 16:30     ` Sean Christopherson
  2021-01-27  1:08     ` Kai Huang
  1 sibling, 0 replies; 156+ messages in thread
From: Sean Christopherson @ 2021-01-26 16:30 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kai Huang, linux-sgx, kvm, x86, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021, Dave Hansen wrote:
> On 1/26/21 1:30 AM, Kai Huang wrote:
> > Remove SGX_EPC_PAGE_RECLAIMER_TRACKED check and warning.  This cannot
> > happen, as enclave pages are freed only at the time when encl->refcount
> > triggers, i.e. when both VFS and the page reclaimer have given up on
> > their references.
> > 
> > Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
> > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > ---
> >  arch/x86/kernel/cpu/sgx/main.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > index 8df81a3ed945..f330abdb5bb1 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -605,8 +605,6 @@ void sgx_free_epc_page(struct sgx_epc_page *page)
> >  	struct sgx_epc_section *section = &sgx_epc_sections[page->section];
> >  	int ret;
> >  
> > -	WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
> 
> I'm all for cleaning up silly, useless warnings.  But, don't we usually
> put warnings in for things that we don't expect to be able to happen?
> 
> In other words, I'm fine with removing this if it hasn't been a valuable
> warning and we don't expect it to become a valuable warning.

Ya, I don't understand the motivation for removing this warning.  I tripped it
more than once in the past during one of the many rebases of the virtual EPC
and EPC cgroup branches.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-26 16:26   ` Dave Hansen
@ 2021-01-26 17:00     ` Sean Christopherson
  2021-01-26 23:54       ` Kai Huang
  2021-01-26 23:56     ` Kai Huang
  1 sibling, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-01-26 17:00 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kai Huang, linux-sgx, kvm, x86, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jethro, b.thiel

On Tue, Jan 26, 2021, Dave Hansen wrote:
> > -	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
> > -		     cpu_has(c, X86_FEATURE_SGX_LC) &&
> > -		     IS_ENABLED(CONFIG_X86_SGX);
> > +	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
> > +			 cpu_has(c, X86_FEATURE_SGX1) &&
> > +			 IS_ENABLED(CONFIG_X86_SGX);
> 
> The X86_FEATURE_SGX1 check seems to have snuck in here.  Why?

It's a best effort check to handle the scenario where SGX is enabled by BIOS,
but was disabled by hardware in response to a machine check bank being disabled.
Adding a check on SGX1 should be in a different patch.  I thought we had a
dicscussion about why the check was omitted in the merge of bare metal support,
but I can't find any such thread.

> > +	enable_sgx_driver = enable_sgx_any &&
> > +			    cpu_has(c, X86_FEATURE_SGX_LC);
> > +	enable_sgx_kvm = enable_sgx_any && enable_vmx &&
> > +			  IS_ENABLED(CONFIG_X86_SGX_KVM);
> >  
> >  	if (msr & FEAT_CTL_LOCKED)
> >  		goto update_caps;
> > @@ -136,15 +146,18 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  	 * i.e. KVM is enabled, to avoid unnecessarily adding an attack vector
> >  	 * for the kernel, e.g. using VMX to hide malicious code.
> >  	 */
> > -	if (cpu_has(c, X86_FEATURE_VMX) && IS_ENABLED(CONFIG_KVM_INTEL)) {
> > +	if (enable_vmx) {
> >  		msr |= FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
> >  
> >  		if (tboot)
> >  			msr |= FEAT_CTL_VMX_ENABLED_INSIDE_SMX;
> >  	}
> >  
> > -	if (enable_sgx)
> > -		msr |= FEAT_CTL_SGX_ENABLED | FEAT_CTL_SGX_LC_ENABLED;
> > +	if (enable_sgx_kvm || enable_sgx_driver) {
> > +		msr |= FEAT_CTL_SGX_ENABLED;
> > +		if (enable_sgx_driver)
> > +			msr |= FEAT_CTL_SGX_LC_ENABLED;
> > +	}
> >  
> >  	wrmsrl(MSR_IA32_FEAT_CTL, msr);
> >  
> > @@ -167,10 +180,29 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  	}
> >  
> >  update_sgx:
> > -	if (!(msr & FEAT_CTL_SGX_ENABLED) ||
> > -	    !(msr & FEAT_CTL_SGX_LC_ENABLED) || !enable_sgx) {
> > -		if (enable_sgx)
> > -			pr_err_once("SGX disabled by BIOS\n");
> > +	if (!(msr & FEAT_CTL_SGX_ENABLED)) {
> > +		if (enable_sgx_kvm || enable_sgx_driver)
> > +			pr_err_once("SGX disabled by BIOS.\n");
> >  		clear_cpu_cap(c, X86_FEATURE_SGX);
> > +		return;
> > +	}
> 
> 
> Isn't there a pr_fmt here already?  Won't these just look like:
> 
> 	sgx: SGX disabled by BIOS.
> 
> That seems a bit silly.

Eh, I like the explicit "SGX" to clarify that the hardware feature was disabled.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-01-26  9:31 ` [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled Kai Huang
@ 2021-01-26 17:03   ` Dave Hansen
  2021-01-26 18:10     ` Andy Lutomirski
  2021-01-30 14:45   ` Jarkko Sakkinen
  1 sibling, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-01-26 17:03 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On 1/26/21 1:31 AM, Kai Huang wrote:
> Modify sgx_init() to always try to initialize the virtual EPC driver,
> even if the bare-metal SGX driver is disabled.  The bare-metal driver
> might be disabled if SGX Launch Control is in locked mode, or not
> supported in the hardware at all.  This allows (non-Linux) guests that
> support non-LC configurations to use SGX.

One thing worth calling out *somewhere* (which is entirely my fault):
"bare-metal" in the context of this patch set refers to true bare-metal,
but *ALSO* covers the plain SGX driver running inside a guest.

So, perhaps "bare-metal" isn't the best term to use.  Again, my bad.
Better nomenclature suggestions are welcome.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-01-26 17:03   ` Dave Hansen
@ 2021-01-26 18:10     ` Andy Lutomirski
  2021-01-26 23:25       ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Andy Lutomirski @ 2021-01-26 18:10 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kai Huang, linux-sgx, kvm, x86, seanjc, jarkko, luto,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa



> On Jan 26, 2021, at 9:03 AM, Dave Hansen <dave.hansen@intel.com> wrote:
> 
> On 1/26/21 1:31 AM, Kai Huang wrote:
>> Modify sgx_init() to always try to initialize the virtual EPC driver,
>> even if the bare-metal SGX driver is disabled.  The bare-metal driver
>> might be disabled if SGX Launch Control is in locked mode, or not
>> supported in the hardware at all.  This allows (non-Linux) guests that
>> support non-LC configurations to use SGX.
> 
> One thing worth calling out *somewhere* (which is entirely my fault):
> "bare-metal" in the context of this patch set refers to true bare-metal,
> but *ALSO* covers the plain SGX driver running inside a guest.
> 
> So, perhaps "bare-metal" isn't the best term to use.  Again, my bad.
> Better nomenclature suggestions are welcome.


How about just SGX?  We can have an SGX driver and a virtual EPC driver.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-01-26 15:34   ` Dave Hansen
@ 2021-01-26 23:18     ` Kai Huang
  2021-01-30 13:20       ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26 23:18 UTC (permalink / raw)
  To: Dave Hansen, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Tue, 2021-01-26 at 07:34 -0800, Dave Hansen wrote:
> On 1/26/21 1:30 AM, Kai Huang wrote:
> > From: Sean Christopherson <seanjc@google.com>
> > 
> > Add SGX1 and SGX2 feature flags, via CPUID.0x12.0x0.EAX, as scattered
> > features, since adding a new leaf for only two bits would be wasteful.
> > As part of virtualizing SGX, KVM will expose the SGX CPUID leafs to its
> > guest, and to do so correctly needs to query hardware and kernel support
> > for SGX1 and SGX2.
> 
> It's also not _just_ exposing the CPUID leaves.  There are some checks
> here when KVM is emulating some SGX instructions too, right?

I would say trapping instead of emulating, but yes KVM will do more. However those
are quite details, and I don't think we should put lots of details here. Or perhaps
we can use 'for instance' as brief description:

As part of virtualizing SGX, KVM will need to use the two flags, for instance, to
expose them to guest.

?

> 
> > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > index 84b887825f12..18b2d0c8bbbe 100644
> > --- a/arch/x86/include/asm/cpufeatures.h
> > +++ b/arch/x86/include/asm/cpufeatures.h
> > @@ -292,6 +292,8 @@
> >  #define X86_FEATURE_FENCE_SWAPGS_KERNEL	(11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */
> >  #define X86_FEATURE_SPLIT_LOCK_DETECT	(11*32+ 6) /* #AC for split lock */
> >  #define X86_FEATURE_PER_THREAD_MBA	(11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */
> > +#define X86_FEATURE_SGX1		(11*32+ 8) /* Software Guard Extensions sub-feature SGX1 */
> > +#define X86_FEATURE_SGX2        	(11*32+ 9) /* Software Guard Extensions sub-feature SGX2 */
> 
> FWIW, I'm not sure how valuable it is to spell the SGX acronym out three
> times.  Can't we use those bytes to put something more useful in that
> comment?

I think we can remove comment for SGX1, since it is basically SGX.

For SGX2, how about below?

/* SGX Enclave Dynamic Memory Management */



^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-01-26 18:10     ` Andy Lutomirski
@ 2021-01-26 23:25       ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26 23:25 UTC (permalink / raw)
  To: Andy Lutomirski, Dave Hansen
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, 2021-01-26 at 10:10 -0800, Andy Lutomirski wrote:
> 
> > On Jan 26, 2021, at 9:03 AM, Dave Hansen <dave.hansen@intel.com> wrote:
> > 
> > On 1/26/21 1:31 AM, Kai Huang wrote:
> > > Modify sgx_init() to always try to initialize the virtual EPC driver,
> > > even if the bare-metal SGX driver is disabled.  The bare-metal driver
> > > might be disabled if SGX Launch Control is in locked mode, or not
> > > supported in the hardware at all.  This allows (non-Linux) guests that
> > > support non-LC configurations to use SGX.
> > 
> > One thing worth calling out *somewhere* (which is entirely my fault):
> > "bare-metal" in the context of this patch set refers to true bare-metal,
> > but *ALSO* covers the plain SGX driver running inside a guest.
> > 
> > So, perhaps "bare-metal" isn't the best term to use.  Again, my bad.
> > Better nomenclature suggestions are welcome.
> 
> 
> How about just SGX?  We can have an SGX driver and a virtual EPC driver.

Thanks. If no one has better idea, I'll change 'bare-metal' driver to SGX driver, in
the whole series.



^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-26 17:00     ` Sean Christopherson
@ 2021-01-26 23:54       ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-26 23:54 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Dave Hansen, linux-sgx, kvm, x86, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jethro, b.thiel

On Tue, 26 Jan 2021 09:00:45 -0800 Sean Christopherson wrote:
> On Tue, Jan 26, 2021, Dave Hansen wrote:
> > > -	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
> > > -		     cpu_has(c, X86_FEATURE_SGX_LC) &&
> > > -		     IS_ENABLED(CONFIG_X86_SGX);
> > > +	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
> > > +			 cpu_has(c, X86_FEATURE_SGX1) &&
> > > +			 IS_ENABLED(CONFIG_X86_SGX);
> > 
> > The X86_FEATURE_SGX1 check seems to have snuck in here.  Why?
> 
> It's a best effort check to handle the scenario where SGX is enabled by BIOS,
> but was disabled by hardware in response to a machine check bank being disabled.
> Adding a check on SGX1 should be in a different patch.  I thought we had a
> dicscussion about why the check was omitted in the merge of bare metal support,
> but I can't find any such thread.

Hi Dave,

This is the link we discussed when in RFC v1. This should provide some info of
why using SGX1 here. 

https://www.spinics.net/lists/linux-sgx/msg03990.html

And Dave, Sean,

If we want another separate patch for fixing SGX1 bit here, I'd like to let
Sean or Jarkko to do that, since it is not quite related to KVM SGX
virtualization here. I can remove SGX1  check here if you all agree.

Comment? 

> 
> > > +	enable_sgx_driver = enable_sgx_any &&
> > > +			    cpu_has(c, X86_FEATURE_SGX_LC);
> > > +	enable_sgx_kvm = enable_sgx_any && enable_vmx &&
> > > +			  IS_ENABLED(CONFIG_X86_SGX_KVM);
> > >  
> > >  	if (msr & FEAT_CTL_LOCKED)
> > >  		goto update_caps;
> > > @@ -136,15 +146,18 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> > >  	 * i.e. KVM is enabled, to avoid unnecessarily adding an attack vector
> > >  	 * for the kernel, e.g. using VMX to hide malicious code.
> > >  	 */
> > > -	if (cpu_has(c, X86_FEATURE_VMX) && IS_ENABLED(CONFIG_KVM_INTEL)) {
> > > +	if (enable_vmx) {
> > >  		msr |= FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
> > >  
> > >  		if (tboot)
> > >  			msr |= FEAT_CTL_VMX_ENABLED_INSIDE_SMX;
> > >  	}
> > >  
> > > -	if (enable_sgx)
> > > -		msr |= FEAT_CTL_SGX_ENABLED | FEAT_CTL_SGX_LC_ENABLED;
> > > +	if (enable_sgx_kvm || enable_sgx_driver) {
> > > +		msr |= FEAT_CTL_SGX_ENABLED;
> > > +		if (enable_sgx_driver)
> > > +			msr |= FEAT_CTL_SGX_LC_ENABLED;
> > > +	}
> > >  
> > >  	wrmsrl(MSR_IA32_FEAT_CTL, msr);
> > >  
> > > @@ -167,10 +180,29 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> > >  	}
> > >  
> > >  update_sgx:
> > > -	if (!(msr & FEAT_CTL_SGX_ENABLED) ||
> > > -	    !(msr & FEAT_CTL_SGX_LC_ENABLED) || !enable_sgx) {
> > > -		if (enable_sgx)
> > > -			pr_err_once("SGX disabled by BIOS\n");
> > > +	if (!(msr & FEAT_CTL_SGX_ENABLED)) {
> > > +		if (enable_sgx_kvm || enable_sgx_driver)
> > > +			pr_err_once("SGX disabled by BIOS.\n");
> > >  		clear_cpu_cap(c, X86_FEATURE_SGX);
> > > +		return;
> > > +	}
> > 
> > 
> > Isn't there a pr_fmt here already?  Won't these just look like:
> > 
> > 	sgx: SGX disabled by BIOS.
> > 
> > That seems a bit silly.
> 
> Eh, I like the explicit "SGX" to clarify that the hardware feature was disabled.

Hi Dave,

The pr_fmt is:

#undef pr_fmt
#define pr_fmt(fmt)     "x86/cpu: " fmt

So, it will have x86/cpu: SGX disabled by BIOS.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-26 16:26   ` Dave Hansen
  2021-01-26 17:00     ` Sean Christopherson
@ 2021-01-26 23:56     ` Kai Huang
  2021-01-27  0:18       ` Dave Hansen
  1 sibling, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-26 23:56 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jethro, b.thiel

On Tue, 26 Jan 2021 08:26:21 -0800 Dave Hansen wrote:
> On 1/26/21 1:30 AM, Kai Huang wrote:
> > --- a/arch/x86/kernel/cpu/feat_ctl.c
> > +++ b/arch/x86/kernel/cpu/feat_ctl.c
> > @@ -105,7 +105,8 @@ early_param("nosgx", nosgx);
> >  void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  {
> >  	bool tboot = tboot_enabled();
> > -	bool enable_sgx;
> > +	bool enable_vmx;
> > +	bool enable_sgx_any, enable_sgx_kvm, enable_sgx_driver;
> >  	u64 msr;
> >  
> >  	if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
> > @@ -114,13 +115,22 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  		return;
> >  	}
> >  
> > +	enable_vmx = cpu_has(c, X86_FEATURE_VMX) &&
> > +		     IS_ENABLED(CONFIG_KVM_INTEL);
> 
> The reason it's called 'enable_sgx' below is because this code is
> actually going to "enable sgx".  This code does not "enable vmx".  That
> makes this a badly-named variable.  "vmx_enabled" or "vmx_available"
> would be better.

It will also try to enable VMX if feature control MSR is not locked by BIOS.
Please see below code:

"
> > -	if (cpu_has(c, X86_FEATURE_VMX) && IS_ENABLED(CONFIG_KVM_INTEL)) {
> > +	if (enable_vmx) {
> >  		msr |= FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
> >  
> >  		if (tboot)
> >  			msr |= FEAT_CTL_VMX_ENABLED_INSIDE_SMX;
> >  	}
"

And if feature control MSR is locked, kernel cannot truly enable anything, but
can only print out msg in case BIOS disabled either VMX, or SGX, or SGX_LC, and
kernel wants to support that.

Does this make sense to you?

> 
> >  	/*
> > -	 * Enable SGX if and only if the kernel supports SGX and Launch Control
> > -	 * is supported, i.e. disable SGX if the LE hash MSRs can't be written.
> > +	 * Enable SGX if and only if the kernel supports SGX.  Require Launch
> > +	 * Control support if SGX virtualization is *not* supported, i.e.
> > +	 * disable SGX if the LE hash MSRs can't be written and SGX can't be
> > +	 * exposed to a KVM guest (which might support non-LC configurations).
> >  	 */
> 
> I hate this comment.
> 
> 	/*
> 	 * Separate out bare-metal SGX enabling from KVM.  This allows
> 	 * KVM guests to use SGX even if the kernel refuses to use it on
> 	 * bare-metal.  This happens if flexible Faunch Control is not
> 	 * available.
> 	 *

Thanks.

> 
> > -	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
> > -		     cpu_has(c, X86_FEATURE_SGX_LC) &&
> > -		     IS_ENABLED(CONFIG_X86_SGX);
> > +	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
> > +			 cpu_has(c, X86_FEATURE_SGX1) &&
> > +			 IS_ENABLED(CONFIG_X86_SGX);
> 
> The X86_FEATURE_SGX1 check seems to have snuck in here.  Why?

Please see my reply to Sean's reply.

> 
> > +	enable_sgx_driver = enable_sgx_any &&
> > +			    cpu_has(c, X86_FEATURE_SGX_LC);
> > +	enable_sgx_kvm = enable_sgx_any && enable_vmx &&
> > +			  IS_ENABLED(CONFIG_X86_SGX_KVM);
> >  
> >  	if (msr & FEAT_CTL_LOCKED)
> >  		goto update_caps;
> > @@ -136,15 +146,18 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  	 * i.e. KVM is enabled, to avoid unnecessarily adding an attack vector
> >  	 * for the kernel, e.g. using VMX to hide malicious code.
> >  	 */
> > -	if (cpu_has(c, X86_FEATURE_VMX) && IS_ENABLED(CONFIG_KVM_INTEL)) {
> > +	if (enable_vmx) {
> >  		msr |= FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
> >  
> >  		if (tboot)
> >  			msr |= FEAT_CTL_VMX_ENABLED_INSIDE_SMX;
> >  	}
> >  
> > -	if (enable_sgx)
> > -		msr |= FEAT_CTL_SGX_ENABLED | FEAT_CTL_SGX_LC_ENABLED;
> > +	if (enable_sgx_kvm || enable_sgx_driver) {
> > +		msr |= FEAT_CTL_SGX_ENABLED;
> > +		if (enable_sgx_driver)
> > +			msr |= FEAT_CTL_SGX_LC_ENABLED;
> > +	}
> >  
> >  	wrmsrl(MSR_IA32_FEAT_CTL, msr);
> >  
> > @@ -167,10 +180,29 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  	}
> >  
> >  update_sgx:
> > -	if (!(msr & FEAT_CTL_SGX_ENABLED) ||
> > -	    !(msr & FEAT_CTL_SGX_LC_ENABLED) || !enable_sgx) {
> > -		if (enable_sgx)
> > -			pr_err_once("SGX disabled by BIOS\n");
> > +	if (!(msr & FEAT_CTL_SGX_ENABLED)) {
> > +		if (enable_sgx_kvm || enable_sgx_driver)
> > +			pr_err_once("SGX disabled by BIOS.\n");
> >  		clear_cpu_cap(c, X86_FEATURE_SGX);
> > +		return;
> > +	}
> 
> 
> Isn't there a pr_fmt here already?  Won't these just look like:
> 
> 	sgx: SGX disabled by BIOS.
> 
> That seems a bit silly.

Please see my reply to Sean's reply.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 05/27] x86/sgx: Add SGX_CHILD_PRESENT hardware error code
  2021-01-26 15:49   ` Dave Hansen
@ 2021-01-27  0:00     ` Kai Huang
  2021-01-27  0:21       ` Dave Hansen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-27  0:00 UTC (permalink / raw)
  To: Dave Hansen, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Tue, 2021-01-26 at 07:49 -0800, Dave Hansen wrote:
> On 1/26/21 1:30 AM, Kai Huang wrote:
> > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > 
> > SGX virtualization requires to allocate "raw" EPC and use it as "virtual
> > EPC" for SGX guest.  Unlike EPC used by SGX driver, virtual EPC doesn't
> > track how EPC pages are used in VM, e.g. (de)construction of enclaves,
> > so it cannot guarantee EREMOVE success, e.g. it doesn't have a priori
> > knowledge of which pages are SECS with non-zero child counts.
> 
> The grammar there is a bit questionable in spots.  Here's a rewrite:
> 
> SGX can accurately track how bare-metal enclave pages are used.  This
> enables SECS to be specifically targeted and EREMOVE'd only after all
> child pages have been EREMOVE'd.  This ensures that bare-metal SGX will
> never encounter SGX_CHILD_PRESENT in normal operation.

How about:

"SGX driver can accurate track how enclave pages are used. This enables..."

Since in another email, you mentioned that we should get rid of bare-metal driver,
and Andy suggested we can just use SGX driver?

> 
> Virtual EPC is different.  The host does not track how EPC pages are
> used by the guest, so it cannot guarantee EREMOVE success.  It might,
> for instance, encounter a SECS with a non-zero child count.
> 
> Aside: Would it be *possible* for the host to figure out where the SECS
> pages are?  If not, we can say "host can not track" versus what I said:
> "host does not track".

Technically it is possible, so "host does not track" is more reasonable.

> 
> > Add SGX_CHILD_PRESENT for use by SGX virtualization to assert EREMOVE
> > failures are expected, but only due to SGX_CHILD_PRESENT.
> > 
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
> > Signed-off-by: Kai Huang <kai.huang@intel.com>
> 
> With the improved changelog:
> 
> Acked-by: Dave Hansen <dave.hansen@intel.com>

Thanks.


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests
  2021-01-26 16:19   ` Dave Hansen
@ 2021-01-27  0:16     ` Kai Huang
  2021-01-27  0:27       ` Dave Hansen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-27  0:16 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, 26 Jan 2021 08:19:25 -0800 Dave Hansen wrote:
> I'd also like to see some comments about code sharing between this and
> the main driver.  For instance, this *could* try to share 99% of the
> ->fault function.  Why doesn't it?  I'm sure there's a good reason.
> 
> > diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> > new file mode 100644
> > index 000000000000..e1ad7856d878
> > --- /dev/null
> > +++ b/arch/x86/kernel/cpu/sgx/virt.c
> > @@ -0,0 +1,254 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*  Copyright(c) 2016-20 Intel Corporation. */
> > +
> > +#define pr_fmt(fmt)	"SGX virtual EPC: " fmt
> 
> Does this actually get used anywhere?  Also, isn't this a bit long?  Maybe:
> 
> #define pr_fmt(fmt)	"sgx/virt: " fmt

It is not used. My bad. I'll remove it.

And yes "sgx/virt: " is better. 

> 
> Also, a one-line summary about what's in here would be nice next to the
> copyright (which needs to be updated).
> 
> /*
>  * Device driver to expose SGX enclave memory to KVM guests.
>  *
>  * Copyright(c) 2016-20 Intel Corporation.
>  */

Will do. However the year should not be 2016-20, but should be 2021, right?

I think it has been ignored since the day Sean wrote the file.

> 
> 
> > +#include <linux/miscdevice.h>
> > +#include <linux/mm.h>
> > +#include <linux/mman.h>
> > +#include <linux/sched/mm.h>
> > +#include <linux/sched/signal.h>
> > +#include <linux/slab.h>
> > +#include <linux/xarray.h>
> > +#include <asm/sgx.h>
> > +#include <uapi/asm/sgx.h>
> > +
> > +#include "encls.h"
> > +#include "sgx.h"
> > +#include "virt.h"
> > +
> > +struct sgx_vepc {
> > +	struct xarray page_array;
> > +	struct mutex lock;
> > +};
> > +
> > +static struct mutex zombie_secs_pages_lock;
> > +static struct list_head zombie_secs_pages;
> 
> Comments would be nice for this random lock and list.
> 
> The main core functions (fault, etc...) are looking OK to me.

Thanks. How about below comment?

/*
 * List to temporarily hold SECS pages that cannot be EREMOVE'd due to
 * having child in other virtual EPC instances, and the lock to protect it.
 */

> 
> ...
> > +int __init sgx_vepc_init(void)
> > +{
> > +	/* SGX virtualization requires KVM to work */
> > +	if (!boot_cpu_has(X86_FEATURE_VMX) || !IS_ENABLED(CONFIG_KVM_INTEL))
> > +		return -ENODEV;
> 
> Can this even be built without IS_ENABLED(CONFIG_KVM_INTEL)?

I think no. Thanks. I'll remove IS_ENABLED(CONFIG_KVM_INTEL).

> 
> > +	INIT_LIST_HEAD(&zombie_secs_pages);
> > +	mutex_init(&zombie_secs_pages_lock);
> > +
> > +	return misc_register(&sgx_vepc_dev);
> > +}
> > diff --git a/arch/x86/kernel/cpu/sgx/virt.h b/arch/x86/kernel/cpu/sgx/virt.h
> > new file mode 100644
> > index 000000000000..44d872380ca1
> > --- /dev/null
> > +++ b/arch/x86/kernel/cpu/sgx/virt.h
> > @@ -0,0 +1,14 @@
> > +/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
> > +#ifndef _ASM_X86_SGX_VIRT_H
> > +#define _ASM_X86_SGX_VIRT_H
> > +
> > +#ifdef CONFIG_X86_SGX_KVM
> > +int __init sgx_vepc_init(void);
> > +#else
> > +static inline int __init sgx_vepc_init(void)
> > +{
> > +	return -ENODEV;
> > +}
> > +#endif
> > +
> > +#endif /* _ASM_X86_SGX_VIRT_H */
> 
> Is more going to go in this header?  It's a little sparse as-is.

No there's no more. The sgx_vepc_init() function declaration needs to be here
since sgx/main.c needs to use it.

May I know your suggestion?


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-26 23:56     ` Kai Huang
@ 2021-01-27  0:18       ` Dave Hansen
  2021-01-27  2:02         ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-01-27  0:18 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jethro, b.thiel

On 1/26/21 3:56 PM, Kai Huang wrote:
> On Tue, 26 Jan 2021 08:26:21 -0800 Dave Hansen wrote:
>> On 1/26/21 1:30 AM, Kai Huang wrote:
>>> --- a/arch/x86/kernel/cpu/feat_ctl.c
>>> +++ b/arch/x86/kernel/cpu/feat_ctl.c
>>> @@ -105,7 +105,8 @@ early_param("nosgx", nosgx);
>>>  void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>>>  {
>>>  	bool tboot = tboot_enabled();
>>> -	bool enable_sgx;
>>> +	bool enable_vmx;
>>> +	bool enable_sgx_any, enable_sgx_kvm, enable_sgx_driver;
>>>  	u64 msr;
>>>  
>>>  	if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
>>> @@ -114,13 +115,22 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>>>  		return;
>>>  	}
>>>  
>>> +	enable_vmx = cpu_has(c, X86_FEATURE_VMX) &&
>>> +		     IS_ENABLED(CONFIG_KVM_INTEL);
>>
>> The reason it's called 'enable_sgx' below is because this code is
>> actually going to "enable sgx".  This code does not "enable vmx".  That
>> makes this a badly-named variable.  "vmx_enabled" or "vmx_available"
>> would be better.
> 
> It will also try to enable VMX if feature control MSR is not locked by BIOS.
> Please see below code:

Ahh, I forgot this is non-SGX code.  It's mucking with all kinds of
other stuff in the same MSR.  Oh, well, I guess that's what you get for
dumping a bunch of refactoring in the same patch as the new code.


>>> -	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
>>> -		     cpu_has(c, X86_FEATURE_SGX_LC) &&
>>> -		     IS_ENABLED(CONFIG_X86_SGX);
>>> +	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
>>> +			 cpu_has(c, X86_FEATURE_SGX1) &&
>>> +			 IS_ENABLED(CONFIG_X86_SGX);
>>
>> The X86_FEATURE_SGX1 check seems to have snuck in here.  Why?
> 
> Please see my reply to Sean's reply.

... yes, so you're breaking out the fix into a separate patch,.

>>>  update_sgx:
>>> -	if (!(msr & FEAT_CTL_SGX_ENABLED) ||
>>> -	    !(msr & FEAT_CTL_SGX_LC_ENABLED) || !enable_sgx) {
>>> -		if (enable_sgx)
>>> -			pr_err_once("SGX disabled by BIOS\n");
>>> +	if (!(msr & FEAT_CTL_SGX_ENABLED)) {
>>> +		if (enable_sgx_kvm || enable_sgx_driver)
>>> +			pr_err_once("SGX disabled by BIOS.\n");
>>>  		clear_cpu_cap(c, X86_FEATURE_SGX);
>>> +		return;
>>> +	}
>>
>>
>> Isn't there a pr_fmt here already?  Won't these just look like:
>>
>> 	sgx: SGX disabled by BIOS.
>>
>> That seems a bit silly.
> 
> Please see my reply to Sean's reply.

Got it.  I was thinking this was in the SGX code, not in the generic CPU
setup code.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 05/27] x86/sgx: Add SGX_CHILD_PRESENT hardware error code
  2021-01-27  0:00     ` Kai Huang
@ 2021-01-27  0:21       ` Dave Hansen
  2021-01-27  0:52         ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-01-27  0:21 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On 1/26/21 4:00 PM, Kai Huang wrote:
> On Tue, 2021-01-26 at 07:49 -0800, Dave Hansen wrote:
>> On 1/26/21 1:30 AM, Kai Huang wrote:
>>> From: Sean Christopherson <sean.j.christopherson@intel.com>
>>>
>>> SGX virtualization requires to allocate "raw" EPC and use it as "virtual
>>> EPC" for SGX guest.  Unlike EPC used by SGX driver, virtual EPC doesn't
>>> track how EPC pages are used in VM, e.g. (de)construction of enclaves,
>>> so it cannot guarantee EREMOVE success, e.g. it doesn't have a priori
>>> knowledge of which pages are SECS with non-zero child counts.
>>
>> The grammar there is a bit questionable in spots.  Here's a rewrite:
>>
>> SGX can accurately track how bare-metal enclave pages are used.  This
>> enables SECS to be specifically targeted and EREMOVE'd only after all
>> child pages have been EREMOVE'd.  This ensures that bare-metal SGX will
>> never encounter SGX_CHILD_PRESENT in normal operation.
> 
> How about:
> 
> "SGX driver can accurate track how enclave pages are used. This enables..."
> 
> Since in another email, you mentioned that we should get rid of bare-metal driver,
> and Andy suggested we can just use SGX driver?

<sigh>

Sure, but with correct grammar, please.

"SGX driver can accurately track how enclave pages are used. This
enables..."

Seriously, if you just paste the sentences into Word, it will highlight
this and tell you.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests
  2021-01-27  0:16     ` Kai Huang
@ 2021-01-27  0:27       ` Dave Hansen
  2021-01-27  0:48         ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-01-27  0:27 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On 1/26/21 4:16 PM, Kai Huang wrote:
> On Tue, 26 Jan 2021 08:19:25 -0800 Dave Hansen wrote:
>> Also, a one-line summary about what's in here would be nice next to the
>> copyright (which needs to be updated).
>>
>> /*
>>  * Device driver to expose SGX enclave memory to KVM guests.
>>  *
>>  * Copyright(c) 2016-20 Intel Corporation.
>>  */
> 
> Will do. However the year should not be 2016-20, but should be 2021, right?
> 
> I think it has been ignored since the day Sean wrote the file.

Yes, should be 2021.  Also, there shouldn't be *ANY* parts of these
files which you, the submitter and newly-minted effective maintainer,
have ignored.

It sounds like you owe us some homework to give every line of these a
once-over.

...
>>> +struct sgx_vepc {
>>> +	struct xarray page_array;
>>> +	struct mutex lock;
>>> +};
>>> +
>>> +static struct mutex zombie_secs_pages_lock;
>>> +static struct list_head zombie_secs_pages;
>>
>> Comments would be nice for this random lock and list.
>>
>> The main core functions (fault, etc...) are looking OK to me.
> 
> Thanks. How about below comment?
> 
> /*
>  * List to temporarily hold SECS pages that cannot be EREMOVE'd due to
>  * having child in other virtual EPC instances, and the lock to protect it.
>  */

Fine.  It's just a bit silly to say that it's a list.  It's also not so
temporary.  Pages can live on here forever.

>>> +	INIT_LIST_HEAD(&zombie_secs_pages);
>>> +	mutex_init(&zombie_secs_pages_lock);
>>> +
>>> +	return misc_register(&sgx_vepc_dev);
>>> +}
>>> diff --git a/arch/x86/kernel/cpu/sgx/virt.h b/arch/x86/kernel/cpu/sgx/virt.h
>>> new file mode 100644
>>> index 000000000000..44d872380ca1
>>> --- /dev/null
>>> +++ b/arch/x86/kernel/cpu/sgx/virt.h
>>> @@ -0,0 +1,14 @@
>>> +/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
>>> +#ifndef _ASM_X86_SGX_VIRT_H
>>> +#define _ASM_X86_SGX_VIRT_H
>>> +
>>> +#ifdef CONFIG_X86_SGX_KVM
>>> +int __init sgx_vepc_init(void);
>>> +#else
>>> +static inline int __init sgx_vepc_init(void)
>>> +{
>>> +	return -ENODEV;
>>> +}
>>> +#endif
>>> +
>>> +#endif /* _ASM_X86_SGX_VIRT_H */
>>
>> Is more going to go in this header?  It's a little sparse as-is.
> 
> No there's no more. The sgx_vepc_init() function declaration needs to be here
> since sgx/main.c needs to use it.
> 
> May I know your suggestion?

I'd toss it in some other existing header that has more meat in it.  I'm
lazy.


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests
  2021-01-27  0:27       ` Dave Hansen
@ 2021-01-27  0:48         ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-27  0:48 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, 26 Jan 2021 16:27:25 -0800 Dave Hansen wrote:
> On 1/26/21 4:16 PM, Kai Huang wrote:
> > On Tue, 26 Jan 2021 08:19:25 -0800 Dave Hansen wrote:
> >> Also, a one-line summary about what's in here would be nice next to the
> >> copyright (which needs to be updated).
> >>
> >> /*
> >>  * Device driver to expose SGX enclave memory to KVM guests.
> >>  *
> >>  * Copyright(c) 2016-20 Intel Corporation.
> >>  */
> > 
> > Will do. However the year should not be 2016-20, but should be 2021, right?
> > 
> > I think it has been ignored since the day Sean wrote the file.
> 
> Yes, should be 2021.  Also, there shouldn't be *ANY* parts of these
> files which you, the submitter and newly-minted effective maintainer,
> have ignored.

Yes agreed.

> 
> It sounds like you owe us some homework to give every line of these a
> once-over.

I'll also check other files. Thanks.

> 
> ...
> >>> +struct sgx_vepc {
> >>> +	struct xarray page_array;
> >>> +	struct mutex lock;
> >>> +};
> >>> +
> >>> +static struct mutex zombie_secs_pages_lock;
> >>> +static struct list_head zombie_secs_pages;
> >>
> >> Comments would be nice for this random lock and list.
> >>
> >> The main core functions (fault, etc...) are looking OK to me.
> > 
> > Thanks. How about below comment?
> > 
> > /*
> >  * List to temporarily hold SECS pages that cannot be EREMOVE'd due to
> >  * having child in other virtual EPC instances, and the lock to protect it.
> >  */
> 
> Fine.  It's just a bit silly to say that it's a list.  It's also not so
> temporary.  Pages can live on here forever.

I'll remove the 'List':

/* SECS pages that cannot be EREMOVE'd due to... */

The list should be empty after VM's all virtual EPC instances have been
released. If one page lives in list forever, the WARN_ONCE() in
sgx_vepc_free_page() will catch it, and there's bug here.

> 
> >>> +	INIT_LIST_HEAD(&zombie_secs_pages);
> >>> +	mutex_init(&zombie_secs_pages_lock);
> >>> +
> >>> +	return misc_register(&sgx_vepc_dev);
> >>> +}
> >>> diff --git a/arch/x86/kernel/cpu/sgx/virt.h b/arch/x86/kernel/cpu/sgx/virt.h
> >>> new file mode 100644
> >>> index 000000000000..44d872380ca1
> >>> --- /dev/null
> >>> +++ b/arch/x86/kernel/cpu/sgx/virt.h
> >>> @@ -0,0 +1,14 @@
> >>> +/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
> >>> +#ifndef _ASM_X86_SGX_VIRT_H
> >>> +#define _ASM_X86_SGX_VIRT_H
> >>> +
> >>> +#ifdef CONFIG_X86_SGX_KVM
> >>> +int __init sgx_vepc_init(void);
> >>> +#else
> >>> +static inline int __init sgx_vepc_init(void)
> >>> +{
> >>> +	return -ENODEV;
> >>> +}
> >>> +#endif
> >>> +
> >>> +#endif /* _ASM_X86_SGX_VIRT_H */
> >>
> >> Is more going to go in this header?  It's a little sparse as-is.
> > 
> > No there's no more. The sgx_vepc_init() function declaration needs to be here
> > since sgx/main.c needs to use it.
> > 
> > May I know your suggestion?
> 
> I'd toss it in some other existing header that has more meat in it.  I'm
> lazy.
> 

I can put it into arch/x86/kernel/cpu/sgx/sgx.h, if it is good to you.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 05/27] x86/sgx: Add SGX_CHILD_PRESENT hardware error code
  2021-01-27  0:21       ` Dave Hansen
@ 2021-01-27  0:52         ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-27  0:52 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, 26 Jan 2021 16:21:36 -0800 Dave Hansen wrote:
> On 1/26/21 4:00 PM, Kai Huang wrote:
> > On Tue, 2021-01-26 at 07:49 -0800, Dave Hansen wrote:
> >> On 1/26/21 1:30 AM, Kai Huang wrote:
> >>> From: Sean Christopherson <sean.j.christopherson@intel.com>
> >>>
> >>> SGX virtualization requires to allocate "raw" EPC and use it as "virtual
> >>> EPC" for SGX guest.  Unlike EPC used by SGX driver, virtual EPC doesn't
> >>> track how EPC pages are used in VM, e.g. (de)construction of enclaves,
> >>> so it cannot guarantee EREMOVE success, e.g. it doesn't have a priori
> >>> knowledge of which pages are SECS with non-zero child counts.
> >>
> >> The grammar there is a bit questionable in spots.  Here's a rewrite:
> >>
> >> SGX can accurately track how bare-metal enclave pages are used.  This
> >> enables SECS to be specifically targeted and EREMOVE'd only after all
> >> child pages have been EREMOVE'd.  This ensures that bare-metal SGX will
> >> never encounter SGX_CHILD_PRESENT in normal operation.
> > 
> > How about:
> > 
> > "SGX driver can accurate track how enclave pages are used. This enables..."
> > 
> > Since in another email, you mentioned that we should get rid of bare-metal driver,
> > and Andy suggested we can just use SGX driver?
> 
> <sigh>
> 
> Sure, but with correct grammar, please.
> 
> "SGX driver can accurately track how enclave pages are used. This
> enables..."
> 
> Seriously, if you just paste the sentences into Word, it will highlight
> this and tell you.

Thanks. My fault.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page()
  2021-01-26 15:39   ` Dave Hansen
  2021-01-26 16:30     ` Sean Christopherson
@ 2021-01-27  1:08     ` Kai Huang
  2021-01-27  1:12       ` Dave Hansen
  1 sibling, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-27  1:08 UTC (permalink / raw)
  To: Dave Hansen, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Tue, 2021-01-26 at 07:39 -0800, Dave Hansen wrote:
> On 1/26/21 1:30 AM, Kai Huang wrote:
> > Remove SGX_EPC_PAGE_RECLAIMER_TRACKED check and warning.  This cannot
> > happen, as enclave pages are freed only at the time when encl->refcount
> > triggers, i.e. when both VFS and the page reclaimer have given up on
> > their references.
> > 
> > Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
> > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > ---
> >  arch/x86/kernel/cpu/sgx/main.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > index 8df81a3ed945..f330abdb5bb1 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -605,8 +605,6 @@ void sgx_free_epc_page(struct sgx_epc_page *page)
> >  	struct sgx_epc_section *section = &sgx_epc_sections[page->section];
> >  	int ret;
> >  
> > 
> > -	WARN_ON_ONCE(page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
> 
> I'm all for cleaning up silly, useless warnings.  But, don't we usually
> put warnings in for things that we don't expect to be able to happen?
> 
> In other words, I'm fine with removing this if it hasn't been a valuable
> warning and we don't expect it to become a valuable warning.  But, the
> changelog doesn't say that.  It also doesn't explain what this patch is
> doing in this series.
> 
> Why is this her?

Hi Jarkko,

I don't have deep understanding of SGX driver. Would you help to answer?


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page()
  2021-01-27  1:08     ` Kai Huang
@ 2021-01-27  1:12       ` Dave Hansen
  2021-01-27  1:26         ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-01-27  1:12 UTC (permalink / raw)
  To: Kai Huang, linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, haitao.huang, pbonzini, bp, tglx, mingo, hpa

On 1/26/21 5:08 PM, Kai Huang wrote:
> I don't have deep understanding of SGX driver. Would you help to answer?

Kai, as the patch submitter, you are expected to be able to at least
minimally explain what the patch is doing.  Please endeavor to obtain
this understanding before sending patches in the future.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
  2021-01-26 16:04   ` Dave Hansen
@ 2021-01-27  1:25     ` Kai Huang
  2021-02-02 18:00       ` Paolo Bonzini
  2021-02-02 19:02       ` Dave Hansen
  0 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-01-27  1:25 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, 26 Jan 2021 08:04:35 -0800 Dave Hansen wrote:
> On 1/26/21 1:30 AM, Kai Huang wrote:
> > From: Jarkko Sakkinen <jarkko@kernel.org>
> > 
> > Encapsulate the snippet in sgx_free_epc_page() concerning EREMOVE to
> > sgx_reset_epc_page(), which is a static helper function for
> > sgx_encl_release().  It's the only function existing, which deals with
> > initialized pages.
> 
> Yikes.  I have no idea what that is saying.  Here's a rewrite:
> 
> EREMOVE takes a pages and removes any association between that page and
> an enclave.  It must be run on a page before it can be added into
> another enclave.  Currently, EREMOVE is run as part of pages being freed
> into the SGX page allocator.  It is not expected to fail.
> 
> KVM does not track how guest pages are used, which means that SGX
> virtualization use of EREMOVE might fail.
> 
> Break out the EREMOVE call from the SGX page allocator.  This will allow
> the SGX virtualization code to use the allocator directly.  (SGX/KVM
> will also introduce a more permissive EREMOVE helper).

Thanks.

Hi Jarkko,

Do you want me to update your patch directly, or do you want to take the
change, and send me the patch again?

> 
> > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> > index ee50a5010277..a78b71447771 100644
> > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > @@ -389,6 +389,16 @@ const struct vm_operations_struct sgx_vm_ops = {
> >  	.access = sgx_vma_access,
> >  };
> >  
> > +
> > +static void sgx_reset_epc_page(struct sgx_epc_page *epc_page)
> > +{
> > +	int ret;
> > +
> > +	ret = __eremove(sgx_get_epc_virt_addr(epc_page));
> > +	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret))
> > +		return;
> > +}
> > +
> >  /**
> >   * sgx_encl_release - Destroy an enclave instance
> >   * @kref:	address of a kref inside &sgx_encl
> > @@ -412,6 +422,7 @@ void sgx_encl_release(struct kref *ref)
> >  			if (sgx_unmark_page_reclaimable(entry->epc_page))
> >  				continue;
> >  
> > +			sgx_reset_epc_page(entry->epc_page);
> >  			sgx_free_epc_page(entry->epc_page);
> >  			encl->secs_child_cnt--;
> >  			entry->epc_page = NULL;
> > @@ -423,6 +434,7 @@ void sgx_encl_release(struct kref *ref)
> >  	xa_destroy(&encl->page_array);
> >  
> >  	if (!encl->secs_child_cnt && encl->secs.epc_page) {
> > +		sgx_reset_epc_page(encl->secs.epc_page);
> >  		sgx_free_epc_page(encl->secs.epc_page);
> >  		encl->secs.epc_page = NULL;
> >  	}
> > @@ -431,6 +443,7 @@ void sgx_encl_release(struct kref *ref)
> >  		va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
> >  					   list);
> >  		list_del(&va_page->list);
> > +		sgx_reset_epc_page(va_page->epc_page);
> >  		sgx_free_epc_page(va_page->epc_page);
> >  		kfree(va_page);
> >  	}
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > index f330abdb5bb1..21c2ffa13870 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -598,16 +598,14 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim)
> >   * sgx_free_epc_page() - Free an EPC page
> >   * @page:	an EPC page
> >   *
> > - * Call EREMOVE for an EPC page and insert it back to the list of free pages.
> > + * Put the EPC page back to the list of free pages. It's the callers
> 
> "caller's"
> 
> > + * responsibility to make sure that the page is in uninitialized state In other
> 
> Period after "state", please.
> 
> > + * words, do EREMOVE, EWB or whatever operation is necessary before calling
> > + * this function.
> >   */
> 
> OK, so if you're going to say "the caller must put the page in
> uninitialized state", let's also add a comment to the place that *DO*
> that, like the shiny new sgx_reset_epc_page().

Hi Dave,

Sorry I am a little bit confused here. Do you mean we should add a comment in
sgx_reset_epc_page() to say, for instance: sgx_free_epc_page() requires the EPC
page already been EREMOVE'd?

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page()
  2021-01-27  1:12       ` Dave Hansen
@ 2021-01-27  1:26         ` Kai Huang
  2021-02-01  0:11           ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-27  1:26 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, 26 Jan 2021 17:12:12 -0800 Dave Hansen wrote:
> On 1/26/21 5:08 PM, Kai Huang wrote:
> > I don't have deep understanding of SGX driver. Would you help to answer?
> 
> Kai, as the patch submitter, you are expected to be able to at least
> minimally explain what the patch is doing.  Please endeavor to obtain
> this understanding before sending patches in the future.

I see. Thanks.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-27  0:18       ` Dave Hansen
@ 2021-01-27  2:02         ` Kai Huang
  2021-01-27 17:13           ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-01-27  2:02 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jethro, b.thiel

On Tue, 26 Jan 2021 16:18:31 -0800 Dave Hansen wrote:
> On 1/26/21 3:56 PM, Kai Huang wrote:
> > On Tue, 26 Jan 2021 08:26:21 -0800 Dave Hansen wrote:
> >> On 1/26/21 1:30 AM, Kai Huang wrote:
> >>> --- a/arch/x86/kernel/cpu/feat_ctl.c
> >>> +++ b/arch/x86/kernel/cpu/feat_ctl.c
> >>> @@ -105,7 +105,8 @@ early_param("nosgx", nosgx);
> >>>  void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >>>  {
> >>>  	bool tboot = tboot_enabled();
> >>> -	bool enable_sgx;
> >>> +	bool enable_vmx;
> >>> +	bool enable_sgx_any, enable_sgx_kvm, enable_sgx_driver;
> >>>  	u64 msr;
> >>>  
> >>>  	if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
> >>> @@ -114,13 +115,22 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >>>  		return;
> >>>  	}
> >>>  
> >>> +	enable_vmx = cpu_has(c, X86_FEATURE_VMX) &&
> >>> +		     IS_ENABLED(CONFIG_KVM_INTEL);
> >>
> >> The reason it's called 'enable_sgx' below is because this code is
> >> actually going to "enable sgx".  This code does not "enable vmx".  That
> >> makes this a badly-named variable.  "vmx_enabled" or "vmx_available"
> >> would be better.
> > 
> > It will also try to enable VMX if feature control MSR is not locked by BIOS.
> > Please see below code:
> 
> Ahh, I forgot this is non-SGX code.  It's mucking with all kinds of
> other stuff in the same MSR.  Oh, well, I guess that's what you get for
> dumping a bunch of refactoring in the same patch as the new code.
> 
> 
> >>> -	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
> >>> -		     cpu_has(c, X86_FEATURE_SGX_LC) &&
> >>> -		     IS_ENABLED(CONFIG_X86_SGX);
> >>> +	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
> >>> +			 cpu_has(c, X86_FEATURE_SGX1) &&
> >>> +			 IS_ENABLED(CONFIG_X86_SGX);
> >>
> >> The X86_FEATURE_SGX1 check seems to have snuck in here.  Why?
> > 
> > Please see my reply to Sean's reply.
> 
> ... yes, so you're breaking out the fix into a separate patch,.

For the separate patch to fix SGX1 check, if I understand correctly, SGX driver
should be changed too. I feel I am not the best person to do it. Jarkko or Sean
is. 

So I'll remove SGX1 here in the next version, but I won't include another
patch to fix the SGX1 logic. If Jarkko or Sean sent out that patch, and it is
merged quickly, I can rebase on top of that.

Does this make sense?

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-27  2:02         ` Kai Huang
@ 2021-01-27 17:13           ` Sean Christopherson
  0 siblings, 0 replies; 156+ messages in thread
From: Sean Christopherson @ 2021-01-27 17:13 UTC (permalink / raw)
  To: Kai Huang
  Cc: Dave Hansen, linux-sgx, kvm, x86, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jethro, b.thiel

On Wed, Jan 27, 2021, Kai Huang wrote:
> On Tue, 26 Jan 2021 16:18:31 -0800 Dave Hansen wrote:
> > On 1/26/21 3:56 PM, Kai Huang wrote:
> > > On Tue, 26 Jan 2021 08:26:21 -0800 Dave Hansen wrote:
> > >> On 1/26/21 1:30 AM, Kai Huang wrote:
> > >>> --- a/arch/x86/kernel/cpu/feat_ctl.c
> > >>> +++ b/arch/x86/kernel/cpu/feat_ctl.c
> > >>> @@ -105,7 +105,8 @@ early_param("nosgx", nosgx);
> > >>>  void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> > >>>  {
> > >>>  	bool tboot = tboot_enabled();
> > >>> -	bool enable_sgx;
> > >>> +	bool enable_vmx;
> > >>> +	bool enable_sgx_any, enable_sgx_kvm, enable_sgx_driver;
> > >>>  	u64 msr;
> > >>>  
> > >>>  	if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
> > >>> @@ -114,13 +115,22 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> > >>>  		return;
> > >>>  	}
> > >>>  
> > >>> +	enable_vmx = cpu_has(c, X86_FEATURE_VMX) &&
> > >>> +		     IS_ENABLED(CONFIG_KVM_INTEL);
> > >>
> > >> The reason it's called 'enable_sgx' below is because this code is
> > >> actually going to "enable sgx".  This code does not "enable vmx".  That
> > >> makes this a badly-named variable.  "vmx_enabled" or "vmx_available"
> > >> would be better.
> > > 
> > > It will also try to enable VMX if feature control MSR is not locked by BIOS.
> > > Please see below code:
> > 
> > Ahh, I forgot this is non-SGX code.  It's mucking with all kinds of
> > other stuff in the same MSR.  Oh, well, I guess that's what you get for
> > dumping a bunch of refactoring in the same patch as the new code.
> > 
> > 
> > >>> -	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
> > >>> -		     cpu_has(c, X86_FEATURE_SGX_LC) &&
> > >>> -		     IS_ENABLED(CONFIG_X86_SGX);
> > >>> +	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
> > >>> +			 cpu_has(c, X86_FEATURE_SGX1) &&
> > >>> +			 IS_ENABLED(CONFIG_X86_SGX);
> > >>
> > >> The X86_FEATURE_SGX1 check seems to have snuck in here.  Why?
> > > 
> > > Please see my reply to Sean's reply.
> > 
> > ... yes, so you're breaking out the fix into a separate patch,.
> 
> For the separate patch to fix SGX1 check, if I understand correctly, SGX driver
> should be changed too. I feel I am not the best person to do it. Jarkko or Sean
> is. 

SGX driver doesn't need to be changed, just this core feat_ctl.c code.

> So I'll remove SGX1 here in the next version, but I won't include another
> patch to fix the SGX1 logic. If Jarkko or Sean sent out that patch, and it is
> merged quickly, I can rebase on top of that.
> 
> Does this make sense?

Yep, adding a check on SGX1 is definitely not mandatory for this series.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-01-26  9:30 ` [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features Kai Huang
  2021-01-26 15:34   ` Dave Hansen
@ 2021-01-30 13:11   ` Jarkko Sakkinen
  1 sibling, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 13:11 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021 at 10:30:16PM +1300, Kai Huang wrote:
> From: Sean Christopherson <seanjc@google.com>
> 
> Add SGX1 and SGX2 feature flags, via CPUID.0x12.0x0.EAX, as scattered
> features, since adding a new leaf for only two bits would be wasteful.
> As part of virtualizing SGX, KVM will expose the SGX CPUID leafs to its
> guest, and to do so correctly needs to query hardware and kernel support
> for SGX1 and SGX2.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
 
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-01-26 23:18     ` Kai Huang
@ 2021-01-30 13:20       ` Jarkko Sakkinen
  2021-02-01  0:01         ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 13:20 UTC (permalink / raw)
  To: Kai Huang
  Cc: Dave Hansen, linux-sgx, kvm, x86, seanjc, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Wed, Jan 27, 2021 at 12:18:32PM +1300, Kai Huang wrote:
> On Tue, 2021-01-26 at 07:34 -0800, Dave Hansen wrote:
> > On 1/26/21 1:30 AM, Kai Huang wrote:
> > > From: Sean Christopherson <seanjc@google.com>
> > > 
> > > Add SGX1 and SGX2 feature flags, via CPUID.0x12.0x0.EAX, as scattered
> > > features, since adding a new leaf for only two bits would be wasteful.
> > > As part of virtualizing SGX, KVM will expose the SGX CPUID leafs to its
> > > guest, and to do so correctly needs to query hardware and kernel support
> > > for SGX1 and SGX2.
> > 
> > It's also not _just_ exposing the CPUID leaves.  There are some checks
> > here when KVM is emulating some SGX instructions too, right?
> 
> I would say trapping instead of emulating, but yes KVM will do more. However those
> are quite details, and I don't think we should put lots of details here. Or perhaps
> we can use 'for instance' as brief description:
> 
> As part of virtualizing SGX, KVM will need to use the two flags, for instance, to
> expose them to guest.
> 
> ?
> 
> > 
> > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > > index 84b887825f12..18b2d0c8bbbe 100644
> > > --- a/arch/x86/include/asm/cpufeatures.h
> > > +++ b/arch/x86/include/asm/cpufeatures.h
> > > @@ -292,6 +292,8 @@
> > >  #define X86_FEATURE_FENCE_SWAPGS_KERNEL	(11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */
> > >  #define X86_FEATURE_SPLIT_LOCK_DETECT	(11*32+ 6) /* #AC for split lock */
> > >  #define X86_FEATURE_PER_THREAD_MBA	(11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */
> > > +#define X86_FEATURE_SGX1		(11*32+ 8) /* Software Guard Extensions sub-feature SGX1 */
> > > +#define X86_FEATURE_SGX2        	(11*32+ 9) /* Software Guard Extensions sub-feature SGX2 */
> > 
> > FWIW, I'm not sure how valuable it is to spell the SGX acronym out three
> > times.  Can't we use those bytes to put something more useful in that
> > comment?
> 
> I think we can remove comment for SGX1, since it is basically SGX.
> 
> For SGX2, how about below?
> 
> /* SGX Enclave Dynamic Memory Management */

(EDMM)

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit
  2021-01-26  9:30 ` [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit Kai Huang
  2021-01-26 15:35   ` Dave Hansen
@ 2021-01-30 13:22   ` Jarkko Sakkinen
  2021-02-01  0:08     ` Kai Huang
  1 sibling, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 13:22 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021 at 10:30:17PM +1300, Kai Huang wrote:
> Move SGX_LC feature bit to CPUID dependency table as well, along with
> new added SGX1 and SGX2 bit, to make clearing all SGX feature bits
> easier. Also remove clear_sgx_caps() since it is just a wrapper of
> setup_clear_cpu_cap(X86_FEATURE_SGX) now.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>

Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

So could this be an improvement to the existing code? If so, then
this should be the first patch. Also, I think that then this can
be merged independently from rest of the patch set.

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests
  2021-01-26  9:30 ` [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests Kai Huang
  2021-01-26 16:19   ` Dave Hansen
@ 2021-01-30 14:41   ` Jarkko Sakkinen
  1 sibling, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 14:41 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021 at 10:30:21PM +1300, Kai Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> Add a misc device /dev/sgx_vepc to allow userspace to allocate "raw" EPC
> without an associated enclave.  The intended and only known use case for
> raw EPC allocation is to expose EPC to a KVM guest, hence the 'vepc'
> moniker, virt.{c,h} files and X86_SGX_KVM Kconfig.
> 
> More specifically, to allocate a virtual EPC instance with particular
> size, the userspace hypervisor opens the device node, and uses mmap()
> with the intended size to get an address range of virtual EPC.  Then
> it may use the address range to create one KVM memory slot as virtual
> EPC for guest.
> 
> Implement the "raw" EPC allocation in the x86 core-SGX subsystem via
> /dev/sgx_vepc rather than in KVM. Doing so has two major advantages:
> 
>   - Does not require changes to KVM's uAPI, e.g. EPC gets handled as
>     just another memory backend for guests.
> 
>   - EPC management is wholly contained in the SGX subsystem, e.g. SGX
>     does not have to export any symbols, changes to reclaim flows don't
>     need to be routed through KVM, SGX's dirty laundry doesn't have to
>     get aired out for the world to see, and so on and so forth.
> 
> The virtual EPC pages allocated to guests are currently not reclaimable.
> Reclaiming EPC page used by enclave requires a special reclaim mechanism
> separate from normal page reclaim, and that mechanism is not supported
> for virutal EPC pages.  Due to the complications of handling reclaim
> conflicts between guest and host, reclaiming virtual EPC pages is
> significantly more complex than basic support for SGX virtualization.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Co-developed-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> ---
> v2->v3:
> 
>  - Changed from /dev/sgx_virt_epc to /dev/sgx_vepc, per Jarkko. Accordingly,
>    renamed 'sgx_virt_epc_xx' to 'sgx_vepc_xx' for various functions and
>    structrues.
>  - Changed CONFIG_X86_SGX_VIRTUALIZATION to CONFIG_X86_SGX_KVM, per Dave.
> 
> v1->v2:
> 
>  - Added one paragraph to explain fops of virtual EPC, per Jarkko's suggestion.
>  - Moved change to sgx_init() out of this patch to a separate patch, as stated
>    in cover letter.
>  - In sgx_virt_epc_init(), return error if VMX is not supported, or
>    CONFIG_KVM_INTEL is not enabled, because there's no point to create
>    /dev/sgx_virt_epc if KVM is not supported.
>  - Removed 'struct mm_struct *mm' in 'struct sgx_virt_epc', and related logic in
>    sgx_virt_epc_open/release/mmap(), per Dave's comment.
>  - Renamed 'virtual_epc_zombie_pages' and 'virt_epc_lock' to 'zombie_secs_pages'
>    'zombie_secs_pages_lock', per Dave's suggestion.
>  - Changed __sgx_free_epc_page() to sgx_free_epc_page() due to Jarkko's patch
>    removes EREMOVE in sgx_free_epc_page().
>  - Changed all struct sgx_virt_epc *epc to struct sgx_virt_epc *vepc.
>  - In __sgx_virt_epc_fault(), changed comment to use WARN_ON() to make sure
>    vepc->lock has already been hold, per Dave's suggestion.
>  - In sgx_virt_epc_free_page(), added comments to explain SGX_ENCLAVE_ACT is not
>    expected; and changed to use WARN_ONCE() to dump actual error code, per
>    Dave's comment.
>  - Removed NULL page check in sgx_virt_epc_free_page(), per Dave's comment.
> 
> ---
>  arch/x86/Kconfig                 |  12 ++
>  arch/x86/kernel/cpu/sgx/Makefile |   1 +
>  arch/x86/kernel/cpu/sgx/virt.c   | 254 +++++++++++++++++++++++++++++++
>  arch/x86/kernel/cpu/sgx/virt.h   |  14 ++
>  4 files changed, 281 insertions(+)
>  create mode 100644 arch/x86/kernel/cpu/sgx/virt.c
>  create mode 100644 arch/x86/kernel/cpu/sgx/virt.h
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 21f851179ff0..ccb35d14c297 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1951,6 +1951,18 @@ config X86_SGX
>  
>  	  If unsure, say N.
>  
> +config X86_SGX_KVM
> +	bool "Software Guard eXtensions (SGX) Virtualization"
> +	depends on X86_SGX && KVM_INTEL
> +	help
> +
> +	  Enables KVM guests to create SGX enclaves.
> +
> +	  This includes support to expose "raw" unreclaimable enclave memory to
> +	  guests via a device node, e.g. /dev/sgx_vepc.
> +
> +	  If unsure, say N.
> +
>  config EFI
>  	bool "EFI runtime service support"
>  	depends on ACPI
> diff --git a/arch/x86/kernel/cpu/sgx/Makefile b/arch/x86/kernel/cpu/sgx/Makefile
> index 91d3dc784a29..9c1656779b2a 100644
> --- a/arch/x86/kernel/cpu/sgx/Makefile
> +++ b/arch/x86/kernel/cpu/sgx/Makefile
> @@ -3,3 +3,4 @@ obj-y += \
>  	encl.o \
>  	ioctl.o \
>  	main.o
> +obj-$(CONFIG_X86_SGX_KVM)	+= virt.o
> diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> new file mode 100644
> index 000000000000..e1ad7856d878
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/sgx/virt.c
> @@ -0,0 +1,254 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*  Copyright(c) 2016-20 Intel Corporation. */
> +
> +#define pr_fmt(fmt)	"SGX virtual EPC: " fmt

Remove this. It's fine to use "sgx:" also for these messages and
easens grepping.

> +
> +#include <linux/miscdevice.h>
> +#include <linux/mm.h>
> +#include <linux/mman.h>
> +#include <linux/sched/mm.h>
> +#include <linux/sched/signal.h>
> +#include <linux/slab.h>
> +#include <linux/xarray.h>
> +#include <asm/sgx.h>
> +#include <uapi/asm/sgx.h>
> +
> +#include "encls.h"
> +#include "sgx.h"
> +#include "virt.h"
> +
> +struct sgx_vepc {
> +	struct xarray page_array;
> +	struct mutex lock;
> +};
> +
> +static struct mutex zombie_secs_pages_lock;
> +static struct list_head zombie_secs_pages;
> +
> +static int __sgx_vepc_fault(struct sgx_vepc *vepc,
> +			    struct vm_area_struct *vma, unsigned long addr)
> +{
> +	struct sgx_epc_page *epc_page;
> +	unsigned long index, pfn;
> +	int ret;
> +
> +	WARN_ON(!mutex_is_locked(&vepc->lock));
> +
> +	/* Calculate index of EPC page in virtual EPC's page_array */
> +	index = vma->vm_pgoff + PFN_DOWN(addr - vma->vm_start);
> +
> +	epc_page = xa_load(&vepc->page_array, index);
> +	if (epc_page)
> +		return 0;
> +
> +	epc_page = sgx_alloc_epc_page(vepc, false);
> +	if (IS_ERR(epc_page))
> +		return PTR_ERR(epc_page);
> +
> +	ret = xa_err(xa_store(&vepc->page_array, index, epc_page, GFP_KERNEL));
> +	if (ret)
> +		goto err_free;
> +
> +	pfn = PFN_DOWN(sgx_get_epc_phys_addr(epc_page));
> +
> +	ret = vmf_insert_pfn(vma, addr, pfn);
> +	if (ret != VM_FAULT_NOPAGE) {
> +		ret = -EFAULT;
> +		goto err_delete;
> +	}
> +
> +	return 0;
> +
> +err_delete:
> +	xa_erase(&vepc->page_array, index);
> +err_free:
> +	sgx_free_epc_page(epc_page);
> +	return ret;
> +}
> +
> +static vm_fault_t sgx_vepc_fault(struct vm_fault *vmf)
> +{
> +	struct vm_area_struct *vma = vmf->vma;
> +	struct sgx_vepc *vepc = vma->vm_private_data;
> +	int ret;
> +
> +	mutex_lock(&vepc->lock);
> +	ret = __sgx_vepc_fault(vepc, vma, vmf->address);
> +	mutex_unlock(&vepc->lock);
> +
> +	if (!ret)
> +		return VM_FAULT_NOPAGE;
> +
> +	if (ret == -EBUSY && (vmf->flags & FAULT_FLAG_ALLOW_RETRY)) {
> +		mmap_read_unlock(vma->vm_mm);
> +		return VM_FAULT_RETRY;
> +	}
> +
> +	return VM_FAULT_SIGBUS;
> +}
> +
> +const struct vm_operations_struct sgx_vepc_vm_ops = {
> +	.fault = sgx_vepc_fault,
> +};
> +
> +static int sgx_vepc_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> +	struct sgx_vepc *vepc = file->private_data;
> +
> +	if (!(vma->vm_flags & VM_SHARED))
> +		return -EINVAL;
> +
> +	vma->vm_ops = &sgx_vepc_vm_ops;
> +	/* Don't copy VMA in fork() */
> +	vma->vm_flags |= VM_PFNMAP | VM_IO | VM_DONTDUMP | VM_DONTCOPY;
> +	vma->vm_private_data = vepc;
> +
> +	return 0;
> +}
> +
> +static int sgx_vepc_free_page(struct sgx_epc_page *epc_page)
> +{
> +	int ret;
> +
> +	/*
> +	 * Take a previously guest-owned EPC page and return it to the
> +	 * general EPC page pool.
> +	 *
> +	 * Guests can not be trusted to have left this page in a good
> +	 * state, so run EREMOVE on the page unconditionally.  In the
> +	 * case that a guest properly EREMOVE'd this page, a superfluous
> +	 * EREMOVE is harmless.
> +	 */
> +	ret = __eremove(sgx_get_epc_virt_addr(epc_page));
> +	if (ret) {
> +		/*
> +		 * Only SGX_CHILD_PRESENT is expected, which is because of
> +		 * EREMOVE'ing an SECS still with child, in which case it can
> +		 * be handled by EREMOVE'ing the SECS again after all pages in
> +		 * virtual EPC have been EREMOVE'd. See comments in below in
> +		 * sgx_vepc_release().
> +		 *
> +		 * The user of virtual EPC (KVM) needs to guarantee there's no
> +		 * logical processor is still running in the enclave in guest,
> +		 * otherwise EREMOVE will get SGX_ENCLAVE_ACT which cannot be
> +		 * handled here.
> +		 */
> +		WARN_ONCE(ret != SGX_CHILD_PRESENT,
> +			  "EREMOVE (EPC page 0x%lx): unexpected error: %d\n",
> +			  sgx_get_epc_phys_addr(epc_page), ret);
> +		return ret;
> +	}
> +
> +	sgx_free_epc_page(epc_page);
> +	return 0;
> +}
> +
> +static int sgx_vepc_release(struct inode *inode, struct file *file)
> +{
> +	struct sgx_vepc *vepc = file->private_data;
> +	struct sgx_epc_page *epc_page, *tmp, *entry;
> +	unsigned long index;
> +
> +	LIST_HEAD(secs_pages);
> +
> +	xa_for_each(&vepc->page_array, index, entry) {
> +		/*
> +		 * Remove all normal, child pages.  sgx_vepc_free_page()
> +		 * will fail if EREMOVE fails, but this is OK and expected on
> +		 * SECS pages.  Those can only be EREMOVE'd *after* all their
> +		 * child pages. Retries below will clean them up.
> +		 */
> +		if (sgx_vepc_free_page(entry))
> +			continue;
> +
> +		xa_erase(&vepc->page_array, index);
> +	}
> +
> +	/*
> +	 * Retry EREMOVE'ing pages.  This will clean up any SECS pages that
> +	 * only had children in this 'epc' area.
> +	 */
> +	xa_for_each(&vepc->page_array, index, entry) {
> +		epc_page = entry;
> +		/*
> +		 * An EREMOVE failure here means that the SECS page still
> +		 * has children.  But, since all children in this 'sgx_vepc'
> +		 * have been removed, the SECS page must have a child on
> +		 * another instance.
> +		 */
> +		if (sgx_vepc_free_page(epc_page))
> +			list_add_tail(&epc_page->list, &secs_pages);
> +
> +		xa_erase(&vepc->page_array, index);
> +	}
> +
> +	/*
> +	 * SECS pages are "pinned" by child pages, an unpinned once all
> +	 * children have been EREMOVE'd.  A child page in this instance
> +	 * may have pinned an SECS page encountered in an earlier release(),
> +	 * creating a zombie.  Since some children were EREMOVE'd above,
> +	 * try to EREMOVE all zombies in the hopes that one was unpinned.
> +	 */
> +	mutex_lock(&zombie_secs_pages_lock);
> +	list_for_each_entry_safe(epc_page, tmp, &zombie_secs_pages, list) {
> +		/*
> +		 * Speculatively remove the page from the list of zombies,
> +		 * if the page is successfully EREMOVE it will be added to
> +		 * the list of free pages.  If EREMOVE fails, throw the page
> +		 * on the local list, which will be spliced on at the end.
> +		 */
> +		list_del(&epc_page->list);
> +
> +		if (sgx_vepc_free_page(epc_page))
> +			list_add_tail(&epc_page->list, &secs_pages);
> +	}
> +
> +	if (!list_empty(&secs_pages))
> +		list_splice_tail(&secs_pages, &zombie_secs_pages);
> +	mutex_unlock(&zombie_secs_pages_lock);
> +
> +	kfree(vepc);
> +
> +	return 0;
> +}
> +
> +static int sgx_vepc_open(struct inode *inode, struct file *file)
> +{
> +	struct sgx_vepc *vepc;
> +
> +	vepc = kzalloc(sizeof(struct sgx_vepc), GFP_KERNEL);
> +	if (!vepc)
> +		return -ENOMEM;
> +	mutex_init(&vepc->lock);
> +	xa_init(&vepc->page_array);
> +
> +	file->private_data = vepc;
> +
> +	return 0;
> +}
> +
> +static const struct file_operations sgx_vepc_fops = {
> +	.owner		= THIS_MODULE,
> +	.open		= sgx_vepc_open,
> +	.release	= sgx_vepc_release,
> +	.mmap		= sgx_vepc_mmap,
> +};
> +
> +static struct miscdevice sgx_vepc_dev = {
> +	.minor = MISC_DYNAMIC_MINOR,
> +	.name = "sgx_vepc",
> +	.nodename = "sgx_vepc",
> +	.fops = &sgx_vepc_fops,
> +};
> +
> +int __init sgx_vepc_init(void)
> +{
> +	/* SGX virtualization requires KVM to work */
> +	if (!boot_cpu_has(X86_FEATURE_VMX) || !IS_ENABLED(CONFIG_KVM_INTEL))
> +		return -ENODEV;
> +
> +	INIT_LIST_HEAD(&zombie_secs_pages);
> +	mutex_init(&zombie_secs_pages_lock);
> +
> +	return misc_register(&sgx_vepc_dev);
> +}
> diff --git a/arch/x86/kernel/cpu/sgx/virt.h b/arch/x86/kernel/cpu/sgx/virt.h
> new file mode 100644
> index 000000000000..44d872380ca1
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/sgx/virt.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
> +#ifndef _ASM_X86_SGX_VIRT_H
> +#define _ASM_X86_SGX_VIRT_H
> +
> +#ifdef CONFIG_X86_SGX_KVM
> +int __init sgx_vepc_init(void);
> +#else
> +static inline int __init sgx_vepc_init(void)
> +{
> +	return -ENODEV;
> +}
> +#endif
> +
> +#endif /* _ASM_X86_SGX_VIRT_H */
> -- 
> 2.29.2
> 
> 

Other than that, this starts to be in shape.

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-26  9:30 ` [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support Kai Huang
  2021-01-26 16:26   ` Dave Hansen
@ 2021-01-30 14:42   ` Jarkko Sakkinen
  2021-02-01  5:38     ` Kai Huang
  1 sibling, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 14:42 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jethro, b.thiel

On Tue, Jan 26, 2021 at 10:30:54PM +1300, Kai Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> The kernel will currently disable all SGX support if the hardware does
> not support launch control.  Make it more permissive to allow SGX
> virtualization on systems without Launch Control support.  This will
> allow KVM to expose SGX to guests that have less-strict requirements on
> the availability of flexible launch control.
> 
> Improve error message to distinguish between three cases.  There are two
> cases where SGX support is completely disabled:
> 1) SGX has been disabled completely by the BIOS
> 2) SGX LC is locked by the BIOS.  Bare-metal support is disabled because
>    of LC unavailability.  SGX virtualization is unavailable (because of
>    Kconfig).
> One where it is partially available:
> 3) SGX LC is locked by the BIOS.  Bare-metal support is disabled because
>    of LC unavailability.  SGX virtualization is supported.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Co-developed-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> ---
> v2->v3:
> 
>  - Added to use 'enable_sgx_any', per Dave.
>  - Changed to call clear_cpu_cap() directly, rather than using clear_sgx_caps()
>    and clear_sgx_lc().
>  - Changed to use CONFIG_X86_SGX_KVM, instead of CONFIG_X86_SGX_VIRTUALIZATION.
> 
> v1->v2:
> 
>  - Refined commit message per Dave's comments.
>  - Added check to only enable SGX virtualization when VMX is supported, per
>    Dave's comment.
>  - Refined error msg print to explicitly call out SGX virtualization will be
>    supported when LC is locked by BIOS, per Dave's comment.
> 
> ---
>  arch/x86/kernel/cpu/feat_ctl.c | 58 ++++++++++++++++++++++++++--------
>  1 file changed, 45 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/feat_ctl.c b/arch/x86/kernel/cpu/feat_ctl.c
> index 27533a6e04fa..0fc202550fcc 100644
> --- a/arch/x86/kernel/cpu/feat_ctl.c
> +++ b/arch/x86/kernel/cpu/feat_ctl.c
> @@ -105,7 +105,8 @@ early_param("nosgx", nosgx);
>  void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>  {
>  	bool tboot = tboot_enabled();
> -	bool enable_sgx;
> +	bool enable_vmx;
> +	bool enable_sgx_any, enable_sgx_kvm, enable_sgx_driver;

Move the declaration first (reverse christmas tree).

>  	u64 msr;
>  
>  	if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
> @@ -114,13 +115,22 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>  		return;
>  	}
>  
> +	enable_vmx = cpu_has(c, X86_FEATURE_VMX) &&
> +		     IS_ENABLED(CONFIG_KVM_INTEL);
> +
>  	/*
> -	 * Enable SGX if and only if the kernel supports SGX and Launch Control
> -	 * is supported, i.e. disable SGX if the LE hash MSRs can't be written.
> +	 * Enable SGX if and only if the kernel supports SGX.  Require Launch
> +	 * Control support if SGX virtualization is *not* supported, i.e.
> +	 * disable SGX if the LE hash MSRs can't be written and SGX can't be
> +	 * exposed to a KVM guest (which might support non-LC configurations).
>  	 */
> -	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
> -		     cpu_has(c, X86_FEATURE_SGX_LC) &&
> -		     IS_ENABLED(CONFIG_X86_SGX);
> +	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
> +			 cpu_has(c, X86_FEATURE_SGX1) &&
> +			 IS_ENABLED(CONFIG_X86_SGX);
> +	enable_sgx_driver = enable_sgx_any &&
> +			    cpu_has(c, X86_FEATURE_SGX_LC);
> +	enable_sgx_kvm = enable_sgx_any && enable_vmx &&
> +			  IS_ENABLED(CONFIG_X86_SGX_KVM);
>  
>  	if (msr & FEAT_CTL_LOCKED)
>  		goto update_caps;
> @@ -136,15 +146,18 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>  	 * i.e. KVM is enabled, to avoid unnecessarily adding an attack vector
>  	 * for the kernel, e.g. using VMX to hide malicious code.
>  	 */
> -	if (cpu_has(c, X86_FEATURE_VMX) && IS_ENABLED(CONFIG_KVM_INTEL)) {
> +	if (enable_vmx) {
>  		msr |= FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
>  
>  		if (tboot)
>  			msr |= FEAT_CTL_VMX_ENABLED_INSIDE_SMX;
>  	}
>  
> -	if (enable_sgx)
> -		msr |= FEAT_CTL_SGX_ENABLED | FEAT_CTL_SGX_LC_ENABLED;
> +	if (enable_sgx_kvm || enable_sgx_driver) {
> +		msr |= FEAT_CTL_SGX_ENABLED;
> +		if (enable_sgx_driver)
> +			msr |= FEAT_CTL_SGX_LC_ENABLED;
> +	}
>  
>  	wrmsrl(MSR_IA32_FEAT_CTL, msr);
>  
> @@ -167,10 +180,29 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
>  	}
>  
>  update_sgx:
> -	if (!(msr & FEAT_CTL_SGX_ENABLED) ||
> -	    !(msr & FEAT_CTL_SGX_LC_ENABLED) || !enable_sgx) {
> -		if (enable_sgx)
> -			pr_err_once("SGX disabled by BIOS\n");
> +	if (!(msr & FEAT_CTL_SGX_ENABLED)) {
> +		if (enable_sgx_kvm || enable_sgx_driver)
> +			pr_err_once("SGX disabled by BIOS.\n");
>  		clear_cpu_cap(c, X86_FEATURE_SGX);
> +		return;
> +	}
> +
> +	/*
> +	 * VMX feature bit may be cleared due to being disabled in BIOS,
> +	 * in which case SGX virtualization cannot be supported either.
> +	 */
> +	if (!cpu_has(c, X86_FEATURE_VMX) && enable_sgx_kvm) {
> +		pr_err_once("SGX virtualization disabled due to lack of VMX.\n");
> +		enable_sgx_kvm = 0;
> +	}
> +
> +	if (!(msr & FEAT_CTL_SGX_LC_ENABLED) && enable_sgx_driver) {
> +		if (!enable_sgx_kvm) {
> +			pr_err_once("SGX Launch Control is locked. Disable SGX.\n");
> +			clear_cpu_cap(c, X86_FEATURE_SGX);
> +		} else {
> +			pr_err_once("SGX Launch Control is locked. Support SGX virtualization only.\n");
> +			clear_cpu_cap(c, X86_FEATURE_SGX_LC);
> +		}
>  	}
>  }
> -- 
> 2.29.2
> 
> 

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-01-26  9:31 ` [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled Kai Huang
  2021-01-26 17:03   ` Dave Hansen
@ 2021-01-30 14:45   ` Jarkko Sakkinen
  2021-02-01  5:40     ` Kai Huang
  1 sibling, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 14:45 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021 at 10:31:00PM +1300, Kai Huang wrote:
> Modify sgx_init() to always try to initialize the virtual EPC driver,
> even if the bare-metal SGX driver is disabled.  The bare-metal driver
> might be disabled if SGX Launch Control is in locked mode, or not
> supported in the hardware at all.  This allows (non-Linux) guests that
> support non-LC configurations to use SGX.
> 
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> ---
> v2->v3:
> 
>  - Changed from sgx_virt_epc_init() to sgx_vepc_init().
> 
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 21c2ffa13870..93d249f7bff3 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -12,6 +12,7 @@
>  #include "driver.h"
>  #include "encl.h"
>  #include "encls.h"
> +#include "virt.h"
>  
>  struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
>  static int sgx_nr_epc_sections;
> @@ -712,7 +713,8 @@ static int __init sgx_init(void)
>  		goto err_page_cache;
>  	}
>  
> -	ret = sgx_drv_init();
> +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> +	ret = !!sgx_drv_init() & !!sgx_vepc_init();

If would create more dumb code and just add

ret = sgx_vepc_init()
if (ret)
        goto err_kthread;

>  	if (ret)
>  		goto err_kthread;
>  
> -- 
> 2.29.2
> 

/Jarkko
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 09/27] x86/sgx: Expose SGX architectural definitions to the kernel
  2021-01-26  9:31 ` [RFC PATCH v3 09/27] x86/sgx: Expose SGX architectural definitions to the kernel Kai Huang
@ 2021-01-30 14:46   ` Jarkko Sakkinen
  0 siblings, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 14:46 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021 at 10:31:01PM +1300, Kai Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> Expose SGX architectural structures, as KVM will use many of the
> architectural constants and structs to virtualize SGX.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>


Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

> ---
> v2->v3:
> 
>  - Added "Expose SGX architectural structures, as..." to commit message,
>    per Jarkko.
> 
> ---
>  arch/x86/{kernel/cpu/sgx/arch.h => include/asm/sgx_arch.h} | 0
>  arch/x86/kernel/cpu/sgx/encl.c                             | 2 +-
>  arch/x86/kernel/cpu/sgx/sgx.h                              | 2 +-
>  tools/testing/selftests/sgx/defines.h                      | 2 +-
>  4 files changed, 3 insertions(+), 3 deletions(-)
>  rename arch/x86/{kernel/cpu/sgx/arch.h => include/asm/sgx_arch.h} (100%)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/arch.h b/arch/x86/include/asm/sgx_arch.h
> similarity index 100%
> rename from arch/x86/kernel/cpu/sgx/arch.h
> rename to arch/x86/include/asm/sgx_arch.h
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index a78b71447771..68941c349cfe 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -7,7 +7,7 @@
>  #include <linux/shmem_fs.h>
>  #include <linux/suspend.h>
>  #include <linux/sched/mm.h>
> -#include "arch.h"
> +#include <asm/sgx_arch.h>
>  #include "encl.h"
>  #include "encls.h"
>  #include "sgx.h"
> diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> index 5fa42d143feb..509f2af33e1d 100644
> --- a/arch/x86/kernel/cpu/sgx/sgx.h
> +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> @@ -8,7 +8,7 @@
>  #include <linux/rwsem.h>
>  #include <linux/types.h>
>  #include <asm/asm.h>
> -#include "arch.h"
> +#include <asm/sgx_arch.h>
>  
>  #undef pr_fmt
>  #define pr_fmt(fmt) "sgx: " fmt
> diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h
> index 592c1ccf4576..4dd39a003f40 100644
> --- a/tools/testing/selftests/sgx/defines.h
> +++ b/tools/testing/selftests/sgx/defines.h
> @@ -14,7 +14,7 @@
>  #define __aligned(x) __attribute__((__aligned__(x)))
>  #define __packed __attribute__((packed))
>  
> -#include "../../../../arch/x86/kernel/cpu/sgx/arch.h"
> +#include "../../../../arch/x86/include/asm/sgx_arch.h"
>  #include "../../../../arch/x86/include/asm/enclu.h"
>  #include "../../../../arch/x86/include/uapi/asm/sgx.h"
>  
> -- 
> 2.29.2
> 
> 

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 12/27] x86/sgx: Add encls_faulted() helper
  2021-01-26  9:31 ` [RFC PATCH v3 12/27] x86/sgx: Add encls_faulted() helper Kai Huang
@ 2021-01-30 14:48   ` Jarkko Sakkinen
  0 siblings, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 14:48 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021 at 10:31:04PM +1300, Kai Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> Add a helper to extract the fault indicator from an encoded ENCLS return
> value.  SGX virtualization will also need to detect ENCLS faults.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Acked-by: Dave Hansen <dave.hansen@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>


Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

/Jarkko

> ---
> v2->v3:
> 
>  - Changed commenting style for return value, per Jarkko.
> 
> ---
>  arch/x86/kernel/cpu/sgx/encls.h | 15 ++++++++++++++-
>  arch/x86/kernel/cpu/sgx/ioctl.c |  2 +-
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
> index be5c49689980..3219d011ee28 100644
> --- a/arch/x86/kernel/cpu/sgx/encls.h
> +++ b/arch/x86/kernel/cpu/sgx/encls.h
> @@ -40,6 +40,19 @@
>  	} while (0);							  \
>  }
>  
> +/*
> + * encls_faulted() - Check if an ENCLS leaf faulted given an error code
> + * @ret 	the return value of an ENCLS leaf function call
> + *
> + * Return:
> + * - true:	ENCLS leaf faulted.
> + * - false:	Otherwise.
> + */
> +static inline bool encls_faulted(int ret)
> +{
> +	return ret & ENCLS_FAULT_FLAG;
> +}
> +
>  /**
>   * encls_failed() - Check if an ENCLS function failed
>   * @ret:	the return value of an ENCLS function call
> @@ -50,7 +63,7 @@
>   */
>  static inline bool encls_failed(int ret)
>  {
> -	if (ret & ENCLS_FAULT_FLAG)
> +	if (encls_faulted(ret))
>  		return ENCLS_TRAPNR(ret) != X86_TRAP_PF;
>  
>  	return !!ret;
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 90a5caf76939..e5977752c7be 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -568,7 +568,7 @@ static int sgx_encl_init(struct sgx_encl *encl, struct sgx_sigstruct *sigstruct,
>  		}
>  	}
>  
> -	if (ret & ENCLS_FAULT_FLAG) {
> +	if (encls_faulted(ret)) {
>  		if (encls_failed(ret))
>  			ENCLS_WARN(ret, "EINIT");
>  
> -- 
> 2.29.2
> 
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 13/27] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs
  2021-01-26  9:31 ` [RFC PATCH v3 13/27] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs Kai Huang
@ 2021-01-30 14:49   ` Jarkko Sakkinen
  2021-02-01  1:17     ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 14:49 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021 at 10:31:05PM +1300, Kai Huang wrote:
> Add a helper to update SGX_LEPUBKEYHASHn MSRs.  SGX virtualization also
> needs to update those MSRs based on guest's "virtual" SGX_LEPUBKEYHASHn
> before EINIT from guest.
> 
> Signed-off-by: Kai Huang <kai.huang@intel.com>


Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

/Jarkko

> ---
> v2->v3:
> 
>  - Added comment for sgx_update_lepubkeyhash(), per Jarkko and Dave.
> 
> ---
>  arch/x86/kernel/cpu/sgx/ioctl.c |  5 ++---
>  arch/x86/kernel/cpu/sgx/main.c  | 15 +++++++++++++++
>  arch/x86/kernel/cpu/sgx/sgx.h   |  2 ++
>  3 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index e5977752c7be..1bae754268d1 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -495,7 +495,7 @@ static int sgx_encl_init(struct sgx_encl *encl, struct sgx_sigstruct *sigstruct,
>  			 void *token)
>  {
>  	u64 mrsigner[4];
> -	int i, j, k;
> +	int i, j;
>  	void *addr;
>  	int ret;
>  
> @@ -544,8 +544,7 @@ static int sgx_encl_init(struct sgx_encl *encl, struct sgx_sigstruct *sigstruct,
>  
>  			preempt_disable();
>  
> -			for (k = 0; k < 4; k++)
> -				wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0 + k, mrsigner[k]);
> +			sgx_update_lepubkeyhash(mrsigner);
>  
>  			ret = __einit(sigstruct, token, addr);
>  
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 93d249f7bff3..b456899a9532 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -697,6 +697,21 @@ static bool __init sgx_page_cache_init(void)
>  	return true;
>  }
>  
> +
> +/*
> + * Update the SGX_LEPUBKEYHASH MSRs to the values specified by caller.
> + * Bare-metal driver requires to update them to hash of enclave's signer
> + * before EINIT. KVM needs to update them to guest's virtual MSR values
> + * before doing EINIT from guest.
> + */
> +void sgx_update_lepubkeyhash(u64 *lepubkeyhash)
> +{
> +	int i;
> +
> +	for (i = 0; i < 4; i++)
> +		wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0 + i, lepubkeyhash[i]);
> +}
> +
>  static int __init sgx_init(void)
>  {
>  	int ret;
> diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> index 509f2af33e1d..ccd4f145c464 100644
> --- a/arch/x86/kernel/cpu/sgx/sgx.h
> +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> @@ -83,4 +83,6 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
>  int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
>  struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
>  
> +void sgx_update_lepubkeyhash(u64 *lepubkeyhash);
> +
>  #endif /* _X86_SGX_H */
> -- 
> 2.29.2
> 
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  2021-01-26  9:31 ` [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM Kai Huang
@ 2021-01-30 14:51   ` Jarkko Sakkinen
  2021-02-01  0:17     ` Kai Huang
  2021-02-04  3:53   ` Kai Huang
  1 sibling, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 14:51 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021 at 10:31:06PM +1300, Kai Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> The bare-metal kernel must intercept ECREATE to be able to impose policies
> on guests.  When it does this, the bare-metal kernel runs ECREATE against
> the userspace mapping of the virtualized EPC.

I guess Andy's earlier comment applies here, i.e. SGX driver?

> 
> Provide wrappers around __ecreate() and __einit() to hide the ugliness
> of overloading the ENCLS return value to encode multiple error formats
> in a single int.  KVM will trap-and-execute ECREATE and EINIT as part
> of SGX virtualization, and on an exception, KVM needs the trapnr so that
> it can inject the correct fault into the guest.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> ---
> v2->v3:
> 
>  - Added kdoc for sgx_virt_ecreate() and sgx_virt_einit(), per Jarkko.
>  - Changed to use CONFIG_X86_SGX_KVM.
> 
> ---
>  arch/x86/include/asm/sgx.h     | 16 ++++++
>  arch/x86/kernel/cpu/sgx/virt.c | 93 ++++++++++++++++++++++++++++++++++
>  2 files changed, 109 insertions(+)
>  create mode 100644 arch/x86/include/asm/sgx.h
> 
> diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> new file mode 100644
> index 000000000000..8a3ea3e1efbe
> --- /dev/null
> +++ b/arch/x86/include/asm/sgx.h
> @@ -0,0 +1,16 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_X86_SGX_H
> +#define _ASM_X86_SGX_H
> +
> +#include <linux/types.h>
> +
> +#ifdef CONFIG_X86_SGX_KVM
> +struct sgx_pageinfo;
> +
> +int sgx_virt_ecreate(struct sgx_pageinfo *pageinfo, void __user *secs,
> +		     int *trapnr);
> +int sgx_virt_einit(void __user *sigstruct, void __user *token,
> +		   void __user *secs, u64 *lepubkeyhash, int *trapnr);
> +#endif
> +
> +#endif /* _ASM_X86_SGX_H */
> diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> index e1ad7856d878..0f5b0e4e33dd 100644
> --- a/arch/x86/kernel/cpu/sgx/virt.c
> +++ b/arch/x86/kernel/cpu/sgx/virt.c
> @@ -252,3 +252,96 @@ int __init sgx_vepc_init(void)
>  
>  	return misc_register(&sgx_vepc_dev);
>  }
> +
> +/**
> + * sgx_virt_ecreate() - Run ECREATE on behalf of guest
> + * @pageinfo:	Pointer to PAGEINFO structure
> + * @secs:	Userspace pointer to SECS page
> + * @trapnr:	trap number injected to guest in case of ECREATE error
> + *
> + * Run ECREATE on behalf of guest after KVM traps ECREATE for the purpose
> + * of enforcing policies of guest's enclaves, and return the trap number
> + * which should be injected to guest in case of any ECREATE error.
> + *
> + * Return:
> + * - 0: 	ECREATE was successful.
> + * - -EFAULT:	ECREATE returned error.
> + */
> +int sgx_virt_ecreate(struct sgx_pageinfo *pageinfo, void __user *secs,
> +		     int *trapnr)
> +{
> +	int ret;
> +
> +	/*
> +	 * @secs is userspace address, and it's not guaranteed @secs points at
> +	 * an actual EPC page. It's also possible to generate a kernel mapping
> +	 * to physical EPC page by resolving PFN but using __uaccess_xx() is
> +	 * simpler.
> +	 */
> +	__uaccess_begin();
> +	ret = __ecreate(pageinfo, (void *)secs);
> +	__uaccess_end();
> +
> +	if (encls_faulted(ret)) {
> +		*trapnr = ENCLS_TRAPNR(ret);
> +		return -EFAULT;
> +	}
> +
> +	/* ECREATE doesn't return an error code, it faults or succeeds. */
> +	WARN_ON_ONCE(ret);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(sgx_virt_ecreate);
> +
> +static int __sgx_virt_einit(void __user *sigstruct, void __user *token,
> +			    void __user *secs)
> +{
> +	int ret;
> +
> +	__uaccess_begin();
> +	ret =  __einit((void *)sigstruct, (void *)token, (void *)secs);
> +	__uaccess_end();
> +	return ret;
> +}
> +
> +/**
> + * sgx_virt_ecreate() - Run EINIT on behalf of guest
> + * @sigstruct:		Userspace pointer to SIGSTRUCT structure
> + * @token:		Userspace pointer to EINITTOKEN structure
> + * @secs:		Userspace pointer to SECS page
> + * @lepubkeyhash:	Pointer to guest's *virtual* SGX_LEPUBKEYHASH MSR
> + * 			values
> + * @trapnr:		trap number injected to guest in case of EINIT error
> + *
> + * Run EINIT on behalf of guest after KVM traps EINIT. If SGX_LC is available
> + * in host, bare-metal driver may rewrite the hardware values, therefore KVM
> + * needs to update hardware values to guest's virtual MSR values in order to
> + * ensure EINIT is executed with expected hardware values.
> + *
> + * Return:
> + * - 0: 	EINIT was successful.
> + * - -EFAULT:	EINIT returned error.
> + */
> +int sgx_virt_einit(void __user *sigstruct, void __user *token,
> +		   void __user *secs, u64 *lepubkeyhash, int *trapnr)
> +{
> +	int ret;
> +
> +	if (!boot_cpu_has(X86_FEATURE_SGX_LC)) {
> +		ret = __sgx_virt_einit(sigstruct, token, secs);
> +	} else {
> +		preempt_disable();
> +
> +		sgx_update_lepubkeyhash(lepubkeyhash);
> +
> +		ret = __sgx_virt_einit(sigstruct, token, secs);
> +		preempt_enable();
> +	}
> +
> +	if (encls_faulted(ret)) {
> +		*trapnr = ENCLS_TRAPNR(ret);
> +		return -EFAULT;
> +	}

Empty line here before return. Applies also to sgx_virt_ecreate().

> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(sgx_virt_einit);
> -- 
> 2.29.2

Great work. I think this patch sets is shaping up.

/Jarkko
> 
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 15/27] x86/sgx: Move provisioning device creation out of SGX driver
  2021-01-26  9:31 ` [RFC PATCH v3 15/27] x86/sgx: Move provisioning device creation out of SGX driver Kai Huang
@ 2021-01-30 14:52   ` Jarkko Sakkinen
  0 siblings, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 14:52 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Jan 26, 2021 at 10:31:07PM +1300, Kai Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> And extract sgx_set_attribute() out of sgx_ioc_enclave_provision() and
> export it as symbol for KVM to use.
> 
> Provisioning key is sensitive. SGX driver only allows to create enclave
> which can access provisioning key when enclave creator has permission to
> open /dev/sgx_provision.  It should apply to VM as well, as provisioning
> key is platform specific, thus unrestricted VM can also potentially
> compromise provisioning key.
> 
> Move provisioning device creation out of sgx_drv_init() to sgx_init() as
> preparation for adding SGX virtualization support, so that even SGX
> driver is not enabled due to flexible launch control is not available,
> SGX virtualization can still be enabled, and use it to restrict VM's
> capability of being able to access provisioning key.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>


Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

/Jarkko

> ---
> v2->v3:
> 
>  - Added kdoc for sgx_set_attribute(), per Jarkko.
> 
> ---
>  arch/x86/include/asm/sgx.h       |  3 ++
>  arch/x86/kernel/cpu/sgx/driver.c | 17 ----------
>  arch/x86/kernel/cpu/sgx/ioctl.c  | 16 ++-------
>  arch/x86/kernel/cpu/sgx/main.c   | 58 +++++++++++++++++++++++++++++++-
>  4 files changed, 62 insertions(+), 32 deletions(-)
> 
> diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> index 8a3ea3e1efbe..d67afb051db3 100644
> --- a/arch/x86/include/asm/sgx.h
> +++ b/arch/x86/include/asm/sgx.h
> @@ -4,6 +4,9 @@
>  
>  #include <linux/types.h>
>  
> +int sgx_set_attribute(unsigned long *allowed_attributes,
> +		      unsigned int attribute_fd);
> +
>  #ifdef CONFIG_X86_SGX_KVM
>  struct sgx_pageinfo;
>  
> diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
> index f2eac41bb4ff..4f3241109bda 100644
> --- a/arch/x86/kernel/cpu/sgx/driver.c
> +++ b/arch/x86/kernel/cpu/sgx/driver.c
> @@ -133,10 +133,6 @@ static const struct file_operations sgx_encl_fops = {
>  	.get_unmapped_area	= sgx_get_unmapped_area,
>  };
>  
> -const struct file_operations sgx_provision_fops = {
> -	.owner			= THIS_MODULE,
> -};
> -
>  static struct miscdevice sgx_dev_enclave = {
>  	.minor = MISC_DYNAMIC_MINOR,
>  	.name = "sgx_enclave",
> @@ -144,13 +140,6 @@ static struct miscdevice sgx_dev_enclave = {
>  	.fops = &sgx_encl_fops,
>  };
>  
> -static struct miscdevice sgx_dev_provision = {
> -	.minor = MISC_DYNAMIC_MINOR,
> -	.name = "sgx_provision",
> -	.nodename = "sgx_provision",
> -	.fops = &sgx_provision_fops,
> -};
> -
>  int __init sgx_drv_init(void)
>  {
>  	unsigned int eax, ebx, ecx, edx;
> @@ -184,11 +173,5 @@ int __init sgx_drv_init(void)
>  	if (ret)
>  		return ret;
>  
> -	ret = misc_register(&sgx_dev_provision);
> -	if (ret) {
> -		misc_deregister(&sgx_dev_enclave);
> -		return ret;
> -	}
> -
>  	return 0;
>  }
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 1bae754268d1..4714de12422d 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -2,6 +2,7 @@
>  /*  Copyright(c) 2016-20 Intel Corporation. */
>  
>  #include <asm/mman.h>
> +#include <asm/sgx.h>
>  #include <linux/mman.h>
>  #include <linux/delay.h>
>  #include <linux/file.h>
> @@ -664,24 +665,11 @@ static long sgx_ioc_enclave_init(struct sgx_encl *encl, void __user *arg)
>  static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
>  {
>  	struct sgx_enclave_provision params;
> -	struct file *file;
>  
>  	if (copy_from_user(&params, arg, sizeof(params)))
>  		return -EFAULT;
>  
> -	file = fget(params.fd);
> -	if (!file)
> -		return -EINVAL;
> -
> -	if (file->f_op != &sgx_provision_fops) {
> -		fput(file);
> -		return -EINVAL;
> -	}
> -
> -	encl->attributes_mask |= SGX_ATTR_PROVISIONKEY;
> -
> -	fput(file);
> -	return 0;
> +	return sgx_set_attribute(&encl->attributes_mask, params.fd);
>  }
>  
>  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index b456899a9532..fba3eaf2ae26 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -1,14 +1,18 @@
>  // SPDX-License-Identifier: GPL-2.0
>  /*  Copyright(c) 2016-20 Intel Corporation. */
>  
> +#include <linux/file.h>
>  #include <linux/freezer.h>
>  #include <linux/highmem.h>
>  #include <linux/kthread.h>
> +#include <linux/miscdevice.h>
>  #include <linux/pagemap.h>
>  #include <linux/ratelimit.h>
>  #include <linux/sched/mm.h>
>  #include <linux/sched/signal.h>
>  #include <linux/slab.h>
> +#include <asm/sgx_arch.h>
> +#include <asm/sgx.h>
>  #include "driver.h"
>  #include "encl.h"
>  #include "encls.h"
> @@ -712,6 +716,51 @@ void sgx_update_lepubkeyhash(u64 *lepubkeyhash)
>  		wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0 + i, lepubkeyhash[i]);
>  }
>  
> +const struct file_operations sgx_provision_fops = {
> +	.owner			= THIS_MODULE,
> +};
> +
> +static struct miscdevice sgx_dev_provision = {
> +	.minor = MISC_DYNAMIC_MINOR,
> +	.name = "sgx_provision",
> +	.nodename = "sgx_provision",
> +	.fops = &sgx_provision_fops,
> +};
> +
> +/**
> + * sgx_set_attribute() - Update allowed attributes given file descriptor
> + * @allowed_attributes: 	Pointer to allowed enclave attributes
> + * @attribute_fd:		File descriptor for specific attribute
> + *
> + * Append enclave attribute indicated by file descriptor to allowed
> + * attributes. Currently only SGX_ATTR_PROVISIONKEY indicated by
> + * /dev/sgx_provision is supported.
> + *
> + * Return:
> + * -0:		SGX_ATTR_PROVISIONKEY is appended to allowed_attributes
> + * -EINVAL:	Invalid, or not supported file descriptor
> + */
> +int sgx_set_attribute(unsigned long *allowed_attributes,
> +		      unsigned int attribute_fd)
> +{
> +	struct file *file;
> +
> +	file = fget(attribute_fd);
> +	if (!file)
> +		return -EINVAL;
> +
> +	if (file->f_op != &sgx_provision_fops) {
> +		fput(file);
> +		return -EINVAL;
> +	}
> +
> +	*allowed_attributes |= SGX_ATTR_PROVISIONKEY;
> +
> +	fput(file);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(sgx_set_attribute);
> +
>  static int __init sgx_init(void)
>  {
>  	int ret;
> @@ -728,13 +777,20 @@ static int __init sgx_init(void)
>  		goto err_page_cache;
>  	}
>  
> +	ret = misc_register(&sgx_dev_provision);
> +	if (ret)
> +		goto err_kthread;
> +
>  	/* Success if the native *or* virtual EPC driver initialized cleanly. */
>  	ret = !!sgx_drv_init() & !!sgx_vepc_init();
>  	if (ret)
> -		goto err_kthread;
> +		goto err_provision;
>  
>  	return 0;
>  
> +err_provision:
> +	misc_deregister(&sgx_dev_provision);
> +
>  err_kthread:
>  	kthread_stop(ksgxd_tsk);
>  
> -- 
> 2.29.2
> 
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  2021-01-26  9:31 ` [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union Kai Huang
@ 2021-01-30 15:00   ` Jarkko Sakkinen
  2021-02-01  0:32     ` Kai Huang
  2021-02-01 17:12     ` Sean Christopherson
  0 siblings, 2 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-01-30 15:00 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jmattson, joro, vkuznets,
	wanpengli

On Tue, Jan 26, 2021 at 10:31:37PM +1300, Kai Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
> full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
> bits 15:0, and single-bit modifiers in bits 31:16.
> 
> Historically, KVM has only had to worry about handling the "failed
> VM-Entry" modifier, which could only be set in very specific flows and
> required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
> bit was a somewhat viable approach.  But even with only a single bit to
> worry about, KVM has had several bugs related to comparing a basic exit
> reason against the full exit reason store in vcpu_vmx.
> 
> Upcoming Intel features, e.g. SGX, will add new modifier bits that can
> be set on more or less any VM-Exit, as opposed to the significantly more
> restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
> flows isn't scalable.  Tracking exit reason in a union forces code to
> explicitly choose between consuming the full exit reason and the basic
> exit, and is a convenient way to document and access the modifiers.

I *believe* that the change is correct but I dropped in the last paragraph
- most likely only because of lack of expertise in this area.

I ask the most basic question: why SGX will add new modifier bits?

/Jarkko

> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 42 +++++++++++++++---------
>  arch/x86/kvm/vmx/vmx.c    | 68 ++++++++++++++++++++-------------------
>  arch/x86/kvm/vmx/vmx.h    | 25 +++++++++++++-
>  3 files changed, 86 insertions(+), 49 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 0fbb46990dfc..f112c2482887 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3311,7 +3311,11 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
>  	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>  	enum vm_entry_failure_code entry_failure_code;
>  	bool evaluate_pending_interrupts;
> -	u32 exit_reason, failed_index;
> +	u32 failed_index;
> +	union vmx_exit_reason exit_reason = {
> +		.basic = -1,
> +		.failed_vmentry = 1,
> +	};
>  
>  	if (kvm_check_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu))
>  		kvm_vcpu_flush_tlb_current(vcpu);
> @@ -3363,7 +3367,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
>  
>  		if (nested_vmx_check_guest_state(vcpu, vmcs12,
>  						 &entry_failure_code)) {
> -			exit_reason = EXIT_REASON_INVALID_STATE;
> +			exit_reason.basic = EXIT_REASON_INVALID_STATE;
>  			vmcs12->exit_qualification = entry_failure_code;
>  			goto vmentry_fail_vmexit;
>  		}
> @@ -3374,7 +3378,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
>  		vcpu->arch.tsc_offset += vmcs12->tsc_offset;
>  
>  	if (prepare_vmcs02(vcpu, vmcs12, &entry_failure_code)) {
> -		exit_reason = EXIT_REASON_INVALID_STATE;
> +		exit_reason.basic = EXIT_REASON_INVALID_STATE;
>  		vmcs12->exit_qualification = entry_failure_code;
>  		goto vmentry_fail_vmexit_guest_mode;
>  	}
> @@ -3384,7 +3388,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
>  						   vmcs12->vm_entry_msr_load_addr,
>  						   vmcs12->vm_entry_msr_load_count);
>  		if (failed_index) {
> -			exit_reason = EXIT_REASON_MSR_LOAD_FAIL;
> +			exit_reason.basic = EXIT_REASON_MSR_LOAD_FAIL;
>  			vmcs12->exit_qualification = failed_index;
>  			goto vmentry_fail_vmexit_guest_mode;
>  		}
> @@ -3452,7 +3456,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
>  		return NVMX_VMENTRY_VMEXIT;
>  
>  	load_vmcs12_host_state(vcpu, vmcs12);
> -	vmcs12->vm_exit_reason = exit_reason | VMX_EXIT_REASONS_FAILED_VMENTRY;
> +	vmcs12->vm_exit_reason = exit_reason.full;
>  	if (enable_shadow_vmcs || vmx->nested.hv_evmcs)
>  		vmx->nested.need_vmcs12_to_shadow_sync = true;
>  	return NVMX_VMENTRY_VMEXIT;
> @@ -5540,7 +5544,12 @@ static int handle_vmfunc(struct kvm_vcpu *vcpu)
>  	return kvm_skip_emulated_instruction(vcpu);
>  
>  fail:
> -	nested_vmx_vmexit(vcpu, vmx->exit_reason,
> +	/*
> +	 * This is effectively a reflected VM-Exit, as opposed to a synthesized
> +	 * nested VM-Exit.  Pass the original exit reason, i.e. don't hardcode
> +	 * EXIT_REASON_VMFUNC as the exit reason.
> +	 */
> +	nested_vmx_vmexit(vcpu, vmx->exit_reason.full,
>  			  vmx_get_intr_info(vcpu),
>  			  vmx_get_exit_qual(vcpu));
>  	return 1;
> @@ -5608,7 +5617,8 @@ static bool nested_vmx_exit_handled_io(struct kvm_vcpu *vcpu,
>   * MSR bitmap. This may be the case even when L0 doesn't use MSR bitmaps.
>   */
>  static bool nested_vmx_exit_handled_msr(struct kvm_vcpu *vcpu,
> -	struct vmcs12 *vmcs12, u32 exit_reason)
> +					struct vmcs12 *vmcs12,
> +					union vmx_exit_reason exit_reason)
>  {
>  	u32 msr_index = kvm_rcx_read(vcpu);
>  	gpa_t bitmap;
> @@ -5622,7 +5632,7 @@ static bool nested_vmx_exit_handled_msr(struct kvm_vcpu *vcpu,
>  	 * First we need to figure out which of the four to use:
>  	 */
>  	bitmap = vmcs12->msr_bitmap;
> -	if (exit_reason == EXIT_REASON_MSR_WRITE)
> +	if (exit_reason.basic == EXIT_REASON_MSR_WRITE)
>  		bitmap += 2048;
>  	if (msr_index >= 0xc0000000) {
>  		msr_index -= 0xc0000000;
> @@ -5759,11 +5769,12 @@ static bool nested_vmx_exit_handled_mtf(struct vmcs12 *vmcs12)
>   * Return true if L0 wants to handle an exit from L2 regardless of whether or not
>   * L1 wants the exit.  Only call this when in is_guest_mode (L2).
>   */
> -static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu, u32 exit_reason)
> +static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu,
> +				     union vmx_exit_reason exit_reason)
>  {
>  	u32 intr_info;
>  
> -	switch ((u16)exit_reason) {
> +	switch (exit_reason.basic) {
>  	case EXIT_REASON_EXCEPTION_NMI:
>  		intr_info = vmx_get_intr_info(vcpu);
>  		if (is_nmi(intr_info))
> @@ -5819,12 +5830,13 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu, u32 exit_reason)
>   * Return 1 if L1 wants to intercept an exit from L2.  Only call this when in
>   * is_guest_mode (L2).
>   */
> -static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu, u32 exit_reason)
> +static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu,
> +				     union vmx_exit_reason exit_reason)
>  {
>  	struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>  	u32 intr_info;
>  
> -	switch ((u16)exit_reason) {
> +	switch (exit_reason.basic) {
>  	case EXIT_REASON_EXCEPTION_NMI:
>  		intr_info = vmx_get_intr_info(vcpu);
>  		if (is_nmi(intr_info))
> @@ -5943,7 +5955,7 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu, u32 exit_reason)
>  bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> -	u32 exit_reason = vmx->exit_reason;
> +	union vmx_exit_reason exit_reason = vmx->exit_reason;
>  	unsigned long exit_qual;
>  	u32 exit_intr_info;
>  
> @@ -5962,7 +5974,7 @@ bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu)
>  		goto reflect_vmexit;
>  	}
>  
> -	trace_kvm_nested_vmexit(exit_reason, vcpu, KVM_ISA_VMX);
> +	trace_kvm_nested_vmexit(exit_reason.full, vcpu, KVM_ISA_VMX);
>  
>  	/* If L0 (KVM) wants the exit, it trumps L1's desires. */
>  	if (nested_vmx_l0_wants_exit(vcpu, exit_reason))
> @@ -5988,7 +6000,7 @@ bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu)
>  	exit_qual = vmx_get_exit_qual(vcpu);
>  
>  reflect_vmexit:
> -	nested_vmx_vmexit(vcpu, exit_reason, exit_intr_info, exit_qual);
> +	nested_vmx_vmexit(vcpu, exit_reason.full, exit_intr_info, exit_qual);
>  	return true;
>  }
>  
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2af05d3b0590..746b87375aff 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1577,7 +1577,7 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
>  	 * i.e. we end up advancing IP with some random value.
>  	 */
>  	if (!static_cpu_has(X86_FEATURE_HYPERVISOR) ||
> -	    to_vmx(vcpu)->exit_reason != EXIT_REASON_EPT_MISCONFIG) {
> +	    to_vmx(vcpu)->exit_reason.basic != EXIT_REASON_EPT_MISCONFIG) {
>  		orig_rip = kvm_rip_read(vcpu);
>  		rip = orig_rip + vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
>  #ifdef CONFIG_X86_64
> @@ -5667,7 +5667,7 @@ static void vmx_get_exit_info(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2,
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
>  	*info1 = vmx_get_exit_qual(vcpu);
> -	if (!(vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY)) {
> +	if (!vmx->exit_reason.failed_vmentry) {
>  		*info2 = vmx->idt_vectoring_info;
>  		*intr_info = vmx_get_intr_info(vcpu);
>  		if (is_exception_with_error_code(*intr_info))
> @@ -5911,8 +5911,9 @@ void dump_vmcs(void)
>  static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> -	u32 exit_reason = vmx->exit_reason;
> +	union vmx_exit_reason exit_reason = vmx->exit_reason;
>  	u32 vectoring_info = vmx->idt_vectoring_info;
> +	u16 exit_handler_index;
>  
>  	/*
>  	 * Flush logged GPAs PML buffer, this will make dirty_bitmap more
> @@ -5954,11 +5955,11 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>  			return 1;
>  	}
>  
> -	if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
> +	if (exit_reason.failed_vmentry) {
>  		dump_vmcs();
>  		vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
>  		vcpu->run->fail_entry.hardware_entry_failure_reason
> -			= exit_reason;
> +			= exit_reason.full;
>  		vcpu->run->fail_entry.cpu = vcpu->arch.last_vmentry_cpu;
>  		return 0;
>  	}
> @@ -5980,18 +5981,18 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>  	 * will cause infinite loop.
>  	 */
>  	if ((vectoring_info & VECTORING_INFO_VALID_MASK) &&
> -			(exit_reason != EXIT_REASON_EXCEPTION_NMI &&
> -			exit_reason != EXIT_REASON_EPT_VIOLATION &&
> -			exit_reason != EXIT_REASON_PML_FULL &&
> -			exit_reason != EXIT_REASON_APIC_ACCESS &&
> -			exit_reason != EXIT_REASON_TASK_SWITCH)) {
> +	    (exit_reason.basic != EXIT_REASON_EXCEPTION_NMI &&
> +	     exit_reason.basic != EXIT_REASON_EPT_VIOLATION &&
> +	     exit_reason.basic != EXIT_REASON_PML_FULL &&
> +	     exit_reason.basic != EXIT_REASON_APIC_ACCESS &&
> +	     exit_reason.basic != EXIT_REASON_TASK_SWITCH)) {
>  		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>  		vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV;
>  		vcpu->run->internal.ndata = 3;
>  		vcpu->run->internal.data[0] = vectoring_info;
> -		vcpu->run->internal.data[1] = exit_reason;
> +		vcpu->run->internal.data[1] = exit_reason.full;
>  		vcpu->run->internal.data[2] = vcpu->arch.exit_qualification;
> -		if (exit_reason == EXIT_REASON_EPT_MISCONFIG) {
> +		if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG) {
>  			vcpu->run->internal.ndata++;
>  			vcpu->run->internal.data[3] =
>  				vmcs_read64(GUEST_PHYSICAL_ADDRESS);
> @@ -6023,38 +6024,39 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
>  	if (exit_fastpath != EXIT_FASTPATH_NONE)
>  		return 1;
>  
> -	if (exit_reason >= kvm_vmx_max_exit_handlers)
> +	if (exit_reason.basic >= kvm_vmx_max_exit_handlers)
>  		goto unexpected_vmexit;
>  #ifdef CONFIG_RETPOLINE
> -	if (exit_reason == EXIT_REASON_MSR_WRITE)
> +	if (exit_reason.basic == EXIT_REASON_MSR_WRITE)
>  		return kvm_emulate_wrmsr(vcpu);
> -	else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER)
> +	else if (exit_reason.basic == EXIT_REASON_PREEMPTION_TIMER)
>  		return handle_preemption_timer(vcpu);
> -	else if (exit_reason == EXIT_REASON_INTERRUPT_WINDOW)
> +	else if (exit_reason.basic == EXIT_REASON_INTERRUPT_WINDOW)
>  		return handle_interrupt_window(vcpu);
> -	else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
> +	else if (exit_reason.basic == EXIT_REASON_EXTERNAL_INTERRUPT)
>  		return handle_external_interrupt(vcpu);
> -	else if (exit_reason == EXIT_REASON_HLT)
> +	else if (exit_reason.basic == EXIT_REASON_HLT)
>  		return kvm_emulate_halt(vcpu);
> -	else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
> +	else if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG)
>  		return handle_ept_misconfig(vcpu);
>  #endif
>  
> -	exit_reason = array_index_nospec(exit_reason,
> -					 kvm_vmx_max_exit_handlers);
> -	if (!kvm_vmx_exit_handlers[exit_reason])
> +	exit_handler_index = array_index_nospec((u16)exit_reason.basic,
> +						kvm_vmx_max_exit_handlers);
> +	if (!kvm_vmx_exit_handlers[exit_handler_index])
>  		goto unexpected_vmexit;
>  
> -	return kvm_vmx_exit_handlers[exit_reason](vcpu);
> +	return kvm_vmx_exit_handlers[exit_handler_index](vcpu);
>  
>  unexpected_vmexit:
> -	vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n", exit_reason);
> +	vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n",
> +		    exit_reason.full);
>  	dump_vmcs();
>  	vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
>  	vcpu->run->internal.suberror =
>  			KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
>  	vcpu->run->internal.ndata = 2;
> -	vcpu->run->internal.data[0] = exit_reason;
> +	vcpu->run->internal.data[0] = exit_reason.full;
>  	vcpu->run->internal.data[1] = vcpu->arch.last_vmentry_cpu;
>  	return 0;
>  }
> @@ -6373,9 +6375,9 @@ static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
> -	if (vmx->exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
> +	if (vmx->exit_reason.basic == EXIT_REASON_EXTERNAL_INTERRUPT)
>  		handle_external_interrupt_irqoff(vcpu);
> -	else if (vmx->exit_reason == EXIT_REASON_EXCEPTION_NMI)
> +	else if (vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI)
>  		handle_exception_nmi_irqoff(vmx);
>  }
>  
> @@ -6567,7 +6569,7 @@ void noinstr vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)
>  
>  static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
>  {
> -	switch (to_vmx(vcpu)->exit_reason) {
> +	switch (to_vmx(vcpu)->exit_reason.basic) {
>  	case EXIT_REASON_MSR_WRITE:
>  		return handle_fastpath_set_msr_irqoff(vcpu);
>  	case EXIT_REASON_PREEMPTION_TIMER:
> @@ -6766,17 +6768,17 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>  	vmx->idt_vectoring_info = 0;
>  
>  	if (unlikely(vmx->fail)) {
> -		vmx->exit_reason = 0xdead;
> +		vmx->exit_reason.full = 0xdead;
>  		return EXIT_FASTPATH_NONE;
>  	}
>  
> -	vmx->exit_reason = vmcs_read32(VM_EXIT_REASON);
> -	if (unlikely((u16)vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY))
> +	vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON);
> +	if (unlikely(vmx->exit_reason.basic == EXIT_REASON_MCE_DURING_VMENTRY))
>  		kvm_machine_check();
>  
> -	trace_kvm_exit(vmx->exit_reason, vcpu, KVM_ISA_VMX);
> +	trace_kvm_exit(vmx->exit_reason.full, vcpu, KVM_ISA_VMX);
>  
> -	if (unlikely(vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
> +	if (unlikely(vmx->exit_reason.failed_vmentry))
>  		return EXIT_FASTPATH_NONE;
>  
>  	vmx->loaded_vmcs->launched = 1;
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 9d3a557949ac..903f246b5abd 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -70,6 +70,29 @@ struct pt_desc {
>  	struct pt_ctx guest;
>  };
>  
> +union vmx_exit_reason {
> +	struct {
> +		u32	basic			: 16;
> +		u32	reserved16		: 1;
> +		u32	reserved17		: 1;
> +		u32	reserved18		: 1;
> +		u32	reserved19		: 1;
> +		u32	reserved20		: 1;
> +		u32	reserved21		: 1;
> +		u32	reserved22		: 1;
> +		u32	reserved23		: 1;
> +		u32	reserved24		: 1;
> +		u32	reserved25		: 1;
> +		u32	reserved26		: 1;
> +		u32	sgx_enclave_mode	: 1;
> +		u32	smi_pending_mtf		: 1;
> +		u32	smi_from_vmx_root	: 1;
> +		u32	reserved30		: 1;
> +		u32	failed_vmentry		: 1;
> +	};
> +	u32 full;
> +};
> +
>  /*
>   * The nested_vmx structure is part of vcpu_vmx, and holds information we need
>   * for correct emulation of VMX (i.e., nested VMX) on this vcpu.
> @@ -244,7 +267,7 @@ struct vcpu_vmx {
>  	int vpid;
>  	bool emulation_required;
>  
> -	u32 exit_reason;
> +	union vmx_exit_reason exit_reason;
>  
>  	/* Posted interrupt descriptor */
>  	struct pi_desc pi_desc;
> -- 
> 2.29.2
> 
> 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-01-30 13:20       ` Jarkko Sakkinen
@ 2021-02-01  0:01         ` Kai Huang
  2021-02-02 17:17           ` Jarkko Sakkinen
  2021-02-02 17:56           ` Paolo Bonzini
  0 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-01  0:01 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Dave Hansen, linux-sgx, kvm, x86, seanjc, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Sat, 30 Jan 2021 15:20:54 +0200 Jarkko Sakkinen wrote:
> On Wed, Jan 27, 2021 at 12:18:32PM +1300, Kai Huang wrote:
> > On Tue, 2021-01-26 at 07:34 -0800, Dave Hansen wrote:
> > > On 1/26/21 1:30 AM, Kai Huang wrote:
> > > > From: Sean Christopherson <seanjc@google.com>
> > > > 
> > > > Add SGX1 and SGX2 feature flags, via CPUID.0x12.0x0.EAX, as scattered
> > > > features, since adding a new leaf for only two bits would be wasteful.
> > > > As part of virtualizing SGX, KVM will expose the SGX CPUID leafs to its
> > > > guest, and to do so correctly needs to query hardware and kernel support
> > > > for SGX1 and SGX2.
> > > 
> > > It's also not _just_ exposing the CPUID leaves.  There are some checks
> > > here when KVM is emulating some SGX instructions too, right?
> > 
> > I would say trapping instead of emulating, but yes KVM will do more. However those
> > are quite details, and I don't think we should put lots of details here. Or perhaps
> > we can use 'for instance' as brief description:
> > 
> > As part of virtualizing SGX, KVM will need to use the two flags, for instance, to
> > expose them to guest.
> > 
> > ?
> > 
> > > 
> > > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > > > index 84b887825f12..18b2d0c8bbbe 100644
> > > > --- a/arch/x86/include/asm/cpufeatures.h
> > > > +++ b/arch/x86/include/asm/cpufeatures.h
> > > > @@ -292,6 +292,8 @@
> > > >  #define X86_FEATURE_FENCE_SWAPGS_KERNEL	(11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */
> > > >  #define X86_FEATURE_SPLIT_LOCK_DETECT	(11*32+ 6) /* #AC for split lock */
> > > >  #define X86_FEATURE_PER_THREAD_MBA	(11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */
> > > > +#define X86_FEATURE_SGX1		(11*32+ 8) /* Software Guard Extensions sub-feature SGX1 */
> > > > +#define X86_FEATURE_SGX2        	(11*32+ 9) /* Software Guard Extensions sub-feature SGX2 */
> > > 
> > > FWIW, I'm not sure how valuable it is to spell the SGX acronym out three
> > > times.  Can't we use those bytes to put something more useful in that
> > > comment?
> > 
> > I think we can remove comment for SGX1, since it is basically SGX.
> > 
> > For SGX2, how about below?
> > 
> > /* SGX Enclave Dynamic Memory Management */
> 
> (EDMM)

Does EDMM obvious to everyone, instead of explicitly saying Enclave Dynamic
Memory Management?

Also do you think we need a comment for SGX1 bit? I can add /* Basic SGX */,
but I am not sure whether it is required.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit
  2021-01-30 13:22   ` Jarkko Sakkinen
@ 2021-02-01  0:08     ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-01  0:08 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Sat, 30 Jan 2021 15:22:49 +0200 Jarkko Sakkinen wrote:
> On Tue, Jan 26, 2021 at 10:30:17PM +1300, Kai Huang wrote:
> > Move SGX_LC feature bit to CPUID dependency table as well, along with
> > new added SGX1 and SGX2 bit, to make clearing all SGX feature bits
> > easier. Also remove clear_sgx_caps() since it is just a wrapper of
> > setup_clear_cpu_cap(X86_FEATURE_SGX) now.
> > 
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: Kai Huang <kai.huang@intel.com>
> 
> Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
> 
> So could this be an improvement to the existing code? If so, then
> this should be the first patch. Also, I think that then this can
> be merged independently from rest of the patch set.

W/o SGX1/SGX2, I don't know whether it is worth to put SGX_LC into cpuid
dependency table, and kill clear_sgx_caps(). And since both you and Dave
already provided Acked-by, I am a little bit reluctant to switch the order
(because obviously both patches will be different).

Let me know if you still want me to switch patch order.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page()
  2021-01-27  1:26         ` Kai Huang
@ 2021-02-01  0:11           ` Kai Huang
  2021-02-03 10:03             ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-01  0:11 UTC (permalink / raw)
  To: Kai Huang
  Cc: Dave Hansen, linux-sgx, kvm, x86, seanjc, jarkko, luto,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Wed, 27 Jan 2021 14:26:52 +1300 Kai Huang wrote:
> On Tue, 26 Jan 2021 17:12:12 -0800 Dave Hansen wrote:
> > On 1/26/21 5:08 PM, Kai Huang wrote:
> > > I don't have deep understanding of SGX driver. Would you help to answer?
> > 
> > Kai, as the patch submitter, you are expected to be able to at least
> > minimally explain what the patch is doing.  Please endeavor to obtain
> > this understanding before sending patches in the future.
> 
> I see. Thanks.

Hi Jarkko,

I think I'll remove this patch in next version, since it is not related to KVM
SGX. And I'll rebase your second patch based on current tip/x86/sgx. You may
send out this patch independently. Let me know if you have comment.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  2021-01-30 14:51   ` Jarkko Sakkinen
@ 2021-02-01  0:17     ` Kai Huang
  2021-02-02 17:20       ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-01  0:17 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Sat, 30 Jan 2021 16:51:41 +0200 Jarkko Sakkinen wrote:
> On Tue, Jan 26, 2021 at 10:31:06PM +1300, Kai Huang wrote:
> > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > 
> > The bare-metal kernel must intercept ECREATE to be able to impose policies
> > on guests.  When it does this, the bare-metal kernel runs ECREATE against
> > the userspace mapping of the virtualized EPC.
> 
> I guess Andy's earlier comment applies here, i.e. SGX driver?

Sure.

[...]

> > +	}
> > +
> > +	if (encls_faulted(ret)) {
> > +		*trapnr = ENCLS_TRAPNR(ret);
> > +		return -EFAULT;
> > +	}
> 
> Empty line here before return. Applies also to sgx_virt_ecreate().

Yes I can remove, but I am just carious: isn't "having empty line before return"
a good coding-style? Do you have any reference to the guideline?

> 
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(sgx_virt_einit);
> > -- 
> > 2.29.2
> 
> Great work. I think this patch sets is shaping up.
> 
> /Jarkko
> > 
> > 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  2021-01-30 15:00   ` Jarkko Sakkinen
@ 2021-02-01  0:32     ` Kai Huang
  2021-02-02 17:24       ` Jarkko Sakkinen
  2021-02-01 17:12     ` Sean Christopherson
  1 sibling, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-01  0:32 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jmattson, joro, vkuznets,
	wanpengli

On Sat, 30 Jan 2021 17:00:46 +0200 Jarkko Sakkinen wrote:
> On Tue, Jan 26, 2021 at 10:31:37PM +1300, Kai Huang wrote:
> > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > 
> > Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
> > full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
> > bits 15:0, and single-bit modifiers in bits 31:16.
> > 
> > Historically, KVM has only had to worry about handling the "failed
> > VM-Entry" modifier, which could only be set in very specific flows and
> > required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
> > bit was a somewhat viable approach.  But even with only a single bit to
> > worry about, KVM has had several bugs related to comparing a basic exit
> > reason against the full exit reason store in vcpu_vmx.
> > 
> > Upcoming Intel features, e.g. SGX, will add new modifier bits that can
> > be set on more or less any VM-Exit, as opposed to the significantly more
> > restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
> > flows isn't scalable.  Tracking exit reason in a union forces code to
> > explicitly choose between consuming the full exit reason and the basic
> > exit, and is a convenient way to document and access the modifiers.
> 
> I *believe* that the change is correct but I dropped in the last paragraph
> - most likely only because of lack of expertise in this area.
> 
> I ask the most basic question: why SGX will add new modifier bits?

Not 100% sure about your question. Assuming you are asking SGX hardware
behavior, SGX architecture adds a new modifier bit (27) to Exit Reason, similar
to new #PF.SGX bit. 

Please refer to SDM Volume 3, Chapter 27.2.1 Basic VM-Exit Information.

Sean's commit msg already provides significant motivation of the change in this
patch.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 13/27] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs
  2021-01-30 14:49   ` Jarkko Sakkinen
@ 2021-02-01  1:17     ` Kai Huang
  2021-02-01 21:22       ` Dave Hansen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-01  1:17 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Sat, 30 Jan 2021 16:49:20 +0200 Jarkko Sakkinen wrote:
> On Tue, Jan 26, 2021 at 10:31:05PM +1300, Kai Huang wrote:
> > Add a helper to update SGX_LEPUBKEYHASHn MSRs.  SGX virtualization also
> > needs to update those MSRs based on guest's "virtual" SGX_LEPUBKEYHASHn
> > before EINIT from guest.
> > 
> > Signed-off-by: Kai Huang <kai.huang@intel.com>
> 
> 
> Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

Thanks Jarkko.

Hi Dave,

This patch originally had your Acked-by, but since I added a comment, I removed
it. May I still have your Acked-by?

> 
> /Jarkko
> 
> > ---
> > v2->v3:
> > 
> >  - Added comment for sgx_update_lepubkeyhash(), per Jarkko and Dave.
> > 
> > ---
> >  arch/x86/kernel/cpu/sgx/ioctl.c |  5 ++---
> >  arch/x86/kernel/cpu/sgx/main.c  | 15 +++++++++++++++
> >  arch/x86/kernel/cpu/sgx/sgx.h   |  2 ++
> >  3 files changed, 19 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> > index e5977752c7be..1bae754268d1 100644
> > --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> > @@ -495,7 +495,7 @@ static int sgx_encl_init(struct sgx_encl *encl, struct sgx_sigstruct *sigstruct,
> >  			 void *token)
> >  {
> >  	u64 mrsigner[4];
> > -	int i, j, k;
> > +	int i, j;
> >  	void *addr;
> >  	int ret;
> >  
> > @@ -544,8 +544,7 @@ static int sgx_encl_init(struct sgx_encl *encl, struct sgx_sigstruct *sigstruct,
> >  
> >  			preempt_disable();
> >  
> > -			for (k = 0; k < 4; k++)
> > -				wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0 + k, mrsigner[k]);
> > +			sgx_update_lepubkeyhash(mrsigner);
> >  
> >  			ret = __einit(sigstruct, token, addr);
> >  
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > index 93d249f7bff3..b456899a9532 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -697,6 +697,21 @@ static bool __init sgx_page_cache_init(void)
> >  	return true;
> >  }
> >  
> > +
> > +/*
> > + * Update the SGX_LEPUBKEYHASH MSRs to the values specified by caller.
> > + * Bare-metal driver requires to update them to hash of enclave's signer
> > + * before EINIT. KVM needs to update them to guest's virtual MSR values
> > + * before doing EINIT from guest.
> > + */
> > +void sgx_update_lepubkeyhash(u64 *lepubkeyhash)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < 4; i++)
> > +		wrmsrl(MSR_IA32_SGXLEPUBKEYHASH0 + i, lepubkeyhash[i]);
> > +}
> > +
> >  static int __init sgx_init(void)
> >  {
> >  	int ret;
> > diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> > index 509f2af33e1d..ccd4f145c464 100644
> > --- a/arch/x86/kernel/cpu/sgx/sgx.h
> > +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> > @@ -83,4 +83,6 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
> >  int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
> >  struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
> >  
> > +void sgx_update_lepubkeyhash(u64 *lepubkeyhash);
> > +
> >  #endif /* _X86_SGX_H */
> > -- 
> > 2.29.2
> > 
> > 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support
  2021-01-30 14:42   ` Jarkko Sakkinen
@ 2021-02-01  5:38     ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-01  5:38 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jethro, b.thiel

On Sat, 30 Jan 2021 16:42:56 +0200 Jarkko Sakkinen wrote:
> On Tue, Jan 26, 2021 at 10:30:54PM +1300, Kai Huang wrote:
> > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > 
> > The kernel will currently disable all SGX support if the hardware does
> > not support launch control.  Make it more permissive to allow SGX
> > virtualization on systems without Launch Control support.  This will
> > allow KVM to expose SGX to guests that have less-strict requirements on
> > the availability of flexible launch control.
> > 
> > Improve error message to distinguish between three cases.  There are two
> > cases where SGX support is completely disabled:
> > 1) SGX has been disabled completely by the BIOS
> > 2) SGX LC is locked by the BIOS.  Bare-metal support is disabled because
> >    of LC unavailability.  SGX virtualization is unavailable (because of
> >    Kconfig).
> > One where it is partially available:
> > 3) SGX LC is locked by the BIOS.  Bare-metal support is disabled because
> >    of LC unavailability.  SGX virtualization is supported.
> > 
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > Co-developed-by: Kai Huang <kai.huang@intel.com>
> > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > ---
> > v2->v3:
> > 
> >  - Added to use 'enable_sgx_any', per Dave.
> >  - Changed to call clear_cpu_cap() directly, rather than using clear_sgx_caps()
> >    and clear_sgx_lc().
> >  - Changed to use CONFIG_X86_SGX_KVM, instead of CONFIG_X86_SGX_VIRTUALIZATION.
> > 
> > v1->v2:
> > 
> >  - Refined commit message per Dave's comments.
> >  - Added check to only enable SGX virtualization when VMX is supported, per
> >    Dave's comment.
> >  - Refined error msg print to explicitly call out SGX virtualization will be
> >    supported when LC is locked by BIOS, per Dave's comment.
> > 
> > ---
> >  arch/x86/kernel/cpu/feat_ctl.c | 58 ++++++++++++++++++++++++++--------
> >  1 file changed, 45 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/feat_ctl.c b/arch/x86/kernel/cpu/feat_ctl.c
> > index 27533a6e04fa..0fc202550fcc 100644
> > --- a/arch/x86/kernel/cpu/feat_ctl.c
> > +++ b/arch/x86/kernel/cpu/feat_ctl.c
> > @@ -105,7 +105,8 @@ early_param("nosgx", nosgx);
> >  void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  {
> >  	bool tboot = tboot_enabled();
> > -	bool enable_sgx;
> > +	bool enable_vmx;
> > +	bool enable_sgx_any, enable_sgx_kvm, enable_sgx_driver;
> 
> Move the declaration first (reverse christmas tree).

Will do. Thanks.

> 
> >  	u64 msr;
> >  
> >  	if (rdmsrl_safe(MSR_IA32_FEAT_CTL, &msr)) {
> > @@ -114,13 +115,22 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  		return;
> >  	}
> >  
> > +	enable_vmx = cpu_has(c, X86_FEATURE_VMX) &&
> > +		     IS_ENABLED(CONFIG_KVM_INTEL);
> > +
> >  	/*
> > -	 * Enable SGX if and only if the kernel supports SGX and Launch Control
> > -	 * is supported, i.e. disable SGX if the LE hash MSRs can't be written.
> > +	 * Enable SGX if and only if the kernel supports SGX.  Require Launch
> > +	 * Control support if SGX virtualization is *not* supported, i.e.
> > +	 * disable SGX if the LE hash MSRs can't be written and SGX can't be
> > +	 * exposed to a KVM guest (which might support non-LC configurations).
> >  	 */
> > -	enable_sgx = cpu_has(c, X86_FEATURE_SGX) &&
> > -		     cpu_has(c, X86_FEATURE_SGX_LC) &&
> > -		     IS_ENABLED(CONFIG_X86_SGX);
> > +	enable_sgx_any = cpu_has(c, X86_FEATURE_SGX) &&
> > +			 cpu_has(c, X86_FEATURE_SGX1) &&
> > +			 IS_ENABLED(CONFIG_X86_SGX);
> > +	enable_sgx_driver = enable_sgx_any &&
> > +			    cpu_has(c, X86_FEATURE_SGX_LC);
> > +	enable_sgx_kvm = enable_sgx_any && enable_vmx &&
> > +			  IS_ENABLED(CONFIG_X86_SGX_KVM);
> >  
> >  	if (msr & FEAT_CTL_LOCKED)
> >  		goto update_caps;
> > @@ -136,15 +146,18 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  	 * i.e. KVM is enabled, to avoid unnecessarily adding an attack vector
> >  	 * for the kernel, e.g. using VMX to hide malicious code.
> >  	 */
> > -	if (cpu_has(c, X86_FEATURE_VMX) && IS_ENABLED(CONFIG_KVM_INTEL)) {
> > +	if (enable_vmx) {
> >  		msr |= FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX;
> >  
> >  		if (tboot)
> >  			msr |= FEAT_CTL_VMX_ENABLED_INSIDE_SMX;
> >  	}
> >  
> > -	if (enable_sgx)
> > -		msr |= FEAT_CTL_SGX_ENABLED | FEAT_CTL_SGX_LC_ENABLED;
> > +	if (enable_sgx_kvm || enable_sgx_driver) {
> > +		msr |= FEAT_CTL_SGX_ENABLED;
> > +		if (enable_sgx_driver)
> > +			msr |= FEAT_CTL_SGX_LC_ENABLED;
> > +	}
> >  
> >  	wrmsrl(MSR_IA32_FEAT_CTL, msr);
> >  
> > @@ -167,10 +180,29 @@ void init_ia32_feat_ctl(struct cpuinfo_x86 *c)
> >  	}
> >  
> >  update_sgx:
> > -	if (!(msr & FEAT_CTL_SGX_ENABLED) ||
> > -	    !(msr & FEAT_CTL_SGX_LC_ENABLED) || !enable_sgx) {
> > -		if (enable_sgx)
> > -			pr_err_once("SGX disabled by BIOS\n");
> > +	if (!(msr & FEAT_CTL_SGX_ENABLED)) {
> > +		if (enable_sgx_kvm || enable_sgx_driver)
> > +			pr_err_once("SGX disabled by BIOS.\n");
> >  		clear_cpu_cap(c, X86_FEATURE_SGX);
> > +		return;
> > +	}
> > +
> > +	/*
> > +	 * VMX feature bit may be cleared due to being disabled in BIOS,
> > +	 * in which case SGX virtualization cannot be supported either.
> > +	 */
> > +	if (!cpu_has(c, X86_FEATURE_VMX) && enable_sgx_kvm) {
> > +		pr_err_once("SGX virtualization disabled due to lack of VMX.\n");
> > +		enable_sgx_kvm = 0;
> > +	}
> > +
> > +	if (!(msr & FEAT_CTL_SGX_LC_ENABLED) && enable_sgx_driver) {
> > +		if (!enable_sgx_kvm) {
> > +			pr_err_once("SGX Launch Control is locked. Disable SGX.\n");
> > +			clear_cpu_cap(c, X86_FEATURE_SGX);
> > +		} else {
> > +			pr_err_once("SGX Launch Control is locked. Support SGX virtualization only.\n");
> > +			clear_cpu_cap(c, X86_FEATURE_SGX_LC);
> > +		}
> >  	}
> >  }
> > -- 
> > 2.29.2
> > 
> > 
> 
> /Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-01-30 14:45   ` Jarkko Sakkinen
@ 2021-02-01  5:40     ` Kai Huang
  2021-02-01 15:25       ` Dave Hansen
  2021-02-02 17:32       ` Jarkko Sakkinen
  0 siblings, 2 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-01  5:40 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Sat, 30 Jan 2021 16:45:43 +0200 Jarkko Sakkinen wrote:
> On Tue, Jan 26, 2021 at 10:31:00PM +1300, Kai Huang wrote:
> > Modify sgx_init() to always try to initialize the virtual EPC driver,
> > even if the bare-metal SGX driver is disabled.  The bare-metal driver
> > might be disabled if SGX Launch Control is in locked mode, or not
> > supported in the hardware at all.  This allows (non-Linux) guests that
> > support non-LC configurations to use SGX.
> > 
> > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > ---
> > v2->v3:
> > 
> >  - Changed from sgx_virt_epc_init() to sgx_vepc_init().
> > 
> > ---
> >  arch/x86/kernel/cpu/sgx/main.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > index 21c2ffa13870..93d249f7bff3 100644
> > --- a/arch/x86/kernel/cpu/sgx/main.c
> > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > @@ -12,6 +12,7 @@
> >  #include "driver.h"
> >  #include "encl.h"
> >  #include "encls.h"
> > +#include "virt.h"
> >  
> >  struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
> >  static int sgx_nr_epc_sections;
> > @@ -712,7 +713,8 @@ static int __init sgx_init(void)
> >  		goto err_page_cache;
> >  	}
> >  
> > -	ret = sgx_drv_init();
> > +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> > +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> 
> If would create more dumb code and just add
> 
> ret = sgx_vepc_init()
> if (ret)
>         goto err_kthread;

Do you mean you want below?

	ret = sgx_drv_init();
	ret = sgx_vepc_init();
	if (ret)
		goto err_kthread;

This was Sean's original code, but Dave didn't like it.

Sean/Dave,

Please let me know which way you prefer.

> 
> >  	if (ret)
> >  		goto err_kthread;
> >  
> > -- 
> > 2.29.2
> > 
> 
> /Jarkko
> > 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-01  5:40     ` Kai Huang
@ 2021-02-01 15:25       ` Dave Hansen
  2021-02-01 17:23         ` Sean Christopherson
  2021-02-02 23:07         ` Jarkko Sakkinen
  2021-02-02 17:32       ` Jarkko Sakkinen
  1 sibling, 2 replies; 156+ messages in thread
From: Dave Hansen @ 2021-02-01 15:25 UTC (permalink / raw)
  To: Kai Huang, Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa

On 1/31/21 9:40 PM, Kai Huang wrote:
>>> -	ret = sgx_drv_init();
>>> +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
>>> +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
>> If would create more dumb code and just add
>>
>> ret = sgx_vepc_init()
>> if (ret)
>>         goto err_kthread;

Jarkko, I'm not sure I understand this suggestion.

> Do you mean you want below?
> 
> 	ret = sgx_drv_init();
> 	ret = sgx_vepc_init();
> 	if (ret)
> 		goto err_kthread;
> 
> This was Sean's original code, but Dave didn't like it.

Are you sure?  I remember the !!&!! abomination being Sean's doing. :)

> Sean/Dave,
> 
> Please let me know which way you prefer.

Kai, I don't really know you are saying here.  In the end,
sgx_vepc_init() has to run regardless of whether sgx_drv_init() is
successful or not.  Also, we only want to 'goto err_kthraed' if *BOTH*
fail.  The code you have above will, for instance, 'goto err_kthread' if
sgx_drv_init() succeeds but sgx_vepc_init() fails.  It entirely
disregards the sgx_drv_init() error code.


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  2021-01-30 15:00   ` Jarkko Sakkinen
  2021-02-01  0:32     ` Kai Huang
@ 2021-02-01 17:12     ` Sean Christopherson
  2021-02-02 22:38       ` Jarkko Sakkinen
  1 sibling, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-01 17:12 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Kai Huang, linux-sgx, kvm, x86, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jmattson, joro, vkuznets,
	wanpengli

On Sat, Jan 30, 2021, Jarkko Sakkinen wrote:
> On Tue, Jan 26, 2021 at 10:31:37PM +1300, Kai Huang wrote:
> > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > 
> > Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
> > full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
> > bits 15:0, and single-bit modifiers in bits 31:16.
> > 
> > Historically, KVM has only had to worry about handling the "failed
> > VM-Entry" modifier, which could only be set in very specific flows and
> > required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
> > bit was a somewhat viable approach.  But even with only a single bit to
> > worry about, KVM has had several bugs related to comparing a basic exit
> > reason against the full exit reason store in vcpu_vmx.
> > 
> > Upcoming Intel features, e.g. SGX, will add new modifier bits that can
> > be set on more or less any VM-Exit, as opposed to the significantly more
> > restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
> > flows isn't scalable.  Tracking exit reason in a union forces code to
> > explicitly choose between consuming the full exit reason and the basic
> > exit, and is a convenient way to document and access the modifiers.
> 
> I *believe* that the change is correct but I dropped in the last paragraph
> - most likely only because of lack of expertise in this area.
> 
> I ask the most basic question: why SGX will add new modifier bits?

Register state is loaded with synthetic state and/or trampoline state on VM-Exit
from enclaves.  For all intents and purposes, emulation and other VMM/hypervisor
behavior that accesses vCPU state is impossible.  E.g. on a #UD VM-Exit, RIP
will point at the AEP; emulating would emulate some random instruction in the
untrusted runtime, not the instruction that faulted.

Hardware sets the "exit from enclave" modifier flag so that the VMM can try and
do something moderately sane, e.g. inject a #UD into the guest instead of
attempting to emulate random instructions.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-01 15:25       ` Dave Hansen
@ 2021-02-01 17:23         ` Sean Christopherson
  2021-02-02  0:12           ` Kai Huang
  2021-02-02 23:07         ` Jarkko Sakkinen
  1 sibling, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-01 17:23 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kai Huang, Jarkko Sakkinen, linux-sgx, kvm, x86, luto,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Mon, Feb 01, 2021, Dave Hansen wrote:
> On 1/31/21 9:40 PM, Kai Huang wrote:
> >>> -	ret = sgx_drv_init();
> >>> +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> >>> +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> >> If would create more dumb code and just add
> >>
> >> ret = sgx_vepc_init()
> >> if (ret)
> >>         goto err_kthread;
> 
> Jarkko, I'm not sure I understand this suggestion.
> 
> > Do you mean you want below?
> > 
> > 	ret = sgx_drv_init();
> > 	ret = sgx_vepc_init();
> > 	if (ret)
> > 		goto err_kthread;
> > 
> > This was Sean's original code, but Dave didn't like it.

The problem is it's wrong.  That snippet would incorrectly bail if drv_init()
succeeds but vepc_init() fails.

The alternative to the bitwise AND is to snapshot the result in two separate
variables:

	ret = sgx_drv_init();
	ret2 = sgx_vepc_init();
	if (ret && ret2)
		goto err_kthread;

or check the return from drv_init() _after_ vepc_init():

	ret = sgx_drv_init();
	if (sgx_vepc_init() && ret)
		goto err_kthread;


As evidenced by this thread, the behavior is subtle and easy to get wrong.  I
deliberately chose the option that was the weirdest specifically to reduce the
probability of someone incorrectly "cleaning up" the code.

> Are you sure?  I remember the !!&!! abomination being Sean's doing. :)

Yep!  That 100% functionally correct horror is my doing.

> > Sean/Dave,
> > 
> > Please let me know which way you prefer.
> 
> Kai, I don't really know you are saying here.  In the end,
> sgx_vepc_init() has to run regardless of whether sgx_drv_init() is
> successful or not.  Also, we only want to 'goto err_kthraed' if *BOTH*
> fail.  The code you have above will, for instance, 'goto err_kthread' if
> sgx_drv_init() succeeds but sgx_vepc_init() fails.  It entirely
> disregards the sgx_drv_init() error code.


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 13/27] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs
  2021-02-01  1:17     ` Kai Huang
@ 2021-02-01 21:22       ` Dave Hansen
  0 siblings, 0 replies; 156+ messages in thread
From: Dave Hansen @ 2021-02-01 21:22 UTC (permalink / raw)
  To: Kai Huang, Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa

On 1/31/21 5:17 PM, Kai Huang wrote:
> On Sat, 30 Jan 2021 16:49:20 +0200 Jarkko Sakkinen wrote:
>> On Tue, Jan 26, 2021 at 10:31:05PM +1300, Kai Huang wrote:
>>> Add a helper to update SGX_LEPUBKEYHASHn MSRs.  SGX virtualization also
>>> needs to update those MSRs based on guest's "virtual" SGX_LEPUBKEYHASHn
>>> before EINIT from guest.
>>>
>>> Signed-off-by: Kai Huang <kai.huang@intel.com>
>>
>> Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
> Thanks Jarkko.
> 
> Hi Dave,
> 
> This patch originally had your Acked-by, but since I added a comment, I removed
> it. May I still have your Acked-by?

Yes, feel free to restore it.  This looks fine.


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-01 17:23         ` Sean Christopherson
@ 2021-02-02  0:12           ` Kai Huang
  2021-02-02 23:10             ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-02  0:12 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Dave Hansen, Jarkko Sakkinen, linux-sgx, kvm, x86, luto,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Mon, 1 Feb 2021 09:23:18 -0800 Sean Christopherson wrote:
> On Mon, Feb 01, 2021, Dave Hansen wrote:
> > On 1/31/21 9:40 PM, Kai Huang wrote:
> > >>> -	ret = sgx_drv_init();
> > >>> +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> > >>> +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> > >> If would create more dumb code and just add
> > >>
> > >> ret = sgx_vepc_init()
> > >> if (ret)
> > >>         goto err_kthread;
> > 
> > Jarkko, I'm not sure I understand this suggestion.
> > 
> > > Do you mean you want below?
> > > 
> > > 	ret = sgx_drv_init();
> > > 	ret = sgx_vepc_init();
> > > 	if (ret)
> > > 		goto err_kthread;
> > > 
> > > This was Sean's original code, but Dave didn't like it.
> 
> The problem is it's wrong.  That snippet would incorrectly bail if drv_init()
> succeeds but vepc_init() fails.
> 
> The alternative to the bitwise AND is to snapshot the result in two separate
> variables:
> 
> 	ret = sgx_drv_init();
> 	ret2 = sgx_vepc_init();
> 	if (ret && ret2)
> 		goto err_kthread;
> 
> or check the return from drv_init() _after_ vepc_init():
> 
> 	ret = sgx_drv_init();
> 	if (sgx_vepc_init() && ret)
> 		goto err_kthread;
> 
> 
> As evidenced by this thread, the behavior is subtle and easy to get wrong.  I
> deliberately chose the option that was the weirdest specifically to reduce the
> probability of someone incorrectly "cleaning up" the code.
> 
> > Are you sure?  I remember the !!&!! abomination being Sean's doing. :)
> 
> Yep!  That 100% functionally correct horror is my doing.
> 
> > > Sean/Dave,
> > > 
> > > Please let me know which way you prefer.
> > 
> > Kai, I don't really know you are saying here.  In the end,
> > sgx_vepc_init() has to run regardless of whether sgx_drv_init() is
> > successful or not.  Also, we only want to 'goto err_kthraed' if *BOTH*
> > fail.  The code you have above will, for instance, 'goto err_kthread' if
> > sgx_drv_init() succeeds but sgx_vepc_init() fails.  It entirely
> > disregards the sgx_drv_init() error code.
> 

Hi Dave, Sean,

Yeah sorry my bad. The example I provided won't work. So I'd like keep the !!
&!! :)

Thanks. 

Jarkko, please let us know if you still have concern.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-02-01  0:01         ` Kai Huang
@ 2021-02-02 17:17           ` Jarkko Sakkinen
  2021-02-03  1:09             ` Kai Huang
  2021-02-02 17:56           ` Paolo Bonzini
  1 sibling, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 17:17 UTC (permalink / raw)
  To: Kai Huang
  Cc: Dave Hansen, linux-sgx, kvm, x86, seanjc, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Mon, Feb 01, 2021 at 01:01:51PM +1300, Kai Huang wrote:
> On Sat, 30 Jan 2021 15:20:54 +0200 Jarkko Sakkinen wrote:
> > On Wed, Jan 27, 2021 at 12:18:32PM +1300, Kai Huang wrote:
> > > On Tue, 2021-01-26 at 07:34 -0800, Dave Hansen wrote:
> > > > On 1/26/21 1:30 AM, Kai Huang wrote:
> > > > > From: Sean Christopherson <seanjc@google.com>
> > > > > 
> > > > > Add SGX1 and SGX2 feature flags, via CPUID.0x12.0x0.EAX, as scattered
> > > > > features, since adding a new leaf for only two bits would be wasteful.
> > > > > As part of virtualizing SGX, KVM will expose the SGX CPUID leafs to its
> > > > > guest, and to do so correctly needs to query hardware and kernel support
> > > > > for SGX1 and SGX2.
> > > > 
> > > > It's also not _just_ exposing the CPUID leaves.  There are some checks
> > > > here when KVM is emulating some SGX instructions too, right?
> > > 
> > > I would say trapping instead of emulating, but yes KVM will do more. However those
> > > are quite details, and I don't think we should put lots of details here. Or perhaps
> > > we can use 'for instance' as brief description:
> > > 
> > > As part of virtualizing SGX, KVM will need to use the two flags, for instance, to
> > > expose them to guest.
> > > 
> > > ?
> > > 
> > > > 
> > > > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > > > > index 84b887825f12..18b2d0c8bbbe 100644
> > > > > --- a/arch/x86/include/asm/cpufeatures.h
> > > > > +++ b/arch/x86/include/asm/cpufeatures.h
> > > > > @@ -292,6 +292,8 @@
> > > > >  #define X86_FEATURE_FENCE_SWAPGS_KERNEL	(11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */
> > > > >  #define X86_FEATURE_SPLIT_LOCK_DETECT	(11*32+ 6) /* #AC for split lock */
> > > > >  #define X86_FEATURE_PER_THREAD_MBA	(11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */
> > > > > +#define X86_FEATURE_SGX1		(11*32+ 8) /* Software Guard Extensions sub-feature SGX1 */
> > > > > +#define X86_FEATURE_SGX2        	(11*32+ 9) /* Software Guard Extensions sub-feature SGX2 */
> > > > 
> > > > FWIW, I'm not sure how valuable it is to spell the SGX acronym out three
> > > > times.  Can't we use those bytes to put something more useful in that
> > > > comment?
> > > 
> > > I think we can remove comment for SGX1, since it is basically SGX.
> > > 
> > > For SGX2, how about below?
> > > 
> > > /* SGX Enclave Dynamic Memory Management */
> > 
> > (EDMM)
> 
> Does EDMM obvious to everyone, instead of explicitly saying Enclave Dynamic
> Memory Management?
> 
> Also do you think we need a comment for SGX1 bit? I can add /* Basic SGX */,
> but I am not sure whether it is required.

I would put write the whole thing down and put EDMM to parentheses.

For SGX1 I would put "Basic SGX features for enclave construction".

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  2021-02-01  0:17     ` Kai Huang
@ 2021-02-02 17:20       ` Jarkko Sakkinen
  2021-02-02 20:35         ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 17:20 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Mon, Feb 01, 2021 at 01:17:44PM +1300, Kai Huang wrote:
> On Sat, 30 Jan 2021 16:51:41 +0200 Jarkko Sakkinen wrote:
> > On Tue, Jan 26, 2021 at 10:31:06PM +1300, Kai Huang wrote:
> > > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > > 
> > > The bare-metal kernel must intercept ECREATE to be able to impose policies
> > > on guests.  When it does this, the bare-metal kernel runs ECREATE against
> > > the userspace mapping of the virtualized EPC.
> > 
> > I guess Andy's earlier comment applies here, i.e. SGX driver?
> 
> Sure.
> 
> [...]
> 
> > > +	}
> > > +
> > > +	if (encls_faulted(ret)) {
> > > +		*trapnr = ENCLS_TRAPNR(ret);

Also here is an empty line needed.

> > > +		return -EFAULT;
> > > +	}
> > 
> > Empty line here before return. Applies also to sgx_virt_ecreate().
> 
> Yes I can remove, but I am just carious: isn't "having empty line before return"
> a good coding-style? Do you have any reference to the guideline?

In the initial SGX patch set, this was the review feedback that I got
from Boris, so I would presume it is tip tree convention. Also, looking
at a random selection of files under arch/x86, it is commonly done this
way.

> 
> > 
> > > +	return ret;
> > > +}
> > > +EXPORT_SYMBOL_GPL(sgx_virt_einit);
> > > -- 
> > > 2.29.2
> > 
> > Great work. I think this patch sets is shaping up.
> > 
> > /Jarkko
> > > 
> > > 
> 

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  2021-02-01  0:32     ` Kai Huang
@ 2021-02-02 17:24       ` Jarkko Sakkinen
  2021-02-02 19:23         ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 17:24 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jmattson, joro, vkuznets,
	wanpengli

On Mon, Feb 01, 2021 at 01:32:59PM +1300, Kai Huang wrote:
> On Sat, 30 Jan 2021 17:00:46 +0200 Jarkko Sakkinen wrote:
> > On Tue, Jan 26, 2021 at 10:31:37PM +1300, Kai Huang wrote:
> > > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > > 
> > > Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
> > > full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
> > > bits 15:0, and single-bit modifiers in bits 31:16.
> > > 
> > > Historically, KVM has only had to worry about handling the "failed
> > > VM-Entry" modifier, which could only be set in very specific flows and
> > > required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
> > > bit was a somewhat viable approach.  But even with only a single bit to
> > > worry about, KVM has had several bugs related to comparing a basic exit
> > > reason against the full exit reason store in vcpu_vmx.
> > > 
> > > Upcoming Intel features, e.g. SGX, will add new modifier bits that can

BTW, SGX is not an upcoming CPU feature.

Also, broadly speaking of upcoming features is not right thing to do.
Better just to scope this down SGX. Theoretically upcoming CPU features
can do pretty much anything. This is change is first and foremost done
for SGX.

> > > be set on more or less any VM-Exit, as opposed to the significantly more
> > > restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
> > > flows isn't scalable.  Tracking exit reason in a union forces code to
> > > explicitly choose between consuming the full exit reason and the basic
> > > exit, and is a convenient way to document and access the modifiers.
> > 
> > I *believe* that the change is correct but I dropped in the last paragraph
> > - most likely only because of lack of expertise in this area.
> > 
> > I ask the most basic question: why SGX will add new modifier bits?
> 
> Not 100% sure about your question. Assuming you are asking SGX hardware
> behavior, SGX architecture adds a new modifier bit (27) to Exit Reason, similar
> to new #PF.SGX bit. 
> 
> Please refer to SDM Volume 3, Chapter 27.2.1 Basic VM-Exit Information.
> 
> Sean's commit msg already provides significant motivation of the change in this
> patch.

Just describe why SGX requires this. That's all.

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-01  5:40     ` Kai Huang
  2021-02-01 15:25       ` Dave Hansen
@ 2021-02-02 17:32       ` Jarkko Sakkinen
  2021-02-02 18:20         ` Sean Christopherson
  2021-02-02 18:49         ` Kai Huang
  1 sibling, 2 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 17:32 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Mon, Feb 01, 2021 at 06:40:40PM +1300, Kai Huang wrote:
> On Sat, 30 Jan 2021 16:45:43 +0200 Jarkko Sakkinen wrote:
> > On Tue, Jan 26, 2021 at 10:31:00PM +1300, Kai Huang wrote:
> > > Modify sgx_init() to always try to initialize the virtual EPC driver,
> > > even if the bare-metal SGX driver is disabled.  The bare-metal driver
> > > might be disabled if SGX Launch Control is in locked mode, or not
> > > supported in the hardware at all.  This allows (non-Linux) guests that
> > > support non-LC configurations to use SGX.
> > > 
> > > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > > ---
> > > v2->v3:
> > > 
> > >  - Changed from sgx_virt_epc_init() to sgx_vepc_init().
> > > 
> > > ---
> > >  arch/x86/kernel/cpu/sgx/main.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > > index 21c2ffa13870..93d249f7bff3 100644
> > > --- a/arch/x86/kernel/cpu/sgx/main.c
> > > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > > @@ -12,6 +12,7 @@
> > >  #include "driver.h"
> > >  #include "encl.h"
> > >  #include "encls.h"
> > > +#include "virt.h"
> > >  
> > >  struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
> > >  static int sgx_nr_epc_sections;
> > > @@ -712,7 +713,8 @@ static int __init sgx_init(void)
> > >  		goto err_page_cache;
> > >  	}
> > >  
> > > -	ret = sgx_drv_init();
> > > +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> > > +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> > 
> > If would create more dumb code and just add
> > 
> > ret = sgx_vepc_init()
> > if (ret)
> >         goto err_kthread;
> 
> Do you mean you want below?
> 
> 	ret = sgx_drv_init();
> 	ret = sgx_vepc_init();
> 	if (ret)
> 		goto err_kthread;
> 
> This was Sean's original code, but Dave didn't like it.

I think it should be like:

ret = sgx_drv_init();
if (ret)
        pr_warn("Driver initialization failed with %d\n", ret);

ret = sgx_vepc_init();
if (ret)
	goto err_kthread;

There is problem here anyhow. I.e. -ENODEV's from sgx_drv_init().  I think
how driver.c should be changed would be just to return 0 in the places
where it now return -ENODEV. Consider "not initialized" as a successful
initialization.

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-02-01  0:01         ` Kai Huang
  2021-02-02 17:17           ` Jarkko Sakkinen
@ 2021-02-02 17:56           ` Paolo Bonzini
  2021-02-02 18:00             ` Dave Hansen
  1 sibling, 1 reply; 156+ messages in thread
From: Paolo Bonzini @ 2021-02-02 17:56 UTC (permalink / raw)
  To: Kai Huang, Jarkko Sakkinen
  Cc: Dave Hansen, linux-sgx, kvm, x86, seanjc, luto, haitao.huang, bp,
	tglx, mingo, hpa

On 01/02/21 01:01, Kai Huang wrote:
>>> I think we can remove comment for SGX1, since it is basically SGX.
>>>
>>> For SGX2, how about below?
>>>
>>> /* SGX Enclave Dynamic Memory Management */
>> (EDMM)
> Does EDMM obvious to everyone, instead of explicitly saying Enclave Dynamic
> Memory Management?
> 
> Also do you think we need a comment for SGX1 bit? I can add /* Basic SGX */,
> but I am not sure whether it is required.
> 

Yes, please use

/* "" Basic SGX */
/* "" SGX Enclave Dynamic Memory Mgmt */

Paolo


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-02-02 17:56           ` Paolo Bonzini
@ 2021-02-02 18:00             ` Dave Hansen
  2021-02-02 18:03               ` Paolo Bonzini
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-02-02 18:00 UTC (permalink / raw)
  To: Paolo Bonzini, Kai Huang, Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, haitao.huang, bp, tglx, mingo, hpa

On 2/2/21 9:56 AM, Paolo Bonzini wrote:
> On 01/02/21 01:01, Kai Huang wrote:
>>>> I think we can remove comment for SGX1, since it is basically SGX.
>>>>
>>>> For SGX2, how about below?
>>>>
>>>> /* SGX Enclave Dynamic Memory Management */
>>> (EDMM)
>> Does EDMM obvious to everyone, instead of explicitly saying Enclave
>> Dynamic
>> Memory Management?
>>
>> Also do you think we need a comment for SGX1 bit? I can add /* Basic
>> SGX */,
>> but I am not sure whether it is required.
> 
> Yes, please use
> 
> /* "" Basic SGX */
> /* "" SGX Enclave Dynamic Memory Mgmt */

Do you actually want to suppress these from /proc/cpuinfo with the ""?

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
  2021-01-27  1:25     ` Kai Huang
@ 2021-02-02 18:00       ` Paolo Bonzini
  2021-02-02 19:25         ` Kai Huang
  2021-02-02 19:02       ` Dave Hansen
  1 sibling, 1 reply; 156+ messages in thread
From: Paolo Bonzini @ 2021-02-02 18:00 UTC (permalink / raw)
  To: Kai Huang, Dave Hansen
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang, bp,
	tglx, mingo, hpa

On 27/01/21 02:25, Kai Huang wrote:
> On Tue, 26 Jan 2021 08:04:35 -0800 Dave Hansen wrote:
>> On 1/26/21 1:30 AM, Kai Huang wrote:
>>> From: Jarkko Sakkinen <jarkko@kernel.org>
>>>
>>> Encapsulate the snippet in sgx_free_epc_page() concerning EREMOVE to
>>> sgx_reset_epc_page(), which is a static helper function for
>>> sgx_encl_release().  It's the only function existing, which deals with
>>> initialized pages.
>>
>> Yikes.  I have no idea what that is saying.  Here's a rewrite:
>>
>> EREMOVE takes a pages and removes any association between that page and
>> an enclave.  It must be run on a page before it can be added into
>> another enclave.  Currently, EREMOVE is run as part of pages being freed
>> into the SGX page allocator.  It is not expected to fail.
>>
>> KVM does not track how guest pages are used, which means that SGX
>> virtualization use of EREMOVE might fail.
>>
>> Break out the EREMOVE call from the SGX page allocator.  This will allow
>> the SGX virtualization code to use the allocator directly.  (SGX/KVM
>> will also introduce a more permissive EREMOVE helper).
> 
> Thanks.
> 
> Hi Jarkko,
> 
> Do you want me to update your patch directly, or do you want to take the
> change, and send me the patch again?

I think you should treat all these 27 patches as yours now (including 
removing them, reordering them, adjusting commit message etc.).


>> OK, so if you're going to say "the caller must put the page in
>> uninitialized state", let's also add a comment to the place that *DO*
>> that, like the shiny new sgx_reset_epc_page().
> 
> Hi Dave,
> 
> Sorry I am a little bit confused here. Do you mean we should add a comment in
> sgx_reset_epc_page() to say, for instance: sgx_free_epc_page() requires the EPC
> page already been EREMOVE'd?

I also don't understand Dave's comment.  I would say

It's the caller's responsibility to make sure that the page is in 
uninitialized state with EREMOVE (sgx_reset_epc_page), EWB etc. before 
calling this function.

Paolo


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-02-02 18:00             ` Dave Hansen
@ 2021-02-02 18:03               ` Paolo Bonzini
  2021-02-02 18:42                 ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Paolo Bonzini @ 2021-02-02 18:03 UTC (permalink / raw)
  To: Dave Hansen, Kai Huang, Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, haitao.huang, bp, tglx, mingo, hpa

On 02/02/21 19:00, Dave Hansen wrote:
>> /* "" Basic SGX */
>> /* "" SGX Enclave Dynamic Memory Mgmt */
> Do you actually want to suppress these from /proc/cpuinfo with the ""?
> 

sgx1 yes.  However sgx2 can be useful to have there, I guess.

Paolo


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-02 17:32       ` Jarkko Sakkinen
@ 2021-02-02 18:20         ` Sean Christopherson
  2021-02-02 23:16           ` Jarkko Sakkinen
  2021-02-02 18:49         ` Kai Huang
  1 sibling, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-02 18:20 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Kai Huang, linux-sgx, kvm, x86, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Feb 02, 2021, Jarkko Sakkinen wrote:
> On Mon, Feb 01, 2021 at 06:40:40PM +1300, Kai Huang wrote:
> > On Sat, 30 Jan 2021 16:45:43 +0200 Jarkko Sakkinen wrote:
> > > On Tue, Jan 26, 2021 at 10:31:00PM +1300, Kai Huang wrote:
> > > > Modify sgx_init() to always try to initialize the virtual EPC driver,
> > > > even if the bare-metal SGX driver is disabled.  The bare-metal driver
> > > > might be disabled if SGX Launch Control is in locked mode, or not
> > > > supported in the hardware at all.  This allows (non-Linux) guests that
> > > > support non-LC configurations to use SGX.
> > > > 
> > > > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > > > ---
> > > > v2->v3:
> > > > 
> > > >  - Changed from sgx_virt_epc_init() to sgx_vepc_init().
> > > > 
> > > > ---
> > > >  arch/x86/kernel/cpu/sgx/main.c | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > > > index 21c2ffa13870..93d249f7bff3 100644
> > > > --- a/arch/x86/kernel/cpu/sgx/main.c
> > > > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > > > @@ -12,6 +12,7 @@
> > > >  #include "driver.h"
> > > >  #include "encl.h"
> > > >  #include "encls.h"
> > > > +#include "virt.h"
> > > >  
> > > >  struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
> > > >  static int sgx_nr_epc_sections;
> > > > @@ -712,7 +713,8 @@ static int __init sgx_init(void)
> > > >  		goto err_page_cache;
> > > >  	}
> > > >  
> > > > -	ret = sgx_drv_init();
> > > > +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> > > > +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> > > 
> > > If would create more dumb code and just add
> > > 
> > > ret = sgx_vepc_init()
> > > if (ret)
> > >         goto err_kthread;
> > 
> > Do you mean you want below?
> > 
> > 	ret = sgx_drv_init();
> > 	ret = sgx_vepc_init();
> > 	if (ret)
> > 		goto err_kthread;
> > 
> > This was Sean's original code, but Dave didn't like it.
> 
> I think it should be like:
> 
> ret = sgx_drv_init();
> if (ret)
>         pr_warn("Driver initialization failed with %d\n", ret);
> 
> ret = sgx_vepc_init();
> if (ret)
> 	goto err_kthread;

And that's wrong, it doesn't correctly handle the case where sgx_drv_init()
succeeds but sgx_vepc_init() fails.

> There is problem here anyhow. I.e. -ENODEV's from sgx_drv_init().  I think
> how driver.c should be changed would be just to return 0 in the places
> where it now return -ENODEV. Consider "not initialized" as a successful
> initialization.



^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-02-02 18:03               ` Paolo Bonzini
@ 2021-02-02 18:42                 ` Sean Christopherson
  2021-02-03  1:05                   ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-02 18:42 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Dave Hansen, Kai Huang, Jarkko Sakkinen, linux-sgx, kvm, x86,
	luto, haitao.huang, bp, tglx, mingo, hpa

On Tue, Feb 02, 2021, Paolo Bonzini wrote:
> On 02/02/21 19:00, Dave Hansen wrote:
> > > /* "" Basic SGX */
> > > /* "" SGX Enclave Dynamic Memory Mgmt */
> > Do you actually want to suppress these from /proc/cpuinfo with the ""?
> > 
> 
> sgx1 yes.  However sgx2 can be useful to have there, I guess.

Agreed, /proc/cpuinfo's sgx1 will always be in lockstep with sgx, so it won't
be useful for dealing with the fallout of hardware disabling SGX due to software
disabling a machine check bank via WRMSR(MCi_CTL).  I can't think of any other
use case for checking /proc/cpuinfo's sgx1.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-02 17:32       ` Jarkko Sakkinen
  2021-02-02 18:20         ` Sean Christopherson
@ 2021-02-02 18:49         ` Kai Huang
  2021-02-02 23:17           ` Jarkko Sakkinen
  1 sibling, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-02 18:49 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, 2 Feb 2021 19:32:30 +0200 Jarkko Sakkinen wrote:
> On Mon, Feb 01, 2021 at 06:40:40PM +1300, Kai Huang wrote:
> > On Sat, 30 Jan 2021 16:45:43 +0200 Jarkko Sakkinen wrote:
> > > On Tue, Jan 26, 2021 at 10:31:00PM +1300, Kai Huang wrote:
> > > > Modify sgx_init() to always try to initialize the virtual EPC driver,
> > > > even if the bare-metal SGX driver is disabled.  The bare-metal driver
> > > > might be disabled if SGX Launch Control is in locked mode, or not
> > > > supported in the hardware at all.  This allows (non-Linux) guests that
> > > > support non-LC configurations to use SGX.
> > > > 
> > > > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > > > ---
> > > > v2->v3:
> > > > 
> > > >  - Changed from sgx_virt_epc_init() to sgx_vepc_init().
> > > > 
> > > > ---
> > > >  arch/x86/kernel/cpu/sgx/main.c | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > > > index 21c2ffa13870..93d249f7bff3 100644
> > > > --- a/arch/x86/kernel/cpu/sgx/main.c
> > > > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > > > @@ -12,6 +12,7 @@
> > > >  #include "driver.h"
> > > >  #include "encl.h"
> > > >  #include "encls.h"
> > > > +#include "virt.h"
> > > >  
> > > >  struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
> > > >  static int sgx_nr_epc_sections;
> > > > @@ -712,7 +713,8 @@ static int __init sgx_init(void)
> > > >  		goto err_page_cache;
> > > >  	}
> > > >  
> > > > -	ret = sgx_drv_init();
> > > > +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> > > > +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> > > 
> > > If would create more dumb code and just add
> > > 
> > > ret = sgx_vepc_init()
> > > if (ret)
> > >         goto err_kthread;
> > 
> > Do you mean you want below?
> > 
> > 	ret = sgx_drv_init();
> > 	ret = sgx_vepc_init();
> > 	if (ret)
> > 		goto err_kthread;
> > 
> > This was Sean's original code, but Dave didn't like it.
> 
> I think it should be like:
> 
> ret = sgx_drv_init();
> if (ret)
>         pr_warn("Driver initialization failed with %d\n", ret);
> 
> ret = sgx_vepc_init();
> if (ret)
> 	goto err_kthread;
> 
> There is problem here anyhow. I.e. -ENODEV's from sgx_drv_init().  I think
> how driver.c should be changed would be just to return 0 in the places
> where it now return -ENODEV. Consider "not initialized" as a successful
> initialization.

Hi Jarkko,

Dave already pointed out above code won't work. The problem is failure to
initialize vepc will just goto err_kthread, no matter whether driver has been
initialized successfully or not. 

I am sticking to the original way (!! & !!).

> 
> /Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
  2021-01-27  1:25     ` Kai Huang
  2021-02-02 18:00       ` Paolo Bonzini
@ 2021-02-02 19:02       ` Dave Hansen
  1 sibling, 0 replies; 156+ messages in thread
From: Dave Hansen @ 2021-02-02 19:02 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, jarkko, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On 1/26/21 5:25 PM, Kai Huang wrote:
>>
>>> + * responsibility to make sure that the page is in uninitialized state In other
>> Period after "state", please.
>>
>>> + * words, do EREMOVE, EWB or whatever operation is necessary before calling
>>> + * this function.
>>>   */
>> OK, so if you're going to say "the caller must put the page in
>> uninitialized state", let's also add a comment to the place that *DO*
>> that, like the shiny new sgx_reset_epc_page().
> Hi Dave,
> 
> Sorry I am a little bit confused here. Do you mean we should add a comment in
> sgx_reset_epc_page() to say, for instance: sgx_free_epc_page() requires the EPC
> page already been EREMOVE'd?

Yes.  You need to place a comment in sgx_reset_epc_page() which says
something like:

	/* Place the page in uninitialized state: */

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  2021-02-02 17:24       ` Jarkko Sakkinen
@ 2021-02-02 19:23         ` Kai Huang
  2021-02-02 22:41           ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-02 19:23 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jmattson, joro, vkuznets,
	wanpengli

On Tue, 2 Feb 2021 19:24:42 +0200 Jarkko Sakkinen wrote:
> On Mon, Feb 01, 2021 at 01:32:59PM +1300, Kai Huang wrote:
> > On Sat, 30 Jan 2021 17:00:46 +0200 Jarkko Sakkinen wrote:
> > > On Tue, Jan 26, 2021 at 10:31:37PM +1300, Kai Huang wrote:
> > > > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > > > 
> > > > Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
> > > > full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
> > > > bits 15:0, and single-bit modifiers in bits 31:16.
> > > > 
> > > > Historically, KVM has only had to worry about handling the "failed
> > > > VM-Entry" modifier, which could only be set in very specific flows and
> > > > required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
> > > > bit was a somewhat viable approach.  But even with only a single bit to
> > > > worry about, KVM has had several bugs related to comparing a basic exit
> > > > reason against the full exit reason store in vcpu_vmx.
> > > > 
> > > > Upcoming Intel features, e.g. SGX, will add new modifier bits that can
> 
> BTW, SGX is not an upcoming CPU feature.

Probably Sean was implying: "Upcoming CPU features that will be supported by
Linux". I don't see big deal here.

> 
> Also, broadly speaking of upcoming features is not right thing to do.
> Better just to scope this down SGX. Theoretically upcoming CPU features
> can do pretty much anything. This is change is first and foremost done
> for SGX.
> 
> > > > be set on more or less any VM-Exit, as opposed to the significantly more
> > > > restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
> > > > flows isn't scalable.  Tracking exit reason in a union forces code to
> > > > explicitly choose between consuming the full exit reason and the basic
> > > > exit, and is a convenient way to document and access the modifiers.
> > > 
> > > I *believe* that the change is correct but I dropped in the last paragraph
> > > - most likely only because of lack of expertise in this area.
> > > 
> > > I ask the most basic question: why SGX will add new modifier bits?
> > 
> > Not 100% sure about your question. Assuming you are asking SGX hardware
> > behavior, SGX architecture adds a new modifier bit (27) to Exit Reason, similar
> > to new #PF.SGX bit. 
> > 
> > Please refer to SDM Volume 3, Chapter 27.2.1 Basic VM-Exit Information.
> > 
> > Sean's commit msg already provides significant motivation of the change in this
> > patch.
> 
> Just describe why SGX requires this. That's all.

This patch is to change vmexit info from u32 to union, because at least one
additional modifier is going to be added, due to SGX. So the motivation of this
patch is the fact that "one or more additional modifier bits will be added",
and SGX is just example. 

So I don't think adding too much SGX backgroud in *THIS* patch is needed.
And another patch: 

[RFC PATCH v3 21/27] KVM: VMX: Add basic handling of VM-Exit from SGX enclave

already has enough information of "why new modifier bit is aadded for SGX".
Sean also replied to you. 

Please look at that patch and see whether it satisfies you.

> 
> /Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
  2021-02-02 18:00       ` Paolo Bonzini
@ 2021-02-02 19:25         ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-02 19:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Dave Hansen, linux-sgx, kvm, x86, seanjc, jarkko, luto,
	haitao.huang, bp, tglx, mingo, hpa

On Tue, 2 Feb 2021 19:00:48 +0100 Paolo Bonzini wrote:
> On 27/01/21 02:25, Kai Huang wrote:
> > On Tue, 26 Jan 2021 08:04:35 -0800 Dave Hansen wrote:
> >> On 1/26/21 1:30 AM, Kai Huang wrote:
> >>> From: Jarkko Sakkinen <jarkko@kernel.org>
> >>>
> >>> Encapsulate the snippet in sgx_free_epc_page() concerning EREMOVE to
> >>> sgx_reset_epc_page(), which is a static helper function for
> >>> sgx_encl_release().  It's the only function existing, which deals with
> >>> initialized pages.
> >>
> >> Yikes.  I have no idea what that is saying.  Here's a rewrite:
> >>
> >> EREMOVE takes a pages and removes any association between that page and
> >> an enclave.  It must be run on a page before it can be added into
> >> another enclave.  Currently, EREMOVE is run as part of pages being freed
> >> into the SGX page allocator.  It is not expected to fail.
> >>
> >> KVM does not track how guest pages are used, which means that SGX
> >> virtualization use of EREMOVE might fail.
> >>
> >> Break out the EREMOVE call from the SGX page allocator.  This will allow
> >> the SGX virtualization code to use the allocator directly.  (SGX/KVM
> >> will also introduce a more permissive EREMOVE helper).
> > 
> > Thanks.
> > 
> > Hi Jarkko,
> > 
> > Do you want me to update your patch directly, or do you want to take the
> > change, and send me the patch again?
> 
> I think you should treat all these 27 patches as yours now (including 
> removing them, reordering them, adjusting commit message etc.).

Agreed. Thank you Paolo for starting to review this series :)

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  2021-02-02 17:20       ` Jarkko Sakkinen
@ 2021-02-02 20:35         ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-02 20:35 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, 2 Feb 2021 19:20:54 +0200 Jarkko Sakkinen wrote:
> On Mon, Feb 01, 2021 at 01:17:44PM +1300, Kai Huang wrote:
> > On Sat, 30 Jan 2021 16:51:41 +0200 Jarkko Sakkinen wrote:
> > > On Tue, Jan 26, 2021 at 10:31:06PM +1300, Kai Huang wrote:
> > > > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > > > 
> > > > The bare-metal kernel must intercept ECREATE to be able to impose policies
> > > > on guests.  When it does this, the bare-metal kernel runs ECREATE against
> > > > the userspace mapping of the virtualized EPC.
> > > 
> > > I guess Andy's earlier comment applies here, i.e. SGX driver?
> > 
> > Sure.
> > 
> > [...]
> > 
> > > > +	}
> > > > +
> > > > +	if (encls_faulted(ret)) {
> > > > +		*trapnr = ENCLS_TRAPNR(ret);
> 
> Also here is an empty line needed.

I honestly don't like putting new line here, since it is just two lines of
code. Adding new line is too sparse I think.

> 
> > > > +		return -EFAULT;
> > > > +	}
> > > 
> > > Empty line here before return. Applies also to sgx_virt_ecreate().
> > 
> > Yes I can remove, but I am just carious: isn't "having empty line before return"
> > a good coding-style? Do you have any reference to the guideline?
> 
> In the initial SGX patch set, this was the review feedback that I got
> from Boris, so I would presume it is tip tree convention. Also, looking
> at a random selection of files under arch/x86, it is commonly done this
> way.

I'll add a new line here. Sorry I misunderstood your original comment.

> 
> > 
> > > 
> > > > +	return ret;
> > > > +}
> > > > +EXPORT_SYMBOL_GPL(sgx_virt_einit);
> > > > -- 
> > > > 2.29.2
> > > 
> > > Great work. I think this patch sets is shaping up.
> > > 
> > > /Jarkko
> > > > 
> > > > 
> > 
> 
> /Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
                   ` (27 preceding siblings ...)
  2021-01-26  9:32 ` [RFC PATCH v3 27/27] KVM: x86: Add capability to grant VM access to privileged SGX attribute Kai Huang
@ 2021-02-02 22:21 ` Edgecombe, Rick P
  2021-02-02 22:33   ` Sean Christopherson
  2021-02-02 22:36   ` Dave Hansen
  28 siblings, 2 replies; 156+ messages in thread
From: Edgecombe, Rick P @ 2021-02-02 22:21 UTC (permalink / raw)
  To: linux-sgx, kvm, Huang, Kai, x86
  Cc: corbet, luto, Hansen, Dave, jethro, wanpengli, seanjc, mingo,
	b.thiel, tglx, pbonzini, jarkko, joro, hpa, jmattson, vkuznets,
	bp, Huang, Haitao

On Tue, 2021-01-26 at 23:10 +1300, Kai Huang wrote:
> This series adds KVM SGX virtualization support. The first 15 patches
> starting
> with x86/sgx or x86/cpu.. are necessary changes to x86 and SGX
> core/driver to
> support KVM SGX virtualization, while the rest are patches to KVM
> subsystem.

Do we need to restrict normal KVM host kernel access to EPC (i.e. via
__kvm_map_gfn() and friends)? As best I can tell the exact behavior of
this kind of access is undefined. The concern would be if any HW ever
treated it as an error, the guest could subject the host kernel to it.
Is it worth a check in those?

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-02 22:21 ` [RFC PATCH v3 00/27] KVM SGX virtualization support Edgecombe, Rick P
@ 2021-02-02 22:33   ` Sean Christopherson
  2021-02-02 23:21     ` Dave Hansen
  2021-02-02 22:36   ` Dave Hansen
  1 sibling, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-02 22:33 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: linux-sgx, kvm, Huang, Kai, x86, corbet, luto, Hansen, Dave,
	jethro, wanpengli, mingo, b.thiel, tglx, pbonzini, jarkko, joro,
	hpa, jmattson, vkuznets, bp, Huang, Haitao

On Tue, Feb 02, 2021, Edgecombe, Rick P wrote:
> On Tue, 2021-01-26 at 23:10 +1300, Kai Huang wrote:
> > This series adds KVM SGX virtualization support. The first 15 patches
> > starting
> > with x86/sgx or x86/cpu.. are necessary changes to x86 and SGX
> > core/driver to
> > support KVM SGX virtualization, while the rest are patches to KVM
> > subsystem.
> 
> Do we need to restrict normal KVM host kernel access to EPC (i.e. via
> __kvm_map_gfn() and friends)? As best I can tell the exact behavior of
> this kind of access is undefined. The concern would be if any HW ever
> treated it as an error, the guest could subject the host kernel to it.
> Is it worth a check in those?

I don't think so.  The SDM does state that the exact behavior is uArch specific,
but it also explicitly states that the access will be altered, which IMO doesn't
leave any wiggle room for a future CPU to fault instead of using some form of
abort semantics.

  Attempts to execute, read, or write to linear addresses mapped to EPC pages
  when not inside an enclave will result in the processor altering the access to
  preserve the confidentiality and integrity of the enclave. The exact behavior
  may be different between implementations.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-02 22:21 ` [RFC PATCH v3 00/27] KVM SGX virtualization support Edgecombe, Rick P
  2021-02-02 22:33   ` Sean Christopherson
@ 2021-02-02 22:36   ` Dave Hansen
  1 sibling, 0 replies; 156+ messages in thread
From: Dave Hansen @ 2021-02-02 22:36 UTC (permalink / raw)
  To: Edgecombe, Rick P, linux-sgx, kvm, Huang, Kai, x86
  Cc: corbet, luto, jethro, wanpengli, seanjc, mingo, b.thiel, tglx,
	pbonzini, jarkko, joro, hpa, jmattson, vkuznets, bp, Huang,
	Haitao

On 2/2/21 2:21 PM, Edgecombe, Rick P wrote:
> On Tue, 2021-01-26 at 23:10 +1300, Kai Huang wrote:
>> This series adds KVM SGX virtualization support. The first 15 patches
>> starting
>> with x86/sgx or x86/cpu.. are necessary changes to x86 and SGX
>> core/driver to
>> support KVM SGX virtualization, while the rest are patches to KVM
>> subsystem.
> 
> Do we need to restrict normal KVM host kernel access to EPC (i.e. via
> __kvm_map_gfn() and friends)? As best I can tell the exact behavior of
> this kind of access is undefined. The concern would be if any HW ever
> treated it as an error, the guest could subject the host kernel to it.
> Is it worth a check in those?

Geez, you're right.  It's not even a page fault we can recover from.

SDM, Vol. 3D 37-1, 37.3 ACCESS-CONTROL REQUIREMENTS, says:

"Non-enclave accesses to EPC memory result in undefined behavior"

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  2021-02-01 17:12     ` Sean Christopherson
@ 2021-02-02 22:38       ` Jarkko Sakkinen
  0 siblings, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 22:38 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Kai Huang, linux-sgx, kvm, x86, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jmattson, joro, vkuznets,
	wanpengli

On Mon, Feb 01, 2021 at 09:12:47AM -0800, Sean Christopherson wrote:
> On Sat, Jan 30, 2021, Jarkko Sakkinen wrote:
> > On Tue, Jan 26, 2021 at 10:31:37PM +1300, Kai Huang wrote:
> > > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > > 
> > > Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
> > > full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
> > > bits 15:0, and single-bit modifiers in bits 31:16.
> > > 
> > > Historically, KVM has only had to worry about handling the "failed
> > > VM-Entry" modifier, which could only be set in very specific flows and
> > > required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
> > > bit was a somewhat viable approach.  But even with only a single bit to
> > > worry about, KVM has had several bugs related to comparing a basic exit
> > > reason against the full exit reason store in vcpu_vmx.
> > > 
> > > Upcoming Intel features, e.g. SGX, will add new modifier bits that can
> > > be set on more or less any VM-Exit, as opposed to the significantly more
> > > restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
> > > flows isn't scalable.  Tracking exit reason in a union forces code to
> > > explicitly choose between consuming the full exit reason and the basic
> > > exit, and is a convenient way to document and access the modifiers.
> > 
> > I *believe* that the change is correct but I dropped in the last paragraph
> > - most likely only because of lack of expertise in this area.
> > 
> > I ask the most basic question: why SGX will add new modifier bits?
> 
> Register state is loaded with synthetic state and/or trampoline state on VM-Exit
> from enclaves.  For all intents and purposes, emulation and other VMM/hypervisor
> behavior that accesses vCPU state is impossible.  E.g. on a #UD VM-Exit, RIP
> will point at the AEP; emulating would emulate some random instruction in the
> untrusted runtime, not the instruction that faulted.
> 
> Hardware sets the "exit from enclave" modifier flag so that the VMM can try and
> do something moderately sane, e.g. inject a #UD into the guest instead of
> attempting to emulate random instructions.

OK, thanks for the explanation! I think this would be a great addition to
the commit message (as a reminder).

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  2021-02-02 19:23         ` Kai Huang
@ 2021-02-02 22:41           ` Jarkko Sakkinen
  2021-02-03  0:42             ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 22:41 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jmattson, joro, vkuznets,
	wanpengli

On Wed, Feb 03, 2021 at 08:23:40AM +1300, Kai Huang wrote:
> On Tue, 2 Feb 2021 19:24:42 +0200 Jarkko Sakkinen wrote:
> > On Mon, Feb 01, 2021 at 01:32:59PM +1300, Kai Huang wrote:
> > > On Sat, 30 Jan 2021 17:00:46 +0200 Jarkko Sakkinen wrote:
> > > > On Tue, Jan 26, 2021 at 10:31:37PM +1300, Kai Huang wrote:
> > > > > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > > > > 
> > > > > Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
> > > > > full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
> > > > > bits 15:0, and single-bit modifiers in bits 31:16.
> > > > > 
> > > > > Historically, KVM has only had to worry about handling the "failed
> > > > > VM-Entry" modifier, which could only be set in very specific flows and
> > > > > required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
> > > > > bit was a somewhat viable approach.  But even with only a single bit to
> > > > > worry about, KVM has had several bugs related to comparing a basic exit
> > > > > reason against the full exit reason store in vcpu_vmx.
> > > > > 
> > > > > Upcoming Intel features, e.g. SGX, will add new modifier bits that can
> > 
> > BTW, SGX is not an upcoming CPU feature.
> 
> Probably Sean was implying: "Upcoming CPU features that will be supported by
> Linux". I don't see big deal here.
> 
> > 
> > Also, broadly speaking of upcoming features is not right thing to do.
> > Better just to scope this down SGX. Theoretically upcoming CPU features
> > can do pretty much anything. This is change is first and foremost done
> > for SGX.
> > 
> > > > > be set on more or less any VM-Exit, as opposed to the significantly more
> > > > > restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
> > > > > flows isn't scalable.  Tracking exit reason in a union forces code to
> > > > > explicitly choose between consuming the full exit reason and the basic
> > > > > exit, and is a convenient way to document and access the modifiers.
> > > > 
> > > > I *believe* that the change is correct but I dropped in the last paragraph
> > > > - most likely only because of lack of expertise in this area.
> > > > 
> > > > I ask the most basic question: why SGX will add new modifier bits?
> > > 
> > > Not 100% sure about your question. Assuming you are asking SGX hardware
> > > behavior, SGX architecture adds a new modifier bit (27) to Exit Reason, similar
> > > to new #PF.SGX bit. 
> > > 
> > > Please refer to SDM Volume 3, Chapter 27.2.1 Basic VM-Exit Information.
> > > 
> > > Sean's commit msg already provides significant motivation of the change in this
> > > patch.
> > 
> > Just describe why SGX requires this. That's all.
> 
> This patch is to change vmexit info from u32 to union, because at least one
> additional modifier is going to be added, due to SGX. So the motivation of this
> patch is the fact that "one or more additional modifier bits will be added",
> and SGX is just example. 
> 
> So I don't think adding too much SGX backgroud in *THIS* patch is needed.
> And another patch: 
> 
> [RFC PATCH v3 21/27] KVM: VMX: Add basic handling of VM-Exit from SGX enclave
> 
> already has enough information of "why new modifier bit is aadded for SGX".
> Sean also replied to you. 

Well it comes after this patch. So you either need to provide the context
here or reorder patches. If latter is impossible, I would just add those
couple of paragraphs that Sean wrote.

> Please look at that patch and see whether it satisfies you.

Well there needs to be causality in patches. I should be able to review
the patches if 17-> did not exist.


> 
> > 
> > /Jarkko
> 

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-01 15:25       ` Dave Hansen
  2021-02-01 17:23         ` Sean Christopherson
@ 2021-02-02 23:07         ` Jarkko Sakkinen
  1 sibling, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 23:07 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kai Huang, linux-sgx, kvm, x86, seanjc, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Mon, Feb 01, 2021 at 07:25:41AM -0800, Dave Hansen wrote:
> On 1/31/21 9:40 PM, Kai Huang wrote:
> >>> -	ret = sgx_drv_init();
> >>> +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> >>> +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> >> If would create more dumb code and just add
> >>
> >> ret = sgx_vepc_init()
> >> if (ret)
> >>         goto err_kthread;
> 
> Jarkko, I'm not sure I understand this suggestion.

I refined it in my 2nd response to Kai:

https://lore.kernel.org/linux-sgx/YBmMrqxlTxClg9Eb@kernel.org/

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-02  0:12           ` Kai Huang
@ 2021-02-02 23:10             ` Jarkko Sakkinen
  0 siblings, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 23:10 UTC (permalink / raw)
  To: Kai Huang
  Cc: Sean Christopherson, Dave Hansen, linux-sgx, kvm, x86, luto,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Tue, Feb 02, 2021 at 01:12:07PM +1300, Kai Huang wrote:
> On Mon, 1 Feb 2021 09:23:18 -0800 Sean Christopherson wrote:
> > On Mon, Feb 01, 2021, Dave Hansen wrote:
> > > On 1/31/21 9:40 PM, Kai Huang wrote:
> > > >>> -	ret = sgx_drv_init();
> > > >>> +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> > > >>> +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> > > >> If would create more dumb code and just add
> > > >>
> > > >> ret = sgx_vepc_init()
> > > >> if (ret)
> > > >>         goto err_kthread;
> > > 
> > > Jarkko, I'm not sure I understand this suggestion.
> > > 
> > > > Do you mean you want below?
> > > > 
> > > > 	ret = sgx_drv_init();
> > > > 	ret = sgx_vepc_init();
> > > > 	if (ret)
> > > > 		goto err_kthread;
> > > > 
> > > > This was Sean's original code, but Dave didn't like it.
> > 
> > The problem is it's wrong.  That snippet would incorrectly bail if drv_init()
> > succeeds but vepc_init() fails.
> > 
> > The alternative to the bitwise AND is to snapshot the result in two separate
> > variables:
> > 
> > 	ret = sgx_drv_init();
> > 	ret2 = sgx_vepc_init();
> > 	if (ret && ret2)
> > 		goto err_kthread;
> > 
> > or check the return from drv_init() _after_ vepc_init():
> > 
> > 	ret = sgx_drv_init();
> > 	if (sgx_vepc_init() && ret)
> > 		goto err_kthread;
> > 
> > 
> > As evidenced by this thread, the behavior is subtle and easy to get wrong.  I
> > deliberately chose the option that was the weirdest specifically to reduce the
> > probability of someone incorrectly "cleaning up" the code.
> > 
> > > Are you sure?  I remember the !!&!! abomination being Sean's doing. :)
> > 
> > Yep!  That 100% functionally correct horror is my doing.
> > 
> > > > Sean/Dave,
> > > > 
> > > > Please let me know which way you prefer.
> > > 
> > > Kai, I don't really know you are saying here.  In the end,
> > > sgx_vepc_init() has to run regardless of whether sgx_drv_init() is
> > > successful or not.  Also, we only want to 'goto err_kthraed' if *BOTH*
> > > fail.  The code you have above will, for instance, 'goto err_kthread' if
> > > sgx_drv_init() succeeds but sgx_vepc_init() fails.  It entirely
> > > disregards the sgx_drv_init() error code.
> > 
> 
> Hi Dave, Sean,
> 
> Yeah sorry my bad. The example I provided won't work. So I'd like keep the !!
> &!! :)
> 
> Thanks. 
> 
> Jarkko, please let us know if you still have concern.

I'll just review the next version. I disagree with this packing though.

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-02 18:20         ` Sean Christopherson
@ 2021-02-02 23:16           ` Jarkko Sakkinen
  2021-02-03  0:49             ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 23:16 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Kai Huang, linux-sgx, kvm, x86, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, Feb 02, 2021 at 10:20:47AM -0800, Sean Christopherson wrote:
> On Tue, Feb 02, 2021, Jarkko Sakkinen wrote:
> > On Mon, Feb 01, 2021 at 06:40:40PM +1300, Kai Huang wrote:
> > > On Sat, 30 Jan 2021 16:45:43 +0200 Jarkko Sakkinen wrote:
> > > > On Tue, Jan 26, 2021 at 10:31:00PM +1300, Kai Huang wrote:
> > > > > Modify sgx_init() to always try to initialize the virtual EPC driver,
> > > > > even if the bare-metal SGX driver is disabled.  The bare-metal driver
> > > > > might be disabled if SGX Launch Control is in locked mode, or not
> > > > > supported in the hardware at all.  This allows (non-Linux) guests that
> > > > > support non-LC configurations to use SGX.
> > > > > 
> > > > > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > > > > ---
> > > > > v2->v3:
> > > > > 
> > > > >  - Changed from sgx_virt_epc_init() to sgx_vepc_init().
> > > > > 
> > > > > ---
> > > > >  arch/x86/kernel/cpu/sgx/main.c | 4 +++-
> > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > > > > index 21c2ffa13870..93d249f7bff3 100644
> > > > > --- a/arch/x86/kernel/cpu/sgx/main.c
> > > > > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > > > > @@ -12,6 +12,7 @@
> > > > >  #include "driver.h"
> > > > >  #include "encl.h"
> > > > >  #include "encls.h"
> > > > > +#include "virt.h"
> > > > >  
> > > > >  struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
> > > > >  static int sgx_nr_epc_sections;
> > > > > @@ -712,7 +713,8 @@ static int __init sgx_init(void)
> > > > >  		goto err_page_cache;
> > > > >  	}
> > > > >  
> > > > > -	ret = sgx_drv_init();
> > > > > +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> > > > > +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> > > > 
> > > > If would create more dumb code and just add
> > > > 
> > > > ret = sgx_vepc_init()
> > > > if (ret)
> > > >         goto err_kthread;
> > > 
> > > Do you mean you want below?
> > > 
> > > 	ret = sgx_drv_init();
> > > 	ret = sgx_vepc_init();
> > > 	if (ret)
> > > 		goto err_kthread;
> > > 
> > > This was Sean's original code, but Dave didn't like it.
> > 
> > I think it should be like:
> > 
> > ret = sgx_drv_init();
> > if (ret)
> >         pr_warn("Driver initialization failed with %d\n", ret);
> > 
> > ret = sgx_vepc_init();
> > if (ret)
> > 	goto err_kthread;
> 
> And that's wrong, it doesn't correctly handle the case where sgx_drv_init()
> succeeds but sgx_vepc_init() fails.

After reading all of this, I think that the only acceptable way to
to manage this is to

ret = sgx_drv_init();
if (ret && ret != -ENODEV)
        goto err_kthread;

ret = sgx_vepc_init();
if (ret)
	goto err_kthread;

Anything else would be a bad idea.

We do support allowing KVM when the driver does not *support* SGX,
not when something is working incorrectly. In that case it is a bad
idea to allow any SGX related initialization to continue.

Agreed that my earlier example is incorrect but so is the condition
in the original patch.

/Jarkko 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-02 18:49         ` Kai Huang
@ 2021-02-02 23:17           ` Jarkko Sakkinen
  0 siblings, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-02 23:17 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Wed, Feb 03, 2021 at 07:49:45AM +1300, Kai Huang wrote:
> On Tue, 2 Feb 2021 19:32:30 +0200 Jarkko Sakkinen wrote:
> > On Mon, Feb 01, 2021 at 06:40:40PM +1300, Kai Huang wrote:
> > > On Sat, 30 Jan 2021 16:45:43 +0200 Jarkko Sakkinen wrote:
> > > > On Tue, Jan 26, 2021 at 10:31:00PM +1300, Kai Huang wrote:
> > > > > Modify sgx_init() to always try to initialize the virtual EPC driver,
> > > > > even if the bare-metal SGX driver is disabled.  The bare-metal driver
> > > > > might be disabled if SGX Launch Control is in locked mode, or not
> > > > > supported in the hardware at all.  This allows (non-Linux) guests that
> > > > > support non-LC configurations to use SGX.
> > > > > 
> > > > > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > > > > ---
> > > > > v2->v3:
> > > > > 
> > > > >  - Changed from sgx_virt_epc_init() to sgx_vepc_init().
> > > > > 
> > > > > ---
> > > > >  arch/x86/kernel/cpu/sgx/main.c | 4 +++-
> > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > > > > index 21c2ffa13870..93d249f7bff3 100644
> > > > > --- a/arch/x86/kernel/cpu/sgx/main.c
> > > > > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > > > > @@ -12,6 +12,7 @@
> > > > >  #include "driver.h"
> > > > >  #include "encl.h"
> > > > >  #include "encls.h"
> > > > > +#include "virt.h"
> > > > >  
> > > > >  struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
> > > > >  static int sgx_nr_epc_sections;
> > > > > @@ -712,7 +713,8 @@ static int __init sgx_init(void)
> > > > >  		goto err_page_cache;
> > > > >  	}
> > > > >  
> > > > > -	ret = sgx_drv_init();
> > > > > +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> > > > > +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> > > > 
> > > > If would create more dumb code and just add
> > > > 
> > > > ret = sgx_vepc_init()
> > > > if (ret)
> > > >         goto err_kthread;
> > > 
> > > Do you mean you want below?
> > > 
> > > 	ret = sgx_drv_init();
> > > 	ret = sgx_vepc_init();
> > > 	if (ret)
> > > 		goto err_kthread;
> > > 
> > > This was Sean's original code, but Dave didn't like it.
> > 
> > I think it should be like:
> > 
> > ret = sgx_drv_init();
> > if (ret)
> >         pr_warn("Driver initialization failed with %d\n", ret);
> > 
> > ret = sgx_vepc_init();
> > if (ret)
> > 	goto err_kthread;
> > 
> > There is problem here anyhow. I.e. -ENODEV's from sgx_drv_init().  I think
> > how driver.c should be changed would be just to return 0 in the places
> > where it now return -ENODEV. Consider "not initialized" as a successful
> > initialization.
> 
> Hi Jarkko,
> 
> Dave already pointed out above code won't work. The problem is failure to
> initialize vepc will just goto err_kthread, no matter whether driver has been
> initialized successfully or not. 
> 
> I am sticking to the original way (!! & !!).

I think it is wrong, as it is not in line with the conditions when KVM SGX
support is allowed. It's exactly allowed when SGX is not supported by the
driver. Not when things are not behaving right. Would be insane to allow
anything to initialize in that situation.

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-02 22:33   ` Sean Christopherson
@ 2021-02-02 23:21     ` Dave Hansen
  2021-02-02 23:56       ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-02-02 23:21 UTC (permalink / raw)
  To: Sean Christopherson, Edgecombe, Rick P
  Cc: linux-sgx, kvm, Huang, Kai, x86, corbet, luto, jethro, wanpengli,
	mingo, b.thiel, tglx, pbonzini, jarkko, joro, hpa, jmattson,
	vkuznets, bp, Huang, Haitao

On 2/2/21 2:33 PM, Sean Christopherson wrote:
>> Do we need to restrict normal KVM host kernel access to EPC (i.e. via
>> __kvm_map_gfn() and friends)? As best I can tell the exact behavior of
>> this kind of access is undefined. The concern would be if any HW ever
>> treated it as an error, the guest could subject the host kernel to it.
>> Is it worth a check in those?
> I don't think so.  The SDM does state that the exact behavior is uArch specific,
> but it also explicitly states that the access will be altered, which IMO doesn't
> leave any wiggle room for a future CPU to fault instead of using some form of
> abort semantics.
> 
>   Attempts to execute, read, or write to linear addresses mapped to EPC pages
>   when not inside an enclave will result in the processor altering the access to
>   preserve the confidentiality and integrity of the enclave. The exact behavior
>   may be different between implementations.

I seem to remember much stronger language in the SDM about this.  I've
always thought of SGX as a big unrecoverable machine-check party waiting
to happen.

I'll ask around internally at Intel and see what folks say.  Basically,
should we be afraid of a big bad EPC access?

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-02 23:21     ` Dave Hansen
@ 2021-02-02 23:56       ` Sean Christopherson
  2021-02-03  0:43         ` Dave Hansen
  2021-02-03 15:10         ` Dave Hansen
  0 siblings, 2 replies; 156+ messages in thread
From: Sean Christopherson @ 2021-02-02 23:56 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Edgecombe, Rick P, linux-sgx, kvm, Huang, Kai, x86, corbet, luto,
	jethro, wanpengli, mingo, b.thiel, tglx, pbonzini, jarkko, joro,
	hpa, jmattson, vkuznets, bp, Huang, Haitao

On Tue, Feb 02, 2021, Dave Hansen wrote:
> On 2/2/21 2:33 PM, Sean Christopherson wrote:
> >> Do we need to restrict normal KVM host kernel access to EPC (i.e. via
> >> __kvm_map_gfn() and friends)? As best I can tell the exact behavior of
> >> this kind of access is undefined. The concern would be if any HW ever
> >> treated it as an error, the guest could subject the host kernel to it.
> >> Is it worth a check in those?
> > I don't think so.  The SDM does state that the exact behavior is uArch specific,
> > but it also explicitly states that the access will be altered, which IMO doesn't
> > leave any wiggle room for a future CPU to fault instead of using some form of
> > abort semantics.
> > 
> >   Attempts to execute, read, or write to linear addresses mapped to EPC pages
> >   when not inside an enclave will result in the processor altering the access to
> >   preserve the confidentiality and integrity of the enclave. The exact behavior
> >   may be different between implementations.
> 
> I seem to remember much stronger language in the SDM about this.  I've
> always thought of SGX as a big unrecoverable machine-check party waiting
> to happen.
>
> I'll ask around internally at Intel and see what folks say.  Basically,
> should we be afraid of a big bad EPC access?

If bad accesses to the EPC can cause machine checks, then EPC should never be
mapped into userspace, i.e. the SGX driver should never have been merged.

The SGX architecture is predicated on using isolation to protect enclaves from
software, not by poisoning memory, a la TDX.  E.g. SGX on ICX's MKTME wouldn't
be a thing if that weren't the case.

A physical attack on DRAM can trigger #MC on systems that use the MEE as
opposed to MKTME, but that obviously doesn't require a guest to coerce KVM into
accessing the EPC.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union
  2021-02-02 22:41           ` Jarkko Sakkinen
@ 2021-02-03  0:42             ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-03  0:42 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-sgx, kvm, x86, seanjc, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa, jmattson, joro, vkuznets,
	wanpengli

On Wed, 3 Feb 2021 00:41:00 +0200 Jarkko Sakkinen wrote:
> On Wed, Feb 03, 2021 at 08:23:40AM +1300, Kai Huang wrote:
> > On Tue, 2 Feb 2021 19:24:42 +0200 Jarkko Sakkinen wrote:
> > > On Mon, Feb 01, 2021 at 01:32:59PM +1300, Kai Huang wrote:
> > > > On Sat, 30 Jan 2021 17:00:46 +0200 Jarkko Sakkinen wrote:
> > > > > On Tue, Jan 26, 2021 at 10:31:37PM +1300, Kai Huang wrote:
> > > > > > From: Sean Christopherson <sean.j.christopherson@intel.com>
> > > > > > 
> > > > > > Convert vcpu_vmx.exit_reason from a u32 to a union (of size u32).  The
> > > > > > full VM_EXIT_REASON field is comprised of a 16-bit basic exit reason in
> > > > > > bits 15:0, and single-bit modifiers in bits 31:16.
> > > > > > 
> > > > > > Historically, KVM has only had to worry about handling the "failed
> > > > > > VM-Entry" modifier, which could only be set in very specific flows and
> > > > > > required dedicated handling.  I.e. manually stripping the FAILED_VMENTRY
> > > > > > bit was a somewhat viable approach.  But even with only a single bit to
> > > > > > worry about, KVM has had several bugs related to comparing a basic exit
> > > > > > reason against the full exit reason store in vcpu_vmx.
> > > > > > 
> > > > > > Upcoming Intel features, e.g. SGX, will add new modifier bits that can
> > > 
> > > BTW, SGX is not an upcoming CPU feature.
> > 
> > Probably Sean was implying: "Upcoming CPU features that will be supported by
> > Linux". I don't see big deal here.
> > 
> > > 
> > > Also, broadly speaking of upcoming features is not right thing to do.
> > > Better just to scope this down SGX. Theoretically upcoming CPU features
> > > can do pretty much anything. This is change is first and foremost done
> > > for SGX.
> > > 
> > > > > > be set on more or less any VM-Exit, as opposed to the significantly more
> > > > > > restricted FAILED_VMENTRY, i.e. correctly handling everything in one-off
> > > > > > flows isn't scalable.  Tracking exit reason in a union forces code to
> > > > > > explicitly choose between consuming the full exit reason and the basic
> > > > > > exit, and is a convenient way to document and access the modifiers.
> > > > > 
> > > > > I *believe* that the change is correct but I dropped in the last paragraph
> > > > > - most likely only because of lack of expertise in this area.
> > > > > 
> > > > > I ask the most basic question: why SGX will add new modifier bits?
> > > > 
> > > > Not 100% sure about your question. Assuming you are asking SGX hardware
> > > > behavior, SGX architecture adds a new modifier bit (27) to Exit Reason, similar
> > > > to new #PF.SGX bit. 
> > > > 
> > > > Please refer to SDM Volume 3, Chapter 27.2.1 Basic VM-Exit Information.
> > > > 
> > > > Sean's commit msg already provides significant motivation of the change in this
> > > > patch.
> > > 
> > > Just describe why SGX requires this. That's all.
> > 
> > This patch is to change vmexit info from u32 to union, because at least one
> > additional modifier is going to be added, due to SGX. So the motivation of this
> > patch is the fact that "one or more additional modifier bits will be added",
> > and SGX is just example. 
> > 
> > So I don't think adding too much SGX backgroud in *THIS* patch is needed.
> > And another patch: 
> > 
> > [RFC PATCH v3 21/27] KVM: VMX: Add basic handling of VM-Exit from SGX enclave
> > 
> > already has enough information of "why new modifier bit is aadded for SGX".
> > Sean also replied to you. 
> 
> Well it comes after this patch. So you either need to provide the context
> here or reorder patches. If latter is impossible, I would just add those
> couple of paragraphs that Sean wrote.

As I explained, to me the motivation of this patch is due to "adding additional
modifier bit", but not due to "adding additional modifier bit *due to SGX*". 

For instance, let's remove SGX in the commit msg, this patch still stands.
Correct?

Sean's paragraph is about why *SGX* adds one additional modifier bit, which
needs to be in another patch, and logically, that patch comes later.

> 
> > Please look at that patch and see whether it satisfies you.
> 
> Well there needs to be causality in patches. I should be able to review
> the patches if 17-> did not exist.
> 
> 
> > 
> > > 
> > > /Jarkko
> > 
> 
> /Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-02 23:56       ` Sean Christopherson
@ 2021-02-03  0:43         ` Dave Hansen
  2021-02-03 15:10         ` Dave Hansen
  1 sibling, 0 replies; 156+ messages in thread
From: Dave Hansen @ 2021-02-03  0:43 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Edgecombe, Rick P, linux-sgx, kvm, Huang, Kai, x86, corbet, luto,
	jethro, wanpengli, mingo, b.thiel, tglx, pbonzini, jarkko, joro,
	hpa, jmattson, vkuznets, bp, Huang, Haitao

On 2/2/21 3:56 PM, Sean Christopherson wrote:
>> I'll ask around internally at Intel and see what folks say.  Basically,
>> should we be afraid of a big bad EPC access?
> If bad accesses to the EPC can cause machine checks, then EPC should never be
> mapped into userspace, i.e. the SGX driver should never have been merged.

That's a good point.  However, I've learned not to assume too much about
the SGX architecture.

Either way, I think we need some architectural clarification.  If it
can't *possibly* be harmful, then the architecture docs should at least
put a stake in the ground and say so.  I'll go rattle some cages.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-02 23:16           ` Jarkko Sakkinen
@ 2021-02-03  0:49             ` Kai Huang
  2021-02-03 22:02               ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-03  0:49 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Sean Christopherson, linux-sgx, kvm, x86, luto, dave.hansen,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Wed, 3 Feb 2021 01:16:20 +0200 Jarkko Sakkinen wrote:
> On Tue, Feb 02, 2021 at 10:20:47AM -0800, Sean Christopherson wrote:
> > On Tue, Feb 02, 2021, Jarkko Sakkinen wrote:
> > > On Mon, Feb 01, 2021 at 06:40:40PM +1300, Kai Huang wrote:
> > > > On Sat, 30 Jan 2021 16:45:43 +0200 Jarkko Sakkinen wrote:
> > > > > On Tue, Jan 26, 2021 at 10:31:00PM +1300, Kai Huang wrote:
> > > > > > Modify sgx_init() to always try to initialize the virtual EPC driver,
> > > > > > even if the bare-metal SGX driver is disabled.  The bare-metal driver
> > > > > > might be disabled if SGX Launch Control is in locked mode, or not
> > > > > > supported in the hardware at all.  This allows (non-Linux) guests that
> > > > > > support non-LC configurations to use SGX.
> > > > > > 
> > > > > > Signed-off-by: Kai Huang <kai.huang@intel.com>
> > > > > > ---
> > > > > > v2->v3:
> > > > > > 
> > > > > >  - Changed from sgx_virt_epc_init() to sgx_vepc_init().
> > > > > > 
> > > > > > ---
> > > > > >  arch/x86/kernel/cpu/sgx/main.c | 4 +++-
> > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > > > > > index 21c2ffa13870..93d249f7bff3 100644
> > > > > > --- a/arch/x86/kernel/cpu/sgx/main.c
> > > > > > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > > > > > @@ -12,6 +12,7 @@
> > > > > >  #include "driver.h"
> > > > > >  #include "encl.h"
> > > > > >  #include "encls.h"
> > > > > > +#include "virt.h"
> > > > > >  
> > > > > >  struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
> > > > > >  static int sgx_nr_epc_sections;
> > > > > > @@ -712,7 +713,8 @@ static int __init sgx_init(void)
> > > > > >  		goto err_page_cache;
> > > > > >  	}
> > > > > >  
> > > > > > -	ret = sgx_drv_init();
> > > > > > +	/* Success if the native *or* virtual EPC driver initialized cleanly. */
> > > > > > +	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> > > > > 
> > > > > If would create more dumb code and just add
> > > > > 
> > > > > ret = sgx_vepc_init()
> > > > > if (ret)
> > > > >         goto err_kthread;
> > > > 
> > > > Do you mean you want below?
> > > > 
> > > > 	ret = sgx_drv_init();
> > > > 	ret = sgx_vepc_init();
> > > > 	if (ret)
> > > > 		goto err_kthread;
> > > > 
> > > > This was Sean's original code, but Dave didn't like it.
> > > 
> > > I think it should be like:
> > > 
> > > ret = sgx_drv_init();
> > > if (ret)
> > >         pr_warn("Driver initialization failed with %d\n", ret);
> > > 
> > > ret = sgx_vepc_init();
> > > if (ret)
> > > 	goto err_kthread;
> > 
> > And that's wrong, it doesn't correctly handle the case where sgx_drv_init()
> > succeeds but sgx_vepc_init() fails.
> 
> After reading all of this, I think that the only acceptable way to
> to manage this is to
> 
> ret = sgx_drv_init();
> if (ret && ret != -ENODEV)
>         goto err_kthread;

Why? From SGX virtualization's perspective, it doesn't care what error code
caused driver not being initialized properly. Actually it even doesn't care
about whether driver initialization is successful or not.

> 
> ret = sgx_vepc_init();
> if (ret)
> 	goto err_kthread;
> 
> Anything else would be a bad idea.
> 
> We do support allowing KVM when the driver does not *support* SGX,
> not when something is working incorrectly. 

What working *incorrectly* thing is related to SGX virtualization? The things
SGX virtualization requires (basically just raw EPC allocation) are all in
sgx/main.c. 

In that case it is a bad
> idea to allow any SGX related initialization to continue.
> 
> Agreed that my earlier example is incorrect but so is the condition
> in the original patch.
> 
> /Jarkko 

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-01-26  9:31 ` [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions Kai Huang
@ 2021-02-03  0:52   ` Edgecombe, Rick P
  2021-02-03  1:36     ` Sean Christopherson
  2021-02-03 18:47   ` Edgecombe, Rick P
  1 sibling, 1 reply; 156+ messages in thread
From: Edgecombe, Rick P @ 2021-02-03  0:52 UTC (permalink / raw)
  To: linux-sgx, kvm, Huang, Kai, x86
  Cc: Huang, Haitao, luto, jarkko, seanjc, Hansen, Dave, vkuznets, bp,
	mingo, tglx, hpa, pbonzini, joro, wanpengli, jmattson

On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> +static int handle_encls_ecreate(struct kvm_vcpu *vcpu)
> +{
> +       unsigned long a_hva, m_hva, x_hva, s_hva, secs_hva;
> +       struct kvm_cpuid_entry2 *sgx_12_0, *sgx_12_1;
> +       gpa_t metadata_gpa, contents_gpa, secs_gpa;
> +       struct sgx_pageinfo pageinfo;
> +       gva_t pageinfo_gva, secs_gva;
> +       u64 attributes, xfrm, size;
> +       struct x86_exception ex;
> +       u8 max_size_log2;
> +       u32 miscselect;
> +       int trapnr, r;
> +
> +       sgx_12_0 = kvm_find_cpuid_entry(vcpu, 0x12, 0);
> +       sgx_12_1 = kvm_find_cpuid_entry(vcpu, 0x12, 1);
> +       if (!sgx_12_0 || !sgx_12_1) {
> +               kvm_inject_gp(vcpu, 0);
> +               return 1;
> +       }
> +
> +       if (sgx_get_encls_gva(vcpu, kvm_rbx_read(vcpu), 32, 32,
> &pageinfo_gva) ||
> +           sgx_get_encls_gva(vcpu, kvm_rcx_read(vcpu), 4096, 4096,
> &secs_gva))
> +               return 1;
> +
> +       /*
> +        * Copy the PAGEINFO to local memory, its pointers need to be
> +        * translated, i.e. we need to do a deep copy/translate.
> +        */
> +       r = kvm_read_guest_virt(vcpu, pageinfo_gva, &pageinfo,
> +                               sizeof(pageinfo), &ex);
> +       if (r == X86EMUL_PROPAGATE_FAULT) {
> +               kvm_inject_emulated_page_fault(vcpu, &ex);
> +               return 1;
> +       } else if (r != X86EMUL_CONTINUE) {
> +               sgx_handle_emulation_failure(vcpu, pageinfo_gva,
> size);
> +               return 0;
> +       }
> +
> +       /*
> +        * Verify alignment early.  This conveniently avoids having
> to worry
> +        * about page splits on userspace addresses.
> +        */
> +       if (!IS_ALIGNED(pageinfo.metadata, 64) ||
> +           !IS_ALIGNED(pageinfo.contents, 4096)) {
> +               kvm_inject_gp(vcpu, 0);
> +               return 1;
> +       }
> +
> +       /*
> +        * Translate the SECINFO, SOURCE and SECS pointers from GVA
> to GPA.
> +        * Resume the guest on failure to inject a #PF.
> +        */
> +       if (sgx_gva_to_gpa(vcpu, pageinfo.metadata, false,
> &metadata_gpa) ||
> +           sgx_gva_to_gpa(vcpu, pageinfo.contents, false,
> &contents_gpa) ||
> +           sgx_gva_to_gpa(vcpu, secs_gva, true, &secs_gpa))
> +               return 1;
> +

Do pageinfo.metadata and pageinfo.contents need cannonical checks here?

I was noticing the other day that the guest walker could access host
memory slightly outside of a memslot if it ever got passed a gva with
bits higher that the va bits. Or at least it appeared that way. I
didn't fully wade into the bit math because all callers from the guest
did cannonical checks.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-02-02 18:42                 ` Sean Christopherson
@ 2021-02-03  1:05                   ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-03  1:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Dave Hansen, Jarkko Sakkinen, linux-sgx, kvm, x86,
	luto, haitao.huang, bp, tglx, mingo, hpa

On Tue, 2 Feb 2021 10:42:05 -0800 Sean Christopherson wrote:
> On Tue, Feb 02, 2021, Paolo Bonzini wrote:
> > On 02/02/21 19:00, Dave Hansen wrote:
> > > > /* "" Basic SGX */
> > > > /* "" SGX Enclave Dynamic Memory Mgmt */
> > > Do you actually want to suppress these from /proc/cpuinfo with the ""?
> > > 
> > 
> > sgx1 yes.  However sgx2 can be useful to have there, I guess.
> 
> Agreed, /proc/cpuinfo's sgx1 will always be in lockstep with sgx, so it won't
> be useful for dealing with the fallout of hardware disabling SGX due to software
> disabling a machine check bank via WRMSR(MCi_CTL).  I can't think of any other
> use case for checking /proc/cpuinfo's sgx1.

So combing all feedbacks, I'll put:

/* "" Basic SGX */
/* SGX Enclave Dynamic Memory Management (EDMM) */

Let me know if you guys have concern.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features
  2021-02-02 17:17           ` Jarkko Sakkinen
@ 2021-02-03  1:09             ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-03  1:09 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Dave Hansen, linux-sgx, kvm, x86, seanjc, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Tue, 2 Feb 2021 19:17:56 +0200 Jarkko Sakkinen wrote:
> On Mon, Feb 01, 2021 at 01:01:51PM +1300, Kai Huang wrote:
> > On Sat, 30 Jan 2021 15:20:54 +0200 Jarkko Sakkinen wrote:
> > > On Wed, Jan 27, 2021 at 12:18:32PM +1300, Kai Huang wrote:
> > > > On Tue, 2021-01-26 at 07:34 -0800, Dave Hansen wrote:
> > > > > On 1/26/21 1:30 AM, Kai Huang wrote:
> > > > > > From: Sean Christopherson <seanjc@google.com>
> > > > > > 
> > > > > > Add SGX1 and SGX2 feature flags, via CPUID.0x12.0x0.EAX, as scattered
> > > > > > features, since adding a new leaf for only two bits would be wasteful.
> > > > > > As part of virtualizing SGX, KVM will expose the SGX CPUID leafs to its
> > > > > > guest, and to do so correctly needs to query hardware and kernel support
> > > > > > for SGX1 and SGX2.
> > > > > 
> > > > > It's also not _just_ exposing the CPUID leaves.  There are some checks
> > > > > here when KVM is emulating some SGX instructions too, right?
> > > > 
> > > > I would say trapping instead of emulating, but yes KVM will do more. However those
> > > > are quite details, and I don't think we should put lots of details here. Or perhaps
> > > > we can use 'for instance' as brief description:
> > > > 
> > > > As part of virtualizing SGX, KVM will need to use the two flags, for instance, to
> > > > expose them to guest.
> > > > 
> > > > ?
> > > > 
> > > > > 
> > > > > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > > > > > index 84b887825f12..18b2d0c8bbbe 100644
> > > > > > --- a/arch/x86/include/asm/cpufeatures.h
> > > > > > +++ b/arch/x86/include/asm/cpufeatures.h
> > > > > > @@ -292,6 +292,8 @@
> > > > > >  #define X86_FEATURE_FENCE_SWAPGS_KERNEL	(11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */
> > > > > >  #define X86_FEATURE_SPLIT_LOCK_DETECT	(11*32+ 6) /* #AC for split lock */
> > > > > >  #define X86_FEATURE_PER_THREAD_MBA	(11*32+ 7) /* "" Per-thread Memory Bandwidth Allocation */
> > > > > > +#define X86_FEATURE_SGX1		(11*32+ 8) /* Software Guard Extensions sub-feature SGX1 */
> > > > > > +#define X86_FEATURE_SGX2        	(11*32+ 9) /* Software Guard Extensions sub-feature SGX2 */
> > > > > 
> > > > > FWIW, I'm not sure how valuable it is to spell the SGX acronym out three
> > > > > times.  Can't we use those bytes to put something more useful in that
> > > > > comment?
> > > > 
> > > > I think we can remove comment for SGX1, since it is basically SGX.
> > > > 
> > > > For SGX2, how about below?
> > > > 
> > > > /* SGX Enclave Dynamic Memory Management */
> > > 
> > > (EDMM)
> > 
> > Does EDMM obvious to everyone, instead of explicitly saying Enclave Dynamic
> > Memory Management?
> > 
> > Also do you think we need a comment for SGX1 bit? I can add /* Basic SGX */,
> > but I am not sure whether it is required.
> 
> I would put write the whole thing down and put EDMM to parentheses.

Good idea to me. Will do.

> 
> For SGX1 I would put "Basic SGX features for enclave construction".

I think "Basic SGX" should be enough, since it already implies "enclave
construction" part (plus other things). For someone doesn't care about SGX,
having "enclave construction" or not doesn't matter; for someone has some
knowledge of SGX, he or she knows what does "Basic SGX" mean.

> 
> /Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03  0:52   ` Edgecombe, Rick P
@ 2021-02-03  1:36     ` Sean Christopherson
  2021-02-03  9:11       ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-03  1:36 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: linux-sgx, kvm, Huang, Kai, x86, Huang, Haitao, luto, jarkko,
	Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini, joro,
	wanpengli, jmattson

On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > +static int handle_encls_ecreate(struct kvm_vcpu *vcpu)
> > +{
> > +       unsigned long a_hva, m_hva, x_hva, s_hva, secs_hva;
> > +       struct kvm_cpuid_entry2 *sgx_12_0, *sgx_12_1;
> > +       gpa_t metadata_gpa, contents_gpa, secs_gpa;
> > +       struct sgx_pageinfo pageinfo;
> > +       gva_t pageinfo_gva, secs_gva;
> > +       u64 attributes, xfrm, size;
> > +       struct x86_exception ex;
> > +       u8 max_size_log2;
> > +       u32 miscselect;
> > +       int trapnr, r;
> > +
> > +       sgx_12_0 = kvm_find_cpuid_entry(vcpu, 0x12, 0);
> > +       sgx_12_1 = kvm_find_cpuid_entry(vcpu, 0x12, 1);
> > +       if (!sgx_12_0 || !sgx_12_1) {
> > +               kvm_inject_gp(vcpu, 0);
> > +               return 1;
> > +       }
> > +
> > +       if (sgx_get_encls_gva(vcpu, kvm_rbx_read(vcpu), 32, 32,
> > &pageinfo_gva) ||
> > +           sgx_get_encls_gva(vcpu, kvm_rcx_read(vcpu), 4096, 4096,
> > &secs_gva))
> > +               return 1;
> > +
> > +       /*
> > +        * Copy the PAGEINFO to local memory, its pointers need to be
> > +        * translated, i.e. we need to do a deep copy/translate.
> > +        */
> > +       r = kvm_read_guest_virt(vcpu, pageinfo_gva, &pageinfo,
> > +                               sizeof(pageinfo), &ex);
> > +       if (r == X86EMUL_PROPAGATE_FAULT) {
> > +               kvm_inject_emulated_page_fault(vcpu, &ex);
> > +               return 1;
> > +       } else if (r != X86EMUL_CONTINUE) {
> > +               sgx_handle_emulation_failure(vcpu, pageinfo_gva,
> > size);
> > +               return 0;
> > +       }
> > +
> > +       /*
> > +        * Verify alignment early.  This conveniently avoids having
> > to worry
> > +        * about page splits on userspace addresses.
> > +        */
> > +       if (!IS_ALIGNED(pageinfo.metadata, 64) ||
> > +           !IS_ALIGNED(pageinfo.contents, 4096)) {
> > +               kvm_inject_gp(vcpu, 0);
> > +               return 1;
> > +       }
> > +
> > +       /*
> > +        * Translate the SECINFO, SOURCE and SECS pointers from GVA
> > to GPA.
> > +        * Resume the guest on failure to inject a #PF.
> > +        */
> > +       if (sgx_gva_to_gpa(vcpu, pageinfo.metadata, false,
> > &metadata_gpa) ||
> > +           sgx_gva_to_gpa(vcpu, pageinfo.contents, false,
> > &contents_gpa) ||
> > +           sgx_gva_to_gpa(vcpu, secs_gva, true, &secs_gpa))
> > +               return 1;
> > +
> 
> Do pageinfo.metadata and pageinfo.contents need cannonical checks here?

Bugger, yes.  So much boilerplate needed in this code :-/

Maybe add yet another helper to do alignment+canonical checks, up where the
IS_ALIGNED() calls are?

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03  1:36     ` Sean Christopherson
@ 2021-02-03  9:11       ` Kai Huang
  2021-02-03 17:07         ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-03  9:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Edgecombe, Rick P, linux-sgx, kvm, x86, Huang, Haitao, luto,
	jarkko, Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini,
	joro, wanpengli, jmattson

On Tue, 2 Feb 2021 17:36:12 -0800 Sean Christopherson wrote:
> On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> > On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > > +static int handle_encls_ecreate(struct kvm_vcpu *vcpu)
> > > +{
> > > +       unsigned long a_hva, m_hva, x_hva, s_hva, secs_hva;
> > > +       struct kvm_cpuid_entry2 *sgx_12_0, *sgx_12_1;
> > > +       gpa_t metadata_gpa, contents_gpa, secs_gpa;
> > > +       struct sgx_pageinfo pageinfo;
> > > +       gva_t pageinfo_gva, secs_gva;
> > > +       u64 attributes, xfrm, size;
> > > +       struct x86_exception ex;
> > > +       u8 max_size_log2;
> > > +       u32 miscselect;
> > > +       int trapnr, r;
> > > +
> > > +       sgx_12_0 = kvm_find_cpuid_entry(vcpu, 0x12, 0);
> > > +       sgx_12_1 = kvm_find_cpuid_entry(vcpu, 0x12, 1);
> > > +       if (!sgx_12_0 || !sgx_12_1) {
> > > +               kvm_inject_gp(vcpu, 0);
> > > +               return 1;
> > > +       }
> > > +
> > > +       if (sgx_get_encls_gva(vcpu, kvm_rbx_read(vcpu), 32, 32,
> > > &pageinfo_gva) ||
> > > +           sgx_get_encls_gva(vcpu, kvm_rcx_read(vcpu), 4096, 4096,
> > > &secs_gva))
> > > +               return 1;
> > > +
> > > +       /*
> > > +        * Copy the PAGEINFO to local memory, its pointers need to be
> > > +        * translated, i.e. we need to do a deep copy/translate.
> > > +        */
> > > +       r = kvm_read_guest_virt(vcpu, pageinfo_gva, &pageinfo,
> > > +                               sizeof(pageinfo), &ex);
> > > +       if (r == X86EMUL_PROPAGATE_FAULT) {
> > > +               kvm_inject_emulated_page_fault(vcpu, &ex);
> > > +               return 1;
> > > +       } else if (r != X86EMUL_CONTINUE) {
> > > +               sgx_handle_emulation_failure(vcpu, pageinfo_gva,
> > > size);
> > > +               return 0;
> > > +       }
> > > +
> > > +       /*
> > > +        * Verify alignment early.  This conveniently avoids having
> > > to worry
> > > +        * about page splits on userspace addresses.
> > > +        */
> > > +       if (!IS_ALIGNED(pageinfo.metadata, 64) ||
> > > +           !IS_ALIGNED(pageinfo.contents, 4096)) {
> > > +               kvm_inject_gp(vcpu, 0);
> > > +               return 1;
> > > +       }
> > > +
> > > +       /*
> > > +        * Translate the SECINFO, SOURCE and SECS pointers from GVA
> > > to GPA.
> > > +        * Resume the guest on failure to inject a #PF.
> > > +        */
> > > +       if (sgx_gva_to_gpa(vcpu, pageinfo.metadata, false,
> > > &metadata_gpa) ||
> > > +           sgx_gva_to_gpa(vcpu, pageinfo.contents, false,
> > > &contents_gpa) ||
> > > +           sgx_gva_to_gpa(vcpu, secs_gva, true, &secs_gpa))
> > > +               return 1;
> > > +
> > 
> > Do pageinfo.metadata and pageinfo.contents need cannonical checks here?
> 
> Bugger, yes.  So much boilerplate needed in this code :-/
> 
> Maybe add yet another helper to do alignment+canonical checks, up where the
> IS_ALIGNED() calls are?

sgx_get_encls_gva() already does canonical check. Couldn't we just use it?

For instance:

	if (sgx_get_encls_gva(vcpu, pageinfo.metadata, 64, 64 &metadata_gva) ||
	    sgx_get_encls_gva(vcpu, pageinfo.contents, 4096, 4096,
                             &contents_gva))
		return 1;

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page()
  2021-02-01  0:11           ` Kai Huang
@ 2021-02-03 10:03             ` Jarkko Sakkinen
  0 siblings, 0 replies; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-03 10:03 UTC (permalink / raw)
  To: Kai Huang
  Cc: Dave Hansen, linux-sgx, kvm, x86, seanjc, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Mon, Feb 01, 2021 at 01:11:10PM +1300, Kai Huang wrote:
> On Wed, 27 Jan 2021 14:26:52 +1300 Kai Huang wrote:
> > On Tue, 26 Jan 2021 17:12:12 -0800 Dave Hansen wrote:
> > > On 1/26/21 5:08 PM, Kai Huang wrote:
> > > > I don't have deep understanding of SGX driver. Would you help to answer?
> > > 
> > > Kai, as the patch submitter, you are expected to be able to at least
> > > minimally explain what the patch is doing.  Please endeavor to obtain
> > > this understanding before sending patches in the future.
> > 
> > I see. Thanks.
> 
> Hi Jarkko,
> 
> I think I'll remove this patch in next version, since it is not related to KVM
> SGX. And I'll rebase your second patch based on current tip/x86/sgx. You may
> send out this patch independently. Let me know if you have comment.

I don't like to pre-ack changes.

My main concern is not to introduce multiple disjoint versions
of sgx_free_epc_page(). It is just not sane because you can do
an implementation where those don't exist.

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-02 23:56       ` Sean Christopherson
  2021-02-03  0:43         ` Dave Hansen
@ 2021-02-03 15:10         ` Dave Hansen
  2021-02-03 17:36           ` Sean Christopherson
  1 sibling, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-02-03 15:10 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Edgecombe, Rick P, linux-sgx, kvm, Huang, Kai, x86, corbet, luto,
	jethro, wanpengli, mingo, b.thiel, tglx, pbonzini, jarkko, joro,
	hpa, jmattson, vkuznets, bp, Huang, Haitao

On 2/2/21 3:56 PM, Sean Christopherson wrote:
>> I seem to remember much stronger language in the SDM about this.  I've
>> always thought of SGX as a big unrecoverable machine-check party waiting
>> to happen.
>>
>> I'll ask around internally at Intel and see what folks say.  Basically,
>> should we be afraid of a big bad EPC access?
> If bad accesses to the EPC can cause machine checks, then EPC should never be
> mapped into userspace, i.e. the SGX driver should never have been merged.

The SDM doesn't define the behavior well enough.  I'll try to get that
fixed.

But, there is some documentation of the abort page semantics:

> https://download.01.org/intel-sgx/sgx-linux/2.10/docs/Intel_SGX_Developer_Reference_Linux_2.10_Open_Source.pdf

Basically, writes get dropped and reads get all 1's on all the
implementations in the wild.  I actually would have much rather gotten a
fault, but oh well.

It sounds like we need to at least modify KVM to make sure not to map
and access EPC addresses.  We might even want to add some VM_WARN_ON()s
in the code that creates kernel mappings to catch these mappings if they
happen anywhere else.

EPC mappings seem like (silent) trouble waiting to happen.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03  9:11       ` Kai Huang
@ 2021-02-03 17:07         ` Sean Christopherson
  2021-02-03 23:11           ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-03 17:07 UTC (permalink / raw)
  To: Kai Huang
  Cc: Edgecombe, Rick P, linux-sgx, kvm, x86, Huang, Haitao, luto,
	jarkko, Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini,
	joro, wanpengli, jmattson

On Wed, Feb 03, 2021, Kai Huang wrote:
> On Tue, 2 Feb 2021 17:36:12 -0800 Sean Christopherson wrote:
> > On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> > > On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > > > +       /*
> > > > +        * Verify alignment early.  This conveniently avoids having
> > > > to worry
> > > > +        * about page splits on userspace addresses.
> > > > +        */
> > > > +       if (!IS_ALIGNED(pageinfo.metadata, 64) ||
> > > > +           !IS_ALIGNED(pageinfo.contents, 4096)) {
> > > > +               kvm_inject_gp(vcpu, 0);
> > > > +               return 1;
> > > > +       }
> > > > +
> > > > +       /*
> > > > +        * Translate the SECINFO, SOURCE and SECS pointers from GVA
> > > > to GPA.
> > > > +        * Resume the guest on failure to inject a #PF.
> > > > +        */
> > > > +       if (sgx_gva_to_gpa(vcpu, pageinfo.metadata, false,
> > > > &metadata_gpa) ||
> > > > +           sgx_gva_to_gpa(vcpu, pageinfo.contents, false,
> > > > &contents_gpa) ||
> > > > +           sgx_gva_to_gpa(vcpu, secs_gva, true, &secs_gpa))
> > > > +               return 1;
> > > > +
> > > 
> > > Do pageinfo.metadata and pageinfo.contents need cannonical checks here?
> > 
> > Bugger, yes.  So much boilerplate needed in this code :-/
> > 
> > Maybe add yet another helper to do alignment+canonical checks, up where the
> > IS_ALIGNED() calls are?
> 
> sgx_get_encls_gva() already does canonical check. Couldn't we just use it?

After rereading the SDM for the bajillionth time, yes, these should indeed use
sgx_get_encls_gva().  Originally I was thinking they were linear addresses, but
they are effective addresses that use DS, i.e. not using the helper to avoid the
DS.base adjustment for 32-bit mode was also wrong.

> For instance:
> 
> 	if (sgx_get_encls_gva(vcpu, pageinfo.metadata, 64, 64 &metadata_gva) ||
> 	    sgx_get_encls_gva(vcpu, pageinfo.contents, 4096, 4096,
>                              &contents_gva))
> 		return 1;

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-03 15:10         ` Dave Hansen
@ 2021-02-03 17:36           ` Sean Christopherson
  2021-02-03 17:43             ` Paolo Bonzini
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-03 17:36 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Edgecombe, Rick P, linux-sgx, kvm, Huang, Kai, x86, corbet, luto,
	jethro, wanpengli, mingo, b.thiel, tglx, pbonzini, jarkko, joro,
	hpa, jmattson, vkuznets, bp, Huang, Haitao

On Wed, Feb 03, 2021, Dave Hansen wrote:
> On 2/2/21 3:56 PM, Sean Christopherson wrote:
> >> I seem to remember much stronger language in the SDM about this.  I've
> >> always thought of SGX as a big unrecoverable machine-check party waiting
> >> to happen.
> >>
> >> I'll ask around internally at Intel and see what folks say.  Basically,
> >> should we be afraid of a big bad EPC access?
> > If bad accesses to the EPC can cause machine checks, then EPC should never be
> > mapped into userspace, i.e. the SGX driver should never have been merged.
> 
> The SDM doesn't define the behavior well enough.  I'll try to get that
> fixed.
> 
> But, there is some documentation of the abort page semantics:
> 
> > https://download.01.org/intel-sgx/sgx-linux/2.10/docs/Intel_SGX_Developer_Reference_Linux_2.10_Open_Source.pdf
> 
> Basically, writes get dropped and reads get all 1's on all the
> implementations in the wild.  I actually would have much rather gotten a
> fault, but oh well.
> 
> It sounds like we need to at least modify KVM to make sure not to map
> and access EPC addresses.

Why?  KVM will read garbage, but KVM needs to be careful with the data it reads,
irrespective of SGX, because the data is user/guest controlled even in the happy
case.

I'm not at all opposed to preventing KVM from accessing EPC, but I really don't
want to add a special check in KVM to avoid reading EPC.  KVM generally isn't
aware of physical backings, and the relevant KVM code is shared between all
architectures.

> We might even want to add some VM_WARN_ON()s in the code that creates kernel
> mappings to catch these mappings if they happen anywhere else.

One thought for handling this would be to extend __ioremap_check_other() to flag
EPC in some way, and then disallow memremap() to EPC.  A clever way to do that
without disallowing SGX's initial memremap() would be to tap into SGX's
sgx_epc_sections array, as the per-section check would activate only after each
section is initialized/map by SGX.

Disallowing memremap(), without warning, would address the KVM flow (the
memremap() in __kvm_map_gfn()) without forcing KVM to explicitly check for EPC.

E.g. something like:

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index c519fc5f6948..f263f3554f27 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -26,6 +26,19 @@ static LIST_HEAD(sgx_active_page_list);

 static DEFINE_SPINLOCK(sgx_reclaimer_lock);

+bool is_sgx_epc(resource_size_t addr, unsigned long size)
+{
+       resource_size_t end = addr + size - 1;
+       int i;
+
+       for (i = 0; i < sgx_nr_epc_sections; i++) {
+               if (<check for overlap with sgx_epc_sections[i])
+                       return true;
+       }
+
+       return false;
+}
+
 /*
  * Reset dirty EPC pages to uninitialized state. Laundry can be left with SECS
  * pages whose child pages blocked EREMOVE.
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 9e5ccc56f8e0..145fc6fc6bc5 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -34,6 +34,7 @@
  */
 struct ioremap_desc {
        unsigned int flags;
+       bool sgx_epc;
 };

 /*
@@ -110,8 +111,14 @@ static unsigned int __ioremap_check_encrypted(struct resource *res)
  * The EFI runtime services data area is not covered by walk_mem_res(), but must
  * be mapped encrypted when SEV is active.
  */
-static void __ioremap_check_other(resource_size_t addr, struct ioremap_desc *desc)
+static void __ioremap_check_other(resource_size_t addr, unsigned long size,
+                                 struct ioremap_desc *desc)
 {
+       if (sgx_is_epc(addr, size)) {
+               desc->sgx_epc = true;
+               return;
+       }
+
        if (!sev_active())
                return;

@@ -155,7 +162,7 @@ static void __ioremap_check_mem(resource_size_t addr, unsigned long size,

        walk_mem_res(start, end, desc, __ioremap_collect_map_flags);

-       __ioremap_check_other(addr, desc);
+       __ioremap_check_other(addr, size, desc);
 }

 /*
@@ -210,6 +217,13 @@ __ioremap_caller(resource_size_t phys_addr, unsigned long size,
                return NULL;
        }

+       /*
+        * Don't allow mapping SGX EPC, it's not accessible via normal reads
+        * and writes.
+        */
+       if (io_desc.epc)
+               return NULL;
+
        /*
         * Mappings have to be page-aligned
         */


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-03 17:36           ` Sean Christopherson
@ 2021-02-03 17:43             ` Paolo Bonzini
  2021-02-03 17:46               ` Dave Hansen
  0 siblings, 1 reply; 156+ messages in thread
From: Paolo Bonzini @ 2021-02-03 17:43 UTC (permalink / raw)
  To: Sean Christopherson, Dave Hansen
  Cc: Edgecombe, Rick P, linux-sgx, kvm, Huang, Kai, x86, corbet, luto,
	jethro, wanpengli, mingo, b.thiel, tglx, jarkko, joro, hpa,
	jmattson, vkuznets, bp, Huang, Haitao

On 03/02/21 18:36, Sean Christopherson wrote:
> I'm not at all opposed to preventing KVM from accessing EPC, but I 
> really don't want to add a special check in KVM to avoid reading EPC. 
> KVM generally isn't aware of physical backings, and the relevant KVM 
> code is shared between all architectures.

Yeah, special casing KVM is almost always the wrong thing to do. 
Anything that KVM can do, other subsystems will do as well.

Paolo


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-03 17:43             ` Paolo Bonzini
@ 2021-02-03 17:46               ` Dave Hansen
  2021-02-03 23:09                 ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-02-03 17:46 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: Edgecombe, Rick P, linux-sgx, kvm, Huang, Kai, x86, corbet, luto,
	jethro, wanpengli, mingo, b.thiel, tglx, jarkko, joro, hpa,
	jmattson, vkuznets, bp, Huang, Haitao

On 2/3/21 9:43 AM, Paolo Bonzini wrote:
> On 03/02/21 18:36, Sean Christopherson wrote:
>> I'm not at all opposed to preventing KVM from accessing EPC, but I
>> really don't want to add a special check in KVM to avoid reading EPC.
>> KVM generally isn't aware of physical backings, and the relevant KVM
>> code is shared between all architectures.
> 
> Yeah, special casing KVM is almost always the wrong thing to do.
> Anything that KVM can do, other subsystems will do as well.

Agreed.  Thwarting ioremap itself seems like the right way to go.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-01-26  9:31 ` [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions Kai Huang
  2021-02-03  0:52   ` Edgecombe, Rick P
@ 2021-02-03 18:47   ` Edgecombe, Rick P
  2021-02-03 19:36     ` Sean Christopherson
  1 sibling, 1 reply; 156+ messages in thread
From: Edgecombe, Rick P @ 2021-02-03 18:47 UTC (permalink / raw)
  To: linux-sgx, kvm, Huang, Kai, x86
  Cc: Huang, Haitao, luto, jarkko, seanjc, Hansen, Dave, vkuznets, bp,
	mingo, tglx, hpa, pbonzini, joro, wanpengli, jmattson

On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> +       /* Exit to userspace if copying from a host userspace address
> fails. */
> +       if (sgx_read_hva(vcpu, m_hva, &miscselect,
> sizeof(miscselect)) ||
> +           sgx_read_hva(vcpu, a_hva, &attributes,
> sizeof(attributes)) ||
> +           sgx_read_hva(vcpu, x_hva, &xfrm, sizeof(xfrm)) ||
> +           sgx_read_hva(vcpu, s_hva, &size, sizeof(size)))
> +               return 0;
> +
> +       /* Enforce restriction of access to the PROVISIONKEY. */
> +       if (!vcpu->kvm->arch.sgx_provisioning_allowed &&
> +           (attributes & SGX_ATTR_PROVISIONKEY)) {
> +               if (sgx_12_1->eax & SGX_ATTR_PROVISIONKEY)
> +                       pr_warn_once("KVM: SGX PROVISIONKEY
> advertised but not allowed\n");
> +               kvm_inject_gp(vcpu, 0);
> +               return 1;
> +       }
> +
> +       /* Enforce CPUID restrictions on MISCSELECT, ATTRIBUTES and
> XFRM. */
> +       if ((u32)miscselect & ~sgx_12_0->ebx ||
> +           (u32)attributes & ~sgx_12_1->eax ||
> +           (u32)(attributes >> 32) & ~sgx_12_1->ebx ||
> +           (u32)xfrm & ~sgx_12_1->ecx ||
> +           (u32)(xfrm >> 32) & ~sgx_12_1->edx) {
> +               kvm_inject_gp(vcpu, 0);
> +               return 1;
> +       }

Don't you need to deep copy the pageinfo.contents struct as well?
Otherwise the guest could change these after they were checked.

But it seems it is checked by the HW and something is caught that would
inject a GP anyway? Can you elaborate on the importance of these
checks?


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03 18:47   ` Edgecombe, Rick P
@ 2021-02-03 19:36     ` Sean Christopherson
  2021-02-03 23:29       ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-03 19:36 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: linux-sgx, kvm, Huang, Kai, x86, Huang, Haitao, luto, jarkko,
	Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini, joro,
	wanpengli, jmattson

On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > +       /* Exit to userspace if copying from a host userspace address
> > fails. */
> > +       if (sgx_read_hva(vcpu, m_hva, &miscselect,
> > sizeof(miscselect)) ||
> > +           sgx_read_hva(vcpu, a_hva, &attributes,
> > sizeof(attributes)) ||
> > +           sgx_read_hva(vcpu, x_hva, &xfrm, sizeof(xfrm)) ||
> > +           sgx_read_hva(vcpu, s_hva, &size, sizeof(size)))
> > +               return 0;
> > +
> > +       /* Enforce restriction of access to the PROVISIONKEY. */
> > +       if (!vcpu->kvm->arch.sgx_provisioning_allowed &&
> > +           (attributes & SGX_ATTR_PROVISIONKEY)) {
> > +               if (sgx_12_1->eax & SGX_ATTR_PROVISIONKEY)
> > +                       pr_warn_once("KVM: SGX PROVISIONKEY
> > advertised but not allowed\n");
> > +               kvm_inject_gp(vcpu, 0);
> > +               return 1;
> > +       }
> > +
> > +       /* Enforce CPUID restrictions on MISCSELECT, ATTRIBUTES and
> > XFRM. */
> > +       if ((u32)miscselect & ~sgx_12_0->ebx ||
> > +           (u32)attributes & ~sgx_12_1->eax ||
> > +           (u32)(attributes >> 32) & ~sgx_12_1->ebx ||
> > +           (u32)xfrm & ~sgx_12_1->ecx ||
> > +           (u32)(xfrm >> 32) & ~sgx_12_1->edx) {
> > +               kvm_inject_gp(vcpu, 0);
> > +               return 1;
> > +       }
> 
> Don't you need to deep copy the pageinfo.contents struct as well?
> Otherwise the guest could change these after they were checked.
> 
> But it seems it is checked by the HW and something is caught that would
> inject a GP anyway? Can you elaborate on the importance of these
> checks?

Argh, yes.  These checks are to allow migration between systems with different
SGX capabilities, and more importantly to prevent userspace from doing an end
around on the restricted access to PROVISIONKEY.

IIRC, earlier versions did do a deep copy, but then I got clever.  Anyways, yeah,
sadly the entire pageinfo.contents page will need to be copied.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-03  0:49             ` Kai Huang
@ 2021-02-03 22:02               ` Jarkko Sakkinen
  2021-02-03 22:59                 ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-03 22:02 UTC (permalink / raw)
  To: Kai Huang
  Cc: Sean Christopherson, linux-sgx, kvm, x86, luto, dave.hansen,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Wed, Feb 03, 2021 at 01:49:06PM +1300, Kai Huang wrote:
> What working *incorrectly* thing is related to SGX virtualization? The things
> SGX virtualization requires (basically just raw EPC allocation) are all in
> sgx/main.c. 

States:

A. SGX driver is unsupported.
B. SGX driver is supported and initialized correctly.
C. SGX driver is supported and failed to initialize.

I just thought that KVM should support SGX when we are either in states A
or B.  Even the short summary implies this. It is expected that SGX driver
initializes correctly if it is supported in the first place. If it doesn't,
something is probaly seriously wrong. That is something we don't expect in
a legit system behavior.

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-03 22:02               ` Jarkko Sakkinen
@ 2021-02-03 22:59                 ` Sean Christopherson
  2021-02-04  1:39                   ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-03 22:59 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Kai Huang, linux-sgx, kvm, x86, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Thu, Feb 04, 2021, Jarkko Sakkinen wrote:
> On Wed, Feb 03, 2021 at 01:49:06PM +1300, Kai Huang wrote:
> > What working *incorrectly* thing is related to SGX virtualization? The things
> > SGX virtualization requires (basically just raw EPC allocation) are all in
> > sgx/main.c. 
> 
> States:
> 
> A. SGX driver is unsupported.
> B. SGX driver is supported and initialized correctly.
> C. SGX driver is supported and failed to initialize.
> 
> I just thought that KVM should support SGX when we are either in states A
> or B.  Even the short summary implies this. It is expected that SGX driver
> initializes correctly if it is supported in the first place. If it doesn't,
> something is probaly seriously wrong. That is something we don't expect in
> a legit system behavior.

It's legit behavior, and something we (you?) explicitly want to support.  See
patch 05, x86/cpu/intel: Allow SGX virtualization without Launch Control support.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-03 17:46               ` Dave Hansen
@ 2021-02-03 23:09                 ` Kai Huang
  2021-02-03 23:32                   ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-03 23:09 UTC (permalink / raw)
  To: Dave Hansen, Paolo Bonzini, Sean Christopherson
  Cc: Edgecombe, Rick P, linux-sgx, kvm, x86, corbet, luto, jethro,
	wanpengli, mingo, b.thiel, tglx, jarkko, joro, hpa, jmattson,
	vkuznets, bp, Huang, Haitao

On Wed, 2021-02-03 at 09:46 -0800, Dave Hansen wrote:
> On 2/3/21 9:43 AM, Paolo Bonzini wrote:
> > On 03/02/21 18:36, Sean Christopherson wrote:
> > > I'm not at all opposed to preventing KVM from accessing EPC, but I
> > > really don't want to add a special check in KVM to avoid reading EPC.
> > > KVM generally isn't aware of physical backings, and the relevant KVM
> > > code is shared between all architectures.
> > 
> > Yeah, special casing KVM is almost always the wrong thing to do.
> > Anything that KVM can do, other subsystems will do as well.
> 
> Agreed.  Thwarting ioremap itself seems like the right way to go.

This sounds irrelevant to KVM SGX, thus I won't include it to KVM SGX series.


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03 17:07         ` Sean Christopherson
@ 2021-02-03 23:11           ` Kai Huang
  0 siblings, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-03 23:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Edgecombe, Rick P, linux-sgx, kvm, x86, Huang, Haitao, luto,
	jarkko, Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini,
	joro, wanpengli, jmattson

On Wed, 2021-02-03 at 09:07 -0800, Sean Christopherson wrote:
> On Wed, Feb 03, 2021, Kai Huang wrote:
> > On Tue, 2 Feb 2021 17:36:12 -0800 Sean Christopherson wrote:
> > > On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> > > > On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > > > > +       /*
> > > > > +        * Verify alignment early.  This conveniently avoids having
> > > > > to worry
> > > > > +        * about page splits on userspace addresses.
> > > > > +        */
> > > > > +       if (!IS_ALIGNED(pageinfo.metadata, 64) ||
> > > > > +           !IS_ALIGNED(pageinfo.contents, 4096)) {
> > > > > +               kvm_inject_gp(vcpu, 0);
> > > > > +               return 1;
> > > > > +       }
> > > > > +
> > > > > +       /*
> > > > > +        * Translate the SECINFO, SOURCE and SECS pointers from GVA
> > > > > to GPA.
> > > > > +        * Resume the guest on failure to inject a #PF.
> > > > > +        */
> > > > > +       if (sgx_gva_to_gpa(vcpu, pageinfo.metadata, false,
> > > > > &metadata_gpa) ||
> > > > > +           sgx_gva_to_gpa(vcpu, pageinfo.contents, false,
> > > > > &contents_gpa) ||
> > > > > +           sgx_gva_to_gpa(vcpu, secs_gva, true, &secs_gpa))
> > > > > +               return 1;
> > > > > +
> > > > 
> > > > Do pageinfo.metadata and pageinfo.contents need cannonical checks here?
> > > 
> > > Bugger, yes.  So much boilerplate needed in this code :-/
> > > 
> > > Maybe add yet another helper to do alignment+canonical checks, up where the
> > > IS_ALIGNED() calls are?
> > 
> > sgx_get_encls_gva() already does canonical check. Couldn't we just use it?
> 
> After rereading the SDM for the bajillionth time, yes, these should indeed use
> sgx_get_encls_gva().  Originally I was thinking they were linear addresses, but
> they are effective addresses that use DS, i.e. not using the helper to avoid the
> DS.base adjustment for 32-bit mode was also wrong.

Agreed.

> 
> > For instance:
> > 
> > 	if (sgx_get_encls_gva(vcpu, pageinfo.metadata, 64, 64 &metadata_gva) ||
> > 	    sgx_get_encls_gva(vcpu, pageinfo.contents, 4096, 4096,
> >                              &contents_gva))
> > 		return 1;



^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03 19:36     ` Sean Christopherson
@ 2021-02-03 23:29       ` Kai Huang
  2021-02-03 23:36         ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-03 23:29 UTC (permalink / raw)
  To: Sean Christopherson, Edgecombe, Rick P
  Cc: linux-sgx, kvm, x86, Huang, Haitao, luto, jarkko, Hansen, Dave,
	vkuznets, bp, mingo, tglx, hpa, pbonzini, joro, wanpengli,
	jmattson

On Wed, 2021-02-03 at 11:36 -0800, Sean Christopherson wrote:
> On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> > On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > > +       /* Exit to userspace if copying from a host userspace address
> > > fails. */
> > > +       if (sgx_read_hva(vcpu, m_hva, &miscselect,
> > > sizeof(miscselect)) ||
> > > +           sgx_read_hva(vcpu, a_hva, &attributes,
> > > sizeof(attributes)) ||
> > > +           sgx_read_hva(vcpu, x_hva, &xfrm, sizeof(xfrm)) ||
> > > +           sgx_read_hva(vcpu, s_hva, &size, sizeof(size)))
> > > +               return 0;
> > > +
> > > +       /* Enforce restriction of access to the PROVISIONKEY. */
> > > +       if (!vcpu->kvm->arch.sgx_provisioning_allowed &&
> > > +           (attributes & SGX_ATTR_PROVISIONKEY)) {
> > > +               if (sgx_12_1->eax & SGX_ATTR_PROVISIONKEY)
> > > +                       pr_warn_once("KVM: SGX PROVISIONKEY
> > > advertised but not allowed\n");
> > > +               kvm_inject_gp(vcpu, 0);
> > > +               return 1;
> > > +       }
> > > +
> > > +       /* Enforce CPUID restrictions on MISCSELECT, ATTRIBUTES and
> > > XFRM. */
> > > +       if ((u32)miscselect & ~sgx_12_0->ebx ||
> > > +           (u32)attributes & ~sgx_12_1->eax ||
> > > +           (u32)(attributes >> 32) & ~sgx_12_1->ebx ||
> > > +           (u32)xfrm & ~sgx_12_1->ecx ||
> > > +           (u32)(xfrm >> 32) & ~sgx_12_1->edx) {
> > > +               kvm_inject_gp(vcpu, 0);
> > > +               return 1;
> > > +       }
> > 
> > Don't you need to deep copy the pageinfo.contents struct as well?
> > Otherwise the guest could change these after they were checked.
> > 
> > But it seems it is checked by the HW and something is caught that would
> > inject a GP anyway? Can you elaborate on the importance of these
> > checks?
> 
> Argh, yes.  These checks are to allow migration between systems with different
> SGX capabilities, and more importantly to prevent userspace from doing an end
> around on the restricted access to PROVISIONKEY.
> 
> IIRC, earlier versions did do a deep copy, but then I got clever.  Anyways, yeah,
> sadly the entire pageinfo.contents page will need to be copied.

I don't fully understand the problem. Are you worried about contents being updated by
other vcpus during the trap? 

And I don't see how copy can avoid this problem. Even you do copy, the content can
still be modified afterwards, correct? So what's the point of copying? Looks a better
solution is to kick all vcpus and put them into block state while KVM is doing ENCLS
for guest.







^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-03 23:09                 ` Kai Huang
@ 2021-02-03 23:32                   ` Sean Christopherson
  2021-02-03 23:37                     ` Dave Hansen
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-03 23:32 UTC (permalink / raw)
  To: Kai Huang
  Cc: Dave Hansen, Paolo Bonzini, Edgecombe, Rick P, linux-sgx, kvm,
	x86, corbet, luto, jethro, wanpengli, mingo, b.thiel, tglx,
	jarkko, joro, hpa, jmattson, vkuznets, bp, Huang, Haitao

On Thu, Feb 04, 2021, Kai Huang wrote:
> On Wed, 2021-02-03 at 09:46 -0800, Dave Hansen wrote:
> > On 2/3/21 9:43 AM, Paolo Bonzini wrote:
> > > On 03/02/21 18:36, Sean Christopherson wrote:
> > > > I'm not at all opposed to preventing KVM from accessing EPC, but I
> > > > really don't want to add a special check in KVM to avoid reading EPC.
> > > > KVM generally isn't aware of physical backings, and the relevant KVM
> > > > code is shared between all architectures.
> > > 
> > > Yeah, special casing KVM is almost always the wrong thing to do.
> > > Anything that KVM can do, other subsystems will do as well.
> > 
> > Agreed.  Thwarting ioremap itself seems like the right way to go.
> 
> This sounds irrelevant to KVM SGX, thus I won't include it to KVM SGX series.

I would say it's relevant, but a pre-existing bug.  Same net effect on what's
needed for this series..

I say it's a pre-existing bug, because I'm pretty sure KVM can be coerced into
accessing the EPC by handing KVM a memslot that's backed by an enclave that was
created by host userspace (via /dev/sgx_enclave).

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03 23:29       ` Kai Huang
@ 2021-02-03 23:36         ` Sean Christopherson
  2021-02-03 23:45           ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-03 23:36 UTC (permalink / raw)
  To: Kai Huang
  Cc: Edgecombe, Rick P, linux-sgx, kvm, x86, Huang, Haitao, luto,
	jarkko, Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini,
	joro, wanpengli, jmattson

On Thu, Feb 04, 2021, Kai Huang wrote:
> On Wed, 2021-02-03 at 11:36 -0800, Sean Christopherson wrote:
> > On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> > > On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > > > +       /* Exit to userspace if copying from a host userspace address
> > > > fails. */
> > > > +       if (sgx_read_hva(vcpu, m_hva, &miscselect,
> > > > sizeof(miscselect)) ||
> > > > +           sgx_read_hva(vcpu, a_hva, &attributes,
> > > > sizeof(attributes)) ||
> > > > +           sgx_read_hva(vcpu, x_hva, &xfrm, sizeof(xfrm)) ||
> > > > +           sgx_read_hva(vcpu, s_hva, &size, sizeof(size)))
> > > > +               return 0;
> > > > +
> > > > +       /* Enforce restriction of access to the PROVISIONKEY. */
> > > > +       if (!vcpu->kvm->arch.sgx_provisioning_allowed &&
> > > > +           (attributes & SGX_ATTR_PROVISIONKEY)) {
> > > > +               if (sgx_12_1->eax & SGX_ATTR_PROVISIONKEY)
> > > > +                       pr_warn_once("KVM: SGX PROVISIONKEY
> > > > advertised but not allowed\n");
> > > > +               kvm_inject_gp(vcpu, 0);
> > > > +               return 1;
> > > > +       }
> > > > +
> > > > +       /* Enforce CPUID restrictions on MISCSELECT, ATTRIBUTES and
> > > > XFRM. */
> > > > +       if ((u32)miscselect & ~sgx_12_0->ebx ||
> > > > +           (u32)attributes & ~sgx_12_1->eax ||
> > > > +           (u32)(attributes >> 32) & ~sgx_12_1->ebx ||
> > > > +           (u32)xfrm & ~sgx_12_1->ecx ||
> > > > +           (u32)(xfrm >> 32) & ~sgx_12_1->edx) {
> > > > +               kvm_inject_gp(vcpu, 0);
> > > > +               return 1;
> > > > +       }
> > > 
> > > Don't you need to deep copy the pageinfo.contents struct as well?
> > > Otherwise the guest could change these after they were checked.
> > > 
> > > But it seems it is checked by the HW and something is caught that would
> > > inject a GP anyway? Can you elaborate on the importance of these
> > > checks?
> > 
> > Argh, yes.  These checks are to allow migration between systems with different
> > SGX capabilities, and more importantly to prevent userspace from doing an end
> > around on the restricted access to PROVISIONKEY.
> > 
> > IIRC, earlier versions did do a deep copy, but then I got clever.  Anyways, yeah,
> > sadly the entire pageinfo.contents page will need to be copied.
> 
> I don't fully understand the problem. Are you worried about contents being updated by
> other vcpus during the trap? 
> 
> And I don't see how copy can avoid this problem. Even you do copy, the content can
> still be modified afterwards, correct? So what's the point of copying?

The goal isn't correctness, it's to prevent a TOCTOU bug.  E.g. the guest could
do ECREATE w/ SECS.SGX_ATTR_PROVISIONKEY=0, and simultaneously set
SGX_ATTR_PROVISIONKEY to bypass the above check.

> Looks a better solution is to kick all vcpus and put them into block state
> while KVM is doing ENCLS for guest.

No.  (a) it won't work, as the memory is writable from host userspace.  (b) that
does not scale, at all.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-03 23:32                   ` Sean Christopherson
@ 2021-02-03 23:37                     ` Dave Hansen
  2021-02-04  0:04                       ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-02-03 23:37 UTC (permalink / raw)
  To: Sean Christopherson, Kai Huang
  Cc: Paolo Bonzini, Edgecombe, Rick P, linux-sgx, kvm, x86, corbet,
	luto, jethro, wanpengli, mingo, b.thiel, tglx, jarkko, joro, hpa,
	jmattson, vkuznets, bp, Huang, Haitao

On 2/3/21 3:32 PM, Sean Christopherson wrote:
>>>> Yeah, special casing KVM is almost always the wrong thing to do.
>>>> Anything that KVM can do, other subsystems will do as well.
>>> Agreed.  Thwarting ioremap itself seems like the right way to go.
>> This sounds irrelevant to KVM SGX, thus I won't include it to KVM SGX series.
> I would say it's relevant, but a pre-existing bug.  Same net effect on what's
> needed for this series..
> 
> I say it's a pre-existing bug, because I'm pretty sure KVM can be coerced into
> accessing the EPC by handing KVM a memslot that's backed by an enclave that was
> created by host userspace (via /dev/sgx_enclave).

Dang, you beat me to it.  I was composing another email that said the
exact same thing.

I guess we need to take a closer look at the KVM fallout from this.
It's a few spots where it KVM knew it might be consuming garbage.  It
just get extra weird stinky garbage now.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03 23:36         ` Sean Christopherson
@ 2021-02-03 23:45           ` Kai Huang
  2021-02-03 23:59             ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-03 23:45 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Edgecombe, Rick P, linux-sgx, kvm, x86, Huang, Haitao, luto,
	jarkko, Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini,
	joro, wanpengli, jmattson

On Wed, 2021-02-03 at 15:36 -0800, Sean Christopherson wrote:
> On Thu, Feb 04, 2021, Kai Huang wrote:
> > On Wed, 2021-02-03 at 11:36 -0800, Sean Christopherson wrote:
> > > On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> > > > On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > > > > +       /* Exit to userspace if copying from a host userspace address
> > > > > fails. */
> > > > > +       if (sgx_read_hva(vcpu, m_hva, &miscselect,
> > > > > sizeof(miscselect)) ||
> > > > > +           sgx_read_hva(vcpu, a_hva, &attributes,
> > > > > sizeof(attributes)) ||
> > > > > +           sgx_read_hva(vcpu, x_hva, &xfrm, sizeof(xfrm)) ||
> > > > > +           sgx_read_hva(vcpu, s_hva, &size, sizeof(size)))
> > > > > +               return 0;
> > > > > +
> > > > > +       /* Enforce restriction of access to the PROVISIONKEY. */
> > > > > +       if (!vcpu->kvm->arch.sgx_provisioning_allowed &&
> > > > > +           (attributes & SGX_ATTR_PROVISIONKEY)) {
> > > > > +               if (sgx_12_1->eax & SGX_ATTR_PROVISIONKEY)
> > > > > +                       pr_warn_once("KVM: SGX PROVISIONKEY
> > > > > advertised but not allowed\n");
> > > > > +               kvm_inject_gp(vcpu, 0);
> > > > > +               return 1;
> > > > > +       }
> > > > > +
> > > > > +       /* Enforce CPUID restrictions on MISCSELECT, ATTRIBUTES and
> > > > > XFRM. */
> > > > > +       if ((u32)miscselect & ~sgx_12_0->ebx ||
> > > > > +           (u32)attributes & ~sgx_12_1->eax ||
> > > > > +           (u32)(attributes >> 32) & ~sgx_12_1->ebx ||
> > > > > +           (u32)xfrm & ~sgx_12_1->ecx ||
> > > > > +           (u32)(xfrm >> 32) & ~sgx_12_1->edx) {
> > > > > +               kvm_inject_gp(vcpu, 0);
> > > > > +               return 1;
> > > > > +       }
> > > > 
> > > > Don't you need to deep copy the pageinfo.contents struct as well?
> > > > Otherwise the guest could change these after they were checked.
> > > > 
> > > > But it seems it is checked by the HW and something is caught that would
> > > > inject a GP anyway? Can you elaborate on the importance of these
> > > > checks?
> > > 
> > > Argh, yes.  These checks are to allow migration between systems with different
> > > SGX capabilities, and more importantly to prevent userspace from doing an end
> > > around on the restricted access to PROVISIONKEY.
> > > 
> > > IIRC, earlier versions did do a deep copy, but then I got clever.  Anyways, yeah,
> > > sadly the entire pageinfo.contents page will need to be copied.
> > 
> > I don't fully understand the problem. Are you worried about contents being updated by
> > other vcpus during the trap? 
> > 
> > And I don't see how copy can avoid this problem. Even you do copy, the content can
> > still be modified afterwards, correct? So what's the point of copying?
> 
> The goal isn't correctness, it's to prevent a TOCTOU bug.  E.g. the guest could
> do ECREATE w/ SECS.SGX_ATTR_PROVISIONKEY=0, and simultaneously set
> SGX_ATTR_PROVISIONKEY to bypass the above check.

Oh ok. Agreed.

However, such attack would require precise timing. Not sure whether it is feasible in
practice.

> 
> > Looks a better solution is to kick all vcpus and put them into block state
> > while KVM is doing ENCLS for guest.
> 
> No.  (a) it won't work, as the memory is writable from host userspace.  (b) that
> does not scale, at all.

Good point. Agreed.


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03 23:45           ` Kai Huang
@ 2021-02-03 23:59             ` Sean Christopherson
  2021-02-04  0:11               ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-03 23:59 UTC (permalink / raw)
  To: Kai Huang
  Cc: Edgecombe, Rick P, linux-sgx, kvm, x86, Huang, Haitao, luto,
	jarkko, Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini,
	joro, wanpengli, jmattson

On Thu, Feb 04, 2021, Kai Huang wrote:
> On Wed, 2021-02-03 at 15:36 -0800, Sean Christopherson wrote:
> > On Thu, Feb 04, 2021, Kai Huang wrote:
> > > On Wed, 2021-02-03 at 11:36 -0800, Sean Christopherson wrote:
> > > > On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> > > > > On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > > > > Don't you need to deep copy the pageinfo.contents struct as well?
> > > > > Otherwise the guest could change these after they were checked.
> > > > > 
> > > > > But it seems it is checked by the HW and something is caught that would
> > > > > inject a GP anyway? Can you elaborate on the importance of these
> > > > > checks?
> > > > 
> > > > Argh, yes.  These checks are to allow migration between systems with different
> > > > SGX capabilities, and more importantly to prevent userspace from doing an end
> > > > around on the restricted access to PROVISIONKEY.
> > > > 
> > > > IIRC, earlier versions did do a deep copy, but then I got clever.  Anyways, yeah,
> > > > sadly the entire pageinfo.contents page will need to be copied.
> > > 
> > > I don't fully understand the problem. Are you worried about contents being updated by
> > > other vcpus during the trap? 
> > > 
> > > And I don't see how copy can avoid this problem. Even you do copy, the content can
> > > still be modified afterwards, correct? So what's the point of copying?
> > 
> > The goal isn't correctness, it's to prevent a TOCTOU bug.  E.g. the guest could
> > do ECREATE w/ SECS.SGX_ATTR_PROVISIONKEY=0, and simultaneously set
> > SGX_ATTR_PROVISIONKEY to bypass the above check.
> 
> Oh ok. Agreed.
> 
> However, such attack would require precise timing. Not sure whether it is feasible in
> practice.

It's very feasible.  XOR the bit in a tight loop, build the enclave on a
separate thread.  Do that until EINIT succeeds.  Compared to other timing
attacks, I doubt it'd take all that long to get a successful result.

Regardless, the difficulty in exploiting the bug is irrelevant, it's a glaring
flaw that needs to be fixed.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-03 23:37                     ` Dave Hansen
@ 2021-02-04  0:04                       ` Kai Huang
  2021-02-04  0:28                         ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-04  0:04 UTC (permalink / raw)
  To: Dave Hansen, Sean Christopherson
  Cc: Paolo Bonzini, Edgecombe, Rick P, linux-sgx, kvm, x86, corbet,
	luto, jethro, wanpengli, mingo, b.thiel, tglx, jarkko, joro, hpa,
	jmattson, vkuznets, bp, Huang, Haitao

On Wed, 2021-02-03 at 15:37 -0800, Dave Hansen wrote:
> On 2/3/21 3:32 PM, Sean Christopherson wrote:
> > > > > Yeah, special casing KVM is almost always the wrong thing to do.
> > > > > Anything that KVM can do, other subsystems will do as well.
> > > > Agreed.  Thwarting ioremap itself seems like the right way to go.
> > > This sounds irrelevant to KVM SGX, thus I won't include it to KVM SGX series.
> > I would say it's relevant, but a pre-existing bug.  Same net effect on what's
> > needed for this series..
> > 
> > I say it's a pre-existing bug, because I'm pretty sure KVM can be coerced into
> > accessing the EPC by handing KVM a memslot that's backed by an enclave that was
> > created by host userspace (via /dev/sgx_enclave).
> 
> Dang, you beat me to it.  I was composing another email that said the
> exact same thing.
> 
> I guess we need to take a closer look at the KVM fallout from this.
> It's a few spots where it KVM knew it might be consuming garbage.  It
> just get extra weird stinky garbage now.

I don't quite understand how KVM will need to access EPC memslot. It is *guest*, but
not KVM, who can read EPC from non-enclave. And if I understand correctly, there will
be no place for KVM to use kernel address of EPC to access it. To KVM, there's no
difference, whether EPC backend is from /dev/sgx_enclave, or /dev/sgx_vepc. And we
really cannot prevent guest from doing anything. 

So how memremap() of EPC section is related to KVM SGX? For instance, the
implementation of this series needs to be modified due to this?


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-03 23:59             ` Sean Christopherson
@ 2021-02-04  0:11               ` Kai Huang
  2021-02-04  2:01                 ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-04  0:11 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Edgecombe, Rick P, linux-sgx, kvm, x86, Huang, Haitao, luto,
	jarkko, Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini,
	joro, wanpengli, jmattson

On Wed, 2021-02-03 at 15:59 -0800, Sean Christopherson wrote:
> On Thu, Feb 04, 2021, Kai Huang wrote:
> > On Wed, 2021-02-03 at 15:36 -0800, Sean Christopherson wrote:
> > > On Thu, Feb 04, 2021, Kai Huang wrote:
> > > > On Wed, 2021-02-03 at 11:36 -0800, Sean Christopherson wrote:
> > > > > On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> > > > > > On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > > > > > Don't you need to deep copy the pageinfo.contents struct as well?
> > > > > > Otherwise the guest could change these after they were checked.
> > > > > > 
> > > > > > But it seems it is checked by the HW and something is caught that would
> > > > > > inject a GP anyway? Can you elaborate on the importance of these
> > > > > > checks?
> > > > > 
> > > > > Argh, yes.  These checks are to allow migration between systems with different
> > > > > SGX capabilities, and more importantly to prevent userspace from doing an end
> > > > > around on the restricted access to PROVISIONKEY.
> > > > > 
> > > > > IIRC, earlier versions did do a deep copy, but then I got clever.  Anyways, yeah,
> > > > > sadly the entire pageinfo.contents page will need to be copied.
> > > > 
> > > > I don't fully understand the problem. Are you worried about contents being updated by
> > > > other vcpus during the trap? 
> > > > 
> > > > And I don't see how copy can avoid this problem. Even you do copy, the content can
> > > > still be modified afterwards, correct? So what's the point of copying?
> > > 
> > > The goal isn't correctness, it's to prevent a TOCTOU bug.  E.g. the guest could
> > > do ECREATE w/ SECS.SGX_ATTR_PROVISIONKEY=0, and simultaneously set
> > > SGX_ATTR_PROVISIONKEY to bypass the above check.
> > 
> > Oh ok. Agreed.
> > 
> > However, such attack would require precise timing. Not sure whether it is feasible in
> > practice.
> 
> It's very feasible.  XOR the bit in a tight loop, build the enclave on a
> separate thread.  Do that until EINIT succeeds.  Compared to other timing
> attacks, I doubt it'd take all that long to get a successful result.

How does it work? The setting PROVISION bit needs to be set after KVM checks SECS's
attribute, and before KVM actually does ECREATE, right?

> 
> Regardless, the difficulty in exploiting the bug is irrelevant, it's a glaring
> flaw that needs to be fixed.

Sure.



^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-04  0:04                       ` Kai Huang
@ 2021-02-04  0:28                         ` Sean Christopherson
  2021-02-04  3:18                           ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-04  0:28 UTC (permalink / raw)
  To: Kai Huang
  Cc: Dave Hansen, Paolo Bonzini, Edgecombe, Rick P, linux-sgx, kvm,
	x86, corbet, luto, jethro, wanpengli, mingo, b.thiel, tglx,
	jarkko, joro, hpa, jmattson, vkuznets, bp, Huang, Haitao

On Thu, Feb 04, 2021, Kai Huang wrote:
> On Wed, 2021-02-03 at 15:37 -0800, Dave Hansen wrote:
> > On 2/3/21 3:32 PM, Sean Christopherson wrote:
> > > > > > Yeah, special casing KVM is almost always the wrong thing to do.
> > > > > > Anything that KVM can do, other subsystems will do as well.
> > > > > Agreed.  Thwarting ioremap itself seems like the right way to go.
> > > > This sounds irrelevant to KVM SGX, thus I won't include it to KVM SGX series.
> > > I would say it's relevant, but a pre-existing bug.  Same net effect on what's
> > > needed for this series..
> > > 
> > > I say it's a pre-existing bug, because I'm pretty sure KVM can be coerced into
> > > accessing the EPC by handing KVM a memslot that's backed by an enclave that was
> > > created by host userspace (via /dev/sgx_enclave).
> > 
> > Dang, you beat me to it.  I was composing another email that said the
> > exact same thing.
> > 
> > I guess we need to take a closer look at the KVM fallout from this.
> > It's a few spots where it KVM knew it might be consuming garbage.  It
> > just get extra weird stinky garbage now.
> 
> I don't quite understand how KVM will need to access EPC memslot. It is *guest*, but
> not KVM, who can read EPC from non-enclave. And if I understand correctly, there will
> be no place for KVM to use kernel address of EPC to access it. To KVM, there's no
> difference, whether EPC backend is from /dev/sgx_enclave, or /dev/sgx_vepc. And we
> really cannot prevent guest from doing anything.
> 
> So how memremap() of EPC section is related to KVM SGX? For instance, the
> implementation of this series needs to be modified due to this?

See kvm_vcpu_map() -> __kvm_map_gfn(), which blindly uses memremap() when the
resulting pfn isn't a "valid" pfn.  KVM doesn't need access to an EPC memslot,
we're talking the case where a malicious userspace/guest hands KVM a GPA that
resolves to the EPC.  E.g. nested VM-Enter with the L1->L2 MSR bitmap pointing
at EPC.  L0 KVM will intercept VM-Enter and then read L1's bitmap to merge it's
desires with L0 KVM's requirements.  That read will hit the EPC, and thankfully
for KVM, return garbage.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-03 22:59                 ` Sean Christopherson
@ 2021-02-04  1:39                   ` Jarkko Sakkinen
  2021-02-04  2:59                     ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-04  1:39 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Kai Huang, linux-sgx, kvm, x86, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Wed, Feb 03, 2021 at 02:59:47PM -0800, Sean Christopherson wrote:
> On Thu, Feb 04, 2021, Jarkko Sakkinen wrote:
> > On Wed, Feb 03, 2021 at 01:49:06PM +1300, Kai Huang wrote:
> > > What working *incorrectly* thing is related to SGX virtualization? The things
> > > SGX virtualization requires (basically just raw EPC allocation) are all in
> > > sgx/main.c. 
> > 
> > States:
> > 
> > A. SGX driver is unsupported.
> > B. SGX driver is supported and initialized correctly.
> > C. SGX driver is supported and failed to initialize.
> > 
> > I just thought that KVM should support SGX when we are either in states A
> > or B.  Even the short summary implies this. It is expected that SGX driver
> > initializes correctly if it is supported in the first place. If it doesn't,
> > something is probaly seriously wrong. That is something we don't expect in
> > a legit system behavior.
> 
> It's legit behavior, and something we (you?) explicitly want to support.  See
> patch 05, x86/cpu/intel: Allow SGX virtualization without Launch Control support.

What I think would be a sane behavior, would be to allow KVM when
sgx_drv_init() returns -ENODEV (case A). This happens when LC is
not enabled:

	if (!cpu_feature_enabled(X86_FEATURE_SGX_LC))
		return -ENODEV;

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions
  2021-02-04  0:11               ` Kai Huang
@ 2021-02-04  2:01                 ` Sean Christopherson
  0 siblings, 0 replies; 156+ messages in thread
From: Sean Christopherson @ 2021-02-04  2:01 UTC (permalink / raw)
  To: Kai Huang
  Cc: Edgecombe, Rick P, linux-sgx, kvm, x86, Huang, Haitao, luto,
	jarkko, Hansen, Dave, vkuznets, bp, mingo, tglx, hpa, pbonzini,
	joro, wanpengli, jmattson

On Thu, Feb 04, 2021, Kai Huang wrote:
> On Wed, 2021-02-03 at 15:59 -0800, Sean Christopherson wrote:
> > On Thu, Feb 04, 2021, Kai Huang wrote:
> > > On Wed, 2021-02-03 at 15:36 -0800, Sean Christopherson wrote:
> > > > On Thu, Feb 04, 2021, Kai Huang wrote:
> > > > > On Wed, 2021-02-03 at 11:36 -0800, Sean Christopherson wrote:
> > > > > > On Wed, Feb 03, 2021, Edgecombe, Rick P wrote:
> > > > > > > On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> > > > > > > Don't you need to deep copy the pageinfo.contents struct as well?
> > > > > > > Otherwise the guest could change these after they were checked.
> > > > > > > 
> > > > > > > But it seems it is checked by the HW and something is caught that would
> > > > > > > inject a GP anyway? Can you elaborate on the importance of these
> > > > > > > checks?
> > > > > > 
> > > > > > Argh, yes.  These checks are to allow migration between systems with different
> > > > > > SGX capabilities, and more importantly to prevent userspace from doing an end
> > > > > > around on the restricted access to PROVISIONKEY.
> > > > > > 
> > > > > > IIRC, earlier versions did do a deep copy, but then I got clever.  Anyways, yeah,
> > > > > > sadly the entire pageinfo.contents page will need to be copied.
> > > > > 
> > > > > I don't fully understand the problem. Are you worried about contents being updated by
> > > > > other vcpus during the trap? 
> > > > > 
> > > > > And I don't see how copy can avoid this problem. Even you do copy, the content can
> > > > > still be modified afterwards, correct? So what's the point of copying?
> > > > 
> > > > The goal isn't correctness, it's to prevent a TOCTOU bug.  E.g. the guest could
> > > > do ECREATE w/ SECS.SGX_ATTR_PROVISIONKEY=0, and simultaneously set
> > > > SGX_ATTR_PROVISIONKEY to bypass the above check.
> > > 
> > > Oh ok. Agreed.
> > > 
> > > However, such attack would require precise timing. Not sure whether it is feasible in
> > > practice.
> > 
> > It's very feasible.  XOR the bit in a tight loop, build the enclave on a
> > separate thread.  Do that until EINIT succeeds.  Compared to other timing
> > attacks, I doubt it'd take all that long to get a successful result.
> 
> How does it work? The setting PROVISION bit needs to be set after KVM checks SECS's
> attribute, and before KVM actually does ECREATE, right?

Yep.  More precisely, toggled between KVM's read into its local copy and final
execution of ECREATE.  It's actually a huge window when you consider how many
uops ENCLS has to churn through before it reads 'contents'.  The success rate
would probaby be 25%: 50% chance KVM's read sees the 'good' value, 50% chance
the CPU sees the 'bad' value in the same exit.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-04  1:39                   ` Jarkko Sakkinen
@ 2021-02-04  2:59                     ` Kai Huang
  2021-02-04  3:05                       ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-04  2:59 UTC (permalink / raw)
  To: Jarkko Sakkinen, Sean Christopherson
  Cc: linux-sgx, kvm, x86, luto, dave.hansen, haitao.huang, pbonzini,
	bp, tglx, mingo, hpa

On Thu, 2021-02-04 at 03:39 +0200, Jarkko Sakkinen wrote:
> On Wed, Feb 03, 2021 at 02:59:47PM -0800, Sean Christopherson wrote:
> > On Thu, Feb 04, 2021, Jarkko Sakkinen wrote:
> > > On Wed, Feb 03, 2021 at 01:49:06PM +1300, Kai Huang wrote:
> > > > What working *incorrectly* thing is related to SGX virtualization? The things
> > > > SGX virtualization requires (basically just raw EPC allocation) are all in
> > > > sgx/main.c. 
> > > 
> > > States:
> > > 
> > > A. SGX driver is unsupported.
> > > B. SGX driver is supported and initialized correctly.
> > > C. SGX driver is supported and failed to initialize.
> > > 
> > > I just thought that KVM should support SGX when we are either in states A
> > > or B.  Even the short summary implies this. It is expected that SGX driver
> > > initializes correctly if it is supported in the first place. If it doesn't,
> > > something is probaly seriously wrong. That is something we don't expect in
> > > a legit system behavior.
> > 
> > It's legit behavior, and something we (you?) explicitly want to support.  See
> > patch 05, x86/cpu/intel: Allow SGX virtualization without Launch Control support.
> 
> What I think would be a sane behavior, would be to allow KVM when
> sgx_drv_init() returns -ENODEV (case A). This happens when LC is
> not enabled:
> 
> 	if (!cpu_feature_enabled(X86_FEATURE_SGX_LC))
> 		return -ENODEV;
> 
> /Jarkko

I really don't understand what's the difference between A and C. When "SGX driver is
supported and failed to initialize" happens, it just means "SGX driver is
unsupported". If it is not the case, can you explicitly point out what will be the
problem?


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-04  2:59                     ` Kai Huang
@ 2021-02-04  3:05                       ` Jarkko Sakkinen
  2021-02-04  3:09                         ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-04  3:05 UTC (permalink / raw)
  To: Kai Huang
  Cc: Sean Christopherson, linux-sgx, kvm, x86, luto, dave.hansen,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Thu, Feb 04, 2021 at 03:59:20PM +1300, Kai Huang wrote:
> On Thu, 2021-02-04 at 03:39 +0200, Jarkko Sakkinen wrote:
> > On Wed, Feb 03, 2021 at 02:59:47PM -0800, Sean Christopherson wrote:
> > > On Thu, Feb 04, 2021, Jarkko Sakkinen wrote:
> > > > On Wed, Feb 03, 2021 at 01:49:06PM +1300, Kai Huang wrote:
> > > > > What working *incorrectly* thing is related to SGX virtualization? The things
> > > > > SGX virtualization requires (basically just raw EPC allocation) are all in
> > > > > sgx/main.c. 
> > > > 
> > > > States:
> > > > 
> > > > A. SGX driver is unsupported.
> > > > B. SGX driver is supported and initialized correctly.
> > > > C. SGX driver is supported and failed to initialize.
> > > > 
> > > > I just thought that KVM should support SGX when we are either in states A
> > > > or B.  Even the short summary implies this. It is expected that SGX driver
> > > > initializes correctly if it is supported in the first place. If it doesn't,
> > > > something is probaly seriously wrong. That is something we don't expect in
> > > > a legit system behavior.
> > > 
> > > It's legit behavior, and something we (you?) explicitly want to support.  See
> > > patch 05, x86/cpu/intel: Allow SGX virtualization without Launch Control support.
> > 
> > What I think would be a sane behavior, would be to allow KVM when
> > sgx_drv_init() returns -ENODEV (case A). This happens when LC is
> > not enabled:
> > 
> > 	if (!cpu_feature_enabled(X86_FEATURE_SGX_LC))
> > 		return -ENODEV;
> > 
> > /Jarkko
> 
> I really don't understand what's the difference between A and C. When "SGX driver is
> supported and failed to initialize" happens, it just means "SGX driver is
> unsupported". If it is not the case, can you explicitly point out what will be the
> problem?

ret != 0 && ret != -ENODEV

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-04  3:05                       ` Jarkko Sakkinen
@ 2021-02-04  3:09                         ` Jarkko Sakkinen
  2021-02-04  3:20                           ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-04  3:09 UTC (permalink / raw)
  To: Kai Huang
  Cc: Sean Christopherson, linux-sgx, kvm, x86, luto, dave.hansen,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Thu, Feb 04, 2021 at 05:05:56AM +0200, Jarkko Sakkinen wrote:
> On Thu, Feb 04, 2021 at 03:59:20PM +1300, Kai Huang wrote:
> > On Thu, 2021-02-04 at 03:39 +0200, Jarkko Sakkinen wrote:
> > > On Wed, Feb 03, 2021 at 02:59:47PM -0800, Sean Christopherson wrote:
> > > > On Thu, Feb 04, 2021, Jarkko Sakkinen wrote:
> > > > > On Wed, Feb 03, 2021 at 01:49:06PM +1300, Kai Huang wrote:
> > > > > > What working *incorrectly* thing is related to SGX virtualization? The things
> > > > > > SGX virtualization requires (basically just raw EPC allocation) are all in
> > > > > > sgx/main.c. 
> > > > > 
> > > > > States:
> > > > > 
> > > > > A. SGX driver is unsupported.
> > > > > B. SGX driver is supported and initialized correctly.
> > > > > C. SGX driver is supported and failed to initialize.
> > > > > 
> > > > > I just thought that KVM should support SGX when we are either in states A
> > > > > or B.  Even the short summary implies this. It is expected that SGX driver
> > > > > initializes correctly if it is supported in the first place. If it doesn't,
> > > > > something is probaly seriously wrong. That is something we don't expect in
> > > > > a legit system behavior.
> > > > 
> > > > It's legit behavior, and something we (you?) explicitly want to support.  See
> > > > patch 05, x86/cpu/intel: Allow SGX virtualization without Launch Control support.
> > > 
> > > What I think would be a sane behavior, would be to allow KVM when
> > > sgx_drv_init() returns -ENODEV (case A). This happens when LC is
> > > not enabled:
> > > 
> > > 	if (!cpu_feature_enabled(X86_FEATURE_SGX_LC))
> > > 		return -ENODEV;
> > > 
> > > /Jarkko
> > 
> > I really don't understand what's the difference between A and C. When "SGX driver is
> > supported and failed to initialize" happens, it just means "SGX driver is
> > unsupported". If it is not the case, can you explicitly point out what will be the
> > problem?

This is as explicit as I can ever possibly get:

A: ret == -ENODEV
B: ret == 0
C: ret != 0 && ret != -ENODEV

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-04  0:28                         ` Sean Christopherson
@ 2021-02-04  3:18                           ` Kai Huang
  2021-02-04 16:28                             ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-04  3:18 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Dave Hansen, Paolo Bonzini, Edgecombe, Rick P, linux-sgx, kvm,
	x86, corbet, luto, jethro, wanpengli, mingo, b.thiel, tglx,
	jarkko, joro, hpa, jmattson, vkuznets, bp, Huang, Haitao

On Wed, 2021-02-03 at 16:28 -0800, Sean Christopherson wrote:
> On Thu, Feb 04, 2021, Kai Huang wrote:
> > On Wed, 2021-02-03 at 15:37 -0800, Dave Hansen wrote:
> > > On 2/3/21 3:32 PM, Sean Christopherson wrote:
> > > > > > > Yeah, special casing KVM is almost always the wrong thing to do.
> > > > > > > Anything that KVM can do, other subsystems will do as well.
> > > > > > Agreed.  Thwarting ioremap itself seems like the right way to go.
> > > > > This sounds irrelevant to KVM SGX, thus I won't include it to KVM SGX series.
> > > > I would say it's relevant, but a pre-existing bug.  Same net effect on what's
> > > > needed for this series..
> > > > 
> > > > I say it's a pre-existing bug, because I'm pretty sure KVM can be coerced into
> > > > accessing the EPC by handing KVM a memslot that's backed by an enclave that was
> > > > created by host userspace (via /dev/sgx_enclave).
> > > 
> > > Dang, you beat me to it.  I was composing another email that said the
> > > exact same thing.
> > > 
> > > I guess we need to take a closer look at the KVM fallout from this.
> > > It's a few spots where it KVM knew it might be consuming garbage.  It
> > > just get extra weird stinky garbage now.
> > 
> > I don't quite understand how KVM will need to access EPC memslot. It is *guest*, but
> > not KVM, who can read EPC from non-enclave. And if I understand correctly, there will
> > be no place for KVM to use kernel address of EPC to access it. To KVM, there's no
> > difference, whether EPC backend is from /dev/sgx_enclave, or /dev/sgx_vepc. And we
> > really cannot prevent guest from doing anything.
> > 
> > So how memremap() of EPC section is related to KVM SGX? For instance, the
> > implementation of this series needs to be modified due to this?
> 
> See kvm_vcpu_map() -> __kvm_map_gfn(), which blindly uses memremap() when the
> resulting pfn isn't a "valid" pfn.  KVM doesn't need access to an EPC memslot,
> we're talking the case where a malicious userspace/guest hands KVM a GPA that
> resolves to the EPC.  E.g. nested VM-Enter with the L1->L2 MSR bitmap pointing
> at EPC.  L0 KVM will intercept VM-Enter and then read L1's bitmap to merge it's
> desires with L0 KVM's requirements.  That read will hit the EPC, and thankfully
> for KVM, return garbage.

Right. I missed __kvm_map_gfn(). 

I am not quite sure returning all ones can be treated as garbage, since one can means
true for a boolean, or one bit in bitmap as you said. But since this only happens
when guest/userspace is malicious, so causing misbehavior to the guest is fine? Do we
see any security risk here?

And I also agree that denying memremap() for EPC is desirable. But I am not sure
whether this should be addressed before KVM SGX series, or KVM SGX series should take
care of it.



^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-04  3:09                         ` Jarkko Sakkinen
@ 2021-02-04  3:20                           ` Kai Huang
  2021-02-04 14:51                             ` Jarkko Sakkinen
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-04  3:20 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Sean Christopherson, linux-sgx, kvm, x86, luto, dave.hansen,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Thu, 2021-02-04 at 05:09 +0200, Jarkko Sakkinen wrote:
> On Thu, Feb 04, 2021 at 05:05:56AM +0200, Jarkko Sakkinen wrote:
> > On Thu, Feb 04, 2021 at 03:59:20PM +1300, Kai Huang wrote:
> > > On Thu, 2021-02-04 at 03:39 +0200, Jarkko Sakkinen wrote:
> > > > On Wed, Feb 03, 2021 at 02:59:47PM -0800, Sean Christopherson wrote:
> > > > > On Thu, Feb 04, 2021, Jarkko Sakkinen wrote:
> > > > > > On Wed, Feb 03, 2021 at 01:49:06PM +1300, Kai Huang wrote:
> > > > > > > What working *incorrectly* thing is related to SGX virtualization? The things
> > > > > > > SGX virtualization requires (basically just raw EPC allocation) are all in
> > > > > > > sgx/main.c. 
> > > > > > 
> > > > > > States:
> > > > > > 
> > > > > > A. SGX driver is unsupported.
> > > > > > B. SGX driver is supported and initialized correctly.
> > > > > > C. SGX driver is supported and failed to initialize.
> > > > > > 
> > > > > > I just thought that KVM should support SGX when we are either in states A
> > > > > > or B.  Even the short summary implies this. It is expected that SGX driver
> > > > > > initializes correctly if it is supported in the first place. If it doesn't,
> > > > > > something is probaly seriously wrong. That is something we don't expect in
> > > > > > a legit system behavior.
> > > > > 
> > > > > It's legit behavior, and something we (you?) explicitly want to support.  See
> > > > > patch 05, x86/cpu/intel: Allow SGX virtualization without Launch Control support.
> > > > 
> > > > What I think would be a sane behavior, would be to allow KVM when
> > > > sgx_drv_init() returns -ENODEV (case A). This happens when LC is
> > > > not enabled:
> > > > 
> > > > 	if (!cpu_feature_enabled(X86_FEATURE_SGX_LC))
> > > > 		return -ENODEV;
> > > > 
> > > > /Jarkko
> > > 
> > > I really don't understand what's the difference between A and C. When "SGX driver is
> > > supported and failed to initialize" happens, it just means "SGX driver is
> > > unsupported". If it is not the case, can you explicitly point out what will be the
> > > problem?
> 
> This is as explicit as I can ever possibly get:
> 
> A: ret == -ENODEV
> B: ret == 0
> C: ret != 0 && ret != -ENODEV

Let me try again: 

Why A and C should be treated differently? What will behave incorrectly, in case of
C?

> 
> /Jarkko



^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  2021-01-26  9:31 ` [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM Kai Huang
  2021-01-30 14:51   ` Jarkko Sakkinen
@ 2021-02-04  3:53   ` Kai Huang
  2021-02-05  0:32     ` Sean Christopherson
  1 sibling, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-04  3:53 UTC (permalink / raw)
  To: linux-sgx, kvm, x86
  Cc: seanjc, jarkko, luto, dave.hansen, haitao.huang, pbonzini, bp,
	tglx, mingo, hpa

Hi Sean,

Do you think is it reasonable to move this patch to KVM? sgx_virt_ecreate() can be
merged to handle ECREATE patch, and sgx_virt_einit() can be merged to handle EINIT
patch. W/o the context of that two patches, it doesn't makes too much sense to have
them standalone under x86 here I think. And nobody except KVM will use them.

On Tue, 2021-01-26 at 22:31 +1300, Kai Huang wrote:
> From: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> The bare-metal kernel must intercept ECREATE to be able to impose policies
> on guests.  When it does this, the bare-metal kernel runs ECREATE against
> the userspace mapping of the virtualized EPC.
> 
> Provide wrappers around __ecreate() and __einit() to hide the ugliness
> of overloading the ENCLS return value to encode multiple error formats
> in a single int.  KVM will trap-and-execute ECREATE and EINIT as part
> of SGX virtualization, and on an exception, KVM needs the trapnr so that
> it can inject the correct fault into the guest.
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> ---
> v2->v3:
> 
>  - Added kdoc for sgx_virt_ecreate() and sgx_virt_einit(), per Jarkko.
>  - Changed to use CONFIG_X86_SGX_KVM.
> 
> ---
>  arch/x86/include/asm/sgx.h     | 16 ++++++
>  arch/x86/kernel/cpu/sgx/virt.c | 93 ++++++++++++++++++++++++++++++++++
>  2 files changed, 109 insertions(+)
>  create mode 100644 arch/x86/include/asm/sgx.h
> 
> diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> new file mode 100644
> index 000000000000..8a3ea3e1efbe
> --- /dev/null
> +++ b/arch/x86/include/asm/sgx.h
> @@ -0,0 +1,16 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_X86_SGX_H
> +#define _ASM_X86_SGX_H
> +
> +#include <linux/types.h>
> +
> +#ifdef CONFIG_X86_SGX_KVM
> +struct sgx_pageinfo;
> +
> +int sgx_virt_ecreate(struct sgx_pageinfo *pageinfo, void __user *secs,
> +		     int *trapnr);
> +int sgx_virt_einit(void __user *sigstruct, void __user *token,
> +		   void __user *secs, u64 *lepubkeyhash, int *trapnr);
> +#endif
> +
> +#endif /* _ASM_X86_SGX_H */
> diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> index e1ad7856d878..0f5b0e4e33dd 100644
> --- a/arch/x86/kernel/cpu/sgx/virt.c
> +++ b/arch/x86/kernel/cpu/sgx/virt.c
> @@ -252,3 +252,96 @@ int __init sgx_vepc_init(void)
>  
> 
>  	return misc_register(&sgx_vepc_dev);
>  }
> +
> +/**
> + * sgx_virt_ecreate() - Run ECREATE on behalf of guest
> + * @pageinfo:	Pointer to PAGEINFO structure
> + * @secs:	Userspace pointer to SECS page
> + * @trapnr:	trap number injected to guest in case of ECREATE error
> + *
> + * Run ECREATE on behalf of guest after KVM traps ECREATE for the purpose
> + * of enforcing policies of guest's enclaves, and return the trap number
> + * which should be injected to guest in case of any ECREATE error.
> + *
> + * Return:
> + * - 0: 	ECREATE was successful.
> + * - -EFAULT:	ECREATE returned error.
> + */
> +int sgx_virt_ecreate(struct sgx_pageinfo *pageinfo, void __user *secs,
> +		     int *trapnr)
> +{
> +	int ret;
> +
> +	/*
> +	 * @secs is userspace address, and it's not guaranteed @secs points at
> +	 * an actual EPC page. It's also possible to generate a kernel mapping
> +	 * to physical EPC page by resolving PFN but using __uaccess_xx() is
> +	 * simpler.
> +	 */
> +	__uaccess_begin();
> +	ret = __ecreate(pageinfo, (void *)secs);
> +	__uaccess_end();
> +
> +	if (encls_faulted(ret)) {
> +		*trapnr = ENCLS_TRAPNR(ret);
> +		return -EFAULT;
> +	}
> +
> +	/* ECREATE doesn't return an error code, it faults or succeeds. */
> +	WARN_ON_ONCE(ret);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(sgx_virt_ecreate);
> +
> +static int __sgx_virt_einit(void __user *sigstruct, void __user *token,
> +			    void __user *secs)
> +{
> +	int ret;
> +
> +	__uaccess_begin();
> +	ret =  __einit((void *)sigstruct, (void *)token, (void *)secs);
> +	__uaccess_end();
> +	return ret;
> +}
> +
> +/**
> + * sgx_virt_ecreate() - Run EINIT on behalf of guest
> + * @sigstruct:		Userspace pointer to SIGSTRUCT structure
> + * @token:		Userspace pointer to EINITTOKEN structure
> + * @secs:		Userspace pointer to SECS page
> + * @lepubkeyhash:	Pointer to guest's *virtual* SGX_LEPUBKEYHASH MSR
> + * 			values
> + * @trapnr:		trap number injected to guest in case of EINIT error
> + *
> + * Run EINIT on behalf of guest after KVM traps EINIT. If SGX_LC is available
> + * in host, bare-metal driver may rewrite the hardware values, therefore KVM
> + * needs to update hardware values to guest's virtual MSR values in order to
> + * ensure EINIT is executed with expected hardware values.
> + *
> + * Return:
> + * - 0: 	EINIT was successful.
> + * - -EFAULT:	EINIT returned error.
> + */
> +int sgx_virt_einit(void __user *sigstruct, void __user *token,
> +		   void __user *secs, u64 *lepubkeyhash, int *trapnr)
> +{
> +	int ret;
> +
> +	if (!boot_cpu_has(X86_FEATURE_SGX_LC)) {
> +		ret = __sgx_virt_einit(sigstruct, token, secs);
> +	} else {
> +		preempt_disable();
> +
> +		sgx_update_lepubkeyhash(lepubkeyhash);
> +
> +		ret = __sgx_virt_einit(sigstruct, token, secs);
> +		preempt_enable();
> +	}
> +
> +	if (encls_faulted(ret)) {
> +		*trapnr = ENCLS_TRAPNR(ret);
> +		return -EFAULT;
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(sgx_virt_einit);



^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-04  3:20                           ` Kai Huang
@ 2021-02-04 14:51                             ` Jarkko Sakkinen
  2021-02-04 22:41                               ` Dave Hansen
  0 siblings, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-04 14:51 UTC (permalink / raw)
  To: Kai Huang
  Cc: Sean Christopherson, linux-sgx, kvm, x86, luto, dave.hansen,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Thu, Feb 04, 2021 at 04:20:49PM +1300, Kai Huang wrote:
> On Thu, 2021-02-04 at 05:09 +0200, Jarkko Sakkinen wrote:
> > On Thu, Feb 04, 2021 at 05:05:56AM +0200, Jarkko Sakkinen wrote:
> > > On Thu, Feb 04, 2021 at 03:59:20PM +1300, Kai Huang wrote:
> > > > On Thu, 2021-02-04 at 03:39 +0200, Jarkko Sakkinen wrote:
> > > > > On Wed, Feb 03, 2021 at 02:59:47PM -0800, Sean Christopherson wrote:
> > > > > > On Thu, Feb 04, 2021, Jarkko Sakkinen wrote:
> > > > > > > On Wed, Feb 03, 2021 at 01:49:06PM +1300, Kai Huang wrote:
> > > > > > > > What working *incorrectly* thing is related to SGX virtualization? The things
> > > > > > > > SGX virtualization requires (basically just raw EPC allocation) are all in
> > > > > > > > sgx/main.c. 
> > > > > > > 
> > > > > > > States:
> > > > > > > 
> > > > > > > A. SGX driver is unsupported.
> > > > > > > B. SGX driver is supported and initialized correctly.
> > > > > > > C. SGX driver is supported and failed to initialize.
> > > > > > > 
> > > > > > > I just thought that KVM should support SGX when we are either in states A
> > > > > > > or B.  Even the short summary implies this. It is expected that SGX driver
> > > > > > > initializes correctly if it is supported in the first place. If it doesn't,
> > > > > > > something is probaly seriously wrong. That is something we don't expect in
> > > > > > > a legit system behavior.
> > > > > > 
> > > > > > It's legit behavior, and something we (you?) explicitly want to support.  See
> > > > > > patch 05, x86/cpu/intel: Allow SGX virtualization without Launch Control support.
> > > > > 
> > > > > What I think would be a sane behavior, would be to allow KVM when
> > > > > sgx_drv_init() returns -ENODEV (case A). This happens when LC is
> > > > > not enabled:
> > > > > 
> > > > > 	if (!cpu_feature_enabled(X86_FEATURE_SGX_LC))
> > > > > 		return -ENODEV;
> > > > > 
> > > > > /Jarkko
> > > > 
> > > > I really don't understand what's the difference between A and C. When "SGX driver is
> > > > supported and failed to initialize" happens, it just means "SGX driver is
> > > > unsupported". If it is not the case, can you explicitly point out what will be the
> > > > problem?
> > 
> > This is as explicit as I can ever possibly get:
> > 
> > A: ret == -ENODEV
> > B: ret == 0
> > C: ret != 0 && ret != -ENODEV
> 
> Let me try again: 
> 
> Why A and C should be treated differently? What will behave incorrectly, in case of
> C?

So you don't know what different error codes mean?

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-04  3:18                           ` Kai Huang
@ 2021-02-04 16:28                             ` Sean Christopherson
  2021-02-04 16:48                               ` Dave Hansen
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-04 16:28 UTC (permalink / raw)
  To: Kai Huang
  Cc: Dave Hansen, Paolo Bonzini, Edgecombe, Rick P, linux-sgx, kvm,
	x86, corbet, luto, jethro, wanpengli, mingo, b.thiel, tglx,
	jarkko, joro, hpa, jmattson, vkuznets, bp, Huang, Haitao

On Thu, Feb 04, 2021, Kai Huang wrote:
> On Wed, 2021-02-03 at 16:28 -0800, Sean Christopherson wrote:
> > On Thu, Feb 04, 2021, Kai Huang wrote:
> > > On Wed, 2021-02-03 at 15:37 -0800, Dave Hansen wrote:
> > > > On 2/3/21 3:32 PM, Sean Christopherson wrote:
> > > > > > > > Yeah, special casing KVM is almost always the wrong thing to do.
> > > > > > > > Anything that KVM can do, other subsystems will do as well.
> > > > > > > Agreed.  Thwarting ioremap itself seems like the right way to go.
> > > > > > This sounds irrelevant to KVM SGX, thus I won't include it to KVM SGX series.
> > > > > I would say it's relevant, but a pre-existing bug.  Same net effect on what's
> > > > > needed for this series..
> > > > > 
> > > > > I say it's a pre-existing bug, because I'm pretty sure KVM can be coerced into
> > > > > accessing the EPC by handing KVM a memslot that's backed by an enclave that was
> > > > > created by host userspace (via /dev/sgx_enclave).
> > > > 
> > > > Dang, you beat me to it.  I was composing another email that said the
> > > > exact same thing.
> > > > 
> > > > I guess we need to take a closer look at the KVM fallout from this.
> > > > It's a few spots where it KVM knew it might be consuming garbage.  It
> > > > just get extra weird stinky garbage now.
> > > 
> > > I don't quite understand how KVM will need to access EPC memslot. It is *guest*, but
> > > not KVM, who can read EPC from non-enclave. And if I understand correctly, there will
> > > be no place for KVM to use kernel address of EPC to access it. To KVM, there's no
> > > difference, whether EPC backend is from /dev/sgx_enclave, or /dev/sgx_vepc. And we
> > > really cannot prevent guest from doing anything.
> > > 
> > > So how memremap() of EPC section is related to KVM SGX? For instance, the
> > > implementation of this series needs to be modified due to this?
> > 
> > See kvm_vcpu_map() -> __kvm_map_gfn(), which blindly uses memremap() when the
> > resulting pfn isn't a "valid" pfn.  KVM doesn't need access to an EPC memslot,
> > we're talking the case where a malicious userspace/guest hands KVM a GPA that
> > resolves to the EPC.  E.g. nested VM-Enter with the L1->L2 MSR bitmap pointing
> > at EPC.  L0 KVM will intercept VM-Enter and then read L1's bitmap to merge it's
> > desires with L0 KVM's requirements.  That read will hit the EPC, and thankfully
> > for KVM, return garbage.
> 
> Right. I missed __kvm_map_gfn(). 
> 
> I am not quite sure returning all ones can be treated as garbage, since one can means
> true for a boolean, or one bit in bitmap as you said. But since this only happens
> when guest/userspace is malicious, so causing misbehavior to the guest is fine?

Yes, it's fine.  It's really the guest causing misbehavior for itself.

> Do we see any security risk here?

Not with current CPUs, which drop writes and read all ones.  If future CPUs take
creatives liberties with the SDM, then we could have a problem, but that's why
Dave is trying to get stronger guarantees into the SDM.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-04 16:28                             ` Sean Christopherson
@ 2021-02-04 16:48                               ` Dave Hansen
  2021-02-05 12:32                                 ` Kai Huang
  0 siblings, 1 reply; 156+ messages in thread
From: Dave Hansen @ 2021-02-04 16:48 UTC (permalink / raw)
  To: Sean Christopherson, Kai Huang
  Cc: Paolo Bonzini, Edgecombe, Rick P, linux-sgx, kvm, x86, corbet,
	luto, jethro, wanpengli, mingo, b.thiel, tglx, jarkko, joro, hpa,
	jmattson, vkuznets, bp, Huang, Haitao

On 2/4/21 8:28 AM, Sean Christopherson wrote:
>> Do we see any security risk here?
> Not with current CPUs, which drop writes and read all ones.  If future CPUs take
> creatives liberties with the SDM, then we could have a problem, but that's why
> Dave is trying to get stronger guarantees into the SDM.

I really don't like the idea of the abort page being used by code that
doesn't know what it's dealing with.  It just seems like trouble (aka.
security risk) waiting to happen.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-04 14:51                             ` Jarkko Sakkinen
@ 2021-02-04 22:41                               ` Dave Hansen
  2021-02-04 22:56                                 ` Kai Huang
  2021-02-05  2:08                                 ` Jarkko Sakkinen
  0 siblings, 2 replies; 156+ messages in thread
From: Dave Hansen @ 2021-02-04 22:41 UTC (permalink / raw)
  To: Jarkko Sakkinen, Kai Huang
  Cc: Sean Christopherson, linux-sgx, kvm, x86, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On 2/4/21 6:51 AM, Jarkko Sakkinen wrote:
>>> A: ret == -ENODEV
>>> B: ret == 0
>>> C: ret != 0 && ret != -ENODEV
>> Let me try again: 
>>
>> Why A and C should be treated differently? What will behave incorrectly, in case of
>> C?
> So you don't know what different error codes mean?

How about we just leave the check in place as Sean wrote it, and add a
nice comment to explain what it is doing:
	
	/*
	 * Always try to initialize the native *and* KVM drivers.
	 * The KVM driver is less picky than the native one and
	 * can function if the native one is not supported on the
	 * current system or fails to initialize.
	 *
	 * Error out only if both fail to initialize.
 	 */
	ret = !!sgx_drv_init() & !!sgx_vepc_init();
	if (ret)
		goto err_kthread;


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-04 22:41                               ` Dave Hansen
@ 2021-02-04 22:56                                 ` Kai Huang
  2021-02-05  2:08                                 ` Jarkko Sakkinen
  1 sibling, 0 replies; 156+ messages in thread
From: Kai Huang @ 2021-02-04 22:56 UTC (permalink / raw)
  To: Dave Hansen, Jarkko Sakkinen
  Cc: Sean Christopherson, linux-sgx, kvm, x86, luto, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Thu, 2021-02-04 at 14:41 -0800, Dave Hansen wrote:
> On 2/4/21 6:51 AM, Jarkko Sakkinen wrote:
> > > > A: ret == -ENODEV
> > > > B: ret == 0
> > > > C: ret != 0 && ret != -ENODEV
> > > Let me try again: 
> > > 
> > > Why A and C should be treated differently? What will behave incorrectly, in case of
> > > C?
> > So you don't know what different error codes mean?
> 
> How about we just leave the check in place as Sean wrote it, and add a
> nice comment to explain what it is doing:
> 	
> 	/*
> 	 * Always try to initialize the native *and* KVM drivers.
> 	 * The KVM driver is less picky than the native one and
> 	 * can function if the native one is not supported on the
> 	 * current system or fails to initialize.
> 	 *
> 	 * Error out only if both fail to initialize.
>  	 */
> 	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> 	if (ret)
> 		goto err_kthread;
> 

Perfect to me. Thanks.


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  2021-02-04  3:53   ` Kai Huang
@ 2021-02-05  0:32     ` Sean Christopherson
  2021-02-05  1:39       ` Huang, Kai
  0 siblings, 1 reply; 156+ messages in thread
From: Sean Christopherson @ 2021-02-05  0:32 UTC (permalink / raw)
  To: Kai Huang
  Cc: linux-sgx, kvm, x86, jarkko, luto, dave.hansen, haitao.huang,
	pbonzini, bp, tglx, mingo, hpa

On Thu, Feb 04, 2021, Kai Huang wrote:
> Hi Sean,
> 
> Do you think is it reasonable to move this patch to KVM? sgx_virt_ecreate() can be
> merged to handle ECREATE patch, and sgx_virt_einit() can be merged to handle EINIT
> patch. W/o the context of that two patches, it doesn't makes too much sense to have
> them standalone under x86 here I think. And nobody except KVM will use them.

Short answer, no.  To do that, nearly all of arch/x86/kernel/cpu/sgx/encls.h
would need to be exposed via asm/sgx.h.  The macro insanity and fault/error code
shenanigans really should be kept as private crud in SGX.  That's the primary
motivation for putting these in sgx/virt.c instead of KVM, my changelog just did
a really poor job of explaining that.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* RE: [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM
  2021-02-05  0:32     ` Sean Christopherson
@ 2021-02-05  1:39       ` Huang, Kai
  0 siblings, 0 replies; 156+ messages in thread
From: Huang, Kai @ 2021-02-05  1:39 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-sgx, kvm, x86, jarkko, luto, Hansen, Dave, Huang, Haitao,
	pbonzini, bp, tglx, mingo, hpa

> On Thu, Feb 04, 2021, Kai Huang wrote:
> > Hi Sean,
> >
> > Do you think is it reasonable to move this patch to KVM?
> > sgx_virt_ecreate() can be merged to handle ECREATE patch, and
> > sgx_virt_einit() can be merged to handle EINIT patch. W/o the context
> > of that two patches, it doesn't makes too much sense to have them
> standalone under x86 here I think. And nobody except KVM will use them.
> 
> Short answer, no.  To do that, nearly all of arch/x86/kernel/cpu/sgx/encls.h
> would need to be exposed via asm/sgx.h.  The macro insanity and fault/error
> code shenanigans really should be kept as private crud in SGX.  That's the
> primary motivation for putting these in sgx/virt.c instead of KVM, my changelog
> just did a really poor job of explaining that.

OK.

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-04 22:41                               ` Dave Hansen
  2021-02-04 22:56                                 ` Kai Huang
@ 2021-02-05  2:08                                 ` Jarkko Sakkinen
  2021-02-05  3:00                                   ` Huang, Kai
  1 sibling, 1 reply; 156+ messages in thread
From: Jarkko Sakkinen @ 2021-02-05  2:08 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kai Huang, Sean Christopherson, linux-sgx, kvm, x86, luto,
	haitao.huang, pbonzini, bp, tglx, mingo, hpa

On Thu, Feb 04, 2021 at 02:41:57PM -0800, Dave Hansen wrote:
> On 2/4/21 6:51 AM, Jarkko Sakkinen wrote:
> >>> A: ret == -ENODEV
> >>> B: ret == 0
> >>> C: ret != 0 && ret != -ENODEV
> >> Let me try again: 
> >>
> >> Why A and C should be treated differently? What will behave incorrectly, in case of
> >> C?
> > So you don't know what different error codes mean?
> 
> How about we just leave the check in place as Sean wrote it, and add a
> nice comment to explain what it is doing:
> 	
> 	/*
> 	 * Always try to initialize the native *and* KVM drivers.
> 	 * The KVM driver is less picky than the native one and
> 	 * can function if the native one is not supported on the
> 	 * current system or fails to initialize.
> 	 *
> 	 * Error out only if both fail to initialize.
>  	 */
> 	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> 	if (ret)
> 		goto err_kthread;

WFM, I can go along, as long as there is a remark. There is a semantical
difference between "not supported" and "failure to initialize". The
driving point is that this should not be hidden. I was first thinking
a note in the commit message, but inline comment is actually a better
idea. Thanks!

I can ack the next version, as long as this comment is included.

/Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* RE: [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled
  2021-02-05  2:08                                 ` Jarkko Sakkinen
@ 2021-02-05  3:00                                   ` Huang, Kai
  0 siblings, 0 replies; 156+ messages in thread
From: Huang, Kai @ 2021-02-05  3:00 UTC (permalink / raw)
  To: Jarkko Sakkinen, Hansen, Dave
  Cc: Sean Christopherson, linux-sgx, kvm, x86, luto, Huang, Haitao,
	pbonzini, bp, tglx, mingo, hpa

> On Thu, Feb 04, 2021 at 02:41:57PM -0800, Dave Hansen wrote:
> > On 2/4/21 6:51 AM, Jarkko Sakkinen wrote:
> > >>> A: ret == -ENODEV
> > >>> B: ret == 0
> > >>> C: ret != 0 && ret != -ENODEV
> > >> Let me try again:
> > >>
> > >> Why A and C should be treated differently? What will behave
> > >> incorrectly, in case of C?
> > > So you don't know what different error codes mean?
> >
> > How about we just leave the check in place as Sean wrote it, and add a
> > nice comment to explain what it is doing:
> >
> > 	/*
> > 	 * Always try to initialize the native *and* KVM drivers.
> > 	 * The KVM driver is less picky than the native one and
> > 	 * can function if the native one is not supported on the
> > 	 * current system or fails to initialize.
> > 	 *
> > 	 * Error out only if both fail to initialize.
> >  	 */
> > 	ret = !!sgx_drv_init() & !!sgx_vepc_init();
> > 	if (ret)
> > 		goto err_kthread;
> 
> WFM, I can go along, as long as there is a remark. There is a semantical
> difference between "not supported" and "failure to initialize". The driving point
> is that this should not be hidden. I was first thinking a note in the commit
> message, but inline comment is actually a better idea. Thanks!
> 
> I can ack the next version, as long as this comment is included.

Sure. Thanks Dave and Jarkko.

> 
> /Jarkko

^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-04 16:48                               ` Dave Hansen
@ 2021-02-05 12:32                                 ` Kai Huang
  2021-02-05 16:51                                   ` Sean Christopherson
  0 siblings, 1 reply; 156+ messages in thread
From: Kai Huang @ 2021-02-05 12:32 UTC (permalink / raw)
  To: Dave Hansen, Sean Christopherson
  Cc: Paolo Bonzini, Edgecombe, Rick P, linux-sgx, kvm, x86, corbet,
	luto, jethro, wanpengli, mingo, b.thiel, tglx, jarkko, joro, hpa,
	jmattson, vkuznets, bp, Huang, Haitao

On Thu, 2021-02-04 at 08:48 -0800, Dave Hansen wrote:
> On 2/4/21 8:28 AM, Sean Christopherson wrote:
> > > Do we see any security risk here?
> > Not with current CPUs, which drop writes and read all ones.  If future CPUs take
> > creatives liberties with the SDM, then we could have a problem, but that's why
> > Dave is trying to get stronger guarantees into the SDM.
> 
> I really don't like the idea of the abort page being used by code that
> doesn't know what it's dealing with.  It just seems like trouble (aka.
> security risk) waiting to happen.

Hi Dave,

Just to confirm, you want this (disallow ioremap() for EPC) fixed in upstream kernel
before KVM SGX can be merged, correct?

If so, and since it seems you also agreed that better solution is to modify ioremap()
to refuse to map EPC, what do you think of the sample code Sean put in his previous
reply?

https://www.spinics.net/lists/kvm/msg234754.html

IMHO adding 'bool sgx_epc' to ioremap_desc seems not ideal, since it's not generic.
Instead, we may define some new flag here, and ioremap_desc->flag can just cope with
it.

Btw as Sean already pointed out, SGX code uses memremap() to initialize EPC section,
and we could choose to still allow this to avoid code change to SGX driver. But it
seems it is a little hack here. What's your opinion? 

Hi Sean,

If we all agree the fix is needed here, do you want to work on the patch (since you
already provided your thought), or do you want me to do it, with Suggested-by you?


^ permalink raw reply	[flat|nested] 156+ messages in thread

* Re: [RFC PATCH v3 00/27] KVM SGX virtualization support
  2021-02-05 12:32                                 ` Kai Huang
@ 2021-02-05 16:51                                   ` Sean Christopherson
  0 siblings, 0 replies; 156+ messages in thread
From: Sean Christopherson @ 2021-02-05 16:51 UTC (permalink / raw)
  To: Kai Huang
  Cc: Dave Hansen, Paolo Bonzini, Edgecombe, Rick P, linux-sgx, kvm,
	x86, corbet, luto, jethro, wanpengli, mingo, b.thiel, tglx,
	jarkko, joro, hpa, jmattson, vkuznets, bp, Huang, Haitao

On Sat, Feb 06, 2021, Kai Huang wrote:
> Hi Sean,
> 
> If we all agree the fix is needed here, do you want to work on the patch (since you
> already provided your thought), or do you want me to do it, with Suggested-by you?

Nope, all yours.

^ permalink raw reply	[flat|nested] 156+ messages in thread

end of thread, other threads:[~2021-02-05 17:11 UTC | newest]

Thread overview: 156+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-26 10:10 [RFC PATCH v3 00/27] KVM SGX virtualization support Kai Huang
2021-01-26  9:29 ` Kai Huang
2021-01-26  9:30 ` [RFC PATCH v3 01/27] x86/cpufeatures: Add SGX1 and SGX2 sub-features Kai Huang
2021-01-26 15:34   ` Dave Hansen
2021-01-26 23:18     ` Kai Huang
2021-01-30 13:20       ` Jarkko Sakkinen
2021-02-01  0:01         ` Kai Huang
2021-02-02 17:17           ` Jarkko Sakkinen
2021-02-03  1:09             ` Kai Huang
2021-02-02 17:56           ` Paolo Bonzini
2021-02-02 18:00             ` Dave Hansen
2021-02-02 18:03               ` Paolo Bonzini
2021-02-02 18:42                 ` Sean Christopherson
2021-02-03  1:05                   ` Kai Huang
2021-01-30 13:11   ` Jarkko Sakkinen
2021-01-26  9:30 ` [RFC PATCH v3 02/27] x86/cpufeatures: Make SGX_LC feature bit depend on SGX bit Kai Huang
2021-01-26 15:35   ` Dave Hansen
2021-01-30 13:22   ` Jarkko Sakkinen
2021-02-01  0:08     ` Kai Huang
2021-01-26  9:30 ` [RFC PATCH v3 03/27] x86/sgx: Remove a warn from sgx_free_epc_page() Kai Huang
2021-01-26 15:39   ` Dave Hansen
2021-01-26 16:30     ` Sean Christopherson
2021-01-27  1:08     ` Kai Huang
2021-01-27  1:12       ` Dave Hansen
2021-01-27  1:26         ` Kai Huang
2021-02-01  0:11           ` Kai Huang
2021-02-03 10:03             ` Jarkko Sakkinen
2021-01-26  9:30 ` [RFC PATCH v3 04/27] x86/sgx: Wipe out EREMOVE " Kai Huang
2021-01-26 16:04   ` Dave Hansen
2021-01-27  1:25     ` Kai Huang
2021-02-02 18:00       ` Paolo Bonzini
2021-02-02 19:25         ` Kai Huang
2021-02-02 19:02       ` Dave Hansen
2021-01-26  9:30 ` [RFC PATCH v3 05/27] x86/sgx: Add SGX_CHILD_PRESENT hardware error code Kai Huang
2021-01-26 15:49   ` Dave Hansen
2021-01-27  0:00     ` Kai Huang
2021-01-27  0:21       ` Dave Hansen
2021-01-27  0:52         ` Kai Huang
2021-01-26  9:30 ` [RFC PATCH v3 06/27] x86/sgx: Introduce virtual EPC for use by KVM guests Kai Huang
2021-01-26 16:19   ` Dave Hansen
2021-01-27  0:16     ` Kai Huang
2021-01-27  0:27       ` Dave Hansen
2021-01-27  0:48         ` Kai Huang
2021-01-30 14:41   ` Jarkko Sakkinen
2021-01-26  9:30 ` [RFC PATCH v3 07/27] x86/cpu/intel: Allow SGX virtualization without Launch Control support Kai Huang
2021-01-26 16:26   ` Dave Hansen
2021-01-26 17:00     ` Sean Christopherson
2021-01-26 23:54       ` Kai Huang
2021-01-26 23:56     ` Kai Huang
2021-01-27  0:18       ` Dave Hansen
2021-01-27  2:02         ` Kai Huang
2021-01-27 17:13           ` Sean Christopherson
2021-01-30 14:42   ` Jarkko Sakkinen
2021-02-01  5:38     ` Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 08/27] x86/sgx: Initialize virtual EPC driver even when SGX driver is disabled Kai Huang
2021-01-26 17:03   ` Dave Hansen
2021-01-26 18:10     ` Andy Lutomirski
2021-01-26 23:25       ` Kai Huang
2021-01-30 14:45   ` Jarkko Sakkinen
2021-02-01  5:40     ` Kai Huang
2021-02-01 15:25       ` Dave Hansen
2021-02-01 17:23         ` Sean Christopherson
2021-02-02  0:12           ` Kai Huang
2021-02-02 23:10             ` Jarkko Sakkinen
2021-02-02 23:07         ` Jarkko Sakkinen
2021-02-02 17:32       ` Jarkko Sakkinen
2021-02-02 18:20         ` Sean Christopherson
2021-02-02 23:16           ` Jarkko Sakkinen
2021-02-03  0:49             ` Kai Huang
2021-02-03 22:02               ` Jarkko Sakkinen
2021-02-03 22:59                 ` Sean Christopherson
2021-02-04  1:39                   ` Jarkko Sakkinen
2021-02-04  2:59                     ` Kai Huang
2021-02-04  3:05                       ` Jarkko Sakkinen
2021-02-04  3:09                         ` Jarkko Sakkinen
2021-02-04  3:20                           ` Kai Huang
2021-02-04 14:51                             ` Jarkko Sakkinen
2021-02-04 22:41                               ` Dave Hansen
2021-02-04 22:56                                 ` Kai Huang
2021-02-05  2:08                                 ` Jarkko Sakkinen
2021-02-05  3:00                                   ` Huang, Kai
2021-02-02 18:49         ` Kai Huang
2021-02-02 23:17           ` Jarkko Sakkinen
2021-01-26  9:31 ` [RFC PATCH v3 09/27] x86/sgx: Expose SGX architectural definitions to the kernel Kai Huang
2021-01-30 14:46   ` Jarkko Sakkinen
2021-01-26  9:31 ` [RFC PATCH v3 10/27] x86/sgx: Move ENCLS leaf definitions to sgx_arch.h Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 11/27] x86/sgx: Add SGX2 ENCLS leaf definitions (EAUG, EMODPR and EMODT) Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 12/27] x86/sgx: Add encls_faulted() helper Kai Huang
2021-01-30 14:48   ` Jarkko Sakkinen
2021-01-26  9:31 ` [RFC PATCH v3 13/27] x86/sgx: Add helper to update SGX_LEPUBKEYHASHn MSRs Kai Huang
2021-01-30 14:49   ` Jarkko Sakkinen
2021-02-01  1:17     ` Kai Huang
2021-02-01 21:22       ` Dave Hansen
2021-01-26  9:31 ` [RFC PATCH v3 14/27] x86/sgx: Add helpers to expose ECREATE and EINIT to KVM Kai Huang
2021-01-30 14:51   ` Jarkko Sakkinen
2021-02-01  0:17     ` Kai Huang
2021-02-02 17:20       ` Jarkko Sakkinen
2021-02-02 20:35         ` Kai Huang
2021-02-04  3:53   ` Kai Huang
2021-02-05  0:32     ` Sean Christopherson
2021-02-05  1:39       ` Huang, Kai
2021-01-26  9:31 ` [RFC PATCH v3 15/27] x86/sgx: Move provisioning device creation out of SGX driver Kai Huang
2021-01-30 14:52   ` Jarkko Sakkinen
2021-01-26  9:31 ` [RFC PATCH v3 16/27] KVM: VMX: Convert vcpu_vmx.exit_reason to a union Kai Huang
2021-01-30 15:00   ` Jarkko Sakkinen
2021-02-01  0:32     ` Kai Huang
2021-02-02 17:24       ` Jarkko Sakkinen
2021-02-02 19:23         ` Kai Huang
2021-02-02 22:41           ` Jarkko Sakkinen
2021-02-03  0:42             ` Kai Huang
2021-02-01 17:12     ` Sean Christopherson
2021-02-02 22:38       ` Jarkko Sakkinen
2021-01-26  9:31 ` [RFC PATCH v3 17/27] KVM: x86: Export kvm_mmu_gva_to_gpa_{read,write}() for SGX (VMX) Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 18/27] KVM: x86: Define new #PF SGX error code bit Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 19/27] KVM: x86: Add support for reverse CPUID lookup of scattered features Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 20/27] KVM: x86: Add reverse-CPUID lookup support for scattered SGX features Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 21/27] KVM: VMX: Add basic handling of VM-Exit from SGX enclave Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 22/27] KVM: VMX: Frame in ENCLS handler for SGX virtualization Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 23/27] KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions Kai Huang
2021-02-03  0:52   ` Edgecombe, Rick P
2021-02-03  1:36     ` Sean Christopherson
2021-02-03  9:11       ` Kai Huang
2021-02-03 17:07         ` Sean Christopherson
2021-02-03 23:11           ` Kai Huang
2021-02-03 18:47   ` Edgecombe, Rick P
2021-02-03 19:36     ` Sean Christopherson
2021-02-03 23:29       ` Kai Huang
2021-02-03 23:36         ` Sean Christopherson
2021-02-03 23:45           ` Kai Huang
2021-02-03 23:59             ` Sean Christopherson
2021-02-04  0:11               ` Kai Huang
2021-02-04  2:01                 ` Sean Christopherson
2021-01-26  9:31 ` [RFC PATCH v3 24/27] KVM: VMX: Add emulation of SGX Launch Control LE hash MSRs Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 25/27] KVM: VMX: Add ENCLS[EINIT] handler to support SGX Launch Control (LC) Kai Huang
2021-01-26  9:31 ` [RFC PATCH v3 26/27] KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC Kai Huang
2021-01-26  9:32 ` [RFC PATCH v3 27/27] KVM: x86: Add capability to grant VM access to privileged SGX attribute Kai Huang
2021-02-02 22:21 ` [RFC PATCH v3 00/27] KVM SGX virtualization support Edgecombe, Rick P
2021-02-02 22:33   ` Sean Christopherson
2021-02-02 23:21     ` Dave Hansen
2021-02-02 23:56       ` Sean Christopherson
2021-02-03  0:43         ` Dave Hansen
2021-02-03 15:10         ` Dave Hansen
2021-02-03 17:36           ` Sean Christopherson
2021-02-03 17:43             ` Paolo Bonzini
2021-02-03 17:46               ` Dave Hansen
2021-02-03 23:09                 ` Kai Huang
2021-02-03 23:32                   ` Sean Christopherson
2021-02-03 23:37                     ` Dave Hansen
2021-02-04  0:04                       ` Kai Huang
2021-02-04  0:28                         ` Sean Christopherson
2021-02-04  3:18                           ` Kai Huang
2021-02-04 16:28                             ` Sean Christopherson
2021-02-04 16:48                               ` Dave Hansen
2021-02-05 12:32                                 ` Kai Huang
2021-02-05 16:51                                   ` Sean Christopherson
2021-02-02 22:36   ` Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).