All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior
@ 2022-06-22  9:37 Zhiquan Li
  2022-06-22  9:37 ` [PATCH v5 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page Zhiquan Li
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Zhiquan Li @ 2022-06-22  9:37 UTC (permalink / raw)
  To: linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, kai.huang, fan.du, cathy.zhang, zhiquan1.li

V4: https://lore.kernel.org/linux-sgx/20220608032654.1764936-1-zhiquan1.li@intel.com/T/#t

Change since V4:
- Switch the order of the two variables at patch 02 so all of variables
  are in reverse Christmas style.
- Do not initialize "ret" because it will be overridden by the return
  value of force_sig_mceerr() unconditionally.
- Add Co-developed-by and Signed-off-by from Cathy Zhang at patch 01.
- Add Acked-by from Kai Huang at patch 01.

V3: https://lore.kernel.org/linux-sgx/41704e5d4c03b49fcda12e695595211d950cfb08.camel@kernel.org/T/#t

Changes since V3:
- Take the definition of EPC page flag SGX_EPC_PAGE_KVM_GUEST from
  Cathy Zhang's third patch of SGX rebootless recovery patch set but
  discard irrelevant portion, since it might need some time to
  re-forge and these are two different features.
  Link: https://lore.kernel.org/linux-sgx/41704e5d4c03b49fcda12e695595211d950cfb08.camel@kernel.org/T/#m9782d23496cacecb7da07a67daa79f4b322ae170

V2: https://lore.kernel.org/linux-sgx/694234d7-6a0d-e85f-f2f9-e52b4a61e1ec@intel.com/T/#t

Changes since V2:
- Repurpose the owner field as the virtual address of virtual EPC page
- Remove struct sgx_vepc_page and relevant code.
- Remove patch 01 as the changes are not necessary in new design.
- Rework patch 02 suggested by Jarkko.
- Adapt patch 03 and 04 since struct sgx_vepc_page was discarded.
- Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with
  SGX_EPC_PAGE_KVM_GUEST as they are duplicated.
  Link: https://lore.kernel.org/linux-sgx/eb95b32ecf3d44a695610cf7f2816785@intel.com/T/#u

V1: https://lore.kernel.org/linux-sgx/443cb425-009c-2784-56f4-5e707122de76@intel.com/T/#t

Changes since V1:
- Updated cover letter and commit messages, added valuable
  information from Jarkko, Tony and Kai’s comments.
- Added documentations for struct struct sgx_vepc and
  struct sgx_vepc_page.

Hi everyone,

This series contains a few patches to fine grained SGX MCA behavior.

When VM guest access a SGX EPC page with memory failure, current
behavior will kill the guest, expected only kill the SGX application
inside it.

To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra
information for hypervisor to inject #MC information to guest, which
is helpful in SGX virtualization case.

The rest of things are guest side. Currently the hypervisor like
Qemu already has mature facility to convert HVA to GPA and inject #MC
to the guest OS.

Then we extend the solution for the normal SGX case, so that the task
has opportunity to make further decision while EPC page has memory
failure.

However, when a page triggers a machine check, it only reports the PFN.
But in order to inject #MC into hypervisor, the virtual address
is required. Then repurpose the “owner” field as the virtual address of
the virtual EPC page so that arch_memory_failure() can easily retrieve
it.

Add a new EPC page flag - SGX_EPC_PAGE_KVM_GUEST to interpret the
meaning of the field.

Suppose an enclave is shared by multiple processes, when an enclave
page triggers a machine check, the enclave will be disabled so that
it couldn't be entered again. Killing other processes with the same
enclave mapped would perhaps be overkill, but they are going to find
that the enclave is "dead" next time they try to use it. Thanks for
Jarkko’s head up and Tony’s clarification on this point.

Our intension is to provide additional info so that the application has
more choices. Current behavior looks gently, and we don’t want to
change it.

If you expect the other processes to be informed in such case, then
you’re looking for an MCA “early kill” feature which worth another
patch set to implement it.

Unlike host enclaves, virtual EPC instance cannot be shared by multiple
VMs. It is because how enclaves are created is totally up to the guest.
Sharing virtual EPC instance will be very likely to unexpectedly break
enclaves in all VMs.

SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance
being shared by multiple VMs via fork(). However KVM doesn't support
running a VM across multiple mm structures, and the de facto userspace
hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice
this should not happen.

This series is based on tip/x86/sgx.

Tests:
1. MCE injection test for SGX in VM.
   As we expected, the application was killed and VM was alive.
2. MCE injection test for SGX on host.
   As we expected, the application received SIGBUS with extra info.
3. Kernel selftest/sgx: PASS
4. Internal SGX stress test: PASS
5. kmemleak test: No memory leakage detected.

Much appreciate your feedback.

Best Regards,
Zhiquan

Zhiquan Li (3):
  x86/sgx: Repurpose the owner field as the virtual address of virtual
    EPC page
  x86/sgx: Fine grained SGX MCA behavior for virtualization
  x86/sgx: Fine grained SGX MCA behavior for normal case

 arch/x86/kernel/cpu/sgx/main.c | 27 +++++++++++++++++++++++++--
 arch/x86/kernel/cpu/sgx/sgx.h  |  2 ++
 arch/x86/kernel/cpu/sgx/virt.c |  4 +++-
 3 files changed, 30 insertions(+), 3 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v5 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page
  2022-06-22  9:37 [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior Zhiquan Li
@ 2022-06-22  9:37 ` Zhiquan Li
  2022-07-21 16:42   ` Dave Hansen
  2022-06-22  9:37 ` [PATCH v5 2/3] x86/sgx: Fine grained SGX MCA behavior for virtualization Zhiquan Li
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Zhiquan Li @ 2022-06-22  9:37 UTC (permalink / raw)
  To: linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, kai.huang, fan.du, cathy.zhang, zhiquan1.li

When a page triggers a machine check, it only reports the
physical address of EPC page. But in order to inject #MC into
hypervisor, the virtual address is required. Then repurpose the
"owner" field as the virtual address of the virtual EPC page so that
arch_memory_failure() can easily retrieve it.

Add a new EPC page flag - SGX_EPC_PAGE_KVM_GUEST to interpret the
meaning of the field.

Co-developed-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>

---
Changes since V4:
- Add Co-developed-by and Signed-off-by from Cathy Zhang, as she had
  fully discussed the flag name with Jarkko.
  Link: https://lore.kernel.org/all/df92395ade424401ac3c6322de568720@intel.com/
- Add Acked-by from Kai Huang
  Link: https://lore.kernel.org/linux-sgx/0676cd4e-d94b-e904-81ae-ca1c05d37070@intel.com/T/#mccfb11df30698dbd060f2b6f06383cda7f154ef3

Changes since V3:
- Take the definition of EPC page flag SGX_EPC_PAGE_KVM_GUEST from
  Cathy Zhang's third patch of SGX rebootless recovery patch set but
  discard irrelevant portion, since it might need some time to
  re-forge and these are two different features.
  Link: https://lore.kernel.org/linux-sgx/41704e5d4c03b49fcda12e695595211d950cfb08.camel@kernel.org/T/#m9782d23496cacecb7da07a67daa79f4b322ae170

Changes since V2:
- Rework the patch suggested by Jarkko.
- Remove struct sgx_vepc_page and relevant code.
- Remove new EPC page flag SGX_EPC_PAGE_IS_VEPC definition as it is
  duplicated to SGX_EPC_PAGE_KVM_GUEST.
  Link: https://lore.kernel.org/linux-sgx/eb95b32ecf3d44a695610cf7f2816785@intel.com/T/#u

Changes since V1:
- Add documentation suggested by Jarkko.
---
 arch/x86/kernel/cpu/sgx/sgx.h  | 2 ++
 arch/x86/kernel/cpu/sgx/virt.c | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index 0f17def9fe6f..b43582da1bcf 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -28,6 +28,8 @@
 
 /* Pages on free list */
 #define SGX_EPC_PAGE_IS_FREE		BIT(1)
+/* Pages allocated for KVM guest */
+#define SGX_EPC_PAGE_KVM_GUEST		BIT(2)
 
 struct sgx_epc_page {
 	unsigned int section;
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
index 6a77a14eee38..776ae5c1c032 100644
--- a/arch/x86/kernel/cpu/sgx/virt.c
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -46,10 +46,12 @@ static int __sgx_vepc_fault(struct sgx_vepc *vepc,
 	if (epc_page)
 		return 0;
 
-	epc_page = sgx_alloc_epc_page(vepc, false);
+	epc_page = sgx_alloc_epc_page((void *)addr, false);
 	if (IS_ERR(epc_page))
 		return PTR_ERR(epc_page);
 
+	epc_page->flags |= SGX_EPC_PAGE_KVM_GUEST;
+
 	ret = xa_err(xa_store(&vepc->page_array, index, epc_page, GFP_KERNEL));
 	if (ret)
 		goto err_free;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 2/3] x86/sgx: Fine grained SGX MCA behavior for virtualization
  2022-06-22  9:37 [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior Zhiquan Li
  2022-06-22  9:37 ` [PATCH v5 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page Zhiquan Li
@ 2022-06-22  9:37 ` Zhiquan Li
  2022-07-21 16:54   ` Dave Hansen
  2022-06-22  9:37 ` [PATCH v5 3/3] x86/sgx: Fine grained SGX MCA behavior for normal case Zhiquan Li
  2022-06-26  6:04 ` [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior Jarkko Sakkinen
  3 siblings, 1 reply; 11+ messages in thread
From: Zhiquan Li @ 2022-06-22  9:37 UTC (permalink / raw)
  To: linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, kai.huang, fan.du, cathy.zhang, zhiquan1.li

When VM guest access a SGX EPC page with memory failure, current
behavior will kill the guest, expected only kill the SGX application
inside it.

To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra
information for hypervisor to inject #MC information to guest, which is
helpful in SGX case.

The rest of things are guest side. Currently the hypervisor like Qemu
already has mature facility to convert HVA to GPA and inject #MC to
the guest OS.

Unlike host enclaves, virtual EPC instance cannot be shared by multiple
VMs.  It is because how enclaves are created is totally up to the guest.
Sharing virtual EPC instance will be very likely to unexpectedly break
enclaves in all VMs.

SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance
being shared by multiple VMs via fork().  However KVM doesn't support
running a VM across multiple mm structures, and the de facto userspace
hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice
this should not happen.

Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>
Acked-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/linux-sgx/443cb425-009c-2784-56f4-5e707122de76@intel.com/T/#m1d1f4098f4fad78034e8706a60e4d79c119db407
---
Changes since V4:
- Switch the order of the two variables so all of variables are in
  reverse Christmas style.
- Do not initialize "ret" because it will be overridden by the return
  value of force_sig_mceerr() unconditionally.

Changes since V2:
- Retrieve virtual address from "owner" field of struct sgx_epc_page,
  instead of struct sgx_vepc_page.
- Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with
  SGX_EPC_PAGE_KVM_GUEST as they are duplicated.

Changes since V1:
- Add Acked-by from Kai Huang.
- Add Kai’s excellent explanation regarding to why we no need to
  consider that one virtual EPC be shared by two guests.
---
 arch/x86/kernel/cpu/sgx/main.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index ab4ec54bbdd9..4507c2302348 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -715,6 +715,8 @@ int arch_memory_failure(unsigned long pfn, int flags)
 	struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT);
 	struct sgx_epc_section *section;
 	struct sgx_numa_node *node;
+	unsigned long vaddr;
+	int ret;
 
 	/*
 	 * mm/memory-failure.c calls this routine for all errors
@@ -731,8 +733,26 @@ int arch_memory_failure(unsigned long pfn, int flags)
 	 * error. The signal may help the task understand why the
 	 * enclave is broken.
 	 */
-	if (flags & MF_ACTION_REQUIRED)
-		force_sig(SIGBUS);
+	if (flags & MF_ACTION_REQUIRED) {
+		/*
+		 * Provide extra info to the task so that it can make further
+		 * decision but not simply kill it. This is quite useful for
+		 * virtualization case.
+		 */
+		if (page->flags & SGX_EPC_PAGE_KVM_GUEST) {
+			/*
+			 * The "owner" field is repurposed as the virtual address
+			 * of virtual EPC page.
+			 */
+			vaddr = (unsigned long)page->owner & PAGE_MASK;
+			ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)vaddr,
+					PAGE_SHIFT);
+			if (ret < 0)
+				pr_err("Memory failure: Error sending signal to %s:%d: %d\n",
+					current->comm, current->pid, ret);
+		} else
+			force_sig(SIGBUS);
+	}
 
 	section = &sgx_epc_sections[page->section];
 	node = section->node;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 3/3] x86/sgx: Fine grained SGX MCA behavior for normal case
  2022-06-22  9:37 [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior Zhiquan Li
  2022-06-22  9:37 ` [PATCH v5 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page Zhiquan Li
  2022-06-22  9:37 ` [PATCH v5 2/3] x86/sgx: Fine grained SGX MCA behavior for virtualization Zhiquan Li
@ 2022-06-22  9:37 ` Zhiquan Li
  2022-07-21 16:57   ` Dave Hansen
  2022-06-26  6:04 ` [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior Jarkko Sakkinen
  3 siblings, 1 reply; 11+ messages in thread
From: Zhiquan Li @ 2022-06-22  9:37 UTC (permalink / raw)
  To: linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, kai.huang, fan.du, cathy.zhang, zhiquan1.li

When the application accesses a SGX EPC page with memory failure, the
task will receive a SIGBUS signal without any extra info, unless the
EPC page has SGX_EPC_PAGE_KVM_GUEST flag. However, in some cases,
we only use SGX in sub-task and we don't expect the entire task group
be killed due to a SGX EPC page for a sub-task has memory failure.

To fix it, we extend the solution for normal case. That is, the SGX
regular EPC page with memory failure will trigger a SIGBUS signal with
code BUS_MCEERR_AR and additional info, so that the user has opportunity
to make further decision.

Suppose an enclave is shared by multiple processes, when an enclave page
triggers a machine check, the enclave will be disabled so that it
couldn't be entered again. Killing other processes with the same enclave
mapped would perhaps be overkill, but they are going to find that the
enclave is "dead" next time they try to use it. Thanks for Jarkko's head
up and Tony's clarification on this point.

Our intension is to provide additional info so that the application has
more choices. Current behavior looks gently, and we don't want to change
it.

Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>
---
No changes since V4.

Changes since V2:
- Adapted the code since struct sgx_vepc_page was discarded.
- Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with
  SGX_EPC_PAGE_KVM_GUEST as they are duplicated.

Changes since V1:
- Add valuable information from Jarkko and Tony the into commit
  message.
---
 arch/x86/kernel/cpu/sgx/main.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 4507c2302348..7c55dcdb2b7c 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -739,12 +739,15 @@ int arch_memory_failure(unsigned long pfn, int flags)
 		 * decision but not simply kill it. This is quite useful for
 		 * virtualization case.
 		 */
-		if (page->flags & SGX_EPC_PAGE_KVM_GUEST) {
+		if (page->owner) {
 			/*
 			 * The "owner" field is repurposed as the virtual address
 			 * of virtual EPC page.
 			 */
-			vaddr = (unsigned long)page->owner & PAGE_MASK;
+			if (page->flags & SGX_EPC_PAGE_KVM_GUEST)
+				vaddr = (unsigned long)page->owner & PAGE_MASK;
+			else
+				vaddr = (unsigned long)page->owner->desc & PAGE_MASK;
 			ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)vaddr,
 					PAGE_SHIFT);
 			if (ret < 0)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior
  2022-06-22  9:37 [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior Zhiquan Li
                   ` (2 preceding siblings ...)
  2022-06-22  9:37 ` [PATCH v5 3/3] x86/sgx: Fine grained SGX MCA behavior for normal case Zhiquan Li
@ 2022-06-26  6:04 ` Jarkko Sakkinen
  3 siblings, 0 replies; 11+ messages in thread
From: Jarkko Sakkinen @ 2022-06-26  6:04 UTC (permalink / raw)
  To: Zhiquan Li
  Cc: linux-sgx, tony.luck, dave.hansen, seanjc, kai.huang, fan.du,
	cathy.zhang

On Wed, Jun 22, 2022 at 05:37:02PM +0800, Zhiquan Li wrote:
> V4: https://lore.kernel.org/linux-sgx/20220608032654.1764936-1-zhiquan1.li@intel.com/T/#t
> 
> Change since V4:
> - Switch the order of the two variables at patch 02 so all of variables
>   are in reverse Christmas style.
> - Do not initialize "ret" because it will be overridden by the return
>   value of force_sig_mceerr() unconditionally.
> - Add Co-developed-by and Signed-off-by from Cathy Zhang at patch 01.
> - Add Acked-by from Kai Huang at patch 01.
> 
> V3: https://lore.kernel.org/linux-sgx/41704e5d4c03b49fcda12e695595211d950cfb08.camel@kernel.org/T/#t
> 
> Changes since V3:
> - Take the definition of EPC page flag SGX_EPC_PAGE_KVM_GUEST from
>   Cathy Zhang's third patch of SGX rebootless recovery patch set but
>   discard irrelevant portion, since it might need some time to
>   re-forge and these are two different features.
>   Link: https://lore.kernel.org/linux-sgx/41704e5d4c03b49fcda12e695595211d950cfb08.camel@kernel.org/T/#m9782d23496cacecb7da07a67daa79f4b322ae170
> 
> V2: https://lore.kernel.org/linux-sgx/694234d7-6a0d-e85f-f2f9-e52b4a61e1ec@intel.com/T/#t
> 
> Changes since V2:
> - Repurpose the owner field as the virtual address of virtual EPC page
> - Remove struct sgx_vepc_page and relevant code.
> - Remove patch 01 as the changes are not necessary in new design.
> - Rework patch 02 suggested by Jarkko.
> - Adapt patch 03 and 04 since struct sgx_vepc_page was discarded.
> - Replace EPC page flag SGX_EPC_PAGE_IS_VEPC with
>   SGX_EPC_PAGE_KVM_GUEST as they are duplicated.
>   Link: https://lore.kernel.org/linux-sgx/eb95b32ecf3d44a695610cf7f2816785@intel.com/T/#u
> 
> V1: https://lore.kernel.org/linux-sgx/443cb425-009c-2784-56f4-5e707122de76@intel.com/T/#t
> 
> Changes since V1:
> - Updated cover letter and commit messages, added valuable
>   information from Jarkko, Tony and Kai’s comments.
> - Added documentations for struct struct sgx_vepc and
>   struct sgx_vepc_page.
> 
> Hi everyone,
> 
> This series contains a few patches to fine grained SGX MCA behavior.
> 
> When VM guest access a SGX EPC page with memory failure, current
> behavior will kill the guest, expected only kill the SGX application
> inside it.
> 
> To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra
> information for hypervisor to inject #MC information to guest, which
> is helpful in SGX virtualization case.
> 
> The rest of things are guest side. Currently the hypervisor like
> Qemu already has mature facility to convert HVA to GPA and inject #MC
> to the guest OS.
> 
> Then we extend the solution for the normal SGX case, so that the task
> has opportunity to make further decision while EPC page has memory
> failure.
> 
> However, when a page triggers a machine check, it only reports the PFN.
> But in order to inject #MC into hypervisor, the virtual address
> is required. Then repurpose the “owner” field as the virtual address of
> the virtual EPC page so that arch_memory_failure() can easily retrieve
> it.
> 
> Add a new EPC page flag - SGX_EPC_PAGE_KVM_GUEST to interpret the
> meaning of the field.
> 
> Suppose an enclave is shared by multiple processes, when an enclave
> page triggers a machine check, the enclave will be disabled so that
> it couldn't be entered again. Killing other processes with the same
> enclave mapped would perhaps be overkill, but they are going to find
> that the enclave is "dead" next time they try to use it. Thanks for
> Jarkko’s head up and Tony’s clarification on this point.
> 
> Our intension is to provide additional info so that the application has
> more choices. Current behavior looks gently, and we don’t want to
> change it.
> 
> If you expect the other processes to be informed in such case, then
> you’re looking for an MCA “early kill” feature which worth another
> patch set to implement it.
> 
> Unlike host enclaves, virtual EPC instance cannot be shared by multiple
> VMs. It is because how enclaves are created is totally up to the guest.
> Sharing virtual EPC instance will be very likely to unexpectedly break
> enclaves in all VMs.
> 
> SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance
> being shared by multiple VMs via fork(). However KVM doesn't support
> running a VM across multiple mm structures, and the de facto userspace
> hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice
> this should not happen.
> 
> This series is based on tip/x86/sgx.
> 
> Tests:
> 1. MCE injection test for SGX in VM.
>    As we expected, the application was killed and VM was alive.
> 2. MCE injection test for SGX on host.
>    As we expected, the application received SIGBUS with extra info.
> 3. Kernel selftest/sgx: PASS
> 4. Internal SGX stress test: PASS
> 5. kmemleak test: No memory leakage detected.
> 
> Much appreciate your feedback.
> 
> Best Regards,
> Zhiquan
> 
> Zhiquan Li (3):
>   x86/sgx: Repurpose the owner field as the virtual address of virtual
>     EPC page
>   x86/sgx: Fine grained SGX MCA behavior for virtualization
>   x86/sgx: Fine grained SGX MCA behavior for normal case
> 
>  arch/x86/kernel/cpu/sgx/main.c | 27 +++++++++++++++++++++++++--
>  arch/x86/kernel/cpu/sgx/sgx.h  |  2 ++
>  arch/x86/kernel/cpu/sgx/virt.c |  4 +++-
>  3 files changed, 30 insertions(+), 3 deletions(-)
> 
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page
  2022-06-22  9:37 ` [PATCH v5 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page Zhiquan Li
@ 2022-07-21 16:42   ` Dave Hansen
  2022-07-21 23:27     ` Kai Huang
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2022-07-21 16:42 UTC (permalink / raw)
  To: Zhiquan Li, linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, kai.huang, fan.du, cathy.zhang

On 6/22/22 02:37, Zhiquan Li wrote:
> diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> index 6a77a14eee38..776ae5c1c032 100644
> --- a/arch/x86/kernel/cpu/sgx/virt.c
> +++ b/arch/x86/kernel/cpu/sgx/virt.c
> @@ -46,10 +46,12 @@ static int __sgx_vepc_fault(struct sgx_vepc *vepc,
>  	if (epc_page)
>  		return 0;
>  
> -	epc_page = sgx_alloc_epc_page(vepc, false);
> +	epc_page = sgx_alloc_epc_page((void *)addr, false);
>  	if (IS_ERR(epc_page))
>  		return PTR_ERR(epc_page);

Was the 'vepc' value simply unused before?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/3] x86/sgx: Fine grained SGX MCA behavior for virtualization
  2022-06-22  9:37 ` [PATCH v5 2/3] x86/sgx: Fine grained SGX MCA behavior for virtualization Zhiquan Li
@ 2022-07-21 16:54   ` Dave Hansen
  2022-07-22 16:21     ` Zhiquan Li
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2022-07-21 16:54 UTC (permalink / raw)
  To: Zhiquan Li, linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, kai.huang, fan.du, cathy.zhang

On 6/22/22 02:37, Zhiquan Li wrote:
> When VM guest access a SGX EPC page with memory failure, current
> behavior will kill the guest, expected only kill the SGX application
> inside it.

Can we please clean this up?  This is generally readable, but _hard_ to
read.  Perhaps:

	Today, if a guest accesses an SGX EPC page with memory failure,
	the kernel will behavior will kill the entire guest.  This blast
	radius is too large.  It would be idea to kill only the SGX
	application inside the guest.

> To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra

	    ^ No "we's".

> information for hypervisor to inject #MC information to guest, which is
> helpful in SGX case.

To fix this, send a SIGBUS to host userspace (like QEMU) which can
follow up by injecting a #MC to the guest.

> The rest of things are guest side. Currently the hypervisor like Qemu
> already has mature facility to convert HVA to GPA and inject #MC to
> the guest OS.
> 
> Unlike host enclaves, virtual EPC instance cannot be shared by multiple
> VMs.  It is because how enclaves are created is totally up to the guest.
> Sharing virtual EPC instance will be very likely to unexpectedly break
> enclaves in all VMs.

I'm not sure why this is here or why it is important to this patch.

> SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance
> being shared by multiple VMs via fork().  However KVM doesn't support
> running a VM across multiple mm structures, and the de facto userspace
> hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice
> this should not happen.


> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index ab4ec54bbdd9..4507c2302348 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -715,6 +715,8 @@ int arch_memory_failure(unsigned long pfn, int flags)
>  	struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT);
>  	struct sgx_epc_section *section;
>  	struct sgx_numa_node *node;
> +	unsigned long vaddr;
> +	int ret;
>  
>  	/*
>  	 * mm/memory-failure.c calls this routine for all errors
> @@ -731,8 +733,26 @@ int arch_memory_failure(unsigned long pfn, int flags)
>  	 * error. The signal may help the task understand why the
>  	 * enclave is broken.
>  	 */
> -	if (flags & MF_ACTION_REQUIRED)
> -		force_sig(SIGBUS);
> +	if (flags & MF_ACTION_REQUIRED) {
> +		/*
> +		 * Provide extra info to the task so that it can make further
> +		 * decision but not simply kill it. This is quite useful for
> +		 * virtualization case.
> +		 */
> +		if (page->flags & SGX_EPC_PAGE_KVM_GUEST) {
> +			/*
> +			 * The "owner" field is repurposed as the virtual address
> +			 * of virtual EPC page.
> +			 */
> +			vaddr = (unsigned long)page->owner & PAGE_MASK;

I really don't like repurposing page->owner like this.  It requires
casting on *both* sides of a type that we have full control over.

	struct sgx_epc_page {
	        unsigned int section;
	        u16 flags;
	        u16 poison;
		union {
		        struct sgx_encl_page *encl_owner;
			// Use when SGX_EPC_PAGE_KVM_GUEST
			// set in ->flags:
		        void __user *vepc_vaddr;
		};
	        struct list_head list;
	};

There is zero reason to play casting games instead of doing that ^

> +			ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)vaddr,
> +					PAGE_SHIFT);
> +			if (ret < 0)
> +				pr_err("Memory failure: Error sending signal to %s:%d: %d\n",
> +					current->comm, current->pid, ret);
> +		} else
> +			force_sig(SIGBUS);
> +	}



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 3/3] x86/sgx: Fine grained SGX MCA behavior for normal case
  2022-06-22  9:37 ` [PATCH v5 3/3] x86/sgx: Fine grained SGX MCA behavior for normal case Zhiquan Li
@ 2022-07-21 16:57   ` Dave Hansen
  2022-07-22 17:28     ` Zhiquan Li
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2022-07-21 16:57 UTC (permalink / raw)
  To: Zhiquan Li, linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, kai.huang, fan.du, cathy.zhang

On 6/22/22 02:37, Zhiquan Li wrote:
> When the application accesses a SGX EPC page with memory failure, the
> task will receive a SIGBUS signal without any extra info, unless the
> EPC page has SGX_EPC_PAGE_KVM_GUEST flag. However, in some cases,
> we only use SGX in sub-task and we don't expect the entire task group
> be killed due to a SGX EPC page for a sub-task has memory failure.
> 
> To fix it, we extend the solution for normal case. That is, the SGX
> regular EPC page with memory failure will trigger a SIGBUS signal with
> code BUS_MCEERR_AR and additional info, so that the user has opportunity
> to make further decision.
> 
> Suppose an enclave is shared by multiple processes, when an enclave page
> triggers a machine check, the enclave will be disabled so that it
> couldn't be entered again. Killing other processes with the same enclave
> mapped would perhaps be overkill, but they are going to find that the
> enclave is "dead" next time they try to use it. Thanks for Jarkko's head
> up and Tony's clarification on this point.
> 
> Our intension is to provide additional info so that the application has
> more choices. Current behavior looks gently, and we don't want to change
> it.
> 
> Signed-off-by: Zhiquan Li <zhiquan1.li@intel.com>

I honestly have zero idea what this patch is doing.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page
  2022-07-21 16:42   ` Dave Hansen
@ 2022-07-21 23:27     ` Kai Huang
  0 siblings, 0 replies; 11+ messages in thread
From: Kai Huang @ 2022-07-21 23:27 UTC (permalink / raw)
  To: Dave Hansen, Zhiquan Li, linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, fan.du, cathy.zhang

On Thu, 2022-07-21 at 09:42 -0700, Dave Hansen wrote:
> On 6/22/22 02:37, Zhiquan Li wrote:
> > diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
> > index 6a77a14eee38..776ae5c1c032 100644
> > --- a/arch/x86/kernel/cpu/sgx/virt.c
> > +++ b/arch/x86/kernel/cpu/sgx/virt.c
> > @@ -46,10 +46,12 @@ static int __sgx_vepc_fault(struct sgx_vepc *vepc,
> >  	if (epc_page)
> >  		return 0;
> >  
> > -	epc_page = sgx_alloc_epc_page(vepc, false);
> > +	epc_page = sgx_alloc_epc_page((void *)addr, false);
> >  	if (IS_ERR(epc_page))
> >  		return PTR_ERR(epc_page);
> 
> Was the 'vepc' value simply unused before?

Yes for EPC page assigned to KVM guest it was unused before this series.

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 2/3] x86/sgx: Fine grained SGX MCA behavior for virtualization
  2022-07-21 16:54   ` Dave Hansen
@ 2022-07-22 16:21     ` Zhiquan Li
  0 siblings, 0 replies; 11+ messages in thread
From: Zhiquan Li @ 2022-07-22 16:21 UTC (permalink / raw)
  To: Dave Hansen, linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, kai.huang, fan.du, cathy.zhang


On 2022/7/22 00:54, Dave Hansen wrote:
> On 6/22/22 02:37, Zhiquan Li wrote:
>> When VM guest access a SGX EPC page with memory failure, current
>> behavior will kill the guest, expected only kill the SGX application
>> inside it.
> Can we please clean this up?  This is generally readable, but _hard_ to
> read.  Perhaps:
> 
> 	Today, if a guest accesses an SGX EPC page with memory failure,
> 	the kernel will behavior will kill the entire guest.  This blast
> 	radius is too large.  It would be idea to kill only the SGX
> 	application inside the guest.
> 
>> To fix it we send SIGBUS with code BUS_MCEERR_AR and some extra
> 	    ^ No "we's".
> 
>> information for hypervisor to inject #MC information to guest, which is
>> helpful in SGX case.
> To fix this, send a SIGBUS to host userspace (like QEMU) which can
> follow up by injecting a #MC to the guest.
> 
>> The rest of things are guest side. Currently the hypervisor like Qemu
>> already has mature facility to convert HVA to GPA and inject #MC to
>> the guest OS.
>>
>> Unlike host enclaves, virtual EPC instance cannot be shared by multiple
>> VMs.  It is because how enclaves are created is totally up to the guest.
>> Sharing virtual EPC instance will be very likely to unexpectedly break
>> enclaves in all VMs.
> I'm not sure why this is here or why it is important to this patch.
> 
>> SGX virtual EPC driver doesn't explicitly prevent virtual EPC instance
>> being shared by multiple VMs via fork().  However KVM doesn't support
>> running a VM across multiple mm structures, and the de facto userspace
>> hypervisor (Qemu) doesn't use fork() to create a new VM, so in practice
>> this should not happen.
> 
>> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
>> index ab4ec54bbdd9..4507c2302348 100644
>> --- a/arch/x86/kernel/cpu/sgx/main.c
>> +++ b/arch/x86/kernel/cpu/sgx/main.c
>> @@ -715,6 +715,8 @@ int arch_memory_failure(unsigned long pfn, int flags)
>>  	struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT);
>>  	struct sgx_epc_section *section;
>>  	struct sgx_numa_node *node;
>> +	unsigned long vaddr;
>> +	int ret;
>>  
>>  	/*
>>  	 * mm/memory-failure.c calls this routine for all errors
>> @@ -731,8 +733,26 @@ int arch_memory_failure(unsigned long pfn, int flags)
>>  	 * error. The signal may help the task understand why the
>>  	 * enclave is broken.
>>  	 */
>> -	if (flags & MF_ACTION_REQUIRED)
>> -		force_sig(SIGBUS);
>> +	if (flags & MF_ACTION_REQUIRED) {
>> +		/*
>> +		 * Provide extra info to the task so that it can make further
>> +		 * decision but not simply kill it. This is quite useful for
>> +		 * virtualization case.
>> +		 */
>> +		if (page->flags & SGX_EPC_PAGE_KVM_GUEST) {
>> +			/*
>> +			 * The "owner" field is repurposed as the virtual address
>> +			 * of virtual EPC page.
>> +			 */
>> +			vaddr = (unsigned long)page->owner & PAGE_MASK;
> I really don't like repurposing page->owner like this.  It requires
> casting on *both* sides of a type that we have full control over.
> 
> 	struct sgx_epc_page {
> 	        unsigned int section;
> 	        u16 flags;
> 	        u16 poison;
> 		union {
> 		        struct sgx_encl_page *encl_owner;
> 			// Use when SGX_EPC_PAGE_KVM_GUEST
> 			// set in ->flags:
> 		        void __user *vepc_vaddr;
> 		};
> 	        struct list_head list;
> 	};
> 
> There is zero reason to play casting games instead of doing that ^
> 

Many thanks for your review, Dave.
I will send V6 patch set as per your suggestion.

Best Regards,
Zhiquan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 3/3] x86/sgx: Fine grained SGX MCA behavior for normal case
  2022-07-21 16:57   ` Dave Hansen
@ 2022-07-22 17:28     ` Zhiquan Li
  0 siblings, 0 replies; 11+ messages in thread
From: Zhiquan Li @ 2022-07-22 17:28 UTC (permalink / raw)
  To: Dave Hansen, linux-sgx, tony.luck, jarkko, dave.hansen
  Cc: seanjc, kai.huang, fan.du, cathy.zhang


On 2022/7/22 00:57, Dave Hansen wrote:
> I honestly have zero idea what this patch is doing.

OK, let's drop it unless we have more proper reason in future.

Best Regards,
Zhiquan

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-07-22 17:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-22  9:37 [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior Zhiquan Li
2022-06-22  9:37 ` [PATCH v5 1/3] x86/sgx: Repurpose the owner field as the virtual address of virtual EPC page Zhiquan Li
2022-07-21 16:42   ` Dave Hansen
2022-07-21 23:27     ` Kai Huang
2022-06-22  9:37 ` [PATCH v5 2/3] x86/sgx: Fine grained SGX MCA behavior for virtualization Zhiquan Li
2022-07-21 16:54   ` Dave Hansen
2022-07-22 16:21     ` Zhiquan Li
2022-06-22  9:37 ` [PATCH v5 3/3] x86/sgx: Fine grained SGX MCA behavior for normal case Zhiquan Li
2022-07-21 16:57   ` Dave Hansen
2022-07-22 17:28     ` Zhiquan Li
2022-06-26  6:04 ` [PATCH v5 0/3] x86/sgx: fine grained SGX MCA behavior Jarkko Sakkinen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.