linux-sgx.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/4]  x86/sgx: implement support for MADV_WILLNEED
@ 2022-10-27 19:45 Haitao Huang
  2022-10-27 19:45 ` [RFC PATCH v2 1/4] x86/sgx: Export sgx_encl_eaug_page Haitao Huang
  0 siblings, 1 reply; 8+ messages in thread
From: Haitao Huang @ 2022-10-27 19:45 UTC (permalink / raw)
  To: linux-sgx, jarkko, dave.hansen, reinette.chatre, vijay.dhanraj

V1: https://lore.kernel.org/linux-sgx/20221019191413.48752-1-haitao.huang@linux.intel.com/T/#t
Changes since V1:
- Separate patch for exporting sgx_encl_eaug_page
- Move return code changes for sgx_encl_eaug_page to the same patch
  implementing sgx_fadvise
- Small improvement in commit messages and the cover letter

Hi Everybody,

The current SGX2 (EDMM) implementation in the kernel only adds
an EPC page when a page fault is triggered on an address without
EPC allocated. Although this is adquate for allocations of smaller
address ranges or ranges with sparse accessing patterns, it is
inefficient for other cases in which large number of EPC pages
need be added and accessed immediately afterwards.

Previously we have attempted [1] to address this issue by implementing
support for the semantics of MAP_POPULATE flag passed into mmap(). However,
some mm maintainers have concerns on adding a new callback in fops [2].

This series is to adopt the MADV_WILLNEED alternative suggested
by Dave in previous discussions [3]. The sgx driver implements the
fops->fadvise() so that user space will be able to use
madvise(..., MADV_WILLNEED) to instruct kernel to EAUG pages as
soon as possible for a given range.

Compared to the MAP_POPULATE approach, this alternative requires an
additional call to madvise() after mmap() from user space. But it
would not need any changes in kernel outside the SGX driver. The separate
madvise() call also offers flexibility for user space to specify a
subrange to EAUG in an enclosing VMA.

The core implementation is in the second patch while the first patch
only exports a function handling EAUG on PF to be reused.

The last two patches are to add a microbenchmark in the sgx selftest to
measure the performance difference.  Following speedup on various
allocation sizes were observed when I ran it on a platform with 4G EPC.
It indicates that the change would roughly half the run time until
EPC swapping is activated, at which point EAUG for madvise is stopped.
-------------------------
  Alloc. size: Speedup
-------------------------
      1 page : 75%
      2 pages: 48%
      4 pages: 55%
      8 pages: 58%
     16 pages: 62%
     32 pages: 62%
     64 pages: 62%
    128 pages: 62%
    256 pages: 73%
    512 pages: 62%
   1024 pages: 62%
   2048 pages: 62%
   4096 pages: 61%
   8192 pages: 61%
  16384 pages: 61%
  32768 pages: 71%
  65536 pages: 61%
 131072 pages: 62%
 262144 pages: 62%
 524288 pages: 62%
1048576 pages: 55%
2097152 pages: 19%
-------------------------

Thank you very much for your attention and any comments/feedback.

Haitao

[1]https://lore.kernel.org/all/20220308112833.262805-1-jarkko@kernel.org/
[2]https://lore.kernel.org/linux-sgx/20220306021534.83553-1-jarkko@kernel.org/
[3]https://lore.kernel.org/linux-sgx/c3083144-bfc1-3260-164c-e59b2d110df8@intel.com/

Haitao Huang (4):
  x86/sgx: Export sgx_encl_eaug_page
  x86/sgx: Implement support for MADV_WILLNEED
  selftests/sgx: add len field for EACCEPT op
  selftests/sgx: Add test for madvise(..., WILLNEED)

 arch/x86/kernel/cpu/sgx/driver.c        |  81 ++++++++++++
 arch/x86/kernel/cpu/sgx/encl.c          |  46 ++++---
 arch/x86/kernel/cpu/sgx/encl.h          |   3 +-
 tools/testing/selftests/sgx/defines.h   |   1 +
 tools/testing/selftests/sgx/main.c      | 167 ++++++++++++++++++++++++
 tools/testing/selftests/sgx/test_encl.c |  20 ++-
 6 files changed, 295 insertions(+), 23 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 1/4] x86/sgx: Export sgx_encl_eaug_page
  2022-10-27 19:45 [RFC PATCH v2 0/4] x86/sgx: implement support for MADV_WILLNEED Haitao Huang
@ 2022-10-27 19:45 ` Haitao Huang
  2022-10-27 19:45   ` [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED Haitao Huang
  2022-11-01  1:12   ` [RFC PATCH v2 1/4] x86/sgx: Export sgx_encl_eaug_page Jarkko Sakkinen
  0 siblings, 2 replies; 8+ messages in thread
From: Haitao Huang @ 2022-10-27 19:45 UTC (permalink / raw)
  To: linux-sgx, jarkko, dave.hansen, reinette.chatre, vijay.dhanraj

This function will be called by both the page fault handler
and the fops->fadvise callback.

Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 4 ++--
 arch/x86/kernel/cpu/sgx/encl.h | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 8bdeae2fc309..1abc5e7f2660 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -308,8 +308,8 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
  * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed
  * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise.
  */
-static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
-				     struct sgx_encl *encl, unsigned long addr)
+vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
+			      struct sgx_encl *encl, unsigned long addr)
 {
 	vm_fault_t vmret = VM_FAULT_SIGBUS;
 	struct sgx_pageinfo pginfo = {0};
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index a65a952116fd..500437981161 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -127,5 +127,6 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
 					 unsigned long addr);
 struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim);
 void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
-
+vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
+			      struct sgx_encl *encl, unsigned long addr)
 #endif /* _X86_ENCL_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED
  2022-10-27 19:45 ` [RFC PATCH v2 1/4] x86/sgx: Export sgx_encl_eaug_page Haitao Huang
@ 2022-10-27 19:45   ` Haitao Huang
  2022-10-27 19:45     ` [RFC PATCH v2 3/4] selftests/sgx: add len field for EACCEPT op Haitao Huang
  2022-11-01  1:21     ` [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED Jarkko Sakkinen
  2022-11-01  1:12   ` [RFC PATCH v2 1/4] x86/sgx: Export sgx_encl_eaug_page Jarkko Sakkinen
  1 sibling, 2 replies; 8+ messages in thread
From: Haitao Huang @ 2022-10-27 19:45 UTC (permalink / raw)
  To: linux-sgx, jarkko, dave.hansen, reinette.chatre, vijay.dhanraj

Support madvise(..., MADV_WILLNEED) by adding EPC pages with EAUG in
the newly added fops->fadvise() callback implementation, sgx_fadvise().

Change the return type and values of the sgx_encl_eaug_page function
so that more specific error codes are returned for different treatment
by the page fault handler and the fadvise callback.
On any error, sgx_fadvise() will discontinue further operations
and return as normal. The page fault handler allows a PF retried
by returning VM_FAULT_NOPAGE in handling -EBUSY returned from
sgx_encl_eaug_page.

Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
---
 arch/x86/kernel/cpu/sgx/driver.c | 81 ++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/encl.c   | 46 +++++++++++-------
 arch/x86/kernel/cpu/sgx/encl.h   |  4 +-
 3 files changed, 113 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
index aa9b8b868867..54b24897605b 100644
--- a/arch/x86/kernel/cpu/sgx/driver.c
+++ b/arch/x86/kernel/cpu/sgx/driver.c
@@ -2,6 +2,7 @@
 /*  Copyright(c) 2016-20 Intel Corporation. */
 
 #include <linux/acpi.h>
+#include <linux/fadvise.h>
 #include <linux/miscdevice.h>
 #include <linux/mman.h>
 #include <linux/security.h>
@@ -9,6 +10,7 @@
 #include <asm/traps.h>
 #include "driver.h"
 #include "encl.h"
+#include "encls.h"
 
 u64 sgx_attributes_reserved_mask;
 u64 sgx_xfrm_reserved_mask = ~0x3;
@@ -97,10 +99,88 @@ static int sgx_mmap(struct file *file, struct vm_area_struct *vma)
 	vma->vm_ops = &sgx_vm_ops;
 	vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
 	vma->vm_private_data = encl;
+	/* Anchor vm_pgoff to the enclave base.
+	 * So offset passed back to sgx_fadvise hook
+	 * is relative to the enclave base
+	 */
+	vma->vm_pgoff = (vma->vm_start - encl->base) >> PAGE_SHIFT;
 
 	return 0;
 }
 
+/*
+ * Add new pages to the enclave sequentially with ENCLS[EAUG] for the WILLNEED advice.
+ * Only do this to existing VMAs in the same enclave and reject the request.
+ * Returns:	0 if EAUG done with best effort, -EINVAL if any sub-range given
+ * is not in the enclave, or enclave is not initialized..
+ */
+static int sgx_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
+{
+	struct sgx_encl *encl = file->private_data;
+	unsigned long start, end, pos;
+	int ret = -EINVAL;
+	struct vm_area_struct *vma = NULL;
+
+	/* Only support WILLNEED */
+	if (advice != POSIX_FADV_WILLNEED)
+		return -EINVAL;
+	if (!encl)
+		return -EINVAL;
+	if (!cpu_feature_enabled(X86_FEATURE_SGX2))
+		return -EINVAL;
+
+	if (offset + len < offset)
+		return -EINVAL;
+	if (encl->base + offset < encl->base)
+		return -EINVAL;
+	start  = offset + encl->base;
+	end = start + len;
+	if (end < start)
+		return -EINVAL;
+	if (end > encl->base + encl->size)
+		return -EINVAL;
+
+	/* EAUG works only for initialized enclaves. */
+	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
+		return -EINVAL;
+
+	mmap_read_lock(current->mm);
+
+	vma = find_vma(current->mm, start);
+	if (!vma)
+		goto unlock;
+	if (vma->vm_private_data != encl)
+		goto unlock;
+
+	pos = start;
+	if (pos < vma->vm_start || end > vma->vm_end) {
+		/* Don't allow any gaps */
+		goto unlock;
+	}
+	/* Here: vm_start <= pos < end <= vm_end */
+	while (pos < end) {
+		if (xa_load(&encl->page_array, PFN_DOWN(pos)))
+			continue;
+		if (signal_pending(current)) {
+			if (pos == start)
+				ret = -ERESTARTSYS;
+			else
+				ret = -EINTR;
+			goto unlock;
+		}
+		ret = sgx_encl_eaug_page(vma, encl, pos);
+		/* It's OK to not finish */
+		if (ret)
+			break;
+		pos = pos + PAGE_SIZE;
+		cond_resched();
+	}
+	ret = 0;
+unlock:
+	mmap_read_unlock(current->mm);
+	return ret;
+}
+
 static unsigned long sgx_get_unmapped_area(struct file *file,
 					   unsigned long addr,
 					   unsigned long len,
@@ -133,6 +213,7 @@ static const struct file_operations sgx_encl_fops = {
 	.compat_ioctl		= sgx_compat_ioctl,
 #endif
 	.mmap			= sgx_mmap,
+	.fadvise		= sgx_fadvise,
 	.get_unmapped_area	= sgx_get_unmapped_area,
 };
 
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 1abc5e7f2660..c57e60d5a0aa 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -305,11 +305,11 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
  * on a SGX2 system then the EPC can be added dynamically via the SGX2
  * ENCLS[EAUG] instruction.
  *
- * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed
- * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise.
+ * Returns: 0 when PTE was installed successfully, -EBUSY for waiting on
+ * reclaimer to free EPC, -ENOMEM for out of RAM, -EFAULT as error otherwise.
  */
-vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
-			      struct sgx_encl *encl, unsigned long addr)
+int sgx_encl_eaug_page(struct vm_area_struct *vma,
+		       struct sgx_encl *encl, unsigned long addr)
 {
 	vm_fault_t vmret = VM_FAULT_SIGBUS;
 	struct sgx_pageinfo pginfo = {0};
@@ -318,10 +318,10 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
 	struct sgx_va_page *va_page;
 	unsigned long phys_addr;
 	u64 secinfo_flags;
-	int ret;
+	int ret = -EFAULT;
 
 	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
-		return VM_FAULT_SIGBUS;
+		return -EFAULT;
 
 	/*
 	 * Ignore internal permission checking for dynamically added pages.
@@ -332,21 +332,21 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
 	secinfo_flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
 	encl_page = sgx_encl_page_alloc(encl, addr - encl->base, secinfo_flags);
 	if (IS_ERR(encl_page))
-		return VM_FAULT_OOM;
+		return -ENOMEM;
 
 	mutex_lock(&encl->lock);
 
 	epc_page = sgx_alloc_epc_page(encl_page, false);
 	if (IS_ERR(epc_page)) {
 		if (PTR_ERR(epc_page) == -EBUSY)
-			vmret =  VM_FAULT_NOPAGE;
+			ret =  -EBUSY;
 		goto err_out_unlock;
 	}
 
 	va_page = sgx_encl_grow(encl, false);
 	if (IS_ERR(va_page)) {
 		if (PTR_ERR(va_page) == -EBUSY)
-			vmret = VM_FAULT_NOPAGE;
+			ret = -EBUSY;
 		goto err_out_epc;
 	}
 
@@ -359,16 +359,20 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
 	 * If ret == -EBUSY then page was created in another flow while
 	 * running without encl->lock
 	 */
-	if (ret)
+	if (ret) {
+		ret = -EFAULT;
 		goto err_out_shrink;
+	}
 
 	pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
 	pginfo.addr = encl_page->desc & PAGE_MASK;
 	pginfo.metadata = 0;
 
 	ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
-	if (ret)
+	if (ret) {
+		ret = -EFAULT;
 		goto err_out;
+	}
 
 	encl_page->encl = encl;
 	encl_page->epc_page = epc_page;
@@ -385,10 +389,10 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
 	vmret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr));
 	if (vmret != VM_FAULT_NOPAGE) {
 		mutex_unlock(&encl->lock);
-		return VM_FAULT_SIGBUS;
+		return -EFAULT;
 	}
 	mutex_unlock(&encl->lock);
-	return VM_FAULT_NOPAGE;
+	return 0;
 
 err_out:
 	xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
@@ -401,7 +405,7 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
 	mutex_unlock(&encl->lock);
 	kfree(encl_page);
 
-	return vmret;
+	return ret;
 }
 
 static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
@@ -431,8 +435,18 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 	 * enclave that will be checked for right away.
 	 */
 	if (cpu_feature_enabled(X86_FEATURE_SGX2) &&
-	    (!xa_load(&encl->page_array, PFN_DOWN(addr))))
-		return sgx_encl_eaug_page(vma, encl, addr);
+	    (!xa_load(&encl->page_array, PFN_DOWN(addr)))) {
+		switch (sgx_encl_eaug_page(vma, encl, addr)) {
+		case 0:
+		case -EBUSY:
+			return VM_FAULT_NOPAGE;
+		case -ENOMEM:
+			return VM_FAULT_OOM;
+		case -EFAULT:
+		default:
+			return VM_FAULT_SIGBUS;
+		}
+	}
 
 	mutex_lock(&encl->lock);
 
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index 500437981161..36059d35e1bc 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -127,6 +127,6 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
 					 unsigned long addr);
 struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim);
 void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
-vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
-			      struct sgx_encl *encl, unsigned long addr)
+int sgx_encl_eaug_page(struct vm_area_struct *vma,
+		       struct sgx_encl *encl, unsigned long addr);
 #endif /* _X86_ENCL_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 3/4] selftests/sgx: add len field for EACCEPT op
  2022-10-27 19:45   ` [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED Haitao Huang
@ 2022-10-27 19:45     ` Haitao Huang
  2022-10-27 19:45       ` [RFC PATCH v2 4/4] selftests/sgx: Add test for madvise(..., WILLNEED) Haitao Huang
  2022-11-01  1:21     ` [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED Jarkko Sakkinen
  1 sibling, 1 reply; 8+ messages in thread
From: Haitao Huang @ 2022-10-27 19:45 UTC (permalink / raw)
  To: linux-sgx, jarkko, dave.hansen, reinette.chatre, vijay.dhanraj

So we can EACCEPT multiple pages inside enclave without EEXIT,
preparing for testing with MADV_WILLNEED for ranges bigger than
a single page.

Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
---
 tools/testing/selftests/sgx/defines.h   |  1 +
 tools/testing/selftests/sgx/test_encl.c | 20 ++++++++++++++------
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h
index d8587c971941..8578e773d3d8 100644
--- a/tools/testing/selftests/sgx/defines.h
+++ b/tools/testing/selftests/sgx/defines.h
@@ -60,6 +60,7 @@ struct encl_op_eaccept {
 	struct encl_op_header header;
 	uint64_t epc_addr;
 	uint64_t flags;
+	uint64_t len;
 	uint64_t ret;
 };
 
diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c
index c0d6397295e3..fc797385200b 100644
--- a/tools/testing/selftests/sgx/test_encl.c
+++ b/tools/testing/selftests/sgx/test_encl.c
@@ -35,14 +35,22 @@ static void do_encl_eaccept(void *_op)
 	struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0};
 	struct encl_op_eaccept *op = _op;
 	int rax;
+	if (op->len == 0)
+		op->len = 4096;
 
 	secinfo.flags = op->flags;
-
-	asm volatile(".byte 0x0f, 0x01, 0xd7"
-				: "=a" (rax)
-				: "a" (EACCEPT),
-				  "b" (&secinfo),
-				  "c" (op->epc_addr));
+	for (uint64_t addr = op->epc_addr;
+			addr < op->epc_addr + op->len; addr += 4096) {
+		asm volatile(".byte 0x0f, 0x01, 0xd7"
+					: "=a" (rax)
+					: "a" (EACCEPT),
+					  "b" (&secinfo),
+					  "c" (addr));
+		if (rax) {
+			op->ret = rax;
+			return;
+		}
+	}
 
 	op->ret = rax;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 4/4] selftests/sgx: Add test for madvise(..., WILLNEED)
  2022-10-27 19:45     ` [RFC PATCH v2 3/4] selftests/sgx: add len field for EACCEPT op Haitao Huang
@ 2022-10-27 19:45       ` Haitao Huang
  0 siblings, 0 replies; 8+ messages in thread
From: Haitao Huang @ 2022-10-27 19:45 UTC (permalink / raw)
  To: linux-sgx, jarkko, dave.hansen, reinette.chatre, vijay.dhanraj

Measure and compare run time for EAUG'ing different number
of EPC pages with/without madvise(..., WILLNEED) call.

Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
---
 tools/testing/selftests/sgx/main.c | 167 +++++++++++++++++++++++++++++
 1 file changed, 167 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 48976bb7bd79..7b5f6705716d 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -10,6 +10,7 @@
 #include <stdint.h>
 #include <stdlib.h>
 #include <string.h>
+#include <time.h>
 #include <unistd.h>
 #include <sys/ioctl.h>
 #include <sys/mman.h>
@@ -1358,6 +1359,172 @@ TEST_F_TIMEOUT(enclave, augment_via_eaccept_long, TIMEOUT_DEFAULT)
 	munmap(addr, ENCL_DYNAMIC_SIZE_LONG);
 }
 
+static int eaccept_range(struct _test_data_enclave *self, void *addr,
+			 unsigned long size, uint64_t flags,
+						struct __test_metadata *_metadata)
+{
+	struct encl_op_eaccept eaccept_op;
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	/*
+	 * Run EACCEPT on every page to trigger the #PF->EAUG->EACCEPT(again
+	 * without a #PF). All should be transparent to userspace.
+	 */
+	eaccept_op.flags = flags;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+	eaccept_op.len = size;
+	eaccept_op.epc_addr = (uint64_t)(addr);
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	ASSERT_EQ(eaccept_op.ret, 0);
+	ASSERT_EQ(self->run.function, EEXIT);
+
+	return 0;
+}
+
+static int trim_remove_range(struct _test_data_enclave *self, void *addr,
+			     unsigned long size, struct __test_metadata *_metadata)
+{
+	int ret, errno_save;
+	struct sgx_enclave_remove_pages remove_ioc;
+	struct sgx_enclave_modify_types modt_ioc;
+	unsigned long offset;
+	unsigned long count;
+
+	if ((uint64_t)addr <= self->encl.encl_base)
+		return -1;
+	offset = (uint64_t)addr - self->encl.encl_base;
+
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	modt_ioc.offset = offset;
+	modt_ioc.length = size;
+	modt_ioc.page_type = SGX_PAGE_TYPE_TRIM;
+	count = 0;
+	do {
+		ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc);
+
+		errno_save = ret == -1 ? errno : 0;
+		if (errno_save != EAGAIN)
+			break;
+		EXPECT_EQ(modt_ioc.result, 0);
+
+		count += modt_ioc.count;
+		modt_ioc.offset += modt_ioc.count;
+		modt_ioc.length -= modt_ioc.count;
+		modt_ioc.result = 0;
+		modt_ioc.count = 0;
+	} while (modt_ioc.length != 0);
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	count += modt_ioc.count;
+	EXPECT_EQ(count, size);
+
+	EXPECT_EQ(eaccept_range(self, addr, size,
+				SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED,
+				   _metadata), 0);
+
+	/* Complete page removal. */
+	memset(&remove_ioc, 0, sizeof(remove_ioc));
+	remove_ioc.offset = offset;
+	remove_ioc.length = size;
+	count = 0;
+	do {
+		ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_REMOVE_PAGES, &remove_ioc);
+
+		errno_save = ret == -1 ? errno : 0;
+		if (errno_save != EAGAIN)
+			break;
+
+		count += remove_ioc.count;
+		remove_ioc.offset += remove_ioc.count;
+		remove_ioc.length -= remove_ioc.count;
+		remove_ioc.count = 0;
+	} while (remove_ioc.length != 0);
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	count += remove_ioc.count;
+	EXPECT_EQ(count, size);
+
+	return 0;
+}
+
+/*
+ * Compare performance with and without madvise call before EACCEPT'ing
+ * different size of regions.
+ */
+TEST_F_TIMEOUT(enclave, augment_via_madvise, TIMEOUT_DEFAULT)
+{
+	unsigned long advise_size = PAGE_SIZE;
+	unsigned long max_advise_size = get_total_epc_mem() * 3UL;
+	int speed_up_percent;
+	clock_t start;
+	double time_used1, time_used2;
+	size_t total_size = 0;
+	unsigned long i;
+	void *addr;
+
+	if (!sgx2_supported())
+		SKIP(return, "SGX2 not supported");
+
+	ASSERT_TRUE(setup_test_encl_dynamic(ENCL_HEAP_SIZE_DEFAULT,
+					    max_advise_size, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	for (i = 0; i < self->encl.nr_segments; i++) {
+		struct encl_segment *seg = &self->encl.segment_tbl[i];
+
+		total_size += seg->size;
+	}
+
+	for (i = 1; i < 52 && advise_size < max_advise_size; i++) {
+		addr = mmap((void *)self->encl.encl_base + total_size, advise_size,
+			    PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,
+					self->encl.fd, 0);
+		EXPECT_NE(addr, MAP_FAILED);
+
+		start = clock();
+		EXPECT_EQ(eaccept_range(self, addr, advise_size,
+					SGX_SECINFO_R | SGX_SECINFO_W
+								| SGX_SECINFO_REG
+								| SGX_SECINFO_PENDING,
+								_metadata), 0);
+		time_used1 = (double)clock() - start;
+
+		EXPECT_EQ(trim_remove_range(self, addr, advise_size, _metadata), 0);
+
+		start = clock();
+		EXPECT_EQ(madvise(addr, advise_size, MADV_WILLNEED), 0);
+		EXPECT_EQ(eaccept_range(self, addr, advise_size,
+					SGX_SECINFO_R | SGX_SECINFO_W
+								| SGX_SECINFO_REG
+								| SGX_SECINFO_PENDING,
+								_metadata), 0);
+		time_used2 = (double)clock() - start;
+
+		speed_up_percent = (int)((time_used1 - time_used2) / time_used1 * 100);
+		TH_LOG("madvise speed up for eaug'ing %10ld pages: %d%%",
+		       advise_size / PAGE_SIZE, speed_up_percent);
+		EXPECT_GE(speed_up_percent, 0);
+		EXPECT_EQ(trim_remove_range(self, addr, advise_size, _metadata), 0);
+	    munmap(addr, advise_size);
+		advise_size = (advise_size << 1UL);
+	}
+	encl_delete(&self->encl);
+}
+
 /*
  * SGX2 page type modification test in two phases:
  * Phase 1:
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 1/4] x86/sgx: Export sgx_encl_eaug_page
  2022-10-27 19:45 ` [RFC PATCH v2 1/4] x86/sgx: Export sgx_encl_eaug_page Haitao Huang
  2022-10-27 19:45   ` [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED Haitao Huang
@ 2022-11-01  1:12   ` Jarkko Sakkinen
  1 sibling, 0 replies; 8+ messages in thread
From: Jarkko Sakkinen @ 2022-11-01  1:12 UTC (permalink / raw)
  To: Haitao Huang; +Cc: linux-sgx, dave.hansen, reinette.chatre, vijay.dhanraj

On Thu, Oct 27, 2022 at 12:45:29PM -0700, Haitao Huang wrote:
> This function will be called by both the page fault handler
> and the fops->fadvise callback.
> 
> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/encl.c | 4 ++--
>  arch/x86/kernel/cpu/sgx/encl.h | 3 ++-
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 8bdeae2fc309..1abc5e7f2660 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -308,8 +308,8 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
>   * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed
>   * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise.
>   */
> -static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> -				     struct sgx_encl *encl, unsigned long addr)
> +vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> +			      struct sgx_encl *encl, unsigned long addr)
>  {
>  	vm_fault_t vmret = VM_FAULT_SIGBUS;
>  	struct sgx_pageinfo pginfo = {0};
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index a65a952116fd..500437981161 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -127,5 +127,6 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
>  					 unsigned long addr);
>  struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim);
>  void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
> -

Spurious change.

> +vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> +			      struct sgx_encl *encl, unsigned long addr)
>  #endif /* _X86_ENCL_H */
> -- 
> 2.25.1
> 

BR, Jarkko

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED
  2022-10-27 19:45   ` [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED Haitao Huang
  2022-10-27 19:45     ` [RFC PATCH v2 3/4] selftests/sgx: add len field for EACCEPT op Haitao Huang
@ 2022-11-01  1:21     ` Jarkko Sakkinen
  2022-11-01 18:24       ` Haitao Huang
  1 sibling, 1 reply; 8+ messages in thread
From: Jarkko Sakkinen @ 2022-11-01  1:21 UTC (permalink / raw)
  To: Haitao Huang; +Cc: linux-sgx, dave.hansen, reinette.chatre, vijay.dhanraj

On Thu, Oct 27, 2022 at 12:45:30PM -0700, Haitao Huang wrote:
> Support madvise(..., MADV_WILLNEED) by adding EPC pages with EAUG in
> the newly added fops->fadvise() callback implementation, sgx_fadvise().
> 
> Change the return type and values of the sgx_encl_eaug_page function
> so that more specific error codes are returned for different treatment
> by the page fault handler and the fadvise callback.
> On any error, sgx_fadvise() will discontinue further operations
> and return as normal. The page fault handler allows a PF retried
> by returning VM_FAULT_NOPAGE in handling -EBUSY returned from
> sgx_encl_eaug_page.
> 
> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/driver.c | 81 ++++++++++++++++++++++++++++++++
>  arch/x86/kernel/cpu/sgx/encl.c   | 46 +++++++++++-------
>  arch/x86/kernel/cpu/sgx/encl.h   |  4 +-
>  3 files changed, 113 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
> index aa9b8b868867..54b24897605b 100644
> --- a/arch/x86/kernel/cpu/sgx/driver.c
> +++ b/arch/x86/kernel/cpu/sgx/driver.c
> @@ -2,6 +2,7 @@
>  /*  Copyright(c) 2016-20 Intel Corporation. */
>  
>  #include <linux/acpi.h>
> +#include <linux/fadvise.h>
>  #include <linux/miscdevice.h>
>  #include <linux/mman.h>
>  #include <linux/security.h>
> @@ -9,6 +10,7 @@
>  #include <asm/traps.h>
>  #include "driver.h"
>  #include "encl.h"
> +#include "encls.h"
>  
>  u64 sgx_attributes_reserved_mask;
>  u64 sgx_xfrm_reserved_mask = ~0x3;
> @@ -97,10 +99,88 @@ static int sgx_mmap(struct file *file, struct vm_area_struct *vma)
>  	vma->vm_ops = &sgx_vm_ops;
>  	vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
>  	vma->vm_private_data = encl;
> +	/* Anchor vm_pgoff to the enclave base.
> +	 * So offset passed back to sgx_fadvise hook
> +	 * is relative to the enclave base
> +	 */

This comment only convolutes code because it does not tell anything
that is non-intuitive to understand from the code.

> +	vma->vm_pgoff = (vma->vm_start - encl->base) >> PAGE_SHIFT;

vma->vm_pgoff = PFN_DOWN(vma->vm_start - encl->base);

>  
>  	return 0;
>  }
>  
> +/*
> + * Add new pages to the enclave sequentially with ENCLS[EAUG] for the WILLNEED advice.
> + * Only do this to existing VMAs in the same enclave and reject the request.

Keep this.

> + * Returns:	0 if EAUG done with best effort, -EINVAL if any sub-range given
> + * is not in the enclave, or enclave is not initialized..
> + */

Please remove this.

It is useless and also incorrect, given that the function can also return
other return values.

> +static int sgx_fadvise(struct file *file, loff_t offset, loff_t len, int advice)
> +{
> +	struct sgx_encl *encl = file->private_data;
> +	unsigned long start, end, pos;
> +	int ret = -EINVAL;
> +	struct vm_area_struct *vma = NULL;

Reverse christmas tree order.

> +
> +	/* Only support WILLNEED */
> +	if (advice != POSIX_FADV_WILLNEED)
> +		return -EINVAL;
> +	if (!encl)
> +		return -EINVAL;
> +	if (!cpu_feature_enabled(X86_FEATURE_SGX2))
> +		return -EINVAL;
> +
> +	if (offset + len < offset)
> +		return -EINVAL;
> +	if (encl->base + offset < encl->base)
> +		return -EINVAL;
> +	start  = offset + encl->base;
> +	end = start + len;
> +	if (end < start)
> +		return -EINVAL;
> +	if (end > encl->base + encl->size)
> +		return -EINVAL;
> +
> +	/* EAUG works only for initialized enclaves. */

Please, remove this comment.

> +	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> +		return -EINVAL;
> +
> +	mmap_read_lock(current->mm);
> +
> +	vma = find_vma(current->mm, start);
> +	if (!vma)
> +		goto unlock;
> +	if (vma->vm_private_data != encl)
> +		goto unlock;
> +
> +	pos = start;
> +	if (pos < vma->vm_start || end > vma->vm_end) {
> +		/* Don't allow any gaps */
> +		goto unlock;
> +	}

Empty line.

> +	/* Here: vm_start <= pos < end <= vm_end */



> +	while (pos < end) {
> +		if (xa_load(&encl->page_array, PFN_DOWN(pos)))
> +			continue;
> +		if (signal_pending(current)) {
> +			if (pos == start)
> +				ret = -ERESTARTSYS;
> +			else
> +				ret = -EINTR;
> +			goto unlock;
> +		}
> +		ret = sgx_encl_eaug_page(vma, encl, pos);
> +		/* It's OK to not finish */
> +		if (ret)
> +			break;
> +		pos = pos + PAGE_SIZE;
> +		cond_resched();
> +	}
> +	ret = 0;

Empty line.

> +unlock:
> +	mmap_read_unlock(current->mm);
> +	return ret;
> +}
> +
>  static unsigned long sgx_get_unmapped_area(struct file *file,
>  					   unsigned long addr,
>  					   unsigned long len,
> @@ -133,6 +213,7 @@ static const struct file_operations sgx_encl_fops = {
>  	.compat_ioctl		= sgx_compat_ioctl,
>  #endif
>  	.mmap			= sgx_mmap,
> +	.fadvise		= sgx_fadvise,
>  	.get_unmapped_area	= sgx_get_unmapped_area,
>  };
>  
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 1abc5e7f2660..c57e60d5a0aa 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -305,11 +305,11 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
>   * on a SGX2 system then the EPC can be added dynamically via the SGX2
>   * ENCLS[EAUG] instruction.
>   *
> - * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed
> - * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise.
> + * Returns: 0 when PTE was installed successfully, -EBUSY for waiting on
> + * reclaimer to free EPC, -ENOMEM for out of RAM, -EFAULT as error otherwise.
>   */
> -vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> -			      struct sgx_encl *encl, unsigned long addr)
> +int sgx_encl_eaug_page(struct vm_area_struct *vma,
> +		       struct sgx_encl *encl, unsigned long addr)
>  {
>  	vm_fault_t vmret = VM_FAULT_SIGBUS;
>  	struct sgx_pageinfo pginfo = {0};
> @@ -318,10 +318,10 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>  	struct sgx_va_page *va_page;
>  	unsigned long phys_addr;
>  	u64 secinfo_flags;
> -	int ret;
> +	int ret = -EFAULT;
>  
>  	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> -		return VM_FAULT_SIGBUS;
> +		return -EFAULT;
>  
>  	/*
>  	 * Ignore internal permission checking for dynamically added pages.
> @@ -332,21 +332,21 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>  	secinfo_flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
>  	encl_page = sgx_encl_page_alloc(encl, addr - encl->base, secinfo_flags);
>  	if (IS_ERR(encl_page))
> -		return VM_FAULT_OOM;
> +		return -ENOMEM;
>  
>  	mutex_lock(&encl->lock);
>  
>  	epc_page = sgx_alloc_epc_page(encl_page, false);
>  	if (IS_ERR(epc_page)) {
>  		if (PTR_ERR(epc_page) == -EBUSY)
> -			vmret =  VM_FAULT_NOPAGE;
> +			ret =  -EBUSY;
>  		goto err_out_unlock;
>  	}
>  
>  	va_page = sgx_encl_grow(encl, false);
>  	if (IS_ERR(va_page)) {
>  		if (PTR_ERR(va_page) == -EBUSY)
> -			vmret = VM_FAULT_NOPAGE;
> +			ret = -EBUSY;
>  		goto err_out_epc;
>  	}
>  
> @@ -359,16 +359,20 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>  	 * If ret == -EBUSY then page was created in another flow while
>  	 * running without encl->lock
>  	 */
> -	if (ret)
> +	if (ret) {
> +		ret = -EFAULT;
>  		goto err_out_shrink;
> +	}
>  
>  	pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
>  	pginfo.addr = encl_page->desc & PAGE_MASK;
>  	pginfo.metadata = 0;
>  
>  	ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
> -	if (ret)
> +	if (ret) {
> +		ret = -EFAULT;
>  		goto err_out;
> +	}
>  
>  	encl_page->encl = encl;
>  	encl_page->epc_page = epc_page;
> @@ -385,10 +389,10 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>  	vmret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr));
>  	if (vmret != VM_FAULT_NOPAGE) {
>  		mutex_unlock(&encl->lock);
> -		return VM_FAULT_SIGBUS;
> +		return -EFAULT;
>  	}
>  	mutex_unlock(&encl->lock);
> -	return VM_FAULT_NOPAGE;
> +	return 0;
>  
>  err_out:
>  	xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
> @@ -401,7 +405,7 @@ vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>  	mutex_unlock(&encl->lock);
>  	kfree(encl_page);
>  
> -	return vmret;
> +	return ret;
>  }
>  
>  static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
> @@ -431,8 +435,18 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
>  	 * enclave that will be checked for right away.
>  	 */
>  	if (cpu_feature_enabled(X86_FEATURE_SGX2) &&
> -	    (!xa_load(&encl->page_array, PFN_DOWN(addr))))
> -		return sgx_encl_eaug_page(vma, encl, addr);
> +	    (!xa_load(&encl->page_array, PFN_DOWN(addr)))) {
> +		switch (sgx_encl_eaug_page(vma, encl, addr)) {
> +		case 0:
> +		case -EBUSY:
> +			return VM_FAULT_NOPAGE;
> +		case -ENOMEM:
> +			return VM_FAULT_OOM;
> +		case -EFAULT:
> +		default:
> +			return VM_FAULT_SIGBUS;
> +		}
> +	}
>  
>  	mutex_lock(&encl->lock);
>  
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index 500437981161..36059d35e1bc 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -127,6 +127,6 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
>  					 unsigned long addr);
>  struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim);
>  void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
> -vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> -			      struct sgx_encl *encl, unsigned long addr)
> +int sgx_encl_eaug_page(struct vm_area_struct *vma,
> +		       struct sgx_encl *encl, unsigned long addr);
>  #endif /* _X86_ENCL_H */
> -- 
> 2.25.1
> 

BR, Jarkko

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED
  2022-11-01  1:21     ` [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED Jarkko Sakkinen
@ 2022-11-01 18:24       ` Haitao Huang
  0 siblings, 0 replies; 8+ messages in thread
From: Haitao Huang @ 2022-11-01 18:24 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx, dave.hansen, reinette.chatre, vijay.dhanraj

Hi Jarkko,

On Mon, 31 Oct 2022 20:21:16 -0500, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Thu, Oct 27, 2022 at 12:45:30PM -0700, Haitao Huang wrote:
>> Support madvise(..., MADV_WILLNEED) by adding EPC pages with EAUG in
>> the newly added fops->fadvise() callback implementation, sgx_fadvise().
>>
>> Change the return type and values of the sgx_encl_eaug_page function
>> so that more specific error codes are returned for different treatment
>> by the page fault handler and the fadvise callback.
>> On any error, sgx_fadvise() will discontinue further operations
>> and return as normal. The page fault handler allows a PF retried
>> by returning VM_FAULT_NOPAGE in handling -EBUSY returned from
>> sgx_encl_eaug_page.
>>
>> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
>> ---
>>  arch/x86/kernel/cpu/sgx/driver.c | 81 ++++++++++++++++++++++++++++++++
>>  arch/x86/kernel/cpu/sgx/encl.c   | 46 +++++++++++-------
>>  arch/x86/kernel/cpu/sgx/encl.h   |  4 +-
>>  3 files changed, 113 insertions(+), 18 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/sgx/driver.c  
>> b/arch/x86/kernel/cpu/sgx/driver.c
>> index aa9b8b868867..54b24897605b 100644
>> --- a/arch/x86/kernel/cpu/sgx/driver.c
>> +++ b/arch/x86/kernel/cpu/sgx/driver.c
>> @@ -2,6 +2,7 @@
>>  /*  Copyright(c) 2016-20 Intel Corporation. */
>>
>>  #include <linux/acpi.h>
>> +#include <linux/fadvise.h>
>>  #include <linux/miscdevice.h>
>>  #include <linux/mman.h>
>>  #include <linux/security.h>
>> @@ -9,6 +10,7 @@
>>  #include <asm/traps.h>
>>  #include "driver.h"
>>  #include "encl.h"
>> +#include "encls.h"
>>
>>  u64 sgx_attributes_reserved_mask;
>>  u64 sgx_xfrm_reserved_mask = ~0x3;
>> @@ -97,10 +99,88 @@ static int sgx_mmap(struct file *file, struct  
>> vm_area_struct *vma)
>>  	vma->vm_ops = &sgx_vm_ops;
>>  	vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
>>  	vma->vm_private_data = encl;
>> +	/* Anchor vm_pgoff to the enclave base.
>> +	 * So offset passed back to sgx_fadvise hook
>> +	 * is relative to the enclave base
>> +	 */
>
> This comment only convolutes code because it does not tell anything
> that is non-intuitive to understand from the code.
>
>> +	vma->vm_pgoff = (vma->vm_start - encl->base) >> PAGE_SHIFT;
>
> vma->vm_pgoff = PFN_DOWN(vma->vm_start - encl->base);
>
>>
>>  	return 0;
>>  }
>>
>> +/*
>> + * Add new pages to the enclave sequentially with ENCLS[EAUG] for the  
>> WILLNEED advice.
>> + * Only do this to existing VMAs in the same enclave and reject the  
>> request.
>
> Keep this.
>
>> + * Returns:	0 if EAUG done with best effort, -EINVAL if any sub-range  
>> given
>> + * is not in the enclave, or enclave is not initialized..
>> + */
>
> Please remove this.
>
> It is useless and also incorrect, given that the function can also return
> other return values.
>
>> +static int sgx_fadvise(struct file *file, loff_t offset, loff_t len,  
>> int advice)
>> +{
>> +	struct sgx_encl *encl = file->private_data;
>> +	unsigned long start, end, pos;
>> +	int ret = -EINVAL;
>> +	struct vm_area_struct *vma = NULL;
>
> Reverse christmas tree order.
>
>> +
>> +	/* Only support WILLNEED */
>> +	if (advice != POSIX_FADV_WILLNEED)
>> +		return -EINVAL;
>> +	if (!encl)
>> +		return -EINVAL;
...
>> +
>> +	/* EAUG works only for initialized enclaves. */
>
> Please, remove this comment.
>
>> +	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
>> +		return -EINVAL;
...
>> +		/* Don't allow any gaps */
>> +		goto unlock;
>> +	}
>
> Empty line.
>
>> +	/* Here: vm_start <= pos < end <= vm_end */
>
...
>> +		if (ret)
>> +			break;
>> +		pos = pos + PAGE_SIZE;
>> +		cond_resched();
>> +	}
>> +	ret = 0;
>
> Empty line.
>

Thanks for the review. Will fix all above in next version.
BR
Haitao

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-11-01 18:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-27 19:45 [RFC PATCH v2 0/4] x86/sgx: implement support for MADV_WILLNEED Haitao Huang
2022-10-27 19:45 ` [RFC PATCH v2 1/4] x86/sgx: Export sgx_encl_eaug_page Haitao Huang
2022-10-27 19:45   ` [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED Haitao Huang
2022-10-27 19:45     ` [RFC PATCH v2 3/4] selftests/sgx: add len field for EACCEPT op Haitao Huang
2022-10-27 19:45       ` [RFC PATCH v2 4/4] selftests/sgx: Add test for madvise(..., WILLNEED) Haitao Huang
2022-11-01  1:21     ` [RFC PATCH v2 2/4] x86/sgx: Implement support for MADV_WILLNEED Jarkko Sakkinen
2022-11-01 18:24       ` Haitao Huang
2022-11-01  1:12   ` [RFC PATCH v2 1/4] x86/sgx: Export sgx_encl_eaug_page Jarkko Sakkinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).