All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2
@ 2021-12-01 19:22 Reinette Chatre
  2021-12-01 19:22 ` [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers Reinette Chatre
                   ` (25 more replies)
  0 siblings, 26 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:22 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Everybody,

The current Linux kernel support for SGX includes support for SGX1 that
requires that an enclave be created with properties that accommodate all
usages over its (the enclave's) lifetime. This includes properties such
as permissions of enclave pages, the number of enclave pages, and the
number of threads supported by the enclave.

Consequences of this requirement to have the enclave be created to
accommodate all usages include:
* pages needing to support relocated code are required to have RWX
  permissions for their entire lifetime,
* an enclave needs to be created with the maximum stack and heap
  projected to be needed during the enclave's entire lifetime which
  can be longer than the processes running within it,
* an enclave needs to be created with support for the maximum number
  of threads projected to run in the enclave.

Since SGX1 a few more instructions were introduced, collectively called
SGX2, that support modifications to an initialized enclave. Hardware
supporting these instructions are already available as listed on
https://github.com/ayeks/SGX-hardware

This series adds support for SGX2, also referred to as Enclave Dynamic
Memory Management (EDMM). This includes:

* Support modifying permissions of regular enclave pages belonging to an
  initialized enclave. New permissions are not allowed to exceed the
  originally vetted permissions. Modifying permissions is accomplished
  with a new ioctl SGX_IOC_PAGE_MODP.

* Support dynamic addition of regular enclave pages to an initialized
  enclave. Pages are added with RW permissions as their "originally
  vetted permissions" (see previous bullet) and thus not allowed to
  be made executable at this time. Enabling dynamically added pages
  to obtain executable permissions require integration with user space
  policy that is deferred until the core SGX2 enabling is complete.
  Pages are dynamically added to an initialized enclave from the SGX
  page fault handler.

* Support expanding an initialized enclave to accommodate more threads.
  More threads can be accommodated by an enclave with the addition of
  Thread Control Structure (TCS) pages that is done by changing the
  type of regular enclave pages to TCS pages using a new ioctl
  SGX_IOC_PAGE_MODT.

* Support removing regular and TCS pages from an initialized enclave.
  Removing pages is accomplished in two stages as supported by two new
  ioctls SGX_IOC_PAGE_MODT (same ioctl as mentioned in previous bullet)
  and SGX_IOC_PAGE_REMOVE.

* Tests covering all the new flows, some edge cases, and one
  comprehensive stress scenario.

No additional work is needed to support SGX2 in a virtualized
environment. The tests included in this series can also be run from
a guest and was tested with the recent QEMU release based on 6.2.0
that supports SGX.

Patches 1 to 9 prepares the existing code for SGX2 support by
introducing the SGX2 instructions, making sure pages remain accessible
after their enclave permissions are changed, and tracking enclave page
types as well as runtime permissions as needed by SGX2.

Patches 10 through 25 are a mix of x86/sgx and selftests/sgx patches
that follow the format where first an SGX2 feature is
enabled and then followed by tests of the new feature and/or
tests of scenarios that combine SGX2 features enabled up to that point.

In two cases (patches 14 and 24) code in support of SGX2 is separated
out with detailed motivation to support the review.

This series is based on commit 5c16f7ee03c0 ("Merge branch
'x86/urgent' into x86/sgx, to resolve conflict" as
found on the x86/sgx branch of the tip repo at
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git

Your feedback will be greatly appreciated.

Regards,

Reinette

Reinette Chatre (25):
  x86/sgx: Add shortlog descriptions to ENCLS wrappers
  x86/sgx: Add wrappers for SGX2 functions
  x86/sgx: Support VMA permissions exceeding enclave permissions
  x86/sgx: Add pfn_mkwrite() handler for present PTEs
  x86/sgx: Introduce runtime protection bits
  x86/sgx: Use more generic name for enclave cpumask function
  x86/sgx: Move PTE zap code to separate function
  x86/sgx: Make SGX IPI callback available internally
  x86/sgx: Keep record of SGX page type
  x86/sgx: Support enclave page permission changes
  selftests/sgx: Add test for EPCM permission changes
  selftests/sgx: Add test for TCS page permission changes
  x86/sgx: Support adding of pages to initialized enclave
  x86/sgx: Tighten accessible memory range after enclave initialization
  selftests/sgx: Test two different SGX2 EAUG flows
  x86/sgx: Support modifying SGX page type
  x86/sgx: Support complete page removal
  selftests/sgx: Introduce dynamic entry point
  selftests/sgx: Introduce TCS initialization enclave operation
  selftests/sgx: Test complete changing of page type flow
  selftests/sgx: Test faulty enclave behavior
  selftests/sgx: Test invalid access to removed enclave page
  selftests/sgx: Test reclaiming of untouched page
  x86/sgx: Free up EPC pages directly to support large page ranges
  selftests/sgx: Page removal stress test

 arch/x86/include/asm/sgx.h                    |    8 +
 arch/x86/include/uapi/asm/sgx.h               |   60 +
 arch/x86/kernel/cpu/sgx/encl.c                |  333 +++-
 arch/x86/kernel/cpu/sgx/encl.h                |   12 +-
 arch/x86/kernel/cpu/sgx/encls.h               |   30 +
 arch/x86/kernel/cpu/sgx/ioctl.c               |  647 +++++++-
 arch/x86/kernel/cpu/sgx/main.c                |   70 +-
 arch/x86/kernel/cpu/sgx/sgx.h                 |    3 +
 tools/testing/selftests/sgx/defines.h         |   23 +
 tools/testing/selftests/sgx/load.c            |   41 +
 tools/testing/selftests/sgx/main.c            | 1450 +++++++++++++++++
 tools/testing/selftests/sgx/main.h            |    1 +
 tools/testing/selftests/sgx/test_encl.c       |   68 +
 .../selftests/sgx/test_encl_bootstrap.S       |    6 +
 14 files changed, 2667 insertions(+), 85 deletions(-)


base-commit: 5c16f7ee03c011b0c6cd4c6deccaf0b269d054b2
-- 
2.25.1


^ permalink raw reply	[flat|nested] 155+ messages in thread

* [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
@ 2021-12-01 19:22 ` Reinette Chatre
  2021-12-04 18:30   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 02/25] x86/sgx: Add wrappers for SGX2 functions Reinette Chatre
                   ` (24 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:22 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The SGX ENCLS instruction uses EAX to specify an SGX function and
may require additional registers, depending on the SGX function.
ENCLS invokes the specified privileged SGX function for managing
and debugging enclaves. Macros are used to wrap the ENCLS
functionality and several wrappers are used to wrap the macros to
make the different SGX functions accessible in the code.

The wrappers of the supported SGX functions are cryptic. Add short
changelog descriptions of each to a comment.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encls.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index 9b204843b78d..241b766265d3 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -162,57 +162,68 @@ static inline bool encls_failed(int ret)
 	ret;						\
 	})
 
+/* Create an SECS page in the Enclave Page Cache (EPC) */
 static inline int __ecreate(struct sgx_pageinfo *pginfo, void *secs)
 {
 	return __encls_2(ECREATE, pginfo, secs);
 }
 
+/* Extend uninitialized enclave measurement */
 static inline int __eextend(void *secs, void *addr)
 {
 	return __encls_2(EEXTEND, secs, addr);
 }
 
+/* Add a page to an uninitialized enclave */
 static inline int __eadd(struct sgx_pageinfo *pginfo, void *addr)
 {
 	return __encls_2(EADD, pginfo, addr);
 }
 
+/* Initialize an enclave for execution */
 static inline int __einit(void *sigstruct, void *token, void *secs)
 {
 	return __encls_ret_3(EINIT, sigstruct, secs, token);
 }
 
+/* Remove a page from the Enclave Page Cache (EPC) */
 static inline int __eremove(void *addr)
 {
 	return __encls_ret_1(EREMOVE, addr);
 }
 
+/* Write to a debug enclave */
 static inline int __edbgwr(void *addr, unsigned long *data)
 {
 	return __encls_2(EDGBWR, *data, addr);
 }
 
+/* Read from a debug enclave */
 static inline int __edbgrd(void *addr, unsigned long *data)
 {
 	return __encls_1_1(EDGBRD, *data, addr);
 }
 
+/* Track threads operating inside the enclave */
 static inline int __etrack(void *addr)
 {
 	return __encls_ret_1(ETRACK, addr);
 }
 
+/* Load, verify, and unblock an Enclave Page Cache (EPC) page */
 static inline int __eldu(struct sgx_pageinfo *pginfo, void *addr,
 			 void *va)
 {
 	return __encls_ret_3(ELDU, pginfo, addr, va);
 }
 
+/* Mark an Enclave Page Cache (EPC) page as blocked */
 static inline int __eblock(void *addr)
 {
 	return __encls_ret_1(EBLOCK, addr);
 }
 
+/* Add a Version Array (VA) page to the Enclave Page Cache (EPC) */
 static inline int __epa(void *addr)
 {
 	unsigned long rbx = SGX_PAGE_TYPE_VA;
@@ -220,6 +231,7 @@ static inline int __epa(void *addr)
 	return __encls_2(EPA, rbx, addr);
 }
 
+/* Invalidate an EPC page and write it out to main memory */
 static inline int __ewb(struct sgx_pageinfo *pginfo, void *addr,
 			void *va)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 02/25] x86/sgx: Add wrappers for SGX2 functions
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
  2021-12-01 19:22 ` [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 22:04   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions Reinette Chatre
                   ` (23 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The SGX ENCLS instruction uses EAX to specify an SGX function and
may require additional registers, depending on the SGX function.
ENCLS invokes the specified privileged SGX function for managing
and debugging enclaves. Several macros are used to wrap the ENCLS
functionality.

Add ENCLS wrappers for the SGX2 EMODPR, EMODT, and EAUG functions
that can make changes to pages of an initialized SGX enclave. The
EMODPR function is used to restrict enclave page permissions
as maintained within the enclave (Enclave Page Cache Map (EPCM)
permissions). The EMODT function is used to change the type of an
enclave page. The EAUG function is used to dynamically add enclave
pages to an initialized enclave.

EMODPR and EMODT accepts two parameters and can fault as well as return
an SGX error code. EAUG also accepts two parameters but does not return
an SGX error code. Use existing macros for all new functions.

Expand enum sgx_return_code with the possible EMODPR and EMODT
return codes.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/include/asm/sgx.h      |  5 +++++
 arch/x86/kernel/cpu/sgx/encls.h | 18 ++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
index 05f3e21f01a7..ebae2a153c66 100644
--- a/arch/x86/include/asm/sgx.h
+++ b/arch/x86/include/asm/sgx.h
@@ -47,17 +47,22 @@ enum sgx_encls_function {
 
 /**
  * enum sgx_return_code - The return code type for ENCLS, ENCLU and ENCLV
+ * %SGX_EPC_PAGE_CONFLICT:	Page is being written by other ENCLS function.
  * %SGX_NOT_TRACKED:		Previous ETRACK's shootdown sequence has not
  *				been completed yet.
  * %SGX_CHILD_PRESENT		SECS has child pages present in the EPC.
  * %SGX_INVALID_EINITTOKEN:	EINITTOKEN is invalid and enclave signer's
  *				public key does not match IA32_SGXLEPUBKEYHASH.
+ * %SGX_PAGE_NOT_MODIFIABLE:	The EPC page cannot be modified because it
+ *				is in the PENDING or MODIFIED state.
  * %SGX_UNMASKED_EVENT:		An unmasked event, e.g. INTR, was received
  */
 enum sgx_return_code {
+	SGX_EPC_PAGE_CONFLICT		= 7,
 	SGX_NOT_TRACKED			= 11,
 	SGX_CHILD_PRESENT		= 13,
 	SGX_INVALID_EINITTOKEN		= 16,
+	SGX_PAGE_NOT_MODIFIABLE		= 20,
 	SGX_UNMASKED_EVENT		= 128,
 };
 
diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index 241b766265d3..243c30301ddb 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -238,4 +238,22 @@ static inline int __ewb(struct sgx_pageinfo *pginfo, void *addr,
 	return __encls_ret_3(EWB, pginfo, addr, va);
 }
 
+/* Restrict the permissions of an Enclave Page Cache (EPC) page */
+static inline int __emodpr(struct sgx_secinfo *secinfo, void *addr)
+{
+	return __encls_ret_2(EMODPR, secinfo, addr);
+}
+
+/* Change the type of an Enclave Page Cache (EPC) page */
+static inline int __emodt(struct sgx_secinfo *secinfo, void *addr)
+{
+	return __encls_ret_2(EMODT, secinfo, addr);
+}
+
+/* Add a page to an initialized enclave */
+static inline int __eaug(struct sgx_pageinfo *pginfo, void *addr)
+{
+	return __encls_2(EAUG, pginfo, addr);
+}
+
 #endif /* _X86_ENCLS_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
  2021-12-01 19:22 ` [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers Reinette Chatre
  2021-12-01 19:23 ` [PATCH 02/25] x86/sgx: Add wrappers for SGX2 functions Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 22:25   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs Reinette Chatre
                   ` (22 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

=== Summary ===

An SGX VMA can only be created if its permissions are the same or
weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA
creation this rule continues to be enforced by the page fault handler.

With SGX2 the EPCM permissions of a page can change after VMA
creation resulting in the VMA exceeding the EPCM permissions and the
page fault handler incorrectly blocking access.

Enable the VMA's pages to remain accessible while ensuring that
the page table entries are installed to match the EPCM permissions
without exceeding the VMA permissions.

=== Full Changelog ===

An SGX enclave is an area of memory where parts of an application
can reside. First an enclave is created and loaded (from
non-enclave memory) with the code and data of an application,
then user space can map (mmap()) the enclave memory to
be able to enter the enclave at its defined entry points for
execution within it.

The hardware maintains a secure structure, the Enclave Page Cache Map
(EPCM), that tracks the contents of the enclave. Of interest here is
its tracking of the enclave page permissions. When a page is loaded
into the enclave its permissions are specified and recorded in the
EPCM. In parallel the OS maintains the page table permissions and
the rule is that page table permissions are never allowed to exceed
EPCM permissions.

A new mapping (mmap()) of enclave memory can only succeed if the
mapping has the same or weaker permissions than the permissions that
were vetted during enclave creation. This is enforced by
sgx_encl_may_map() that is called on the mmap() as well as mprotect()
paths. This permission verification remains.

One feature of SGX2 is to support the modification of enclave page
permissions after enclave creation. Enclave pages may thus already be
mapped at the time their enclave permissions are changed resulting
in the VMA's permissions potentially exceeding the enclave page
permissions.

Enable permissions of existing VMAs to exceed enclave page permissions
in preparation for dynamic enclave page permission changes.
New VMAs that attempt to exceed enclave page permissions continue to be
unsupported.

Reasons why permissions of existing VMAs are allowed to exceed enclave
page permissions instead of dynamically changing VMA permissions when
enclave page permissions change are:
1) Changing VMA permissions involve splitting VMAs which is an operation
   that can fail. Additionally the actual changing of page permissions
   of a range of pages could also fail on any of the pages involved.
   Handling these error cases causes problems. For example, if an
   enclave page permission change fails and the VMA has already been
   split then it is not possible to undo the VMA split nor possible to
   undo the enclave page permission changes that did succeed before the
   failure.
2) The OS has little insight into the user space where EPCM permissions
   are controlled from. For example, a RW page may be made RO just
   before it is made RX and splitting the VMAs while the VMAs may change
   soon is unnecessary.

Remove the extra permission check called on a page fault
(vm_operations_struct->fault) or during debugging
(vm_operations_struct->access) when loading the enclave page from swap
that ensures that the VMA permissions do not exceed the enclave
permissions. Since a VMA could only exist if it passed the original
permission checks during mmap() and a VMA may indeed exceed the page
permissions this extra permission check is no longer appropriate.

With the permission check removed, ensure that page table entries do
not blindly inherit the VMA permissions but instead the permissions
that the VMA and enclave agree on. PTEs for writable pages (from VMA and
enclave perspective) are installed with the writable bit set, reducing
the need for this additional flow to the permission mismatch cases
handled next.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 38 ++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 001808e3901c..20e97d3abdce 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -91,10 +91,8 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page,
 }
 
 static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
-						unsigned long addr,
-						unsigned long vm_flags)
+						unsigned long addr)
 {
-	unsigned long vm_prot_bits = vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
 	struct sgx_epc_page *epc_page;
 	struct sgx_encl_page *entry;
 
@@ -102,14 +100,6 @@ static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
 	if (!entry)
 		return ERR_PTR(-EFAULT);
 
-	/*
-	 * Verify that the faulted page has equal or higher build time
-	 * permissions than the VMA permissions (i.e. the subset of {VM_READ,
-	 * VM_WRITE, VM_EXECUTE} in vma->vm_flags).
-	 */
-	if ((entry->vm_max_prot_bits & vm_prot_bits) != vm_prot_bits)
-		return ERR_PTR(-EFAULT);
-
 	/* Entry successfully located. */
 	if (entry->epc_page) {
 		if (entry->desc & SGX_ENCL_PAGE_BEING_RECLAIMED)
@@ -138,7 +128,9 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 {
 	unsigned long addr = (unsigned long)vmf->address;
 	struct vm_area_struct *vma = vmf->vma;
+	unsigned long page_prot_bits;
 	struct sgx_encl_page *entry;
+	unsigned long vm_prot_bits;
 	unsigned long phys_addr;
 	struct sgx_encl *encl;
 	vm_fault_t ret;
@@ -155,7 +147,7 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 
 	mutex_lock(&encl->lock);
 
-	entry = sgx_encl_load_page(encl, addr, vma->vm_flags);
+	entry = sgx_encl_load_page(encl, addr);
 	if (IS_ERR(entry)) {
 		mutex_unlock(&encl->lock);
 
@@ -167,7 +159,19 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 
 	phys_addr = sgx_get_epc_phys_addr(entry->epc_page);
 
-	ret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr));
+	/*
+	 * Insert PTE to match the EPCM page permissions ensured to not
+	 * exceed the VMA permissions.
+	 */
+	vm_prot_bits = vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
+	page_prot_bits = entry->vm_max_prot_bits & vm_prot_bits;
+	/*
+	 * Add VM_SHARED so that PTE is made writable right away if VMA
+	 * and EPCM are writable (no COW in SGX).
+	 */
+	page_prot_bits |= (vma->vm_flags & VM_SHARED);
+	ret = vmf_insert_pfn_prot(vma, addr, PFN_DOWN(phys_addr),
+				  vm_get_page_prot(page_prot_bits));
 	if (ret != VM_FAULT_NOPAGE) {
 		mutex_unlock(&encl->lock);
 
@@ -295,15 +299,14 @@ static int sgx_encl_debug_write(struct sgx_encl *encl, struct sgx_encl_page *pag
  * Load an enclave page to EPC if required, and take encl->lock.
  */
 static struct sgx_encl_page *sgx_encl_reserve_page(struct sgx_encl *encl,
-						   unsigned long addr,
-						   unsigned long vm_flags)
+						   unsigned long addr)
 {
 	struct sgx_encl_page *entry;
 
 	for ( ; ; ) {
 		mutex_lock(&encl->lock);
 
-		entry = sgx_encl_load_page(encl, addr, vm_flags);
+		entry = sgx_encl_load_page(encl, addr);
 		if (PTR_ERR(entry) != -EBUSY)
 			break;
 
@@ -339,8 +342,7 @@ static int sgx_vma_access(struct vm_area_struct *vma, unsigned long addr,
 		return -EFAULT;
 
 	for (i = 0; i < len; i += cnt) {
-		entry = sgx_encl_reserve_page(encl, (addr + i) & PAGE_MASK,
-					      vma->vm_flags);
+		entry = sgx_encl_reserve_page(encl, (addr + i) & PAGE_MASK);
 		if (IS_ERR(entry)) {
 			ret = PTR_ERR(entry);
 			break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (2 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 22:43   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 05/25] x86/sgx: Introduce runtime protection bits Reinette Chatre
                   ` (21 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

By default a write page fault on a present PTE inherits the permissions
of the VMA. Enclave page permissions maintained in the hardware's
Enclave Page Cache Map (EPCM) may change after a VMA accessing the page
is created. A VMA's permissions may thus exceed the enclave page
permissions even though the VMA was originally created not to exceed
the enclave page permissions. Following the default behavior during
a page fault on a present PTE while the VMA permissions exceed the
enclave page permissions would result in the PTE for an enclave page
to be writable even though the page is not writable according to the
enclave's permissions.

Consider the following scenario:
* An enclave page exists with RW EPCM permissions.
* A RW VMA maps the range spanning the enclave page.
* The enclave page's EPCM permissions are changed to read-only.
* There is no page table entry for the enclave page.

Q.
 What will user space observe when an attempt is made to write to the
 enclave page from within the enclave?

A.
 Initially the page table entry is not present so the following is
 observed:
 1) Instruction writing to enclave page is run from within the enclave.
 2) A page fault with second and third bits set (0x6) is encountered
    and handled by the SGX handler sgx_vma_fault() that installs a
    read-only page table entry following previous patch that installs
    page table entry with permissions that VMA and enclave agree on
    (read-only in this case).
 3) Instruction writing to enclave page is re-attempted.
 4) A page fault with first three bits set (0x7) is encountered and
    transparently (from SGX and user space perspective) handled by the
    OS with the page table entry made writable because the VMA is
    writable.
 5) Instruction writing to enclave page is re-attempted.
 6) Since the EPCM permissions prevents writing to the page a new page
    fault is encountered, this time with the SGX flag set in the error
    code (0x8007). No action is taken by OS for this page fault and
    execution returns to user space.
 7) Typically such a fault will be passed on to an application with a
    signal but if the enclave is entered with the vDSO function provided
    by the kernel then user space does not receive a signal but instead
    the vDSO function returns successfully with exception information
    (vector=14, error code=0x8007, and address) within the exception
    fields within the vDSO function's struct sgx_enclave_run.

As can be observed it is not possible for user space to write to an
enclave page if that page's enclave page permissions do not allow so,
no matter what the VMA or PTE allows.

Even so, the OS should not allow writing to a page if that page is not
writable. Thus the page table entry should accurately reflect the
enclave page permissions.

Do not blindly accept VMA permissions on a page fault due to a write
attempt to a present PTE. Install a pfn_mkwrite() handler that ensures
that the VMA permissions agree with the enclave permissions in this
regard.

Considering the same scenario as above after this change results in
the following behavior change:

Q.
 What will user space observe when an attempt is made to write to the
 enclave page from within the enclave?

A.
 Initially the page table entry is not present so the following is
 observed:
 1) Instruction writing to enclave page is run from within the enclave.
 2) A page fault with second and third bits set (0x6) is encountered
    and handled by the SGX handler sgx_vma_fault() that installs a
    read-only page table entry following previous patch that installs
    page table entry with permissions that VMA and enclave agree on
    (read-only in this case).
 3) Instruction writing to enclave page is re-attempted.
 4) A page fault with first three bits set (0x7) is encountered and
    passed to the pfn_mkwrite() handler for consideration. The handler
    determines that the page should not be writable and returns SIGBUS.
 5) Typically such a fault will be passed on to an application with a
    signal but if the enclave is entered with the vDSO function provided
    by the kernel then user space does not receive a signal but instead
    the vDSO function returns successfully with exception information
    (vector=14, error code=0x7, and address) within the exception fields
    within the vDSO function's struct sgx_enclave_run.

The accurate exception information supports the SGX runtime, which is
virtually always implemented inside a shared library, by providing
accurate information in support of its management of the SGX enclave.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 42 ++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 20e97d3abdce..60afa8eaf979 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -184,6 +184,47 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 	return VM_FAULT_NOPAGE;
 }
 
+/*
+ * A fault occurred while writing to a present enclave PTE. Since PTE is
+ * present this will not be handled by sgx_vma_fault(). VMA may allow
+ * writing to the page while enclave does not. Do not follow the default
+ * of inheriting VMA permissions in this regard, ensure enclave also allows
+ * writing to the page.
+ */
+static vm_fault_t sgx_vma_pfn_mkwrite(struct vm_fault *vmf)
+{
+	unsigned long addr = (unsigned long)vmf->address;
+	struct vm_area_struct *vma = vmf->vma;
+	struct sgx_encl_page *entry;
+	struct sgx_encl *encl;
+	vm_fault_t ret = 0;
+
+	encl = vma->vm_private_data;
+
+	/*
+	 * It's very unlikely but possible that allocating memory for the
+	 * mm_list entry of a forked process failed in sgx_vma_open(). When
+	 * this happens, vm_private_data is set to NULL.
+	 */
+	if (unlikely(!encl))
+		return VM_FAULT_SIGBUS;
+
+	mutex_lock(&encl->lock);
+
+	entry = xa_load(&encl->page_array, PFN_DOWN(addr));
+	if (!entry) {
+		ret = VM_FAULT_SIGBUS;
+		goto out;
+	}
+
+	if (!(entry->vm_max_prot_bits & VM_WRITE))
+		ret = VM_FAULT_SIGBUS;
+
+out:
+	mutex_unlock(&encl->lock);
+	return ret;
+}
+
 static void sgx_vma_open(struct vm_area_struct *vma)
 {
 	struct sgx_encl *encl = vma->vm_private_data;
@@ -381,6 +422,7 @@ const struct vm_operations_struct sgx_vm_ops = {
 	.mprotect = sgx_vma_mprotect,
 	.open = sgx_vma_open,
 	.access = sgx_vma_access,
+	.pfn_mkwrite = sgx_vma_pfn_mkwrite,
 };
 
 /**
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (3 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-03 19:28   ` Andy Lutomirski
  2021-12-04 22:50   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 06/25] x86/sgx: Use more generic name for enclave cpumask function Reinette Chatre
                   ` (20 subsequent siblings)
  25 siblings, 2 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Enclave creators declare their paging permission intent at the time
the pages are added to the enclave. These paging permissions are
vetted when pages are added to the enclave and stashed off
(in sgx_encl_page->vm_max_prot_bits) for later comparison with
enclave PTEs.

Current permission support assume that enclave page permissions
remain static for the lifetime of the enclave. This is about to change
with the addition of support for SGX2 where the permissions of enclave
pages belonging to an initialized enclave may be changed during the
enclave's lifetime.

Introduce runtime protection bits in preparation for support of
enclave page permission changes. These bits reflect the active
permissions of an enclave page and are not to exceed the maximum
protection bits that passed scrutiny during enclave creation.

Associate runtime protection bits with each enclave page. Initialize
the runtime protection bits to the vetted maximum protection bits
on page creation. Use the runtime protection bits for any access
checks.

struct sgx_encl_page hosting this information is maintained for each
enclave page so the space consumed by the struct is important.
The existing vm_max_prot_bits is already unsigned long while only using
three bits. Transition to a bitfield for the two members containing
protection bits.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c  | 6 +++---
 arch/x86/kernel/cpu/sgx/encl.h  | 3 ++-
 arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++++++
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 60afa8eaf979..6fec68896e1b 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -164,7 +164,7 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 	 * exceed the VMA permissions.
 	 */
 	vm_prot_bits = vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
-	page_prot_bits = entry->vm_max_prot_bits & vm_prot_bits;
+	page_prot_bits = entry->vm_run_prot_bits & vm_prot_bits;
 	/*
 	 * Add VM_SHARED so that PTE is made writable right away if VMA
 	 * and EPCM are writable (no COW in SGX).
@@ -217,7 +217,7 @@ static vm_fault_t sgx_vma_pfn_mkwrite(struct vm_fault *vmf)
 		goto out;
 	}
 
-	if (!(entry->vm_max_prot_bits & VM_WRITE))
+	if (!(entry->vm_run_prot_bits & VM_WRITE))
 		ret = VM_FAULT_SIGBUS;
 
 out:
@@ -280,7 +280,7 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
 	mutex_lock(&encl->lock);
 	xas_lock(&xas);
 	xas_for_each(&xas, page, PFN_DOWN(end - 1)) {
-		if (~page->vm_max_prot_bits & vm_prot_bits) {
+		if (~page->vm_run_prot_bits & vm_prot_bits) {
 			ret = -EACCES;
 			break;
 		}
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index fec43ca65065..dc262d843411 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -27,7 +27,8 @@
 
 struct sgx_encl_page {
 	unsigned long desc;
-	unsigned long vm_max_prot_bits;
+	unsigned long vm_max_prot_bits:8;
+	unsigned long vm_run_prot_bits:8;
 	struct sgx_epc_page *epc_page;
 	struct sgx_encl *encl;
 	struct sgx_va_page *va_page;
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 83df20e3e633..7e0819a89532 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -197,6 +197,12 @@ static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
 	/* Calculate maximum of the VM flags for the page. */
 	encl_page->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
 
+	/*
+	 * At time of allocation, the runtime protection bits are the same
+	 * as the maximum protection bits.
+	 */
+	encl_page->vm_run_prot_bits = encl_page->vm_max_prot_bits;
+
 	return encl_page;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 06/25] x86/sgx: Use more generic name for enclave cpumask function
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (4 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 05/25] x86/sgx: Introduce runtime protection bits Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 22:56   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 07/25] x86/sgx: Move PTE zap code to separate function Reinette Chatre
                   ` (19 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Knowing which CPUs might have executed an enclave is useful to
ensure that TLBs are cleared when changes are made to enclave pages.

Knowing which CPUs might have executed an enclave is currently only used
within the reclaimer when an enclave page is evicted. In support of this
the function that determines the CPUs is called sgx_encl_ewb_cpumask() -
a name that is specific to the current usage.

Rename sgx_encl_ewb_cpumask() to sgx_encl_cpumask() in preparation for its
usage in more flows. Care should be taken to ensure that any
future usage maintains the current context requirement that ETRACK has
been called first. To highlight this the existing comments regarding
this context is expanded and moved to more prominent location.

With the usage of this function no longer unique to the reclaimer it is
relocated to be the rest of the enclave code in encl.c.

No functional change.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 67 ++++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/encl.h |  1 +
 arch/x86/kernel/cpu/sgx/main.c | 31 +---------------
 3 files changed, 69 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 6fec68896e1b..c29e10541d12 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -595,6 +595,73 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
 	return 0;
 }
 
+/**
+ * sgx_encl_cpumask - Query which CPUs might be accessing the enclave
+ * @encl: the enclave
+ *
+ * Some SGX functions require that no cached linear-to-physical address
+ * mappings are present before they can succeed. For example, ENCLS[EWB]
+ * copies a page from the enclave page cache to regular main memory but
+ * it fails if it cannot ensure that there are no cached
+ * linear-to-physical address mappings referring to the page.
+ *
+ * SGX hardware flushes all cached linear-to-physical mappings on a CPU
+ * when an enclave is exited via ENCLU[EEXIT] or an Asynchronous Enclave
+ * Exit (AEX). Exiting an enclave will thus ensure cached linear-to-physical
+ * address mappings are cleared but coordination with the tracking done within
+ * the SGX hardware is needed to support the SGX functions that depend on this
+ * cache clearing.
+ *
+ * When the ENCLS[ETRACK] function is issued on an enclave the hardware
+ * tracks threads operating inside the enclave at that time. The SGX
+ * hardware tracking require that all the identified threads must have
+ * exited the enclave in order to flush the mappings before a function such
+ * as ENCLS[EWB] will be permitted
+ *
+ * The following flow is used to support SGX functions that require that
+ * no cached linear-to-physical address mappings are present:
+ * 1) Execute ENCLS[ETRACK] to initiate hardware tracking.
+ * 2) Use this function (sgx_encl_cpumask()) to query which CPUs might be
+ *    accessing the enclave.
+ * 3) Send IPI to identified CPUs, kicking them out of the enclave and
+ *    thus flushing all locally cached linear-to-physical address mappings.
+ * 4) Execute SGX function.
+ *
+ * Context: It is required to call this function after ENCLS[ETRACK].
+ *          This will ensure that if any new mm appears (racing with
+ *          sgx_encl_mm_add()) then the new mm will enter into the
+ *          enclave with fresh linear-to-physical address mappings.
+ *
+ *          It is required that all IPIs are completed before a new
+ *          ENCLS[ETRACK] is issued so be sure to protect steps 1 to 3
+ *          of the above flow with the enclave's mutex.
+ *
+ * Return: cpumask of CPUs that might be accessing @encl
+ */
+const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl)
+{
+	cpumask_t *cpumask = &encl->cpumask;
+	struct sgx_encl_mm *encl_mm;
+	int idx;
+
+	cpumask_clear(cpumask);
+
+	idx = srcu_read_lock(&encl->srcu);
+
+	list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
+		if (!mmget_not_zero(encl_mm->mm))
+			continue;
+
+		cpumask_or(cpumask, cpumask, mm_cpumask(encl_mm->mm));
+
+		mmput_async(encl_mm->mm);
+	}
+
+	srcu_read_unlock(&encl->srcu, idx);
+
+	return cpumask;
+}
+
 static struct page *sgx_encl_get_backing_page(struct sgx_encl *encl,
 					      pgoff_t index)
 {
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index dc262d843411..becb68503baa 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -106,6 +106,7 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
 
 void sgx_encl_release(struct kref *ref);
 int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm);
+const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl);
 int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
 			 struct sgx_backing *backing);
 void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 6036328de255..e5be992897f4 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -201,35 +201,6 @@ static void sgx_ipi_cb(void *info)
 {
 }
 
-static const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl)
-{
-	cpumask_t *cpumask = &encl->cpumask;
-	struct sgx_encl_mm *encl_mm;
-	int idx;
-
-	/*
-	 * Can race with sgx_encl_mm_add(), but ETRACK has already been
-	 * executed, which means that the CPUs running in the new mm will enter
-	 * into the enclave with a fresh epoch.
-	 */
-	cpumask_clear(cpumask);
-
-	idx = srcu_read_lock(&encl->srcu);
-
-	list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
-		if (!mmget_not_zero(encl_mm->mm))
-			continue;
-
-		cpumask_or(cpumask, cpumask, mm_cpumask(encl_mm->mm));
-
-		mmput_async(encl_mm->mm);
-	}
-
-	srcu_read_unlock(&encl->srcu, idx);
-
-	return cpumask;
-}
-
 /*
  * Swap page to the regular memory transformed to the blocked state by using
  * EBLOCK, which means that it can no longer be referenced (no new TLB entries).
@@ -276,7 +247,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page,
 			 * miss cpus that entered the enclave between
 			 * generating the mask and incrementing epoch.
 			 */
-			on_each_cpu_mask(sgx_encl_ewb_cpumask(encl),
+			on_each_cpu_mask(sgx_encl_cpumask(encl),
 					 sgx_ipi_cb, NULL, 1);
 			ret = __sgx_encl_ewb(epc_page, va_slot, backing);
 		}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 07/25] x86/sgx: Move PTE zap code to separate function
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (5 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 06/25] x86/sgx: Use more generic name for enclave cpumask function Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 22:59   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 08/25] x86/sgx: Make SGX IPI callback available internally Reinette Chatre
                   ` (18 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The SGX reclaimer removes page table entries pointing to pages that are
moved to swap. SGX2 enables changes to pages belonging to an initialized
enclave, for example changing page permissions. Supporting SGX2 requires
this ability to remove page table entries that is available in the
SGX reclaimer code.

Factor out the code removing page table entries to a separate function,
fixing accuracy of comments in the process, and make it available to other
areas within the SGX code.

Since the code will no longer be unique to the reclaimer it is relocated
to be with the rest of the enclave code in encl.c interacting with the
page table.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 45 +++++++++++++++++++++++++++++++++-
 arch/x86/kernel/cpu/sgx/encl.h |  2 +-
 arch/x86/kernel/cpu/sgx/main.c | 31 ++---------------------
 3 files changed, 47 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index c29e10541d12..ba39186d5a28 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -587,7 +587,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
 
 	spin_lock(&encl->mm_lock);
 	list_add_rcu(&encl_mm->list, &encl->mm_list);
-	/* Pairs with smp_rmb() in sgx_reclaimer_block(). */
+	/* Pairs with smp_rmb() in sgx_zap_enclave_ptes(). */
 	smp_wmb();
 	encl->mm_list_version++;
 	spin_unlock(&encl->mm_lock);
@@ -776,6 +776,49 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm,
 	return ret;
 }
 
+/**
+ * sgx_zap_enclave_ptes - remove PTEs mapping the address from enclave
+ * @encl: the enclave
+ * @addr: page aligned pointer to single page for which PTEs will be removed
+ *
+ * Multiple VMAs may have an enclave page mapped. Remove the PTE mapping
+ * @addr from each VMA. Ensure that page fault handler is ready to handle
+ * new mappings of @addr before calling this function.
+ */
+void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr)
+{
+	unsigned long mm_list_version;
+	struct sgx_encl_mm *encl_mm;
+	struct vm_area_struct *vma;
+	int idx, ret;
+
+	do {
+		mm_list_version = encl->mm_list_version;
+
+		/* Pairs with smp_wmb() in sgx_encl_mm_add(). */
+		smp_rmb();
+
+		idx = srcu_read_lock(&encl->srcu);
+
+		list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
+			if (!mmget_not_zero(encl_mm->mm))
+				continue;
+
+			mmap_read_lock(encl_mm->mm);
+
+			ret = sgx_encl_find(encl_mm->mm, addr, &vma);
+			if (!ret && encl == vma->vm_private_data)
+				zap_vma_ptes(vma, addr, PAGE_SIZE);
+
+			mmap_read_unlock(encl_mm->mm);
+
+			mmput_async(encl_mm->mm);
+		}
+
+		srcu_read_unlock(&encl->srcu, idx);
+	} while (unlikely(encl->mm_list_version != mm_list_version));
+}
+
 /**
  * sgx_alloc_va_page() - Allocate a Version Array (VA) page
  *
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index becb68503baa..82e21088e68b 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -112,7 +112,7 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
 void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
 int sgx_encl_test_and_clear_young(struct mm_struct *mm,
 				  struct sgx_encl_page *page);
-
+void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr);
 struct sgx_epc_page *sgx_alloc_va_page(void);
 unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page);
 void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset);
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index e5be992897f4..9b96f4e0a17a 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -135,36 +135,9 @@ static void sgx_reclaimer_block(struct sgx_epc_page *epc_page)
 	struct sgx_encl_page *page = epc_page->owner;
 	unsigned long addr = page->desc & PAGE_MASK;
 	struct sgx_encl *encl = page->encl;
-	unsigned long mm_list_version;
-	struct sgx_encl_mm *encl_mm;
-	struct vm_area_struct *vma;
-	int idx, ret;
-
-	do {
-		mm_list_version = encl->mm_list_version;
-
-		/* Pairs with smp_rmb() in sgx_encl_mm_add(). */
-		smp_rmb();
-
-		idx = srcu_read_lock(&encl->srcu);
-
-		list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
-			if (!mmget_not_zero(encl_mm->mm))
-				continue;
-
-			mmap_read_lock(encl_mm->mm);
-
-			ret = sgx_encl_find(encl_mm->mm, addr, &vma);
-			if (!ret && encl == vma->vm_private_data)
-				zap_vma_ptes(vma, addr, PAGE_SIZE);
-
-			mmap_read_unlock(encl_mm->mm);
-
-			mmput_async(encl_mm->mm);
-		}
+	int ret;
 
-		srcu_read_unlock(&encl->srcu, idx);
-	} while (unlikely(encl->mm_list_version != mm_list_version));
+	sgx_zap_enclave_ptes(encl, addr);
 
 	mutex_lock(&encl->lock);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 08/25] x86/sgx: Make SGX IPI callback available internally
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (6 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 07/25] x86/sgx: Move PTE zap code to separate function Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 23:00   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 09/25] x86/sgx: Keep record of SGX page type Reinette Chatre
                   ` (17 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The ETRACK instruction followed by an IPI to all CPUs within an enclave
is a common pattern with more frequent use in support of SGX2.

Make the (empty) IPI callback function available internally in
preparation for more usages.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 2 +-
 arch/x86/kernel/cpu/sgx/sgx.h  | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 9b96f4e0a17a..887648ce6084 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -170,7 +170,7 @@ static int __sgx_encl_ewb(struct sgx_epc_page *epc_page, void *va_slot,
 	return ret;
 }
 
-static void sgx_ipi_cb(void *info)
+void sgx_ipi_cb(void *info)
 {
 }
 
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index 9ec3136c7800..ca89d625aa74 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -89,6 +89,8 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
 int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
 struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
 
+void sgx_ipi_cb(void *info);
+
 #ifdef CONFIG_X86_SGX_KVM
 int __init sgx_vepc_init(void);
 #else
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 09/25] x86/sgx: Keep record of SGX page type
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (7 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 08/25] x86/sgx: Make SGX IPI callback available internally Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 23:03   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 10/25] x86/sgx: Support enclave page permission changes Reinette Chatre
                   ` (16 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

SGX2 functions are not allowed on all page types. For example,
ENCLS[EMODPR] is only allowed on regular SGX enclave pages and
ENCLS[EMODPT] is only allowed on TCS and regular pages. If these
functions are attempted on another type of page the hardware would
trigger a fault.

Keep a record of the SGX page type so that there is more
certainty whether an SGX2 instruction can succeed and faults
can be treated as real failures.

The page type is made to be a property of struct sgx_encl_page
and thus does not cover the VA page type. VA pages are maintained
in separate structures and thus their type can be determined in
a different way. The SGX2 instructions being supported do not
operate on VA pages and this is thus not a scenario needing to
be covered at this time.

With the protection bits consuming 16 bits of the unsigned long
there is room available in the bitfield to include the page type
information without increasing the space consumed by the struct.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/include/asm/sgx.h      | 3 +++
 arch/x86/kernel/cpu/sgx/encl.h  | 1 +
 arch/x86/kernel/cpu/sgx/ioctl.c | 2 ++
 3 files changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
index ebae2a153c66..8b6cbedada96 100644
--- a/arch/x86/include/asm/sgx.h
+++ b/arch/x86/include/asm/sgx.h
@@ -221,6 +221,9 @@ struct sgx_pageinfo {
  * %SGX_PAGE_TYPE_REG:	a regular page
  * %SGX_PAGE_TYPE_VA:	a VA page
  * %SGX_PAGE_TYPE_TRIM:	a page in trimmed state
+ *
+ * Make sure when making changes to this enum that its values can still fit
+ * in the bitfield within &struct sgx_encl_page
  */
 enum sgx_page_type {
 	SGX_PAGE_TYPE_SECS,
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index 82e21088e68b..cb9f16d457ac 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -29,6 +29,7 @@ struct sgx_encl_page {
 	unsigned long desc;
 	unsigned long vm_max_prot_bits:8;
 	unsigned long vm_run_prot_bits:8;
+	enum sgx_page_type type:16;
 	struct sgx_epc_page *epc_page;
 	struct sgx_encl *encl;
 	struct sgx_va_page *va_page;
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 7e0819a89532..491d2700a54d 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -107,6 +107,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs)
 		set_bit(SGX_ENCL_DEBUG, &encl->flags);
 
 	encl->secs.encl = encl;
+	encl->secs.type = SGX_PAGE_TYPE_SECS;
 	encl->base = secs->base;
 	encl->size = secs->size;
 	encl->attributes = secs->attributes;
@@ -350,6 +351,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src,
 	 */
 	encl_page->encl = encl;
 	encl_page->epc_page = epc_page;
+	encl_page->type = (secinfo->flags & SGX_SECINFO_PAGE_TYPE_MASK) >> 8;
 	encl->secs_child_cnt++;
 
 	if (flags & SGX_PAGE_MEASURE) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (8 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 09/25] x86/sgx: Keep record of SGX page type Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-02 23:48   ` Dave Hansen
                     ` (4 more replies)
  2021-12-01 19:23 ` [PATCH 11/25] selftests/sgx: Add test for EPCM " Reinette Chatre
                   ` (15 subsequent siblings)
  25 siblings, 5 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

In the initial (SGX1) version of SGX, pages in an enclave need to be
created with permissions that support all usages of the pages, from the
time the enclave is initialized until it is unloaded. For example,
pages used by a JIT compiler or when code needs to otherwise be
relocated need to always have RWX permissions.

SGX2 includes two functions that can be used to modify the enclave page
permissions of regular enclave pages within an initialized enclave.
ENCLS[EMODPR] is run from the OS and used to restrict enclave page
permissions while ENCLU[EMODPE] is run from within the enclave to
extend enclave page permissions.

Enclave page permission changes need to be approached with care and
for this reason this initial support is to allow enclave page
permission changes _only_ if the new permissions are the same or
more restrictive that the permissions originally vetted at the time the
pages were added to the enclave. Support for extending enclave page
permissions beyond what was originally vetted is deferred.

Whether enclave page permissions are restricted or extended it
is necessary to ensure that the page table entries and enclave page
permissions are in sync. Introduce a new ioctl, SGX_IOC_PAGE_MODP, to
support enclave page permission changes. Since the OS has no insight
in how permissions may have been extended from within the enclave all
page permission requests are treated as permission restrictions. This
ioctl is used when enclave page permissions need to be restricted via
the OS as well as after enclave page permissions have been extended
from within the enclave (to ensure correct page table entries are
generated). With this ioctl the user specifies a page range and the
permissions to be applied to all pages in the provided range. The ioctl
itself can return an error code based on failures encountered by the OS.
It is also possible for SGX specific failures to be encountered. Add a
result output parameter to communicate the SGX return code. It is
possible for the permission change request to fail on a particular
page. To support partial success the ioctl will return the number
of pages that were successfully changed.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/include/uapi/asm/sgx.h |  20 +++
 arch/x86/kernel/cpu/sgx/encl.c  |   4 +-
 arch/x86/kernel/cpu/sgx/encl.h  |   3 +
 arch/x86/kernel/cpu/sgx/ioctl.c | 235 ++++++++++++++++++++++++++++++++
 4 files changed, 260 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index f4b81587e90b..24bebc31e336 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -29,6 +29,8 @@ enum sgx_page_flags {
 	_IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
 #define SGX_IOC_VEPC_REMOVE_ALL \
 	_IO(SGX_MAGIC, 0x04)
+#define SGX_IOC_PAGE_MODP \
+	_IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp)
 
 /**
  * struct sgx_enclave_create - parameter structure for the
@@ -76,6 +78,24 @@ struct sgx_enclave_provision {
 	__u64 fd;
 };
 
+/**
+ * struct sgx_page_modp - parameter structure for the %SGX_IOC_PAGE_MODP ioctl
+ * @offset:	starting page offset (page aligned relative to enclave base
+ *		address defined in SECS)
+ * @length:	length of memory (multiple of the page size)
+ * @prot:	new protection bits of pages in range described by @offset
+ *		and @length
+ * @result:	SGX result code of ENCLS[EMODPR] function
+ * @count:	bytes successfully changed (multiple of page size)
+ */
+struct sgx_page_modp {
+	__u64 offset;
+	__u64 length;
+	__u64 prot;
+	__u64 result;
+	__u64 count;
+};
+
 struct sgx_enclave_run;
 
 /**
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index ba39186d5a28..03c4d7e00b44 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -90,8 +90,8 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page,
 	return epc_page;
 }
 
-static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
-						unsigned long addr)
+struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
+					 unsigned long addr)
 {
 	struct sgx_epc_page *epc_page;
 	struct sgx_encl_page *entry;
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index cb9f16d457ac..848a28d28d3d 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -120,4 +120,7 @@ void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset);
 bool sgx_va_page_full(struct sgx_va_page *va_page);
 void sgx_encl_free_epc_page(struct sgx_epc_page *page);
 
+struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
+					 unsigned long addr);
+
 #endif /* _X86_ENCL_H */
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 491d2700a54d..5dddb3c9f742 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -682,6 +682,238 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
 	return sgx_set_attribute(&encl->attributes_mask, params.fd);
 }
 
+/**
+ * sgx_page_modp - Align enclave (EPCM) and OS (PTE) view of page permission
+ * @encl:	Enclave to which the pages belong.
+ * @modp:	Checked parameters from user on which pages need modifying
+ *		and their new permissions.
+ *
+ * SGX2 distinguishes between extending and restricting the enclave page
+ * permissions maintained by the hardware (EPCM permissions) of pages
+ * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
+ *
+ * EPCM permissions can be extended anytime directly from the enclave with
+ * no visibility from the OS. This is accomplished with ENCLU[EMODPE]
+ * run from within enclave. Accessing pages with the new, extended,
+ * permissions requires the OS to update the PTE to handle the subsequent
+ * #PF correctly.
+ *
+ * EPCM permissions cannot be restricted from within the enclave, the enclave
+ * requires the OS to run the privileged level 0 instructions ENCLS[EMODPR]
+ * and ENCLS[ETRACK] to achieve this.
+ *
+ * Since OS does not have insight into enclave's ENCLU[EMODPE] calls all
+ * EPCM permission changes are treated as restricting of (EPCM) permissions.
+ * Page table entries are cleared to ensure that the fault handler installs
+ * new entries with correct permissions.
+ *
+ * Return:
+ * - 0:		Success.
+ * - -errno:	Otherwise.
+ */
+static long sgx_page_modp(struct sgx_encl *encl, struct sgx_page_modp *modp)
+{
+	unsigned long vm_prot, run_prot_restore;
+	struct sgx_encl_page *entry;
+	struct sgx_secinfo secinfo;
+	unsigned long addr;
+	u64 secinfo_perm;
+	unsigned long c;
+	void *epc_virt;
+	int ret;
+
+	secinfo_perm = modp->prot & SGX_SECINFO_PERMISSION_MASK;
+
+	if ((secinfo_perm & SGX_SECINFO_W) && !(secinfo_perm & SGX_SECINFO_R))
+		return -EINVAL;
+
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = secinfo_perm;
+
+	vm_prot = _calc_vm_trans(secinfo.flags, SGX_SECINFO_R, PROT_READ)  |
+		  _calc_vm_trans(secinfo.flags, SGX_SECINFO_W, PROT_WRITE) |
+		  _calc_vm_trans(secinfo.flags, SGX_SECINFO_X, PROT_EXEC);
+	vm_prot = calc_vm_prot_bits(vm_prot, 0);
+
+	for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
+		addr = encl->base + modp->offset + c;
+
+		mutex_lock(&encl->lock);
+
+		entry = sgx_encl_load_page(encl, addr);
+		if (IS_ERR(entry)) {
+			ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
+			goto out_unlock;
+		}
+
+		/*
+		 * Changing EPCM permissions is only supported on regular
+		 * SGX pages. Attempting this change on other pages will
+		 * result in #PF.
+		 */
+		if (entry->type != SGX_PAGE_TYPE_REG) {
+			ret = -EINVAL;
+			goto out_unlock;
+		}
+
+		/*
+		 * Do not verify if current runtime protection bits are what
+		 * is being requested. The enclave may have done some
+		 * permission extending calls without letting OS know and
+		 * thus permission restriction may still be needed even if
+		 * from OS perspective the permissions are unchanged.
+		 */
+
+		/* Do not exceed permissions that have been vetted. */
+		if ((entry->vm_max_prot_bits & vm_prot) != vm_prot) {
+			ret = -EPERM;
+			goto out_unlock;
+		}
+
+		/* Make sure page stays around while releasing mutex. */
+		if (sgx_unmark_page_reclaimable(entry->epc_page)) {
+			ret = -EAGAIN;
+			goto out_unlock;
+		}
+
+		/*
+		 * Change runtime protection before zapping PTEs to ensure
+		 * any new #PF uses new permissions. EPCM permissions not
+		 * changed yet.
+		 */
+		run_prot_restore = entry->vm_run_prot_bits;
+		entry->vm_run_prot_bits = vm_prot;
+
+		mutex_unlock(&encl->lock);
+		/*
+		 * Do not keep encl->lock because of dependency on
+		 * mmap_lock acquired in sgx_zap_enclave_ptes().
+		 */
+		sgx_zap_enclave_ptes(encl, addr);
+
+		mutex_lock(&encl->lock);
+
+		/* Change EPCM permissions. */
+		epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
+		ret = __emodpr(&secinfo, epc_virt);
+		if (encls_faulted(ret)) {
+			/*
+			 * All possible faults should be avoidable:
+			 * parameters have been checked, will only change
+			 * permissions of a regular page, and no concurrent
+			 * SGX1/SGX2 ENCLS instructions since these
+			 * are protected with mutex.
+			 */
+			pr_err_once("EMODPR encountered exception %d\n",
+				    ENCLS_TRAPNR(ret));
+			ret = -EFAULT;
+			goto out_prot_restore;
+		}
+		if (encls_failed(ret)) {
+			modp->result = ret;
+			ret = -EFAULT;
+			goto out_prot_restore;
+		}
+
+		epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
+		ret = __etrack(epc_virt);
+		if (ret) {
+			/*
+			 * ETRACK only fails when there is an OS issue. For
+			 * example, two consecutive ETRACK was sent without
+			 * completed IPI between.
+			 */
+			pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
+			/*
+			 * Send IPIs to kick CPUs out of the enclave and
+			 * try ETRACK again.
+			 */
+			on_each_cpu_mask(sgx_encl_cpumask(encl),
+					 sgx_ipi_cb, NULL, 1);
+			ret = __etrack(epc_virt);
+			if (ret) {
+				pr_err_once("ETRACK repeat returned %d (0x%x)",
+					    ret, ret);
+				ret = -EFAULT;
+				goto out_reclaim;
+			}
+		}
+		on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
+
+		sgx_mark_page_reclaimable(entry->epc_page);
+		mutex_unlock(&encl->lock);
+	}
+
+	ret = 0;
+	goto out;
+
+out_prot_restore:
+	entry->vm_run_prot_bits = run_prot_restore;
+out_reclaim:
+	sgx_mark_page_reclaimable(entry->epc_page);
+out_unlock:
+	mutex_unlock(&encl->lock);
+out:
+	modp->count = c;
+
+	return ret;
+}
+
+/**
+ * sgx_ioc_page_modp() - handler for %SGX_IOC_PAGE_MODP
+ * @encl:	an enclave pointer
+ * @arg:	userspace pointer to a &struct sgx_page_modp instance
+ *
+ * Return:
+ * - 0:		Success
+ * - -errno:	Otherwise
+ */
+static long sgx_ioc_page_modp(struct sgx_encl *encl, void __user *arg)
+{
+	struct sgx_page_modp params;
+	long ret;
+
+	/*
+	 * Ensure that there is a change this could succeed: (1) SGX2
+	 * is required, and (2) only pages in an initialized enclave could
+	 * be modified.
+	 */
+	if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
+		return -ENODEV;
+
+	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
+		return -EINVAL;
+
+	/*
+	 * Obtain parameters from user and perform sanity checks.
+	 */
+	if (copy_from_user(&params, arg, sizeof(params)))
+		return -EFAULT;
+
+	if (!IS_ALIGNED(params.offset, PAGE_SIZE))
+		return -EINVAL;
+
+	if (!params.length || params.length & (PAGE_SIZE - 1))
+		return -EINVAL;
+
+	if (params.offset + params.length - PAGE_SIZE >= encl->size)
+		return -EINVAL;
+
+	if (params.prot & ~SGX_SECINFO_PERMISSION_MASK)
+		return -EINVAL;
+
+	if (params.result || params.count)
+		return -EINVAL;
+
+	ret = sgx_page_modp(encl, &params);
+
+	if (copy_to_user(arg, &params, sizeof(params)))
+		return -EFAULT;
+
+	return ret;
+}
+
 long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 {
 	struct sgx_encl *encl = filep->private_data;
@@ -703,6 +935,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 	case SGX_IOC_ENCLAVE_PROVISION:
 		ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
 		break;
+	case SGX_IOC_PAGE_MODP:
+		ret = sgx_ioc_page_modp(encl, (void __user *)arg);
+		break;
 	default:
 		ret = -ENOIOCTLCMD;
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 11/25] selftests/sgx: Add test for EPCM permission changes
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (9 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 10/25] x86/sgx: Support enclave page permission changes Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-01 19:23 ` [PATCH 12/25] selftests/sgx: Add test for TCS page " Reinette Chatre
                   ` (14 subsequent siblings)
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

EPCM permission changes could be made from within (to extend
permissions) or out (to restrict permissions) the enclave. OS
support is needed when permissions are restricted to be able to
call the privileged ENCLS[EMODPR] instruction and ensure PTEs
allowing the restricted permissions are flushed. EPCM permissions
can be extended via ENCLU[EMODPE] from within the enclave.

Add a test that exercises a few of the enclave page permission flows:
1) Test starts with a RW (from enclave and OS perspective) enclave page
   that is mapped via a RW VMA.
2) The SGX_IOC_PAGE_MODP ioctl is used to restrict the enclave (EPCM)
   page permissions to read-only (OS removes page table entry in the
   process).
3) Run ENCLU[EACCEPT] from within the enclave to accept the new page
   permissions.
4) Attempt to write to the enclave page from within the enclave - this
   should fail with a page fault on the page table entry since the page
   table entry accurately reflects the EPCM permissions.
5) Restore EPCM permissions to RW by running ENCLU[EMODPE] from within
   the enclave.
6) Attempt to write to the enclave page from within the enclave - this
   should fail again with a page fault because even though the EPCM
   permissions are RW the page table entries do not yet reflect that.
7) The SGX_IOC_PAGE_MODP ioctl is used to inform OS of new page
   permissions and page table entries will accurately reflect RW EPCM
   permissions.
8) Writing to enclave page from within enclave succeeds.
9) Ensure EPCM.PR is clear by running ENCLU[EACCEPT].

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/defines.h   |  15 ++
 tools/testing/selftests/sgx/main.c      | 264 ++++++++++++++++++++++++
 tools/testing/selftests/sgx/test_encl.c |  38 ++++
 3 files changed, 317 insertions(+)

diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h
index 02d775789ea7..b638eb98c80c 100644
--- a/tools/testing/selftests/sgx/defines.h
+++ b/tools/testing/selftests/sgx/defines.h
@@ -24,6 +24,8 @@ enum encl_op_type {
 	ENCL_OP_PUT_TO_ADDRESS,
 	ENCL_OP_GET_FROM_ADDRESS,
 	ENCL_OP_NOP,
+	ENCL_OP_EACCEPT,
+	ENCL_OP_EMODPE,
 	ENCL_OP_MAX,
 };
 
@@ -53,4 +55,17 @@ struct encl_op_get_from_addr {
 	uint64_t addr;
 };
 
+struct encl_op_eaccept {
+	struct encl_op_header header;
+	uint64_t epc_addr;
+	uint64_t flags;
+	uint64_t ret;
+};
+
+struct encl_op_emodpe {
+	struct encl_op_header header;
+	uint64_t epc_addr;
+	uint64_t flags;
+};
+
 #endif /* DEFINES_H */
diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 7e912db4c6c5..dbd071ba03fe 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -24,6 +24,18 @@ static const uint64_t MAGIC = 0x1122334455667788ULL;
 static const uint64_t MAGIC2 = 0x8877665544332211ULL;
 vdso_sgx_enter_enclave_t vdso_sgx_enter_enclave;
 
+/*
+ * Security Information (SECINFO) data structure needed by a few SGX
+ * instructions (eg. ENCLU[EACCEPT] and ENCLU[EMODPE]) holds meta-data
+ * about an enclave page. &enum sgx_secinfo_page_state specifies the
+ * secinfo flags used for page state.
+ */
+enum sgx_secinfo_page_state {
+	SGX_SECINFO_PENDING = (1 << 3),
+	SGX_SECINFO_MODIFIED = (1 << 4),
+	SGX_SECINFO_PR = (1 << 5),
+};
+
 struct vdso_symtab {
 	Elf64_Sym *elf_symtab;
 	const char *elf_symstrtab;
@@ -555,4 +567,256 @@ TEST_F(enclave, pte_permissions)
 	EXPECT_EQ(self->run.exception_addr, 0);
 }
 
+/*
+ * Enclave page permission test.
+ *
+ * Modify and restore enclave page's EPCM (enclave) permissions from
+ * outside enclave (ENCLS[EMODPR] via OS) as well as from within enclave (via
+ * ENCLU[EMODPE]). Kernel should ensure PTE permissions are the same as
+ * the EPCM permissions so check for page fault if VMA allows access but
+ * EPCM and PTE does not.
+ */
+TEST_F(enclave, epcm_permissions)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct encl_op_eaccept eaccept_op;
+	struct encl_op_emodpe emodpe_op;
+	unsigned long data_start;
+	struct sgx_page_modp ioc;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Ensure kernel supports needed ioctl and system supports needed
+	 * commands.
+	 */
+	memset(&ioc, 0, sizeof(ioc));
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODP ioctl");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Page that will have its permissions changed is the second data
+	 * page in the .data segment. This forms part of the local encl_buffer
+	 * within the enclave.
+	 *
+	 * At start of test @data_start should have EPCM as well as PTE
+	 * permissions of RW.
+	 */
+
+	data_start = self->encl.encl_base +
+		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	/*
+	 * Sanity check that page at @data_start is writable before making
+	 * any changes to page permissions.
+	 *
+	 * Start by writing MAGIC to test page.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = data_start;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that
+	 * page is writable.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = data_start;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Change EPCM permissions to read-only, PTE entry flushed by OS in
+	 * the process.
+	 */
+	memset(&ioc, 0, sizeof(ioc));
+
+	ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	ioc.length = PAGE_SIZE;
+	ioc.prot = PROT_READ;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(ioc.result, 0);
+	EXPECT_EQ(ioc.count, 4096);
+
+	/*
+	 * EPCM permissions changed from OS, need to EACCEPT from enclave.
+	 */
+	eaccept_op.epc_addr = data_start;
+	eaccept_op.flags = PROT_READ | SGX_SECINFO_REG | SGX_SECINFO_PR;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/*
+	 * EPCM permissions of page is now read-only, expect #PF
+	 * on PTE (not EPCM) when attempting to write to page from
+	 * within enclave.
+	 */
+	put_addr_op.value = MAGIC2;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(self->run.function, ERESUME);
+	EXPECT_EQ(self->run.exception_vector, 14);
+	EXPECT_EQ(self->run.exception_error_code, 0x7);
+	EXPECT_EQ(self->run.exception_addr, data_start);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	/*
+	 * Received AEX but cannot return to enclave at same entrypoint,
+	 * need different TCS from where EPCM permission can be made writable
+	 * again.
+	 */
+	self->run.tcs = self->encl.encl_base + PAGE_SIZE;
+
+	/*
+	 * Enter enclave at new TCS to change EPCM permissions to be
+	 * writable again and thus fix the page fault that triggered the
+	 * AEX.
+	 */
+
+	emodpe_op.epc_addr = data_start;
+	emodpe_op.flags = PROT_READ | PROT_WRITE;
+	emodpe_op.header.type = ENCL_OP_EMODPE;
+
+	EXPECT_EQ(ENCL_CALL(&emodpe_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Attempt to return to main TCS to resume execution at faulting
+	 * instruction, but PTE should still prevent writing to the page.
+	 */
+	self->run.tcs = self->encl.encl_base;
+
+	EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0,
+					 ERESUME, 0, 0,
+					 &self->run),
+		  0);
+
+	EXPECT_EQ(self->run.function, ERESUME);
+	EXPECT_EQ(self->run.exception_vector, 14);
+	EXPECT_EQ(self->run.exception_error_code, 0x7);
+	EXPECT_EQ(self->run.exception_addr, data_start);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+	/*
+	 * Inform OS about new permissions to have PTEs match EPCM.
+	 */
+	memset(&ioc, 0, sizeof(ioc));
+
+	ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	ioc.length = PAGE_SIZE;
+	ioc.prot = PROT_READ | PROT_WRITE;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(ioc.result, 0);
+	EXPECT_EQ(ioc.count, 4096);
+
+	/*
+	 * Wrong page permissions that caused original fault has
+	 * now been fixed via EPCM permissions as well as PTE.
+	 * Resume execution in main TCS to re-attempt the memory access.
+	 */
+	self->run.tcs = self->encl.encl_base;
+
+	EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0,
+					 ERESUME, 0, 0,
+					 &self->run),
+		  0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	get_addr_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC2);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.user_data, 0);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * The SGX_IOC_PAGE_MODP runs ENCLS[EMODPR] that sets EPCM.PR even
+	 * if permissions are not actually restricted. The previous memory
+	 * access succeeding shows that the PR flag does not prevent
+	 * access. Even so, include an ENCLU[EACCEPT] as reference
+	 * implementation to ensure EPCM does not have a dangling PR bit set.
+	 */
+
+	eaccept_op.epc_addr = data_start;
+	eaccept_op.flags = PROT_READ | PROT_WRITE | SGX_SECINFO_REG | SGX_SECINFO_PR;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+}
+
 TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c
index 4fca01cfd898..5b6c65331527 100644
--- a/tools/testing/selftests/sgx/test_encl.c
+++ b/tools/testing/selftests/sgx/test_encl.c
@@ -11,6 +11,42 @@
  */
 static uint8_t encl_buffer[8192] = { 1 };
 
+enum sgx_enclu_function {
+	EACCEPT = 0x5,
+	EMODPE = 0x6,
+};
+
+static void do_encl_emodpe(void *_op)
+{
+	struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0};
+	struct encl_op_emodpe *op = _op;
+
+	secinfo.flags = op->flags;
+
+	asm volatile(".byte 0x0f, 0x01, 0xd7"
+				:
+				: "a" (EMODPE),
+				  "b" (&secinfo),
+				  "c" (op->epc_addr));
+}
+
+static void do_encl_eaccept(void *_op)
+{
+	struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0};
+	struct encl_op_eaccept *op = _op;
+	int rax;
+
+	secinfo.flags = op->flags;
+
+	asm volatile(".byte 0x0f, 0x01, 0xd7"
+				: "=a" (rax)
+				: "a" (EACCEPT),
+				  "b" (&secinfo),
+				  "c" (op->epc_addr));
+
+	op->ret = rax;
+}
+
 static void *memcpy(void *dest, const void *src, size_t n)
 {
 	size_t i;
@@ -62,6 +98,8 @@ void encl_body(void *rdi,  void *rsi)
 		do_encl_op_put_to_addr,
 		do_encl_op_get_from_addr,
 		do_encl_op_nop,
+		do_encl_eaccept,
+		do_encl_emodpe,
 	};
 
 	struct encl_op_header *op = (struct encl_op_header *)rdi;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 12/25] selftests/sgx: Add test for TCS page permission changes
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (10 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 11/25] selftests/sgx: Add test for EPCM " Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-01 19:23 ` [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave Reinette Chatre
                   ` (13 subsequent siblings)
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Kernel should not allow permission changes on TCS pages. Add test to
confirm this behavior.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/main.c | 70 ++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index dbd071ba03fe..c7c50d05e246 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -120,6 +120,24 @@ static Elf64_Sym *vdso_symtab_get(struct vdso_symtab *symtab, const char *name)
 	return NULL;
 }
 
+/*
+ * Return the offset in the enclave where the TCS segment can be found.
+ * The first RW segment loaded is the TCS.
+ */
+static off_t encl_get_tcs_offset(struct encl *encl)
+{
+	int i;
+
+	for (i = 0; i < encl->nr_segments; i++) {
+		struct encl_segment *seg = &encl->segment_tbl[i];
+
+		if (i == 0 && seg->prot == (PROT_READ | PROT_WRITE))
+			return seg->offset;
+	}
+
+	return -1;
+}
+
 /*
  * Return the offset in the enclave where the data segment can be found.
  * The first RW segment loaded is the TCS, skip that to get info on the
@@ -567,6 +585,58 @@ TEST_F(enclave, pte_permissions)
 	EXPECT_EQ(self->run.exception_addr, 0);
 }
 
+/*
+ * Modifying permissions of TCS page should not be possible.
+ */
+TEST_F(enclave, tcs_permissions)
+{
+	struct sgx_page_modp ioc;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	memset(&ioc, 0, sizeof(ioc));
+
+	/*
+	 * Ensure kernel supports needed ioctl and system supports needed
+	 * commands.
+	 */
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODP ioctl");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Attempt to make TCS page read-only. This is not allowed and
+	 * should be prevented by OS.
+	 */
+	ioc.offset = encl_get_tcs_offset(&self->encl);
+	ioc.length = PAGE_SIZE;
+	ioc.prot = PROT_READ;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODP, &ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, -1);
+	EXPECT_EQ(errno_save, EINVAL);
+	EXPECT_EQ(ioc.result, 0);
+	EXPECT_EQ(ioc.count, 0);
+}
+
 /*
  * Enclave page permission test.
  *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (11 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 12/25] selftests/sgx: Add test for TCS page " Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-03  0:38   ` Dave Hansen
                     ` (2 more replies)
  2021-12-01 19:23 ` [PATCH 14/25] x86/sgx: Tighten accessible memory range after enclave initialization Reinette Chatre
                   ` (12 subsequent siblings)
  25 siblings, 3 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

With SGX1 an enclave needs to be created with its maximum memory demands
allocated. Pages cannot be added to an enclave after it is initialized.
SGX2 introduces a new function, ENCLS[EAUG], that can be used to add
pages to an initialized enclave. With SGX2 the enclave still needs to
set aside address space for its maximum memory demands during enclave
creation, but all pages need not be added before enclave initialization.
Pages can be added during enclave runtime.

Add support for dynamically adding pages to an initialized enclave,
architecturally limited to RW permission. Add pages via the page fault
handler at the time an enclave address without a backing enclave page
is accessed, potentially directly reclaiming pages if no free pages
are available.

The enclave is still required to run ENCLU[EACCEPT] on the page before
it can be used. A useful flow is for the enclave to run ENCLU[EACCEPT]
on an uninitialized address. This will trigger the page fault handler
that will add the enclave page and return execution to the enclave to
repeat the ENCLU[EACCEPT] instruction, this time successful.

If the enclave accesses an uninitialized address in another way, for
example by expanding the enclave stack to a page that has not yet been
added, then the page fault handler would add the page on the first
write but upon returning to the enclave the instruction that triggered
the page fault would be repeated and since ENCLU[EACCEPT] was not run
yet it would trigger a second page fault, this time with the SGX flag
set in the page fault error code. This can only be recovered by entering
the enclave again and directly running the ENCLU[EACCEPT] instruction on
the now initialized address.

Accessing an uninitialized address from outside the enclave also triggers
this flow but the page will remain in PENDING state until accepted from
within the enclave.

The page is added with the architecturally constrained RW permissions
as runtime as well as maximum allowed permissions. It is understood that
there are some use cases, for example code relocation, that requires RWX
maximum permissions. Supporting these use cases require guidance from user
space policy before such maximum permissions can be allowed. Integration
with user policy is deferred to a follow-up series.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c  | 133 ++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/encl.h  |   2 +
 arch/x86/kernel/cpu/sgx/ioctl.c |   4 +-
 3 files changed, 137 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 03c4d7e00b44..342b97dd4c33 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -124,6 +124,128 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
 	return entry;
 }
 
+/**
+ * sgx_encl_eaug_page - Dynamically add page to initialized enclave
+ * @vma:	VMA obtained from fault info from where page is accessed
+ * @encl:	enclave accessing the page
+ * @addr:	address that triggered the page fault
+ *
+ * When an initialized enclave accesses a page with no backing EPC page
+ * on a SGX2 system then the EPC can be added dynamically via the SGX2
+ * ENCLS[EAUG] instruction.
+ *
+ * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed
+ * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise.
+ */
+static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
+				     struct sgx_encl *encl, unsigned long addr)
+{
+	struct sgx_pageinfo pginfo = {0};
+	struct sgx_encl_page *encl_page;
+	struct sgx_epc_page *epc_page;
+	struct sgx_va_page *va_page;
+	unsigned long phys_addr;
+	unsigned long prot;
+	vm_fault_t vmret;
+	int ret;
+
+	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
+		return VM_FAULT_SIGBUS;
+
+	encl_page = kzalloc(sizeof(*encl_page), GFP_KERNEL);
+	if (!encl_page)
+		return VM_FAULT_OOM;
+
+	encl_page->desc = addr;
+	encl_page->encl = encl;
+
+	/*
+	 * Adding a regular page that is architecturally allowed to only
+	 * be created with RW permissions.
+	 * TBD: Interface with user space policy to support max permissions
+	 * of RWX.
+	 */
+	prot = PROT_READ | PROT_WRITE;
+	encl_page->vm_run_prot_bits = calc_vm_prot_bits(prot, 0);
+	encl_page->vm_max_prot_bits = encl_page->vm_run_prot_bits;
+
+	epc_page = sgx_alloc_epc_page(encl_page, true);
+	if (IS_ERR(epc_page)) {
+		kfree(encl_page);
+		return VM_FAULT_SIGBUS;
+	}
+
+	va_page = sgx_encl_grow(encl);
+	if (IS_ERR(va_page)) {
+		ret = PTR_ERR(va_page);
+		goto err_out_free;
+	}
+
+	mutex_lock(&encl->lock);
+
+	/*
+	 * Copy comment from sgx_encl_add_page() to maintain guidance in
+	 * this similar flow:
+	 * Adding to encl->va_pages must be done under encl->lock.  Ditto for
+	 * deleting (via sgx_encl_shrink()) in the error path.
+	 */
+	if (va_page)
+		list_add(&va_page->list, &encl->va_pages);
+
+	ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc),
+			encl_page, GFP_KERNEL);
+	/*
+	 * If ret == -EBUSY then page was created in another flow while
+	 * running without encl->lock
+	 */
+	if (ret)
+		goto err_out_unlock;
+
+	pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
+	pginfo.addr = encl_page->desc & PAGE_MASK;
+	pginfo.metadata = 0;
+
+	ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
+	if (ret)
+		goto err_out;
+
+	encl_page->encl = encl;
+	encl_page->epc_page = epc_page;
+	encl_page->type = SGX_PAGE_TYPE_REG;
+	encl->secs_child_cnt++;
+
+	sgx_mark_page_reclaimable(encl_page->epc_page);
+
+	phys_addr = sgx_get_epc_phys_addr(epc_page);
+	/*
+	 * Do not undo everything when creating PTE entry fails - next #PF
+	 * would find page ready for a PTE.
+	 * PAGE_SHARED because protection is forced to be RW above and COW
+	 * is not supported.
+	 */
+	vmret = vmf_insert_pfn_prot(vma, addr, PFN_DOWN(phys_addr),
+				    PAGE_SHARED);
+	if (vmret != VM_FAULT_NOPAGE) {
+		mutex_unlock(&encl->lock);
+		return VM_FAULT_SIGBUS;
+	}
+	mutex_unlock(&encl->lock);
+	return VM_FAULT_NOPAGE;
+
+err_out:
+	xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
+
+err_out_unlock:
+	sgx_encl_shrink(encl, va_page);
+	mutex_unlock(&encl->lock);
+
+err_out_free:
+	sgx_encl_free_epc_page(epc_page);
+	kfree(encl_page);
+
+	return VM_FAULT_SIGBUS;
+}
+
 static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 {
 	unsigned long addr = (unsigned long)vmf->address;
@@ -145,6 +267,17 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 	if (unlikely(!encl))
 		return VM_FAULT_SIGBUS;
 
+	/*
+	 * The page_array keeps track of all enclave pages, whether they
+	 * are swapped out or not. If there is no entry for this page and
+	 * the system supports SGX2 then it is possible to dynamically add
+	 * a new enclave page. This is only possible for an initialized
+	 * enclave that will be checked for right away.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_SGX2) &&
+	    (!xa_load(&encl->page_array, PFN_DOWN(addr))))
+		return sgx_encl_eaug_page(vma, encl, addr);
+
 	mutex_lock(&encl->lock);
 
 	entry = sgx_encl_load_page(encl, addr);
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index 848a28d28d3d..1b6ce1da7c92 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -123,4 +123,6 @@ void sgx_encl_free_epc_page(struct sgx_epc_page *page);
 struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
 					 unsigned long addr);
 
+struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl);
+void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
 #endif /* _X86_ENCL_H */
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 5dddb3c9f742..de0bf68ee842 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -17,7 +17,7 @@
 #include "encl.h"
 #include "encls.h"
 
-static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
+struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
 {
 	struct sgx_va_page *va_page = NULL;
 	void *err;
@@ -43,7 +43,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
 	return va_page;
 }
 
-static void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
+void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
 {
 	encl->page_cnt--;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 14/25] x86/sgx: Tighten accessible memory range after enclave initialization
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (12 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 23:14   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 15/25] selftests/sgx: Test two different SGX2 EAUG flows Reinette Chatre
                   ` (11 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Before an enclave is initialized the enclave's memory range is unknown.
The enclave's memory range is learned at the time it is created via the
SGX_IOC_ENCLAVE_CREATE ioctl where the provided memory range is obtained
from an earlier mmap() of the sgx_enclave device. After an enclave is
initialized its memory can be mapped into user space (mmap()) from where
it can be entered at its defined entry points.

With the enclave's memory range known after it is initialized there is
no reason why it should be possible to map memory outside this range.

Lock down access to the initialized enclave's memory range by denying
any attempt to map memory outside its memory range.

Locking down the memory range also makes adding pages to an initialized
enclave more efficient. Pages are added to an initialized enclave by
accessing memory that belongs to the enclave's memory range but not yet
backed by an enclave page. If it is possible for user space to map
memory that does not form part of the enclave then an access to this
memory would eventually fail. Failures range from a prompt general
protection fault if the access was an ENCLU[EACCEPT] from within the
enclave, or a page fault via the vDSO if it was another access from
within the enclave, or a SIGBUS (also resulting from a page fault) if
the access was from outside the enclave.

Disallowing invalid memory to be mapped in the first place avoids
preventable failures.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 342b97dd4c33..37203da382f8 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -403,6 +403,10 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
 
 	XA_STATE(xas, &encl->page_array, PFN_DOWN(start));
 
+	if (test_bit(SGX_ENCL_INITIALIZED, &encl->flags) &&
+	    (start < encl->base || end > encl->base + encl->size))
+		return -EACCES;
+
 	/*
 	 * Disallow READ_IMPLIES_EXEC tasks as their VMA permissions might
 	 * conflict with the enclave page permissions.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 15/25] selftests/sgx: Test two different SGX2 EAUG flows
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (13 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 14/25] x86/sgx: Tighten accessible memory range after enclave initialization Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-01 19:23 ` [PATCH 16/25] x86/sgx: Support modifying SGX page type Reinette Chatre
                   ` (10 subsequent siblings)
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Enclave pages can be added to an initialized enclave when an address
belonging to the enclave but without a backing page is accessed from
within the enclave.

Accessing memory without a backing enclave page from within an enclave
can be in different ways:
1) Pre-emptively run ENCLU[EACCEPT]. Since the addition of a page
   always needs to be accepted by the enclave via ENCLU[EACCEPT] this
   flow is efficient since the first execution of ENCLU[EACCEPT]
   triggers the addition of the page and when execution returns to the
   same instruction the second execution would be successful as an
   acceptance of the page.

2) A direct read or write. The flow where a direct read or write
   triggers the page addition execution cannot resume from the
   instruction (read/write) that triggered the fault but instead
   the enclave needs to be entered at a different entry point to
   run needed ENCLU[EACCEPT] before execution can return to the
   original entry point and the read/write instruction that faulted.

Add tests for both flows.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/main.c | 260 +++++++++++++++++++++++++++++
 1 file changed, 260 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index c7c50d05e246..bc8c7d06d74c 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -85,6 +85,30 @@ static bool vdso_get_symtab(void *addr, struct vdso_symtab *symtab)
 	return true;
 }
 
+static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
+			   unsigned int *ecx, unsigned int *edx)
+{
+	asm volatile("cpuid"
+	    : "=a" (*eax),
+	      "=b" (*ebx),
+	      "=c" (*ecx),
+	      "=d" (*edx)
+	    : "0" (*eax), "2" (*ecx)
+	    : "memory");
+}
+
+static inline int sgx2_supported(void)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	eax = SGX_CPUID;
+	ecx = 0x0;
+
+	__cpuid(&eax, &ebx, &ecx, &edx);
+
+	return eax & 0x2;
+}
+
 static unsigned long elf_sym_hash(const char *name)
 {
 	unsigned long h = 0, high;
@@ -889,4 +913,240 @@ TEST_F(enclave, epcm_permissions)
 	EXPECT_EQ(eaccept_op.ret, 0);
 }
 
+/*
+ * Test the addition of pages to an initialized enclave via writing to
+ * a page belonging to the enclave's address space but was not added
+ * during enclave creation.
+ */
+TEST_F(enclave, augment)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct encl_op_eaccept eaccept_op;
+	size_t total_size = 0;
+	void *addr;
+	int i;
+
+	if (!sgx2_supported())
+		SKIP(return, "SGX2 not supported");
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	for (i = 0; i < self->encl.nr_segments; i++) {
+		struct encl_segment *seg = &self->encl.segment_tbl[i];
+
+		total_size += seg->size;
+	}
+
+	/*
+	 * Actual enclave size is expected to be larger than the loaded
+	 * test enclave since enclave size must be a power of 2 in bytes
+	 * and test_encl does not consume it all.
+	 */
+	EXPECT_LT(total_size + PAGE_SIZE, self->encl.encl_size);
+
+	/*
+	 * Create memory mapping for the page that will be added. New
+	 * memory mapping is for one page right after all existing
+	 * mappings.
+	 */
+	addr = mmap((void *)self->encl.encl_base + total_size, PAGE_SIZE,
+		    PROT_READ | PROT_WRITE | PROT_EXEC,
+		    MAP_SHARED | MAP_FIXED, self->encl.fd, 0);
+	EXPECT_NE(addr, MAP_FAILED);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	/*
+	 * Attempt to write to the new page from within enclave.
+	 * Expected to fail since page is not (yet) part of the enclave.
+	 * The first #PF will trigger the addition of the page to the
+	 * enclave, but since the new page needs an EACCEPT from within the
+	 * enclave before it can be used it would not be possible
+	 * to successfully return to the failing instruction. This is the
+	 * cause of the second #PF captured here having the SGX bit set,
+	 * it is from hardware preventing the page from being used.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = (unsigned long)addr;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(self->run.function, ERESUME);
+	EXPECT_EQ(self->run.exception_vector, 14);
+	EXPECT_EQ(self->run.exception_addr, (unsigned long)addr);
+
+	if (self->run.exception_error_code == 0x6) {
+		munmap(addr, PAGE_SIZE);
+		SKIP(return, "Kernel does not support adding pages to initialized enclave");
+	}
+
+	EXPECT_EQ(self->run.exception_error_code, 0x8007);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	/* Handle AEX by running EACCEPT from new entry point. */
+	self->run.tcs = self->encl.encl_base + PAGE_SIZE;
+
+	eaccept_op.epc_addr = self->encl.encl_base + total_size;
+	eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/* Can now return to main TCS to resume execution. */
+	self->run.tcs = self->encl.encl_base;
+
+	EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0,
+					 ERESUME, 0, 0,
+					 &self->run),
+		  0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that data
+	 * previously written (MAGIC) is present. Only change two test
+	 * parameters, rest are same as previous test.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = (unsigned long)addr;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	munmap(addr, PAGE_SIZE);
+}
+
+/*
+ * Test for the addition of pages to an initialized enclave via a
+ * pre-emptive run of EACCEPT on page to be added.
+ */
+TEST_F(enclave, augment_via_eaccept)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct encl_op_eaccept eaccept_op;
+	size_t total_size = 0;
+	void *addr;
+	int i;
+
+	if (!sgx2_supported())
+		SKIP(return, "SGX2 not supported");
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	for (i = 0; i < self->encl.nr_segments; i++) {
+		struct encl_segment *seg = &self->encl.segment_tbl[i];
+
+		total_size += seg->size;
+	}
+
+	/*
+	 * Actual enclave size is expected to be larger than the loaded
+	 * test enclave since enclave size must be a power of 2 in bytes while
+	 * test_encl does not consume it all.
+	 */
+	EXPECT_LT(total_size + PAGE_SIZE, self->encl.encl_size);
+
+	/*
+	 * mmap() a page at end of existing enclave to be used for dynamic
+	 * EPC page.
+	 */
+
+	addr = mmap((void *)self->encl.encl_base + total_size, PAGE_SIZE,
+		    PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED | MAP_FIXED,
+		    self->encl.fd, 0);
+	EXPECT_NE(addr, MAP_FAILED);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	/*
+	 * Run EACCEPT on new page to trigger the #PF->EAUG->EACCEPT(again
+	 * without a #PF). All should be transparent to userspace.
+	 */
+	eaccept_op.epc_addr = self->encl.encl_base + total_size;
+	eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	if (self->run.exception_vector == 14 &&
+	    self->run.exception_error_code == 4 &&
+	    self->run.exception_addr == self->encl.encl_base + total_size) {
+		munmap(addr, PAGE_SIZE);
+		SKIP(return, "Kernel does not support adding pages to initialized enclave");
+	}
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/*
+	 * New page should be accessible from within enclave - attempt to
+	 * write to it.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = (unsigned long)addr;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that data
+	 * previously written (MAGIC) is present. Only change two test
+	 * parameters, rest are same as previous test.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = (unsigned long)addr;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	munmap(addr, PAGE_SIZE);
+}
+
 TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 16/25] x86/sgx: Support modifying SGX page type
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (14 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 15/25] selftests/sgx: Test two different SGX2 EAUG flows Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 23:45   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 17/25] x86/sgx: Support complete page removal Reinette Chatre
                   ` (9 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Every enclave contains one or more Thread Control Structures (TCS). The
TCS contains meta-data used by the hardware to save and restore thread
specific information when entering/exiting the enclave. With SGX1 an
enclave needs to be created with enough TCSs to support the largest
number of threads expecting to use the enclave and enough enclave pages
to meet all its anticipated memory demands. In SGX1 all pages remain in
the enclave until the enclave is unloaded.

Earlier changes added support for the SGX2 feature where pages can be
added dynamically to an initialized enclave.

SGX2 introduces a new function, ENCLS[EMODT], that is used to change
the type of an enclave page from a regular (SGX_PAGE_TYPE_REG) enclave
page to a TCS (SGX_PAGE_TYPE_TCS) page or change the type from a
regular (SGX_PAGE_TYPE_REG) or TCS (SGX_PAGE_TYPE_TCS)
page to a trimmed (SGX_PAGE_TYPE_TRIM) page (setting it up for later
removal).

With the existing support of dynamically adding regular enclave pages
to an initialized enclave and changing the page type to TCS it is
possible to dynamically increase the number of threads supported by an
enclave.

Changing the enclave page type to SGX_PAGE_TYPE_TRIM is the first step
of dynamically removing pages from an initialized enclave. The complete
page removal flow is:
1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
   using the ioctl introduced here.
2) Approve the page removal by running ENCLU[EACCEPT] from within
   the enclave.
3) Initiate actual page removal using the new ioctl introduced in the
   following patch.

Support changing SGX enclave page types with a new ioctl. With this
ioctl the user specifies a page range and the enclave page type to be
applied to all pages in the provided range. The ioctl itself can return
an error code based on failures encountered by the OS. It is also
possible for SGX specific failures to be encountered.  Add a result
output parameter to communicate the SGX return code. It is
possible for the enclave page type change request to fail on any page
within the provided range. Support partial success by returning
the number of pages that were successfully changed.

After the page type is changed to SGX_PAGE_TYPE_TRIM the page continues
to be accessible from the OS perspective with page table entries and
internal state. The page may be moved to swap. Any invalid access
(any access except ENCLU[EACCEPT]) will encounter a page fault with
SGX flag set in error code until the page is removed. Removal of
trimmed enclave pages on user request will be supported in following
patch. Trimmed enclave pages are also removed when enclave is unloaded.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/include/uapi/asm/sgx.h |  19 +++
 arch/x86/kernel/cpu/sgx/ioctl.c | 235 ++++++++++++++++++++++++++++++++
 2 files changed, 254 insertions(+)

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index 24bebc31e336..f70caccd166c 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -31,6 +31,8 @@ enum sgx_page_flags {
 	_IO(SGX_MAGIC, 0x04)
 #define SGX_IOC_PAGE_MODP \
 	_IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp)
+#define SGX_IOC_PAGE_MODT \
+	_IOWR(SGX_MAGIC, 0x06, struct sgx_page_modt)
 
 /**
  * struct sgx_enclave_create - parameter structure for the
@@ -96,6 +98,23 @@ struct sgx_page_modp {
 	__u64 count;
 };
 
+/**
+ * struct sgx_page_modt - parameter structure for the %SGX_IOC_PAGE_MODT ioctl
+ * @offset:	starting page offset (page aligned relative to enclave base
+ *		address defined in SECS)
+ * @length:	length of memory (multiple of the page size)
+ * @type:	new type of pages in range described by @offset and @length
+ * @result:	SGX result code of ENCLS[EMODT] function
+ * @count:	bytes successfully changed (multiple of page size)
+ */
+struct sgx_page_modt {
+	__u64 offset;
+	__u64 length;
+	__u64 type;
+	__u64 result;
+	__u64 count;
+};
+
 struct sgx_enclave_run;
 
 /**
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index de0bf68ee842..a952d608ab35 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -914,6 +914,238 @@ static long sgx_ioc_page_modp(struct sgx_encl *encl, void __user *arg)
 	return ret;
 }
 
+/**
+ * sgx_page_modt - Modify type of SGX enclave pages
+ * @encl:	Enclave to which the pages belong.
+ * @modt:	Checked parameters from user about which pages need modifying
+ *		and their new type.
+ *
+ * Ability to change the enclave page type supports the following use cases:
+ * * It is possible to add TCS pages to enclave by changing the type of
+ * regular pages (SGX_PAGE_TYPE_REG) to TCS (SGX_PAGE_TYPE_TCS) pages. With
+ * this support the number of threads supported by an initialized enclave
+ * can be increased dynamically.
+ * * Regular or TCS pages can dynamically be removed from an initialized
+ * enclave by changing the page type to SGX_PAGE_TYPE_TRIM. Changing the
+ * page type to SGX_PAGE_TYPE_TRIM marks the page for removal with actual
+ * removal done by handler of %SGX_IOC_PAGE_REMOVE ioctl called after
+ * ENCLU[EACCEPT] is run on SGX_PAGE_TYPE_TRIM page from within the enclave.
+ *
+ * Return:
+ * - 0:		Success
+ * - -errno:	Otherwise
+ */
+static long sgx_page_modt(struct sgx_encl *encl, struct sgx_page_modt *modt)
+{
+	unsigned long max_prot_restore, run_prot_restore;
+	enum sgx_page_type page_type;
+	struct sgx_encl_page *entry;
+	struct sgx_secinfo secinfo;
+	unsigned long prot;
+	unsigned long addr;
+	unsigned long c;
+	void *epc_virt;
+	int ret;
+
+	page_type = modt->type & SGX_PAGE_TYPE_MASK;
+
+	/*
+	 * The only new page types allowed by hardware are PT_TCS and PT_TRIM.
+	 */
+	if (page_type != SGX_PAGE_TYPE_TCS && page_type != SGX_PAGE_TYPE_TRIM)
+		return -EINVAL;
+
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = page_type << 8;
+
+	for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
+		addr = encl->base + modt->offset + c;
+
+		mutex_lock(&encl->lock);
+
+		entry = sgx_encl_load_page(encl, addr);
+		if (IS_ERR(entry)) {
+			ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
+			goto out_unlock;
+		}
+
+		/*
+		 * Borrow the logic from the Intel SDM. Regular pages
+		 * (SGX_PAGE_TYPE_REG) can change type to SGX_PAGE_TYPE_TCS
+		 * or SGX_PAGE_TYPE_TRIM but TCS pages can only be trimmed.
+		 * CET pages not supported yet.
+		 */
+		if (!(entry->type == SGX_PAGE_TYPE_REG ||
+		      (entry->type == SGX_PAGE_TYPE_TCS &&
+		       page_type == SGX_PAGE_TYPE_TRIM))) {
+			ret = -EINVAL;
+			goto out_unlock;
+		}
+
+		max_prot_restore = entry->vm_max_prot_bits;
+		run_prot_restore = entry->vm_run_prot_bits;
+
+		/*
+		 * Once a regular page becomes a TCS page it cannot be
+		 * changed back. So the maximum allowed protection reflects
+		 * the TCS page that is always RW from OS perspective but
+		 * will be inaccessible from within enclave. Before doing
+		 * so, do make sure that the new page type continues to
+		 * respect the originally vetted page permissions.
+		 */
+		if (entry->type == SGX_PAGE_TYPE_REG &&
+		    page_type == SGX_PAGE_TYPE_TCS) {
+			if (~entry->vm_max_prot_bits & (VM_READ | VM_WRITE)) {
+				ret = -EPERM;
+				goto out_unlock;
+			}
+			prot = PROT_READ | PROT_WRITE;
+			entry->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
+			entry->vm_run_prot_bits = entry->vm_max_prot_bits;
+
+			/*
+			 * Prevent page from being reclaimed while mutex
+			 * is released.
+			 */
+			if (sgx_unmark_page_reclaimable(entry->epc_page)) {
+				ret = -EAGAIN;
+				goto out_entry_changed;
+			}
+
+			/*
+			 * Do not keep encl->lock because of dependency on
+			 * mmap_lock acquired in sgx_zap_enclave_ptes().
+			 */
+			mutex_unlock(&encl->lock);
+
+			sgx_zap_enclave_ptes(encl, addr);
+
+			mutex_lock(&encl->lock);
+
+			sgx_mark_page_reclaimable(entry->epc_page);
+		}
+
+		/* Change EPC type */
+		epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
+		ret = __emodt(&secinfo, epc_virt);
+		if (encls_faulted(ret)) {
+			/*
+			 * All possible faults should be avoidable:
+			 * parameters have been checked, will only change
+			 * valid page types, and no concurrent
+			 * SGX1/SGX2 ENCLS instructions since these are
+			 * protected with mutex.
+			 */
+			pr_err_once("EMODT encountered exception %d\n",
+				    ENCLS_TRAPNR(ret));
+			ret = -EFAULT;
+			goto out_entry_changed;
+		}
+		if (encls_failed(ret)) {
+			modt->result = ret;
+			ret = -EFAULT;
+			goto out_entry_changed;
+		}
+
+		epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
+		ret = __etrack(epc_virt);
+		if (ret) {
+			/*
+			 * ETRACK only fails when there is an OS issue. For
+			 * example, two consecutive ETRACK was sent without
+			 * completed IPI between.
+			 */
+			pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
+			/*
+			 * Send IPIs to kick CPUs out of the enclave and
+			 * try ETRACK again.
+			 */
+			on_each_cpu_mask(sgx_encl_cpumask(encl),
+					 sgx_ipi_cb, NULL, 1);
+			ret = __etrack(epc_virt);
+			if (ret) {
+				pr_err_once("ETRACK repeat returned %d (0x%x)",
+					    ret, ret);
+				ret = -EFAULT;
+				goto out_unlock;
+			}
+		}
+		on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
+
+		entry->type = page_type;
+
+		mutex_unlock(&encl->lock);
+	}
+
+	ret = 0;
+	goto out;
+
+out_entry_changed:
+	entry->vm_max_prot_bits = max_prot_restore;
+	entry->vm_run_prot_bits = run_prot_restore;
+out_unlock:
+	mutex_unlock(&encl->lock);
+out:
+	modt->count = c;
+
+	return ret;
+}
+
+/**
+ * sgx_ioc_page_modt() - handler for %SGX_IOC_PAGE_MODT
+ * @encl:	an enclave pointer
+ * @arg:	userspace pointer to a &struct sgx_page_modt instance
+ *
+ * Return:
+ * - 0:		Success
+ * - -errno:	Otherwise
+ */
+static long sgx_ioc_page_modt(struct sgx_encl *encl, void __user *arg)
+{
+	struct sgx_page_modt params;
+	long ret;
+
+	/*
+	 * Ensure that there is a chance the request could succeed:
+	 * (1) SGX2 is required.
+	 * (2) Only pages in an initialized enclave could be modified.
+	 */
+	if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
+		return -ENODEV;
+
+	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
+		return -EINVAL;
+
+	/*
+	 * Obtain parameters from user and perform sanity checks.
+	 */
+	if (copy_from_user(&params, arg, sizeof(params)))
+		return -EFAULT;
+
+	if (!IS_ALIGNED(params.offset, PAGE_SIZE))
+		return -EINVAL;
+
+	if (!params.length || params.length & (PAGE_SIZE - 1))
+		return -EINVAL;
+
+	if (params.offset + params.length - PAGE_SIZE >= encl->size)
+		return -EINVAL;
+
+	if (params.type & ~SGX_PAGE_TYPE_MASK)
+		return -EINVAL;
+
+	if (params.result || params.count)
+		return -EINVAL;
+
+	ret = sgx_page_modt(encl, &params);
+
+	if (copy_to_user(arg, &params, sizeof(params)))
+		return -EFAULT;
+
+	return ret;
+}
+
 long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 {
 	struct sgx_encl *encl = filep->private_data;
@@ -938,6 +1170,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 	case SGX_IOC_PAGE_MODP:
 		ret = sgx_ioc_page_modp(encl, (void __user *)arg);
 		break;
+	case SGX_IOC_PAGE_MODT:
+		ret = sgx_ioc_page_modt(encl, (void __user *)arg);
+		break;
 	default:
 		ret = -ENOIOCTLCMD;
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 17/25] x86/sgx: Support complete page removal
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (15 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 16/25] x86/sgx: Support modifying SGX page type Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 23:45   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 18/25] selftests/sgx: Introduce dynamic entry point Reinette Chatre
                   ` (8 subsequent siblings)
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The SGX2 page removal flow was introduced in previous patch and is
as follows:
1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
   using the ioctl introduced in previous patch.
2) Approve the page removal by running ENCLU[EACCEPT] from within
   the enclave.
3) Initiate actual page removal using the new ioctl introduced here.

Support the final step of the SGX2 page removal flow with a new ioctl.
With this ioctl the user specifies a page range that should
be removed. At this time all pages in the provided range should have
the SGX_PAGE_TYPE_TRIM page type and the ioctl will fail with EPERM
(Operation not permitted) when it encounters a page that does not have
the correct type. Page removal can fail on any page within the
provided range. Support partial success by returning the number of pages
that were successfully removed.

Since actual page removal will succeed even if ENCLU[EACCEPT] was not
run from within the enclave the ENCLU[EMODPR] instruction with RWX
permissions is used as a no-op mechanism to ensure ENCLU[EACCEPT] was
successfully run from within the enclave before the enclave page is
removed.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/include/uapi/asm/sgx.h |  21 +++++
 arch/x86/kernel/cpu/sgx/ioctl.c | 159 ++++++++++++++++++++++++++++++++
 2 files changed, 180 insertions(+)

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index f70caccd166c..6648ded960f8 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -33,6 +33,8 @@ enum sgx_page_flags {
 	_IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp)
 #define SGX_IOC_PAGE_MODT \
 	_IOWR(SGX_MAGIC, 0x06, struct sgx_page_modt)
+#define SGX_IOC_PAGE_REMOVE \
+	_IOWR(SGX_MAGIC, 0x07, struct sgx_page_remove)
 
 /**
  * struct sgx_enclave_create - parameter structure for the
@@ -115,6 +117,25 @@ struct sgx_page_modt {
 	__u64 count;
 };
 
+/**
+ * struct sgx_page_remove - parameters for the %SGX_IOC_PAGE_REMOVE ioctl
+ * @offset:	starting page offset (page aligned relative to enclave base
+ *		address defined in SECS)
+ * @length:	length of memory (multiple of the page size)
+ * @count:	bytes successfully changed (multiple of page size)
+ *
+ * Regular (PT_REG) or TCS (PT_TCS) can be removed from an initialized
+ * enclave if the system supports SGX2. First, the %SGX_IOC_PAGE_MODT ioctl
+ * should be used to change the page type to PT_TRIM. After that succeeds
+ * ENCLU[EACCEPT] should be run from within the enclave and then can this
+ * ioctl be used to complete the page removal.
+ */
+struct sgx_page_remove {
+	__u64 offset;
+	__u64 length;
+	__u64 count;
+};
+
 struct sgx_enclave_run;
 
 /**
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index a952d608ab35..d11da6c53b26 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -1146,6 +1146,162 @@ static long sgx_ioc_page_modt(struct sgx_encl *encl, void __user *arg)
 	return ret;
 }
 
+/**
+ * sgx_page_remove - Remove trimmed pages from SGX enclave
+ * @encl:	Enclave to which the pages belong
+ * @params:	Checked parameters from user on which pages need to be removed
+ *
+ * Final step of the flow removing pages from an initialized enclave. The
+ * complete flow is:
+ * 1) User changes the type of the pages to be removed to %SGX_PAGE_TYPE_TRIM
+ *    using the %SGX_IOC_PAGE_MODT ioctl.
+ * 2) User approves the page removal by running ENCLU[EACCEPT] from within
+ *    the enclave.
+ * 3) User initiates actual page removal using the %SGX_IOC_PAGE_REMOVE
+ *    ioctl that is handled here.
+ *
+ * First remove any page table entries pointing to the page and then proceed
+ * with the actual removal of the enclave page and data in support of it.
+ *
+ * VA pages are not affected by this removal. It is thus possible that the
+ * enclave may end up with more VA pages than needed to support all its
+ * pages.
+ *
+ * Return:
+ * - 0:		Success.
+ * - -errno:	Otherwise.
+ */
+static long sgx_page_remove(struct sgx_encl *encl,
+			    struct sgx_page_remove *params)
+{
+	struct sgx_encl_page *entry;
+	struct sgx_secinfo secinfo;
+	unsigned long addr;
+	unsigned long c;
+	void *epc_virt;
+	int ret;
+
+	memset(&secinfo, 0, sizeof(secinfo));
+	secinfo.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
+
+	for (c = 0 ; c < params->length; c += PAGE_SIZE) {
+		addr = encl->base + params->offset + c;
+
+		mutex_lock(&encl->lock);
+
+		entry = sgx_encl_load_page(encl, addr);
+		if (IS_ERR(entry)) {
+			ret = -EFAULT;
+			goto out_unlock;
+		}
+
+		if (entry->type != SGX_PAGE_TYPE_TRIM) {
+			ret = -EPERM;
+			goto out_unlock;
+		}
+
+		/*
+		 * ENCLS[EMODPR] is a no-op instruction used to inform if
+		 * ENCLU[EACCEPT] was run from within the enclave. If
+		 * ENCLS[EMODPR] is run with RWX on a trimmed page that is
+		 * not yet accepted then it will return
+		 * %SGX_PAGE_NOT_MODIFIABLE, after the trimmed page is
+		 * accepted the instruction will encounter a page fault.
+		 */
+		epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
+		ret = __emodpr(&secinfo, epc_virt);
+		if (!encls_faulted(ret) || ENCLS_TRAPNR(ret) != X86_TRAP_PF) {
+			ret = -EPERM;
+			goto out_unlock;
+		}
+
+		if (sgx_unmark_page_reclaimable(entry->epc_page)) {
+			ret = -EBUSY;
+			goto out_unlock;
+		}
+
+		/*
+		 * Do not keep encl->lock because of dependency on
+		 * mmap_lock acquired in sgx_zap_enclave_ptes().
+		 */
+		mutex_unlock(&encl->lock);
+
+		sgx_zap_enclave_ptes(encl, addr);
+
+		mutex_lock(&encl->lock);
+
+		sgx_encl_free_epc_page(entry->epc_page);
+		encl->secs_child_cnt--;
+		entry->epc_page = NULL;
+		xa_erase(&encl->page_array, PFN_DOWN(entry->desc));
+		sgx_encl_shrink(encl, NULL);
+		kfree(entry);
+
+		mutex_unlock(&encl->lock);
+	}
+
+	ret = 0;
+	goto out;
+
+out_unlock:
+	mutex_unlock(&encl->lock);
+out:
+	params->count = c;
+
+	return ret;
+}
+
+/**
+ * sgx_ioc_page_remove() - handler for %SGX_IOC_PAGE_REMOVE
+ * @encl:	an enclave pointer
+ * @arg:	userspace pointer to a struct sgx_page_remove instance
+ *
+ * Return:
+ * - 0:		Success
+ * - -errno:	Otherwise
+ */
+static long sgx_ioc_page_remove(struct sgx_encl *encl, void __user *arg)
+{
+	struct sgx_page_remove params;
+	long ret;
+
+	/*
+	 * Ensure that there is a chance the request could succeed:
+	 * (1) SGX2 is required
+	 * (2) Pages can only be removed from an initialized enclave
+	 */
+	if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
+		return -ENODEV;
+
+	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
+		return -EINVAL;
+
+	/*
+	 * Obtain parameters from user and perform sanity checks.
+	 */
+	if (copy_from_user(&params, arg, sizeof(params)))
+		return -EFAULT;
+
+	if (!IS_ALIGNED(params.offset, PAGE_SIZE))
+		return -EINVAL;
+
+	if (!params.length || params.length & (PAGE_SIZE - 1))
+		return -EINVAL;
+
+	if (params.offset + params.length - PAGE_SIZE >= encl->size)
+		return -EINVAL;
+
+	if (params.count)
+		return -EINVAL;
+
+	ret = sgx_page_remove(encl, &params);
+
+	if (copy_to_user(arg, &params, sizeof(params)))
+		return -EFAULT;
+
+	return ret;
+}
+
 long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 {
 	struct sgx_encl *encl = filep->private_data;
@@ -1173,6 +1329,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 	case SGX_IOC_PAGE_MODT:
 		ret = sgx_ioc_page_modt(encl, (void __user *)arg);
 		break;
+	case SGX_IOC_PAGE_REMOVE:
+		ret = sgx_ioc_page_remove(encl, (void __user *)arg);
+		break;
 	default:
 		ret = -ENOIOCTLCMD;
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 18/25] selftests/sgx: Introduce dynamic entry point
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (16 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 17/25] x86/sgx: Support complete page removal Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-01 19:23 ` [PATCH 19/25] selftests/sgx: Introduce TCS initialization enclave operation Reinette Chatre
                   ` (7 subsequent siblings)
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The test enclave (test_encl.elf) is built with two initialized
Thread Control Structures (TCS) included in the binary. Both TCS are
initialized with the same entry point, encl_entry, that correctly
computes the absolute address of the stack based on the stack of each
TCS that is also built into the binary.

A new TCS can be added dynamically to the enclave and requires to be
initialized with an entry point used to enter the enclave. Since the
existing entry point, encl_entry, assumes that the TCS and its stack
exists at particular offsets within the binary it is not able to handle
a dynamically added TCS and its stack.

Introduce a new entry point, encl_dyn_entry, that initializes the absolute
address of that thread's stack to the address immediately preceding the
TCS itself. It is now possible to dynamically add a contiguous memory
region to the enclave with the new stack preceding the new TCS. With the
new TCS initialized with encl_dyn_entry as entry point the absolute address
of the stack is computed correctly on entry.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/test_encl_bootstrap.S | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/sgx/test_encl_bootstrap.S b/tools/testing/selftests/sgx/test_encl_bootstrap.S
index 82fb0dfcbd23..03ae0f57e29d 100644
--- a/tools/testing/selftests/sgx/test_encl_bootstrap.S
+++ b/tools/testing/selftests/sgx/test_encl_bootstrap.S
@@ -45,6 +45,12 @@ encl_entry:
 	# TCS #2. By adding the value of encl_stack to it, we get
 	# the absolute address for the stack.
 	lea	(encl_stack)(%rbx), %rax
+	jmp encl_entry_core
+encl_dyn_entry:
+	# Entry point for dynamically created TCS page expected to follow
+	# its stack directly.
+	lea -1(%rbx), %rax
+encl_entry_core:
 	xchg	%rsp, %rax
 	push	%rax
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 19/25] selftests/sgx: Introduce TCS initialization enclave operation
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (17 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 18/25] selftests/sgx: Introduce dynamic entry point Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-01 19:23 ` [PATCH 20/25] selftests/sgx: Test complete changing of page type flow Reinette Chatre
                   ` (6 subsequent siblings)
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The Thread Control Structure (TCS) contains meta-data used by the
hardware to save and restore thread specific information when
entering/exiting the enclave. A TCS can be added to an initialized
enclave by first adding a new regular enclave page, initializing the
content of the new page from within the enclave, and then changing that
page's type to a TCS.

Support the initialization of a TCS from within the enclave.
The variable information needed that should be provided from outside the
enclave is the address of the TCS, address of the State Save Area (SSA),
and the entry point that the thread should use to enter the enclave. With
this information provided all needed fields of a TCS can be initialized.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/defines.h   |  8 +++++++
 tools/testing/selftests/sgx/test_encl.c | 30 +++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h
index b638eb98c80c..d8587c971941 100644
--- a/tools/testing/selftests/sgx/defines.h
+++ b/tools/testing/selftests/sgx/defines.h
@@ -26,6 +26,7 @@ enum encl_op_type {
 	ENCL_OP_NOP,
 	ENCL_OP_EACCEPT,
 	ENCL_OP_EMODPE,
+	ENCL_OP_INIT_TCS_PAGE,
 	ENCL_OP_MAX,
 };
 
@@ -68,4 +69,11 @@ struct encl_op_emodpe {
 	uint64_t flags;
 };
 
+struct encl_op_init_tcs_page {
+	struct encl_op_header header;
+	uint64_t tcs_page;
+	uint64_t ssa;
+	uint64_t entry;
+};
+
 #endif /* DEFINES_H */
diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c
index 5b6c65331527..c0d6397295e3 100644
--- a/tools/testing/selftests/sgx/test_encl.c
+++ b/tools/testing/selftests/sgx/test_encl.c
@@ -57,6 +57,35 @@ static void *memcpy(void *dest, const void *src, size_t n)
 	return dest;
 }
 
+static void *memset(void *dest, int c, size_t n)
+{
+	size_t i;
+
+	for (i = 0; i < n; i++)
+		((char *)dest)[i] = c;
+
+	return dest;
+}
+
+static void do_encl_init_tcs_page(void *_op)
+{
+	struct encl_op_init_tcs_page *op = _op;
+	void *tcs = (void *)op->tcs_page;
+	uint32_t val_32;
+
+	memset(tcs, 0, 16);			/* STATE and FLAGS */
+	memcpy(tcs + 16, &op->ssa, 8);		/* OSSA */
+	memset(tcs + 24, 0, 4);			/* CSSA */
+	val_32 = 1;
+	memcpy(tcs + 28, &val_32, 4);		/* NSSA */
+	memcpy(tcs + 32, &op->entry, 8);	/* OENTRY */
+	memset(tcs + 40, 0, 24);		/* AEP, OFSBASE, OGSBASE */
+	val_32 = 0xFFFFFFFF;
+	memcpy(tcs + 64, &val_32, 4);		/* FSLIMIT */
+	memcpy(tcs + 68, &val_32, 4);		/* GSLIMIT */
+	memset(tcs + 72, 0, 4024);		/* Reserved */
+}
+
 static void do_encl_op_put_to_buf(void *op)
 {
 	struct encl_op_put_to_buf *op2 = op;
@@ -100,6 +129,7 @@ void encl_body(void *rdi,  void *rsi)
 		do_encl_op_nop,
 		do_encl_eaccept,
 		do_encl_emodpe,
+		do_encl_init_tcs_page,
 	};
 
 	struct encl_op_header *op = (struct encl_op_header *)rdi;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 20/25] selftests/sgx: Test complete changing of page type flow
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (18 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 19/25] selftests/sgx: Introduce TCS initialization enclave operation Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-01 19:23 ` [PATCH 21/25] selftests/sgx: Test faulty enclave behavior Reinette Chatre
                   ` (5 subsequent siblings)
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Support for changing an enclave page's type enables an initialized
enclave to be expanded with support for more threads by changing the
type of a regular enclave page to that of a Thread Control Structure
(TCS).  Additionally, being able to change a TCS or regular enclave
page's type to be trimmed (SGX_PAGE_TYPE_TRIM) initiates the removal
of the page from the enclave.

Test changing page type to TCS as well as page removal flows
in two phases: In the first phase support for a new thread is
dynamically added to an initialized enclave and in the second phase
the pages associated with the new thread are removed from the enclave.
As an additional sanity check after the second phase the page used as
a TCS page during the first phase is added back as a regular page and
ensured that it can be written to (which is not possible if it was a
TCS page).

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/load.c |  41 ++++
 tools/testing/selftests/sgx/main.c | 343 +++++++++++++++++++++++++++++
 tools/testing/selftests/sgx/main.h |   1 +
 3 files changed, 385 insertions(+)

diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c
index 9d4322c946e2..41b9d2031799 100644
--- a/tools/testing/selftests/sgx/load.c
+++ b/tools/testing/selftests/sgx/load.c
@@ -129,6 +129,47 @@ static bool encl_ioc_add_pages(struct encl *encl, struct encl_segment *seg)
 	return true;
 }
 
+/*
+ * Parse the enclave code's symbol table to locate and return address of
+ * the provided symbol
+ */
+uint64_t encl_get_entry(struct encl *encl, const char *symbol)
+{
+	Elf64_Shdr *sections;
+	Elf64_Sym *symtab;
+	Elf64_Ehdr *ehdr;
+	char *sym_names;
+	int num_sym;
+	int i;
+
+	ehdr = encl->bin;
+	sections = encl->bin + ehdr->e_shoff;
+
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		if (sections[i].sh_type == SHT_SYMTAB) {
+			symtab = (Elf64_Sym *)((char *)encl->bin + sections[i].sh_offset);
+			num_sym = sections[i].sh_size / sections[i].sh_entsize;
+			break;
+		}
+	}
+
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		if (sections[i].sh_type == SHT_STRTAB) {
+			sym_names = (char *)encl->bin + sections[i].sh_offset;
+			break;
+		}
+	}
+
+	for (i = 0; i < num_sym; i++) {
+		Elf64_Sym *sym = &symtab[i];
+
+		if (!strcmp(symbol, sym_names + sym->st_name))
+			return (uint64_t)sym->st_value;
+	}
+
+	return 0;
+}
+
 bool encl_load(const char *path, struct encl *encl, unsigned long heap_size)
 {
 	const char device_path[] = "/dev/sgx_enclave";
diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index bc8c7d06d74c..d73ea2a02d4b 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -1149,4 +1149,347 @@ TEST_F(enclave, augment_via_eaccept)
 	munmap(addr, PAGE_SIZE);
 }
 
+/*
+ * SGX2 page type modification test in two phases:
+ * Phase 1:
+ * Create a new TCS, consisting out of three new pages (stack page with regular
+ * page type, SSA page with regular page type, and TCS page with TCS page
+ * type) in an initialized enclave and run a simple workload within it.
+ * Phase 2:
+ * Remove the three pages added in phase 1, add a new regular page at the
+ * same address that previously hosted the TCS page and verify that it can
+ * be modified.
+ */
+TEST_F(enclave, tcs_create)
+{
+	struct encl_op_init_tcs_page init_tcs_page_op;
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct encl_op_get_from_buf get_buf_op;
+	struct encl_op_put_to_buf put_buf_op;
+	void *addr, *tcs, *stack_end, *ssa;
+	struct encl_op_eaccept eaccept_op;
+	struct sgx_page_remove remove_ioc;
+	struct sgx_page_modt modt_ioc;
+	size_t total_size = 0;
+	uint64_t val_64;
+	int errno_save;
+	int ret, i;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl,
+				    _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Hardware (SGX2) and OS support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODT ioctl");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Add three regular pages via EAUG: one will be the TCS stack, one
+	 * will be the TCS SSA, and one will be the new TCS. The stack and
+	 * SSA will remain as regular pages, the TCS page will need its
+	 * type changed after populated with needed data.
+	 */
+	for (i = 0; i < self->encl.nr_segments; i++) {
+		struct encl_segment *seg = &self->encl.segment_tbl[i];
+
+		total_size += seg->size;
+	}
+
+	/*
+	 * Actual enclave size is expected to be larger than the loaded
+	 * test enclave since enclave size must be a power of 2 in bytes while
+	 * test_encl does not consume it all.
+	 */
+	EXPECT_LT(total_size + 3 * PAGE_SIZE, self->encl.encl_size);
+
+	/*
+	 * mmap() three pages at end of existing enclave to be used for the
+	 * three new pages.
+	 */
+	addr = mmap((void *)self->encl.encl_base + total_size, 3 * PAGE_SIZE,
+		    PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,
+		    self->encl.fd, 0);
+	EXPECT_NE(addr, MAP_FAILED);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	stack_end = (void *)self->encl.encl_base + total_size;
+	tcs = (void *)self->encl.encl_base + total_size + PAGE_SIZE;
+	ssa = (void *)self->encl.encl_base + total_size + 2 * PAGE_SIZE;
+
+	/*
+	 * Run EACCEPT on each new page to trigger the
+	 * EACCEPT->(#PF)->EAUG->EACCEPT(again without a #PF) flow.
+	 */
+
+	eaccept_op.epc_addr = (unsigned long)stack_end;
+	eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	if (self->run.exception_vector == 14 &&
+	    self->run.exception_error_code == 4 &&
+	    self->run.exception_addr == (unsigned long)stack_end) {
+		munmap(addr, 3 * PAGE_SIZE);
+		SKIP(return, "Kernel does not support adding pages to initialized enclave");
+	}
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	eaccept_op.epc_addr = (unsigned long)ssa;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	eaccept_op.epc_addr = (unsigned long)tcs;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/*
+	 * Three new pages added to enclave. Now populate the TCS page with
+	 * needed data. This should be done from within enclave. Provide
+	 * the function that will do the actual data population with needed
+	 * data.
+	 */
+
+	/*
+	 * New TCS will use the "encl_dyn_entry" entrypoint that expects
+	 * stack to begin in page before TCS page.
+	 */
+	val_64 = encl_get_entry(&self->encl, "encl_dyn_entry");
+	EXPECT_NE(val_64, 0);
+
+	init_tcs_page_op.tcs_page = (unsigned long)tcs;
+	init_tcs_page_op.ssa = (unsigned long)total_size + 2 * PAGE_SIZE;
+	init_tcs_page_op.entry = val_64;
+	init_tcs_page_op.header.type = ENCL_OP_INIT_TCS_PAGE;
+
+	EXPECT_EQ(ENCL_CALL(&init_tcs_page_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/* Change TCS page type to TCS. */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+
+	modt_ioc.offset = total_size + PAGE_SIZE;
+	modt_ioc.length = PAGE_SIZE;
+	modt_ioc.type = SGX_PAGE_TYPE_TCS;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, 4096);
+
+	/* EACCEPT new TCS page from enclave. */
+	eaccept_op.epc_addr = (unsigned long)tcs;
+	eaccept_op.flags = SGX_SECINFO_TCS | SGX_SECINFO_MODIFIED;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/* Run workload from new TCS. */
+	self->run.tcs = (unsigned long)tcs;
+
+	/*
+	 * Simple workload to write to data buffer and read value back.
+	 */
+	put_buf_op.header.type = ENCL_OP_PUT_TO_BUFFER;
+	put_buf_op.value = MAGIC;
+
+	EXPECT_EQ(ENCL_CALL(&put_buf_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	get_buf_op.header.type = ENCL_OP_GET_FROM_BUFFER;
+	get_buf_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_buf_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_buf_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Phase 2 of test:
+	 * Remove pages associated with new TCS, create a regular page
+	 * where TCS page used to be and verify it can be used as a regular
+	 * page.
+	 */
+
+	/* Start page removal by requesting change of page type to PT_TRIM. */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+
+	modt_ioc.offset = total_size;
+	modt_ioc.length = 3 * PAGE_SIZE;
+	modt_ioc.type = SGX_PAGE_TYPE_TRIM;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, 3 * PAGE_SIZE);
+
+	/*
+	 * Enter enclave via TCS #1 and approve page removal by sending
+	 * EACCEPT for each of three removed pages.
+	 */
+	self->run.tcs = self->encl.encl_base;
+
+	eaccept_op.epc_addr = (unsigned long)stack_end;
+	eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	eaccept_op.epc_addr = (unsigned long)tcs;
+	eaccept_op.ret = 0;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	eaccept_op.epc_addr = (unsigned long)ssa;
+	eaccept_op.ret = 0;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/* Send final ioctl to complete page removal. */
+	memset(&remove_ioc, 0, sizeof(remove_ioc));
+
+	remove_ioc.offset = total_size;
+	remove_ioc.length = 3 * PAGE_SIZE;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_REMOVE, &remove_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(remove_ioc.count, 3 * PAGE_SIZE);
+
+	/*
+	 * Enter enclave via TCS #1 and access location where TCS #3 was to
+	 * trigger dynamic add of regular page at that location.
+	 */
+	eaccept_op.epc_addr = (unsigned long)tcs;
+	eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/*
+	 * New page should be accessible from within enclave - write to it.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = (unsigned long)tcs;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that data
+	 * previously written (MAGIC) is present. Only change two test
+	 * parameters, rest are same as previous test.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = (unsigned long)tcs;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	munmap(addr, 3 * PAGE_SIZE);
+}
+
 TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/sgx/main.h b/tools/testing/selftests/sgx/main.h
index b45c52ec7ab3..fc585be97e2f 100644
--- a/tools/testing/selftests/sgx/main.h
+++ b/tools/testing/selftests/sgx/main.h
@@ -38,6 +38,7 @@ void encl_delete(struct encl *ctx);
 bool encl_load(const char *path, struct encl *encl, unsigned long heap_size);
 bool encl_measure(struct encl *encl);
 bool encl_build(struct encl *encl);
+uint64_t encl_get_entry(struct encl *encl, const char *symbol);
 
 int sgx_enter_enclave(void *rdi, void *rsi, long rdx, u32 function, void *r8, void *r9,
 		      struct sgx_enclave_run *run);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 21/25] selftests/sgx: Test faulty enclave behavior
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (19 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 20/25] selftests/sgx: Test complete changing of page type flow Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-01 19:23 ` [PATCH 22/25] selftests/sgx: Test invalid access to removed enclave page Reinette Chatre
                   ` (4 subsequent siblings)
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Removing a page from an initialized enclave involves three steps:
first the user requests changing the page type to SGX_PAGE_TYPE_TRIM
via an ioctl, on success the ENCLU[EACCEPT] instruction needs to be run
from within the enclave to accept the page removal, finally the
user requests page removal to be completed via an ioctl. Only after
acceptance (ENCLU[EACCEPT]) from within the enclave can the kernel
remove the page from a running enclave.

Test the behavior when the user's request to change the page type
succeeds, but the ENCLU[EACCEPT] instruction is not run before the
ioctl requesting page removal is run. This should not be permitted.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/main.c | 114 +++++++++++++++++++++++++++++
 1 file changed, 114 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index d73ea2a02d4b..f71f943099fb 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -1492,4 +1492,118 @@ TEST_F(enclave, tcs_create)
 	munmap(addr, 3 * PAGE_SIZE);
 }
 
+/*
+ * Ensure sane behavior if user requests page removal, does not run
+ * EACCEPT from within enclave but still attempts to finalize page removal
+ * with the SGX_IOC_PAGE_REMOVE ioctl. The latter should fail because the
+ * removal was not EACCEPTed from within the enclave.
+ */
+TEST_F(enclave, remove_added_page_no_eaccept)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct sgx_page_remove remove_ioc;
+	struct sgx_page_modt modt_ioc;
+	unsigned long data_start;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Hardware (SGX2) and OS support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODT ioctl");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Page that will be removed is the second data page in the .data
+	 * segment. This forms part of the local encl_buffer within the
+	 * enclave.
+	 */
+	data_start = self->encl.encl_base +
+		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	/*
+	 * Sanity check that page at @data_start is writable before
+	 * removing it.
+	 *
+	 * Start by writing MAGIC to test page.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = data_start;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that data
+	 * previously written (MAGIC) is present. Only change two test
+	 * parameters, rest are same as previous test.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = data_start;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/* Start page removal by requesting change of page type to PT_TRIM */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+
+	modt_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	modt_ioc.length = PAGE_SIZE;
+	modt_ioc.type = SGX_PAGE_TYPE_TRIM;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, 4096);
+
+	/* Skip EACCEPT */
+
+	/* Send final ioctl to complete page removal */
+	memset(&remove_ioc, 0, sizeof(remove_ioc));
+
+	remove_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	remove_ioc.length = PAGE_SIZE;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_REMOVE, &remove_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	/* Operation not permitted since EACCEPT was omitted. */
+	EXPECT_EQ(ret, -1);
+	EXPECT_EQ(errno_save, EPERM);
+	EXPECT_EQ(remove_ioc.count, 0);
+}
+
 TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 22/25] selftests/sgx: Test invalid access to removed enclave page
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (20 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 21/25] selftests/sgx: Test faulty enclave behavior Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-01 19:23 ` [PATCH 23/25] selftests/sgx: Test reclaiming of untouched page Reinette Chatre
                   ` (3 subsequent siblings)
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Removing a page from an initialized enclave involves three steps:
(1) the user requests changing the page type to SGX_PAGE_TYPE_TRIM
via the SGX_IOC_PAGE_MODT ioctl, (2) on success the ENCLU[EACCEPT]
instruction is run from within the enclave to accept the page removal,
(3) the user initiates the actual removal of the page via the
SGX_IOC_PAGE_REMOVE ioctl.

Test two possible invalid accesses during the page removal flow:
* Test the behavior when a request to remove the page by changing its
  type to SGX_PAGE_TYPE_TRIM completes successfully but instead of
  executing ENCLU[EACCEPT] from within the enclave the enclave attempts
  to read from the page. Even though the page is accessible from the
  page table entries its type is SGX_PAGE_TYPE_TRIM and thus not
  accessible according to SGX. The expected behavior is a page fault
  with the SGX flag set in the error code.
* Test the behavior when the page type is changed successfully and
  ENCLU[EACCEPT] was run from within the enclave. The final ioctl,
  SGX_IOC_PAGE_REMOVE, is omitted and replaced with an attempt to access
  the page. Even though the page is accessible from the page table
  entries its type is SGX_PAGE_TYPE_TRIM and thus not accessible
  according to SGX.  The expected behavior is a page fault with the
  SGX flag set in the error code.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/main.c | 243 +++++++++++++++++++++++++++++
 1 file changed, 243 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index f71f943099fb..3bee8fa557fd 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -1606,4 +1606,247 @@ TEST_F(enclave, remove_added_page_no_eaccept)
 	EXPECT_EQ(remove_ioc.count, 0);
 }
 
+/*
+ * Request enclave page removal but instead of correctly following with
+ * EACCEPT a read attempt to page is made from within the enclave.
+ */
+TEST_F(enclave, remove_added_page_invalid_access)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	unsigned long data_start;
+	struct sgx_page_modt ioc;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Hardware (SGX2) and OS support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&ioc, 0, sizeof(ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODT ioctl");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Page that will be removed is the second data page in the .data
+	 * segment. This forms part of the local encl_buffer within the
+	 * enclave.
+	 */
+	data_start = self->encl.encl_base +
+		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	/*
+	 * Sanity check that page at @data_start is writable before
+	 * removing it.
+	 *
+	 * Start by writing MAGIC to test page.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = data_start;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that data
+	 * previously written (MAGIC) is present. Only change two test
+	 * parameters, rest are same as previous test.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = data_start;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/* Start page removal by requesting change of page type to PT_TRIM */
+	memset(&ioc, 0, sizeof(ioc));
+
+	ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	ioc.length = PAGE_SIZE;
+	ioc.type = SGX_PAGE_TYPE_TRIM;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(ioc.result, 0);
+	EXPECT_EQ(ioc.count, 4096);
+
+	/*
+	 * Read from page that was just removed.
+	 */
+	get_addr_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	/*
+	 * From OS perspective the page is present but according to SGX the
+	 * page should not be accessible so a #PF with SGX bit set is
+	 * expected.
+	 */
+
+	EXPECT_EQ(self->run.function, ERESUME);
+	EXPECT_EQ(self->run.exception_vector, 14);
+	EXPECT_EQ(self->run.exception_error_code, 0x8005);
+	EXPECT_EQ(self->run.exception_addr, data_start);
+}
+
+/*
+ * Request enclave page removal and correctly follow with
+ * EACCEPT but do not follow with removal ioctl but instead a read attempt
+ * to removed page is made from within the enclave.
+ */
+TEST_F(enclave, remove_added_page_invalid_access_after_eaccept)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct encl_op_eaccept eaccept_op;
+	unsigned long data_start;
+	struct sgx_page_modt ioc;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Hardware (SGX2) and OS support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&ioc, 0, sizeof(ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support test SGX_IOC_PAGE_MODT ioctl");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Page that will be removed is the second data page in the .data
+	 * segment. This forms part of the local encl_buffer within the
+	 * enclave.
+	 */
+	data_start = self->encl.encl_base +
+		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	/*
+	 * Sanity check that page at @data_start is writable before
+	 * removing it.
+	 *
+	 * Start by writing MAGIC to test page.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = data_start;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that data
+	 * previously written (MAGIC) is present. Only change two test
+	 * parameters, rest are same as previous test.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = data_start;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/* Start page removal by requesting change of page type to PT_TRIM */
+	memset(&ioc, 0, sizeof(ioc));
+
+	ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	ioc.length = PAGE_SIZE;
+	ioc.type = SGX_PAGE_TYPE_TRIM;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(ioc.result, 0);
+	EXPECT_EQ(ioc.count, 4096);
+
+	eaccept_op.epc_addr = (unsigned long)data_start;
+	eaccept_op.ret = 0;
+	eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/* Skip ioctl to remove page */
+
+	/*
+	 * Read from page that was just removed.
+	 */
+	get_addr_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	/*
+	 * From OS perspective the page is present but according to SGX the
+	 * page should not be accessible so a #PF with SGX bit set is
+	 * expected.
+	 */
+
+	EXPECT_EQ(self->run.function, ERESUME);
+	EXPECT_EQ(self->run.exception_vector, 14);
+	EXPECT_EQ(self->run.exception_error_code, 0x8005);
+	EXPECT_EQ(self->run.exception_addr, data_start);
+}
+
 TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 23/25] selftests/sgx: Test reclaiming of untouched page
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (21 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 22/25] selftests/sgx: Test invalid access to removed enclave page Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-01 19:23 ` [PATCH 24/25] x86/sgx: Free up EPC pages directly to support large page ranges Reinette Chatre
                   ` (2 subsequent siblings)
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Removing a page from an initialized enclave involves three steps: (1) the
user requests changing the page type to PT_TRIM via the SGX_IOC_PAGE_MODT
ioctl, (2) on success the ENCLU[EACCEPT] instruction is run from within
the enclave to accept the page removal, (3) the user initiates the actual
removal of the page via the SGX_IOC_PAGE_REMOVE ioctl.

Remove a page that has never been accessed. This means that when the first
ioctl requesting page removal arrives, there will be no page table entry,
yet a valid page table entry needs to exist for the ENCLU[EACCEPT] function
to succeed. In this test it is verified that a page table entry can still
be installed for a page that is in the process of being removed.

Suggested-by: Haitao Huang <haitao.huang@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/main.c | 58 ++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 3bee8fa557fd..618f5ff0601b 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -1849,4 +1849,62 @@ TEST_F(enclave, remove_added_page_invalid_access_after_eaccept)
 	EXPECT_EQ(self->run.exception_addr, data_start);
 }
 
+TEST_F(enclave, remove_untouched_page)
+{
+	struct encl_op_eaccept eaccept_op;
+	struct sgx_page_remove remove_ioc;
+	struct sgx_page_modt modt_ioc;
+	unsigned long data_start;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	data_start = self->encl.encl_base +
+			 encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+
+	modt_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	modt_ioc.length = PAGE_SIZE;
+	modt_ioc.type = SGX_PAGE_TYPE_TRIM;
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, 4096);
+
+	/*
+	 * Enter enclave via TCS #1 and approve page removal by sending
+	 * EACCEPT for removed page.
+	 */
+
+	eaccept_op.epc_addr = data_start;
+	eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	memset(&remove_ioc, 0, sizeof(remove_ioc));
+
+	remove_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	remove_ioc.length = PAGE_SIZE;
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_REMOVE, &remove_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(remove_ioc.count, 4096);
+}
+
 TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 24/25] x86/sgx: Free up EPC pages directly to support large page ranges
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (22 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 23/25] selftests/sgx: Test reclaiming of untouched page Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-04 23:47   ` Jarkko Sakkinen
  2021-12-01 19:23 ` [PATCH 25/25] selftests/sgx: Page removal stress test Reinette Chatre
  2021-12-02 18:30 ` [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Dave Hansen
  25 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The page reclaimer ensures availability of EPC pages across all
enclaves. In support of this it runs independently from the individual
enclaves in order to take locks from the different enclaves as it writes
pages to swap.

When needing to load a page from swap an EPC page needs to be available for
its contents to be loaded into. Loading an existing enclave page from swap
does not reclaim EPC pages directly if none are available, instead the
reclaimer is woken when the available EPC pages are found to be below a
watermark.

When iterating over a large number of pages in an oversubscribed
environment there is a race between the reclaimer woken up and EPC pages
reclaimed fast enough for the page operations to proceed.

Instead of tuning the race between the page operations and the reclaimer
the page operations instead makes sure that there are EPC pages available.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++++++
 arch/x86/kernel/cpu/sgx/main.c  | 6 ++++++
 arch/x86/kernel/cpu/sgx/sgx.h   | 1 +
 3 files changed, 13 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index d11da6c53b26..fc2737b3c7cc 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -739,6 +739,8 @@ static long sgx_page_modp(struct sgx_encl *encl, struct sgx_page_modp *modp)
 	for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
 		addr = encl->base + modp->offset + c;
 
+		sgx_direct_reclaim();
+
 		mutex_lock(&encl->lock);
 
 		entry = sgx_encl_load_page(encl, addr);
@@ -962,6 +964,8 @@ static long sgx_page_modt(struct sgx_encl *encl, struct sgx_page_modt *modt)
 	for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
 		addr = encl->base + modt->offset + c;
 
+		sgx_direct_reclaim();
+
 		mutex_lock(&encl->lock);
 
 		entry = sgx_encl_load_page(encl, addr);
@@ -1187,6 +1191,8 @@ static long sgx_page_remove(struct sgx_encl *encl,
 	for (c = 0 ; c < params->length; c += PAGE_SIZE) {
 		addr = encl->base + params->offset + c;
 
+		sgx_direct_reclaim();
+
 		mutex_lock(&encl->lock);
 
 		entry = sgx_encl_load_page(encl, addr);
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 887648ce6084..bc3fc57f5f08 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -376,6 +376,12 @@ static bool sgx_should_reclaim(unsigned long watermark)
 	       !list_empty(&sgx_active_page_list);
 }
 
+void sgx_direct_reclaim(void)
+{
+	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
+		sgx_reclaim_pages();
+}
+
 static int ksgxd(void *p)
 {
 	set_freezable();
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index ca89d625aa74..02af24acaacc 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -85,6 +85,7 @@ static inline void *sgx_get_epc_virt_addr(struct sgx_epc_page *page)
 struct sgx_epc_page *__sgx_alloc_epc_page(void);
 void sgx_free_epc_page(struct sgx_epc_page *page);
 
+void sgx_direct_reclaim(void);
 void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
 int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
 struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* [PATCH 25/25] selftests/sgx: Page removal stress test
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (23 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 24/25] x86/sgx: Free up EPC pages directly to support large page ranges Reinette Chatre
@ 2021-12-01 19:23 ` Reinette Chatre
  2021-12-02 18:30 ` [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Dave Hansen
  25 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-01 19:23 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Create enclave with additional heap that consumes all physical SGX memory
and then remove it.

Depending on the available SGX memory this test could take a significant
time to run (several minutes) as it (1) creates the enclave, (2)
changes the type of every page to be trimmed, (3) enters the enclave
once per page to run EACCEPT, before (4) the pages are finally removed.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 tools/testing/selftests/sgx/main.c | 98 ++++++++++++++++++++++++++++++
 1 file changed, 98 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 618f5ff0601b..c1495eaafe79 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -393,7 +393,105 @@ TEST_F(enclave, unclobbered_vdso_oversubscribed)
 	EXPECT_EQ(get_op.value, MAGIC);
 	EXPECT_EEXIT(&self->run);
 	EXPECT_EQ(self->run.user_data, 0);
+}
+
+TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900)
+{
+	struct encl_op_get_from_buf get_op;
+	struct encl_op_put_to_buf put_op;
+	struct sgx_page_modt modt_ioc;
+	struct encl_segment *heap;
+	unsigned long total_mem;
+	int ret, errno_save;
+	unsigned long addr;
+	struct encl_op_eaccept eaccept_op;
+	struct sgx_page_remove remove_ioc;
+	unsigned long i;
+
+	/*
+	 * Create enclave with additional heap that is as big as all
+	 * available physical SGX memory.
+	 */
+	total_mem = get_total_epc_mem();
+	ASSERT_NE(total_mem, 0);
+	TH_LOG("Creating an enclave with %lu bytes heap may take a while ...",
+	       total_mem);
+	ASSERT_TRUE(setup_test_encl(total_mem, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	heap = &self->encl.segment_tbl[self->encl.nr_segments - 1];
+
+	put_op.header.type = ENCL_OP_PUT_TO_BUFFER;
+	put_op.value = MAGIC;
+
+	EXPECT_EQ(ENCL_CALL(&put_op, &self->run, false), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.user_data, 0);
+
+	get_op.header.type = ENCL_OP_GET_FROM_BUFFER;
+	get_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_op, &self->run, false), 0);
+
+	EXPECT_EQ(get_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.user_data, 0);
+
+	/* Trim  entire heap */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+
+	modt_ioc.offset = heap->offset;
+	modt_ioc.length = heap->size;
+	modt_ioc.type = SGX_PAGE_TYPE_TRIM;
+
+	TH_LOG("Changing type of %zd bytes to trimmed may take a while ...",
+	       heap->size);
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_MODT, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, heap->size);
 
+	/* EACCEPT all removed pages */
+	addr = self->encl.encl_base + heap->offset;
+
+	eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	TH_LOG("Entering enclave to run EACCEPT for each page of %zd bytes may take a while ...",
+	       heap->size);
+	for (i = 0; i < heap->size; i += 4096) {
+		eaccept_op.epc_addr = addr + i;
+		eaccept_op.ret = 0;
+
+		EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+		EXPECT_EEXIT(&self->run);
+		EXPECT_EQ(self->run.exception_vector, 0);
+		EXPECT_EQ(self->run.exception_error_code, 0);
+		EXPECT_EQ(self->run.exception_addr, 0);
+		EXPECT_EQ(eaccept_op.ret, 0);
+	}
+
+	/* Complete page removal */
+	memset(&remove_ioc, 0, sizeof(remove_ioc));
+
+	remove_ioc.offset = heap->offset;
+	remove_ioc.length = heap->size;
+
+	TH_LOG("Removing %zd bytes from enclave may take a while ...",
+	       heap->size);
+	ret = ioctl(self->encl.fd, SGX_IOC_PAGE_REMOVE, &remove_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(remove_ioc.count, heap->size);
 }
 
 TEST_F(enclave, clobbered_vdso)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 155+ messages in thread

* Re: [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2
  2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (24 preceding siblings ...)
  2021-12-01 19:23 ` [PATCH 25/25] selftests/sgx: Page removal stress test Reinette Chatre
@ 2021-12-02 18:30 ` Dave Hansen
  2021-12-02 20:38   ` Nathaniel McCallum
  25 siblings, 1 reply; 155+ messages in thread
From: Dave Hansen @ 2021-12-02 18:30 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, luto, mingo,
	linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/1/21 11:22 AM, Reinette Chatre wrote:
> * Support modifying permissions of regular enclave pages belonging to an
>   initialized enclave. New permissions are not allowed to exceed the
>   originally vetted permissions. Modifying permissions is accomplished
>   with a new ioctl SGX_IOC_PAGE_MODP.

It's probably also worth noting that this effectively punts on the issue
of how to allow enclaves to relax the permissions on pages, like taking
a page from R=>RW, or R=>RX.  RX isn't allowed unless the page was
*added* originally with RX or RWX.

Since dynamically added pages start with initial RW permissions, they
can *never* be RX or RWX since they did not start with execute
permissions.  That's a limitation, of course, but it's one that can be
dealt with separately from this set.

Does that sound sane to everyone?

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2
  2021-12-02 18:30 ` [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Dave Hansen
@ 2021-12-02 20:38   ` Nathaniel McCallum
  0 siblings, 0 replies; 155+ messages in thread
From: Nathaniel McCallum @ 2021-12-02 20:38 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Reinette Chatre, dave.hansen, jarkko, tglx, bp, luto, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Thu, Dec 2, 2021 at 1:30 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 12/1/21 11:22 AM, Reinette Chatre wrote:
> > * Support modifying permissions of regular enclave pages belonging to an
> >   initialized enclave. New permissions are not allowed to exceed the
> >   originally vetted permissions. Modifying permissions is accomplished
> >   with a new ioctl SGX_IOC_PAGE_MODP.
>
> It's probably also worth noting that this effectively punts on the issue
> of how to allow enclaves to relax the permissions on pages, like taking
> a page from R=>RW, or R=>RX.  RX isn't allowed unless the page was
> *added* originally with RX or RWX.
>
> Since dynamically added pages start with initial RW permissions, they
> can *never* be RX or RWX since they did not start with execute
> permissions.  That's a limitation, of course, but it's one that can be
> dealt with separately from this set.
>
> Does that sound sane to everyone?

We (Enarx) need arbitrary permission modifications. But for now we can
just use this patch series and patch the original permissions to be
RWX on all new pages. I think that should be sufficient.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-01 19:23 ` [PATCH 10/25] x86/sgx: Support enclave page permission changes Reinette Chatre
@ 2021-12-02 23:48   ` Dave Hansen
  2021-12-03 18:18     ` Reinette Chatre
  2021-12-03  0:32   ` Dave Hansen
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 155+ messages in thread
From: Dave Hansen @ 2021-12-02 23:48 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, luto, mingo,
	linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/1/21 11:23 AM, Reinette Chatre wrote:
> + * EPCM permissions can be extended anytime directly from the enclave with
> + * no visibility from the OS. This is accomplished with ENCLU[EMODPE]
> + * run from within enclave. Accessing pages with the new, extended,
> + * permissions requires the OS to update the PTE to handle the subsequent
> + * #PF correctly.

Hi Reinette,

I really dislike the Intel nomenclature here.  I know the Intel docs are
all written around permission "extension", but I find it ambiguous.

I've been looking at these instructions literally for years now and
permission extension to me can mean either:
 1. The set of things you can do is extended
 2. The set of things you can *NOT* do is extended

I much rather prefer nomenclature like:

	EPCM permissions can be relaxed anytime directly from the
	enclave with no visibility from the OS. This is accomplished
	with ENCLU[EMODPE] run from within enclave. Accessing pages with
	the new, relaxed permissions requires the OS to update the PTE
	to handle the subsequent correctly.

"Relax" is less ambiguous.  Relaxing a restriction and relaxing
permissions both mean doing things less strictly.  Extending
restrictions and extending what is allowed are opposites.

Maybe it's just me and I need to get this through my thick skull at some
point.  But, I do think it's OK to improve on the architecture names for
things when they go into the kernel.  The XSAVE XSTATE_BV->xfeatures
rename comes to mind.

Anyway, I'd appreciate if you could keep this in mind and consider
changing it if a future revision is needed if you believe it is more clear.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-01 19:23 ` [PATCH 10/25] x86/sgx: Support enclave page permission changes Reinette Chatre
  2021-12-02 23:48   ` Dave Hansen
@ 2021-12-03  0:32   ` Dave Hansen
  2021-12-03 18:18     ` Reinette Chatre
  2021-12-03 18:14   ` Dave Hansen
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 155+ messages in thread
From: Dave Hansen @ 2021-12-03  0:32 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, luto, mingo,
	linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/1/21 11:23 AM, Reinette Chatre wrote:
> Whether enclave page permissions are restricted or extended it
> is necessary to ensure that the page table entries and enclave page
> permissions are in sync. Introduce a new ioctl,

These should be "ioctl()".

> SGX_IOC_PAGE_MODP, to support enclave page permission changes. Since
> the OS has no insight in how permissions may have been extended from
> within the enclave all page permission requests are treated as
> permission restrictions.
I'm trying to wrap my head around this a bit.  If this is only for
restricting permissions, should we be reflecting that in the naming?
SGX_IOC_PAGE_RESTRICT_PERM, perhaps?  Wouldn't that be more direct than
saying, "here's a permission change ioctl(), but it doesn't arbitrarily
change things, it treats all changes as restrictions"?

The pseudocode for EMODPR looks like this:

> (* Update EPCM permissions *)
> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;

so it makes total sense that it can only restrict permissions since it's
effectively:

	new_hw_perm = old_hw_perm & secinfo_perm;

...
> +/**
> + * struct sgx_page_modp - parameter structure for the %SGX_IOC_PAGE_MODP ioctl
> + * @offset:	starting page offset (page aligned relative to enclave base
> + *		address defined in SECS)
> + * @length:	length of memory (multiple of the page size)
> + * @prot:	new protection bits of pages in range described by @offset
> + *		and @length
> + * @result:	SGX result code of ENCLS[EMODPR] function
> + * @count:	bytes successfully changed (multiple of page size)
> + */
> +struct sgx_page_modp {
> +	__u64 offset;
> +	__u64 length;
> +	__u64 prot;
> +	__u64 result;
> +	__u64 count;
> +};

Could we make it more explicit that offset/length/prot are inputs and
result/count are output?

..
> +	if (!params.length || params.length & (PAGE_SIZE - 1))
> +		return -EINVAL;

I find these a bit easier to read if they're:

	if (!params.length || !IS_ALIGNED(params.length, PAGE_SIZE))
		...

> +	if (params.offset + params.length - PAGE_SIZE >= encl->size)
> +		return -EINVAL;

I hate boundary conditions. :)  But, I think this would be simpler
written as:

	if (params.offset + params.length > encl->size)

Please double-check me on that, though.  I've gotten these kinds of
checks wrong more times than I care to admit.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2021-12-01 19:23 ` [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave Reinette Chatre
@ 2021-12-03  0:38   ` Dave Hansen
  2021-12-03 18:47     ` Reinette Chatre
  2021-12-04 23:13   ` Jarkko Sakkinen
  2022-03-01 15:13   ` Jarkko Sakkinen
  2 siblings, 1 reply; 155+ messages in thread
From: Dave Hansen @ 2021-12-03  0:38 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, luto, mingo,
	linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/1/21 11:23 AM, Reinette Chatre wrote:
> +static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> +				     struct sgx_encl *encl, unsigned long addr)
> +{
> +	struct sgx_pageinfo pginfo = {0};
> +	struct sgx_encl_page *encl_page;
> +	struct sgx_epc_page *epc_page;
> +	struct sgx_va_page *va_page;
> +	unsigned long phys_addr;
> +	unsigned long prot;
> +	vm_fault_t vmret;
> +	int ret;
> +
> +	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> +		return VM_FAULT_SIGBUS;
> +
> +	encl_page = kzalloc(sizeof(*encl_page), GFP_KERNEL);
> +	if (!encl_page)
> +		return VM_FAULT_OOM;
> +
> +	encl_page->desc = addr;
> +	encl_page->encl = encl;
> +
> +	/*
> +	 * Adding a regular page that is architecturally allowed to only
> +	 * be created with RW permissions.
> +	 * TBD: Interface with user space policy to support max permissions
> +	 * of RWX.
> +	 */
> +	prot = PROT_READ | PROT_WRITE;
> +	encl_page->vm_run_prot_bits = calc_vm_prot_bits(prot, 0);
> +	encl_page->vm_max_prot_bits = encl_page->vm_run_prot_bits;
> +
> +	epc_page = sgx_alloc_epc_page(encl_page, true);
> +	if (IS_ERR(epc_page)) {
> +		kfree(encl_page);
> +		return VM_FAULT_SIGBUS;
> +	}
> +
> +	va_page = sgx_encl_grow(encl);
> +	if (IS_ERR(va_page)) {
> +		ret = PTR_ERR(va_page);
> +		goto err_out_free;
> +	}
> +
> +	mutex_lock(&encl->lock);
> +
> +	/*
> +	 * Copy comment from sgx_encl_add_page() to maintain guidance in
> +	 * this similar flow:
> +	 * Adding to encl->va_pages must be done under encl->lock.  Ditto for
> +	 * deleting (via sgx_encl_shrink()) in the error path.
> +	 */
> +	if (va_page)
> +		list_add(&va_page->list, &encl->va_pages);
> +
> +	ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc),
> +			encl_page, GFP_KERNEL);
> +	/*
> +	 * If ret == -EBUSY then page was created in another flow while
> +	 * running without encl->lock
> +	 */
> +	if (ret)
> +		goto err_out_unlock;
> +
> +	pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
> +	pginfo.addr = encl_page->desc & PAGE_MASK;
> +	pginfo.metadata = 0;
> +
> +	ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
> +	if (ret)
> +		goto err_out;
> +
> +	encl_page->encl = encl;
> +	encl_page->epc_page = epc_page;
> +	encl_page->type = SGX_PAGE_TYPE_REG;
> +	encl->secs_child_cnt++;
> +
> +	sgx_mark_page_reclaimable(encl_page->epc_page);
> +
> +	phys_addr = sgx_get_epc_phys_addr(epc_page);
> +	/*
> +	 * Do not undo everything when creating PTE entry fails - next #PF
> +	 * would find page ready for a PTE.
> +	 * PAGE_SHARED because protection is forced to be RW above and COW
> +	 * is not supported.
> +	 */
> +	vmret = vmf_insert_pfn_prot(vma, addr, PFN_DOWN(phys_addr),
> +				    PAGE_SHARED);
> +	if (vmret != VM_FAULT_NOPAGE) {
> +		mutex_unlock(&encl->lock);
> +		return VM_FAULT_SIGBUS;
> +	}
> +	mutex_unlock(&encl->lock);
> +	return VM_FAULT_NOPAGE;
> +
> +err_out:
> +	xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
> +
> +err_out_unlock:
> +	sgx_encl_shrink(encl, va_page);
> +	mutex_unlock(&encl->lock);
> +
> +err_out_free:
> +	sgx_encl_free_epc_page(epc_page);
> +	kfree(encl_page);
> +
> +	return VM_FAULT_SIGBUS;
> +}

There seems to be very little code sharing between this and the existing
page addition.  Are we confident that no refactoring here is in order?

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-01 19:23 ` [PATCH 10/25] x86/sgx: Support enclave page permission changes Reinette Chatre
  2021-12-02 23:48   ` Dave Hansen
  2021-12-03  0:32   ` Dave Hansen
@ 2021-12-03 18:14   ` Dave Hansen
  2021-12-03 18:49     ` Reinette Chatre
  2021-12-03 19:38   ` Andy Lutomirski
  2021-12-04 23:08   ` Jarkko Sakkinen
  4 siblings, 1 reply; 155+ messages in thread
From: Dave Hansen @ 2021-12-03 18:14 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, luto, mingo,
	linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/1/21 11:23 AM, Reinette Chatre wrote:
> Enclave page permission changes need to be approached with care and
> for this reason this initial support is to allow enclave page
> permission changes _only_ if the new permissions are the same or
> more restrictive that the permissions originally vetted at the time the
> pages were added to the enclave. Support for extending enclave page
> permissions beyond what was originally vetted is deferred.

It's probably worth adding a few examples here:

 * RWX => RW => RX => RW => R => RWX
 * RW => R => RW
 * RX => R => RX


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-02 23:48   ` Dave Hansen
@ 2021-12-03 18:18     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-03 18:18 UTC (permalink / raw)
  To: Dave Hansen, dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Dave,

On 12/2/2021 3:48 PM, Dave Hansen wrote:
> On 12/1/21 11:23 AM, Reinette Chatre wrote:
>> + * EPCM permissions can be extended anytime directly from the enclave with
>> + * no visibility from the OS. This is accomplished with ENCLU[EMODPE]
>> + * run from within enclave. Accessing pages with the new, extended,
>> + * permissions requires the OS to update the PTE to handle the subsequent
>> + * #PF correctly.
> 
> Hi Reinette,
> 
> I really dislike the Intel nomenclature here.  I know the Intel docs are
> all written around permission "extension", but I find it ambiguous.
> 
> I've been looking at these instructions literally for years now and
> permission extension to me can mean either:
>   1. The set of things you can do is extended
>   2. The set of things you can *NOT* do is extended
> 
> I much rather prefer nomenclature like:
> 
> 	EPCM permissions can be relaxed anytime directly from the
> 	enclave with no visibility from the OS. This is accomplished
> 	with ENCLU[EMODPE] run from within enclave. Accessing pages with
> 	the new, relaxed permissions requires the OS to update the PTE
> 	to handle the subsequent correctly.
> 
> "Relax" is less ambiguous.  Relaxing a restriction and relaxing
> permissions both mean doing things less strictly.  Extending
> restrictions and extending what is allowed are opposites.

Very good point.

> Maybe it's just me and I need to get this through my thick skull at some
> point.  But, I do think it's OK to improve on the architecture names for
> things when they go into the kernel.  The XSAVE XSTATE_BV->xfeatures
> rename comes to mind.
> 
> Anyway, I'd appreciate if you could keep this in mind and consider
> changing it if a future revision is needed if you believe it is more clear.
> 

Will do. I see that there is opportunity to use this terminology in my 
reply to your other message in response to this patch. I'll do so and we 
can then further judge how it sounds.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-03  0:32   ` Dave Hansen
@ 2021-12-03 18:18     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-03 18:18 UTC (permalink / raw)
  To: Dave Hansen, dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Dave,

On 12/2/2021 4:32 PM, Dave Hansen wrote:
> On 12/1/21 11:23 AM, Reinette Chatre wrote:
>> Whether enclave page permissions are restricted or extended it
>> is necessary to ensure that the page table entries and enclave page
>> permissions are in sync. Introduce a new ioctl,
> 
> These should be "ioctl()".

Will fix.

> 
>> SGX_IOC_PAGE_MODP, to support enclave page permission changes. Since
>> the OS has no insight in how permissions may have been extended from
>> within the enclave all page permission requests are treated as
>> permission restrictions.
> I'm trying to wrap my head around this a bit.  If this is only for
> restricting permissions, should we be reflecting that in the naming?
> SGX_IOC_PAGE_RESTRICT_PERM, perhaps?  Wouldn't that be more direct than
> saying, "here's a permission change ioctl(), but it doesn't arbitrarily
> change things, it treats all changes as restrictions"?

The ioctl is named from the user space perspective as opposed to the OS 
perspective. While the OS treats all permission changes as permission 
restrictions, user space needs to call this ioctl() to support all 
enclave page permission changes:

* If the enclave page permissions are being restricted then this ioctl() 
would clear the page table entries and call ENCLS[EMODPR] that would 
have work to do to change the enclave page permissions.
* If the enclave page permissions are relaxed (should have been preceded 
by ENCLU[EMODPE] from within the enclave) then this ioctl() would do the 
same as in previous bullet (most importantly clear the page tables) but 
in this case ENCLS[EMODPR] would be a no-op as you indicate below.

Since user space needs OS support for both relaxing and restriction of 
permissions "SGX_IOC_PAGE_MODP" seemed appropriate.


> The pseudocode for EMODPR looks like this:
> 
>> (* Update EPCM permissions *)
>> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
>> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
>> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;
> 
> so it makes total sense that it can only restrict permissions since it's
> effectively:
> 
> 	new_hw_perm = old_hw_perm & secinfo_perm;
> 
> ...
>> +/**
>> + * struct sgx_page_modp - parameter structure for the %SGX_IOC_PAGE_MODP ioctl
>> + * @offset:	starting page offset (page aligned relative to enclave base
>> + *		address defined in SECS)
>> + * @length:	length of memory (multiple of the page size)
>> + * @prot:	new protection bits of pages in range described by @offset
>> + *		and @length
>> + * @result:	SGX result code of ENCLS[EMODPR] function
>> + * @count:	bytes successfully changed (multiple of page size)
>> + */
>> +struct sgx_page_modp {
>> +	__u64 offset;
>> +	__u64 length;
>> +	__u64 prot;
>> +	__u64 result;
>> +	__u64 count;
>> +};
> 
> Could we make it more explicit that offset/length/prot are inputs and
> result/count are output?

This follows the pattern of existing struct sgx_enclave_add_pages. Could 
you please provide guidance or a reference of what you would like to 
see? I scanned all the files in arch/x86/include/uapi/asm/* defining RW 
ioctls and a few files in include/uapi/linux/* and I was not able to 
notice such a custom.

Would you like to see something like a "in_"/"out_" prefix? If so, would 
you like to see a preparatory patch that changes struct 
sgx_enclave_add_pages also? If needed, I am not sure how to handle the 
latter due to the possible user space impact.

> 
> ..
>> +	if (!params.length || params.length & (PAGE_SIZE - 1))
>> +		return -EINVAL;
> 
> I find these a bit easier to read if they're:
> 
> 	if (!params.length || !IS_ALIGNED(params.length, PAGE_SIZE))
> 		...
> 

I am not sure about this. First, (I understand this is not a reason to 
do things a particular way), this is re-using the vetted code from 
sgx_ioc_enclave_add_pages(). Second, my understanding of IS_ALIGNED is 
its use to indicate that a provided address/offset is on some boundary, 
in this case it is the length field being verified (not an address or 
offset) and it is required to be a multiple of the page size.

I understand that the code ends up being the same but I think that it 
may be hard to parse that a length field is required to be aligned.

No objection to changing this if you prefer using IS_ALIGNED and I will 
then also include a preparatory patch to change 
sgx_ioc_enclave_add_pages() and make the same change in the following 
patches.

Could you please let me know what you prefer?

>> +	if (params.offset + params.length - PAGE_SIZE >= encl->size)
>> +		return -EINVAL;
> 
> I hate boundary conditions. :)  But, I think this would be simpler
> written as:
> 
> 	if (params.offset + params.length > encl->size)
> 
> Please double-check me on that, though.  I've gotten these kinds of
> checks wrong more times than I care to admit.
> 

I am very cautious about boundary conditions and thus preferred to 
re-use the existing checks from sgx_ioc_enclave_add_pages(). Your 
suggestion is much simpler though and I will use it. Would you also like 
to see a preparatory patch that changes the existing check in 
sgx_ioc_enclave_add_pages()?

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2021-12-03  0:38   ` Dave Hansen
@ 2021-12-03 18:47     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-03 18:47 UTC (permalink / raw)
  To: Dave Hansen, dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Dave,

On 12/2/2021 4:38 PM, Dave Hansen wrote:
> On 12/1/21 11:23 AM, Reinette Chatre wrote:
>> +static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
>> +				     struct sgx_encl *encl, unsigned long addr)
>> +{
>> +	struct sgx_pageinfo pginfo = {0};
>> +	struct sgx_encl_page *encl_page;
>> +	struct sgx_epc_page *epc_page;
>> +	struct sgx_va_page *va_page;
>> +	unsigned long phys_addr;
>> +	unsigned long prot;
>> +	vm_fault_t vmret;
>> +	int ret;
>> +
>> +	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
>> +		return VM_FAULT_SIGBUS;
>> +
>> +	encl_page = kzalloc(sizeof(*encl_page), GFP_KERNEL);
>> +	if (!encl_page)
>> +		return VM_FAULT_OOM;
>> +
>> +	encl_page->desc = addr;
>> +	encl_page->encl = encl;
>> +
>> +	/*
>> +	 * Adding a regular page that is architecturally allowed to only
>> +	 * be created with RW permissions.
>> +	 * TBD: Interface with user space policy to support max permissions
>> +	 * of RWX.
>> +	 */
>> +	prot = PROT_READ | PROT_WRITE;
>> +	encl_page->vm_run_prot_bits = calc_vm_prot_bits(prot, 0);
>> +	encl_page->vm_max_prot_bits = encl_page->vm_run_prot_bits;
>> +
>> +	epc_page = sgx_alloc_epc_page(encl_page, true);
>> +	if (IS_ERR(epc_page)) {
>> +		kfree(encl_page);
>> +		return VM_FAULT_SIGBUS;
>> +	}
>> +
>> +	va_page = sgx_encl_grow(encl);
>> +	if (IS_ERR(va_page)) {
>> +		ret = PTR_ERR(va_page);
>> +		goto err_out_free;
>> +	}
>> +
>> +	mutex_lock(&encl->lock);
>> +
>> +	/*
>> +	 * Copy comment from sgx_encl_add_page() to maintain guidance in
>> +	 * this similar flow:
>> +	 * Adding to encl->va_pages must be done under encl->lock.  Ditto for
>> +	 * deleting (via sgx_encl_shrink()) in the error path.
>> +	 */
>> +	if (va_page)
>> +		list_add(&va_page->list, &encl->va_pages);
>> +
>> +	ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc),
>> +			encl_page, GFP_KERNEL);
>> +	/*
>> +	 * If ret == -EBUSY then page was created in another flow while
>> +	 * running without encl->lock
>> +	 */
>> +	if (ret)
>> +		goto err_out_unlock;
>> +
>> +	pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
>> +	pginfo.addr = encl_page->desc & PAGE_MASK;
>> +	pginfo.metadata = 0;
>> +
>> +	ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
>> +	if (ret)
>> +		goto err_out;
>> +
>> +	encl_page->encl = encl;
>> +	encl_page->epc_page = epc_page;
>> +	encl_page->type = SGX_PAGE_TYPE_REG;
>> +	encl->secs_child_cnt++;
>> +
>> +	sgx_mark_page_reclaimable(encl_page->epc_page);
>> +
>> +	phys_addr = sgx_get_epc_phys_addr(epc_page);
>> +	/*
>> +	 * Do not undo everything when creating PTE entry fails - next #PF
>> +	 * would find page ready for a PTE.
>> +	 * PAGE_SHARED because protection is forced to be RW above and COW
>> +	 * is not supported.
>> +	 */
>> +	vmret = vmf_insert_pfn_prot(vma, addr, PFN_DOWN(phys_addr),
>> +				    PAGE_SHARED);
>> +	if (vmret != VM_FAULT_NOPAGE) {
>> +		mutex_unlock(&encl->lock);
>> +		return VM_FAULT_SIGBUS;
>> +	}
>> +	mutex_unlock(&encl->lock);
>> +	return VM_FAULT_NOPAGE;
>> +
>> +err_out:
>> +	xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
>> +
>> +err_out_unlock:
>> +	sgx_encl_shrink(encl, va_page);
>> +	mutex_unlock(&encl->lock);
>> +
>> +err_out_free:
>> +	sgx_encl_free_epc_page(epc_page);
>> +	kfree(encl_page);
>> +
>> +	return VM_FAULT_SIGBUS;
>> +}
> 
> There seems to be very little code sharing between this and the existing
> page addition.  Are we confident that no refactoring here is in order?
> 

I can understand your concern here because this code looks similar to 
the page addition code. Primarily because it uses the same objects (an 
enclave page, an EPC page, and a VA page).

The flow is different though because (1) the enclave page needs to be 
created differently to handle its static (RW) permissions as opposed to 
the permissions from additional meta data, (2) a different instruction 
(ENCLS[EAUG] vs ENCLS[EADD]) is used, and (3) the page table entries are 
installed which does not form part of the original page addition.

A major complication to factoring out code is that there are (slightly 
different) allocations needed before the mutex is obtained (enclave 
page, EPC page, and VA page) and then different actions taken on these 
individual allocations with the mutex held. With the mutex in the middle 
of difference in allocation and different actions it is not clear to me 
how to refactor this.

Please do let me know if you see any ways in which I can improve this code.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-03 18:14   ` Dave Hansen
@ 2021-12-03 18:49     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-03 18:49 UTC (permalink / raw)
  To: Dave Hansen, dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Dave,

On 12/3/2021 10:14 AM, Dave Hansen wrote:
> On 12/1/21 11:23 AM, Reinette Chatre wrote:
>> Enclave page permission changes need to be approached with care and
>> for this reason this initial support is to allow enclave page
>> permission changes _only_ if the new permissions are the same or
>> more restrictive that the permissions originally vetted at the time the
>> pages were added to the enclave. Support for extending enclave page
>> permissions beyond what was originally vetted is deferred.
> 
> It's probably worth adding a few examples here:
> 
>   * RWX => RW => RX => RW => R => RWX
>   * RW => R => RW
>   * RX => R => RX
> 

Indeed - that would make the implications of this change clear.

Will do. Thank you very much.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-01 19:23 ` [PATCH 05/25] x86/sgx: Introduce runtime protection bits Reinette Chatre
@ 2021-12-03 19:28   ` Andy Lutomirski
  2021-12-03 22:12     ` Reinette Chatre
  2021-12-04 23:57     ` Jarkko Sakkinen
  2021-12-04 22:50   ` Jarkko Sakkinen
  1 sibling, 2 replies; 155+ messages in thread
From: Andy Lutomirski @ 2021-12-03 19:28 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/1/21 11:23, Reinette Chatre wrote:
> Enclave creators declare their paging permission intent at the time
> the pages are added to the enclave. These paging permissions are
> vetted when pages are added to the enclave and stashed off
> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
> enclave PTEs.
> 

I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change the 
EPCM permission bits however it likes with no oversight from the kernel. 
  So we end up with a whole bunch of permission masks:

The PTE: controlled by complex kernel policy

The VMA: with your series, this is entirely controlled by userspace.  I 
think I'm fine with that.

vm_max_prot_bits: populated from secinfo at setup time, unless I missed 
something that changes it later.  Maybe I'm confused or missed something 
in one of the patches,

vm_run_prot_bits: populated from some combination of ioctls.  I'm 
entirely lost as to what this is for.

EPCM bits: controlled by the guest.  basically useless for any host 
purpose on SGX2 hardware (with or without kernel support -- the enclave 
can do ENCLU[EMODPE] whether we like it or not, even on old kernels)

So I guess I don't understand the purpose of this patch	or of the rules 
in the later patches, and I feel like this is getting more complicated 
than makes sense.


Could we perhaps make vm_max_prot_bits dynamic or at least controllable 
in some useful way?  My initial proposal (years ago) was for 
vm_max_prot_bits to be *separately* configured at initial load time 
instead of being inferred from secinfo with the intent being that the 
user untrusted runtime would set it appropriately.  I have no problem 
with allowing runtime changes as long as the security policy makes sense 
and it's kept consistent with PTEs.

Also, I think we need a changelog message or, even better, actual docs 
in kernel, explaining the actual final set of rules and invariants for 
all these masks.

--Andy

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-01 19:23 ` [PATCH 10/25] x86/sgx: Support enclave page permission changes Reinette Chatre
                     ` (2 preceding siblings ...)
  2021-12-03 18:14   ` Dave Hansen
@ 2021-12-03 19:38   ` Andy Lutomirski
  2021-12-03 22:34     ` Reinette Chatre
  2021-12-04 23:08   ` Jarkko Sakkinen
  4 siblings, 1 reply; 155+ messages in thread
From: Andy Lutomirski @ 2021-12-03 19:38 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/1/21 11:23, Reinette Chatre wrote:
> In the initial (SGX1) version of SGX, pages in an enclave need to be
> created with permissions that support all usages of the pages, from the
> time the enclave is initialized until it is unloaded. For example,
> pages used by a JIT compiler or when code needs to otherwise be
> relocated need to always have RWX permissions.
> 
> SGX2 includes two functions that can be used to modify the enclave page
> permissions of regular enclave pages within an initialized enclave.
> ENCLS[EMODPR] is run from the OS and used to restrict enclave page
> permissions while ENCLU[EMODPE] is run from within the enclave to
> extend enclave page permissions.
> 
> Enclave page permission changes need to be approached with care and
> for this reason this initial support is to allow enclave page
> permission changes _only_ if the new permissions are the same or
> more restrictive that the permissions originally vetted at the time the
> pages were added to the enclave. Support for extending enclave page
> permissions beyond what was originally vetted is deferred.
> 

I may well be missing something, but off the top of my head, literally 
the only reason that EMODPR needs CPL0 (i.e. ENCLS) is that it requires 
a TLB flush IPI to take effect.  (Score one for AMD for being having 
superior hardware in this regard.)

Given that, I don't see any reason for the EMODPR operation to be 
treated as security sensitive -- it just needs to be implemented 
correctly.  I don't even see why the host should (or even can!) do any 
useful tracking of the EPCM state.

(But I am confused about one thing: to the extent an enclave actually 
needs EMODPR, is there anything in the hardware or anything that the 
enclave can do short of actually poking the page from all threads and 
confirming that a fault occurs to make sure the OS actually flushed the 
TLB?  ISTM a malicious host could attack an enclave by omitting the TLB 
flush and then exploiting an enclave but that would have been mitigated 
if the flush occurred.)

--Andy

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-03 19:28   ` Andy Lutomirski
@ 2021-12-03 22:12     ` Reinette Chatre
  2021-12-04  0:38       ` Andy Lutomirski
  2021-12-04 23:57     ` Jarkko Sakkinen
  1 sibling, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-03 22:12 UTC (permalink / raw)
  To: Andy Lutomirski, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Andy,

On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
> On 12/1/21 11:23, Reinette Chatre wrote:
>> Enclave creators declare their paging permission intent at the time
>> the pages are added to the enclave. These paging permissions are
>> vetted when pages are added to the enclave and stashed off
>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>> enclave PTEs.
>>
> 
> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change the 
> EPCM permission bits however it likes with no oversight from the kernel. 
>   So we end up with a whole bunch of permission masks:

Before jumping to the permission masks I would like to step back and 
just confirm the context. We need to consider the following three 
permissions:

EPCM permissions: the enclave page permissions maintained in the SGX 
hardware. The OS is constrained here in that it cannot query the current 
EPCM permissions. Even so, the OS needs to ensure PTEs are installed 
appropriately (we do not want a RW PTE for a read-only enclave page) and 
thus the OS keeps its own record of EPCM permissions to support this.
As you note later even in current kernel the enclave can change these 
permissions without OS knowing. EPCM permissions can only be relaxed 
without the OS knowledge though so the OS record of EPCM permissions can 
only ever be stricter than the actual EPCM permissions.

VMA permissions: Current behavior (not changed in this series) is that 
the OS enforces that a new VMA should have the same or weaker 
permissions than the EPCM permissions.

Page table entries: These should match the EPCM permissions without 
exceeding VMA permissions.

> The PTE: controlled by complex kernel policy
> 
> The VMA: with your series, this is entirely controlled by userspace.  I 
> think I'm fine with that.
> 
> vm_max_prot_bits: populated from secinfo at setup time, unless I missed 
> something that changes it later.  Maybe I'm confused or missed something 
> in one of the patches,

Yes, vm_max_prot_bits is currently and continues to be populated from 
secinfo for pages added before the enclave is initialized and in a later 
patch it would be hardcoded to RW for pages that are added after the 
enclave is initialized.  In the current implementation vm_max_prot_bits 
is the OS's record of the EPCM permissions used to guide VMA and PTE 
permissions.

On a higher level, the implementation decision is that vm_max_prot_bits 
is the static "vetted" permissions of a page - the maximum permissions a 
page is allowed to have during its entire lifetime. This matches the 
current implementation. In the current implementation permissions are 
only able to change via VMA and PTE ... for example a read-only VMA can 
access an enclave page with vm_max_prot_bits of RW. With the SGX2 
support permission changes are allowed to EPCM permissions - but in this 
implementation they are not allowed to exceed the originally vetted 
vm_max_prot_bits.

In this SGX2 implementation an enclave page could thus be added to an 
enclave with secinfo and vm_max_prot_bits of RW that would only allow 
that page to have R or RW permissions (VMA, PTE, and OS view of EPCM 
permissions) in its lifetime, never RX or RWX. Yes, it may be possible 
for the enclave to change the EPCM permissions from within the enclave 
using ENCLU[EMODPE] but to access the page the enclave would need the OS 
to install the appropriate PTE and the OS would not do so if 
vm_max_prot_bits does not allow it. Neither would the OS allow an 
executable VMA.

> 
> vm_run_prot_bits: populated from some combination of ioctls.  I'm 
> entirely lost as to what this is for.

With SGX2 it is possible to change the EPCM permissions of an enclave 
page after the enclave is initialized. vm_max_prot_bits would provide 
guidance to what permissions a page is allowed to have while 
vm_run_prot_bits contains the current view of EPCM permissions used by 
the OS to guide whether requested VMA permissions are allowed and guide 
what PTE permissions should be.

Consider this example how vm_max_prot_bits and vm_run_prot_bits are used:

(1) Add enclave page with secinfo of RW to uninitialized enclave
     vm_max_prot_bits = RW
     vm_run_prot_bits = RW

(2) User space runs SGX_IOC_PAGE_MODP to change the permissions to read-
     only. This is allowed because vm_max_prot_bits = RW. Now:
     vm_max_prot_bits = RW
     vm_run_prot_bits = R

     Now VMAs are created and PTEs installed based on value of
     vm_run_prot_bits - write access will not be allowed.

(3) User space runs SGX_IOC_PAGE_MODP to change the permissions to RX.
     This will be denied because vm_max_prot_bits = RW.

(3) User space runs SGX_IOC_PAGE_MODP to change the permissions to RW.
     This will be allowed because vm_max_prot_bits = RW.

     Now VMAs are created and PTEs installed based on value of
     vm_run_prot_bits - write access will again be allowed.

> 
> EPCM bits: controlled by the guest.  basically useless for any host 
> purpose on SGX2 hardware (with or without kernel support -- the enclave 
> can do ENCLU[EMODPE] whether we like it or not, even on old kernels)

Indeed - permissions can only be relaxed without the OS knowledge so the 
OS's view would always be the same or stricter than the enclave.

> So I guess I don't understand the purpose of this patch    or of the 
> rules in the later patches, and I feel like this is getting more 
> complicated than makes sense.
> 
> 
> Could we perhaps make vm_max_prot_bits dynamic or at least controllable 
> in some useful way?  My initial proposal (years ago) was for 
> vm_max_prot_bits to be *separately* configured at initial load time 
> instead of being inferred from secinfo with the intent being that the 
> user untrusted runtime would set it appropriately.  I have no problem 
> with allowing runtime changes as long as the security policy makes sense 
> and it's kept consistent with PTEs.

This SGX2 enabling indeed builds on the current implementation where 
vm_max_prot_bits is inferred from secinfo. The intent is for 
vm_max_prot_bits to reflect the maximum allowed vetted permissions.

At this time vm_max_prot_bits is indeed static and thus creates the need 
for (dynamic) vm_run_prot_bits that reflects the current EPCM 
permissions and guides VMA and PTE permissions while vm_max_prot_bits 
guides new permission requests. From what I understand this 
implementation follows the current security policy - permissions are 
never allowed to exceed the originally vetted permissions. PTEs are kept 
consistent in that they match the (vetted, OS view of) EPCM permissions.

> Also, I think we need a changelog message or, even better, actual docs 
> in kernel, explaining the actual final set of rules and invariants for 
> all these masks.

I will add a section to Documentation/x86/sgx.rst.

Reinette


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-03 19:38   ` Andy Lutomirski
@ 2021-12-03 22:34     ` Reinette Chatre
  2021-12-04  0:42       ` Andy Lutomirski
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-03 22:34 UTC (permalink / raw)
  To: Andy Lutomirski, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Andy,

On 12/3/2021 11:38 AM, Andy Lutomirski wrote:
> On 12/1/21 11:23, Reinette Chatre wrote:
>> In the initial (SGX1) version of SGX, pages in an enclave need to be
>> created with permissions that support all usages of the pages, from the
>> time the enclave is initialized until it is unloaded. For example,
>> pages used by a JIT compiler or when code needs to otherwise be
>> relocated need to always have RWX permissions.
>>
>> SGX2 includes two functions that can be used to modify the enclave page
>> permissions of regular enclave pages within an initialized enclave.
>> ENCLS[EMODPR] is run from the OS and used to restrict enclave page
>> permissions while ENCLU[EMODPE] is run from within the enclave to
>> extend enclave page permissions.
>>
>> Enclave page permission changes need to be approached with care and
>> for this reason this initial support is to allow enclave page
>> permission changes _only_ if the new permissions are the same or
>> more restrictive that the permissions originally vetted at the time the
>> pages were added to the enclave. Support for extending enclave page
>> permissions beyond what was originally vetted is deferred.
>>
> 
> I may well be missing something, but off the top of my head, literally 
> the only reason that EMODPR needs CPL0 (i.e. ENCLS) is that it requires 
> a TLB flush IPI to take effect.  (Score one for AMD for being having 
> superior hardware in this regard.)

My understanding also is that it is the need for TLB flush that require 
the privilege but I am trying to get more information here.

> 
> Given that, I don't see any reason for the EMODPR operation to be 
> treated as security sensitive -- it just needs to be implemented 
> correctly.  I don't even see why the host should (or even can!) do any 
> useful tracking of the EPCM state.

The OS needs to know the EPCM permissions to be able to install the 
appropriate PTEs. If the enclave chooses to change the enclave page 
permissions from within the enclave then user space needs to let the OS 
know via the SGX_IOC_PAGE_MODP ioctl to ensure that the OS can install 
correct PTEs in support of the permission change.


> (But I am confused about one thing: to the extent an enclave actually 
> needs EMODPR, is there anything in the hardware or anything that the 
> enclave can do short of actually poking the page from all threads and 
> confirming that a fault occurs to make sure the OS actually flushed the 
> TLB?  ISTM a malicious host could attack an enclave by omitting the TLB 
> flush and then exploiting an enclave but that would have been mitigated 
> if the flush occurred.)

When enclave page permissions are restricted it requires the enclave to 
accept the new permissions from within the enclave by running 
ENCLU[EACCEPT]. This instruction requires that (it will fail otherwise) 
the OS completed an ENCLS[ETRACK] on the affected page - essentially 
ENCLU[EACCEPT] can only succeed if no cached linear-to-physical address 
mappings are present. The ETRACK flow is elaborate and I attempted to 
document it in patch 06/25. Essentially, SGX hardware flushes all cached 
linear-to-physical mappings when an enclave is exited and with ETRACK it 
can be ensured that all threads that were in an enclave at the time the 
tracking started (in this case after ENCLS[EMODPR]), have exited.

Reinette



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-03 22:12     ` Reinette Chatre
@ 2021-12-04  0:38       ` Andy Lutomirski
  2021-12-04  1:14         ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Andy Lutomirski @ 2021-12-04  0:38 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/3/21 14:12, Reinette Chatre wrote:
> Hi Andy,
> 
> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>> On 12/1/21 11:23, Reinette Chatre wrote:
>>> Enclave creators declare their paging permission intent at the time
>>> the pages are added to the enclave. These paging permissions are
>>> vetted when pages are added to the enclave and stashed off
>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>> enclave PTEs.
>>>
>>
>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change 
>> the EPCM permission bits however it likes with no oversight from the 
>> kernel.   So we end up with a whole bunch of permission masks:
> 
> Before jumping to the permission masks I would like to step back and 
> just confirm the context. We need to consider the following three 
> permissions:
> 
> EPCM permissions: the enclave page permissions maintained in the SGX 
> hardware. The OS is constrained here in that it cannot query the current 
> EPCM permissions. Even so, the OS needs to ensure PTEs are installed 
> appropriately (we do not want a RW PTE for a read-only enclave page)

Why not?  What's wrong with an RW PTE for a read-only enclave page?

If you convince me that this is actually important, then I'll read all 
the stuff below.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-03 22:34     ` Reinette Chatre
@ 2021-12-04  0:42       ` Andy Lutomirski
  2021-12-04  1:35         ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Andy Lutomirski @ 2021-12-04  0:42 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/3/21 14:34, Reinette Chatre wrote:
> Hi Andy,
> 
> On 12/3/2021 11:38 AM, Andy Lutomirski wrote:
>> On 12/1/21 11:23, Reinette Chatre wrote:
>>> In the initial (SGX1) version of SGX, pages in an enclave need to be
>>> created with permissions that support all usages of the pages, from the
>>> time the enclave is initialized until it is unloaded. For example,
>>> pages used by a JIT compiler or when code needs to otherwise be
>>> relocated need to always have RWX permissions.
>>>
>>> SGX2 includes two functions that can be used to modify the enclave page
>>> permissions of regular enclave pages within an initialized enclave.
>>> ENCLS[EMODPR] is run from the OS and used to restrict enclave page
>>> permissions while ENCLU[EMODPE] is run from within the enclave to
>>> extend enclave page permissions.
>>>
>>> Enclave page permission changes need to be approached with care and
>>> for this reason this initial support is to allow enclave page
>>> permission changes _only_ if the new permissions are the same or
>>> more restrictive that the permissions originally vetted at the time the
>>> pages were added to the enclave. Support for extending enclave page
>>> permissions beyond what was originally vetted is deferred.
>>>
>>
>> I may well be missing something, but off the top of my head, literally 
>> the only reason that EMODPR needs CPL0 (i.e. ENCLS) is that it 
>> requires a TLB flush IPI to take effect.  (Score one for AMD for being 
>> having superior hardware in this regard.)
> 
> My understanding also is that it is the need for TLB flush that require 
> the privilege but I am trying to get more information here.
> 
>>
>> Given that, I don't see any reason for the EMODPR operation to be 
>> treated as security sensitive -- it just needs to be implemented 
>> correctly.  I don't even see why the host should (or even can!) do any 
>> useful tracking of the EPCM state.
> 
> The OS needs to know the EPCM permissions to be able to install the 
> appropriate PTEs. If the enclave chooses to change the enclave page 
> permissions from within the enclave then user space needs to let the OS 
> know via the SGX_IOC_PAGE_MODP ioctl to ensure that the OS can install 
> correct PTEs in support of the permission change.
> 
> 
>> (But I am confused about one thing: to the extent an enclave actually 
>> needs EMODPR, is there anything in the hardware or anything that the 
>> enclave can do short of actually poking the page from all threads and 
>> confirming that a fault occurs to make sure the OS actually flushed 
>> the TLB?  ISTM a malicious host could attack an enclave by omitting 
>> the TLB flush and then exploiting an enclave but that would have been 
>> mitigated if the flush occurred.)
> 
> When enclave page permissions are restricted it requires the enclave to 
> accept the new permissions from within the enclave by running 
> ENCLU[EACCEPT]. This instruction requires that (it will fail otherwise) 
> the OS completed an ENCLS[ETRACK] on the affected page - essentially 
> ENCLU[EACCEPT] can only succeed if no cached linear-to-physical address 
> mappings are present. The ETRACK flow is elaborate and I attempted to 
> document it in patch 06/25. Essentially, SGX hardware flushes all cached 
> linear-to-physical mappings when an enclave is exited and with ETRACK it 
> can be ensured that all threads that were in an enclave at the time the 
> tracking started (in this case after ENCLS[EMODPR]), have exited.
> 

Does the enclave do something before asking for the ioctl to put the 
page in a state where the tracking is armed?  I read the SDM, but I 
probably read the wrong part of the SDM, and I may have missed this.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-04  0:38       ` Andy Lutomirski
@ 2021-12-04  1:14         ` Reinette Chatre
  2021-12-04 17:56           ` Andy Lutomirski
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-04  1:14 UTC (permalink / raw)
  To: Andy Lutomirski, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Andy,

On 12/3/2021 4:38 PM, Andy Lutomirski wrote:
> On 12/3/21 14:12, Reinette Chatre wrote:
>> Hi Andy,
>>
>> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>>> On 12/1/21 11:23, Reinette Chatre wrote:
>>>> Enclave creators declare their paging permission intent at the time
>>>> the pages are added to the enclave. These paging permissions are
>>>> vetted when pages are added to the enclave and stashed off
>>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>>> enclave PTEs.
>>>>
>>>
>>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change 
>>> the EPCM permission bits however it likes with no oversight from the 
>>> kernel.   So we end up with a whole bunch of permission masks:
>>
>> Before jumping to the permission masks I would like to step back and 
>> just confirm the context. We need to consider the following three 
>> permissions:
>>
>> EPCM permissions: the enclave page permissions maintained in the SGX 
>> hardware. The OS is constrained here in that it cannot query the 
>> current EPCM permissions. Even so, the OS needs to ensure PTEs are 
>> installed appropriately (we do not want a RW PTE for a read-only 
>> enclave page)
> 
> Why not?  What's wrong with an RW PTE for a read-only enclave page?
> 
> If you convince me that this is actually important, then I'll read all 
> the stuff below.

Perhaps it is my misunderstanding/misinterpretation of the current 
implementation? From what I understand the current requirement, as 
enforced in the current mmap(), mprotect() as well as fault() hooks, is 
that mappings are required to have identical or weaker permission than 
the enclave permission.

Could you please elaborate how you envision PTEs should be managed in 
this implementation?

Thank you

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-04  0:42       ` Andy Lutomirski
@ 2021-12-04  1:35         ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-04  1:35 UTC (permalink / raw)
  To: Andy Lutomirski, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Andy,

On 12/3/2021 4:42 PM, Andy Lutomirski wrote:
> On 12/3/21 14:34, Reinette Chatre wrote:
>> On 12/3/2021 11:38 AM, Andy Lutomirski wrote:
>>> On 12/1/21 11:23, Reinette Chatre wrote:
>>>> In the initial (SGX1) version of SGX, pages in an enclave need to be
>>>> created with permissions that support all usages of the pages, from the
>>>> time the enclave is initialized until it is unloaded. For example,
>>>> pages used by a JIT compiler or when code needs to otherwise be
>>>> relocated need to always have RWX permissions.
>>>>
>>>> SGX2 includes two functions that can be used to modify the enclave page
>>>> permissions of regular enclave pages within an initialized enclave.
>>>> ENCLS[EMODPR] is run from the OS and used to restrict enclave page
>>>> permissions while ENCLU[EMODPE] is run from within the enclave to
>>>> extend enclave page permissions.
>>>>
>>>> Enclave page permission changes need to be approached with care and
>>>> for this reason this initial support is to allow enclave page
>>>> permission changes _only_ if the new permissions are the same or
>>>> more restrictive that the permissions originally vetted at the time the
>>>> pages were added to the enclave. Support for extending enclave page
>>>> permissions beyond what was originally vetted is deferred.
>>>>
>>>
>>> I may well be missing something, but off the top of my head, 
>>> literally the only reason that EMODPR needs CPL0 (i.e. ENCLS) is that 
>>> it requires a TLB flush IPI to take effect.  (Score one for AMD for 
>>> being having superior hardware in this regard.)
>>
>> My understanding also is that it is the need for TLB flush that 
>> require the privilege but I am trying to get more information here.
>>
>>>
>>> Given that, I don't see any reason for the EMODPR operation to be 
>>> treated as security sensitive -- it just needs to be implemented 
>>> correctly.  I don't even see why the host should (or even can!) do 
>>> any useful tracking of the EPCM state.
>>
>> The OS needs to know the EPCM permissions to be able to install the 
>> appropriate PTEs. If the enclave chooses to change the enclave page 
>> permissions from within the enclave then user space needs to let the 
>> OS know via the SGX_IOC_PAGE_MODP ioctl to ensure that the OS can 
>> install correct PTEs in support of the permission change.
>>
>>
>>> (But I am confused about one thing: to the extent an enclave actually 
>>> needs EMODPR, is there anything in the hardware or anything that the 
>>> enclave can do short of actually poking the page from all threads and 
>>> confirming that a fault occurs to make sure the OS actually flushed 
>>> the TLB?  ISTM a malicious host could attack an enclave by omitting 
>>> the TLB flush and then exploiting an enclave but that would have been 
>>> mitigated if the flush occurred.)
>>
>> When enclave page permissions are restricted it requires the enclave 
>> to accept the new permissions from within the enclave by running 
>> ENCLU[EACCEPT]. This instruction requires that (it will fail 
>> otherwise) the OS completed an ENCLS[ETRACK] on the affected page - 
>> essentially ENCLU[EACCEPT] can only succeed if no cached 
>> linear-to-physical address mappings are present. The ETRACK flow is 
>> elaborate and I attempted to document it in patch 06/25. Essentially, 
>> SGX hardware flushes all cached linear-to-physical mappings when an 
>> enclave is exited and with ETRACK it can be ensured that all threads 
>> that were in an enclave at the time the tracking started (in this case 
>> after ENCLS[EMODPR]), have exited.
>>
> 
> Does the enclave do something before asking for the ioctl to put the 
> page in a state where the tracking is armed?  I read the SDM, but I 
> probably read the wrong part of the SDM, and I may have missed this.

No, when restricting permissions the enclave does not do anything 
special to the page before calling the ioctl.

The (non enclave) userspace calls the SGX_IOC_PAGE_MODP ioctl that will 
call ENCLS[EMODPR] to restrict the permissions as well as the 
ENCLS[ETRACK] to start the tracking before sending all CPUs that may be 
accessing the enclave an IPI. The enclave then runs ENCLU[EACCEPT] to 
accept the permission changes and this would fail if the host attempted 
to omit the TLB flush.

You can see an example of EPCM permission changes from user space 
perspective in the form of a selftest found in the patch that follows 
this one.

Reinette


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-04  1:14         ` Reinette Chatre
@ 2021-12-04 17:56           ` Andy Lutomirski
  2021-12-04 23:55             ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Andy Lutomirski @ 2021-12-04 17:56 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On 12/3/21 17:14, Reinette Chatre wrote:
> Hi Andy,
> 
> On 12/3/2021 4:38 PM, Andy Lutomirski wrote:
>> On 12/3/21 14:12, Reinette Chatre wrote:
>>> Hi Andy,
>>>
>>> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>>>> On 12/1/21 11:23, Reinette Chatre wrote:
>>>>> Enclave creators declare their paging permission intent at the time
>>>>> the pages are added to the enclave. These paging permissions are
>>>>> vetted when pages are added to the enclave and stashed off
>>>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>>>> enclave PTEs.
>>>>>
>>>>
>>>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change 
>>>> the EPCM permission bits however it likes with no oversight from the 
>>>> kernel.   So we end up with a whole bunch of permission masks:
>>>
>>> Before jumping to the permission masks I would like to step back and 
>>> just confirm the context. We need to consider the following three 
>>> permissions:
>>>
>>> EPCM permissions: the enclave page permissions maintained in the SGX 
>>> hardware. The OS is constrained here in that it cannot query the 
>>> current EPCM permissions. Even so, the OS needs to ensure PTEs are 
>>> installed appropriately (we do not want a RW PTE for a read-only 
>>> enclave page)
>>
>> Why not?  What's wrong with an RW PTE for a read-only enclave page?
>>
>> If you convince me that this is actually important, then I'll read all 
>> the stuff below.
> 
> Perhaps it is my misunderstanding/misinterpretation of the current 
> implementation? From what I understand the current requirement, as 
> enforced in the current mmap(), mprotect() as well as fault() hooks, is 
> that mappings are required to have identical or weaker permission than 
> the enclave permission.

The current implementation does require that, but for a perhaps 
counterintuitive reason.  If a SELinux-restricted (or similarly 
restricted) process that is *not* permitted to do JIT-like things loads 
an enclave, it's entirely okay for it to initialize RW enclave pages 
however it likes and it's entirely okay for it to initialize RX (or XO 
if that ever becomes a thing) enclave pages from appropriately files on 
disk.  But it's not okay for it to create RWX enclave pages or to 
initialize RX enclave pages from untrusted application memory. [0]

So we have a half-baked implementation right now: the permission to 
execute a page is decided based on secinfo (max permissions) when the 
enclave is set up, and it's enforced at the PTE level.  The PTE 
enforcement is because, on SGX2 hardware, the enclave can do EMODPE and 
bypass any supposed restrictions in the EPCM.

The only coupling between EPCM and PTE here is that the max_perm is 
initialized together with EPCM, but it didn't have to be that way.

An SGX2 implementation needs to be more fully baked, because in a 
dynamic environment enclaves need to be able to use EMODPE and actually 
end up with permissions that exceed the initial secinfo permissions.  So 
it needs to be possible to make a page that starts out R (or RW or 
whatever) but nonetheless has max_perm=RWX so that the enclave can use a 
combination of EMODPE and (ioctl-based) EMODPR to do JIT.  So I think 
you should make it possible to set up pages like this, but I see no 
reason to couple the PTE and the EPCM permissions.

> 
> Could you please elaborate how you envision PTEs should be managed in 
> this implementation?

As above: PTE permissions may not exceed max_perm, and EPCM is entirely 
separate except to the extent needed for ABI compatibility with SGX1 
runtimes.


[0] I'm not sure anyone actually has a system set up like this or that 
the necessary LSM support is in the kernel.  But it's supposed to be 
possible without changing the ABI.


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers
  2021-12-01 19:22 ` [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers Reinette Chatre
@ 2021-12-04 18:30   ` Jarkko Sakkinen
  2021-12-06 21:13     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 18:30 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:22:59AM -0800, Reinette Chatre wrote:
> The SGX ENCLS instruction uses EAX to specify an SGX function and
> may require additional registers, depending on the SGX function.
> ENCLS invokes the specified privileged SGX function for managing
> and debugging enclaves. Macros are used to wrap the ENCLS
> functionality and several wrappers are used to wrap the macros to
> make the different SGX functions accessible in the code.
> 
> The wrappers of the supported SGX functions are cryptic. Add short
> changelog descriptions of each to a comment.

I think you are adding function descriptions.

> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/encls.h | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
> index 9b204843b78d..241b766265d3 100644
> --- a/arch/x86/kernel/cpu/sgx/encls.h
> +++ b/arch/x86/kernel/cpu/sgx/encls.h
> @@ -162,57 +162,68 @@ static inline bool encls_failed(int ret)
>  	ret;						\
>  	})
>  
> +/* Create an SECS page in the Enclave Page Cache (EPC) */
>  static inline int __ecreate(struct sgx_pageinfo *pginfo, void *secs)
>  {
>  	return __encls_2(ECREATE, pginfo, secs);
>  }

You have:

* "Create an SECS page in the Enclave Page Cache (EPC)"
* "Add a Version Array (VA) page to the Enclave Page Cache (EPC)"

They should have similar descriptions, e.g.

* "Initialize an EPC page into SGX Enclave Control Structure (SECS) page."
* "Initialize an EPC page into Version Array (VA) page."

> +/* Extend uninitialized enclave measurement */
>  static inline int __eextend(void *secs, void *addr)
>  {
>  	return __encls_2(EEXTEND, secs, addr);
>  }

That description does not make __eextend any less cryptic.

Something like this would be already more informative:

/* Hash a 256 byte region of an enclave page to SECS:MRENCLAVE. */

This same remark applies to the rest of these comments. They should
provide a clue what the wrapper does rather than an English open coded
function name.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 02/25] x86/sgx: Add wrappers for SGX2 functions
  2021-12-01 19:23 ` [PATCH 02/25] x86/sgx: Add wrappers for SGX2 functions Reinette Chatre
@ 2021-12-04 22:04   ` Jarkko Sakkinen
  2021-12-06 21:15     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 22:04 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:00AM -0800, Reinette Chatre wrote:
> The SGX ENCLS instruction uses EAX to specify an SGX function and
> may require additional registers, depending on the SGX function.
> ENCLS invokes the specified privileged SGX function for managing
> and debugging enclaves. Several macros are used to wrap the ENCLS
> functionality.
> 
> Add ENCLS wrappers for the SGX2 EMODPR, EMODT, and EAUG functions
> that can make changes to pages of an initialized SGX enclave. The
> EMODPR function is used to restrict enclave page permissions
> as maintained within the enclave (Enclave Page Cache Map (EPCM)
> permissions). The EMODT function is used to change the type of an
> enclave page. The EAUG function is used to dynamically add enclave
> pages to an initialized enclave.
> 
> EMODPR and EMODT accepts two parameters and can fault as well as return
> an SGX error code. EAUG also accepts two parameters but does not return
> an SGX error code. Use existing macros for all new functions.
> 
> Expand enum sgx_return_code with the possible EMODPR and EMODT
> return codes.

These implementation details only obfuscate this commit message, and
it is way too high-level to be useful e.g. for kernel maintenance.

I'd replace it with something like:

"
Add wrappers for ENCLS leaf functions EAUG, EMODT and EMODPR,
which roughly take two steps:

1. EAUG creates a new EPCM entry.
   EMODT and EMODPR modify an existing EPCM entry.
2. Set either .PR = 1 (EMODPR), .MODIFY = 1 (EMODT) or .PENDING = 1 (AUG).

The bit is reset by the enclave by invoking ENCLU leaf function EACCEPT
or EACCEPTCOPY, which will result the EPCM change becoming effective.
"

The current commit message is also not addressing these:

1. What happens if enclaves accesses a memory address with either .PR,
   .MODIFY or .PENDING set in EPCM, other than by the means of EACCEPT
   or EACCEPTCOPY?
2. The calling conditions (e.g. concerning TLB's and ETRACK/IPI/etc
   dance related to it).


If this information was properly contained here, discussing about the
following commits would be much easier.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions
  2021-12-01 19:23 ` [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions Reinette Chatre
@ 2021-12-04 22:25   ` Jarkko Sakkinen
  2021-12-04 22:27     ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 22:25 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote:
> === Summary ===
> 
> An SGX VMA can only be created if its permissions are the same or
> weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA
> creation this rule continues to be enforced by the page fault handler.
> 
> With SGX2 the EPCM permissions of a page can change after VMA
> creation resulting in the VMA exceeding the EPCM permissions and the
> page fault handler incorrectly blocking access.
> 
> Enable the VMA's pages to remain accessible while ensuring that
> the page table entries are installed to match the EPCM permissions
> without exceeding the VMA perms issions.

I don't understand what the short summary means in English, and the
commit message is way too bloated to make any conclusions. It really
needs a rewrite.

These were the questions I could not find answer for:

1. Why it would be by any means safe to remove a permission check?
2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing
   VMA's and mmap() new ones.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions
  2021-12-04 22:25   ` Jarkko Sakkinen
@ 2021-12-04 22:27     ` Jarkko Sakkinen
  2021-12-06 21:16       ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 22:27 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Sun, Dec 05, 2021 at 12:25:59AM +0200, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote:
> > === Summary ===
> > 
> > An SGX VMA can only be created if its permissions are the same or
> > weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA
> > creation this rule continues to be enforced by the page fault handler.
> > 
> > With SGX2 the EPCM permissions of a page can change after VMA
> > creation resulting in the VMA exceeding the EPCM permissions and the
> > page fault handler incorrectly blocking access.
> > 
> > Enable the VMA's pages to remain accessible while ensuring that
> > the page table entries are installed to match the EPCM permissions
> > without exceeding the VMA perms issions.
> 
> I don't understand what the short summary means in English, and the
> commit message is way too bloated to make any conclusions. It really
> needs a rewrite.
> 
> These were the questions I could not find answer for:
> 
> 1. Why it would be by any means safe to remove a permission check?
> 2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing
>    VMA's and mmap() new ones.

3. Isn't this an API/ABI break?

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs
  2021-12-01 19:23 ` [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs Reinette Chatre
@ 2021-12-04 22:43   ` Jarkko Sakkinen
  2021-12-06 21:18     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 22:43 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:02AM -0800, Reinette Chatre wrote:
> By default a write page fault on a present PTE inherits the permissions
> of the VMA. Enclave page permissions maintained in the hardware's
> Enclave Page Cache Map (EPCM) may change after a VMA accessing the page
> is created. A VMA's permissions may thus exceed the enclave page
> permissions even though the VMA was originally created not to exceed
> the enclave page permissions. Following the default behavior during
> a page fault on a present PTE while the VMA permissions exceed the
> enclave page permissions would result in the PTE for an enclave page
> to be writable even though the page is not writable according to the
> enclave's permissions.
> 
> Consider the following scenario:
> * An enclave page exists with RW EPCM permissions.
> * A RW VMA maps the range spanning the enclave page.
> * The enclave page's EPCM permissions are changed to read-only.

How could this happen in the existing mainline code?

> * There is no page table entry for the enclave page.
> 
> Q.
>  What will user space observe when an attempt is made to write to the
>  enclave page from within the enclave?
> 
> A.
>  Initially the page table entry is not present so the following is
>  observed:
>  1) Instruction writing to enclave page is run from within the enclave.
>  2) A page fault with second and third bits set (0x6) is encountered
>     and handled by the SGX handler sgx_vma_fault() that installs a
>     read-only page table entry following previous patch that installs
>     page table entry with permissions that VMA and enclave agree on
>     (read-only in this case).
>  3) Instruction writing to enclave page is re-attempted.
>  4) A page fault with first three bits set (0x7) is encountered and
>     transparently (from SGX and user space perspective) handled by the
>     OS with the page table entry made writable because the VMA is
>     writable.
>  5) Instruction writing to enclave page is re-attempted.
>  6) Since the EPCM permissions prevents writing to the page a new page
>     fault is encountered, this time with the SGX flag set in the error
>     code (0x8007). No action is taken by OS for this page fault and
>     execution returns to user space.
>  7) Typically such a fault will be passed on to an application with a
>     signal but if the enclave is entered with the vDSO function provided
>     by the kernel then user space does not receive a signal but instead
>     the vDSO function returns successfully with exception information
>     (vector=14, error code=0x8007, and address) within the exception
>     fields within the vDSO function's struct sgx_enclave_run.
> 
> As can be observed it is not possible for user space to write to an
> enclave page if that page's enclave page permissions do not allow so,
> no matter what the VMA or PTE allows.
> 
> Even so, the OS should not allow writing to a page if that page is not
> writable. Thus the page table entry should accurately reflect the
> enclave page permissions.
> 
> Do not blindly accept VMA permissions on a page fault due to a write
> attempt to a present PTE. Install a pfn_mkwrite() handler that ensures
> that the VMA permissions agree with the enclave permissions in this
> regard.
> 
> Considering the same scenario as above after this change results in
> the following behavior change:
> 
> Q.
>  What will user space observe when an attempt is made to write to the
>  enclave page from within the enclave?
> 
> A.
>  Initially the page table entry is not present so the following is
>  observed:
>  1) Instruction writing to enclave page is run from within the enclave.
>  2) A page fault with second and third bits set (0x6) is encountered
>     and handled by the SGX handler sgx_vma_fault() that installs a
>     read-only page table entry following previous patch that installs
>     page table entry with permissions that VMA and enclave agree on
>     (read-only in this case).
>  3) Instruction writing to enclave page is re-attempted.
>  4) A page fault with first three bits set (0x7) is encountered and
>     passed to the pfn_mkwrite() handler for consideration. The handler
>     determines that the page should not be writable and returns SIGBUS.
>  5) Typically such a fault will be passed on to an application with a
>     signal but if the enclave is entered with the vDSO function provided
>     by the kernel then user space does not receive a signal but instead
>     the vDSO function returns successfully with exception information
>     (vector=14, error code=0x7, and address) within the exception fields
>     within the vDSO function's struct sgx_enclave_run.
> 
> The accurate exception information supports the SGX runtime, which is
> virtually always implemented inside a shared library, by providing
> accurate information in support of its management of the SGX enclave.

This QA-format is not a great idea, as it kind of tells what are the legit
questions to ask. You should describe what the patch does and what are the
legit reasons for doing that. Unfortunately, in the current form it is very
hard to get grip of this patch.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-01 19:23 ` [PATCH 05/25] x86/sgx: Introduce runtime protection bits Reinette Chatre
  2021-12-03 19:28   ` Andy Lutomirski
@ 2021-12-04 22:50   ` Jarkko Sakkinen
  2021-12-06 21:28     ` Reinette Chatre
  1 sibling, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 22:50 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

What about:

"x86/sgx: Add encl_page->vm_run_prot_bits for dynamic permission changes"

On Wed, Dec 01, 2021 at 11:23:03AM -0800, Reinette Chatre wrote:
> Enclave creators declare their paging permission intent at the time
> the pages are added to the enclave. These paging permissions are
> vetted when pages are added to the enclave and stashed off
> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
> enclave PTEs.
> 
> Current permission support assume that enclave page permissions
> remain static for the lifetime of the enclave. This is about to change
> with the addition of support for SGX2 where the permissions of enclave
> pages belonging to an initialized enclave may be changed during the
> enclave's lifetime.
> 
> Introduce runtime protection bits in preparation for support of

By writing "Introduce runtime protection bits", instead of simply "Add
encl_page->vm_run_prot_bits", the only thing you are adding is obfuscation.

Try to refer to the "exact thing", instead of English rephrasing
whenever possible.

> enclave page permission changes. These bits reflect the active
> permissions of an enclave page and are not to exceed the maximum
> protection bits that passed scrutiny during enclave creation.
> 
> Associate runtime protection bits with each enclave page. Initialize
> the runtime protection bits to the vetted maximum protection bits
> on page creation. Use the runtime protection bits for any access
> checks.

I guess the first sentence in this paragraph is completely redundant
as the first sentence of the previous paragraph tells the exact
same story.

> struct sgx_encl_page hosting this information is maintained for each
> enclave page so the space consumed by the struct is important.
> The existing vm_max_prot_bits is already unsigned long while only using
> three bits. Transition to a bitfield for the two members containing
> protection bits.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>

So this commit message left the most important thing unanswered,
or I missed it (which happens quite often): why two fields instead
of one? Why vm_max_port_bits needs to stay constant?

It's something that should be clearly documented.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 06/25] x86/sgx: Use more generic name for enclave cpumask function
  2021-12-01 19:23 ` [PATCH 06/25] x86/sgx: Use more generic name for enclave cpumask function Reinette Chatre
@ 2021-12-04 22:56   ` Jarkko Sakkinen
  2021-12-06 21:29     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 22:56 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

What are "enclave cpumask" and "generic name"? I'd prefer to speak
about concrete things and no use weird rephrasings at all.

Also, renaming is not exporting.

You should split this into two patches:

1. x86/sgx: Export sgx_encl_ewb_cpumask()
2. x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask().

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 07/25] x86/sgx: Move PTE zap code to separate function
  2021-12-01 19:23 ` [PATCH 07/25] x86/sgx: Move PTE zap code to separate function Reinette Chatre
@ 2021-12-04 22:59   ` Jarkko Sakkinen
  2021-12-06 21:30     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 22:59 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:05AM -0800, Reinette Chatre wrote:
> The SGX reclaimer removes page table entries pointing to pages that are
> moved to swap. SGX2 enables changes to pages belonging to an initialized
> enclave, for example changing page permissions. Supporting SGX2 requires
> this ability to remove page table entries that is available in the
> SGX reclaimer code.

Missing: why SGX2 requirest this?

> Factor out the code removing page table entries to a separate function,
> fixing accuracy of comments in the process, and make it available to other
> areas within the SGX code.
> 
> Since the code will no longer be unique to the reclaimer it is relocated
> to be with the rest of the enclave code in encl.c interacting with the
> page table.

This last paragraph should be removed. It can be seen from the code change
and diffstat.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 08/25] x86/sgx: Make SGX IPI callback available internally
  2021-12-01 19:23 ` [PATCH 08/25] x86/sgx: Make SGX IPI callback available internally Reinette Chatre
@ 2021-12-04 23:00   ` Jarkko Sakkinen
  2021-12-06 21:36     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 23:00 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:06AM -0800, Reinette Chatre wrote:
> The ETRACK instruction followed by an IPI to all CPUs within an enclave
> is a common pattern with more frequent use in support of SGX2.
> 
> Make the (empty) IPI callback function available internally in
> preparation for more usages.

Please, just describe the usages that this is needed for so that
there is zero guesswork required.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 09/25] x86/sgx: Keep record of SGX page type
  2021-12-01 19:23 ` [PATCH 09/25] x86/sgx: Keep record of SGX page type Reinette Chatre
@ 2021-12-04 23:03   ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 23:03 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:07AM -0800, Reinette Chatre wrote:
> SGX2 functions are not allowed on all page types. For example,
> ENCLS[EMODPR] is only allowed on regular SGX enclave pages and
> ENCLS[EMODPT] is only allowed on TCS and regular pages. If these
> functions are attempted on another type of page the hardware would
> trigger a fault.
> 
> Keep a record of the SGX page type so that there is more
> certainty whether an SGX2 instruction can succeed and faults
> can be treated as real failures.
> 
> The page type is made to be a property of struct sgx_encl_page
> and thus does not cover the VA page type. VA pages are maintained
> in separate structures and thus their type can be determined in
> a different way. The SGX2 instructions being supported do not
> operate on VA pages and this is thus not a scenario needing to
> be covered at this time.
> 
> With the protection bits consuming 16 bits of the unsigned long
> there is room available in the bitfield to include the page type
> information without increasing the space consumed by the struct.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>

I think this is needed for any formation of these patches, and
I cannot forsee it done by any other way, so
 
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-01 19:23 ` [PATCH 10/25] x86/sgx: Support enclave page permission changes Reinette Chatre
                     ` (3 preceding siblings ...)
  2021-12-03 19:38   ` Andy Lutomirski
@ 2021-12-04 23:08   ` Jarkko Sakkinen
  2021-12-06 20:19     ` Dave Hansen
  2021-12-06 21:42     ` Reinette Chatre
  4 siblings, 2 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 23:08 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:08AM -0800, Reinette Chatre wrote:
> In the initial (SGX1) version of SGX, pages in an enclave need to be
> created with permissions that support all usages of the pages, from the
> time the enclave is initialized until it is unloaded. For example,
> pages used by a JIT compiler or when code needs to otherwise be
> relocated need to always have RWX permissions.
> 
> SGX2 includes two functions that can be used to modify the enclave page
> permissions of regular enclave pages within an initialized enclave.
> ENCLS[EMODPR] is run from the OS and used to restrict enclave page
> permissions while ENCLU[EMODPE] is run from within the enclave to
> extend enclave page permissions.
> 
> Enclave page permission changes need to be approached with care and
> for this reason this initial support is to allow enclave page
> permission changes _only_ if the new permissions are the same or
> more restrictive that the permissions originally vetted at the time the
> pages were added to the enclave. Support for extending enclave page
> permissions beyond what was originally vetted is deferred.

This paragraph is out-of-scope for a commit message. You could have
this in the cover letter but not here. I would just remove it.

> Whether enclave page permissions are restricted or extended it
> is necessary to ensure that the page table entries and enclave page
> permissions are in sync. Introduce a new ioctl, SGX_IOC_PAGE_MODP, to

SGX_IOC_PAGE_MODP does not match the naming convetion of these:

* SGX_IOC_ENCLAVE_CREATE
* SGX_IOC_ENCLAVE_ADD_PAGES
* SGX_IOC_ENCLAVE_INIT

A better name would be SGX_IOC_ENCLAVE_MOD_PROTECTIONS. It doesn't
do harm to be a more verbose.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2021-12-01 19:23 ` [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave Reinette Chatre
  2021-12-03  0:38   ` Dave Hansen
@ 2021-12-04 23:13   ` Jarkko Sakkinen
  2021-12-06 21:44     ` Reinette Chatre
  2022-03-01 15:13   ` Jarkko Sakkinen
  2 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 23:13 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

"to initialize" -> "to an initialized"

On Wed, Dec 01, 2021 at 11:23:11AM -0800, Reinette Chatre wrote:
> With SGX1 an enclave needs to be created with its maximum memory demands
> allocated. Pages cannot be added to an enclave after it is initialized.
> SGX2 introduces a new function, ENCLS[EAUG], that can be used to add
> pages to an initialized enclave. With SGX2 the enclave still needs to
> set aside address space for its maximum memory demands during enclave
> creation, but all pages need not be added before enclave initialization.
> Pages can be added during enclave runtime.
> 
> Add support for dynamically adding pages to an initialized enclave,
> architecturally limited to RW permission. Add pages via the page fault
> handler at the time an enclave address without a backing enclave page
> is accessed, potentially directly reclaiming pages if no free pages
> are available.
> 
> The enclave is still required to run ENCLU[EACCEPT] on the page before
> it can be used. A useful flow is for the enclave to run ENCLU[EACCEPT]
> on an uninitialized address. This will trigger the page fault handler
> that will add the enclave page and return execution to the enclave to
> repeat the ENCLU[EACCEPT] instruction, this time successful.
> 
> If the enclave accesses an uninitialized address in another way, for
> example by expanding the enclave stack to a page that has not yet been
> added, then the page fault handler would add the page on the first
> write but upon returning to the enclave the instruction that triggered
> the page fault would be repeated and since ENCLU[EACCEPT] was not run
> yet it would trigger a second page fault, this time with the SGX flag
> set in the page fault error code. This can only be recovered by entering
> the enclave again and directly running the ENCLU[EACCEPT] instruction on
> the now initialized address.
> 
> Accessing an uninitialized address from outside the enclave also triggers
> this flow but the page will remain in PENDING state until accepted from
> within the enclave.

What does it mean being in PENDING state, and more imporantly, what is
PENDING state? What does a memory access within enclave cause when it
touch a page within this state?

I see a lot of text in the commit message but zero mentions about EPCM
expect this one sudden mention about PENDING field without attaching
it to anything concrete.

/Jarkko


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 14/25] x86/sgx: Tighten accessible memory range after enclave initialization
  2021-12-01 19:23 ` [PATCH 14/25] x86/sgx: Tighten accessible memory range after enclave initialization Reinette Chatre
@ 2021-12-04 23:14   ` Jarkko Sakkinen
  2021-12-06 21:45     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 23:14 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:12AM -0800, Reinette Chatre wrote:
> Before an enclave is initialized the enclave's memory range is unknown.
> The enclave's memory range is learned at the time it is created via the
> SGX_IOC_ENCLAVE_CREATE ioctl where the provided memory range is obtained
> from an earlier mmap() of the sgx_enclave device. After an enclave is
> initialized its memory can be mapped into user space (mmap()) from where
> it can be entered at its defined entry points.
> 
> With the enclave's memory range known after it is initialized there is
> no reason why it should be possible to map memory outside this range.
> 
> Lock down access to the initialized enclave's memory range by denying
> any attempt to map memory outside its memory range.
> 
> Locking down the memory range also makes adding pages to an initialized
> enclave more efficient. Pages are added to an initialized enclave by
> accessing memory that belongs to the enclave's memory range but not yet
> backed by an enclave page. If it is possible for user space to map
> memory that does not form part of the enclave then an access to this
> memory would eventually fail. Failures range from a prompt general
> protection fault if the access was an ENCLU[EACCEPT] from within the
> enclave, or a page fault via the vDSO if it was another access from
> within the enclave, or a SIGBUS (also resulting from a page fault) if
> the access was from outside the enclave.
> 
> Disallowing invalid memory to be mapped in the first place avoids
> preventable failures.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/encl.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 342b97dd4c33..37203da382f8 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -403,6 +403,10 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
>  
>  	XA_STATE(xas, &encl->page_array, PFN_DOWN(start));
>  

Please write a comment here.

> +	if (test_bit(SGX_ENCL_INITIALIZED, &encl->flags) &&
> +	    (start < encl->base || end > encl->base + encl->size))
> +		return -EACCES;
> +
>  	/*
>  	 * Disallow READ_IMPLIES_EXEC tasks as their VMA permissions might
>  	 * conflict with the enclave page permissions.
> -- 
> 2.25.1
> 

Otherwise, makes sense.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 16/25] x86/sgx: Support modifying SGX page type
  2021-12-01 19:23 ` [PATCH 16/25] x86/sgx: Support modifying SGX page type Reinette Chatre
@ 2021-12-04 23:45   ` Jarkko Sakkinen
  2021-12-06 21:48     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 23:45 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:14AM -0800, Reinette Chatre wrote:
> Every enclave contains one or more Thread Control Structures (TCS). The
> TCS contains meta-data used by the hardware to save and restore thread
> specific information when entering/exiting the enclave. With SGX1 an
> enclave needs to be created with enough TCSs to support the largest
> number of threads expecting to use the enclave and enough enclave pages
> to meet all its anticipated memory demands. In SGX1 all pages remain in
> the enclave until the enclave is unloaded.
> 
> Earlier changes added support for the SGX2 feature where pages can be
> added dynamically to an initialized enclave.

Please remove this paragraph, i.e. do not tie the commit order like
this.
> 
> SGX2 introduces a new function, ENCLS[EMODT], that is used to change
> the type of an enclave page from a regular (SGX_PAGE_TYPE_REG) enclave
> page to a TCS (SGX_PAGE_TYPE_TCS) page or change the type from a
> regular (SGX_PAGE_TYPE_REG) or TCS (SGX_PAGE_TYPE_TCS)
> page to a trimmed (SGX_PAGE_TYPE_TRIM) page (setting it up for later
> removal).
> 
> With the existing support of dynamically adding regular enclave pages
> to an initialized enclave and changing the page type to TCS it is
> possible to dynamically increase the number of threads supported by an
> enclave.
> 
> Changing the enclave page type to SGX_PAGE_TYPE_TRIM is the first step
> of dynamically removing pages from an initialized enclave. The complete
> page removal flow is:
> 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
>    using the ioctl introduced here.
> 2) Approve the page removal by running ENCLU[EACCEPT] from within
>    the enclave.
> 3) Initiate actual page removal using the new ioctl introduced in the
>    following patch.
> 
> Support changing SGX enclave page types with a new ioctl. With this

What is "a new ioctl"? Why not just write "Add <ioctl name>""?

> ioctl the user specifies a page range and the enclave page type to be
> applied to all pages in the provided range. The ioctl itself can return
> an error code based on failures encountered by the OS. It is also
> possible for SGX specific failures to be encountered.  Add a result
> output parameter to communicate the SGX return code. It is
> possible for the enclave page type change request to fail on any page
> within the provided range. Support partial success by returning
> the number of pages that were successfully changed.
> 
> After the page type is changed to SGX_PAGE_TYPE_TRIM the page continues
> to be accessible from the OS perspective with page table entries and
> internal state. The page may be moved to swap. Any invalid access
> (any access except ENCLU[EACCEPT]) will encounter a page fault with
> SGX flag set in error code until the page is removed. Removal of
> trimmed enclave pages on user request will be supported in following
> patch. Trimmed enclave pages are also removed when enclave is unloaded.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>

This is lacking discussion of EPCM interaction, most importanly
.MODIFY field of an EPCM entry.

> ---
>  arch/x86/include/uapi/asm/sgx.h |  19 +++
>  arch/x86/kernel/cpu/sgx/ioctl.c | 235 ++++++++++++++++++++++++++++++++
>  2 files changed, 254 insertions(+)
> 
> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> index 24bebc31e336..f70caccd166c 100644
> --- a/arch/x86/include/uapi/asm/sgx.h
> +++ b/arch/x86/include/uapi/asm/sgx.h
> @@ -31,6 +31,8 @@ enum sgx_page_flags {
>  	_IO(SGX_MAGIC, 0x04)
>  #define SGX_IOC_PAGE_MODP \
>  	_IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp)
> +#define SGX_IOC_PAGE_MODT \
> +	_IOWR(SGX_MAGIC, 0x06, struct sgx_page_modt)

I'd suggest to change this as SGX_IOC_ENCLAVE_MODIFY_TYPE.

>  
>  /**
>   * struct sgx_enclave_create - parameter structure for the
> @@ -96,6 +98,23 @@ struct sgx_page_modp {
>  	__u64 count;
>  };
>  
> +/**
> + * struct sgx_page_modt - parameter structure for the %SGX_IOC_PAGE_MODT ioctl
> + * @offset:	starting page offset (page aligned relative to enclave base
> + *		address defined in SECS)
> + * @length:	length of memory (multiple of the page size)
> + * @type:	new type of pages in range described by @offset and @length
> + * @result:	SGX result code of ENCLS[EMODT] function
> + * @count:	bytes successfully changed (multiple of page size)
> + */
> +struct sgx_page_modt {
> +	__u64 offset;
> +	__u64 length;
> +	__u64 type;
> +	__u64 result;
> +	__u64 count;
> +};
> +
>  struct sgx_enclave_run;
>  
>  /**
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index de0bf68ee842..a952d608ab35 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -914,6 +914,238 @@ static long sgx_ioc_page_modp(struct sgx_encl *encl, void __user *arg)
>  	return ret;
>  }
>  
> +/**
> + * sgx_page_modt - Modify type of SGX enclave pages
> + * @encl:	Enclave to which the pages belong.
> + * @modt:	Checked parameters from user about which pages need modifying
> + *		and their new type.
> + *
> + * Ability to change the enclave page type supports the following use cases:
> + * * It is possible to add TCS pages to enclave by changing the type of
> + * regular pages (SGX_PAGE_TYPE_REG) to TCS (SGX_PAGE_TYPE_TCS) pages. With
> + * this support the number of threads supported by an initialized enclave
> + * can be increased dynamically.
> + * * Regular or TCS pages can dynamically be removed from an initialized
> + * enclave by changing the page type to SGX_PAGE_TYPE_TRIM. Changing the
> + * page type to SGX_PAGE_TYPE_TRIM marks the page for removal with actual
> + * removal done by handler of %SGX_IOC_PAGE_REMOVE ioctl called after
> + * ENCLU[EACCEPT] is run on SGX_PAGE_TYPE_TRIM page from within the enclave.
> + *
> + * Return:
> + * - 0:		Success
> + * - -errno:	Otherwise
> + */
> +static long sgx_page_modt(struct sgx_encl *encl, struct sgx_page_modt *modt)
> +{
> +	unsigned long max_prot_restore, run_prot_restore;
> +	enum sgx_page_type page_type;
> +	struct sgx_encl_page *entry;
> +	struct sgx_secinfo secinfo;
> +	unsigned long prot;
> +	unsigned long addr;
> +	unsigned long c;
> +	void *epc_virt;
> +	int ret;
> +
> +	page_type = modt->type & SGX_PAGE_TYPE_MASK;
> +
> +	/*
> +	 * The only new page types allowed by hardware are PT_TCS and PT_TRIM.
> +	 */
> +	if (page_type != SGX_PAGE_TYPE_TCS && page_type != SGX_PAGE_TYPE_TRIM)
> +		return -EINVAL;
> +
> +	memset(&secinfo, 0, sizeof(secinfo));
> +
> +	secinfo.flags = page_type << 8;
> +
> +	for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
> +		addr = encl->base + modt->offset + c;
> +
> +		mutex_lock(&encl->lock);
> +
> +		entry = sgx_encl_load_page(encl, addr);
> +		if (IS_ERR(entry)) {
> +			ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> +			goto out_unlock;
> +		}
> +
> +		/*
> +		 * Borrow the logic from the Intel SDM. Regular pages
> +		 * (SGX_PAGE_TYPE_REG) can change type to SGX_PAGE_TYPE_TCS
> +		 * or SGX_PAGE_TYPE_TRIM but TCS pages can only be trimmed.
> +		 * CET pages not supported yet.
> +		 */
> +		if (!(entry->type == SGX_PAGE_TYPE_REG ||
> +		      (entry->type == SGX_PAGE_TYPE_TCS &&
> +		       page_type == SGX_PAGE_TYPE_TRIM))) {
> +			ret = -EINVAL;
> +			goto out_unlock;
> +		}
> +
> +		max_prot_restore = entry->vm_max_prot_bits;
> +		run_prot_restore = entry->vm_run_prot_bits;
> +
> +		/*
> +		 * Once a regular page becomes a TCS page it cannot be
> +		 * changed back. So the maximum allowed protection reflects
> +		 * the TCS page that is always RW from OS perspective but
> +		 * will be inaccessible from within enclave. Before doing
> +		 * so, do make sure that the new page type continues to
> +		 * respect the originally vetted page permissions.
> +		 */
> +		if (entry->type == SGX_PAGE_TYPE_REG &&
> +		    page_type == SGX_PAGE_TYPE_TCS) {
> +			if (~entry->vm_max_prot_bits & (VM_READ | VM_WRITE)) {
> +				ret = -EPERM;
> +				goto out_unlock;
> +			}
> +			prot = PROT_READ | PROT_WRITE;
> +			entry->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
> +			entry->vm_run_prot_bits = entry->vm_max_prot_bits;
> +
> +			/*
> +			 * Prevent page from being reclaimed while mutex
> +			 * is released.
> +			 */
> +			if (sgx_unmark_page_reclaimable(entry->epc_page)) {
> +				ret = -EAGAIN;
> +				goto out_entry_changed;
> +			}
> +
> +			/*
> +			 * Do not keep encl->lock because of dependency on
> +			 * mmap_lock acquired in sgx_zap_enclave_ptes().
> +			 */
> +			mutex_unlock(&encl->lock);
> +
> +			sgx_zap_enclave_ptes(encl, addr);
> +
> +			mutex_lock(&encl->lock);
> +
> +			sgx_mark_page_reclaimable(entry->epc_page);
> +		}
> +
> +		/* Change EPC type */
> +		epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> +		ret = __emodt(&secinfo, epc_virt);
> +		if (encls_faulted(ret)) {
> +			/*
> +			 * All possible faults should be avoidable:
> +			 * parameters have been checked, will only change
> +			 * valid page types, and no concurrent
> +			 * SGX1/SGX2 ENCLS instructions since these are
> +			 * protected with mutex.
> +			 */
> +			pr_err_once("EMODT encountered exception %d\n",
> +				    ENCLS_TRAPNR(ret));
> +			ret = -EFAULT;
> +			goto out_entry_changed;
> +		}
> +		if (encls_failed(ret)) {
> +			modt->result = ret;
> +			ret = -EFAULT;
> +			goto out_entry_changed;
> +		}
> +
> +		epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
> +		ret = __etrack(epc_virt);
> +		if (ret) {
> +			/*
> +			 * ETRACK only fails when there is an OS issue. For
> +			 * example, two consecutive ETRACK was sent without
> +			 * completed IPI between.
> +			 */
> +			pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
> +			/*
> +			 * Send IPIs to kick CPUs out of the enclave and
> +			 * try ETRACK again.
> +			 */
> +			on_each_cpu_mask(sgx_encl_cpumask(encl),
> +					 sgx_ipi_cb, NULL, 1);
> +			ret = __etrack(epc_virt);
> +			if (ret) {
> +				pr_err_once("ETRACK repeat returned %d (0x%x)",
> +					    ret, ret);
> +				ret = -EFAULT;
> +				goto out_unlock;
> +			}
> +		}
> +		on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> +
> +		entry->type = page_type;
> +
> +		mutex_unlock(&encl->lock);
> +	}
> +
> +	ret = 0;
> +	goto out;
> +
> +out_entry_changed:
> +	entry->vm_max_prot_bits = max_prot_restore;
> +	entry->vm_run_prot_bits = run_prot_restore;
> +out_unlock:
> +	mutex_unlock(&encl->lock);
> +out:
> +	modt->count = c;
> +
> +	return ret;
> +}
> +
> +/**
> + * sgx_ioc_page_modt() - handler for %SGX_IOC_PAGE_MODT
> + * @encl:	an enclave pointer
> + * @arg:	userspace pointer to a &struct sgx_page_modt instance
> + *
> + * Return:
> + * - 0:		Success
> + * - -errno:	Otherwise
> + */
> +static long sgx_ioc_page_modt(struct sgx_encl *encl, void __user *arg)
> +{
> +	struct sgx_page_modt params;
> +	long ret;
> +
> +	/*
> +	 * Ensure that there is a chance the request could succeed:
> +	 * (1) SGX2 is required.
> +	 * (2) Only pages in an initialized enclave could be modified.
> +	 */
> +	if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
> +		return -ENODEV;
> +
> +	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> +		return -EINVAL;
> +
> +	/*
> +	 * Obtain parameters from user and perform sanity checks.
> +	 */
> +	if (copy_from_user(&params, arg, sizeof(params)))
> +		return -EFAULT;
> +
> +	if (!IS_ALIGNED(params.offset, PAGE_SIZE))
> +		return -EINVAL;
> +
> +	if (!params.length || params.length & (PAGE_SIZE - 1))
> +		return -EINVAL;
> +
> +	if (params.offset + params.length - PAGE_SIZE >= encl->size)
> +		return -EINVAL;
> +
> +	if (params.type & ~SGX_PAGE_TYPE_MASK)
> +		return -EINVAL;
> +
> +	if (params.result || params.count)
> +		return -EINVAL;
> +
> +	ret = sgx_page_modt(encl, &params);
> +
> +	if (copy_to_user(arg, &params, sizeof(params)))
> +		return -EFAULT;
> +
> +	return ret;
> +}
> +
>  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>  {
>  	struct sgx_encl *encl = filep->private_data;
> @@ -938,6 +1170,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>  	case SGX_IOC_PAGE_MODP:
>  		ret = sgx_ioc_page_modp(encl, (void __user *)arg);
>  		break;
> +	case SGX_IOC_PAGE_MODT:
> +		ret = sgx_ioc_page_modt(encl, (void __user *)arg);
> +		break;
>  	default:
>  		ret = -ENOIOCTLCMD;
>  		break;
> -- 
> 2.25.1
> 

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 17/25] x86/sgx: Support complete page removal
  2021-12-01 19:23 ` [PATCH 17/25] x86/sgx: Support complete page removal Reinette Chatre
@ 2021-12-04 23:45   ` Jarkko Sakkinen
  2021-12-06 21:49     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 23:45 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:15AM -0800, Reinette Chatre wrote:
> The SGX2 page removal flow was introduced in previous patch and is
> as follows:
> 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
>    using the ioctl introduced in previous patch.
> 2) Approve the page removal by running ENCLU[EACCEPT] from within
>    the enclave.
> 3) Initiate actual page removal using the new ioctl introduced here.
> 
> Support the final step of the SGX2 page removal flow with a new ioctl.
> With this ioctl the user specifies a page range that should
> be removed. At this time all pages in the provided range should have
> the SGX_PAGE_TYPE_TRIM page type and the ioctl will fail with EPERM
> (Operation not permitted) when it encounters a page that does not have
> the correct type. Page removal can fail on any page within the
> provided range. Support partial success by returning the number of pages
> that were successfully removed.
> 
> Since actual page removal will succeed even if ENCLU[EACCEPT] was not
> run from within the enclave the ENCLU[EMODPR] instruction with RWX
> permissions is used as a no-op mechanism to ensure ENCLU[EACCEPT] was
> successfully run from within the enclave before the enclave page is
> removed.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>  arch/x86/include/uapi/asm/sgx.h |  21 +++++
>  arch/x86/kernel/cpu/sgx/ioctl.c | 159 ++++++++++++++++++++++++++++++++
>  2 files changed, 180 insertions(+)
> 
> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> index f70caccd166c..6648ded960f8 100644
> --- a/arch/x86/include/uapi/asm/sgx.h
> +++ b/arch/x86/include/uapi/asm/sgx.h
> @@ -33,6 +33,8 @@ enum sgx_page_flags {
>  	_IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp)
>  #define SGX_IOC_PAGE_MODT \
>  	_IOWR(SGX_MAGIC, 0x06, struct sgx_page_modt)
> +#define SGX_IOC_PAGE_REMOVE \
> +	_IOWR(SGX_MAGIC, 0x07, struct sgx_page_remove)

Should be SGX_IOC_ENCLAVE_REMOVE_PAGES.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 24/25] x86/sgx: Free up EPC pages directly to support large page ranges
  2021-12-01 19:23 ` [PATCH 24/25] x86/sgx: Free up EPC pages directly to support large page ranges Reinette Chatre
@ 2021-12-04 23:47   ` Jarkko Sakkinen
  2021-12-06 22:07     ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 23:47 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:22AM -0800, Reinette Chatre wrote:
> The page reclaimer ensures availability of EPC pages across all
> enclaves. In support of this it runs independently from the individual
> enclaves in order to take locks from the different enclaves as it writes
> pages to swap.
> 
> When needing to load a page from swap an EPC page needs to be available for
> its contents to be loaded into. Loading an existing enclave page from swap
> does not reclaim EPC pages directly if none are available, instead the
> reclaimer is woken when the available EPC pages are found to be below a
> watermark.
> 
> When iterating over a large number of pages in an oversubscribed
> environment there is a race between the reclaimer woken up and EPC pages
> reclaimed fast enough for the page operations to proceed.
> 
> Instead of tuning the race between the page operations and the reclaimer
> the page operations instead makes sure that there are EPC pages available.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>

Why this needs to be part of this patch set?

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-04 17:56           ` Andy Lutomirski
@ 2021-12-04 23:55             ` Reinette Chatre
  2021-12-13 22:34               ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-04 23:55 UTC (permalink / raw)
  To: Andy Lutomirski, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Andy,

On 12/4/2021 9:56 AM, Andy Lutomirski wrote:
> On 12/3/21 17:14, Reinette Chatre wrote:
>> Hi Andy,
>>
>> On 12/3/2021 4:38 PM, Andy Lutomirski wrote:
>>> On 12/3/21 14:12, Reinette Chatre wrote:
>>>> Hi Andy,
>>>>
>>>> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>>>>> On 12/1/21 11:23, Reinette Chatre wrote:
>>>>>> Enclave creators declare their paging permission intent at the time
>>>>>> the pages are added to the enclave. These paging permissions are
>>>>>> vetted when pages are added to the enclave and stashed off
>>>>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>>>>> enclave PTEs.
>>>>>>
>>>>>
>>>>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change 
>>>>> the EPCM permission bits however it likes with no oversight from 
>>>>> the kernel.   So we end up with a whole bunch of permission masks:
>>>>
>>>> Before jumping to the permission masks I would like to step back and 
>>>> just confirm the context. We need to consider the following three 
>>>> permissions:
>>>>
>>>> EPCM permissions: the enclave page permissions maintained in the SGX 
>>>> hardware. The OS is constrained here in that it cannot query the 
>>>> current EPCM permissions. Even so, the OS needs to ensure PTEs are 
>>>> installed appropriately (we do not want a RW PTE for a read-only 
>>>> enclave page)
>>>
>>> Why not?  What's wrong with an RW PTE for a read-only enclave page?
>>>
>>> If you convince me that this is actually important, then I'll read 
>>> all the stuff below.
>>
>> Perhaps it is my misunderstanding/misinterpretation of the current 
>> implementation? From what I understand the current requirement, as 
>> enforced in the current mmap(), mprotect() as well as fault() hooks, 
>> is that mappings are required to have identical or weaker permission 
>> than the enclave permission.
> 
> The current implementation does require that, but for a perhaps 
> counterintuitive reason.  If a SELinux-restricted (or similarly 
> restricted) process that is *not* permitted to do JIT-like things loads 
> an enclave, it's entirely okay for it to initialize RW enclave pages 
> however it likes and it's entirely okay for it to initialize RX (or XO 
> if that ever becomes a thing) enclave pages from appropriately files on 
> disk.  But it's not okay for it to create RWX enclave pages or to 
> initialize RX enclave pages from untrusted application memory. [0]
> 
> So we have a half-baked implementation right now: the permission to 
> execute a page is decided based on secinfo (max permissions) when the 
> enclave is set up, and it's enforced at the PTE level.  The PTE 
> enforcement is because, on SGX2 hardware, the enclave can do EMODPE and 
> bypass any supposed restrictions in the EPCM.
> 
> The only coupling between EPCM and PTE here is that the max_perm is 
> initialized together with EPCM, but it didn't have to be that way.
> 
> An SGX2 implementation needs to be more fully baked, because in a 
> dynamic environment enclaves need to be able to use EMODPE and actually 
> end up with permissions that exceed the initial secinfo permissions.  So 

Could you please elaborate why this is a requirement? In this 
implementation the secinfo of a page added before enclave initialization 
(via SGX_IOC_ENCLAVE_ADD_PAGES) would indicate the maximum permissions 
it may have during its lifetime. Pages needing to be writable and 
executable during their lifetime can be created with RWX secinfo and 
during the enclave runtime the pages could obtain all combinations of 
permissions: RWX, R, RW, RX. A page added with RW secinfo may have R or 
RW permissions during its lifetime but never RX or RWX.

So far our inquiries on whether this is acceptable has been positive and 
is also what Dave attempted to put a spotlight on in:
https://lore.kernel.org/lkml/94d8d631-5345-66c4-52a3-941e52500f84@intel.com/

This above is specific to pages added before enclave initialization. In 
this implementation pages added after enclave initialization, those 
needing the ENCLS[EAUG] SGX2 instruction, are added with max permissions 
of RW so could only have R or RW permissions during their lifetime. This 
is an understood limitation and it is understood that integration with 
user policy is required to support these pages obtaining executable 
permission. The plan is to handle user policy integration in a series 
that follows this core SGX2 enabling.

> it needs to be possible to make a page that starts out R (or RW or 
> whatever) but nonetheless has max_perm=RWX so that the enclave can use a 
> combination of EMODPE and (ioctl-based) EMODPR to do JIT.  So I think 
> you should make it possible to set up pages like this, but I see no 
> reason to couple the PTE and the EPCM permissions.
> 
>>
>> Could you please elaborate how you envision PTEs should be managed in 
>> this implementation?
> 
> As above: PTE permissions may not exceed max_perm, and EPCM is entirely 
> separate except to the extent needed for ABI compatibility with SGX1 
> runtimes.

ok, so if I understand correctly you, since PTE permissions may not 
exceed max_perm and EPCM are separate, this seems to get back to your 
previous question of "What's wrong with an RW PTE for a read-only 
enclave page?"

This is indeed something that we could allow but not doing so (that is 
PTEs not exceeding EPCM permissions) would better support the SGX 
runtime. That is why I separated out the addition of the pfn_mkwrite() 
callback in the previous patch (04/25). Like in your example, there is a 
RW mapping of a read-only enclave page that first results in a RW PTE 
for the read-only enclave page. That would result in a #PF with the SGX 
flag set (0x8007). If the PTE matches the enclave permissions the page 
fault would have familiar 0x7 error code.

In either case user space would encounter a #PF so technically there is 
nothing "wrong" with allowing this - even so, as motivated in the 
previous patch: accurate exception information supports the SGX runtime, 
which is virtually always implemented inside a shared library, by 
providing accurate information in support of its management of the SGX 
enclave.


> [0] I'm not sure anyone actually has a system set up like this or that 
> the necessary LSM support is in the kernel.  But it's supposed to be 
> possible without changing the ABI.
> 

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-03 19:28   ` Andy Lutomirski
  2021-12-03 22:12     ` Reinette Chatre
@ 2021-12-04 23:57     ` Jarkko Sakkinen
  2021-12-06 21:20       ` Reinette Chatre
  1 sibling, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-04 23:57 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Reinette Chatre, dave.hansen, tglx, bp, mingo, linux-sgx, x86,
	seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On Fri, Dec 03, 2021 at 11:28:04AM -0800, Andy Lutomirski wrote:
> On 12/1/21 11:23, Reinette Chatre wrote:
> > Enclave creators declare their paging permission intent at the time
> > the pages are added to the enclave. These paging permissions are
> > vetted when pages are added to the enclave and stashed off
> > (in sgx_encl_page->vm_max_prot_bits) for later comparison with
> > enclave PTEs.
> > 
> 
> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change the EPCM
> permission bits however it likes with no oversight from the kernel.  So we
> end up with a whole bunch of permission masks:
> 
> The PTE: controlled by complex kernel policy
> 
> The VMA: with your series, this is entirely controlled by userspace.  I
> think I'm fine with that.
> 
> vm_max_prot_bits: populated from secinfo at setup time, unless I missed
> something that changes it later.  Maybe I'm confused or missed something in
> one of the patches,
> 
> vm_run_prot_bits: populated from some combination of ioctls.  I'm entirely
> lost as to what this is for.
> 
> EPCM bits: controlled by the guest.  basically useless for any host purpose
> on SGX2 hardware (with or without kernel support -- the enclave can do
> ENCLU[EMODPE] whether we like it or not, even on old kernels)
> 
> So I guess I don't understand the purpose of this patch	or of the rules in
> the later patches, and I feel like this is getting more complicated than
> makes sense.
> 
> 
> Could we perhaps make vm_max_prot_bits dynamic or at least controllable in
> some useful way?  My initial proposal (years ago) was for vm_max_prot_bits
> to be *separately* configured at initial load time instead of being inferred
> from secinfo with the intent being that the user untrusted runtime would set
> it appropriately.  I have no problem with allowing runtime changes as long
> as the security policy makes sense and it's kept consistent with PTEs.

This is a valid question. Since EMODPE exists why not just make things for
EMODPE, and ignore EMODPR altogether?

> Also, I think we need a changelog message or, even better, actual docs in
> kernel, explaining the actual final set of rules and invariants for all
> these masks.
> 
> --Andy

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-04 23:08   ` Jarkko Sakkinen
@ 2021-12-06 20:19     ` Dave Hansen
  2021-12-11  5:17       ` Jarkko Sakkinen
  2021-12-06 21:42     ` Reinette Chatre
  1 sibling, 1 reply; 155+ messages in thread
From: Dave Hansen @ 2021-12-06 20:19 UTC (permalink / raw)
  To: Jarkko Sakkinen, Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On 12/4/21 3:08 PM, Jarkko Sakkinen wrote:
>> Enclave page permission changes need to be approached with care and
>> for this reason this initial support is to allow enclave page
>> permission changes _only_ if the new permissions are the same or
>> more restrictive that the permissions originally vetted at the time the
>> pages were added to the enclave. Support for extending enclave page
>> permissions beyond what was originally vetted is deferred.
> This paragraph is out-of-scope for a commit message. You could have
> this in the cover letter but not here. I would just remove it.

This does convey valuable information, though.  It tells the reader that
this is a sub-optimal implementation.  It also acknowledges that there
is further work to do.  Maybe saying that it is "deferred" is not quite
the verbiage I would use, but the concept is fine.


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers
  2021-12-04 18:30   ` Jarkko Sakkinen
@ 2021-12-06 21:13     ` Reinette Chatre
  2021-12-11  5:28       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:13 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 10:30 AM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:22:59AM -0800, Reinette Chatre wrote:
>> The SGX ENCLS instruction uses EAX to specify an SGX function and
>> may require additional registers, depending on the SGX function.
>> ENCLS invokes the specified privileged SGX function for managing
>> and debugging enclaves. Macros are used to wrap the ENCLS
>> functionality and several wrappers are used to wrap the macros to
>> make the different SGX functions accessible in the code.
>>
>> The wrappers of the supported SGX functions are cryptic. Add short
>> changelog descriptions of each to a comment.
> 
> I think you are adding function descriptions.

Will change.

> 
>> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>> ---
>>   arch/x86/kernel/cpu/sgx/encls.h | 12 ++++++++++++
>>   1 file changed, 12 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
>> index 9b204843b78d..241b766265d3 100644
>> --- a/arch/x86/kernel/cpu/sgx/encls.h
>> +++ b/arch/x86/kernel/cpu/sgx/encls.h
>> @@ -162,57 +162,68 @@ static inline bool encls_failed(int ret)
>>   	ret;						\
>>   	})
>>   
>> +/* Create an SECS page in the Enclave Page Cache (EPC) */
>>   static inline int __ecreate(struct sgx_pageinfo *pginfo, void *secs)
>>   {
>>   	return __encls_2(ECREATE, pginfo, secs);
>>   }
> 
> You have:
> 
> * "Create an SECS page in the Enclave Page Cache (EPC)"
> * "Add a Version Array (VA) page to the Enclave Page Cache (EPC)"
> 
> They should have similar descriptions, e.g.
> 
> * "Initialize an EPC page into SGX Enclave Control Structure (SECS) page."
> * "Initialize an EPC page into Version Array (VA) page."

Will do. Did you intentionally omit the articles or would you be ok if I 
change it to:

"Initialize an EPC page into an SGX Enclave Control Structure (SECS) page."
"Initialize an EPC page into a Version Array (VA) page."

I also notice that you prefer the comments to end with a period and I 
will do so for all in the next version.

>> +/* Extend uninitialized enclave measurement */
>>   static inline int __eextend(void *secs, void *addr)
>>   {
>>   	return __encls_2(EEXTEND, secs, addr);
>>   }
> 
> That description does not make __eextend any less cryptic.
> 
> Something like this would be already more informative:
> 
> /* Hash a 256 byte region of an enclave page to SECS:MRENCLAVE. */

Thank you, I will use this description.

> 
> This same remark applies to the rest of these comments. They should
> provide a clue what the wrapper does rather than an English open coded
> function name.

Please see below for another attempt that includes your proposed changes 
so far. What do you think?

__ecreate():
/* Initialize an EPC page into an SGX Enclave Control Structure (SECS) 
page. */

__eextend():
/* Hash a 256 byte region of an enclave page to SECS:MRENCLAVE. */

__eadd():
/* Copy a source page from non-enclave memory into the EPC. */

__einit():
/* Finalize enclave build, initialize enclave for user code execution */

__eremove():
/* Disassociate EPC page from its enclave and mark it as unused. */

__edbgwr():
/* Copy data to an EPC page belonging to a debug enclave. */

__edbgrd():
/* Copy data from an EPC page belonging to a debug enclave. */

__etrack():
/* Track that software has completed the required TLB address clears. */

__eldu():
/* Load, verify, and unblock an Enclave Page Cache (EPC) page. */

__eblock():
/* Make EPC page inaccessible to enclave, ready to be written to memory. */

__epa():
/* Initialize an EPC page into a Version Array (VA) page. */

__ewb():
/* Invalidate an EPC page and write it out to main memory. */


Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 02/25] x86/sgx: Add wrappers for SGX2 functions
  2021-12-04 22:04   ` Jarkko Sakkinen
@ 2021-12-06 21:15     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:15 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 2:04 PM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:00AM -0800, Reinette Chatre wrote:
>> The SGX ENCLS instruction uses EAX to specify an SGX function and
>> may require additional registers, depending on the SGX function.
>> ENCLS invokes the specified privileged SGX function for managing
>> and debugging enclaves. Several macros are used to wrap the ENCLS
>> functionality.
>>
>> Add ENCLS wrappers for the SGX2 EMODPR, EMODT, and EAUG functions
>> that can make changes to pages of an initialized SGX enclave. The
>> EMODPR function is used to restrict enclave page permissions
>> as maintained within the enclave (Enclave Page Cache Map (EPCM)
>> permissions). The EMODT function is used to change the type of an
>> enclave page. The EAUG function is used to dynamically add enclave
>> pages to an initialized enclave.
>>
>> EMODPR and EMODT accepts two parameters and can fault as well as return
>> an SGX error code. EAUG also accepts two parameters but does not return
>> an SGX error code. Use existing macros for all new functions.
>>
>> Expand enum sgx_return_code with the possible EMODPR and EMODT
>> return codes.
> 
> These implementation details only obfuscate this commit message, and
> it is way too high-level to be useful e.g. for kernel maintenance.

2c273671d0df ("x86/sgx: Add wrappers for ENCLS functions") seemed to be 
good enough for kernel maintenance, but ok.

> 
> I'd replace it with something like:
> 
> "
> Add wrappers for ENCLS leaf functions EAUG, EMODT and EMODPR,
> which roughly take two steps:
> 
> 1. EAUG creates a new EPCM entry.
>     EMODT and EMODPR modify an existing EPCM entry.
> 2. Set either .PR = 1 (EMODPR), .MODIFY = 1 (EMODT) or .PENDING = 1 (AUG).
> 
> The bit is reset by the enclave by invoking ENCLU leaf function EACCEPT
> or EACCEPTCOPY, which will result the EPCM change becoming effective.
> "

I can use this if the SGX2 functions continues to be introduced in a 
single patch but ...

> 
> The current commit message is also not addressing these:
> 
> 1. What happens if enclaves accesses a memory address with either .PR,
>     .MODIFY or .PENDING set in EPCM, other than by the means of EACCEPT
>     or EACCEPTCOPY?
> 2. The calling conditions (e.g. concerning TLB's and ETRACK/IPI/etc
>     dance related to it).

... adding this information for all three SGX functions would be too 
much for one patch so I think that I should rather split this into three 
patches, each introducing a single SGX2 function with all the details 
you require. But ...


... the intent of this patch was just to introduce the wrappers of the 
SGX2 functions. These details surrounding the flows when using these 
functions are addressed in the patches that use them. It sounds to me 
that you want to duplicate that information here where the wrappers are 
added. Looking ahead you do require the same information in the 
changelogs of the patches that use these wrappers so I would like to 
confirm if you would like to see three separate patches with the details 
duplicating the information provided later or if you would like to see a 
single patch with the three wrappers and the changelog that you recommend?

> If this information was properly contained here, discussing about the
> following commits would be much easier.

The commits using these functions should have clear content on the flows 
surrounding them. I see there is work to do and I will review them to 
ensure that.

Reinette


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions
  2021-12-04 22:27     ` Jarkko Sakkinen
@ 2021-12-06 21:16       ` Reinette Chatre
  2021-12-11  5:39         ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:16 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 2:27 PM, Jarkko Sakkinen wrote:
> On Sun, Dec 05, 2021 at 12:25:59AM +0200, Jarkko Sakkinen wrote:
>> On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote:
>>> === Summary ===
>>>
>>> An SGX VMA can only be created if its permissions are the same or
>>> weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA
>>> creation this rule continues to be enforced by the page fault handler.
>>>
>>> With SGX2 the EPCM permissions of a page can change after VMA
>>> creation resulting in the VMA exceeding the EPCM permissions and the
>>> page fault handler incorrectly blocking access.
>>>
>>> Enable the VMA's pages to remain accessible while ensuring that
>>> the page table entries are installed to match the EPCM permissions
>>> without exceeding the VMA perms issions.
>>
>> I don't understand what the short summary means in English, and the
>> commit message is way too bloated to make any conclusions. It really
>> needs a rewrite.
>>
>> These were the questions I could not find answer for:
>>
>> 1. Why it would be by any means safe to remove a permission check?

The permission check is redundant for SGX1 and incorrect for SGX2.

In the current SGX1 implementation the permission check in 
sgx_encl_load_page() is redundant because an SGX VMA can only be created 
if its permissions are the same or weaker than the EPCM permissions.

In SGX2 a user is able to change EPCM permissions during runtime (while 
VMA has the memory mapped). A RW VMA may thus originally have mapped an 
enclave page with RW EPCM permissions but since then the enclave page 
may have its permissions changed to read-only. The VMA should still be 
able to read those enclave pages but the check in sgx_encl_load_page() 
will prevent that.

>> 2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing
>>     VMA's and mmap() new ones.

User is not prevented from closing existing VMAs and creating new ones.

> 3. Isn't this an API/ABI break?

Could you please elaborate where you see the API/ABI break? The rule 
that new VMAs cannot exceed EPCM permissions is untouched.

Reinette



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs
  2021-12-04 22:43   ` Jarkko Sakkinen
@ 2021-12-06 21:18     ` Reinette Chatre
  2021-12-11  7:37       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:18 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 2:43 PM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:02AM -0800, Reinette Chatre wrote:
>> By default a write page fault on a present PTE inherits the permissions
>> of the VMA. Enclave page permissions maintained in the hardware's
>> Enclave Page Cache Map (EPCM) may change after a VMA accessing the page
>> is created. A VMA's permissions may thus exceed the enclave page
>> permissions even though the VMA was originally created not to exceed
>> the enclave page permissions. Following the default behavior during
>> a page fault on a present PTE while the VMA permissions exceed the
>> enclave page permissions would result in the PTE for an enclave page
>> to be writable even though the page is not writable according to the
>> enclave's permissions.
>>
>> Consider the following scenario:
>> * An enclave page exists with RW EPCM permissions.
>> * A RW VMA maps the range spanning the enclave page.
>> * The enclave page's EPCM permissions are changed to read-only.
> 
> How could this happen in the existing mainline code?

This is a preparatory patch for SGX2 support. Restricting the 
permissions of an enclave page would require OS support that is added in 
a later patch.

> 
>> * There is no page table entry for the enclave page.
>>
>> Q.
>>   What will user space observe when an attempt is made to write to the
>>   enclave page from within the enclave?
>>
>> A.
>>   Initially the page table entry is not present so the following is
>>   observed:
>>   1) Instruction writing to enclave page is run from within the enclave.
>>   2) A page fault with second and third bits set (0x6) is encountered
>>      and handled by the SGX handler sgx_vma_fault() that installs a
>>      read-only page table entry following previous patch that installs
>>      page table entry with permissions that VMA and enclave agree on
>>      (read-only in this case).
>>   3) Instruction writing to enclave page is re-attempted.
>>   4) A page fault with first three bits set (0x7) is encountered and
>>      transparently (from SGX and user space perspective) handled by the
>>      OS with the page table entry made writable because the VMA is
>>      writable.
>>   5) Instruction writing to enclave page is re-attempted.
>>   6) Since the EPCM permissions prevents writing to the page a new page
>>      fault is encountered, this time with the SGX flag set in the error
>>      code (0x8007). No action is taken by OS for this page fault and
>>      execution returns to user space.
>>   7) Typically such a fault will be passed on to an application with a
>>      signal but if the enclave is entered with the vDSO function provided
>>      by the kernel then user space does not receive a signal but instead
>>      the vDSO function returns successfully with exception information
>>      (vector=14, error code=0x8007, and address) within the exception
>>      fields within the vDSO function's struct sgx_enclave_run.
>>
>> As can be observed it is not possible for user space to write to an
>> enclave page if that page's enclave page permissions do not allow so,
>> no matter what the VMA or PTE allows.
>>
>> Even so, the OS should not allow writing to a page if that page is not
>> writable. Thus the page table entry should accurately reflect the
>> enclave page permissions.
>>
>> Do not blindly accept VMA permissions on a page fault due to a write
>> attempt to a present PTE. Install a pfn_mkwrite() handler that ensures
>> that the VMA permissions agree with the enclave permissions in this
>> regard.
>>
>> Considering the same scenario as above after this change results in
>> the following behavior change:
>>
>> Q.
>>   What will user space observe when an attempt is made to write to the
>>   enclave page from within the enclave?
>>
>> A.
>>   Initially the page table entry is not present so the following is
>>   observed:
>>   1) Instruction writing to enclave page is run from within the enclave.
>>   2) A page fault with second and third bits set (0x6) is encountered
>>      and handled by the SGX handler sgx_vma_fault() that installs a
>>      read-only page table entry following previous patch that installs
>>      page table entry with permissions that VMA and enclave agree on
>>      (read-only in this case).
>>   3) Instruction writing to enclave page is re-attempted.
>>   4) A page fault with first three bits set (0x7) is encountered and
>>      passed to the pfn_mkwrite() handler for consideration. The handler
>>      determines that the page should not be writable and returns SIGBUS.
>>   5) Typically such a fault will be passed on to an application with a
>>      signal but if the enclave is entered with the vDSO function provided
>>      by the kernel then user space does not receive a signal but instead
>>      the vDSO function returns successfully with exception information
>>      (vector=14, error code=0x7, and address) within the exception fields
>>      within the vDSO function's struct sgx_enclave_run.
>>
>> The accurate exception information supports the SGX runtime, which is
>> virtually always implemented inside a shared library, by providing
>> accurate information in support of its management of the SGX enclave.
> 
> This QA-format is not a great idea, as it kind of tells what are the legit
> questions to ask.

I will remove the QA-format and just describe the two (before/after) 
scenarios.

> You should describe what the patch does and what are the
> legit reasons for doing that. Unfortunately, in the current form it is very
> hard to get grip of this patch.

That was the goal of the summary (the first paragraph) at the start of 
the changelog. Could you please elaborate how you would like me to 
improve it?

Reinette


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-04 23:57     ` Jarkko Sakkinen
@ 2021-12-06 21:20       ` Reinette Chatre
  2021-12-11  7:42         ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:20 UTC (permalink / raw)
  To: Jarkko Sakkinen, Andy Lutomirski
  Cc: dave.hansen, tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang,
	cathy.zhang, cedric.xing, haitao.huang, mark.shanahan, hpa,
	linux-kernel

Hi Jarkko,

On 12/4/2021 3:57 PM, Jarkko Sakkinen wrote:
> On Fri, Dec 03, 2021 at 11:28:04AM -0800, Andy Lutomirski wrote:
>> On 12/1/21 11:23, Reinette Chatre wrote:
>>> Enclave creators declare their paging permission intent at the time
>>> the pages are added to the enclave. These paging permissions are
>>> vetted when pages are added to the enclave and stashed off
>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>> enclave PTEs.
>>>
>>
>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to change the EPCM
>> permission bits however it likes with no oversight from the kernel.  So we
>> end up with a whole bunch of permission masks:
>>
>> The PTE: controlled by complex kernel policy
>>
>> The VMA: with your series, this is entirely controlled by userspace.  I
>> think I'm fine with that.
>>
>> vm_max_prot_bits: populated from secinfo at setup time, unless I missed
>> something that changes it later.  Maybe I'm confused or missed something in
>> one of the patches,
>>
>> vm_run_prot_bits: populated from some combination of ioctls.  I'm entirely
>> lost as to what this is for.
>>
>> EPCM bits: controlled by the guest.  basically useless for any host purpose
>> on SGX2 hardware (with or without kernel support -- the enclave can do
>> ENCLU[EMODPE] whether we like it or not, even on old kernels)
>>
>> So I guess I don't understand the purpose of this patch	or of the rules in
>> the later patches, and I feel like this is getting more complicated than
>> makes sense.
>>
>>
>> Could we perhaps make vm_max_prot_bits dynamic or at least controllable in
>> some useful way?  My initial proposal (years ago) was for vm_max_prot_bits
>> to be *separately* configured at initial load time instead of being inferred
>> from secinfo with the intent being that the user untrusted runtime would set
>> it appropriately.  I have no problem with allowing runtime changes as long
>> as the security policy makes sense and it's kept consistent with PTEs.
> 
> This is a valid question. Since EMODPE exists why not just make things for
> EMODPE, and ignore EMODPR altogether?
> 

I believe that we should support the best practice of principle of least 
privilege - once a page no longer needs a particular permission there 
should be a way to remove it (the unneeded permission).

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-04 22:50   ` Jarkko Sakkinen
@ 2021-12-06 21:28     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:28 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 2:50 PM, Jarkko Sakkinen wrote:
> What about:
> 
> "x86/sgx: Add encl_page->vm_run_prot_bits for dynamic permission changes"

Sure.

> 
> On Wed, Dec 01, 2021 at 11:23:03AM -0800, Reinette Chatre wrote:
>> Enclave creators declare their paging permission intent at the time
>> the pages are added to the enclave. These paging permissions are
>> vetted when pages are added to the enclave and stashed off
>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>> enclave PTEs.
>>
>> Current permission support assume that enclave page permissions
>> remain static for the lifetime of the enclave. This is about to change
>> with the addition of support for SGX2 where the permissions of enclave
>> pages belonging to an initialized enclave may be changed during the
>> enclave's lifetime.
>>
>> Introduce runtime protection bits in preparation for support of
> 
> By writing "Introduce runtime protection bits", instead of simply "Add
> encl_page->vm_run_prot_bits", the only thing you are adding is obfuscation.
> 
> Try to refer to the "exact thing", instead of English rephrasing
> whenever possible.
> 
>> enclave page permission changes. These bits reflect the active
>> permissions of an enclave page and are not to exceed the maximum
>> protection bits that passed scrutiny during enclave creation.
>>
>> Associate runtime protection bits with each enclave page. Initialize
>> the runtime protection bits to the vetted maximum protection bits
>> on page creation. Use the runtime protection bits for any access
>> checks.
> 
> I guess the first sentence in this paragraph is completely redundant
> as the first sentence of the previous paragraph tells the exact
> same story.

The previous paragraph introduces what these bits are and what they mean 
and the second describes how they are used. I can merge the paragraphs.

> 
>> struct sgx_encl_page hosting this information is maintained for each
>> enclave page so the space consumed by the struct is important.
>> The existing vm_max_prot_bits is already unsigned long while only using
>> three bits. Transition to a bitfield for the two members containing
>> protection bits.
>>
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> 
> So this commit message left the most important thing unanswered,
> or I missed it (which happens quite often): why two fields instead
> of one? Why vm_max_port_bits needs to stay constant?
> 
> It's something that should be clearly documented.

vm_max_prot_bits is the vetted EPCM permissions an enclave is allowed to 
have (for EADDed pages it is the value from secinfo). Permissions can be 
changed using SGX2 but they should never exceed vm_max_prot_bits.
vm_run_prot_bits reflects the current (from OS perspective) active EPCM 
permissions and replaces the current usages of vm_max_prot_bits in 
runtime (VMA and PTE) permission checks.

Consider this example how vm_max_prot_bits and vm_run_prot_bits are used:

(1) Add enclave page with secinfo of RW to uninitialized enclave
     vm_max_prot_bits = RW
     vm_run_prot_bits = RW

(2) User space runs SGX_IOC_PAGE_MODP (renamed
     to SGX_IOC_ENCLAVE_MOD_PROTECTIONS) to change the permissions to
     read-only. This is allowed because vm_max_prot_bits = RW. Now:
     vm_max_prot_bits = RW
     vm_run_prot_bits = R

     At this point only new read-only VMAs would be allowed to access
     this page and PTEs would not allow write access ... this is guided
     by vm_run_prot_bits.

(3) User space runs SGX_IOC_PAGE_MODP (renamed
     to SGX_IOC_ENCLAVE_MOD_PROTECTIONS) to change the permissions to RX.
     This will be denied because vm_max_prot_bits = RW.

(3) User space runs SGX_IOC_PAGE_MODP (renamed
     to SGX_IOC_ENCLAVE_MOD_PROTECTIONS) to change the permissions to RW.
     This will be allowed because vm_max_prot_bits = RW.

If this is helpful I can add this example to the changelog.

Reinette





^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 06/25] x86/sgx: Use more generic name for enclave cpumask function
  2021-12-04 22:56   ` Jarkko Sakkinen
@ 2021-12-06 21:29     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:29 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 2:56 PM, Jarkko Sakkinen wrote:
> What are "enclave cpumask" and "generic name"? I'd prefer to speak
> about concrete things and no use weird rephrasings at all.
> 
> Also, renaming is not exporting.
> 
> You should split this into two patches:
> 
> 1. x86/sgx: Export sgx_encl_ewb_cpumask()
> 2. x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask().

Will do.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 07/25] x86/sgx: Move PTE zap code to separate function
  2021-12-04 22:59   ` Jarkko Sakkinen
@ 2021-12-06 21:30     ` Reinette Chatre
  2021-12-11  7:52       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:30 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 2:59 PM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:05AM -0800, Reinette Chatre wrote:
>> The SGX reclaimer removes page table entries pointing to pages that are
>> moved to swap. SGX2 enables changes to pages belonging to an initialized
>> enclave, for example changing page permissions. Supporting SGX2 requires
>> this ability to remove page table entries that is available in the
>> SGX reclaimer code.
> 
> Missing: why SGX2 requirest this?

The above paragraph states that SGX2 needs to remove page table entries 
because it modifies page permissions. Could you please elaborate what is 
missing?

> 
>> Factor out the code removing page table entries to a separate function,
>> fixing accuracy of comments in the process, and make it available to other
>> areas within the SGX code.
>>
>> Since the code will no longer be unique to the reclaimer it is relocated
>> to be with the rest of the enclave code in encl.c interacting with the
>> page table.
> 
> This last paragraph should be removed. It can be seen from the code change
> and diffstat.

I understand that the code movement can be seen from the diffstat but 
the reason for the move may not be obvious to everybody. If it is ok 
with you I'd rather keep this text.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 08/25] x86/sgx: Make SGX IPI callback available internally
  2021-12-04 23:00   ` Jarkko Sakkinen
@ 2021-12-06 21:36     ` Reinette Chatre
  2021-12-11  7:53       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:36 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 3:00 PM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:06AM -0800, Reinette Chatre wrote:
>> The ETRACK instruction followed by an IPI to all CPUs within an enclave
>> is a common pattern with more frequent use in support of SGX2.
>>
>> Make the (empty) IPI callback function available internally in
>> preparation for more usages.
> 
> Please, just describe the usages that this is needed for so that
> there is zero guesswork required.

The reader is not required to guess. The first paragraph states that 
SGX2 also uses the ETRACK flow that relies on this function. What if I 
replace "for more usages" by "for usage by SGX2"?

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-04 23:08   ` Jarkko Sakkinen
  2021-12-06 20:19     ` Dave Hansen
@ 2021-12-06 21:42     ` Reinette Chatre
  2021-12-11  7:57       ` Jarkko Sakkinen
  1 sibling, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:42 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 3:08 PM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:08AM -0800, Reinette Chatre wrote:
>> In the initial (SGX1) version of SGX, pages in an enclave need to be
>> created with permissions that support all usages of the pages, from the
>> time the enclave is initialized until it is unloaded. For example,
>> pages used by a JIT compiler or when code needs to otherwise be
>> relocated need to always have RWX permissions.
>>
>> SGX2 includes two functions that can be used to modify the enclave page
>> permissions of regular enclave pages within an initialized enclave.
>> ENCLS[EMODPR] is run from the OS and used to restrict enclave page
>> permissions while ENCLU[EMODPE] is run from within the enclave to
>> extend enclave page permissions.
>>
>> Enclave page permission changes need to be approached with care and
>> for this reason this initial support is to allow enclave page
>> permission changes _only_ if the new permissions are the same or
>> more restrictive that the permissions originally vetted at the time the
>> pages were added to the enclave. Support for extending enclave page
>> permissions beyond what was originally vetted is deferred.
> 
> This paragraph is out-of-scope for a commit message. You could have
> this in the cover letter but not here. I would just remove it.

I think this is essential information that is mentioned in the cover 
letter _and_ in this changelog. I will follow Dave's guidance and avoid 
"deferred" by just removing that last sentence.

> 
>> Whether enclave page permissions are restricted or extended it
>> is necessary to ensure that the page table entries and enclave page
>> permissions are in sync. Introduce a new ioctl, SGX_IOC_PAGE_MODP, to
> 
> SGX_IOC_PAGE_MODP does not match the naming convetion of these:
> 
> * SGX_IOC_ENCLAVE_CREATE
> * SGX_IOC_ENCLAVE_ADD_PAGES
> * SGX_IOC_ENCLAVE_INIT

ah - my understanding was that the SGX_IOC_ENCLAVE prefix related to 
operations related to the entire enclave and thus I introduced the 
prefix SGX_IOC_PAGE to relate to operations on pages within an enclave.

> 
> A better name would be SGX_IOC_ENCLAVE_MOD_PROTECTIONS. It doesn't
> do harm to be a more verbose.

Will do. I see later you propose SGX_IOC_ENCLAVE_MODIFY_TYPE - would you 
like them to be consistent wrt MOD/MODIFY?

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2021-12-04 23:13   ` Jarkko Sakkinen
@ 2021-12-06 21:44     ` Reinette Chatre
  2021-12-11  8:00       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:44 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 3:13 PM, Jarkko Sakkinen wrote:
> "to initialize" -> "to an initialized"

Will do.


> 
> On Wed, Dec 01, 2021 at 11:23:11AM -0800, Reinette Chatre wrote:
>> With SGX1 an enclave needs to be created with its maximum memory demands
>> allocated. Pages cannot be added to an enclave after it is initialized.
>> SGX2 introduces a new function, ENCLS[EAUG], that can be used to add
>> pages to an initialized enclave. With SGX2 the enclave still needs to
>> set aside address space for its maximum memory demands during enclave
>> creation, but all pages need not be added before enclave initialization.
>> Pages can be added during enclave runtime.
>>
>> Add support for dynamically adding pages to an initialized enclave,
>> architecturally limited to RW permission. Add pages via the page fault
>> handler at the time an enclave address without a backing enclave page
>> is accessed, potentially directly reclaiming pages if no free pages
>> are available.
>>
>> The enclave is still required to run ENCLU[EACCEPT] on the page before
>> it can be used. A useful flow is for the enclave to run ENCLU[EACCEPT]
>> on an uninitialized address. This will trigger the page fault handler
>> that will add the enclave page and return execution to the enclave to
>> repeat the ENCLU[EACCEPT] instruction, this time successful.
>>
>> If the enclave accesses an uninitialized address in another way, for
>> example by expanding the enclave stack to a page that has not yet been
>> added, then the page fault handler would add the page on the first
>> write but upon returning to the enclave the instruction that triggered
>> the page fault would be repeated and since ENCLU[EACCEPT] was not run
>> yet it would trigger a second page fault, this time with the SGX flag
>> set in the page fault error code. This can only be recovered by entering
>> the enclave again and directly running the ENCLU[EACCEPT] instruction on
>> the now initialized address.
>>
>> Accessing an uninitialized address from outside the enclave also triggers
>> this flow but the page will remain in PENDING state until accepted from
>> within the enclave.
> 
> What does it mean being in PENDING state, and more imporantly, what is
> PENDING state? What does a memory access within enclave cause when it
> touch a page within this state?

The PENDING state is the enclave page state from the SGX hardware's 
perspective. The OS uses the ENCLS[EAUG] SGX2 function to add a new page 
to the enclave but from the SGX hardware's perspective it would be in a 
PENDING state until the enclave accepts the page. An access to the page 
in PENDING state would result in a page fault.


> I see a lot of text in the commit message but zero mentions about EPCM
> expect this one sudden mention about PENDING field without attaching
> it to anything concrete.

My apologies - I will add this to this changelog. This matches your 
request to describe the __eaug() wrapper introduced in patch 02/25. 
Would you like me to duplicate this information here and in that patch 
(a new patch dedicated to the __eaug() wrapper) or would you be ok if I 
introduce the wrappers all together briefly as in the example you 
provide and then detail the flows where the wrappers are used - like 
this patch?

Reinette



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 14/25] x86/sgx: Tighten accessible memory range after enclave initialization
  2021-12-04 23:14   ` Jarkko Sakkinen
@ 2021-12-06 21:45     ` Reinette Chatre
  2021-12-11  8:01       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:45 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 3:14 PM, Jarkko Sakkinen wrote:
>> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
>> index 342b97dd4c33..37203da382f8 100644
>> --- a/arch/x86/kernel/cpu/sgx/encl.c
>> +++ b/arch/x86/kernel/cpu/sgx/encl.c
>> @@ -403,6 +403,10 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
>>   
>>   	XA_STATE(xas, &encl->page_array, PFN_DOWN(start));
>>   
> 
> Please write a comment here.

Would the comment below suffice?

/* Disallow mapping outside enclave's address range. */

> 
>> +	if (test_bit(SGX_ENCL_INITIALIZED, &encl->flags) &&
>> +	    (start < encl->base || end > encl->base + encl->size))
>> +		return -EACCES;
>> +
>>   	/*
>>   	 * Disallow READ_IMPLIES_EXEC tasks as their VMA permissions might
>>   	 * conflict with the enclave page permissions.
>> -- 
>> 2.25.1
>>
> 
> Otherwise, makes sense.
> 

Thank you

Reinette


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 16/25] x86/sgx: Support modifying SGX page type
  2021-12-04 23:45   ` Jarkko Sakkinen
@ 2021-12-06 21:48     ` Reinette Chatre
  2021-12-11  8:02       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:48 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 3:45 PM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:14AM -0800, Reinette Chatre wrote:
>> Every enclave contains one or more Thread Control Structures (TCS). The
>> TCS contains meta-data used by the hardware to save and restore thread
>> specific information when entering/exiting the enclave. With SGX1 an
>> enclave needs to be created with enough TCSs to support the largest
>> number of threads expecting to use the enclave and enough enclave pages
>> to meet all its anticipated memory demands. In SGX1 all pages remain in
>> the enclave until the enclave is unloaded.
>>
>> Earlier changes added support for the SGX2 feature where pages can be
>> added dynamically to an initialized enclave.
> 
> Please remove this paragraph, i.e. do not tie the commit order like
> this.

Will do.

>>
>> SGX2 introduces a new function, ENCLS[EMODT], that is used to change
>> the type of an enclave page from a regular (SGX_PAGE_TYPE_REG) enclave
>> page to a TCS (SGX_PAGE_TYPE_TCS) page or change the type from a
>> regular (SGX_PAGE_TYPE_REG) or TCS (SGX_PAGE_TYPE_TCS)
>> page to a trimmed (SGX_PAGE_TYPE_TRIM) page (setting it up for later
>> removal).
>>
>> With the existing support of dynamically adding regular enclave pages
>> to an initialized enclave and changing the page type to TCS it is
>> possible to dynamically increase the number of threads supported by an
>> enclave.
>>
>> Changing the enclave page type to SGX_PAGE_TYPE_TRIM is the first step
>> of dynamically removing pages from an initialized enclave. The complete
>> page removal flow is:
>> 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
>>     using the ioctl introduced here.
>> 2) Approve the page removal by running ENCLU[EACCEPT] from within
>>     the enclave.
>> 3) Initiate actual page removal using the new ioctl introduced in the
>>     following patch.
>>
>> Support changing SGX enclave page types with a new ioctl. With this
> 
> What is "a new ioctl"? Why not just write "Add <ioctl name>""?

I do so to reduce the changes required during the ioctl naming 
discussion churn.

>> ioctl the user specifies a page range and the enclave page type to be
>> applied to all pages in the provided range. The ioctl itself can return
>> an error code based on failures encountered by the OS. It is also
>> possible for SGX specific failures to be encountered.  Add a result
>> output parameter to communicate the SGX return code. It is
>> possible for the enclave page type change request to fail on any page
>> within the provided range. Support partial success by returning
>> the number of pages that were successfully changed.
>>
>> After the page type is changed to SGX_PAGE_TYPE_TRIM the page continues
>> to be accessible from the OS perspective with page table entries and
>> internal state. The page may be moved to swap. Any invalid access
>> (any access except ENCLU[EACCEPT]) will encounter a page fault with
>> SGX flag set in error code until the page is removed. Removal of
>> trimmed enclave pages on user request will be supported in following
>> patch. Trimmed enclave pages are also removed when enclave is unloaded.
>>
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> 
> This is lacking discussion of EPCM interaction, most importanly
> .MODIFY field of an EPCM entry.

I will add that. I have the same question here as in EAUG patch - would 
you like a duplicate description in this patch and a new patch that 
introduces just the __emodt() wrapper or would you be ok with all new 
wrappers introduced together and the detailed description of their 
hardware supported flows only present in the patch that uses those wrappers?


>> ---
>>   arch/x86/include/uapi/asm/sgx.h |  19 +++
>>   arch/x86/kernel/cpu/sgx/ioctl.c | 235 ++++++++++++++++++++++++++++++++
>>   2 files changed, 254 insertions(+)
>>
>> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
>> index 24bebc31e336..f70caccd166c 100644
>> --- a/arch/x86/include/uapi/asm/sgx.h
>> +++ b/arch/x86/include/uapi/asm/sgx.h
>> @@ -31,6 +31,8 @@ enum sgx_page_flags {
>>   	_IO(SGX_MAGIC, 0x04)
>>   #define SGX_IOC_PAGE_MODP \
>>   	_IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp)
>> +#define SGX_IOC_PAGE_MODT \
>> +	_IOWR(SGX_MAGIC, 0x06, struct sgx_page_modt)
> 
> I'd suggest to change this as SGX_IOC_ENCLAVE_MODIFY_TYPE.

How about SGX_IOC_ENCLAVE_MOD_TYPE to be consistent with your earlier 
suggestion of SGX_IOC_ENCLAVE_MOD_PROTECTIONS ?

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 17/25] x86/sgx: Support complete page removal
  2021-12-04 23:45   ` Jarkko Sakkinen
@ 2021-12-06 21:49     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 21:49 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 3:45 PM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:15AM -0800, Reinette Chatre wrote:

...

>> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
>> index f70caccd166c..6648ded960f8 100644
>> --- a/arch/x86/include/uapi/asm/sgx.h
>> +++ b/arch/x86/include/uapi/asm/sgx.h
>> @@ -33,6 +33,8 @@ enum sgx_page_flags {
>>   	_IOWR(SGX_MAGIC, 0x05, struct sgx_page_modp)
>>   #define SGX_IOC_PAGE_MODT \
>>   	_IOWR(SGX_MAGIC, 0x06, struct sgx_page_modt)
>> +#define SGX_IOC_PAGE_REMOVE \
>> +	_IOWR(SGX_MAGIC, 0x07, struct sgx_page_remove)
> 
> Should be SGX_IOC_ENCLAVE_REMOVE_PAGES.
> 

Will do.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 24/25] x86/sgx: Free up EPC pages directly to support large page ranges
  2021-12-04 23:47   ` Jarkko Sakkinen
@ 2021-12-06 22:07     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-06 22:07 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/4/2021 3:47 PM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:22AM -0800, Reinette Chatre wrote:
>> The page reclaimer ensures availability of EPC pages across all
>> enclaves. In support of this it runs independently from the individual
>> enclaves in order to take locks from the different enclaves as it writes
>> pages to swap.
>>
>> When needing to load a page from swap an EPC page needs to be available for
>> its contents to be loaded into. Loading an existing enclave page from swap
>> does not reclaim EPC pages directly if none are available, instead the
>> reclaimer is woken when the available EPC pages are found to be below a
>> watermark.
>>
>> When iterating over a large number of pages in an oversubscribed
>> environment there is a race between the reclaimer woken up and EPC pages
>> reclaimed fast enough for the page operations to proceed.
>>
>> Instead of tuning the race between the page operations and the reclaimer
>> the page operations instead makes sure that there are EPC pages available.
>>
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> 
> Why this needs to be part of this patch set?

When pages are modified they are required to be in the EPC and thus 
potentially need to be loaded from swap. When needing to modify a large 
number of pages in an oversubscribed environment there is a problem with 
the reclaimer providing free EPC pages fast enough for all the page 
modification operations to proceed.

What that means is that if a user attempts to modify a large range of 
pages in an oversubscribed environment it is likely that the operation 
will fail to complete but instead it would result in partial success of 
as many pages as was on the free list. This is because the reclaimer may 
not run fast enough to free up sufficient EPC pages in a dynamic way.

This becomes complicated for user space. It could increase the priority 
of the reclaimer but that has been found to be insufficient*. There 
would still not be a guarantee that after one page modification call 
fails enough pages would have been freed up in support of a second page 
modification call.

With this change it would be ensured that when pages are being modified 
that there are sufficient EPC pages available to support the modifications.

Reinette

* The test that follows this patch was used to explore this scenario.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-06 20:19     ` Dave Hansen
@ 2021-12-11  5:17       ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  5:17 UTC (permalink / raw)
  To: Dave Hansen, Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 12:19 -0800, Dave Hansen wrote:
> On 12/4/21 3:08 PM, Jarkko Sakkinen wrote:
> > > Enclave page permission changes need to be approached with care and
> > > for this reason this initial support is to allow enclave page
> > > permission changes _only_ if the new permissions are the same or
> > > more restrictive that the permissions originally vetted at the time the
> > > pages were added to the enclave. Support for extending enclave page
> > > permissions beyond what was originally vetted is deferred.
> > This paragraph is out-of-scope for a commit message. You could have
> > this in the cover letter but not here. I would just remove it.
> 
> This does convey valuable information, though.  It tells the reader that
> this is a sub-optimal implementation.  It also acknowledges that there
> is further work to do.  Maybe saying that it is "deferred" is not quite
> the verbiage I would use, but the concept is fine.

BTW, should we consistently speak about protection bits instead of
permissions?

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers
  2021-12-06 21:13     ` Reinette Chatre
@ 2021-12-11  5:28       ` Jarkko Sakkinen
  2021-12-13 22:06         ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  5:28 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 13:13 -0800, Reinette Chatre wrote:
> > * "Create an SECS page in the Enclave Page Cache (EPC)"
> > * "Add a Version Array (VA) page to the Enclave Page Cache (EPC)"
> > 
> > They should have similar descriptions, e.g.
> > 
> > * "Initialize an EPC page into SGX Enclave Control Structure (SECS) page."
> > * "Initialize an EPC page into Version Array (VA) page."
> 
> Will do. Did you intentionally omit the articles or would you be ok if I 
> change it to:
> 
> "Initialize an EPC page into an SGX Enclave Control Structure (SECS) page."
> "Initialize an EPC page into a Version Array (VA) page."
> 
> I also notice that you prefer the comments to end with a period and I 
> will do so for all in the next version.

Looks fine to me.

> > > +/* Extend uninitialized enclave measurement */
> > >   static inline int __eextend(void *secs, void *addr)
> > >   {
> > >   	return __encls_2(EEXTEND, secs, addr);
> > >   }
> > 
> > That description does not make __eextend any less cryptic.
> > 
> > Something like this would be already more informative:
> > 
> > /* Hash a 256 byte region of an enclave page to SECS:MRENCLAVE. */
> 
> Thank you, I will use this description.
> 
> > 
> > This same remark applies to the rest of these comments. They should
> > provide a clue what the wrapper does rather than an English open coded
> > function name.
> 
> Please see below for another attempt that includes your proposed changes 
> so far. What do you think?
> 
> __ecreate():
> /* Initialize an EPC page into an SGX Enclave Control Structure (SECS) 
> page. */
> 
> __eextend():
> /* Hash a 256 byte region of an enclave page to SECS:MRENCLAVE. */
> 
> __eadd():
> /* Copy a source page from non-enclave memory into the EPC. */

Perhaps:

/* 
 * Associate an EPC page to an enclave either as a REG or TCS page
 * populated with the provided data.
 */

This is more aligned with your description for __eremove().

> 
> __einit():
> /* Finalize enclave build, initialize enclave for user code execution */
> 
> __eremove():
> /* Disassociate EPC page from its enclave and mark it as unused. */
> 
> __edbgwr():
> /* Copy data to an EPC page belonging to a debug enclave. */
> 
> __edbgrd():
> /* Copy data from an EPC page belonging to a debug enclave. */
> 
> __etrack():
> /* Track that software has completed the required TLB address clears. */
> 
> __eldu():
> /* Load, verify, and unblock an Enclave Page Cache (EPC) page. */
> 
> __eblock():
> /* Make EPC page inaccessible to enclave, ready to be written to memory. */
> 
> __epa():
> /* Initialize an EPC page into a Version Array (VA) page. */
> 
> __ewb():
> /* Invalidate an EPC page and write it out to main memory. */
> 
> 
> Reinette

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions
  2021-12-06 21:16       ` Reinette Chatre
@ 2021-12-11  5:39         ` Jarkko Sakkinen
  2021-12-13 22:08           ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  5:39 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 13:16 -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/4/2021 2:27 PM, Jarkko Sakkinen wrote:
> > On Sun, Dec 05, 2021 at 12:25:59AM +0200, Jarkko Sakkinen wrote:
> > > On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote:
> > > > === Summary ===
> > > > 
> > > > An SGX VMA can only be created if its permissions are the same or
> > > > weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA
> > > > creation this rule continues to be enforced by the page fault handler.
> > > > 
> > > > With SGX2 the EPCM permissions of a page can change after VMA
> > > > creation resulting in the VMA exceeding the EPCM permissions and the
> > > > page fault handler incorrectly blocking access.
> > > > 
> > > > Enable the VMA's pages to remain accessible while ensuring that
> > > > the page table entries are installed to match the EPCM permissions
> > > > without exceeding the VMA perms issions.
> > > 
> > > I don't understand what the short summary means in English, and the
> > > commit message is way too bloated to make any conclusions. It really
> > > needs a rewrite.
> > > 
> > > These were the questions I could not find answer for:
> > > 
> > > 1. Why it would be by any means safe to remove a permission check?
> 
> The permission check is redundant for SGX1 and incorrect for SGX2.
> 
> In the current SGX1 implementation the permission check in 
> sgx_encl_load_page() is redundant because an SGX VMA can only be created 
> if its permissions are the same or weaker than the EPCM permissions.
> 
> In SGX2 a user is able to change EPCM permissions during runtime (while 
> VMA has the memory mapped). A RW VMA may thus originally have mapped an 
> enclave page with RW EPCM permissions but since then the enclave page 
> may have its permissions changed to read-only. The VMA should still be 
> able to read those enclave pages but the check in sgx_encl_load_page() 
> will prevent that.
> 
> > > 2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing
> > >     VMA's and mmap() new ones.
> 
> User is not prevented from closing existing VMAs and creating new ones.
> 
> > 3. Isn't this an API/ABI break?
> 
> Could you please elaborate where you see the API/ABI break? The rule 
> that new VMAs cannot exceed EPCM permissions is untouched.
> 
> Reinette

I just don't understand the description. There's a whole bunch of text
but 

It does not discuss what the patch does in low-level detail what the
patch does, e.g. the use of vm_insert_pfn_prot(). I honestly do not
get the story here...

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs
  2021-12-06 21:18     ` Reinette Chatre
@ 2021-12-11  7:37       ` Jarkko Sakkinen
  2021-12-13 22:09         ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  7:37 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 13:18 -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/4/2021 2:43 PM, Jarkko Sakkinen wrote:
> > On Wed, Dec 01, 2021 at 11:23:02AM -0800, Reinette Chatre wrote:
> > > By default a write page fault on a present PTE inherits the permissions
> > > of the VMA. Enclave page permissions maintained in the hardware's
> > > Enclave Page Cache Map (EPCM) may change after a VMA accessing the page
> > > is created. A VMA's permissions may thus exceed the enclave page
> > > permissions even though the VMA was originally created not to exceed
> > > the enclave page permissions. Following the default behavior during
> > > a page fault on a present PTE while the VMA permissions exceed the
> > > enclave page permissions would result in the PTE for an enclave page
> > > to be writable even though the page is not writable according to the
> > > enclave's permissions.
> > > 
> > > Consider the following scenario:
> > > * An enclave page exists with RW EPCM permissions.
> > > * A RW VMA maps the range spanning the enclave page.
> > > * The enclave page's EPCM permissions are changed to read-only.
> > 
> > How could this happen in the existing mainline code?
> 
> This is a preparatory patch for SGX2 support. Restricting the 
> permissions of an enclave page would require OS support that is added in 
> a later patch.
> 
> > 
> > > * There is no page table entry for the enclave page.
> > > 
> > > Q.
> > >   What will user space observe when an attempt is made to write to the
> > >   enclave page from within the enclave?
> > > 
> > > A.
> > >   Initially the page table entry is not present so the following is
> > >   observed:
> > >   1) Instruction writing to enclave page is run from within the enclave.
> > >   2) A page fault with second and third bits set (0x6) is encountered
> > >      and handled by the SGX handler sgx_vma_fault() that installs a
> > >      read-only page table entry following previous patch that installs
> > >      page table entry with permissions that VMA and enclave agree on
> > >      (read-only in this case).
> > >   3) Instruction writing to enclave page is re-attempted.
> > >   4) A page fault with first three bits set (0x7) is encountered and
> > >      transparently (from SGX and user space perspective) handled by the
> > >      OS with the page table entry made writable because the VMA is
> > >      writable.
> > >   5) Instruction writing to enclave page is re-attempted.
> > >   6) Since the EPCM permissions prevents writing to the page a new page
> > >      fault is encountered, this time with the SGX flag set in the error
> > >      code (0x8007). No action is taken by OS for this page fault and
> > >      execution returns to user space.
> > >   7) Typically such a fault will be passed on to an application with a
> > >      signal but if the enclave is entered with the vDSO function provided
> > >      by the kernel then user space does not receive a signal but instead
> > >      the vDSO function returns successfully with exception information
> > >      (vector=14, error code=0x8007, and address) within the exception
> > >      fields within the vDSO function's struct sgx_enclave_run.
> > > 
> > > As can be observed it is not possible for user space to write to an
> > > enclave page if that page's enclave page permissions do not allow so,
> > > no matter what the VMA or PTE allows.
> > > 
> > > Even so, the OS should not allow writing to a page if that page is not
> > > writable. Thus the page table entry should accurately reflect the
> > > enclave page permissions.
> > > 
> > > Do not blindly accept VMA permissions on a page fault due to a write
> > > attempt to a present PTE. Install a pfn_mkwrite() handler that ensures
> > > that the VMA permissions agree with the enclave permissions in this
> > > regard.
> > > 
> > > Considering the same scenario as above after this change results in
> > > the following behavior change:
> > > 
> > > Q.
> > >   What will user space observe when an attempt is made to write to the
> > >   enclave page from within the enclave?
> > > 
> > > A.
> > >   Initially the page table entry is not present so the following is
> > >   observed:
> > >   1) Instruction writing to enclave page is run from within the enclave.
> > >   2) A page fault with second and third bits set (0x6) is encountered
> > >      and handled by the SGX handler sgx_vma_fault() that installs a
> > >      read-only page table entry following previous patch that installs
> > >      page table entry with permissions that VMA and enclave agree on
> > >      (read-only in this case).
> > >   3) Instruction writing to enclave page is re-attempted.
> > >   4) A page fault with first three bits set (0x7) is encountered and
> > >      passed to the pfn_mkwrite() handler for consideration. The handler
> > >      determines that the page should not be writable and returns SIGBUS.
> > >   5) Typically such a fault will be passed on to an application with a
> > >      signal but if the enclave is entered with the vDSO function provided
> > >      by the kernel then user space does not receive a signal but instead
> > >      the vDSO function returns successfully with exception information
> > >      (vector=14, error code=0x7, and address) within the exception fields
> > >      within the vDSO function's struct sgx_enclave_run.
> > > 
> > > The accurate exception information supports the SGX runtime, which is
> > > virtually always implemented inside a shared library, by providing
> > > accurate information in support of its management of the SGX enclave.
> > 
> > This QA-format is not a great idea, as it kind of tells what are the legit
> > questions to ask.
> 
> I will remove the QA-format and just describe the two (before/after) 
> scenarios.
> 
> > You should describe what the patch does and what are the
> > legit reasons for doing that. Unfortunately, in the current form it is very
> > hard to get grip of this patch.
> 
> That was the goal of the summary (the first paragraph) at the start of 
> the changelog. Could you please elaborate how you would like me to 
> improve it?

If I do a search "mktme" through the commit message, it gives
me zero results.

/Jarkko\x13\x13

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-06 21:20       ` Reinette Chatre
@ 2021-12-11  7:42         ` Jarkko Sakkinen
  2021-12-13 22:10           ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  7:42 UTC (permalink / raw)
  To: Reinette Chatre, Andy Lutomirski
  Cc: dave.hansen, tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang,
	cathy.zhang, cedric.xing, haitao.huang, mark.shanahan, hpa,
	linux-kernel

On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
> > This is a valid question. Since EMODPE exists why not just make things for
> > EMODPE, and ignore EMODPR altogether?
> > 
> 
> I believe that we should support the best practice of principle of least 
> privilege - once a page no longer needs a particular permission there 
> should be a way to remove it (the unneeded permission).

What if EMODPR was not used at all, since EMODPE is there anyway?

This could be achieved e.g. by having ioctl to change protection
bits in encl->page_tree.

This would simplify things a lot given that there would be only
two, instead of three, EACCEPT code paths.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 07/25] x86/sgx: Move PTE zap code to separate function
  2021-12-06 21:30     ` Reinette Chatre
@ 2021-12-11  7:52       ` Jarkko Sakkinen
  2021-12-13 22:11         ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  7:52 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 13:30 -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/4/2021 2:59 PM, Jarkko Sakkinen wrote:
> > On Wed, Dec 01, 2021 at 11:23:05AM -0800, Reinette Chatre wrote:
> > > The SGX reclaimer removes page table entries pointing to pages that are
> > > moved to swap. SGX2 enables changes to pages belonging to an initialized
> > > enclave, for example changing page permissions. Supporting SGX2 requires
> > > this ability to remove page table entries that is available in the
> > > SGX reclaimer code.
> > 
> > Missing: why SGX2 requirest this?
> 
> The above paragraph states that SGX2 needs to remove page table entries 
> because it modifies page permissions. Could you please elaborate what is 
> missing?

It does not say why SGX2 requires an ability to remove page table entries.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 08/25] x86/sgx: Make SGX IPI callback available internally
  2021-12-06 21:36     ` Reinette Chatre
@ 2021-12-11  7:53       ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  7:53 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 13:36 -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/4/2021 3:00 PM, Jarkko Sakkinen wrote:
> > On Wed, Dec 01, 2021 at 11:23:06AM -0800, Reinette Chatre wrote:
> > > The ETRACK instruction followed by an IPI to all CPUs within an enclave
> > > is a common pattern with more frequent use in support of SGX2.
> > > 
> > > Make the (empty) IPI callback function available internally in
> > > preparation for more usages.
> > 
> > Please, just describe the usages that this is needed for so that
> > there is zero guesswork required.
> 
> The reader is not required to guess. The first paragraph states that 
> SGX2 also uses the ETRACK flow that relies on this function. What if I 
> replace "for more usages" by "for usage by SGX2"?

I think that'd be good enough.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-06 21:42     ` Reinette Chatre
@ 2021-12-11  7:57       ` Jarkko Sakkinen
  2021-12-13 22:12         ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  7:57 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 13:42 -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/4/2021 3:08 PM, Jarkko Sakkinen wrote:
> > On Wed, Dec 01, 2021 at 11:23:08AM -0800, Reinette Chatre wrote:
> > > In the initial (SGX1) version of SGX, pages in an enclave need to be
> > > created with permissions that support all usages of the pages, from the
> > > time the enclave is initialized until it is unloaded. For example,
> > > pages used by a JIT compiler or when code needs to otherwise be
> > > relocated need to always have RWX permissions.
> > > 
> > > SGX2 includes two functions that can be used to modify the enclave page
> > > permissions of regular enclave pages within an initialized enclave.
> > > ENCLS[EMODPR] is run from the OS and used to restrict enclave page
> > > permissions while ENCLU[EMODPE] is run from within the enclave to
> > > extend enclave page permissions.
> > > 
> > > Enclave page permission changes need to be approached with care and
> > > for this reason this initial support is to allow enclave page
> > > permission changes _only_ if the new permissions are the same or
> > > more restrictive that the permissions originally vetted at the time the
> > > pages were added to the enclave. Support for extending enclave page
> > > permissions beyond what was originally vetted is deferred.
> > 
> > This paragraph is out-of-scope for a commit message. You could have
> > this in the cover letter but not here. I would just remove it.
> 
> I think this is essential information that is mentioned in the cover 
> letter _and_ in this changelog. I will follow Dave's guidance and avoid 
> "deferred" by just removing that last sentence.
> 
> > 
> > > Whether enclave page permissions are restricted or extended it
> > > is necessary to ensure that the page table entries and enclave page
> > > permissions are in sync. Introduce a new ioctl, SGX_IOC_PAGE_MODP, to
> > 
> > SGX_IOC_PAGE_MODP does not match the naming convetion of these:
> > 
> > * SGX_IOC_ENCLAVE_CREATE
> > * SGX_IOC_ENCLAVE_ADD_PAGES
> > * SGX_IOC_ENCLAVE_INIT
> 
> ah - my understanding was that the SGX_IOC_ENCLAVE prefix related to 
> operations related to the entire enclave and thus I introduced the 
> prefix SGX_IOC_PAGE to relate to operations on pages within an enclave.

SGX_IOC_ENCLAVE_ADD_PAGES is also operation working on pages within an
enclave.

Also, to be aligned with SGX_IOC_ENCLAVE_ADD_PAGES, the new operations
should also take secinfo as input.

> 
> > 
> > A better name would be SGX_IOC_ENCLAVE_MOD_PROTECTIONS. It doesn't
> > do harm to be a more verbose.
> 
> Will do. I see later you propose SGX_IOC_ENCLAVE_MODIFY_TYPE - would you 
> like them to be consistent wrt MOD/MODIFY?

I would considering introducing just one new ioctl:

  SGX_IOC_ENCLAVE_MODIFY_PAGES

and choose either operations based on e.g. a flag
(see flags field SGX_IOC_ENCLAVE_ADD_PAGES).

> Reinette

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2021-12-06 21:44     ` Reinette Chatre
@ 2021-12-11  8:00       ` Jarkko Sakkinen
  2021-12-13 22:12         ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  8:00 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 13:44 -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/4/2021 3:13 PM, Jarkko Sakkinen wrote:
> > "to initialize" -> "to an initialized"
> 
> Will do.
> 
> 
> > 
> > On Wed, Dec 01, 2021 at 11:23:11AM -0800, Reinette Chatre wrote:
> > > With SGX1 an enclave needs to be created with its maximum memory demands
> > > allocated. Pages cannot be added to an enclave after it is initialized.
> > > SGX2 introduces a new function, ENCLS[EAUG], that can be used to add
> > > pages to an initialized enclave. With SGX2 the enclave still needs to
> > > set aside address space for its maximum memory demands during enclave
> > > creation, but all pages need not be added before enclave initialization.
> > > Pages can be added during enclave runtime.
> > > 
> > > Add support for dynamically adding pages to an initialized enclave,
> > > architecturally limited to RW permission. Add pages via the page fault
> > > handler at the time an enclave address without a backing enclave page
> > > is accessed, potentially directly reclaiming pages if no free pages
> > > are available.
> > > 
> > > The enclave is still required to run ENCLU[EACCEPT] on the page before
> > > it can be used. A useful flow is for the enclave to run ENCLU[EACCEPT]
> > > on an uninitialized address. This will trigger the page fault handler
> > > that will add the enclave page and return execution to the enclave to
> > > repeat the ENCLU[EACCEPT] instruction, this time successful.
> > > 
> > > If the enclave accesses an uninitialized address in another way, for
> > > example by expanding the enclave stack to a page that has not yet been
> > > added, then the page fault handler would add the page on the first
> > > write but upon returning to the enclave the instruction that triggered
> > > the page fault would be repeated and since ENCLU[EACCEPT] was not run
> > > yet it would trigger a second page fault, this time with the SGX flag
> > > set in the page fault error code. This can only be recovered by entering
> > > the enclave again and directly running the ENCLU[EACCEPT] instruction on
> > > the now initialized address.
> > > 
> > > Accessing an uninitialized address from outside the enclave also triggers
> > > this flow but the page will remain in PENDING state until accepted from
> > > within the enclave.
> > 
> > What does it mean being in PENDING state, and more imporantly, what is
> > PENDING state? What does a memory access within enclave cause when it
> > touch a page within this state?
> 
> The PENDING state is the enclave page state from the SGX hardware's 
> perspective. The OS uses the ENCLS[EAUG] SGX2 function to add a new page 
> to the enclave but from the SGX hardware's perspective it would be in a 
> PENDING state until the enclave accepts the page. An access to the page 
> in PENDING state would result in a page fault.
> 
> 
> > I see a lot of text in the commit message but zero mentions about EPCM
> > expect this one sudden mention about PENDING field without attaching
> > it to anything concrete.
> 
> My apologies - I will add this to this changelog. This matches your 
> request to describe the __eaug() wrapper introduced in patch 02/25. 
> Would you like me to duplicate this information here and in that patch 
> (a new patch dedicated to the __eaug() wrapper) or would you be ok if I 
> introduce the wrappers all together briefly as in the example you 
> provide and then detail the flows where the wrappers are used - like 
> this patch?

I think it would be a good place to describe these details in 02/25,
and skip them in rest of the patches.

/Jarkko


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 14/25] x86/sgx: Tighten accessible memory range after enclave initialization
  2021-12-06 21:45     ` Reinette Chatre
@ 2021-12-11  8:01       ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  8:01 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 13:45 -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/4/2021 3:14 PM, Jarkko Sakkinen wrote:
> > > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> > > index 342b97dd4c33..37203da382f8 100644
> > > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > > @@ -403,6 +403,10 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
> > >   
> > >   	XA_STATE(xas, &encl->page_array, PFN_DOWN(start));
> > >   
> > 
> > Please write a comment here.
> 
> Would the comment below suffice?
> 
> /* Disallow mapping outside enclave's address range. */

Yeah, looks good to me.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 16/25] x86/sgx: Support modifying SGX page type
  2021-12-06 21:48     ` Reinette Chatre
@ 2021-12-11  8:02       ` Jarkko Sakkinen
  2021-12-13 17:43         ` Dave Hansen
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-11  8:02 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, 2021-12-06 at 13:48 -0800, Reinette Chatre wrote:
> > I'd suggest to change this as SGX_IOC_ENCLAVE_MODIFY_TYPE.
> 
> How about SGX_IOC_ENCLAVE_MOD_TYPE to be consistent with your earlier 
> suggestion of SGX_IOC_ENCLAVE_MOD_PROTECTIONS ?

I think it would be best to introduce only one new ioctl that would
be capable of doing either operation (and use secinfo as a vessel
for additional data).

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 16/25] x86/sgx: Support modifying SGX page type
  2021-12-11  8:02       ` Jarkko Sakkinen
@ 2021-12-13 17:43         ` Dave Hansen
  2021-12-21  8:52           ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Dave Hansen @ 2021-12-13 17:43 UTC (permalink / raw)
  To: Jarkko Sakkinen, Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On 12/11/21 12:02 AM, Jarkko Sakkinen wrote:
> On Mon, 2021-12-06 at 13:48 -0800, Reinette Chatre wrote:
>>> I'd suggest to change this as SGX_IOC_ENCLAVE_MODIFY_TYPE.
>> How about SGX_IOC_ENCLAVE_MOD_TYPE to be consistent with your earlier 
>> suggestion of SGX_IOC_ENCLAVE_MOD_PROTECTIONS ?
> I think it would be best to introduce only one new ioctl that would
> be capable of doing either operation (and use secinfo as a vessel
> for additional data).

Why?

I don't think we should try to multiplex within an ioctl().  Just create
a second ioctl().

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers
  2021-12-11  5:28       ` Jarkko Sakkinen
@ 2021-12-13 22:06         ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-13 22:06 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/10/2021 9:28 PM, Jarkko Sakkinen wrote:
> On Mon, 2021-12-06 at 13:13 -0800, Reinette Chatre wrote:

...

>>
>> __eadd():
>> /* Copy a source page from non-enclave memory into the EPC. */
> 
> Perhaps:
> 
> /*
>   * Associate an EPC page to an enclave either as a REG or TCS page
>   * populated with the provided data.
>   */
> 
> This is more aligned with your description for __eremove().

I was trying to keep the descriptions as concise one-liners. I'll use 
the text you provide if you are ok with its line length being an exception.

...

>>
>> __eremove():
>> /* Disassociate EPC page from its enclave and mark it as unused. */

...

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions
  2021-12-11  5:39         ` Jarkko Sakkinen
@ 2021-12-13 22:08           ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-13 22:08 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/10/2021 9:39 PM, Jarkko Sakkinen wrote:
> On Mon, 2021-12-06 at 13:16 -0800, Reinette Chatre wrote:
>> On 12/4/2021 2:27 PM, Jarkko Sakkinen wrote:
>>> On Sun, Dec 05, 2021 at 12:25:59AM +0200, Jarkko Sakkinen wrote:
>>>> On Wed, Dec 01, 2021 at 11:23:01AM -0800, Reinette Chatre wrote:
>>>>> === Summary ===
>>>>>
>>>>> An SGX VMA can only be created if its permissions are the same or
>>>>> weaker than the Enclave Page Cache Map (EPCM) permissions. After VMA
>>>>> creation this rule continues to be enforced by the page fault handler.
>>>>>
>>>>> With SGX2 the EPCM permissions of a page can change after VMA
>>>>> creation resulting in the VMA exceeding the EPCM permissions and the
>>>>> page fault handler incorrectly blocking access.
>>>>>
>>>>> Enable the VMA's pages to remain accessible while ensuring that
>>>>> the page table entries are installed to match the EPCM permissions
>>>>> without exceeding the VMA perms issions.
>>>>
>>>> I don't understand what the short summary means in English, and the
>>>> commit message is way too bloated to make any conclusions. It really
>>>> needs a rewrite.
>>>>
>>>> These were the questions I could not find answer for:
>>>>
>>>> 1. Why it would be by any means safe to remove a permission check?
>>
>> The permission check is redundant for SGX1 and incorrect for SGX2.
>>
>> In the current SGX1 implementation the permission check in
>> sgx_encl_load_page() is redundant because an SGX VMA can only be created
>> if its permissions are the same or weaker than the EPCM permissions.
>>
>> In SGX2 a user is able to change EPCM permissions during runtime (while
>> VMA has the memory mapped). A RW VMA may thus originally have mapped an
>> enclave page with RW EPCM permissions but since then the enclave page
>> may have its permissions changed to read-only. The VMA should still be
>> able to read those enclave pages but the check in sgx_encl_load_page()
>> will prevent that.
>>
>>>> 2. Why not re-issuing mmap()'s is unfeasible? I.e. close existing
>>>>      VMA's and mmap() new ones.
>>
>> User is not prevented from closing existing VMAs and creating new ones.
>>
>>> 3. Isn't this an API/ABI break?
>>
>> Could you please elaborate where you see the API/ABI break? The rule
>> that new VMAs cannot exceed EPCM permissions is untouched.
>>
>> Reinette
> 
> I just don't understand the description. There's a whole bunch of text
> but
> 
> It does not discuss what the patch does in low-level detail what the
> patch does, e.g. the use of vm_insert_pfn_prot(). I honestly do not
> get the story here...

vmf_insert_pfn_prot() replaces the existing call to vmf_insert_pfn().

Notice how:

vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
                           unsigned long pfn)
{
         return vmf_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot);
}

vmf_insert_pfn() is replaced with the function it would call anyway. It 
is done because the PTE being installed should no longer blindly inherit 
the VMA permission as is done in the current code, but it should also 
take the EPCM permissions into account. This is because the EPCM 
permissions can change after the VMA is created.

For example, consider a RW VMA created to map pages with RW EPCM pages.
Since SGX1 does not allow EPCM permission changes it is ok to always 
install RW PTEs to access those pages and thus vmf_insert_pfn() is 
sufficient. In SGX2 the EPCM pages may become read-only and the PTEs 
should no longer be RW. This is made possible with the call to 
vmf_insert_pfn_prot() where the protection bits for the PTE can be 
provided (so that the PTE permissions do not exceed the EPCM permissions).

Reinette



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs
  2021-12-11  7:37       ` Jarkko Sakkinen
@ 2021-12-13 22:09         ` Reinette Chatre
  2021-12-28 14:51           ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-13 22:09 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/10/2021 11:37 PM, Jarkko Sakkinen wrote:
> On Mon, 2021-12-06 at 13:18 -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 12/4/2021 2:43 PM, Jarkko Sakkinen wrote:
>>> On Wed, Dec 01, 2021 at 11:23:02AM -0800, Reinette Chatre wrote:
>>>> By default a write page fault on a present PTE inherits the permissions
>>>> of the VMA. Enclave page permissions maintained in the hardware's
>>>> Enclave Page Cache Map (EPCM) may change after a VMA accessing the page
>>>> is created. A VMA's permissions may thus exceed the enclave page
>>>> permissions even though the VMA was originally created not to exceed
>>>> the enclave page permissions. Following the default behavior during
>>>> a page fault on a present PTE while the VMA permissions exceed the
>>>> enclave page permissions would result in the PTE for an enclave page
>>>> to be writable even though the page is not writable according to the
>>>> enclave's permissions.
>>>>
>>>> Consider the following scenario:
>>>> * An enclave page exists with RW EPCM permissions.
>>>> * A RW VMA maps the range spanning the enclave page.
>>>> * The enclave page's EPCM permissions are changed to read-only.
>>>
>>> How could this happen in the existing mainline code?
>>
>> This is a preparatory patch for SGX2 support. Restricting the
>> permissions of an enclave page would require OS support that is added in
>> a later patch.
>>
>>>
>>>> * There is no page table entry for the enclave page.
>>>>
>>>> Q.
>>>>    What will user space observe when an attempt is made to write to the
>>>>    enclave page from within the enclave?
>>>>
>>>> A.
>>>>    Initially the page table entry is not present so the following is
>>>>    observed:
>>>>    1) Instruction writing to enclave page is run from within the enclave.
>>>>    2) A page fault with second and third bits set (0x6) is encountered
>>>>       and handled by the SGX handler sgx_vma_fault() that installs a
>>>>       read-only page table entry following previous patch that installs
>>>>       page table entry with permissions that VMA and enclave agree on
>>>>       (read-only in this case).
>>>>    3) Instruction writing to enclave page is re-attempted.
>>>>    4) A page fault with first three bits set (0x7) is encountered and
>>>>       transparently (from SGX and user space perspective) handled by the
>>>>       OS with the page table entry made writable because the VMA is
>>>>       writable.
>>>>    5) Instruction writing to enclave page is re-attempted.
>>>>    6) Since the EPCM permissions prevents writing to the page a new page
>>>>       fault is encountered, this time with the SGX flag set in the error
>>>>       code (0x8007). No action is taken by OS for this page fault and
>>>>       execution returns to user space.
>>>>    7) Typically such a fault will be passed on to an application with a
>>>>       signal but if the enclave is entered with the vDSO function provided
>>>>       by the kernel then user space does not receive a signal but instead
>>>>       the vDSO function returns successfully with exception information
>>>>       (vector=14, error code=0x8007, and address) within the exception
>>>>       fields within the vDSO function's struct sgx_enclave_run.
>>>>
>>>> As can be observed it is not possible for user space to write to an
>>>> enclave page if that page's enclave page permissions do not allow so,
>>>> no matter what the VMA or PTE allows.
>>>>
>>>> Even so, the OS should not allow writing to a page if that page is not
>>>> writable. Thus the page table entry should accurately reflect the
>>>> enclave page permissions.
>>>>
>>>> Do not blindly accept VMA permissions on a page fault due to a write
>>>> attempt to a present PTE. Install a pfn_mkwrite() handler that ensures
>>>> that the VMA permissions agree with the enclave permissions in this
>>>> regard.
>>>>
>>>> Considering the same scenario as above after this change results in
>>>> the following behavior change:
>>>>
>>>> Q.
>>>>    What will user space observe when an attempt is made to write to the
>>>>    enclave page from within the enclave?
>>>>
>>>> A.
>>>>    Initially the page table entry is not present so the following is
>>>>    observed:
>>>>    1) Instruction writing to enclave page is run from within the enclave.
>>>>    2) A page fault with second and third bits set (0x6) is encountered
>>>>       and handled by the SGX handler sgx_vma_fault() that installs a
>>>>       read-only page table entry following previous patch that installs
>>>>       page table entry with permissions that VMA and enclave agree on
>>>>       (read-only in this case).
>>>>    3) Instruction writing to enclave page is re-attempted.
>>>>    4) A page fault with first three bits set (0x7) is encountered and
>>>>       passed to the pfn_mkwrite() handler for consideration. The handler
>>>>       determines that the page should not be writable and returns SIGBUS.
>>>>    5) Typically such a fault will be passed on to an application with a
>>>>       signal but if the enclave is entered with the vDSO function provided
>>>>       by the kernel then user space does not receive a signal but instead
>>>>       the vDSO function returns successfully with exception information
>>>>       (vector=14, error code=0x7, and address) within the exception fields
>>>>       within the vDSO function's struct sgx_enclave_run.
>>>>
>>>> The accurate exception information supports the SGX runtime, which is
>>>> virtually always implemented inside a shared library, by providing
>>>> accurate information in support of its management of the SGX enclave.
>>>
>>> This QA-format is not a great idea, as it kind of tells what are the legit
>>> questions to ask.
>>
>> I will remove the QA-format and just describe the two (before/after)
>> scenarios.
>>
>>> You should describe what the patch does and what are the
>>> legit reasons for doing that. Unfortunately, in the current form it is very
>>> hard to get grip of this patch.
>>
>> That was the goal of the summary (the first paragraph) at the start of
>> the changelog. Could you please elaborate how you would like me to
>> improve it?
> 
> If I do a search "mktme" through the commit message, it gives
> me zero results.

Could you please elaborate why you expect "mktme" to show up in the 
commit message?

Reinette


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-11  7:42         ` Jarkko Sakkinen
@ 2021-12-13 22:10           ` Reinette Chatre
  2021-12-28 14:52             ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-13 22:10 UTC (permalink / raw)
  To: Jarkko Sakkinen, Andy Lutomirski
  Cc: dave.hansen, tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang,
	cathy.zhang, cedric.xing, haitao.huang, mark.shanahan, hpa,
	linux-kernel

Hi Jarkko,

On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
> On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
>>> This is a valid question. Since EMODPE exists why not just make things for
>>> EMODPE, and ignore EMODPR altogether?
>>>
>>
>> I believe that we should support the best practice of principle of least
>> privilege - once a page no longer needs a particular permission there
>> should be a way to remove it (the unneeded permission).
> 
> What if EMODPR was not used at all, since EMODPE is there anyway?

EMODPR and EMODPE are not equivalent.

EMODPE can only be used to "extend"/relax permissions while EMODPR can 
only be used to restrict permissions.

Notice in the EMODPE instruction reference of the SDM:

(* Update EPCM permissions *)
EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;

So, when using EMODPE it is only possible to add permissions, not remove 
permissions.

If a user wants to remove permissions from an EPCM page it is only 
possible when using EMODPR. Notice in its instruction reference found in 
the SDM how it in turn can only be used to restrict permissions:

(* Update EPCM permissions *)
EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;

> This could be achieved e.g. by having ioctl to change protection
> bits in encl->page_tree.
> 
> This would simplify things a lot given that there would be only
> two, instead of three, EACCEPT code paths.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 07/25] x86/sgx: Move PTE zap code to separate function
  2021-12-11  7:52       ` Jarkko Sakkinen
@ 2021-12-13 22:11         ` Reinette Chatre
  2021-12-28 14:55           ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-13 22:11 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/10/2021 11:52 PM, Jarkko Sakkinen wrote:
> On Mon, 2021-12-06 at 13:30 -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 12/4/2021 2:59 PM, Jarkko Sakkinen wrote:
>>> On Wed, Dec 01, 2021 at 11:23:05AM -0800, Reinette Chatre wrote:
>>>> The SGX reclaimer removes page table entries pointing to pages that are
>>>> moved to swap. SGX2 enables changes to pages belonging to an initialized
>>>> enclave, for example changing page permissions. Supporting SGX2 requires
>>>> this ability to remove page table entries that is available in the
>>>> SGX reclaimer code.
>>>
>>> Missing: why SGX2 requirest this?
>>
>> The above paragraph states that SGX2 needs to remove page table entries
>> because it modifies page permissions. Could you please elaborate what is
>> missing?
> 
> It does not say why SGX2 requires an ability to remove page table entries.

Are you saying that modification of EPCM page permissions is not a 
reason to remove page table entries pointing to those pages?

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-11  7:57       ` Jarkko Sakkinen
@ 2021-12-13 22:12         ` Reinette Chatre
  2021-12-28 14:56           ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-13 22:12 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/10/2021 11:57 PM, Jarkko Sakkinen wrote:
> On Mon, 2021-12-06 at 13:42 -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 12/4/2021 3:08 PM, Jarkko Sakkinen wrote:
>>> On Wed, Dec 01, 2021 at 11:23:08AM -0800, Reinette Chatre wrote:
>>>> In the initial (SGX1) version of SGX, pages in an enclave need to be
>>>> created with permissions that support all usages of the pages, from the
>>>> time the enclave is initialized until it is unloaded. For example,
>>>> pages used by a JIT compiler or when code needs to otherwise be
>>>> relocated need to always have RWX permissions.
>>>>
>>>> SGX2 includes two functions that can be used to modify the enclave page
>>>> permissions of regular enclave pages within an initialized enclave.
>>>> ENCLS[EMODPR] is run from the OS and used to restrict enclave page
>>>> permissions while ENCLU[EMODPE] is run from within the enclave to
>>>> extend enclave page permissions.
>>>>
>>>> Enclave page permission changes need to be approached with care and
>>>> for this reason this initial support is to allow enclave page
>>>> permission changes _only_ if the new permissions are the same or
>>>> more restrictive that the permissions originally vetted at the time the
>>>> pages were added to the enclave. Support for extending enclave page
>>>> permissions beyond what was originally vetted is deferred.
>>>
>>> This paragraph is out-of-scope for a commit message. You could have
>>> this in the cover letter but not here. I would just remove it.
>>
>> I think this is essential information that is mentioned in the cover
>> letter _and_ in this changelog. I will follow Dave's guidance and avoid
>> "deferred" by just removing that last sentence.
>>
>>>
>>>> Whether enclave page permissions are restricted or extended it
>>>> is necessary to ensure that the page table entries and enclave page
>>>> permissions are in sync. Introduce a new ioctl, SGX_IOC_PAGE_MODP, to
>>>
>>> SGX_IOC_PAGE_MODP does not match the naming convetion of these:
>>>
>>> * SGX_IOC_ENCLAVE_CREATE
>>> * SGX_IOC_ENCLAVE_ADD_PAGES
>>> * SGX_IOC_ENCLAVE_INIT
>>
>> ah - my understanding was that the SGX_IOC_ENCLAVE prefix related to
>> operations related to the entire enclave and thus I introduced the
>> prefix SGX_IOC_PAGE to relate to operations on pages within an enclave.
> 
> SGX_IOC_ENCLAVE_ADD_PAGES is also operation working on pages within an
> enclave.
> 
> Also, to be aligned with SGX_IOC_ENCLAVE_ADD_PAGES, the new operations
> should also take secinfo as input.

ok, will do.

> 
>>
>>>
>>> A better name would be SGX_IOC_ENCLAVE_MOD_PROTECTIONS. It doesn't
>>> do harm to be a more verbose.
>>
>> Will do. I see later you propose SGX_IOC_ENCLAVE_MODIFY_TYPE - would you
>> like them to be consistent wrt MOD/MODIFY?
> 
> I would considering introducing just one new ioctl:
> 
>    SGX_IOC_ENCLAVE_MODIFY_PAGES
> 
> and choose either operations based on e.g. a flag
> (see flags field SGX_IOC_ENCLAVE_ADD_PAGES).
> 

There seems to be different opinion about the single ioctl() as 
per:https://lore.kernel.org/lkml/0fb14185-5cc3-a963-253d-2e119b4a52bb@intel.com/

I thus plan to proceed with the two ioctls, both taking secinfo as 
input. Would that be ok with you?

Reinette



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2021-12-11  8:00       ` Jarkko Sakkinen
@ 2021-12-13 22:12         ` Reinette Chatre
  2021-12-28 14:57           ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2021-12-13 22:12 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/11/2021 12:00 AM, Jarkko Sakkinen wrote:
> On Mon, 2021-12-06 at 13:44 -0800, Reinette Chatre wrote:
>> On 12/4/2021 3:13 PM, Jarkko Sakkinen wrote:
>>> On Wed, Dec 01, 2021 at 11:23:11AM -0800, Reinette Chatre wrote:

...

>>>> Accessing an uninitialized address from outside the enclave also triggers
>>>> this flow but the page will remain in PENDING state until accepted from
>>>> within the enclave.
>>>
>>> What does it mean being in PENDING state, and more imporantly, what is
>>> PENDING state? What does a memory access within enclave cause when it
>>> touch a page within this state?
>>
>> The PENDING state is the enclave page state from the SGX hardware's
>> perspective. The OS uses the ENCLS[EAUG] SGX2 function to add a new page
>> to the enclave but from the SGX hardware's perspective it would be in a
>> PENDING state until the enclave accepts the page. An access to the page
>> in PENDING state would result in a page fault.
>>
>>
>>> I see a lot of text in the commit message but zero mentions about EPCM
>>> expect this one sudden mention about PENDING field without attaching
>>> it to anything concrete.
>>
>> My apologies - I will add this to this changelog. This matches your
>> request to describe the __eaug() wrapper introduced in patch 02/25.
>> Would you like me to duplicate this information here and in that patch
>> (a new patch dedicated to the __eaug() wrapper) or would you be ok if I
>> introduce the wrappers all together briefly as in the example you
>> provide and then detail the flows where the wrappers are used - like
>> this patch?
> 
> I think it would be a good place to describe these details in 02/25,
> and skip them in rest of the patches.
> 

Will do. I do think describing this amount of detail for the new SGX2 
functions would be too much for a single patch so I currently plan to 
split that (02/25) patch into a new patch per SGX2 instruction. Is that 
ok with you or would you like to keep it in a single patch?

Reinette


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-04 23:55             ` Reinette Chatre
@ 2021-12-13 22:34               ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2021-12-13 22:34 UTC (permalink / raw)
  To: Andy Lutomirski, dave.hansen, jarkko, tglx, bp, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Andy,

I would like to check in if you had some time to digest my responses 
with a few high level questions below ...

On 12/4/2021 3:55 PM, Reinette Chatre wrote:
> On 12/4/2021 9:56 AM, Andy Lutomirski wrote:
>> On 12/3/21 17:14, Reinette Chatre wrote:
>>> On 12/3/2021 4:38 PM, Andy Lutomirski wrote:
>>>> On 12/3/21 14:12, Reinette Chatre wrote:
>>>>> On 12/3/2021 11:28 AM, Andy Lutomirski wrote:
>>>>>> On 12/1/21 11:23, Reinette Chatre wrote:
>>>>>>> Enclave creators declare their paging permission intent at the time
>>>>>>> the pages are added to the enclave. These paging permissions are
>>>>>>> vetted when pages are added to the enclave and stashed off
>>>>>>> (in sgx_encl_page->vm_max_prot_bits) for later comparison with
>>>>>>> enclave PTEs.
>>>>>>>
>>>>>>
>>>>>> I'm a bit confused here. ENCLU[EMODPE] allows the enclave to 
>>>>>> change the EPCM permission bits however it likes with no oversight 
>>>>>> from the kernel.   So we end up with a whole bunch of permission 
>>>>>> masks:
>>>>>
>>>>> Before jumping to the permission masks I would like to step back 
>>>>> and just confirm the context. We need to consider the following 
>>>>> three permissions:
>>>>>
>>>>> EPCM permissions: the enclave page permissions maintained in the 
>>>>> SGX hardware. The OS is constrained here in that it cannot query 
>>>>> the current EPCM permissions. Even so, the OS needs to ensure PTEs 
>>>>> are installed appropriately (we do not want a RW PTE for a 
>>>>> read-only enclave page)
>>>>
>>>> Why not?  What's wrong with an RW PTE for a read-only enclave page?
>>>>
>>>> If you convince me that this is actually important, then I'll read 
>>>> all the stuff below.
>>>
>>> Perhaps it is my misunderstanding/misinterpretation of the current 
>>> implementation? From what I understand the current requirement, as 
>>> enforced in the current mmap(), mprotect() as well as fault() hooks, 
>>> is that mappings are required to have identical or weaker permission 
>>> than the enclave permission.
>>
>> The current implementation does require that, but for a perhaps 
>> counterintuitive reason.  If a SELinux-restricted (or similarly 
>> restricted) process that is *not* permitted to do JIT-like things 
>> loads an enclave, it's entirely okay for it to initialize RW enclave 
>> pages however it likes and it's entirely okay for it to initialize RX 
>> (or XO if that ever becomes a thing) enclave pages from appropriately 
>> files on disk.  But it's not okay for it to create RWX enclave pages 
>> or to initialize RX enclave pages from untrusted application memory. [0]
>>
>> So we have a half-baked implementation right now: the permission to 
>> execute a page is decided based on secinfo (max permissions) when the 
>> enclave is set up, and it's enforced at the PTE level.  The PTE 
>> enforcement is because, on SGX2 hardware, the enclave can do EMODPE 
>> and bypass any supposed restrictions in the EPCM.
>>
>> The only coupling between EPCM and PTE here is that the max_perm is 
>> initialized together with EPCM, but it didn't have to be that way.
>>
>> An SGX2 implementation needs to be more fully baked, because in a 
>> dynamic environment enclaves need to be able to use EMODPE and 
>> actually end up with permissions that exceed the initial secinfo 
>> permissions.  So 
> 
> Could you please elaborate why this is a requirement? In this 
> implementation the secinfo of a page added before enclave initialization 
> (via SGX_IOC_ENCLAVE_ADD_PAGES) would indicate the maximum permissions 
> it may have during its lifetime. Pages needing to be writable and 
> executable during their lifetime can be created with RWX secinfo and 
> during the enclave runtime the pages could obtain all combinations of 
> permissions: RWX, R, RW, RX. A page added with RW secinfo may have R or 
> RW permissions during its lifetime but never RX or RWX.
> 
> So far our inquiries on whether this is acceptable has been positive and 
> is also what Dave attempted to put a spotlight on in:
> https://lore.kernel.org/lkml/94d8d631-5345-66c4-52a3-941e52500f84@intel.com/ 
> 
> 
> This above is specific to pages added before enclave initialization. In 
> this implementation pages added after enclave initialization, those 
> needing the ENCLS[EAUG] SGX2 instruction, are added with max permissions 
> of RW so could only have R or RW permissions during their lifetime. This 
> is an understood limitation and it is understood that integration with 
> user policy is required to support these pages obtaining executable 
> permission. The plan is to handle user policy integration in a series 
> that follows this core SGX2 enabling.

Are you ok with the strategy to support modification of enclave page 
permissions?

> 
>> it needs to be possible to make a page that starts out R (or RW or 
>> whatever) but nonetheless has max_perm=RWX so that the enclave can use 
>> a combination of EMODPE and (ioctl-based) EMODPR to do JIT.  So I 
>> think you should make it possible to set up pages like this, but I see 
>> no reason to couple the PTE and the EPCM permissions.
>>
>>>
>>> Could you please elaborate how you envision PTEs should be managed in 
>>> this implementation?
>>
>> As above: PTE permissions may not exceed max_perm, and EPCM is 
>> entirely separate except to the extent needed for ABI compatibility 
>> with SGX1 runtimes.
> 
> ok, so if I understand correctly you, since PTE permissions may not 
> exceed max_perm and EPCM are separate, this seems to get back to your 
> previous question of "What's wrong with an RW PTE for a read-only 
> enclave page?"
> 
> This is indeed something that we could allow but not doing so (that is 
> PTEs not exceeding EPCM permissions) would better support the SGX 
> runtime. That is why I separated out the addition of the pfn_mkwrite() 
> callback in the previous patch (04/25). Like in your example, there is a 
> RW mapping of a read-only enclave page that first results in a RW PTE 
> for the read-only enclave page. That would result in a #PF with the SGX 
> flag set (0x8007). If the PTE matches the enclave permissions the page 
> fault would have familiar 0x7 error code.
> 
> In either case user space would encounter a #PF so technically there is 
> nothing "wrong" with allowing this - even so, as motivated in the 
> previous patch: accurate exception information supports the SGX runtime, 
> which is virtually always implemented inside a shared library, by 
> providing accurate information in support of its management of the SGX 
> enclave.

Are you ok with managing PTEs in this way? It matches your requirement 
that PTE permissions may not exceed max_perm and ABI is compatible with 
SGX1. Additionally, PTEs are not allowed to exceed EPCM permissions, 
which is not an ABI change since it was not a consideration during SGX1 
where EPCM permissions could not change.


>> [0] I'm not sure anyone actually has a system set up like this or that 
>> the necessary LSM support is in the kernel.  But it's supposed to be 
>> possible without changing the ABI.
>>

Thank you very much

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 16/25] x86/sgx: Support modifying SGX page type
  2021-12-13 17:43         ` Dave Hansen
@ 2021-12-21  8:52           ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-21  8:52 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx,
	x86, seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On Mon, Dec 13, 2021 at 09:43:38AM -0800, Dave Hansen wrote:
> On 12/11/21 12:02 AM, Jarkko Sakkinen wrote:
> > On Mon, 2021-12-06 at 13:48 -0800, Reinette Chatre wrote:
> >>> I'd suggest to change this as SGX_IOC_ENCLAVE_MODIFY_TYPE.
> >> How about SGX_IOC_ENCLAVE_MOD_TYPE to be consistent with your earlier 
> >> suggestion of SGX_IOC_ENCLAVE_MOD_PROTECTIONS ?
> > I think it would be best to introduce only one new ioctl that would
> > be capable of doing either operation (and use secinfo as a vessel
> > for additional data).
> 
> Why?
> 
> I don't think we should try to multiplex within an ioctl().  Just create
> a second ioctl().

I'm fine with 2 ioctls.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs
  2021-12-13 22:09         ` Reinette Chatre
@ 2021-12-28 14:51           ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-28 14:51 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Dec 13, 2021 at 02:09:30PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/10/2021 11:37 PM, Jarkko Sakkinen wrote:
> > On Mon, 2021-12-06 at 13:18 -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > > 
> > > On 12/4/2021 2:43 PM, Jarkko Sakkinen wrote:
> > > > On Wed, Dec 01, 2021 at 11:23:02AM -0800, Reinette Chatre wrote:
> > > > > By default a write page fault on a present PTE inherits the permissions
> > > > > of the VMA. Enclave page permissions maintained in the hardware's
> > > > > Enclave Page Cache Map (EPCM) may change after a VMA accessing the page
> > > > > is created. A VMA's permissions may thus exceed the enclave page
> > > > > permissions even though the VMA was originally created not to exceed
> > > > > the enclave page permissions. Following the default behavior during
> > > > > a page fault on a present PTE while the VMA permissions exceed the
> > > > > enclave page permissions would result in the PTE for an enclave page
> > > > > to be writable even though the page is not writable according to the
> > > > > enclave's permissions.
> > > > > 
> > > > > Consider the following scenario:
> > > > > * An enclave page exists with RW EPCM permissions.
> > > > > * A RW VMA maps the range spanning the enclave page.
> > > > > * The enclave page's EPCM permissions are changed to read-only.
> > > > 
> > > > How could this happen in the existing mainline code?
> > > 
> > > This is a preparatory patch for SGX2 support. Restricting the
> > > permissions of an enclave page would require OS support that is added in
> > > a later patch.
> > > 
> > > > 
> > > > > * There is no page table entry for the enclave page.
> > > > > 
> > > > > Q.
> > > > >    What will user space observe when an attempt is made to write to the
> > > > >    enclave page from within the enclave?
> > > > > 
> > > > > A.
> > > > >    Initially the page table entry is not present so the following is
> > > > >    observed:
> > > > >    1) Instruction writing to enclave page is run from within the enclave.
> > > > >    2) A page fault with second and third bits set (0x6) is encountered
> > > > >       and handled by the SGX handler sgx_vma_fault() that installs a
> > > > >       read-only page table entry following previous patch that installs
> > > > >       page table entry with permissions that VMA and enclave agree on
> > > > >       (read-only in this case).
> > > > >    3) Instruction writing to enclave page is re-attempted.
> > > > >    4) A page fault with first three bits set (0x7) is encountered and
> > > > >       transparently (from SGX and user space perspective) handled by the
> > > > >       OS with the page table entry made writable because the VMA is
> > > > >       writable.
> > > > >    5) Instruction writing to enclave page is re-attempted.
> > > > >    6) Since the EPCM permissions prevents writing to the page a new page
> > > > >       fault is encountered, this time with the SGX flag set in the error
> > > > >       code (0x8007). No action is taken by OS for this page fault and
> > > > >       execution returns to user space.
> > > > >    7) Typically such a fault will be passed on to an application with a
> > > > >       signal but if the enclave is entered with the vDSO function provided
> > > > >       by the kernel then user space does not receive a signal but instead
> > > > >       the vDSO function returns successfully with exception information
> > > > >       (vector=14, error code=0x8007, and address) within the exception
> > > > >       fields within the vDSO function's struct sgx_enclave_run.
> > > > > 
> > > > > As can be observed it is not possible for user space to write to an
> > > > > enclave page if that page's enclave page permissions do not allow so,
> > > > > no matter what the VMA or PTE allows.
> > > > > 
> > > > > Even so, the OS should not allow writing to a page if that page is not
> > > > > writable. Thus the page table entry should accurately reflect the
> > > > > enclave page permissions.
> > > > > 
> > > > > Do not blindly accept VMA permissions on a page fault due to a write
> > > > > attempt to a present PTE. Install a pfn_mkwrite() handler that ensures
> > > > > that the VMA permissions agree with the enclave permissions in this
> > > > > regard.
> > > > > 
> > > > > Considering the same scenario as above after this change results in
> > > > > the following behavior change:
> > > > > 
> > > > > Q.
> > > > >    What will user space observe when an attempt is made to write to the
> > > > >    enclave page from within the enclave?
> > > > > 
> > > > > A.
> > > > >    Initially the page table entry is not present so the following is
> > > > >    observed:
> > > > >    1) Instruction writing to enclave page is run from within the enclave.
> > > > >    2) A page fault with second and third bits set (0x6) is encountered
> > > > >       and handled by the SGX handler sgx_vma_fault() that installs a
> > > > >       read-only page table entry following previous patch that installs
> > > > >       page table entry with permissions that VMA and enclave agree on
> > > > >       (read-only in this case).
> > > > >    3) Instruction writing to enclave page is re-attempted.
> > > > >    4) A page fault with first three bits set (0x7) is encountered and
> > > > >       passed to the pfn_mkwrite() handler for consideration. The handler
> > > > >       determines that the page should not be writable and returns SIGBUS.
> > > > >    5) Typically such a fault will be passed on to an application with a
> > > > >       signal but if the enclave is entered with the vDSO function provided
> > > > >       by the kernel then user space does not receive a signal but instead
> > > > >       the vDSO function returns successfully with exception information
> > > > >       (vector=14, error code=0x7, and address) within the exception fields
> > > > >       within the vDSO function's struct sgx_enclave_run.
> > > > > 
> > > > > The accurate exception information supports the SGX runtime, which is
> > > > > virtually always implemented inside a shared library, by providing
> > > > > accurate information in support of its management of the SGX enclave.
> > > > 
> > > > This QA-format is not a great idea, as it kind of tells what are the legit
> > > > questions to ask.
> > > 
> > > I will remove the QA-format and just describe the two (before/after)
> > > scenarios.
> > > 
> > > > You should describe what the patch does and what are the
> > > > legit reasons for doing that. Unfortunately, in the current form it is very
> > > > hard to get grip of this patch.
> > > 
> > > That was the goal of the summary (the first paragraph) at the start of
> > > the changelog. Could you please elaborate how you would like me to
> > > improve it?
> > 
> > If I do a search "mktme" through the commit message, it gives
> > me zero results.
> 
> Could you please elaborate why you expect "mktme" to show up in the commit
> message?

I'm sorry, my mistake doubled: I ment to write mkwrite, and yes its use was well
described in the commit message.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-13 22:10           ` Reinette Chatre
@ 2021-12-28 14:52             ` Jarkko Sakkinen
  2022-01-06 17:46               ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-28 14:52 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Andy Lutomirski, dave.hansen, tglx, bp, mingo, linux-sgx, x86,
	seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On Mon, Dec 13, 2021 at 02:10:17PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
> > On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
> > > > This is a valid question. Since EMODPE exists why not just make things for
> > > > EMODPE, and ignore EMODPR altogether?
> > > > 
> > > 
> > > I believe that we should support the best practice of principle of least
> > > privilege - once a page no longer needs a particular permission there
> > > should be a way to remove it (the unneeded permission).
> > 
> > What if EMODPR was not used at all, since EMODPE is there anyway?
> 
> EMODPR and EMODPE are not equivalent.
> 
> EMODPE can only be used to "extend"/relax permissions while EMODPR can only
> be used to restrict permissions.
> 
> Notice in the EMODPE instruction reference of the SDM:
> 
> (* Update EPCM permissions *)
> EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
> EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
> EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;
> 
> So, when using EMODPE it is only possible to add permissions, not remove
> permissions.
> 
> If a user wants to remove permissions from an EPCM page it is only possible
> when using EMODPR. Notice in its instruction reference found in the SDM how
> it in turn can only be used to restrict permissions:
> 
> (* Update EPCM permissions *)
> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;

OK, so the question is: do we need both or would a mechanism just to extend
permissions be sufficient?

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 07/25] x86/sgx: Move PTE zap code to separate function
  2021-12-13 22:11         ` Reinette Chatre
@ 2021-12-28 14:55           ` Jarkko Sakkinen
  2022-01-06 17:46             ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-28 14:55 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Dec 13, 2021 at 02:11:26PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/10/2021 11:52 PM, Jarkko Sakkinen wrote:
> > On Mon, 2021-12-06 at 13:30 -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > > 
> > > On 12/4/2021 2:59 PM, Jarkko Sakkinen wrote:
> > > > On Wed, Dec 01, 2021 at 11:23:05AM -0800, Reinette Chatre wrote:
> > > > > The SGX reclaimer removes page table entries pointing to pages that are
> > > > > moved to swap. SGX2 enables changes to pages belonging to an initialized
> > > > > enclave, for example changing page permissions. Supporting SGX2 requires
> > > > > this ability to remove page table entries that is available in the
> > > > > SGX reclaimer code.
> > > > 
> > > > Missing: why SGX2 requirest this?
> > > 
> > > The above paragraph states that SGX2 needs to remove page table entries
> > > because it modifies page permissions. Could you please elaborate what is
> > > missing?
> > 
> > It does not say why SGX2 requires an ability to remove page table entries.
> 
> Are you saying that modification of EPCM page permissions is not a reason to
> remove page table entries pointing to those pages?

So you have:

"Supporting SGX2 requires this ability to remove page table entries that is
available in the SGX reclaimer code"

Just write down where you need this ability (briefly).

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 10/25] x86/sgx: Support enclave page permission changes
  2021-12-13 22:12         ` Reinette Chatre
@ 2021-12-28 14:56           ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-28 14:56 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Dec 13, 2021 at 02:12:44PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/10/2021 11:57 PM, Jarkko Sakkinen wrote:
> > On Mon, 2021-12-06 at 13:42 -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > > 
> > > On 12/4/2021 3:08 PM, Jarkko Sakkinen wrote:
> > > > On Wed, Dec 01, 2021 at 11:23:08AM -0800, Reinette Chatre wrote:
> > > > > In the initial (SGX1) version of SGX, pages in an enclave need to be
> > > > > created with permissions that support all usages of the pages, from the
> > > > > time the enclave is initialized until it is unloaded. For example,
> > > > > pages used by a JIT compiler or when code needs to otherwise be
> > > > > relocated need to always have RWX permissions.
> > > > > 
> > > > > SGX2 includes two functions that can be used to modify the enclave page
> > > > > permissions of regular enclave pages within an initialized enclave.
> > > > > ENCLS[EMODPR] is run from the OS and used to restrict enclave page
> > > > > permissions while ENCLU[EMODPE] is run from within the enclave to
> > > > > extend enclave page permissions.
> > > > > 
> > > > > Enclave page permission changes need to be approached with care and
> > > > > for this reason this initial support is to allow enclave page
> > > > > permission changes _only_ if the new permissions are the same or
> > > > > more restrictive that the permissions originally vetted at the time the
> > > > > pages were added to the enclave. Support for extending enclave page
> > > > > permissions beyond what was originally vetted is deferred.
> > > > 
> > > > This paragraph is out-of-scope for a commit message. You could have
> > > > this in the cover letter but not here. I would just remove it.
> > > 
> > > I think this is essential information that is mentioned in the cover
> > > letter _and_ in this changelog. I will follow Dave's guidance and avoid
> > > "deferred" by just removing that last sentence.
> > > 
> > > > 
> > > > > Whether enclave page permissions are restricted or extended it
> > > > > is necessary to ensure that the page table entries and enclave page
> > > > > permissions are in sync. Introduce a new ioctl, SGX_IOC_PAGE_MODP, to
> > > > 
> > > > SGX_IOC_PAGE_MODP does not match the naming convetion of these:
> > > > 
> > > > * SGX_IOC_ENCLAVE_CREATE
> > > > * SGX_IOC_ENCLAVE_ADD_PAGES
> > > > * SGX_IOC_ENCLAVE_INIT
> > > 
> > > ah - my understanding was that the SGX_IOC_ENCLAVE prefix related to
> > > operations related to the entire enclave and thus I introduced the
> > > prefix SGX_IOC_PAGE to relate to operations on pages within an enclave.
> > 
> > SGX_IOC_ENCLAVE_ADD_PAGES is also operation working on pages within an
> > enclave.
> > 
> > Also, to be aligned with SGX_IOC_ENCLAVE_ADD_PAGES, the new operations
> > should also take secinfo as input.
> 
> ok, will do.
> 
> > 
> > > 
> > > > 
> > > > A better name would be SGX_IOC_ENCLAVE_MOD_PROTECTIONS. It doesn't
> > > > do harm to be a more verbose.
> > > 
> > > Will do. I see later you propose SGX_IOC_ENCLAVE_MODIFY_TYPE - would you
> > > like them to be consistent wrt MOD/MODIFY?
> > 
> > I would considering introducing just one new ioctl:
> > 
> >    SGX_IOC_ENCLAVE_MODIFY_PAGES
> > 
> > and choose either operations based on e.g. a flag
> > (see flags field SGX_IOC_ENCLAVE_ADD_PAGES).
> > 
> 
> There seems to be different opinion about the single ioctl() as per:https://lore.kernel.org/lkml/0fb14185-5cc3-a963-253d-2e119b4a52bb@intel.com/
> 
> I thus plan to proceed with the two ioctls, both taking secinfo as input.
> Would that be ok with you?

Yeah, let's continue with two ioctls for now, I agree.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2021-12-13 22:12         ` Reinette Chatre
@ 2021-12-28 14:57           ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2021-12-28 14:57 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Dec 13, 2021 at 02:12:57PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/11/2021 12:00 AM, Jarkko Sakkinen wrote:
> > On Mon, 2021-12-06 at 13:44 -0800, Reinette Chatre wrote:
> > > On 12/4/2021 3:13 PM, Jarkko Sakkinen wrote:
> > > > On Wed, Dec 01, 2021 at 11:23:11AM -0800, Reinette Chatre wrote:
> 
> ...
> 
> > > > > Accessing an uninitialized address from outside the enclave also triggers
> > > > > this flow but the page will remain in PENDING state until accepted from
> > > > > within the enclave.
> > > > 
> > > > What does it mean being in PENDING state, and more imporantly, what is
> > > > PENDING state? What does a memory access within enclave cause when it
> > > > touch a page within this state?
> > > 
> > > The PENDING state is the enclave page state from the SGX hardware's
> > > perspective. The OS uses the ENCLS[EAUG] SGX2 function to add a new page
> > > to the enclave but from the SGX hardware's perspective it would be in a
> > > PENDING state until the enclave accepts the page. An access to the page
> > > in PENDING state would result in a page fault.
> > > 
> > > 
> > > > I see a lot of text in the commit message but zero mentions about EPCM
> > > > expect this one sudden mention about PENDING field without attaching
> > > > it to anything concrete.
> > > 
> > > My apologies - I will add this to this changelog. This matches your
> > > request to describe the __eaug() wrapper introduced in patch 02/25.
> > > Would you like me to duplicate this information here and in that patch
> > > (a new patch dedicated to the __eaug() wrapper) or would you be ok if I
> > > introduce the wrappers all together briefly as in the example you
> > > provide and then detail the flows where the wrappers are used - like
> > > this patch?
> > 
> > I think it would be a good place to describe these details in 02/25,
> > and skip them in rest of the patches.
> > 
> 
> Will do. I do think describing this amount of detail for the new SGX2
> functions would be too much for a single patch so I currently plan to split
> that (02/25) patch into a new patch per SGX2 instruction. Is that ok with
> you or would you like to keep it in a single patch?

It's ok for me.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2021-12-28 14:52             ` Jarkko Sakkinen
@ 2022-01-06 17:46               ` Reinette Chatre
  2022-01-07 12:16                 ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2022-01-06 17:46 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, dave.hansen, tglx, bp, mingo, linux-sgx, x86,
	seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Jarkko,

On 12/28/2021 6:52 AM, Jarkko Sakkinen wrote:
> On Mon, Dec 13, 2021 at 02:10:17PM -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
>>> On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
>>>>> This is a valid question. Since EMODPE exists why not just make things for
>>>>> EMODPE, and ignore EMODPR altogether?
>>>>>
>>>>
>>>> I believe that we should support the best practice of principle of least
>>>> privilege - once a page no longer needs a particular permission there
>>>> should be a way to remove it (the unneeded permission).
>>>
>>> What if EMODPR was not used at all, since EMODPE is there anyway?
>>
>> EMODPR and EMODPE are not equivalent.
>>
>> EMODPE can only be used to "extend"/relax permissions while EMODPR can only
>> be used to restrict permissions.
>>
>> Notice in the EMODPE instruction reference of the SDM:
>>
>> (* Update EPCM permissions *)
>> EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
>> EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
>> EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;
>>
>> So, when using EMODPE it is only possible to add permissions, not remove
>> permissions.
>>
>> If a user wants to remove permissions from an EPCM page it is only possible
>> when using EMODPR. Notice in its instruction reference found in the SDM how
>> it in turn can only be used to restrict permissions:
>>
>> (* Update EPCM permissions *)
>> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
>> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
>> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;
> 
> OK, so the question is: do we need both or would a mechanism just to extend
> permissions be sufficient?

I do believe that we need both in order to support pages having only
the permissions required to support their intended use during the time the
particular access is required. While technically it is possible to grant
pages all permissions they may need during their lifetime it is safer to
remove permissions when no longer required.

Reinette 


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 07/25] x86/sgx: Move PTE zap code to separate function
  2021-12-28 14:55           ` Jarkko Sakkinen
@ 2022-01-06 17:46             ` Reinette Chatre
  2022-01-07 12:26               ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2022-01-06 17:46 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 12/28/2021 6:55 AM, Jarkko Sakkinen wrote:
> On Mon, Dec 13, 2021 at 02:11:26PM -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 12/10/2021 11:52 PM, Jarkko Sakkinen wrote:
>>> On Mon, 2021-12-06 at 13:30 -0800, Reinette Chatre wrote:
>>>> Hi Jarkko,
>>>>
>>>> On 12/4/2021 2:59 PM, Jarkko Sakkinen wrote:
>>>>> On Wed, Dec 01, 2021 at 11:23:05AM -0800, Reinette Chatre wrote:
>>>>>> The SGX reclaimer removes page table entries pointing to pages that are
>>>>>> moved to swap. SGX2 enables changes to pages belonging to an initialized
>>>>>> enclave, for example changing page permissions. Supporting SGX2 requires
>>>>>> this ability to remove page table entries that is available in the
>>>>>> SGX reclaimer code.
>>>>>
>>>>> Missing: why SGX2 requirest this?
>>>>
>>>> The above paragraph states that SGX2 needs to remove page table entries
>>>> because it modifies page permissions. Could you please elaborate what is
>>>> missing?
>>>
>>> It does not say why SGX2 requires an ability to remove page table entries.
>>
>> Are you saying that modification of EPCM page permissions is not a reason to
>> remove page table entries pointing to those pages?
> 
> So you have:
> 
> "Supporting SGX2 requires this ability to remove page table entries that is
> available in the SGX reclaimer code"
> 
> Just write down where you need this ability (briefly).

Will do. I will expand the current permission changing text and also add the need
for this ability when regular pages are changed to TCS pages. TCS pages may not
be accessed by enclave code so when a regular page becomes a TCS page any page
table entries pointing to it should be removed.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-06 17:46               ` Reinette Chatre
@ 2022-01-07 12:16                 ` Jarkko Sakkinen
  2022-01-07 16:14                   ` Haitao Huang
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-07 12:16 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Andy Lutomirski, dave.hansen, tglx, bp, mingo, linux-sgx, x86,
	seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On Thu, Jan 06, 2022 at 09:46:06AM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/28/2021 6:52 AM, Jarkko Sakkinen wrote:
> > On Mon, Dec 13, 2021 at 02:10:17PM -0800, Reinette Chatre wrote:
> >> Hi Jarkko,
> >>
> >> On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
> >>> On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
> >>>>> This is a valid question. Since EMODPE exists why not just make things for
> >>>>> EMODPE, and ignore EMODPR altogether?
> >>>>>
> >>>>
> >>>> I believe that we should support the best practice of principle of least
> >>>> privilege - once a page no longer needs a particular permission there
> >>>> should be a way to remove it (the unneeded permission).
> >>>
> >>> What if EMODPR was not used at all, since EMODPE is there anyway?
> >>
> >> EMODPR and EMODPE are not equivalent.
> >>
> >> EMODPE can only be used to "extend"/relax permissions while EMODPR can only
> >> be used to restrict permissions.
> >>
> >> Notice in the EMODPE instruction reference of the SDM:
> >>
> >> (* Update EPCM permissions *)
> >> EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
> >> EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
> >> EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;
> >>
> >> So, when using EMODPE it is only possible to add permissions, not remove
> >> permissions.
> >>
> >> If a user wants to remove permissions from an EPCM page it is only possible
> >> when using EMODPR. Notice in its instruction reference found in the SDM how
> >> it in turn can only be used to restrict permissions:
> >>
> >> (* Update EPCM permissions *)
> >> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
> >> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
> >> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;
> > 
> > OK, so the question is: do we need both or would a mechanism just to extend
> > permissions be sufficient?
> 
> I do believe that we need both in order to support pages having only
> the permissions required to support their intended use during the time the
> particular access is required. While technically it is possible to grant
> pages all permissions they may need during their lifetime it is safer to
> remove permissions when no longer required.

So if we imagine a run-time: how EMODPR would be useful, and how using it
would make things safer?

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 07/25] x86/sgx: Move PTE zap code to separate function
  2022-01-06 17:46             ` Reinette Chatre
@ 2022-01-07 12:26               ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-07 12:26 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Thu, Jan 06, 2022 at 09:46:35AM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 12/28/2021 6:55 AM, Jarkko Sakkinen wrote:
> > On Mon, Dec 13, 2021 at 02:11:26PM -0800, Reinette Chatre wrote:
> >> Hi Jarkko,
> >>
> >> On 12/10/2021 11:52 PM, Jarkko Sakkinen wrote:
> >>> On Mon, 2021-12-06 at 13:30 -0800, Reinette Chatre wrote:
> >>>> Hi Jarkko,
> >>>>
> >>>> On 12/4/2021 2:59 PM, Jarkko Sakkinen wrote:
> >>>>> On Wed, Dec 01, 2021 at 11:23:05AM -0800, Reinette Chatre wrote:
> >>>>>> The SGX reclaimer removes page table entries pointing to pages that are
> >>>>>> moved to swap. SGX2 enables changes to pages belonging to an initialized
> >>>>>> enclave, for example changing page permissions. Supporting SGX2 requires
> >>>>>> this ability to remove page table entries that is available in the
> >>>>>> SGX reclaimer code.
> >>>>>
> >>>>> Missing: why SGX2 requirest this?
> >>>>
> >>>> The above paragraph states that SGX2 needs to remove page table entries
> >>>> because it modifies page permissions. Could you please elaborate what is
> >>>> missing?
> >>>
> >>> It does not say why SGX2 requires an ability to remove page table entries.
> >>
> >> Are you saying that modification of EPCM page permissions is not a reason to
> >> remove page table entries pointing to those pages?
> > 
> > So you have:
> > 
> > "Supporting SGX2 requires this ability to remove page table entries that is
> > available in the SGX reclaimer code"
> > 
> > Just write down where you need this ability (briefly).
> 
> Will do. I will expand the current permission changing text and also add the need
> for this ability when regular pages are changed to TCS pages. TCS pages may not
> be accessed by enclave code so when a regular page becomes a TCS page any page
> table entries pointing to it should be removed.

Thank you, sounds good.

BR,
Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-07 12:16                 ` Jarkko Sakkinen
@ 2022-01-07 16:14                   ` Haitao Huang
  2022-01-08 15:45                     ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Haitao Huang @ 2022-01-07 16:14 UTC (permalink / raw)
  To: Reinette Chatre, Jarkko Sakkinen
  Cc: Andy Lutomirski, dave.hansen, tglx, bp, mingo, linux-sgx, x86,
	seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On Fri, 07 Jan 2022 06:16:21 -0600, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Thu, Jan 06, 2022 at 09:46:06AM -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 12/28/2021 6:52 AM, Jarkko Sakkinen wrote:
>> > On Mon, Dec 13, 2021 at 02:10:17PM -0800, Reinette Chatre wrote:
>> >> Hi Jarkko,
>> >>
>> >> On 12/10/2021 11:42 PM, Jarkko Sakkinen wrote:
>> >>> On Mon, 2021-12-06 at 13:20 -0800, Reinette Chatre wrote:
>> >>>>> This is a valid question. Since EMODPE exists why not just make  
>> things for
>> >>>>> EMODPE, and ignore EMODPR altogether?
>> >>>>>
>> >>>>
>> >>>> I believe that we should support the best practice of principle of  
>> least
>> >>>> privilege - once a page no longer needs a particular permission  
>> there
>> >>>> should be a way to remove it (the unneeded permission).
>> >>>
>> >>> What if EMODPR was not used at all, since EMODPE is there anyway?
>> >>
>> >> EMODPR and EMODPE are not equivalent.
>> >>
>> >> EMODPE can only be used to "extend"/relax permissions while EMODPR  
>> can only
>> >> be used to restrict permissions.
>> >>
>> >> Notice in the EMODPE instruction reference of the SDM:
>> >>
>> >> (* Update EPCM permissions *)
>> >> EPCM(DS:RCX).R := EPCM(DS:RCX).R | SCRATCH_SECINFO.FLAGS.R;
>> >> EPCM(DS:RCX).W := EPCM(DS:RCX).W | SCRATCH_SECINFO.FLAGS.W;
>> >> EPCM(DS:RCX).X := EPCM(DS:RCX).X | SCRATCH_SECINFO.FLAGS.X;
>> >>
>> >> So, when using EMODPE it is only possible to add permissions, not  
>> remove
>> >> permissions.
>> >>
>> >> If a user wants to remove permissions from an EPCM page it is only  
>> possible
>> >> when using EMODPR. Notice in its instruction reference found in the  
>> SDM how
>> >> it in turn can only be used to restrict permissions:
>> >>
>> >> (* Update EPCM permissions *)
>> >> EPCM(DS:RCX).R := EPCM(DS:RCX).R & SCRATCH_SECINFO.FLAGS.R;
>> >> EPCM(DS:RCX).W := EPCM(DS:RCX).W & SCRATCH_SECINFO.FLAGS.W;
>> >> EPCM(DS:RCX).X := EPCM(DS:RCX).X & SCRATCH_SECINFO.FLAGS.X;
>> >
>> > OK, so the question is: do we need both or would a mechanism just to  
>> extend
>> > permissions be sufficient?
>>
>> I do believe that we need both in order to support pages having only
>> the permissions required to support their intended use during the time  
>> the
>> particular access is required. While technically it is possible to grant
>> pages all permissions they may need during their lifetime it is safer to
>> remove permissions when no longer required.
>
> So if we imagine a run-time: how EMODPR would be useful, and how using it
> would make things safer?
>
In scenarios of JIT compilers, once code is generated into RW pages,  
modifying both PTE and EPCM permissions to RX would be a good defensive  
measure. In that case, EMODPR is useful.

Haitao

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-07 16:14                   ` Haitao Huang
@ 2022-01-08 15:45                     ` Jarkko Sakkinen
  2022-01-08 15:51                       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-08 15:45 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > OK, so the question is: do we need both or would a mechanism just
> > > to extend
> > > > permissions be sufficient?
> > > 
> > > I do believe that we need both in order to support pages having only
> > > the permissions required to support their intended use during the
> > > time the
> > > particular access is required. While technically it is possible to grant
> > > pages all permissions they may need during their lifetime it is safer to
> > > remove permissions when no longer required.
> > 
> > So if we imagine a run-time: how EMODPR would be useful, and how using it
> > would make things safer?
> > 
> In scenarios of JIT compilers, once code is generated into RW pages,
> modifying both PTE and EPCM permissions to RX would be a good defensive
> measure. In that case, EMODPR is useful.

What is the exact threat we are talking about?

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-08 15:45                     ` Jarkko Sakkinen
@ 2022-01-08 15:51                       ` Jarkko Sakkinen
  2022-01-08 16:22                         ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-08 15:51 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > OK, so the question is: do we need both or would a mechanism just
> > > > to extend
> > > > > permissions be sufficient?
> > > > 
> > > > I do believe that we need both in order to support pages having only
> > > > the permissions required to support their intended use during the
> > > > time the
> > > > particular access is required. While technically it is possible to grant
> > > > pages all permissions they may need during their lifetime it is safer to
> > > > remove permissions when no longer required.
> > > 
> > > So if we imagine a run-time: how EMODPR would be useful, and how using it
> > > would make things safer?
> > > 
> > In scenarios of JIT compilers, once code is generated into RW pages,
> > modifying both PTE and EPCM permissions to RX would be a good defensive
> > measure. In that case, EMODPR is useful.
> 
> What is the exact threat we are talking about?

To add: it should be *significantly* critical thread, given that not
supporting only EAUG would leave us only one complex call pattern with
EACCEPT involvement.

I'd even go to suggest to leave EMODPR out of the patch set, and introduce
it when there is PoC code for any of the existing run-time that
demonstrates the demand for it. Right now this way too speculative.

Supporting EMODPE is IMHO by factors more critical.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-08 15:51                       ` Jarkko Sakkinen
@ 2022-01-08 16:22                         ` Jarkko Sakkinen
  2022-01-10 22:05                           ` Haitao Huang
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-08 16:22 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > OK, so the question is: do we need both or would a mechanism just
> > > > > to extend
> > > > > > permissions be sufficient?
> > > > > 
> > > > > I do believe that we need both in order to support pages having only
> > > > > the permissions required to support their intended use during the
> > > > > time the
> > > > > particular access is required. While technically it is possible to grant
> > > > > pages all permissions they may need during their lifetime it is safer to
> > > > > remove permissions when no longer required.
> > > > 
> > > > So if we imagine a run-time: how EMODPR would be useful, and how using it
> > > > would make things safer?
> > > > 
> > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > modifying both PTE and EPCM permissions to RX would be a good defensive
> > > measure. In that case, EMODPR is useful.
> > 
> > What is the exact threat we are talking about?
> 
> To add: it should be *significantly* critical thread, given that not
> supporting only EAUG would leave us only one complex call pattern with
> EACCEPT involvement.
> 
> I'd even go to suggest to leave EMODPR out of the patch set, and introduce
> it when there is PoC code for any of the existing run-time that
> demonstrates the demand for it. Right now this way too speculative.
> 
> Supporting EMODPE is IMHO by factors more critical.

At least it does not protected against enclave code because an enclave can
always choose not to EACCEPT any of the EMODPR requests. I'm not only
confused here about the actual threat but also the potential adversary and
target.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-08 16:22                         ` Jarkko Sakkinen
@ 2022-01-10 22:05                           ` Haitao Huang
  2022-01-11  1:53                             ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Haitao Huang @ 2022-01-10 22:05 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>> > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>> > > > > > OK, so the question is: do we need both or would a mechanism  
>> just
>> > > > > to extend
>> > > > > > permissions be sufficient?
>> > > > >
>> > > > > I do believe that we need both in order to support pages having  
>> only
>> > > > > the permissions required to support their intended use during  
>> the
>> > > > > time the
>> > > > > particular access is required. While technically it is possible  
>> to grant
>> > > > > pages all permissions they may need during their lifetime it is  
>> safer to
>> > > > > remove permissions when no longer required.
>> > > >
>> > > > So if we imagine a run-time: how EMODPR would be useful, and how  
>> using it
>> > > > would make things safer?
>> > > >
>> > > In scenarios of JIT compilers, once code is generated into RW pages,
>> > > modifying both PTE and EPCM permissions to RX would be a good  
>> defensive
>> > > measure. In that case, EMODPR is useful.
>> >
>> > What is the exact threat we are talking about?
>>
>> To add: it should be *significantly* critical thread, given that not
>> supporting only EAUG would leave us only one complex call pattern with
>> EACCEPT involvement.
>>
>> I'd even go to suggest to leave EMODPR out of the patch set, and  
>> introduce
>> it when there is PoC code for any of the existing run-time that
>> demonstrates the demand for it. Right now this way too speculative.
>>
>> Supporting EMODPE is IMHO by factors more critical.
>
> At least it does not protected against enclave code because an enclave  
> can
> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> confused here about the actual threat but also the potential adversary  
> and
> target.
>
I'm not sure I follow your thoughts here. The sequence should be for  
enclave to request  EMODPR in the first place through runtime to kernel,  
then to verify with EACCEPT that the OS indeed has done EMODPR.
If enclave does not verify with EACCEPT, then its own code has  
vulnerability. But this does not justify OS not providing the mechanism to  
request EMODPR.

Similar to how we don't want have RWX code pages for normal Linux  
application, when an enclave loads code pages (either directly or JIT  
compiled from high level code ) into EAUG'd page (which has RW), we do not  
want leave pages to be RWX for code to be executable, hence the need of  
EMODPR request OS to reduce the permissions to RX once the code is ready  
to execute.

I believe this is needed for LibOS runtimes (e.g.,Gramine) loading  
unmodified app binaries, or an enclave with JIT compiler (I think Enarx in  
this category?). Experts from those project can confirm or contradict.  
Intel SDK currently also has implementation to reduce permissions of RELRO  
sections in ELF binaries to ReadOnly after relocation is done. In our new  
EDMM user support[1] based on this patch series, we also support flows to  
reduce permissions using EMODPR in a generic way.

[1]https://github.com/intel/linux-sgx/pull/751

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-10 22:05                           ` Haitao Huang
@ 2022-01-11  1:53                             ` Jarkko Sakkinen
  2022-01-11  1:55                               ` Jarkko Sakkinen
  2022-01-11 17:13                               ` Reinette Chatre
  0 siblings, 2 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-11  1:53 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> wrote:
> 
> > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > > > OK, so the question is: do we need both or would a
> > > mechanism just
> > > > > > > to extend
> > > > > > > > permissions be sufficient?
> > > > > > >
> > > > > > > I do believe that we need both in order to support pages
> > > having only
> > > > > > > the permissions required to support their intended use
> > > during the
> > > > > > > time the
> > > > > > > particular access is required. While technically it is
> > > possible to grant
> > > > > > > pages all permissions they may need during their lifetime it
> > > is safer to
> > > > > > > remove permissions when no longer required.
> > > > > >
> > > > > > So if we imagine a run-time: how EMODPR would be useful, and
> > > how using it
> > > > > > would make things safer?
> > > > > >
> > > > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > > > modifying both PTE and EPCM permissions to RX would be a good
> > > defensive
> > > > > measure. In that case, EMODPR is useful.
> > > >
> > > > What is the exact threat we are talking about?
> > > 
> > > To add: it should be *significantly* critical thread, given that not
> > > supporting only EAUG would leave us only one complex call pattern with
> > > EACCEPT involvement.
> > > 
> > > I'd even go to suggest to leave EMODPR out of the patch set, and
> > > introduce
> > > it when there is PoC code for any of the existing run-time that
> > > demonstrates the demand for it. Right now this way too speculative.
> > > 
> > > Supporting EMODPE is IMHO by factors more critical.
> > 
> > At least it does not protected against enclave code because an enclave
> > can
> > always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > confused here about the actual threat but also the potential adversary
> > and
> > target.
> > 
> I'm not sure I follow your thoughts here. The sequence should be for enclave
> to request  EMODPR in the first place through runtime to kernel, then to
> verify with EACCEPT that the OS indeed has done EMODPR.
> If enclave does not verify with EACCEPT, then its own code has
> vulnerability. But this does not justify OS not providing the mechanism to
> request EMODPR.

The question is really simple: what is the threat scenario? In order to use
the word "vulnerability", you would need one.

Given the complexity of the whole dance with EMODPR it is mandatory to have
one, in order to ack it to the mainline.

> Similar to how we don't want have RWX code pages for normal Linux
> application, when an enclave loads code pages (either directly or JIT
> compiled from high level code ) into EAUG'd page (which has RW), we do not
> want leave pages to be RWX for code to be executable, hence the need of
> EMODPR request OS to reduce the permissions to RX once the code is ready to
> execute.

You cannot compare *enforced* permissions outside the enclave, and claim that
they would be equivalent to the permissions of the already sandboxed code
inside the enclave, with permissions that are not enforced but are based
on good will of the enclave code.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-11  1:53                             ` Jarkko Sakkinen
@ 2022-01-11  1:55                               ` Jarkko Sakkinen
  2022-01-11  2:03                                 ` Jarkko Sakkinen
  2022-01-11 17:13                               ` Reinette Chatre
  1 sibling, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-11  1:55 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > wrote:
> > 
> > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > > > > OK, so the question is: do we need both or would a
> > > > mechanism just
> > > > > > > > to extend
> > > > > > > > > permissions be sufficient?
> > > > > > > >
> > > > > > > > I do believe that we need both in order to support pages
> > > > having only
> > > > > > > > the permissions required to support their intended use
> > > > during the
> > > > > > > > time the
> > > > > > > > particular access is required. While technically it is
> > > > possible to grant
> > > > > > > > pages all permissions they may need during their lifetime it
> > > > is safer to
> > > > > > > > remove permissions when no longer required.
> > > > > > >
> > > > > > > So if we imagine a run-time: how EMODPR would be useful, and
> > > > how using it
> > > > > > > would make things safer?
> > > > > > >
> > > > > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > > > > modifying both PTE and EPCM permissions to RX would be a good
> > > > defensive
> > > > > > measure. In that case, EMODPR is useful.
> > > > >
> > > > > What is the exact threat we are talking about?
> > > > 
> > > > To add: it should be *significantly* critical thread, given that not
> > > > supporting only EAUG would leave us only one complex call pattern with
> > > > EACCEPT involvement.
> > > > 
> > > > I'd even go to suggest to leave EMODPR out of the patch set, and
> > > > introduce
> > > > it when there is PoC code for any of the existing run-time that
> > > > demonstrates the demand for it. Right now this way too speculative.
> > > > 
> > > > Supporting EMODPE is IMHO by factors more critical.
> > > 
> > > At least it does not protected against enclave code because an enclave
> > > can
> > > always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > confused here about the actual threat but also the potential adversary
> > > and
> > > target.
> > > 
> > I'm not sure I follow your thoughts here. The sequence should be for enclave
> > to request  EMODPR in the first place through runtime to kernel, then to
> > verify with EACCEPT that the OS indeed has done EMODPR.
> > If enclave does not verify with EACCEPT, then its own code has
> > vulnerability. But this does not justify OS not providing the mechanism to
> > request EMODPR.
> 
> The question is really simple: what is the threat scenario? In order to use
> the word "vulnerability", you would need one.
> 
> Given the complexity of the whole dance with EMODPR it is mandatory to have
> one, in order to ack it to the mainline.
> 
> > Similar to how we don't want have RWX code pages for normal Linux
> > application, when an enclave loads code pages (either directly or JIT
> > compiled from high level code ) into EAUG'd page (which has RW), we do not
> > want leave pages to be RWX for code to be executable, hence the need of
> > EMODPR request OS to reduce the permissions to RX once the code is ready to
> > execute.
> 
> You cannot compare *enforced* permissions outside the enclave, and claim that
> they would be equivalent to the permissions of the already sandboxed code
> inside the enclave, with permissions that are not enforced but are based
> on good will of the enclave code.

To add, you can already do "EMODPR" by simply adjusting VMA permissions to be
more restrictive. How this would be worse than this collaboration based 
thing?

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-11  1:55                               ` Jarkko Sakkinen
@ 2022-01-11  2:03                                 ` Jarkko Sakkinen
  2022-01-11  2:15                                   ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-11  2:03 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
> On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > wrote:
> > > 
> > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > > > > > OK, so the question is: do we need both or would a
> > > > > mechanism just
> > > > > > > > > to extend
> > > > > > > > > > permissions be sufficient?
> > > > > > > > >
> > > > > > > > > I do believe that we need both in order to support pages
> > > > > having only
> > > > > > > > > the permissions required to support their intended use
> > > > > during the
> > > > > > > > > time the
> > > > > > > > > particular access is required. While technically it is
> > > > > possible to grant
> > > > > > > > > pages all permissions they may need during their lifetime it
> > > > > is safer to
> > > > > > > > > remove permissions when no longer required.
> > > > > > > >
> > > > > > > > So if we imagine a run-time: how EMODPR would be useful, and
> > > > > how using it
> > > > > > > > would make things safer?
> > > > > > > >
> > > > > > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > > > > > modifying both PTE and EPCM permissions to RX would be a good
> > > > > defensive
> > > > > > > measure. In that case, EMODPR is useful.
> > > > > >
> > > > > > What is the exact threat we are talking about?
> > > > > 
> > > > > To add: it should be *significantly* critical thread, given that not
> > > > > supporting only EAUG would leave us only one complex call pattern with
> > > > > EACCEPT involvement.
> > > > > 
> > > > > I'd even go to suggest to leave EMODPR out of the patch set, and
> > > > > introduce
> > > > > it when there is PoC code for any of the existing run-time that
> > > > > demonstrates the demand for it. Right now this way too speculative.
> > > > > 
> > > > > Supporting EMODPE is IMHO by factors more critical.
> > > > 
> > > > At least it does not protected against enclave code because an enclave
> > > > can
> > > > always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > > confused here about the actual threat but also the potential adversary
> > > > and
> > > > target.
> > > > 
> > > I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > to request  EMODPR in the first place through runtime to kernel, then to
> > > verify with EACCEPT that the OS indeed has done EMODPR.
> > > If enclave does not verify with EACCEPT, then its own code has
> > > vulnerability. But this does not justify OS not providing the mechanism to
> > > request EMODPR.
> > 
> > The question is really simple: what is the threat scenario? In order to use
> > the word "vulnerability", you would need one.
> > 
> > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > one, in order to ack it to the mainline.
> > 
> > > Similar to how we don't want have RWX code pages for normal Linux
> > > application, when an enclave loads code pages (either directly or JIT
> > > compiled from high level code ) into EAUG'd page (which has RW), we do not
> > > want leave pages to be RWX for code to be executable, hence the need of
> > > EMODPR request OS to reduce the permissions to RX once the code is ready to
> > > execute.
> > 
> > You cannot compare *enforced* permissions outside the enclave, and claim that
> > they would be equivalent to the permissions of the already sandboxed code
> > inside the enclave, with permissions that are not enforced but are based
> > on good will of the enclave code.
> 
> To add, you can already do "EMODPR" by simply adjusting VMA permissions to be
> more restrictive. How this would be worse than this collaboration based 
> thing?

... or you could even make soft version of EMODPR without using that opcode
by writing an ioctl to update our xarray to allow lower permissions. That
ties the hands of the process who is doing the mmap() already. 

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-11  2:03                                 ` Jarkko Sakkinen
@ 2022-01-11  2:15                                   ` Jarkko Sakkinen
  2022-01-11  3:48                                     ` Haitao Huang
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-11  2:15 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
> On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
> > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > > wrote:
> > > > 
> > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > > > > > > > > OK, so the question is: do we need both or would a
> > > > > > mechanism just
> > > > > > > > > > to extend
> > > > > > > > > > > permissions be sufficient?
> > > > > > > > > >
> > > > > > > > > > I do believe that we need both in order to support pages
> > > > > > having only
> > > > > > > > > > the permissions required to support their intended use
> > > > > > during the
> > > > > > > > > > time the
> > > > > > > > > > particular access is required. While technically it is
> > > > > > possible to grant
> > > > > > > > > > pages all permissions they may need during their lifetime it
> > > > > > is safer to
> > > > > > > > > > remove permissions when no longer required.
> > > > > > > > >
> > > > > > > > > So if we imagine a run-time: how EMODPR would be useful, and
> > > > > > how using it
> > > > > > > > > would make things safer?
> > > > > > > > >
> > > > > > > > In scenarios of JIT compilers, once code is generated into RW pages,
> > > > > > > > modifying both PTE and EPCM permissions to RX would be a good
> > > > > > defensive
> > > > > > > > measure. In that case, EMODPR is useful.
> > > > > > >
> > > > > > > What is the exact threat we are talking about?
> > > > > > 
> > > > > > To add: it should be *significantly* critical thread, given that not
> > > > > > supporting only EAUG would leave us only one complex call pattern with
> > > > > > EACCEPT involvement.
> > > > > > 
> > > > > > I'd even go to suggest to leave EMODPR out of the patch set, and
> > > > > > introduce
> > > > > > it when there is PoC code for any of the existing run-time that
> > > > > > demonstrates the demand for it. Right now this way too speculative.
> > > > > > 
> > > > > > Supporting EMODPE is IMHO by factors more critical.
> > > > > 
> > > > > At least it does not protected against enclave code because an enclave
> > > > > can
> > > > > always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > > > confused here about the actual threat but also the potential adversary
> > > > > and
> > > > > target.
> > > > > 
> > > > I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > > to request  EMODPR in the first place through runtime to kernel, then to
> > > > verify with EACCEPT that the OS indeed has done EMODPR.
> > > > If enclave does not verify with EACCEPT, then its own code has
> > > > vulnerability. But this does not justify OS not providing the mechanism to
> > > > request EMODPR.
> > > 
> > > The question is really simple: what is the threat scenario? In order to use
> > > the word "vulnerability", you would need one.
> > > 
> > > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > one, in order to ack it to the mainline.
> > > 
> > > > Similar to how we don't want have RWX code pages for normal Linux
> > > > application, when an enclave loads code pages (either directly or JIT
> > > > compiled from high level code ) into EAUG'd page (which has RW), we do not
> > > > want leave pages to be RWX for code to be executable, hence the need of
> > > > EMODPR request OS to reduce the permissions to RX once the code is ready to
> > > > execute.
> > > 
> > > You cannot compare *enforced* permissions outside the enclave, and claim that
> > > they would be equivalent to the permissions of the already sandboxed code
> > > inside the enclave, with permissions that are not enforced but are based
> > > on good will of the enclave code.
> > 
> > To add, you can already do "EMODPR" by simply adjusting VMA permissions to be
> > more restrictive. How this would be worse than this collaboration based 
> > thing?
> 
> ... or you could even make soft version of EMODPR without using that opcode
> by writing an ioctl to update our xarray to allow lower permissions. That
> ties the hands of the process who is doing the mmap() already. 

E.g. why not just

#define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
#define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)

struct sgx_enclave_restrict_page_permissions {
	__u64 src;
	__u64 offset;
	__u64 length;
	__u64 secinfo;
	__u64 count;
};
struct sgx_enclave_extend_page_permissions {
	__u64 src;
	__u64 offset;
	__u64 length;
	__u64 secinfo;
	__u64 count;
};

These would simply update the xarray and nothing else. I'd go with two
ioctls (with the necessary checks for secinfo) in order to provide hook
up points in the future for LSMs.

This leaves only EAUG and EMODT requiring the EACCEPT handshake.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-11  2:15                                   ` Jarkko Sakkinen
@ 2022-01-11  3:48                                     ` Haitao Huang
  2022-01-12 23:48                                       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Haitao Huang @ 2022-01-11  3:48 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Mon, 10 Jan 2022 20:15:28 -0600, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
>> On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
>> > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
>> > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>> > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen  
>> <jarkko@kernel.org>
>> > > > wrote:
>> > > >
>> > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>> > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen  
>> wrote:
>> > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang  
>> wrote:
>> > > > > > > > > > > OK, so the question is: do we need both or would a
>> > > > > > mechanism just
>> > > > > > > > > > to extend
>> > > > > > > > > > > permissions be sufficient?
>> > > > > > > > > >
>> > > > > > > > > > I do believe that we need both in order to support  
>> pages
>> > > > > > having only
>> > > > > > > > > > the permissions required to support their intended use
>> > > > > > during the
>> > > > > > > > > > time the
>> > > > > > > > > > particular access is required. While technically it is
>> > > > > > possible to grant
>> > > > > > > > > > pages all permissions they may need during their  
>> lifetime it
>> > > > > > is safer to
>> > > > > > > > > > remove permissions when no longer required.
>> > > > > > > > >
>> > > > > > > > > So if we imagine a run-time: how EMODPR would be  
>> useful, and
>> > > > > > how using it
>> > > > > > > > > would make things safer?
>> > > > > > > > >
>> > > > > > > > In scenarios of JIT compilers, once code is generated  
>> into RW pages,
>> > > > > > > > modifying both PTE and EPCM permissions to RX would be a  
>> good
>> > > > > > defensive
>> > > > > > > > measure. In that case, EMODPR is useful.
>> > > > > > >
>> > > > > > > What is the exact threat we are talking about?
>> > > > > >
>> > > > > > To add: it should be *significantly* critical thread, given  
>> that not
>> > > > > > supporting only EAUG would leave us only one complex call  
>> pattern with
>> > > > > > EACCEPT involvement.
>> > > > > >
>> > > > > > I'd even go to suggest to leave EMODPR out of the patch set,  
>> and
>> > > > > > introduce
>> > > > > > it when there is PoC code for any of the existing run-time  
>> that
>> > > > > > demonstrates the demand for it. Right now this way too  
>> speculative.
>> > > > > >
>> > > > > > Supporting EMODPE is IMHO by factors more critical.
>> > > > >
>> > > > > At least it does not protected against enclave code because an  
>> enclave
>> > > > > can
>> > > > > always choose not to EACCEPT any of the EMODPR requests. I'm  
>> not only
>> > > > > confused here about the actual threat but also the potential  
>> adversary
>> > > > > and
>> > > > > target.
>> > > > >
>> > > > I'm not sure I follow your thoughts here. The sequence should be  
>> for enclave
>> > > > to request  EMODPR in the first place through runtime to kernel,  
>> then to
>> > > > verify with EACCEPT that the OS indeed has done EMODPR.
>> > > > If enclave does not verify with EACCEPT, then its own code has
>> > > > vulnerability. But this does not justify OS not providing the  
>> mechanism to
>> > > > request EMODPR.
>> > >
>> > > The question is really simple: what is the threat scenario? In  
>> order to use
>> > > the word "vulnerability", you would need one.
>> > >
>> > > Given the complexity of the whole dance with EMODPR it is mandatory  
>> to have
>> > > one, in order to ack it to the mainline.
>> > >
>> > > > Similar to how we don't want have RWX code pages for normal Linux
>> > > > application, when an enclave loads code pages (either directly or  
>> JIT
>> > > > compiled from high level code ) into EAUG'd page (which has RW),  
>> we do not
>> > > > want leave pages to be RWX for code to be executable, hence the  
>> need of
>> > > > EMODPR request OS to reduce the permissions to RX once the code  
>> is ready to
>> > > > execute.
>> > >
>> > > You cannot compare *enforced* permissions outside the enclave, and  
>> claim that
>> > > they would be equivalent to the permissions of the already  
>> sandboxed code
>> > > inside the enclave, with permissions that are not enforced but are  
>> based
>> > > on good will of the enclave code.
>> >
>> > To add, you can already do "EMODPR" by simply adjusting VMA  
>> permissions to be
>> > more restrictive. How this would be worse than this collaboration  
>> based
>> > thing?
>>
>> ... or you could even make soft version of EMODPR without using that  
>> opcode
>> by writing an ioctl to update our xarray to allow lower permissions.  
>> That
>> ties the hands of the process who is doing the mmap() already.
>
> E.g. why not just
>
> #define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
> 	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
> #define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
> 	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)
>
> struct sgx_enclave_restrict_page_permissions {
> 	__u64 src;
> 	__u64 offset;
> 	__u64 length;
> 	__u64 secinfo;
> 	__u64 count;
> };
> struct sgx_enclave_extend_page_permissions {
> 	__u64 src;
> 	__u64 offset;
> 	__u64 length;
> 	__u64 secinfo;
> 	__u64 count;
> };
>
> These would simply update the xarray and nothing else. I'd go with two
> ioctls (with the necessary checks for secinfo) in order to provide hook
> up points in the future for LSMs.
>
> This leaves only EAUG and EMODT requiring the EACCEPT handshake.
>
> /Jarkko
The trusted code base here is the enclave. It can't trust any code outside  
for enforcement. There is also need for TLB shootdown.

To answer your earlier question about threat, the threat is  
malicious/compromised code inside enclave. Yes, you can say the whole  
thing is sand-boxed, but the runtime inside enclave could load complex  
upper layer code.  Therefore the runtime needs to have a trusted mechanism  
to ensure code pages not writable so that there is less/no chance for  
compromised malicious enclave to modify existing code pages. I still  
consider it to be similar to normal Linux elf-loader/dynamic linker  
relying on mmap/mprotect and trusting OS to enforce permissions, but here  
the enclave runtime only trust the HW provided mechanism: EMODPR to change  
EPCM records and EACCEPT to verify.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-11  1:53                             ` Jarkko Sakkinen
  2022-01-11  1:55                               ` Jarkko Sakkinen
@ 2022-01-11 17:13                               ` Reinette Chatre
  2022-01-12 23:50                                 ` Jarkko Sakkinen
  1 sibling, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2022-01-11 17:13 UTC (permalink / raw)
  To: Jarkko Sakkinen, Haitao Huang
  Cc: Andy Lutomirski, dave.hansen, tglx, bp, mingo, linux-sgx, x86,
	seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Hi Jarkko,

On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>> wrote:
>>
>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>>>>>>>>> OK, so the question is: do we need both or would a
>>>> mechanism just
>>>>>>>> to extend
>>>>>>>>> permissions be sufficient?
>>>>>>>>
>>>>>>>> I do believe that we need both in order to support pages
>>>> having only
>>>>>>>> the permissions required to support their intended use
>>>> during the
>>>>>>>> time the
>>>>>>>> particular access is required. While technically it is
>>>> possible to grant
>>>>>>>> pages all permissions they may need during their lifetime it
>>>> is safer to
>>>>>>>> remove permissions when no longer required.
>>>>>>>
>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
>>>> how using it
>>>>>>> would make things safer?
>>>>>>>
>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
>>>>>> modifying both PTE and EPCM permissions to RX would be a good
>>>> defensive
>>>>>> measure. In that case, EMODPR is useful.
>>>>>
>>>>> What is the exact threat we are talking about?
>>>>
>>>> To add: it should be *significantly* critical thread, given that not
>>>> supporting only EAUG would leave us only one complex call pattern with
>>>> EACCEPT involvement.
>>>>
>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
>>>> introduce
>>>> it when there is PoC code for any of the existing run-time that
>>>> demonstrates the demand for it. Right now this way too speculative.
>>>>
>>>> Supporting EMODPE is IMHO by factors more critical.
>>>
>>> At least it does not protected against enclave code because an enclave
>>> can
>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
>>> confused here about the actual threat but also the potential adversary
>>> and
>>> target.
>>>
>> I'm not sure I follow your thoughts here. The sequence should be for enclave
>> to request  EMODPR in the first place through runtime to kernel, then to
>> verify with EACCEPT that the OS indeed has done EMODPR.
>> If enclave does not verify with EACCEPT, then its own code has
>> vulnerability. But this does not justify OS not providing the mechanism to
>> request EMODPR.
> 
> The question is really simple: what is the threat scenario? In order to use
> the word "vulnerability", you would need one.
> 
> Given the complexity of the whole dance with EMODPR it is mandatory to have
> one, in order to ack it to the mainline.
> 

Which complexity related to EMODPR are you concerned about? In a later message
you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
so it seems that you are perhaps concerned about the flow involving EACCEPT?
The OS does not require nor depend on EACCEPT being called as part of these flows
so a faulty or misbehaving user space omitting an EACCEPT call would not impact
these flows in the OS, but would of course impact the enclave.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-11  3:48                                     ` Haitao Huang
@ 2022-01-12 23:48                                       ` Jarkko Sakkinen
  2022-01-13  2:41                                         ` Haitao Huang
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-12 23:48 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Mon, Jan 10, 2022 at 09:48:15PM -0600, Haitao Huang wrote:
> On Mon, 10 Jan 2022 20:15:28 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> wrote:
> 
> > On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
> > > On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
> > > > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> > > > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen
> > > <jarkko@kernel.org>
> > > > > > wrote:
> > > > > >
> > > > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen
> > > wrote:
> > > > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang
> > > wrote:
> > > > > > > > > > > > > OK, so the question is: do we need both or would a
> > > > > > > > mechanism just
> > > > > > > > > > > > to extend
> > > > > > > > > > > > > permissions be sufficient?
> > > > > > > > > > > >
> > > > > > > > > > > > I do believe that we need both in order to support
> > > pages
> > > > > > > > having only
> > > > > > > > > > > > the permissions required to support their intended use
> > > > > > > > during the
> > > > > > > > > > > > time the
> > > > > > > > > > > > particular access is required. While technically it is
> > > > > > > > possible to grant
> > > > > > > > > > > > pages all permissions they may need during their
> > > lifetime it
> > > > > > > > is safer to
> > > > > > > > > > > > remove permissions when no longer required.
> > > > > > > > > > >
> > > > > > > > > > > So if we imagine a run-time: how EMODPR would be
> > > useful, and
> > > > > > > > how using it
> > > > > > > > > > > would make things safer?
> > > > > > > > > > >
> > > > > > > > > > In scenarios of JIT compilers, once code is generated
> > > into RW pages,
> > > > > > > > > > modifying both PTE and EPCM permissions to RX would be
> > > a good
> > > > > > > > defensive
> > > > > > > > > > measure. In that case, EMODPR is useful.
> > > > > > > > >
> > > > > > > > > What is the exact threat we are talking about?
> > > > > > > >
> > > > > > > > To add: it should be *significantly* critical thread,
> > > given that not
> > > > > > > > supporting only EAUG would leave us only one complex call
> > > pattern with
> > > > > > > > EACCEPT involvement.
> > > > > > > >
> > > > > > > > I'd even go to suggest to leave EMODPR out of the patch
> > > set, and
> > > > > > > > introduce
> > > > > > > > it when there is PoC code for any of the existing run-time
> > > that
> > > > > > > > demonstrates the demand for it. Right now this way too
> > > speculative.
> > > > > > > >
> > > > > > > > Supporting EMODPE is IMHO by factors more critical.
> > > > > > >
> > > > > > > At least it does not protected against enclave code because
> > > an enclave
> > > > > > > can
> > > > > > > always choose not to EACCEPT any of the EMODPR requests. I'm
> > > not only
> > > > > > > confused here about the actual threat but also the potential
> > > adversary
> > > > > > > and
> > > > > > > target.
> > > > > > >
> > > > > > I'm not sure I follow your thoughts here. The sequence should
> > > be for enclave
> > > > > > to request  EMODPR in the first place through runtime to
> > > kernel, then to
> > > > > > verify with EACCEPT that the OS indeed has done EMODPR.
> > > > > > If enclave does not verify with EACCEPT, then its own code has
> > > > > > vulnerability. But this does not justify OS not providing the
> > > mechanism to
> > > > > > request EMODPR.
> > > > >
> > > > > The question is really simple: what is the threat scenario? In
> > > order to use
> > > > > the word "vulnerability", you would need one.
> > > > >
> > > > > Given the complexity of the whole dance with EMODPR it is
> > > mandatory to have
> > > > > one, in order to ack it to the mainline.
> > > > >
> > > > > > Similar to how we don't want have RWX code pages for normal Linux
> > > > > > application, when an enclave loads code pages (either directly
> > > or JIT
> > > > > > compiled from high level code ) into EAUG'd page (which has
> > > RW), we do not
> > > > > > want leave pages to be RWX for code to be executable, hence
> > > the need of
> > > > > > EMODPR request OS to reduce the permissions to RX once the
> > > code is ready to
> > > > > > execute.
> > > > >
> > > > > You cannot compare *enforced* permissions outside the enclave,
> > > and claim that
> > > > > they would be equivalent to the permissions of the already
> > > sandboxed code
> > > > > inside the enclave, with permissions that are not enforced but
> > > are based
> > > > > on good will of the enclave code.
> > > >
> > > > To add, you can already do "EMODPR" by simply adjusting VMA
> > > permissions to be
> > > > more restrictive. How this would be worse than this collaboration
> > > based
> > > > thing?
> > > 
> > > ... or you could even make soft version of EMODPR without using that
> > > opcode
> > > by writing an ioctl to update our xarray to allow lower permissions.
> > > That
> > > ties the hands of the process who is doing the mmap() already.
> > 
> > E.g. why not just
> > 
> > #define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
> > 	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
> > #define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
> > 	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)
> > 
> > struct sgx_enclave_restrict_page_permissions {
> > 	__u64 src;
> > 	__u64 offset;
> > 	__u64 length;
> > 	__u64 secinfo;
> > 	__u64 count;
> > };
> > struct sgx_enclave_extend_page_permissions {
> > 	__u64 src;
> > 	__u64 offset;
> > 	__u64 length;
> > 	__u64 secinfo;
> > 	__u64 count;
> > };
> > 
> > These would simply update the xarray and nothing else. I'd go with two
> > ioctls (with the necessary checks for secinfo) in order to provide hook
> > up points in the future for LSMs.
> > 
> > This leaves only EAUG and EMODT requiring the EACCEPT handshake.
> > 
> > /Jarkko
> The trusted code base here is the enclave. It can't trust any code outside
> for enforcement. There is also need for TLB shootdown.
> 
> To answer your earlier question about threat, the threat is
> malicious/compromised code inside enclave. Yes, you can say the whole thing
> is sand-boxed, but the runtime inside enclave could load complex upper layer
> code.  Therefore the runtime needs to have a trusted mechanism to ensure
> code pages not writable so that there is less/no chance for compromised
> malicious enclave to modify existing code pages. I still consider it to be
> similar to normal Linux elf-loader/dynamic linker relying on mmap/mprotect
> and trusting OS to enforce permissions, but here the enclave runtime only
> trust the HW provided mechanism: EMODPR to change EPCM records and EACCEPT
> to verify.

So what if:

1. User space does EMODPR ioctl.
2. Enclave does EACCEPT.
3. Enclave does EMODPE.

The problem here is the asymmetry of these operations. If EMODPE also
required EACCEPT from the run-time, EMODPR would also make sense.

Please give a code example on how EMODPR improves trust.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-11 17:13                               ` Reinette Chatre
@ 2022-01-12 23:50                                 ` Jarkko Sakkinen
  2022-01-12 23:56                                   ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-12 23:50 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Haitao Huang, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> >> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> >> wrote:
> >>
> >>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> >>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> >>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> >>>>>>>>> OK, so the question is: do we need both or would a
> >>>> mechanism just
> >>>>>>>> to extend
> >>>>>>>>> permissions be sufficient?
> >>>>>>>>
> >>>>>>>> I do believe that we need both in order to support pages
> >>>> having only
> >>>>>>>> the permissions required to support their intended use
> >>>> during the
> >>>>>>>> time the
> >>>>>>>> particular access is required. While technically it is
> >>>> possible to grant
> >>>>>>>> pages all permissions they may need during their lifetime it
> >>>> is safer to
> >>>>>>>> remove permissions when no longer required.
> >>>>>>>
> >>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> >>>> how using it
> >>>>>>> would make things safer?
> >>>>>>>
> >>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> >>>>>> modifying both PTE and EPCM permissions to RX would be a good
> >>>> defensive
> >>>>>> measure. In that case, EMODPR is useful.
> >>>>>
> >>>>> What is the exact threat we are talking about?
> >>>>
> >>>> To add: it should be *significantly* critical thread, given that not
> >>>> supporting only EAUG would leave us only one complex call pattern with
> >>>> EACCEPT involvement.
> >>>>
> >>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> >>>> introduce
> >>>> it when there is PoC code for any of the existing run-time that
> >>>> demonstrates the demand for it. Right now this way too speculative.
> >>>>
> >>>> Supporting EMODPE is IMHO by factors more critical.
> >>>
> >>> At least it does not protected against enclave code because an enclave
> >>> can
> >>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> >>> confused here about the actual threat but also the potential adversary
> >>> and
> >>> target.
> >>>
> >> I'm not sure I follow your thoughts here. The sequence should be for enclave
> >> to request  EMODPR in the first place through runtime to kernel, then to
> >> verify with EACCEPT that the OS indeed has done EMODPR.
> >> If enclave does not verify with EACCEPT, then its own code has
> >> vulnerability. But this does not justify OS not providing the mechanism to
> >> request EMODPR.
> > 
> > The question is really simple: what is the threat scenario? In order to use
> > the word "vulnerability", you would need one.
> > 
> > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > one, in order to ack it to the mainline.
> > 
> 
> Which complexity related to EMODPR are you concerned about? In a later message
> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> The OS does not require nor depend on EACCEPT being called as part of these flows
> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> these flows in the OS, but would of course impact the enclave.

I'd say *any* complexity because I see no benefit of supporting it. E.g.
EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
EMODPR going to help with any sort of workload?

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-12 23:50                                 ` Jarkko Sakkinen
@ 2022-01-12 23:56                                   ` Jarkko Sakkinen
  2022-01-13 20:09                                     ` Nathaniel McCallum
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-12 23:56 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Haitao Huang, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > Hi Jarkko,
> > 
> > On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > >> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > >> wrote:
> > >>
> > >>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > >>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > >>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > >>>>>>>>> OK, so the question is: do we need both or would a
> > >>>> mechanism just
> > >>>>>>>> to extend
> > >>>>>>>>> permissions be sufficient?
> > >>>>>>>>
> > >>>>>>>> I do believe that we need both in order to support pages
> > >>>> having only
> > >>>>>>>> the permissions required to support their intended use
> > >>>> during the
> > >>>>>>>> time the
> > >>>>>>>> particular access is required. While technically it is
> > >>>> possible to grant
> > >>>>>>>> pages all permissions they may need during their lifetime it
> > >>>> is safer to
> > >>>>>>>> remove permissions when no longer required.
> > >>>>>>>
> > >>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > >>>> how using it
> > >>>>>>> would make things safer?
> > >>>>>>>
> > >>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > >>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > >>>> defensive
> > >>>>>> measure. In that case, EMODPR is useful.
> > >>>>>
> > >>>>> What is the exact threat we are talking about?
> > >>>>
> > >>>> To add: it should be *significantly* critical thread, given that not
> > >>>> supporting only EAUG would leave us only one complex call pattern with
> > >>>> EACCEPT involvement.
> > >>>>
> > >>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > >>>> introduce
> > >>>> it when there is PoC code for any of the existing run-time that
> > >>>> demonstrates the demand for it. Right now this way too speculative.
> > >>>>
> > >>>> Supporting EMODPE is IMHO by factors more critical.
> > >>>
> > >>> At least it does not protected against enclave code because an enclave
> > >>> can
> > >>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > >>> confused here about the actual threat but also the potential adversary
> > >>> and
> > >>> target.
> > >>>
> > >> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > >> to request  EMODPR in the first place through runtime to kernel, then to
> > >> verify with EACCEPT that the OS indeed has done EMODPR.
> > >> If enclave does not verify with EACCEPT, then its own code has
> > >> vulnerability. But this does not justify OS not providing the mechanism to
> > >> request EMODPR.
> > > 
> > > The question is really simple: what is the threat scenario? In order to use
> > > the word "vulnerability", you would need one.
> > > 
> > > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > one, in order to ack it to the mainline.
> > > 
> > 
> > Which complexity related to EMODPR are you concerned about? In a later message
> > you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > The OS does not require nor depend on EACCEPT being called as part of these flows
> > so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > these flows in the OS, but would of course impact the enclave.
> 
> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> EMODPR going to help with any sort of workload?

I've even started think should we just always allow mmap()? The worst thing
that can happen is that the enclave crashes. Does that matter all that
much? I'm asking because access control is the main theme in SGX2 patch set
that IMHO should be considered to the ground. It really "stress tests" that
area. If we can settle on  that, then other things are just technical details
that we can surely sort out.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-12 23:48                                       ` Jarkko Sakkinen
@ 2022-01-13  2:41                                         ` Haitao Huang
  2022-01-14 21:36                                           ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Haitao Huang @ 2022-01-13  2:41 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Wed, 12 Jan 2022 17:48:48 -0600, Jarkko Sakkinen <jarkko@kernel.org>  
wrote:

> On Mon, Jan 10, 2022 at 09:48:15PM -0600, Haitao Huang wrote:
>> On Mon, 10 Jan 2022 20:15:28 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>> wrote:
>>
>> > On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
>> > > On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
>> > > > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
>> > > > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>> > > > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen
>> > > <jarkko@kernel.org>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen  
>> wrote:
>> > > > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen
>> > > wrote:
>> > > > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang
>> > > wrote:
>> > > > > > > > > > > > > OK, so the question is: do we need both or  
>> would a
>> > > > > > > > mechanism just
>> > > > > > > > > > > > to extend
>> > > > > > > > > > > > > permissions be sufficient?
>> > > > > > > > > > > >
>> > > > > > > > > > > > I do believe that we need both in order to support
>> > > pages
>> > > > > > > > having only
>> > > > > > > > > > > > the permissions required to support their  
>> intended use
>> > > > > > > > during the
>> > > > > > > > > > > > time the
>> > > > > > > > > > > > particular access is required. While technically  
>> it is
>> > > > > > > > possible to grant
>> > > > > > > > > > > > pages all permissions they may need during their
>> > > lifetime it
>> > > > > > > > is safer to
>> > > > > > > > > > > > remove permissions when no longer required.
>> > > > > > > > > > >
>> > > > > > > > > > > So if we imagine a run-time: how EMODPR would be
>> > > useful, and
>> > > > > > > > how using it
>> > > > > > > > > > > would make things safer?
>> > > > > > > > > > >
>> > > > > > > > > > In scenarios of JIT compilers, once code is generated
>> > > into RW pages,
>> > > > > > > > > > modifying both PTE and EPCM permissions to RX would be
>> > > a good
>> > > > > > > > defensive
>> > > > > > > > > > measure. In that case, EMODPR is useful.
>> > > > > > > > >
>> > > > > > > > > What is the exact threat we are talking about?
>> > > > > > > >
>> > > > > > > > To add: it should be *significantly* critical thread,
>> > > given that not
>> > > > > > > > supporting only EAUG would leave us only one complex call
>> > > pattern with
>> > > > > > > > EACCEPT involvement.
>> > > > > > > >
>> > > > > > > > I'd even go to suggest to leave EMODPR out of the patch
>> > > set, and
>> > > > > > > > introduce
>> > > > > > > > it when there is PoC code for any of the existing run-time
>> > > that
>> > > > > > > > demonstrates the demand for it. Right now this way too
>> > > speculative.
>> > > > > > > >
>> > > > > > > > Supporting EMODPE is IMHO by factors more critical.
>> > > > > > >
>> > > > > > > At least it does not protected against enclave code because
>> > > an enclave
>> > > > > > > can
>> > > > > > > always choose not to EACCEPT any of the EMODPR requests. I'm
>> > > not only
>> > > > > > > confused here about the actual threat but also the potential
>> > > adversary
>> > > > > > > and
>> > > > > > > target.
>> > > > > > >
>> > > > > > I'm not sure I follow your thoughts here. The sequence should
>> > > be for enclave
>> > > > > > to request  EMODPR in the first place through runtime to
>> > > kernel, then to
>> > > > > > verify with EACCEPT that the OS indeed has done EMODPR.
>> > > > > > If enclave does not verify with EACCEPT, then its own code has
>> > > > > > vulnerability. But this does not justify OS not providing the
>> > > mechanism to
>> > > > > > request EMODPR.
>> > > > >
>> > > > > The question is really simple: what is the threat scenario? In
>> > > order to use
>> > > > > the word "vulnerability", you would need one.
>> > > > >
>> > > > > Given the complexity of the whole dance with EMODPR it is
>> > > mandatory to have
>> > > > > one, in order to ack it to the mainline.
>> > > > >
>> > > > > > Similar to how we don't want have RWX code pages for normal  
>> Linux
>> > > > > > application, when an enclave loads code pages (either directly
>> > > or JIT
>> > > > > > compiled from high level code ) into EAUG'd page (which has
>> > > RW), we do not
>> > > > > > want leave pages to be RWX for code to be executable, hence
>> > > the need of
>> > > > > > EMODPR request OS to reduce the permissions to RX once the
>> > > code is ready to
>> > > > > > execute.
>> > > > >
>> > > > > You cannot compare *enforced* permissions outside the enclave,
>> > > and claim that
>> > > > > they would be equivalent to the permissions of the already
>> > > sandboxed code
>> > > > > inside the enclave, with permissions that are not enforced but
>> > > are based
>> > > > > on good will of the enclave code.
>> > > >
>> > > > To add, you can already do "EMODPR" by simply adjusting VMA
>> > > permissions to be
>> > > > more restrictive. How this would be worse than this collaboration
>> > > based
>> > > > thing?
>> > >
>> > > ... or you could even make soft version of EMODPR without using that
>> > > opcode
>> > > by writing an ioctl to update our xarray to allow lower permissions.
>> > > That
>> > > ties the hands of the process who is doing the mmap() already.
>> >
>> > E.g. why not just
>> >
>> > #define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
>> > 	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
>> > #define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
>> > 	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)
>> >
>> > struct sgx_enclave_restrict_page_permissions {
>> > 	__u64 src;
>> > 	__u64 offset;
>> > 	__u64 length;
>> > 	__u64 secinfo;
>> > 	__u64 count;
>> > };
>> > struct sgx_enclave_extend_page_permissions {
>> > 	__u64 src;
>> > 	__u64 offset;
>> > 	__u64 length;
>> > 	__u64 secinfo;
>> > 	__u64 count;
>> > };
>> >
>> > These would simply update the xarray and nothing else. I'd go with two
>> > ioctls (with the necessary checks for secinfo) in order to provide  
>> hook
>> > up points in the future for LSMs.
>> >
>> > This leaves only EAUG and EMODT requiring the EACCEPT handshake.
>> >
>> > /Jarkko
>> The trusted code base here is the enclave. It can't trust any code  
>> outside
>> for enforcement. There is also need for TLB shootdown.
>>
>> To answer your earlier question about threat, the threat is
>> malicious/compromised code inside enclave. Yes, you can say the whole  
>> thing
>> is sand-boxed, but the runtime inside enclave could load complex upper  
>> layer
>> code.  Therefore the runtime needs to have a trusted mechanism to ensure
>> code pages not writable so that there is less/no chance for compromised
>> malicious enclave to modify existing code pages. I still consider it to  
>> be
>> similar to normal Linux elf-loader/dynamic linker relying on  
>> mmap/mprotect
>> and trusting OS to enforce permissions, but here the enclave runtime  
>> only
>> trust the HW provided mechanism: EMODPR to change EPCM records and  
>> EACCEPT
>> to verify.
>
> So what if:
>
> 1. User space does EMODPR ioctl.
> 2. Enclave does EACCEPT.
> 3. Enclave does EMODPE.
>
Could you elaborate on your exact concern here? EMODPE won't be able to  
restrict permissions, only add, so no way to cancel what's done by EMODPR  
if that's your concern.

And EMODPE would only affect EPCM not PTE. So if OS set PTE no matching  
EPCM, the enclave won't be able to use the page for added access.

> The problem here is the asymmetry of these operations. If EMODPE also
> required EACCEPT from the run-time, EMODPR would also make sense.
>

The asymmetry is on the user space side as Reinette stated in her reply. I  
could not see why this a relevant concern for kernel.

> Please give a code example on how EMODPR improves trust.
>
It's not that EMODPR itself improves trust. What I try to say is that the  
enclave runtime can use EACCET to verify EPCM permissions which is  
trusted, and not relying on PTE permissions which is controlled by OS. It  
must do EACCEPT for EMODPR and other ENCLS ops like EMODT,EAUG, etc. as  
enclave security model considers OS untrusted.

EMODPR is the only way to restrict permissions in EPCM for enclave pages.  
So if it is not supported by kernel then there is no way for enclave  
runtimes to support the use cases I stated previously. That means RWX  
required in EPCM for dynamic loaded/JIT compiled code pages.

Thanks
Haitao

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-12 23:56                                   ` Jarkko Sakkinen
@ 2022-01-13 20:09                                     ` Nathaniel McCallum
  2022-01-13 21:42                                       ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Nathaniel McCallum @ 2022-01-13 20:09 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Reinette Chatre, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > >
> > > On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > >> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > >> wrote:
> > > >>
> > > >>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > >>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > >>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > >>>>>>>>> OK, so the question is: do we need both or would a
> > > >>>> mechanism just
> > > >>>>>>>> to extend
> > > >>>>>>>>> permissions be sufficient?
> > > >>>>>>>>
> > > >>>>>>>> I do believe that we need both in order to support pages
> > > >>>> having only
> > > >>>>>>>> the permissions required to support their intended use
> > > >>>> during the
> > > >>>>>>>> time the
> > > >>>>>>>> particular access is required. While technically it is
> > > >>>> possible to grant
> > > >>>>>>>> pages all permissions they may need during their lifetime it
> > > >>>> is safer to
> > > >>>>>>>> remove permissions when no longer required.
> > > >>>>>>>
> > > >>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > > >>>> how using it
> > > >>>>>>> would make things safer?
> > > >>>>>>>
> > > >>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > > >>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > > >>>> defensive
> > > >>>>>> measure. In that case, EMODPR is useful.
> > > >>>>>
> > > >>>>> What is the exact threat we are talking about?
> > > >>>>
> > > >>>> To add: it should be *significantly* critical thread, given that not
> > > >>>> supporting only EAUG would leave us only one complex call pattern with
> > > >>>> EACCEPT involvement.
> > > >>>>
> > > >>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > > >>>> introduce
> > > >>>> it when there is PoC code for any of the existing run-time that
> > > >>>> demonstrates the demand for it. Right now this way too speculative.
> > > >>>>
> > > >>>> Supporting EMODPE is IMHO by factors more critical.
> > > >>>
> > > >>> At least it does not protected against enclave code because an enclave
> > > >>> can
> > > >>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > >>> confused here about the actual threat but also the potential adversary
> > > >>> and
> > > >>> target.
> > > >>>
> > > >> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > >> to request  EMODPR in the first place through runtime to kernel, then to
> > > >> verify with EACCEPT that the OS indeed has done EMODPR.
> > > >> If enclave does not verify with EACCEPT, then its own code has
> > > >> vulnerability. But this does not justify OS not providing the mechanism to
> > > >> request EMODPR.
> > > >
> > > > The question is really simple: what is the threat scenario? In order to use
> > > > the word "vulnerability", you would need one.
> > > >
> > > > Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > > one, in order to ack it to the mainline.
> > > >
> > >
> > > Which complexity related to EMODPR are you concerned about? In a later message
> > > you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > > so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > > The OS does not require nor depend on EACCEPT being called as part of these flows
> > > so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > > these flows in the OS, but would of course impact the enclave.
> >
> > I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > EMODPR going to help with any sort of workload?
>
> I've even started think should we just always allow mmap()?

I suspect this may be the most ergonomic way forward. Instructions
like EAUG/EMODPR/etc are really irrelevant implementation details to
what the enclave wants, which is a memory mapping in the enclave. Why
make the enclave runner do multiple context switches just to change
the memory map of an enclave?

> The worst thing
> that can happen is that the enclave crashes. Does that matter all that
> much? I'm asking because access control is the main theme in SGX2 patch set
> that IMHO should be considered to the ground. It really "stress tests" that
> area. If we can settle on  that, then other things are just technical details
> that we can surely sort out.
>
> /Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-13 20:09                                     ` Nathaniel McCallum
@ 2022-01-13 21:42                                       ` Reinette Chatre
  2022-01-14 21:53                                         ` Jarkko Sakkinen
  2022-01-17 13:27                                         ` Nathaniel McCallum
  0 siblings, 2 replies; 155+ messages in thread
From: Reinette Chatre @ 2022-01-13 21:42 UTC (permalink / raw)
  To: Nathaniel McCallum, Jarkko Sakkinen
  Cc: Haitao Huang, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

Hi Jarkko and Nathaniel,

On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>
>> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
>>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
>>>> Hi Jarkko,
>>>>
>>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
>>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>>>>>> wrote:
>>>>>>
>>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>>>>>>>>>>>>> OK, so the question is: do we need both or would a
>>>>>>>> mechanism just
>>>>>>>>>>>> to extend
>>>>>>>>>>>>> permissions be sufficient?
>>>>>>>>>>>>
>>>>>>>>>>>> I do believe that we need both in order to support pages
>>>>>>>> having only
>>>>>>>>>>>> the permissions required to support their intended use
>>>>>>>> during the
>>>>>>>>>>>> time the
>>>>>>>>>>>> particular access is required. While technically it is
>>>>>>>> possible to grant
>>>>>>>>>>>> pages all permissions they may need during their lifetime it
>>>>>>>> is safer to
>>>>>>>>>>>> remove permissions when no longer required.
>>>>>>>>>>>
>>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
>>>>>>>> how using it
>>>>>>>>>>> would make things safer?
>>>>>>>>>>>
>>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
>>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
>>>>>>>> defensive
>>>>>>>>>> measure. In that case, EMODPR is useful.
>>>>>>>>>
>>>>>>>>> What is the exact threat we are talking about?
>>>>>>>>
>>>>>>>> To add: it should be *significantly* critical thread, given that not
>>>>>>>> supporting only EAUG would leave us only one complex call pattern with
>>>>>>>> EACCEPT involvement.
>>>>>>>>
>>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
>>>>>>>> introduce
>>>>>>>> it when there is PoC code for any of the existing run-time that
>>>>>>>> demonstrates the demand for it. Right now this way too speculative.
>>>>>>>>
>>>>>>>> Supporting EMODPE is IMHO by factors more critical.
>>>>>>>
>>>>>>> At least it does not protected against enclave code because an enclave
>>>>>>> can
>>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
>>>>>>> confused here about the actual threat but also the potential adversary
>>>>>>> and
>>>>>>> target.
>>>>>>>
>>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
>>>>>> to request  EMODPR in the first place through runtime to kernel, then to
>>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
>>>>>> If enclave does not verify with EACCEPT, then its own code has
>>>>>> vulnerability. But this does not justify OS not providing the mechanism to
>>>>>> request EMODPR.
>>>>>
>>>>> The question is really simple: what is the threat scenario? In order to use
>>>>> the word "vulnerability", you would need one.
>>>>>
>>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
>>>>> one, in order to ack it to the mainline.
>>>>>
>>>>
>>>> Which complexity related to EMODPR are you concerned about? In a later message
>>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
>>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
>>>> The OS does not require nor depend on EACCEPT being called as part of these flows
>>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
>>>> these flows in the OS, but would of course impact the enclave.
>>>
>>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
>>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
>>> EMODPR going to help with any sort of workload?
>>
>> I've even started think should we just always allow mmap()?
> 
> I suspect this may be the most ergonomic way forward. Instructions
> like EAUG/EMODPR/etc are really irrelevant implementation details to
> what the enclave wants, which is a memory mapping in the enclave. Why
> make the enclave runner do multiple context switches just to change
> the memory map of an enclave?

The enclave runner is not forced to make any changes to a memory mapping. To start,
this implementation supports and does not change the existing ABI where a new
memory mapping can only be created if its permissions are the same or weaker
than the EPCM permissions. After the memory mapping is created the EPCM permissions
can change (thanks to SGX2) and when they do there are no forced nor required
changes to the memory mapping - pages remain accessible where the memory mapping
and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
to an enclave page (EMODPE) then the memory mapping may need to be changed as
should be expected to access a page with permissions that the memory mapping
did not previously allow.

Are you saying that the permissions of a new memory mapping should now be allowed
to exceed EPCM permissions and thus the enclave runner would not need to modify a
memory mapping when EPCM permissions are relaxed? As mentioned above this may be
considered a change in ABI but something we could support on SGX2 systems.

I would also like to highlight Haitao's earlier comment that a foundation of SGX is
that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
and EMODPE to manage enclave page permissions.

Reinette







^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-13  2:41                                         ` Haitao Huang
@ 2022-01-14 21:36                                           ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-14 21:36 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Reinette Chatre, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Wed, Jan 12, 2022 at 08:41:18PM -0600, Haitao Huang wrote:
> On Wed, 12 Jan 2022 17:48:48 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> wrote:
> 
> > On Mon, Jan 10, 2022 at 09:48:15PM -0600, Haitao Huang wrote:
> > > On Mon, 10 Jan 2022 20:15:28 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > wrote:
> > > 
> > > > On Tue, Jan 11, 2022 at 04:03:32AM +0200, Jarkko Sakkinen wrote:
> > > > > On Tue, Jan 11, 2022 at 03:55:59AM +0200, Jarkko Sakkinen wrote:
> > > > > > On Tue, Jan 11, 2022 at 03:53:26AM +0200, Jarkko Sakkinen wrote:
> > > > > > > On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > > > > > > On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen
> > > > > <jarkko@kernel.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko
> > > Sakkinen wrote:
> > > > > > > > > > On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen
> > > > > wrote:
> > > > > > > > > > > On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang
> > > > > wrote:
> > > > > > > > > > > > > > > OK, so the question is: do we need both or
> > > would a
> > > > > > > > > > mechanism just
> > > > > > > > > > > > > > to extend
> > > > > > > > > > > > > > > permissions be sufficient?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I do believe that we need both in order to support
> > > > > pages
> > > > > > > > > > having only
> > > > > > > > > > > > > > the permissions required to support their
> > > intended use
> > > > > > > > > > during the
> > > > > > > > > > > > > > time the
> > > > > > > > > > > > > > particular access is required. While
> > > technically it is
> > > > > > > > > > possible to grant
> > > > > > > > > > > > > > pages all permissions they may need during their
> > > > > lifetime it
> > > > > > > > > > is safer to
> > > > > > > > > > > > > > remove permissions when no longer required.
> > > > > > > > > > > > >
> > > > > > > > > > > > > So if we imagine a run-time: how EMODPR would be
> > > > > useful, and
> > > > > > > > > > how using it
> > > > > > > > > > > > > would make things safer?
> > > > > > > > > > > > >
> > > > > > > > > > > > In scenarios of JIT compilers, once code is generated
> > > > > into RW pages,
> > > > > > > > > > > > modifying both PTE and EPCM permissions to RX would be
> > > > > a good
> > > > > > > > > > defensive
> > > > > > > > > > > > measure. In that case, EMODPR is useful.
> > > > > > > > > > >
> > > > > > > > > > > What is the exact threat we are talking about?
> > > > > > > > > >
> > > > > > > > > > To add: it should be *significantly* critical thread,
> > > > > given that not
> > > > > > > > > > supporting only EAUG would leave us only one complex call
> > > > > pattern with
> > > > > > > > > > EACCEPT involvement.
> > > > > > > > > >
> > > > > > > > > > I'd even go to suggest to leave EMODPR out of the patch
> > > > > set, and
> > > > > > > > > > introduce
> > > > > > > > > > it when there is PoC code for any of the existing run-time
> > > > > that
> > > > > > > > > > demonstrates the demand for it. Right now this way too
> > > > > speculative.
> > > > > > > > > >
> > > > > > > > > > Supporting EMODPE is IMHO by factors more critical.
> > > > > > > > >
> > > > > > > > > At least it does not protected against enclave code because
> > > > > an enclave
> > > > > > > > > can
> > > > > > > > > always choose not to EACCEPT any of the EMODPR requests. I'm
> > > > > not only
> > > > > > > > > confused here about the actual threat but also the potential
> > > > > adversary
> > > > > > > > > and
> > > > > > > > > target.
> > > > > > > > >
> > > > > > > > I'm not sure I follow your thoughts here. The sequence should
> > > > > be for enclave
> > > > > > > > to request  EMODPR in the first place through runtime to
> > > > > kernel, then to
> > > > > > > > verify with EACCEPT that the OS indeed has done EMODPR.
> > > > > > > > If enclave does not verify with EACCEPT, then its own code has
> > > > > > > > vulnerability. But this does not justify OS not providing the
> > > > > mechanism to
> > > > > > > > request EMODPR.
> > > > > > >
> > > > > > > The question is really simple: what is the threat scenario? In
> > > > > order to use
> > > > > > > the word "vulnerability", you would need one.
> > > > > > >
> > > > > > > Given the complexity of the whole dance with EMODPR it is
> > > > > mandatory to have
> > > > > > > one, in order to ack it to the mainline.
> > > > > > >
> > > > > > > > Similar to how we don't want have RWX code pages for
> > > normal Linux
> > > > > > > > application, when an enclave loads code pages (either directly
> > > > > or JIT
> > > > > > > > compiled from high level code ) into EAUG'd page (which has
> > > > > RW), we do not
> > > > > > > > want leave pages to be RWX for code to be executable, hence
> > > > > the need of
> > > > > > > > EMODPR request OS to reduce the permissions to RX once the
> > > > > code is ready to
> > > > > > > > execute.
> > > > > > >
> > > > > > > You cannot compare *enforced* permissions outside the enclave,
> > > > > and claim that
> > > > > > > they would be equivalent to the permissions of the already
> > > > > sandboxed code
> > > > > > > inside the enclave, with permissions that are not enforced but
> > > > > are based
> > > > > > > on good will of the enclave code.
> > > > > >
> > > > > > To add, you can already do "EMODPR" by simply adjusting VMA
> > > > > permissions to be
> > > > > > more restrictive. How this would be worse than this collaboration
> > > > > based
> > > > > > thing?
> > > > >
> > > > > ... or you could even make soft version of EMODPR without using that
> > > > > opcode
> > > > > by writing an ioctl to update our xarray to allow lower permissions.
> > > > > That
> > > > > ties the hands of the process who is doing the mmap() already.
> > > >
> > > > E.g. why not just
> > > >
> > > > #define SGX_IOC_ENCLAVE_RESTRICT_PAGE_PERMISSIONS \
> > > > 	_IOW(SGX_MAGIC, 0x05, struct sgx_enclave_modify_page_permissions)
> > > > #define SGX_IOC_ENCLAVE_EXTEND_PAGE_PERMISSIONS \
> > > > 	_IOW(SGX_MAGIC, 0x06, struct sgx_enclave_modify_page_permissions)
> > > >
> > > > struct sgx_enclave_restrict_page_permissions {
> > > > 	__u64 src;
> > > > 	__u64 offset;
> > > > 	__u64 length;
> > > > 	__u64 secinfo;
> > > > 	__u64 count;
> > > > };
> > > > struct sgx_enclave_extend_page_permissions {
> > > > 	__u64 src;
> > > > 	__u64 offset;
> > > > 	__u64 length;
> > > > 	__u64 secinfo;
> > > > 	__u64 count;
> > > > };
> > > >
> > > > These would simply update the xarray and nothing else. I'd go with two
> > > > ioctls (with the necessary checks for secinfo) in order to provide
> > > hook
> > > > up points in the future for LSMs.
> > > >
> > > > This leaves only EAUG and EMODT requiring the EACCEPT handshake.
> > > >
> > > > /Jarkko
> > > The trusted code base here is the enclave. It can't trust any code
> > > outside
> > > for enforcement. There is also need for TLB shootdown.
> > > 
> > > To answer your earlier question about threat, the threat is
> > > malicious/compromised code inside enclave. Yes, you can say the
> > > whole thing
> > > is sand-boxed, but the runtime inside enclave could load complex
> > > upper layer
> > > code.  Therefore the runtime needs to have a trusted mechanism to ensure
> > > code pages not writable so that there is less/no chance for compromised
> > > malicious enclave to modify existing code pages. I still consider it
> > > to be
> > > similar to normal Linux elf-loader/dynamic linker relying on
> > > mmap/mprotect
> > > and trusting OS to enforce permissions, but here the enclave runtime
> > > only
> > > trust the HW provided mechanism: EMODPR to change EPCM records and
> > > EACCEPT
> > > to verify.
> > 
> > So what if:
> > 
> > 1. User space does EMODPR ioctl.
> > 2. Enclave does EACCEPT.
> > 3. Enclave does EMODPE.
> > 
> Could you elaborate on your exact concern here? EMODPE won't be able to
> restrict permissions, only add, so no way to cancel what's done by EMODPR if
> that's your concern.
> 
> And EMODPE would only affect EPCM not PTE. So if OS set PTE no matching
> EPCM, the enclave won't be able to use the page for added access.

The problem I see is clearly visible in your last sentence, if you think
about it. That's all I can add more to this discussion for the moment.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-13 21:42                                       ` Reinette Chatre
@ 2022-01-14 21:53                                         ` Jarkko Sakkinen
  2022-01-14 21:57                                           ` Jarkko Sakkinen
                                                             ` (2 more replies)
  2022-01-17 13:27                                         ` Nathaniel McCallum
  1 sibling, 3 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-14 21:53 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> Hi Jarkko and Nathaniel,
> 
> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >>
> >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> >>>> Hi Jarkko,
> >>>>
> >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> >>>>>>>> mechanism just
> >>>>>>>>>>>> to extend
> >>>>>>>>>>>>> permissions be sufficient?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I do believe that we need both in order to support pages
> >>>>>>>> having only
> >>>>>>>>>>>> the permissions required to support their intended use
> >>>>>>>> during the
> >>>>>>>>>>>> time the
> >>>>>>>>>>>> particular access is required. While technically it is
> >>>>>>>> possible to grant
> >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> >>>>>>>> is safer to
> >>>>>>>>>>>> remove permissions when no longer required.
> >>>>>>>>>>>
> >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> >>>>>>>> how using it
> >>>>>>>>>>> would make things safer?
> >>>>>>>>>>>
> >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> >>>>>>>> defensive
> >>>>>>>>>> measure. In that case, EMODPR is useful.
> >>>>>>>>>
> >>>>>>>>> What is the exact threat we are talking about?
> >>>>>>>>
> >>>>>>>> To add: it should be *significantly* critical thread, given that not
> >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> >>>>>>>> EACCEPT involvement.
> >>>>>>>>
> >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> >>>>>>>> introduce
> >>>>>>>> it when there is PoC code for any of the existing run-time that
> >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> >>>>>>>>
> >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> >>>>>>>
> >>>>>>> At least it does not protected against enclave code because an enclave
> >>>>>>> can
> >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> >>>>>>> confused here about the actual threat but also the potential adversary
> >>>>>>> and
> >>>>>>> target.
> >>>>>>>
> >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> >>>>>> If enclave does not verify with EACCEPT, then its own code has
> >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> >>>>>> request EMODPR.
> >>>>>
> >>>>> The question is really simple: what is the threat scenario? In order to use
> >>>>> the word "vulnerability", you would need one.
> >>>>>
> >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> >>>>> one, in order to ack it to the mainline.
> >>>>>
> >>>>
> >>>> Which complexity related to EMODPR are you concerned about? In a later message
> >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> >>>> these flows in the OS, but would of course impact the enclave.
> >>>
> >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> >>> EMODPR going to help with any sort of workload?
> >>
> >> I've even started think should we just always allow mmap()?
> > 
> > I suspect this may be the most ergonomic way forward. Instructions
> > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > what the enclave wants, which is a memory mapping in the enclave. Why
> > make the enclave runner do multiple context switches just to change
> > the memory map of an enclave?
> 
> The enclave runner is not forced to make any changes to a memory mapping. To start,
> this implementation supports and does not change the existing ABI where a new
> memory mapping can only be created if its permissions are the same or weaker
> than the EPCM permissions. After the memory mapping is created the EPCM permissions
> can change (thanks to SGX2) and when they do there are no forced nor required
> changes to the memory mapping - pages remain accessible where the memory mapping
> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> to an enclave page (EMODPE) then the memory mapping may need to be changed as
> should be expected to access a page with permissions that the memory mapping
> did not previously allow.
> 
> Are you saying that the permissions of a new memory mapping should now be allowed
> to exceed EPCM permissions and thus the enclave runner would not need to modify a
> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> considered a change in ABI but something we could support on SGX2 systems.
> 
> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> and EMODPE to manage enclave page permissions.

Thanks, this was very informative response. I'll try to elaborate why
EMODPR gives me headaches.

I'm having hard time to connect the dots between OS mistrust and
restricting enclave by changing EPCM permissions. To make EMODPR actually
legit, it needs really at least some sort of example of a scenario where
mistrusted OS is the adversary and enclave is the attack target. Otherwise,
we are just waving our hands.

Generally speaking a restriction is not a restriction if cannot be enforced. 

I see two non-EMODPR options: you could relax this,  *or* you could make it
soft restriction by not doing EMODPR but instead just updating the internal
xarray. The 2nd option would be fully backwards compatible with the
existing invariant.

It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
or will be supported.

> Reinette
 
/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-14 21:53                                         ` Jarkko Sakkinen
@ 2022-01-14 21:57                                           ` Jarkko Sakkinen
  2022-01-14 22:00                                             ` Jarkko Sakkinen
  2022-01-14 22:17                                           ` Jarkko Sakkinen
  2022-01-14 23:05                                           ` Reinette Chatre
  2 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-14 21:57 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > Hi Jarkko and Nathaniel,
> > 
> > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > >>
> > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > >>>> Hi Jarkko,
> > >>>>
> > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > >>>>>>>> mechanism just
> > >>>>>>>>>>>> to extend
> > >>>>>>>>>>>>> permissions be sufficient?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > >>>>>>>> having only
> > >>>>>>>>>>>> the permissions required to support their intended use
> > >>>>>>>> during the
> > >>>>>>>>>>>> time the
> > >>>>>>>>>>>> particular access is required. While technically it is
> > >>>>>>>> possible to grant
> > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > >>>>>>>> is safer to
> > >>>>>>>>>>>> remove permissions when no longer required.
> > >>>>>>>>>>>
> > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > >>>>>>>> how using it
> > >>>>>>>>>>> would make things safer?
> > >>>>>>>>>>>
> > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > >>>>>>>> defensive
> > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > >>>>>>>>>
> > >>>>>>>>> What is the exact threat we are talking about?
> > >>>>>>>>
> > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > >>>>>>>> EACCEPT involvement.
> > >>>>>>>>
> > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > >>>>>>>> introduce
> > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > >>>>>>>>
> > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > >>>>>>>
> > >>>>>>> At least it does not protected against enclave code because an enclave
> > >>>>>>> can
> > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > >>>>>>> confused here about the actual threat but also the potential adversary
> > >>>>>>> and
> > >>>>>>> target.
> > >>>>>>>
> > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > >>>>>> request EMODPR.
> > >>>>>
> > >>>>> The question is really simple: what is the threat scenario? In order to use
> > >>>>> the word "vulnerability", you would need one.
> > >>>>>
> > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > >>>>> one, in order to ack it to the mainline.
> > >>>>>
> > >>>>
> > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > >>>> these flows in the OS, but would of course impact the enclave.
> > >>>
> > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > >>> EMODPR going to help with any sort of workload?
> > >>
> > >> I've even started think should we just always allow mmap()?
> > > 
> > > I suspect this may be the most ergonomic way forward. Instructions
> > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > make the enclave runner do multiple context switches just to change
> > > the memory map of an enclave?
> > 
> > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > this implementation supports and does not change the existing ABI where a new
> > memory mapping can only be created if its permissions are the same or weaker
> > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > can change (thanks to SGX2) and when they do there are no forced nor required
> > changes to the memory mapping - pages remain accessible where the memory mapping
> > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > should be expected to access a page with permissions that the memory mapping
> > did not previously allow.
> > 
> > Are you saying that the permissions of a new memory mapping should now be allowed
> > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > considered a change in ABI but something we could support on SGX2 systems.
> > 
> > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > and EMODPE to manage enclave page permissions.
> 
> Thanks, this was very informative response. I'll try to elaborate why
> EMODPR gives me headaches.
> 
> I'm having hard time to connect the dots between OS mistrust and
> restricting enclave by changing EPCM permissions. To make EMODPR actually
> legit, it needs really at least some sort of example of a scenario where
> mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> we are just waving our hands.
> 
> Generally speaking a restriction is not a restriction if cannot be enforced. 
> 
> I see two non-EMODPR options: you could relax this,  *or* you could make it
> soft restriction by not doing EMODPR but instead just updating the internal
> xarray. The 2nd option would be fully backwards compatible with the
> existing invariant.
> 
> It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> or will be supported.

Off-topic: I was able to compile a kernel with your SGX2 patches and run
kselftests. I cannot give tested-by's before the design is locked-in but
in that sense I don't think we are far away of some solution. EAUG side
looks pretty good to me.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-14 21:57                                           ` Jarkko Sakkinen
@ 2022-01-14 22:00                                             ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-14 22:00 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Fri, Jan 14, 2022 at 11:57:08PM +0200, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> > On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > > Hi Jarkko and Nathaniel,
> > > 
> > > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > >>
> > > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > > >>>> Hi Jarkko,
> > > >>>>
> > > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > > >>>>>>>> mechanism just
> > > >>>>>>>>>>>> to extend
> > > >>>>>>>>>>>>> permissions be sufficient?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > > >>>>>>>> having only
> > > >>>>>>>>>>>> the permissions required to support their intended use
> > > >>>>>>>> during the
> > > >>>>>>>>>>>> time the
> > > >>>>>>>>>>>> particular access is required. While technically it is
> > > >>>>>>>> possible to grant
> > > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > > >>>>>>>> is safer to
> > > >>>>>>>>>>>> remove permissions when no longer required.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > > >>>>>>>> how using it
> > > >>>>>>>>>>> would make things safer?
> > > >>>>>>>>>>>
> > > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > > >>>>>>>> defensive
> > > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > > >>>>>>>>>
> > > >>>>>>>>> What is the exact threat we are talking about?
> > > >>>>>>>>
> > > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > > >>>>>>>> EACCEPT involvement.
> > > >>>>>>>>
> > > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > > >>>>>>>> introduce
> > > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > > >>>>>>>>
> > > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > > >>>>>>>
> > > >>>>>>> At least it does not protected against enclave code because an enclave
> > > >>>>>>> can
> > > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > >>>>>>> confused here about the actual threat but also the potential adversary
> > > >>>>>>> and
> > > >>>>>>> target.
> > > >>>>>>>
> > > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > > >>>>>> request EMODPR.
> > > >>>>>
> > > >>>>> The question is really simple: what is the threat scenario? In order to use
> > > >>>>> the word "vulnerability", you would need one.
> > > >>>>>
> > > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > >>>>> one, in order to ack it to the mainline.
> > > >>>>>
> > > >>>>
> > > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > > >>>> these flows in the OS, but would of course impact the enclave.
> > > >>>
> > > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > > >>> EMODPR going to help with any sort of workload?
> > > >>
> > > >> I've even started think should we just always allow mmap()?
> > > > 
> > > > I suspect this may be the most ergonomic way forward. Instructions
> > > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > > make the enclave runner do multiple context switches just to change
> > > > the memory map of an enclave?
> > > 
> > > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > > this implementation supports and does not change the existing ABI where a new
> > > memory mapping can only be created if its permissions are the same or weaker
> > > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > > can change (thanks to SGX2) and when they do there are no forced nor required
> > > changes to the memory mapping - pages remain accessible where the memory mapping
> > > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > > should be expected to access a page with permissions that the memory mapping
> > > did not previously allow.
> > > 
> > > Are you saying that the permissions of a new memory mapping should now be allowed
> > > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > > considered a change in ABI but something we could support on SGX2 systems.
> > > 
> > > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > > and EMODPE to manage enclave page permissions.
> > 
> > Thanks, this was very informative response. I'll try to elaborate why
> > EMODPR gives me headaches.
> > 
> > I'm having hard time to connect the dots between OS mistrust and
> > restricting enclave by changing EPCM permissions. To make EMODPR actually
> > legit, it needs really at least some sort of example of a scenario where
> > mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> > we are just waving our hands.
> > 
> > Generally speaking a restriction is not a restriction if cannot be enforced. 
> > 
> > I see two non-EMODPR options: you could relax this,  *or* you could make it
> > soft restriction by not doing EMODPR but instead just updating the internal
> > xarray. The 2nd option would be fully backwards compatible with the
> > existing invariant.
> > 
> > It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> > or will be supported.
> 
> Off-topic: I was able to compile a kernel with your SGX2 patches and run
> kselftests. I cannot give tested-by's before the design is locked-in but
> in that sense I don't think we are far away of some solution. EAUG side
> looks pretty good to me.

Also, I'll add the missing ioctl's to my man page patch before sending a
new version so that it lacks only SGX2 ioctls. I think you are right in
your review feedback that they should be part of the patch so that we get
it in-sync.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-14 21:53                                         ` Jarkko Sakkinen
  2022-01-14 21:57                                           ` Jarkko Sakkinen
@ 2022-01-14 22:17                                           ` Jarkko Sakkinen
  2022-01-14 22:23                                             ` Jarkko Sakkinen
  2022-01-14 23:05                                           ` Reinette Chatre
  2 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-14 22:17 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > Hi Jarkko and Nathaniel,
> > 
> > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > >>
> > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > >>>> Hi Jarkko,
> > >>>>
> > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > >>>>>>>> mechanism just
> > >>>>>>>>>>>> to extend
> > >>>>>>>>>>>>> permissions be sufficient?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > >>>>>>>> having only
> > >>>>>>>>>>>> the permissions required to support their intended use
> > >>>>>>>> during the
> > >>>>>>>>>>>> time the
> > >>>>>>>>>>>> particular access is required. While technically it is
> > >>>>>>>> possible to grant
> > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > >>>>>>>> is safer to
> > >>>>>>>>>>>> remove permissions when no longer required.
> > >>>>>>>>>>>
> > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > >>>>>>>> how using it
> > >>>>>>>>>>> would make things safer?
> > >>>>>>>>>>>
> > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > >>>>>>>> defensive
> > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > >>>>>>>>>
> > >>>>>>>>> What is the exact threat we are talking about?
> > >>>>>>>>
> > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > >>>>>>>> EACCEPT involvement.
> > >>>>>>>>
> > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > >>>>>>>> introduce
> > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > >>>>>>>>
> > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > >>>>>>>
> > >>>>>>> At least it does not protected against enclave code because an enclave
> > >>>>>>> can
> > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > >>>>>>> confused here about the actual threat but also the potential adversary
> > >>>>>>> and
> > >>>>>>> target.
> > >>>>>>>
> > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > >>>>>> request EMODPR.
> > >>>>>
> > >>>>> The question is really simple: what is the threat scenario? In order to use
> > >>>>> the word "vulnerability", you would need one.
> > >>>>>
> > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > >>>>> one, in order to ack it to the mainline.
> > >>>>>
> > >>>>
> > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > >>>> these flows in the OS, but would of course impact the enclave.
> > >>>
> > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > >>> EMODPR going to help with any sort of workload?
> > >>
> > >> I've even started think should we just always allow mmap()?
> > > 
> > > I suspect this may be the most ergonomic way forward. Instructions
> > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > make the enclave runner do multiple context switches just to change
> > > the memory map of an enclave?
> > 
> > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > this implementation supports and does not change the existing ABI where a new
> > memory mapping can only be created if its permissions are the same or weaker
> > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > can change (thanks to SGX2) and when they do there are no forced nor required
> > changes to the memory mapping - pages remain accessible where the memory mapping
> > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > should be expected to access a page with permissions that the memory mapping
> > did not previously allow.
> > 
> > Are you saying that the permissions of a new memory mapping should now be allowed
> > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > considered a change in ABI but something we could support on SGX2 systems.
> > 
> > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > and EMODPE to manage enclave page permissions.
> 
> Thanks, this was very informative response. I'll try to elaborate why
> EMODPR gives me headaches.
> 
> I'm having hard time to connect the dots between OS mistrust and
> restricting enclave by changing EPCM permissions. To make EMODPR actually
> legit, it needs really at least some sort of example of a scenario where
> mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> we are just waving our hands.
> 
> Generally speaking a restriction is not a restriction if cannot be enforced. 
> 
> I see two non-EMODPR options: you could relax this,  *or* you could make it
> soft restriction by not doing EMODPR but instead just updating the internal
> xarray. The 2nd option would be fully backwards compatible with the
> existing invariant.
> 
> It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> or will be supported.

I think I *might* have a supporting scenario for EMODPR.

Enclave might want to accept EMODPR request because a bug in functionality
triggered with TCS entries might allow otherwise to rewrite enclave data,
i.e. provide a write primitive outside the enclave. With some other way to
exploit you could have a read primitive and thus have a full access to the
internal data of the enclave.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-14 22:17                                           ` Jarkko Sakkinen
@ 2022-01-14 22:23                                             ` Jarkko Sakkinen
  2022-01-14 22:34                                               ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-14 22:23 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Sat, Jan 15, 2022 at 12:17:06AM +0200, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> > On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > > Hi Jarkko and Nathaniel,
> > > 
> > > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > >>
> > > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > > >>>> Hi Jarkko,
> > > >>>>
> > > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > > >>>>>>>> mechanism just
> > > >>>>>>>>>>>> to extend
> > > >>>>>>>>>>>>> permissions be sufficient?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > > >>>>>>>> having only
> > > >>>>>>>>>>>> the permissions required to support their intended use
> > > >>>>>>>> during the
> > > >>>>>>>>>>>> time the
> > > >>>>>>>>>>>> particular access is required. While technically it is
> > > >>>>>>>> possible to grant
> > > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > > >>>>>>>> is safer to
> > > >>>>>>>>>>>> remove permissions when no longer required.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > > >>>>>>>> how using it
> > > >>>>>>>>>>> would make things safer?
> > > >>>>>>>>>>>
> > > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > > >>>>>>>> defensive
> > > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > > >>>>>>>>>
> > > >>>>>>>>> What is the exact threat we are talking about?
> > > >>>>>>>>
> > > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > > >>>>>>>> EACCEPT involvement.
> > > >>>>>>>>
> > > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > > >>>>>>>> introduce
> > > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > > >>>>>>>>
> > > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > > >>>>>>>
> > > >>>>>>> At least it does not protected against enclave code because an enclave
> > > >>>>>>> can
> > > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > >>>>>>> confused here about the actual threat but also the potential adversary
> > > >>>>>>> and
> > > >>>>>>> target.
> > > >>>>>>>
> > > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > > >>>>>> request EMODPR.
> > > >>>>>
> > > >>>>> The question is really simple: what is the threat scenario? In order to use
> > > >>>>> the word "vulnerability", you would need one.
> > > >>>>>
> > > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > >>>>> one, in order to ack it to the mainline.
> > > >>>>>
> > > >>>>
> > > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > > >>>> these flows in the OS, but would of course impact the enclave.
> > > >>>
> > > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > > >>> EMODPR going to help with any sort of workload?
> > > >>
> > > >> I've even started think should we just always allow mmap()?
> > > > 
> > > > I suspect this may be the most ergonomic way forward. Instructions
> > > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > > make the enclave runner do multiple context switches just to change
> > > > the memory map of an enclave?
> > > 
> > > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > > this implementation supports and does not change the existing ABI where a new
> > > memory mapping can only be created if its permissions are the same or weaker
> > > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > > can change (thanks to SGX2) and when they do there are no forced nor required
> > > changes to the memory mapping - pages remain accessible where the memory mapping
> > > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > > should be expected to access a page with permissions that the memory mapping
> > > did not previously allow.
> > > 
> > > Are you saying that the permissions of a new memory mapping should now be allowed
> > > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > > considered a change in ABI but something we could support on SGX2 systems.
> > > 
> > > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > > and EMODPE to manage enclave page permissions.
> > 
> > Thanks, this was very informative response. I'll try to elaborate why
> > EMODPR gives me headaches.
> > 
> > I'm having hard time to connect the dots between OS mistrust and
> > restricting enclave by changing EPCM permissions. To make EMODPR actually
> > legit, it needs really at least some sort of example of a scenario where
> > mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> > we are just waving our hands.
> > 
> > Generally speaking a restriction is not a restriction if cannot be enforced. 
> > 
> > I see two non-EMODPR options: you could relax this,  *or* you could make it
> > soft restriction by not doing EMODPR but instead just updating the internal
> > xarray. The 2nd option would be fully backwards compatible with the
> > existing invariant.
> > 
> > It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> > or will be supported.
> 
> I think I *might* have a supporting scenario for EMODPR.
> 
> Enclave might want to accept EMODPR request because a bug in functionality
> triggered with TCS entries might allow otherwise to rewrite enclave data,
> i.e. provide a write primitive outside the enclave. With some other way to
> exploit you could have a read primitive and thus have a full access to the
> internal data of the enclave.

I.e. because of this it would be "for profit case" for the enclave not to
cancel the effect of EMODPR by applying EMODPE because it can protect
itself by doing that from malformed input data.

I get that the whole point is the OS mistrust but you really need to bring
up the rationale to the specifics what you mean by it in the context of the
kernel patch. Otherwise, anything would go by saying that we do this
because OS mistrust.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-14 22:23                                             ` Jarkko Sakkinen
@ 2022-01-14 22:34                                               ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-14 22:34 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Sat, Jan 15, 2022 at 12:23:46AM +0200, Jarkko Sakkinen wrote:
> On Sat, Jan 15, 2022 at 12:17:06AM +0200, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 11:53:22PM +0200, Jarkko Sakkinen wrote:
> > > On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> > > > Hi Jarkko and Nathaniel,
> > > > 
> > > > On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > > > > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > > >>
> > > > >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> > > > >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> > > > >>>> Hi Jarkko,
> > > > >>>>
> > > > >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> > > > >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> > > > >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> > > > >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> > > > >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> > > > >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> > > > >>>>>>>> mechanism just
> > > > >>>>>>>>>>>> to extend
> > > > >>>>>>>>>>>>> permissions be sufficient?
> > > > >>>>>>>>>>>>
> > > > >>>>>>>>>>>> I do believe that we need both in order to support pages
> > > > >>>>>>>> having only
> > > > >>>>>>>>>>>> the permissions required to support their intended use
> > > > >>>>>>>> during the
> > > > >>>>>>>>>>>> time the
> > > > >>>>>>>>>>>> particular access is required. While technically it is
> > > > >>>>>>>> possible to grant
> > > > >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> > > > >>>>>>>> is safer to
> > > > >>>>>>>>>>>> remove permissions when no longer required.
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> > > > >>>>>>>> how using it
> > > > >>>>>>>>>>> would make things safer?
> > > > >>>>>>>>>>>
> > > > >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> > > > >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> > > > >>>>>>>> defensive
> > > > >>>>>>>>>> measure. In that case, EMODPR is useful.
> > > > >>>>>>>>>
> > > > >>>>>>>>> What is the exact threat we are talking about?
> > > > >>>>>>>>
> > > > >>>>>>>> To add: it should be *significantly* critical thread, given that not
> > > > >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> > > > >>>>>>>> EACCEPT involvement.
> > > > >>>>>>>>
> > > > >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> > > > >>>>>>>> introduce
> > > > >>>>>>>> it when there is PoC code for any of the existing run-time that
> > > > >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> > > > >>>>>>>>
> > > > >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> > > > >>>>>>>
> > > > >>>>>>> At least it does not protected against enclave code because an enclave
> > > > >>>>>>> can
> > > > >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> > > > >>>>>>> confused here about the actual threat but also the potential adversary
> > > > >>>>>>> and
> > > > >>>>>>> target.
> > > > >>>>>>>
> > > > >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> > > > >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> > > > >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> > > > >>>>>> If enclave does not verify with EACCEPT, then its own code has
> > > > >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> > > > >>>>>> request EMODPR.
> > > > >>>>>
> > > > >>>>> The question is really simple: what is the threat scenario? In order to use
> > > > >>>>> the word "vulnerability", you would need one.
> > > > >>>>>
> > > > >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> > > > >>>>> one, in order to ack it to the mainline.
> > > > >>>>>
> > > > >>>>
> > > > >>>> Which complexity related to EMODPR are you concerned about? In a later message
> > > > >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> > > > >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> > > > >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> > > > >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> > > > >>>> these flows in the OS, but would of course impact the enclave.
> > > > >>>
> > > > >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> > > > >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> > > > >>> EMODPR going to help with any sort of workload?
> > > > >>
> > > > >> I've even started think should we just always allow mmap()?
> > > > > 
> > > > > I suspect this may be the most ergonomic way forward. Instructions
> > > > > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > > > > what the enclave wants, which is a memory mapping in the enclave. Why
> > > > > make the enclave runner do multiple context switches just to change
> > > > > the memory map of an enclave?
> > > > 
> > > > The enclave runner is not forced to make any changes to a memory mapping. To start,
> > > > this implementation supports and does not change the existing ABI where a new
> > > > memory mapping can only be created if its permissions are the same or weaker
> > > > than the EPCM permissions. After the memory mapping is created the EPCM permissions
> > > > can change (thanks to SGX2) and when they do there are no forced nor required
> > > > changes to the memory mapping - pages remain accessible where the memory mapping
> > > > and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> > > > to an enclave page (EMODPE) then the memory mapping may need to be changed as
> > > > should be expected to access a page with permissions that the memory mapping
> > > > did not previously allow.
> > > > 
> > > > Are you saying that the permissions of a new memory mapping should now be allowed
> > > > to exceed EPCM permissions and thus the enclave runner would not need to modify a
> > > > memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> > > > considered a change in ABI but something we could support on SGX2 systems.
> > > > 
> > > > I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> > > > that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> > > > and EMODPE to manage enclave page permissions.
> > > 
> > > Thanks, this was very informative response. I'll try to elaborate why
> > > EMODPR gives me headaches.
> > > 
> > > I'm having hard time to connect the dots between OS mistrust and
> > > restricting enclave by changing EPCM permissions. To make EMODPR actually
> > > legit, it needs really at least some sort of example of a scenario where
> > > mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> > > we are just waving our hands.
> > > 
> > > Generally speaking a restriction is not a restriction if cannot be enforced. 
> > > 
> > > I see two non-EMODPR options: you could relax this,  *or* you could make it
> > > soft restriction by not doing EMODPR but instead just updating the internal
> > > xarray. The 2nd option would be fully backwards compatible with the
> > > existing invariant.
> > > 
> > > It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> > > or will be supported.
> > 
> > I think I *might* have a supporting scenario for EMODPR.
> > 
> > Enclave might want to accept EMODPR request because a bug in functionality
> > triggered with TCS entries might allow otherwise to rewrite enclave data,
> > i.e. provide a write primitive outside the enclave. With some other way to
> > exploit you could have a read primitive and thus have a full access to the
> > internal data of the enclave.
> 
> I.e. because of this it would be "for profit case" for the enclave not to
> cancel the effect of EMODPR by applying EMODPE because it can protect
> itself by doing that from malformed input data.
> 
> I get that the whole point is the OS mistrust but you really need to bring
> up the rationale to the specifics what you mean by it in the context of the
> kernel patch. Otherwise, anything would go by saying that we do this
> because OS mistrust.

My scenario is illegit because:

1. An attacker can choose not to do EMODPR and still take advantage of the
   exploit, and get the write primitive.
2. Enclave has very theoretical chances to counter-measure that because
   introspection is not possible, only the "mistrusted OS" has that
   capability, i.e. the attacker. ERDINFO is AFAIK ENCLS leaf.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-14 21:53                                         ` Jarkko Sakkinen
  2022-01-14 21:57                                           ` Jarkko Sakkinen
  2022-01-14 22:17                                           ` Jarkko Sakkinen
@ 2022-01-14 23:05                                           ` Reinette Chatre
  2022-01-14 23:15                                             ` Jarkko Sakkinen
  2 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2022-01-14 23:05 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

Hi Jarkko,

On 1/14/2022 1:53 PM, Jarkko Sakkinen wrote:
> On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
>> Hi Jarkko and Nathaniel,
>>
>> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
>>> On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>>
>>>> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
>>>>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
>>>>>> Hi Jarkko,
>>>>>>
>>>>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
>>>>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>>>>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>>>>>>>>>>>>>>> OK, so the question is: do we need both or would a
>>>>>>>>>> mechanism just
>>>>>>>>>>>>>> to extend
>>>>>>>>>>>>>>> permissions be sufficient?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do believe that we need both in order to support pages
>>>>>>>>>> having only
>>>>>>>>>>>>>> the permissions required to support their intended use
>>>>>>>>>> during the
>>>>>>>>>>>>>> time the
>>>>>>>>>>>>>> particular access is required. While technically it is
>>>>>>>>>> possible to grant
>>>>>>>>>>>>>> pages all permissions they may need during their lifetime it
>>>>>>>>>> is safer to
>>>>>>>>>>>>>> remove permissions when no longer required.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
>>>>>>>>>> how using it
>>>>>>>>>>>>> would make things safer?
>>>>>>>>>>>>>
>>>>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
>>>>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
>>>>>>>>>> defensive
>>>>>>>>>>>> measure. In that case, EMODPR is useful.
>>>>>>>>>>>
>>>>>>>>>>> What is the exact threat we are talking about?
>>>>>>>>>>
>>>>>>>>>> To add: it should be *significantly* critical thread, given that not
>>>>>>>>>> supporting only EAUG would leave us only one complex call pattern with
>>>>>>>>>> EACCEPT involvement.
>>>>>>>>>>
>>>>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
>>>>>>>>>> introduce
>>>>>>>>>> it when there is PoC code for any of the existing run-time that
>>>>>>>>>> demonstrates the demand for it. Right now this way too speculative.
>>>>>>>>>>
>>>>>>>>>> Supporting EMODPE is IMHO by factors more critical.
>>>>>>>>>
>>>>>>>>> At least it does not protected against enclave code because an enclave
>>>>>>>>> can
>>>>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
>>>>>>>>> confused here about the actual threat but also the potential adversary
>>>>>>>>> and
>>>>>>>>> target.
>>>>>>>>>
>>>>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
>>>>>>>> to request  EMODPR in the first place through runtime to kernel, then to
>>>>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
>>>>>>>> If enclave does not verify with EACCEPT, then its own code has
>>>>>>>> vulnerability. But this does not justify OS not providing the mechanism to
>>>>>>>> request EMODPR.
>>>>>>>
>>>>>>> The question is really simple: what is the threat scenario? In order to use
>>>>>>> the word "vulnerability", you would need one.
>>>>>>>
>>>>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
>>>>>>> one, in order to ack it to the mainline.
>>>>>>>
>>>>>>
>>>>>> Which complexity related to EMODPR are you concerned about? In a later message
>>>>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
>>>>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
>>>>>> The OS does not require nor depend on EACCEPT being called as part of these flows
>>>>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
>>>>>> these flows in the OS, but would of course impact the enclave.
>>>>>
>>>>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
>>>>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
>>>>> EMODPR going to help with any sort of workload?
>>>>
>>>> I've even started think should we just always allow mmap()?
>>>
>>> I suspect this may be the most ergonomic way forward. Instructions
>>> like EAUG/EMODPR/etc are really irrelevant implementation details to
>>> what the enclave wants, which is a memory mapping in the enclave. Why
>>> make the enclave runner do multiple context switches just to change
>>> the memory map of an enclave?
>>
>> The enclave runner is not forced to make any changes to a memory mapping. To start,
>> this implementation supports and does not change the existing ABI where a new
>> memory mapping can only be created if its permissions are the same or weaker
>> than the EPCM permissions. After the memory mapping is created the EPCM permissions
>> can change (thanks to SGX2) and when they do there are no forced nor required
>> changes to the memory mapping - pages remain accessible where the memory mapping
>> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
>> to an enclave page (EMODPE) then the memory mapping may need to be changed as
>> should be expected to access a page with permissions that the memory mapping
>> did not previously allow.
>>
>> Are you saying that the permissions of a new memory mapping should now be allowed
>> to exceed EPCM permissions and thus the enclave runner would not need to modify a
>> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
>> considered a change in ABI but something we could support on SGX2 systems.
>>
>> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
>> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
>> and EMODPE to manage enclave page permissions.
> 
> Thanks, this was very informative response. I'll try to elaborate why
> EMODPR gives me headaches.
> 
> I'm having hard time to connect the dots between OS mistrust and
> restricting enclave by changing EPCM permissions. To make EMODPR actually
> legit, it needs really at least some sort of example of a scenario where
> mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> we are just waving our hands.

The enclave itself should run in an environment that respects the foundational
security principle of least privilege. There are valid scenarios where an enclave
may need to write to a page and then later execute from it, but it should
ideally not have those permissions for its entire lifetime. As Haitao mentioned
this is to protect it from itself, which can be bugs in the code or maybe even
malicious code and is similar to how non enclave code is treated today. So in your
request I interpret this to mean that it is the enclave that is both the
adversary and the target.

At the same time the ones running the enclaves do not trust the OS to manage
access to enclave pages and that is how we ended up with the need for these SGX2
permission functions needed to change the EPCM permissions.

> 
> Generally speaking a restriction is not a restriction if cannot be enforced. 

The EPCM permissions are enforced by the hardware and in addition
in this implementation the OS will not add PTEs that exceed the EPCM permissions.

> 
> I see two non-EMODPR options: you could relax this,  *or* you could make it
> soft restriction by not doing EMODPR but instead just updating the internal
> xarray. The 2nd option would be fully backwards compatible with the
> existing invariant.

Just updating OS structures would not be sufficient. In addition to needing to
update its EPCM permissions for its security the enclave needs to ensure that
there are no cached addresses in the TLB that may still allow access to pages
with permissions that were removed.

> 
> It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> or will be supported.

EMODPE is currently supported and you can see an example of its use
in the testing code that forms part of this series. Please see
"[PATCH 11/25] selftests/sgx: Add test for EPCM permission changes" where you
will find support for calling ENCLU[EMODPE] added to the test enclave and the 
"epcm_permissions" test making use of it.

After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS
to communicate the new permissions to the OS. Since the OS does not know when/if 
ENCLU[EMODPE] has indeed been called all permission changes are treated as
permission restriction and will thus, in addition to removing the page table
entries, call ENCLS[EMODPR], which will be a no-op (apart from EPCM.PR being set)
if user space only did call ENCLU[EMODPE]. The page remains accessible even though
EPCM.PR is set but the current implementations follows a
SGX_IOC_ENCLAVE_MOD_PROTECTIONS call with an ENCLU[EACCEPT] with the PR bit set
to ensure there are no dangling PR bit set in EPCM.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-14 23:05                                           ` Reinette Chatre
@ 2022-01-14 23:15                                             ` Jarkko Sakkinen
  2022-01-15  0:01                                               ` Reinette Chatre
  2022-01-15 16:49                                               ` Jarkko Sakkinen
  0 siblings, 2 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-14 23:15 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/14/2022 1:53 PM, Jarkko Sakkinen wrote:
> > On Thu, Jan 13, 2022 at 01:42:50PM -0800, Reinette Chatre wrote:
> >> Hi Jarkko and Nathaniel,
> >>
> >> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> >>> On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >>>>
> >>>> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> >>>>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> >>>>>> Hi Jarkko,
> >>>>>>
> >>>>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> >>>>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> >>>>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> >>>>>>>>>>>>>>> OK, so the question is: do we need both or would a
> >>>>>>>>>> mechanism just
> >>>>>>>>>>>>>> to extend
> >>>>>>>>>>>>>>> permissions be sufficient?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I do believe that we need both in order to support pages
> >>>>>>>>>> having only
> >>>>>>>>>>>>>> the permissions required to support their intended use
> >>>>>>>>>> during the
> >>>>>>>>>>>>>> time the
> >>>>>>>>>>>>>> particular access is required. While technically it is
> >>>>>>>>>> possible to grant
> >>>>>>>>>>>>>> pages all permissions they may need during their lifetime it
> >>>>>>>>>> is safer to
> >>>>>>>>>>>>>> remove permissions when no longer required.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> >>>>>>>>>> how using it
> >>>>>>>>>>>>> would make things safer?
> >>>>>>>>>>>>>
> >>>>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> >>>>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> >>>>>>>>>> defensive
> >>>>>>>>>>>> measure. In that case, EMODPR is useful.
> >>>>>>>>>>>
> >>>>>>>>>>> What is the exact threat we are talking about?
> >>>>>>>>>>
> >>>>>>>>>> To add: it should be *significantly* critical thread, given that not
> >>>>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> >>>>>>>>>> EACCEPT involvement.
> >>>>>>>>>>
> >>>>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> >>>>>>>>>> introduce
> >>>>>>>>>> it when there is PoC code for any of the existing run-time that
> >>>>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> >>>>>>>>>>
> >>>>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> >>>>>>>>>
> >>>>>>>>> At least it does not protected against enclave code because an enclave
> >>>>>>>>> can
> >>>>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> >>>>>>>>> confused here about the actual threat but also the potential adversary
> >>>>>>>>> and
> >>>>>>>>> target.
> >>>>>>>>>
> >>>>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> >>>>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> >>>>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> >>>>>>>> If enclave does not verify with EACCEPT, then its own code has
> >>>>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> >>>>>>>> request EMODPR.
> >>>>>>>
> >>>>>>> The question is really simple: what is the threat scenario? In order to use
> >>>>>>> the word "vulnerability", you would need one.
> >>>>>>>
> >>>>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> >>>>>>> one, in order to ack it to the mainline.
> >>>>>>>
> >>>>>>
> >>>>>> Which complexity related to EMODPR are you concerned about? In a later message
> >>>>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> >>>>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> >>>>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> >>>>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> >>>>>> these flows in the OS, but would of course impact the enclave.
> >>>>>
> >>>>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> >>>>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> >>>>> EMODPR going to help with any sort of workload?
> >>>>
> >>>> I've even started think should we just always allow mmap()?
> >>>
> >>> I suspect this may be the most ergonomic way forward. Instructions
> >>> like EAUG/EMODPR/etc are really irrelevant implementation details to
> >>> what the enclave wants, which is a memory mapping in the enclave. Why
> >>> make the enclave runner do multiple context switches just to change
> >>> the memory map of an enclave?
> >>
> >> The enclave runner is not forced to make any changes to a memory mapping. To start,
> >> this implementation supports and does not change the existing ABI where a new
> >> memory mapping can only be created if its permissions are the same or weaker
> >> than the EPCM permissions. After the memory mapping is created the EPCM permissions
> >> can change (thanks to SGX2) and when they do there are no forced nor required
> >> changes to the memory mapping - pages remain accessible where the memory mapping
> >> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> >> to an enclave page (EMODPE) then the memory mapping may need to be changed as
> >> should be expected to access a page with permissions that the memory mapping
> >> did not previously allow.
> >>
> >> Are you saying that the permissions of a new memory mapping should now be allowed
> >> to exceed EPCM permissions and thus the enclave runner would not need to modify a
> >> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> >> considered a change in ABI but something we could support on SGX2 systems.
> >>
> >> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> >> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> >> and EMODPE to manage enclave page permissions.
> > 
> > Thanks, this was very informative response. I'll try to elaborate why
> > EMODPR gives me headaches.
> > 
> > I'm having hard time to connect the dots between OS mistrust and
> > restricting enclave by changing EPCM permissions. To make EMODPR actually
> > legit, it needs really at least some sort of example of a scenario where
> > mistrusted OS is the adversary and enclave is the attack target. Otherwise,
> > we are just waving our hands.
> 
> The enclave itself should run in an environment that respects the foundational
> security principle of least privilege. There are valid scenarios where an enclave
> may need to write to a page and then later execute from it, but it should
> ideally not have those permissions for its entire lifetime. As Haitao mentioned
> this is to protect it from itself, which can be bugs in the code or maybe even
> malicious code and is similar to how non enclave code is treated today. So in your
> request I interpret this to mean that it is the enclave that is both the
> adversary and the target.
> 
> At the same time the ones running the enclaves do not trust the OS to manage
> access to enclave pages and that is how we ended up with the need for these SGX2
> permission functions needed to change the EPCM permissions.
> 
> > 
> > Generally speaking a restriction is not a restriction if cannot be enforced. 
> 
> The EPCM permissions are enforced by the hardware and in addition
> in this implementation the OS will not add PTEs that exceed the EPCM permissions.
> 
> > 
> > I see two non-EMODPR options: you could relax this,  *or* you could make it
> > soft restriction by not doing EMODPR but instead just updating the internal
> > xarray. The 2nd option would be fully backwards compatible with the
> > existing invariant.
> 
> Just updating OS structures would not be sufficient. In addition to needing to
> update its EPCM permissions for its security the enclave needs to ensure that
> there are no cached addresses in the TLB that may still allow access to pages
> with permissions that were removed.

How enclave can check a page range that EPCM has the expected permissions?

I'd expect OS mistrusting enclave to do such check at the start of TCS.

Otherwise, it cannot be sure whether EMODPR was ever requested, and thus
it plays zero part in the game.

You would get equivalent level of security by just modifying the xarray,
and not doing EMODPR.

> > It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
> > or will be supported.
> 
> EMODPE is currently supported and you can see an example of its use
> in the testing code that forms part of this series. Please see
> "[PATCH 11/25] selftests/sgx: Add test for EPCM permission changes" where you
> will find support for calling ENCLU[EMODPE] added to the test enclave and the 
> "epcm_permissions" test making use of it.
> 
> After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS

OK, great.

A minor nit: please call it SGX_IOC_ENCLAVE_MODIFY_PROTECTIONS. 

> to communicate the new permissions to the OS. Since the OS does not know when/if 
> ENCLU[EMODPE] has indeed been called all permission changes are treated as
> permission restriction and will thus, in addition to removing the page table
> entries, call ENCLS[EMODPR], which will be a no-op (apart from EPCM.PR being set)
> if user space only did call ENCLU[EMODPE]. The page remains accessible even though
> EPCM.PR is set but the current implementations follows a
> SGX_IOC_ENCLAVE_MOD_PROTECTIONS call with an ENCLU[EACCEPT] with the PR bit set
> to ensure there are no dangling PR bit set in EPCM.
> 
> Reinette

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-14 23:15                                             ` Jarkko Sakkinen
@ 2022-01-15  0:01                                               ` Reinette Chatre
  2022-01-15  0:27                                                 ` Jarkko Sakkinen
  2022-01-15 16:49                                               ` Jarkko Sakkinen
  1 sibling, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2022-01-15  0:01 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

Hi Jarkko,

On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
>> Hi Jarkko,
> 
> How enclave can check a page range that EPCM has the expected permissions?

Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
time the enclave provides the expected permissions and that will fail
if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).

> 
> I'd expect OS mistrusting enclave to do such check at the start of TCS.
> 
> Otherwise, it cannot be sure whether EMODPR was ever requested, and thus
> it plays zero part in the game.

The EPCM would always have a PR bit set after EMODPR was requested.

> 
> You would get equivalent level of security by just modifying the xarray,
> and not doing EMODPR.

The xarray is an internal SGX driver structure that can guide how the OS manages
page permissions (via VMA permissions or PTEs). This brings us back to the
fact that the OS is not trusted and a malicious OS may install PTEs that
allow full access to enclave pages and that is why the enclave needs/wants
to control its own permissions via the EPCM permissions that are managed
with the ENCLS[EMODPR] and ENCLU[EMODPE] instructions.

 
>>> It's really hard to ACK or NAK EMODPR patch without knowing how EMODPE is
>>> or will be supported.
>>
>> EMODPE is currently supported and you can see an example of its use
>> in the testing code that forms part of this series. Please see
>> "[PATCH 11/25] selftests/sgx: Add test for EPCM permission changes" where you
>> will find support for calling ENCLU[EMODPE] added to the test enclave and the 
>> "epcm_permissions" test making use of it.
>>
>> After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS
> 
> OK, great.
> 
> A minor nit: please call it SGX_IOC_ENCLAVE_MODIFY_PROTECTIONS. 

Sure. (btw ... I was following guidance from:
https://lore.kernel.org/lkml/Yav0%2F3jeJsuT3yEq@iki.fi/ ).

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-15  0:01                                               ` Reinette Chatre
@ 2022-01-15  0:27                                                 ` Jarkko Sakkinen
  2022-01-15  0:41                                                   ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-15  0:27 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> >> Hi Jarkko,
> > 
> > How enclave can check a page range that EPCM has the expected permissions?
> 
> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> time the enclave provides the expected permissions and that will fail
> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).

This is a very valid point but that does make the introspection possible
only at the time of EACCEPT.

It does not give tools for enclave to make sure that EMODPR-ETRACK dance
was ever exercised.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-15  0:27                                                 ` Jarkko Sakkinen
@ 2022-01-15  0:41                                                   ` Reinette Chatre
  2022-01-15  1:18                                                     ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2022-01-15  0:41 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

Hi Jarkko,

On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
>>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
>>>> Hi Jarkko,
>>>
>>> How enclave can check a page range that EPCM has the expected permissions?
>>
>> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
>> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
>> time the enclave provides the expected permissions and that will fail
>> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> 
> This is a very valid point but that does make the introspection possible
> only at the time of EACCEPT.
> 
> It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> was ever exercised.

Could you please elaborate? EACCEPT is available to the enclave as a tool
and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).

Here is the relevant snippet from the SDM from the section where it
describes EACCEPT:

IF (Tracking not correct)
    THEN
        RFLAGS.ZF := 1;
        RAX := SGX_NOT_TRACKED;
        GOTO DONE;
FI;

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-15  0:41                                                   ` Reinette Chatre
@ 2022-01-15  1:18                                                     ` Jarkko Sakkinen
  2022-01-15 11:56                                                       ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-15  1:18 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> >> Hi Jarkko,
> >>
> >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> >>>> Hi Jarkko,
> >>>
> >>> How enclave can check a page range that EPCM has the expected permissions?
> >>
> >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> >> time the enclave provides the expected permissions and that will fail
> >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > 
> > This is a very valid point but that does make the introspection possible
> > only at the time of EACCEPT.
> > 
> > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > was ever exercised.
> 
> Could you please elaborate? EACCEPT is available to the enclave as a tool
> and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> 
> Here is the relevant snippet from the SDM from the section where it
> describes EACCEPT:
> 
> IF (Tracking not correct)
>     THEN
>         RFLAGS.ZF := 1;
>         RAX := SGX_NOT_TRACKED;
>         GOTO DONE;
> FI;
> 
> Reinette

Yes, if enclave calls EACCEPT it does the necessary introspection and makes
sure that ETRACK is completed. I have trouble understanding how enclave
makes sure that EACCEPT was called.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-15  1:18                                                     ` Jarkko Sakkinen
@ 2022-01-15 11:56                                                       ` Jarkko Sakkinen
  2022-01-15 11:59                                                         ` Jarkko Sakkinen
  2022-01-17 13:13                                                         ` Nathaniel McCallum
  0 siblings, 2 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-15 11:56 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > Hi Jarkko,
> > 
> > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > >> Hi Jarkko,
> > >>
> > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > >>>> Hi Jarkko,
> > >>>
> > >>> How enclave can check a page range that EPCM has the expected permissions?
> > >>
> > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > >> time the enclave provides the expected permissions and that will fail
> > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > 
> > > This is a very valid point but that does make the introspection possible
> > > only at the time of EACCEPT.
> > > 
> > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > was ever exercised.
> > 
> > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > 
> > Here is the relevant snippet from the SDM from the section where it
> > describes EACCEPT:
> > 
> > IF (Tracking not correct)
> >     THEN
> >         RFLAGS.ZF := 1;
> >         RAX := SGX_NOT_TRACKED;
> >         GOTO DONE;
> > FI;
> > 
> > Reinette
> 
> Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> sure that ETRACK is completed. I have trouble understanding how enclave
> makes sure that EACCEPT was called.

I'm not concerned of anything going wrong once EMODPR has been started.

The problem nails down to that the whole EMODPR process is spawned by
the entity that is not trusted so maybe that should further broke down
to three roles:

1. Build process B
2. Runner process R.
3. Enclave E.

And to the costraint that we trust B *more* than R. Once B has done all the
needed EMODPR calls it would send the file descriptor to R. Even if R would
have full access to /dev/sgx_enclave, it would not matter, since B has done
EMODPR-EACCEPT dance with E.

So what you can achieve with EMODPR is not protection against mistrusted
*OS*. There's absolutely no chance you could use it for that purpose
because mistrusted OS controls the whole process.

EMODPR is to help to protect enclave against mistrusted *process*, i.e.
in the above scenario R.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-15 11:56                                                       ` Jarkko Sakkinen
@ 2022-01-15 11:59                                                         ` Jarkko Sakkinen
  2022-01-17 13:13                                                         ` Nathaniel McCallum
  1 sibling, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-15 11:59 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Sat, Jan 15, 2022 at 01:56:55PM +0200, Jarkko Sakkinen wrote:
> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > > 
> > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > >> Hi Jarkko,
> > > >>
> > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > >>>> Hi Jarkko,
> > > >>>
> > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > >>
> > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > >> time the enclave provides the expected permissions and that will fail
> > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > 
> > > > This is a very valid point but that does make the introspection possible
> > > > only at the time of EACCEPT.
> > > > 
> > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > was ever exercised.
> > > 
> > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > > 
> > > Here is the relevant snippet from the SDM from the section where it
> > > describes EACCEPT:
> > > 
> > > IF (Tracking not correct)
> > >     THEN
> > >         RFLAGS.ZF := 1;
> > >         RAX := SGX_NOT_TRACKED;
> > >         GOTO DONE;
> > > FI;
> > > 
> > > Reinette
> > 
> > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > sure that ETRACK is completed. I have trouble understanding how enclave
> > makes sure that EACCEPT was called.
> 
> I'm not concerned of anything going wrong once EMODPR has been started.
> 
> The problem nails down to that the whole EMODPR process is spawned by
> the entity that is not trusted so maybe that should further broke down
> to three roles:
> 
> 1. Build process B
> 2. Runner process R.
> 3. Enclave E.
> 
> And to the costraint that we trust B *more* than R. Once B has done all the
> needed EMODPR calls it would send the file descriptor to R. Even if R would
> have full access to /dev/sgx_enclave, it would not matter, since B has done
> EMODPR-EACCEPT dance with E.
> 
> So what you can achieve with EMODPR is not protection against mistrusted
> *OS*. There's absolutely no chance you could use it for that purpose
> because mistrusted OS controls the whole process.
> 
> EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> in the above scenario R.

My suggestion for this is that let's make EMODPR either opt-in or opt-out
with flags parameter, I don't care which way around. That way you can pick
between performance or extra layer of hardening if you care about the above
scenario.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-14 23:15                                             ` Jarkko Sakkinen
  2022-01-15  0:01                                               ` Reinette Chatre
@ 2022-01-15 16:49                                               ` Jarkko Sakkinen
  2022-01-18 21:18                                                 ` Reinette Chatre
  1 sibling, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-15 16:49 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Sat, Jan 15, 2022 at 01:15:53AM +0200, Jarkko Sakkinen wrote:
> > After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS
> 
> OK, great.
> 
> A minor nit: please call it SGX_IOC_ENCLAVE_MODIFY_PROTECTIONS. 

I'm not confident after looking through the test case and ioctl
about EMODPE support but I do not want disturb this anymore. Bunch
of things have been nailed and I'm now running the code, which is
great.

The obviously wrong implementation choice in this ioctl is that
it is multi-function. It should be just split it into two ioctls:
sgx_restrict_page_permissions and sgx_extend_page_permissions.

They are conceptually different flows and I'm also basing this on earlier
discussion in this mailing list from which I conclude that it is also
consensus to not have such ioctls.

Might sound clanky but it is much easier to comprehend what is going
on "in the blackbox" by doing that split.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-15 11:56                                                       ` Jarkko Sakkinen
  2022-01-15 11:59                                                         ` Jarkko Sakkinen
@ 2022-01-17 13:13                                                         ` Nathaniel McCallum
  2022-01-18  1:59                                                           ` Jarkko Sakkinen
  1 sibling, 1 reply; 155+ messages in thread
From: Nathaniel McCallum @ 2022-01-17 13:13 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Reinette Chatre, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>
> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > Hi Jarkko,
> > >
> > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > >> Hi Jarkko,
> > > >>
> > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > >>>> Hi Jarkko,
> > > >>>
> > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > >>
> > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > >> time the enclave provides the expected permissions and that will fail
> > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > >
> > > > This is a very valid point but that does make the introspection possible
> > > > only at the time of EACCEPT.
> > > >
> > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > was ever exercised.
> > >
> > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > >
> > > Here is the relevant snippet from the SDM from the section where it
> > > describes EACCEPT:
> > >
> > > IF (Tracking not correct)
> > >     THEN
> > >         RFLAGS.ZF := 1;
> > >         RAX := SGX_NOT_TRACKED;
> > >         GOTO DONE;
> > > FI;
> > >
> > > Reinette
> >
> > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > sure that ETRACK is completed. I have trouble understanding how enclave
> > makes sure that EACCEPT was called.
>
> I'm not concerned of anything going wrong once EMODPR has been started.
>
> The problem nails down to that the whole EMODPR process is spawned by
> the entity that is not trusted so maybe that should further broke down
> to three roles:
>
> 1. Build process B
> 2. Runner process R.
> 3. Enclave E.
>
> And to the costraint that we trust B *more* than R. Once B has done all the
> needed EMODPR calls it would send the file descriptor to R. Even if R would
> have full access to /dev/sgx_enclave, it would not matter, since B has done
> EMODPR-EACCEPT dance with E.
>
> So what you can achieve with EMODPR is not protection against mistrusted
> *OS*. There's absolutely no chance you could use it for that purpose
> because mistrusted OS controls the whole process.
>
> EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> in the above scenario R.

There are two general cases that I can see. Both are valid.

1. The OS moves from a trusted to an untrusted state. This could be
the multi-process system you've described. But it could also be that
the kernel becomes compromised after the enclave is fully initialized.

2. The OS is untrustworthy from the start.

The second case is the stronger one and if you can solve it, the first
one is solved implicitly. And our end goal is that if the OS does
anything malicious we will crash in a controlled way.

A defensive enclave will always want to have the least number of
privileges for the maximum protection. Therefore, the enclave will
want the OS to call EMODPR. If that were it, the host could just lie.
But the enclave also verifies that the EMODPR operation was, in fact,
executed by doing EACCEPT. When the enclave calls EACCEPT, if the
kernel hasn't restricted permissions then we get a controlled crash.
Therefore, we have solved the second case.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-13 21:42                                       ` Reinette Chatre
  2022-01-14 21:53                                         ` Jarkko Sakkinen
@ 2022-01-17 13:27                                         ` Nathaniel McCallum
  2022-01-18 21:11                                           ` Reinette Chatre
  1 sibling, 1 reply; 155+ messages in thread
From: Nathaniel McCallum @ 2022-01-17 13:27 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Jarkko Sakkinen, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Thu, Jan 13, 2022 at 4:43 PM Reinette Chatre
<reinette.chatre@intel.com> wrote:
>
> Hi Jarkko and Nathaniel,
>
> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
> > On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >>
> >> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
> >>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
> >>>> Hi Jarkko,
> >>>>
> >>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
> >>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
> >>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
> >>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
> >>>>>>>>>>>>> OK, so the question is: do we need both or would a
> >>>>>>>> mechanism just
> >>>>>>>>>>>> to extend
> >>>>>>>>>>>>> permissions be sufficient?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I do believe that we need both in order to support pages
> >>>>>>>> having only
> >>>>>>>>>>>> the permissions required to support their intended use
> >>>>>>>> during the
> >>>>>>>>>>>> time the
> >>>>>>>>>>>> particular access is required. While technically it is
> >>>>>>>> possible to grant
> >>>>>>>>>>>> pages all permissions they may need during their lifetime it
> >>>>>>>> is safer to
> >>>>>>>>>>>> remove permissions when no longer required.
> >>>>>>>>>>>
> >>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
> >>>>>>>> how using it
> >>>>>>>>>>> would make things safer?
> >>>>>>>>>>>
> >>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
> >>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
> >>>>>>>> defensive
> >>>>>>>>>> measure. In that case, EMODPR is useful.
> >>>>>>>>>
> >>>>>>>>> What is the exact threat we are talking about?
> >>>>>>>>
> >>>>>>>> To add: it should be *significantly* critical thread, given that not
> >>>>>>>> supporting only EAUG would leave us only one complex call pattern with
> >>>>>>>> EACCEPT involvement.
> >>>>>>>>
> >>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
> >>>>>>>> introduce
> >>>>>>>> it when there is PoC code for any of the existing run-time that
> >>>>>>>> demonstrates the demand for it. Right now this way too speculative.
> >>>>>>>>
> >>>>>>>> Supporting EMODPE is IMHO by factors more critical.
> >>>>>>>
> >>>>>>> At least it does not protected against enclave code because an enclave
> >>>>>>> can
> >>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
> >>>>>>> confused here about the actual threat but also the potential adversary
> >>>>>>> and
> >>>>>>> target.
> >>>>>>>
> >>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
> >>>>>> to request  EMODPR in the first place through runtime to kernel, then to
> >>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
> >>>>>> If enclave does not verify with EACCEPT, then its own code has
> >>>>>> vulnerability. But this does not justify OS not providing the mechanism to
> >>>>>> request EMODPR.
> >>>>>
> >>>>> The question is really simple: what is the threat scenario? In order to use
> >>>>> the word "vulnerability", you would need one.
> >>>>>
> >>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
> >>>>> one, in order to ack it to the mainline.
> >>>>>
> >>>>
> >>>> Which complexity related to EMODPR are you concerned about? In a later message
> >>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
> >>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
> >>>> The OS does not require nor depend on EACCEPT being called as part of these flows
> >>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
> >>>> these flows in the OS, but would of course impact the enclave.
> >>>
> >>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
> >>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
> >>> EMODPR going to help with any sort of workload?
> >>
> >> I've even started think should we just always allow mmap()?
> >
> > I suspect this may be the most ergonomic way forward. Instructions
> > like EAUG/EMODPR/etc are really irrelevant implementation details to
> > what the enclave wants, which is a memory mapping in the enclave. Why
> > make the enclave runner do multiple context switches just to change
> > the memory map of an enclave?
>
> The enclave runner is not forced to make any changes to a memory mapping. To start,
> this implementation supports and does not change the existing ABI where a new
> memory mapping can only be created if its permissions are the same or weaker
> than the EPCM permissions. After the memory mapping is created the EPCM permissions
> can change (thanks to SGX2) and when they do there are no forced nor required
> changes to the memory mapping - pages remain accessible where the memory mapping
> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
> to an enclave page (EMODPE) then the memory mapping may need to be changed as
> should be expected to access a page with permissions that the memory mapping
> did not previously allow.
>
> Are you saying that the permissions of a new memory mapping should now be allowed
> to exceed EPCM permissions and thus the enclave runner would not need to modify a
> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
> considered a change in ABI but something we could support on SGX2 systems.
>
> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
> and EMODPE to manage enclave page permissions.

As I understand the problem, there are two permission sets:

 * The EPCM permissions
 * The mmap() permissions

The mmap() permissions cannot exceed the EPCM permissions, for obvious reasons.

Hypothesis: there is no practical reason the EPCM permissions should
exceed mmap() permissions.

If the hypothesis is true, then userspace shouldn't have an API to
manage EPCM permissions distinct from mmap() permissions. Instead,
userspace should just call mmap() and the kernel should internally
adjust the EPCM permissions to match the mmap() permissions. This has
a performance advantage that every permissions change is one syscall
rather than two.

So what is the use case where an enclave would want to restrict mmap()
permissions but not restrict EPCM?

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-17 13:13                                                         ` Nathaniel McCallum
@ 2022-01-18  1:59                                                           ` Jarkko Sakkinen
  2022-01-18  2:22                                                             ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-18  1:59 UTC (permalink / raw)
  To: Nathaniel McCallum
  Cc: Reinette Chatre, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum wrote:
> On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> >
> > On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > > Hi Jarkko,
> > > >
> > > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > > >> Hi Jarkko,
> > > > >>
> > > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > > >>>> Hi Jarkko,
> > > > >>>
> > > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > > >>
> > > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > > >> time the enclave provides the expected permissions and that will fail
> > > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > >
> > > > > This is a very valid point but that does make the introspection possible
> > > > > only at the time of EACCEPT.
> > > > >
> > > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > > was ever exercised.
> > > >
> > > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > > >
> > > > Here is the relevant snippet from the SDM from the section where it
> > > > describes EACCEPT:
> > > >
> > > > IF (Tracking not correct)
> > > >     THEN
> > > >         RFLAGS.ZF := 1;
> > > >         RAX := SGX_NOT_TRACKED;
> > > >         GOTO DONE;
> > > > FI;
> > > >
> > > > Reinette
> > >
> > > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > > sure that ETRACK is completed. I have trouble understanding how enclave
> > > makes sure that EACCEPT was called.
> >
> > I'm not concerned of anything going wrong once EMODPR has been started.
> >
> > The problem nails down to that the whole EMODPR process is spawned by
> > the entity that is not trusted so maybe that should further broke down
> > to three roles:
> >
> > 1. Build process B
> > 2. Runner process R.
> > 3. Enclave E.
> >
> > And to the costraint that we trust B *more* than R. Once B has done all the
> > needed EMODPR calls it would send the file descriptor to R. Even if R would
> > have full access to /dev/sgx_enclave, it would not matter, since B has done
> > EMODPR-EACCEPT dance with E.
> >
> > So what you can achieve with EMODPR is not protection against mistrusted
> > *OS*. There's absolutely no chance you could use it for that purpose
> > because mistrusted OS controls the whole process.
> >
> > EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> > in the above scenario R.
> 
> There are two general cases that I can see. Both are valid.
> 
> 1. The OS moves from a trusted to an untrusted state. This could be
> the multi-process system you've described. But it could also be that
> the kernel becomes compromised after the enclave is fully initialized.
> 
> 2. The OS is untrustworthy from the start.
> 
> The second case is the stronger one and if you can solve it, the first
> one is solved implicitly. And our end goal is that if the OS does
> anything malicious we will crash in a controlled way.
> 
> A defensive enclave will always want to have the least number of
> privileges for the maximum protection. Therefore, the enclave will
> want the OS to call EMODPR. If that were it, the host could just lie.
> But the enclave also verifies that the EMODPR operation was, in fact,
> executed by doing EACCEPT. When the enclave calls EACCEPT, if the
> kernel hasn't restricted permissions then we get a controlled crash.
> Therefore, we have solved the second case.

So you're referring to this part of the SDM pseude code in the SDM:

(* Check the destination EPC page for concurrency *)
IF ( EPC page in use )
    THEN #GP(0); FI;

I wonder does "EPC page in use" unconditionally trigger when EACCEPT
is invoked for a page for which all of these conditions hold:

- .PR := 0 (no EMODPR in progress)
- .MODIFIED := 0 (no EMODT in progress)
- .PENDING := 0 (no EMODPR in progress)

I don't know the exact scope and scale of "EPC page in use".

Then, yes, EACCEPT could be at least used to validate that one of the
three operations above was requested. However, enclave thread cannot say
which one was it, so it is guesswork.

I guess that still is enough to quantitatively argue that EMODPR is an
operation that does improve the confidentiality properities of an
enclave, even if it has it flaws.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-18  1:59                                                           ` Jarkko Sakkinen
@ 2022-01-18  2:22                                                             ` Jarkko Sakkinen
  2022-01-18  3:31                                                               ` Jarkko Sakkinen
  2022-01-18 20:59                                                               ` Reinette Chatre
  0 siblings, 2 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-18  2:22 UTC (permalink / raw)
  To: Nathaniel McCallum
  Cc: Reinette Chatre, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
> On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum wrote:
> > On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > >
> > > On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > > > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > > > Hi Jarkko,
> > > > >
> > > > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > > > >> Hi Jarkko,
> > > > > >>
> > > > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > > > >>>> Hi Jarkko,
> > > > > >>>
> > > > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > > > >>
> > > > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > > > >> time the enclave provides the expected permissions and that will fail
> > > > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > > >
> > > > > > This is a very valid point but that does make the introspection possible
> > > > > > only at the time of EACCEPT.
> > > > > >
> > > > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > > > was ever exercised.
> > > > >
> > > > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > > > >
> > > > > Here is the relevant snippet from the SDM from the section where it
> > > > > describes EACCEPT:
> > > > >
> > > > > IF (Tracking not correct)
> > > > >     THEN
> > > > >         RFLAGS.ZF := 1;
> > > > >         RAX := SGX_NOT_TRACKED;
> > > > >         GOTO DONE;
> > > > > FI;
> > > > >
> > > > > Reinette
> > > >
> > > > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > > > sure that ETRACK is completed. I have trouble understanding how enclave
> > > > makes sure that EACCEPT was called.
> > >
> > > I'm not concerned of anything going wrong once EMODPR has been started.
> > >
> > > The problem nails down to that the whole EMODPR process is spawned by
> > > the entity that is not trusted so maybe that should further broke down
> > > to three roles:
> > >
> > > 1. Build process B
> > > 2. Runner process R.
> > > 3. Enclave E.
> > >
> > > And to the costraint that we trust B *more* than R. Once B has done all the
> > > needed EMODPR calls it would send the file descriptor to R. Even if R would
> > > have full access to /dev/sgx_enclave, it would not matter, since B has done
> > > EMODPR-EACCEPT dance with E.
> > >
> > > So what you can achieve with EMODPR is not protection against mistrusted
> > > *OS*. There's absolutely no chance you could use it for that purpose
> > > because mistrusted OS controls the whole process.
> > >
> > > EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> > > in the above scenario R.
> > 
> > There are two general cases that I can see. Both are valid.
> > 
> > 1. The OS moves from a trusted to an untrusted state. This could be
> > the multi-process system you've described. But it could also be that
> > the kernel becomes compromised after the enclave is fully initialized.
> > 
> > 2. The OS is untrustworthy from the start.
> > 
> > The second case is the stronger one and if you can solve it, the first
> > one is solved implicitly. And our end goal is that if the OS does
> > anything malicious we will crash in a controlled way.
> > 
> > A defensive enclave will always want to have the least number of
> > privileges for the maximum protection. Therefore, the enclave will
> > want the OS to call EMODPR. If that were it, the host could just lie.
> > But the enclave also verifies that the EMODPR operation was, in fact,
> > executed by doing EACCEPT. When the enclave calls EACCEPT, if the
> > kernel hasn't restricted permissions then we get a controlled crash.
> > Therefore, we have solved the second case.
> 
> So you're referring to this part of the SDM pseude code in the SDM:
> 
> (* Check the destination EPC page for concurrency *)
> IF ( EPC page in use )
>     THEN #GP(0); FI;
> 
> I wonder does "EPC page in use" unconditionally trigger when EACCEPT
> is invoked for a page for which all of these conditions hold:
> 
> - .PR := 0 (no EMODPR in progress)
> - .MODIFIED := 0 (no EMODT in progress)
> - .PENDING := 0 (no EMODPR in progress)
> 
> I don't know the exact scope and scale of "EPC page in use".
> 
> Then, yes, EACCEPT could be at least used to validate that one of the
> three operations above was requested. However, enclave thread cannot say
> which one was it, so it is guesswork.

OK, I got it, and this last paragraph is not true. SECINFO given EACCEPT
will lock in rest of the details and make the operation deterministic.

The only question mark then is the condition when no requests are active.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-18  2:22                                                             ` Jarkko Sakkinen
@ 2022-01-18  3:31                                                               ` Jarkko Sakkinen
  2022-01-18 20:59                                                               ` Reinette Chatre
  1 sibling, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-18  3:31 UTC (permalink / raw)
  To: Nathaniel McCallum
  Cc: Reinette Chatre, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Tue, Jan 18, 2022 at 04:22:45AM +0200, Jarkko Sakkinen wrote:
> On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
> > On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum wrote:
> > > On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
> > > >
> > > > On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
> > > > > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
> > > > > > Hi Jarkko,
> > > > > >
> > > > > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
> > > > > > >> Hi Jarkko,
> > > > > > >>
> > > > > > >> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > > > > >>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
> > > > > > >>>> Hi Jarkko,
> > > > > > >>>
> > > > > > >>> How enclave can check a page range that EPCM has the expected permissions?
> > > > > > >>
> > > > > > >> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
> > > > > > >> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
> > > > > > >> time the enclave provides the expected permissions and that will fail
> > > > > > >> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > > > >
> > > > > > > This is a very valid point but that does make the introspection possible
> > > > > > > only at the time of EACCEPT.
> > > > > > >
> > > > > > > It does not give tools for enclave to make sure that EMODPR-ETRACK dance
> > > > > > > was ever exercised.
> > > > > >
> > > > > > Could you please elaborate? EACCEPT is available to the enclave as a tool
> > > > > > and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
> > > > > >
> > > > > > Here is the relevant snippet from the SDM from the section where it
> > > > > > describes EACCEPT:
> > > > > >
> > > > > > IF (Tracking not correct)
> > > > > >     THEN
> > > > > >         RFLAGS.ZF := 1;
> > > > > >         RAX := SGX_NOT_TRACKED;
> > > > > >         GOTO DONE;
> > > > > > FI;
> > > > > >
> > > > > > Reinette
> > > > >
> > > > > Yes, if enclave calls EACCEPT it does the necessary introspection and makes
> > > > > sure that ETRACK is completed. I have trouble understanding how enclave
> > > > > makes sure that EACCEPT was called.
> > > >
> > > > I'm not concerned of anything going wrong once EMODPR has been started.
> > > >
> > > > The problem nails down to that the whole EMODPR process is spawned by
> > > > the entity that is not trusted so maybe that should further broke down
> > > > to three roles:
> > > >
> > > > 1. Build process B
> > > > 2. Runner process R.
> > > > 3. Enclave E.
> > > >
> > > > And to the costraint that we trust B *more* than R. Once B has done all the
> > > > needed EMODPR calls it would send the file descriptor to R. Even if R would
> > > > have full access to /dev/sgx_enclave, it would not matter, since B has done
> > > > EMODPR-EACCEPT dance with E.
> > > >
> > > > So what you can achieve with EMODPR is not protection against mistrusted
> > > > *OS*. There's absolutely no chance you could use it for that purpose
> > > > because mistrusted OS controls the whole process.
> > > >
> > > > EMODPR is to help to protect enclave against mistrusted *process*, i.e.
> > > > in the above scenario R.
> > > 
> > > There are two general cases that I can see. Both are valid.
> > > 
> > > 1. The OS moves from a trusted to an untrusted state. This could be
> > > the multi-process system you've described. But it could also be that
> > > the kernel becomes compromised after the enclave is fully initialized.
> > > 
> > > 2. The OS is untrustworthy from the start.
> > > 
> > > The second case is the stronger one and if you can solve it, the first
> > > one is solved implicitly. And our end goal is that if the OS does
> > > anything malicious we will crash in a controlled way.
> > > 
> > > A defensive enclave will always want to have the least number of
> > > privileges for the maximum protection. Therefore, the enclave will
> > > want the OS to call EMODPR. If that were it, the host could just lie.
> > > But the enclave also verifies that the EMODPR operation was, in fact,
> > > executed by doing EACCEPT. When the enclave calls EACCEPT, if the
> > > kernel hasn't restricted permissions then we get a controlled crash.
> > > Therefore, we have solved the second case.
> > 
> > So you're referring to this part of the SDM pseude code in the SDM:
> > 
> > (* Check the destination EPC page for concurrency *)
> > IF ( EPC page in use )
> >     THEN #GP(0); FI;
> > 
> > I wonder does "EPC page in use" unconditionally trigger when EACCEPT
> > is invoked for a page for which all of these conditions hold:
> > 
> > - .PR := 0 (no EMODPR in progress)
> > - .MODIFIED := 0 (no EMODT in progress)
> > - .PENDING := 0 (no EMODPR in progress)
> > 
> > I don't know the exact scope and scale of "EPC page in use".
> > 
> > Then, yes, EACCEPT could be at least used to validate that one of the
> > three operations above was requested. However, enclave thread cannot say
> > which one was it, so it is guesswork.
> 
> OK, I got it, and this last paragraph is not true. SECINFO given EACCEPT
> will lock in rest of the details and make the operation deterministic.
> 
> The only question mark then is the condition when no requests are active.

This the big picture model how I look at SGX2 patches, and perhaps this
could bring some more common sense idioms to this discussion.

There is not one but two passes of measurement:

1. A static pass.
2. A dynamic pass.

The static pass is the full process of doing ioctls that trigger ECREATE,
EADD and EEXTEND. It ends to EINIT, which triggers to pass/fail condition.

The 2nd pass is the dynamic pass. It's the pass where enclave measures
itself and is performed by the full set of all EACCEPT operations done by
the enclave. It ends to either all EACCEPT operations succeeding, or any
of them #GP.

Confidentiality is the state established exactly when both passes are
completed.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-18  2:22                                                             ` Jarkko Sakkinen
  2022-01-18  3:31                                                               ` Jarkko Sakkinen
@ 2022-01-18 20:59                                                               ` Reinette Chatre
  2022-01-20 12:53                                                                 ` Jarkko Sakkinen
  1 sibling, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2022-01-18 20:59 UTC (permalink / raw)
  To: Jarkko Sakkinen, Nathaniel McCallum
  Cc: Haitao Huang, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

Hi Jarkko,

On 1/17/2022 6:22 PM, Jarkko Sakkinen wrote:
> On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
>> On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum wrote:
>>> On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>>
>>>> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen wrote:
>>>>> On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre wrote:
>>>>>> Hi Jarkko,
>>>>>>
>>>>>> On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
>>>>>>> On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette Chatre wrote:
>>>>>>>> Hi Jarkko,
>>>>>>>>
>>>>>>>> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
>>>>>>>>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette Chatre wrote:
>>>>>>>>>> Hi Jarkko,
>>>>>>>>>
>>>>>>>>> How enclave can check a page range that EPCM has the expected permissions?
>>>>>>>>
>>>>>>>> Only way to change EPCM permissions from outside enclave is to run ENCLS[EMODPR]
>>>>>>>> that needs to be accepted from within the enclave via ENCLU[EACCEPT]. At that
>>>>>>>> time the enclave provides the expected permissions and that will fail
>>>>>>>> if there is a mismatch with the EPCM permissions (SGX_PAGE_ATTRIBUTES_MISMATCH).
>>>>>>>
>>>>>>> This is a very valid point but that does make the introspection possible
>>>>>>> only at the time of EACCEPT.
>>>>>>>
>>>>>>> It does not give tools for enclave to make sure that EMODPR-ETRACK dance
>>>>>>> was ever exercised.
>>>>>>
>>>>>> Could you please elaborate? EACCEPT is available to the enclave as a tool
>>>>>> and it would fail if ETRACK was not completed (error SGX_NOT_TRACKED).
>>>>>>
>>>>>> Here is the relevant snippet from the SDM from the section where it
>>>>>> describes EACCEPT:
>>>>>>
>>>>>> IF (Tracking not correct)
>>>>>>     THEN
>>>>>>         RFLAGS.ZF := 1;
>>>>>>         RAX := SGX_NOT_TRACKED;
>>>>>>         GOTO DONE;
>>>>>> FI;
>>>>>>
>>>>>> Reinette
>>>>>
>>>>> Yes, if enclave calls EACCEPT it does the necessary introspection and makes
>>>>> sure that ETRACK is completed. I have trouble understanding how enclave
>>>>> makes sure that EACCEPT was called.
>>>>
>>>> I'm not concerned of anything going wrong once EMODPR has been started.
>>>>
>>>> The problem nails down to that the whole EMODPR process is spawned by
>>>> the entity that is not trusted so maybe that should further broke down
>>>> to three roles:
>>>>
>>>> 1. Build process B
>>>> 2. Runner process R.
>>>> 3. Enclave E.
>>>>
>>>> And to the costraint that we trust B *more* than R. Once B has done all the
>>>> needed EMODPR calls it would send the file descriptor to R. Even if R would
>>>> have full access to /dev/sgx_enclave, it would not matter, since B has done
>>>> EMODPR-EACCEPT dance with E.
>>>>
>>>> So what you can achieve with EMODPR is not protection against mistrusted
>>>> *OS*. There's absolutely no chance you could use it for that purpose
>>>> because mistrusted OS controls the whole process.
>>>>
>>>> EMODPR is to help to protect enclave against mistrusted *process*, i.e.
>>>> in the above scenario R.
>>>
>>> There are two general cases that I can see. Both are valid.
>>>
>>> 1. The OS moves from a trusted to an untrusted state. This could be
>>> the multi-process system you've described. But it could also be that
>>> the kernel becomes compromised after the enclave is fully initialized.
>>>
>>> 2. The OS is untrustworthy from the start.
>>>
>>> The second case is the stronger one and if you can solve it, the first
>>> one is solved implicitly. And our end goal is that if the OS does
>>> anything malicious we will crash in a controlled way.
>>>
>>> A defensive enclave will always want to have the least number of
>>> privileges for the maximum protection. Therefore, the enclave will
>>> want the OS to call EMODPR. If that were it, the host could just lie.
>>> But the enclave also verifies that the EMODPR operation was, in fact,
>>> executed by doing EACCEPT. When the enclave calls EACCEPT, if the
>>> kernel hasn't restricted permissions then we get a controlled crash.
>>> Therefore, we have solved the second case.
>>
>> So you're referring to this part of the SDM pseude code in the SDM:
>>
>> (* Check the destination EPC page for concurrency *)
>> IF ( EPC page in use )
>>     THEN #GP(0); FI;
>>
>> I wonder does "EPC page in use" unconditionally trigger when EACCEPT
>> is invoked for a page for which all of these conditions hold:
>>
>> - .PR := 0 (no EMODPR in progress)
>> - .MODIFIED := 0 (no EMODT in progress)
>> - .PENDING := 0 (no EMODPR in progress)
>>
>> I don't know the exact scope and scale of "EPC page in use".
>>
>> Then, yes, EACCEPT could be at least used to validate that one of the
>> three operations above was requested. However, enclave thread cannot say
>> which one was it, so it is guesswork.
> 
> OK, I got it, and this last paragraph is not true. SECINFO given EACCEPT
> will lock in rest of the details and make the operation deterministic.

Indeed - so the SDM pseudo code that is relevant here can be found under
the "(* Verify that accept request matches current EPC page settings *)"
comment where the enclave can verify that all EPCM values are as they should
and would fail with SGX_PAGE_ATTRIBUTES_MISMATCH if there is anything
amiss.

> 
> The only question mark then is the condition when no requests are active.

Could you please elaborate what you mean with this question? If no request
is active then I understand that to mean that no request has started. 

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-17 13:27                                         ` Nathaniel McCallum
@ 2022-01-18 21:11                                           ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2022-01-18 21:11 UTC (permalink / raw)
  To: Nathaniel McCallum
  Cc: Jarkko Sakkinen, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

Hi Nathaniel,

On 1/17/2022 5:27 AM, Nathaniel McCallum wrote:
> On Thu, Jan 13, 2022 at 4:43 PM Reinette Chatre
> <reinette.chatre@intel.com> wrote:
>>
>> Hi Jarkko and Nathaniel,
>>
>> On 1/13/2022 12:09 PM, Nathaniel McCallum wrote:
>>> On Wed, Jan 12, 2022 at 6:56 PM Jarkko Sakkinen <jarkko@kernel.org> wrote:
>>>>
>>>> On Thu, Jan 13, 2022 at 01:50:13AM +0200, Jarkko Sakkinen wrote:
>>>>> On Tue, Jan 11, 2022 at 09:13:27AM -0800, Reinette Chatre wrote:
>>>>>> Hi Jarkko,
>>>>>>
>>>>>> On 1/10/2022 5:53 PM, Jarkko Sakkinen wrote:
>>>>>>> On Mon, Jan 10, 2022 at 04:05:21PM -0600, Haitao Huang wrote:
>>>>>>>> On Sat, 08 Jan 2022 10:22:30 -0600, Jarkko Sakkinen <jarkko@kernel.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Sat, Jan 08, 2022 at 05:51:46PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>>> On Sat, Jan 08, 2022 at 05:45:44PM +0200, Jarkko Sakkinen wrote:
>>>>>>>>>>> On Fri, Jan 07, 2022 at 10:14:29AM -0600, Haitao Huang wrote:
>>>>>>>>>>>>>>> OK, so the question is: do we need both or would a
>>>>>>>>>> mechanism just
>>>>>>>>>>>>>> to extend
>>>>>>>>>>>>>>> permissions be sufficient?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I do believe that we need both in order to support pages
>>>>>>>>>> having only
>>>>>>>>>>>>>> the permissions required to support their intended use
>>>>>>>>>> during the
>>>>>>>>>>>>>> time the
>>>>>>>>>>>>>> particular access is required. While technically it is
>>>>>>>>>> possible to grant
>>>>>>>>>>>>>> pages all permissions they may need during their lifetime it
>>>>>>>>>> is safer to
>>>>>>>>>>>>>> remove permissions when no longer required.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So if we imagine a run-time: how EMODPR would be useful, and
>>>>>>>>>> how using it
>>>>>>>>>>>>> would make things safer?
>>>>>>>>>>>>>
>>>>>>>>>>>> In scenarios of JIT compilers, once code is generated into RW pages,
>>>>>>>>>>>> modifying both PTE and EPCM permissions to RX would be a good
>>>>>>>>>> defensive
>>>>>>>>>>>> measure. In that case, EMODPR is useful.
>>>>>>>>>>>
>>>>>>>>>>> What is the exact threat we are talking about?
>>>>>>>>>>
>>>>>>>>>> To add: it should be *significantly* critical thread, given that not
>>>>>>>>>> supporting only EAUG would leave us only one complex call pattern with
>>>>>>>>>> EACCEPT involvement.
>>>>>>>>>>
>>>>>>>>>> I'd even go to suggest to leave EMODPR out of the patch set, and
>>>>>>>>>> introduce
>>>>>>>>>> it when there is PoC code for any of the existing run-time that
>>>>>>>>>> demonstrates the demand for it. Right now this way too speculative.
>>>>>>>>>>
>>>>>>>>>> Supporting EMODPE is IMHO by factors more critical.
>>>>>>>>>
>>>>>>>>> At least it does not protected against enclave code because an enclave
>>>>>>>>> can
>>>>>>>>> always choose not to EACCEPT any of the EMODPR requests. I'm not only
>>>>>>>>> confused here about the actual threat but also the potential adversary
>>>>>>>>> and
>>>>>>>>> target.
>>>>>>>>>
>>>>>>>> I'm not sure I follow your thoughts here. The sequence should be for enclave
>>>>>>>> to request  EMODPR in the first place through runtime to kernel, then to
>>>>>>>> verify with EACCEPT that the OS indeed has done EMODPR.
>>>>>>>> If enclave does not verify with EACCEPT, then its own code has
>>>>>>>> vulnerability. But this does not justify OS not providing the mechanism to
>>>>>>>> request EMODPR.
>>>>>>>
>>>>>>> The question is really simple: what is the threat scenario? In order to use
>>>>>>> the word "vulnerability", you would need one.
>>>>>>>
>>>>>>> Given the complexity of the whole dance with EMODPR it is mandatory to have
>>>>>>> one, in order to ack it to the mainline.
>>>>>>>
>>>>>>
>>>>>> Which complexity related to EMODPR are you concerned about? In a later message
>>>>>> you mention "This leaves only EAUG and EMODT requiring the EACCEPT handshake"
>>>>>> so it seems that you are perhaps concerned about the flow involving EACCEPT?
>>>>>> The OS does not require nor depend on EACCEPT being called as part of these flows
>>>>>> so a faulty or misbehaving user space omitting an EACCEPT call would not impact
>>>>>> these flows in the OS, but would of course impact the enclave.
>>>>>
>>>>> I'd say *any* complexity because I see no benefit of supporting it. E.g.
>>>>> EMODPR/EACCEPT/EMODPE sequence I mentioned to Haitao concerns me. How is
>>>>> EMODPR going to help with any sort of workload?
>>>>
>>>> I've even started think should we just always allow mmap()?
>>>
>>> I suspect this may be the most ergonomic way forward. Instructions
>>> like EAUG/EMODPR/etc are really irrelevant implementation details to
>>> what the enclave wants, which is a memory mapping in the enclave. Why
>>> make the enclave runner do multiple context switches just to change
>>> the memory map of an enclave?
>>
>> The enclave runner is not forced to make any changes to a memory mapping. To start,
>> this implementation supports and does not change the existing ABI where a new
>> memory mapping can only be created if its permissions are the same or weaker
>> than the EPCM permissions. After the memory mapping is created the EPCM permissions
>> can change (thanks to SGX2) and when they do there are no forced nor required
>> changes to the memory mapping - pages remain accessible where the memory mapping
>> and EPCM permissions agree. It is true that if an enclave chooses to relax permissions
>> to an enclave page (EMODPE) then the memory mapping may need to be changed as
>> should be expected to access a page with permissions that the memory mapping
>> did not previously allow.
>>
>> Are you saying that the permissions of a new memory mapping should now be allowed
>> to exceed EPCM permissions and thus the enclave runner would not need to modify a
>> memory mapping when EPCM permissions are relaxed? As mentioned above this may be
>> considered a change in ABI but something we could support on SGX2 systems.
>>
>> I would also like to highlight Haitao's earlier comment that a foundation of SGX is
>> that the OS is untrusted. The enclave owner does not trust the OS and needs EMODPR
>> and EMODPE to manage enclave page permissions.
> 
> As I understand the problem, there are two permission sets:
> 
>  * The EPCM permissions
>  * The mmap() permissions

It may be easier to think of there being three permission sets:
* EPCM permissions
* VMA (the mmap() permissions)
* PTE permissions

> 
> The mmap() permissions cannot exceed the EPCM permissions, for obvious reasons.

That is the current rule - when a new memory mapping is created it cannot exceed
the EPCM permissions. This rule remains in SGX2 but there is a caveat that
the EPCM permissions may change during runtime while the memory is mapped and thus
the VMA permissions may indeed exceed the EPCM permissions. This is where the
importance of PTE permissions is highlighted.

You may ask - when EPCM permissions are changed, why not just change the VMA
permissions? Please see the commit message below that contains details about
this and the reasons why VMA permissions are allowed to exceed EPCM permissions:
https://lore.kernel.org/lkml/7e622156315c9c22c3ef84a7c0aeb01b5c001ff9.1638381245.git.reinette.chatre@intel.com/

> 
> Hypothesis: there is no practical reason the EPCM permissions should
> exceed mmap() permissions.
> 
> If the hypothesis is true, then userspace shouldn't have an API to
> manage EPCM permissions distinct from mmap() permissions. Instead,
> userspace should just call mmap() and the kernel should internally
> adjust the EPCM permissions to match the mmap() permissions. This has
> a performance advantage that every permissions change is one syscall
> rather than two.
> 
> So what is the use case where an enclave would want to restrict mmap()
> permissions but not restrict EPCM?

An enclave with its EPCM permissions can be used by different tasks, each
should only access the enclave with the least privileges needed. Multiple
tasks may thus map a portion of an enclave with different permissions.

This implementation also supports the workflow that if a portion of an enclave
is already mapped then it is possible to change the EPCM permissions without
requiring the memory to be remapped.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-15 16:49                                               ` Jarkko Sakkinen
@ 2022-01-18 21:18                                                 ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2022-01-18 21:18 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

Hi Jarkko,

On 1/15/2022 8:49 AM, Jarkko Sakkinen wrote:
> On Sat, Jan 15, 2022 at 01:15:53AM +0200, Jarkko Sakkinen wrote:
>>> After running ENCLU[EMODPE] user space uses SGX_IOC_ENCLAVE_MOD_PROTECTIONS
>>
>> OK, great.
>>
>> A minor nit: please call it SGX_IOC_ENCLAVE_MODIFY_PROTECTIONS. 
> 
> I'm not confident after looking through the test case and ioctl
> about EMODPE support but I do not want disturb this anymore. Bunch
> of things have been nailed and I'm now running the code, which is
> great.
> 
> The obviously wrong implementation choice in this ioctl is that
> it is multi-function. It should be just split it into two ioctls:
> sgx_restrict_page_permissions and sgx_extend_page_permissions.

Sure, I can move it to two ioctls.

To keep the naming consistent, what do you think of:
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
SGX_IOC_ENCLAVE_RELAX_PERMISSIONS

Please refer to message below as motivation for the "relax" term:
https://lore.kernel.org/lkml/24447a03-139a-c7e0-9ad5-34e2019c4df5@intel.com/

> 
> They are conceptually different flows and I'm also basing this on earlier
> discussion in this mailing list from which I conclude that it is also
> consensus to not have such ioctls.
> 
> Might sound clanky but it is much easier to comprehend what is going
> on "in the blackbox" by doing that split.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-18 20:59                                                               ` Reinette Chatre
@ 2022-01-20 12:53                                                                 ` Jarkko Sakkinen
  2022-01-20 16:52                                                                   ` Reinette Chatre
  0 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-20 12:53 UTC (permalink / raw)
  To: Reinette Chatre, Nathaniel McCallum
  Cc: Haitao Huang, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

On Tue, 2022-01-18 at 12:59 -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/17/2022 6:22 PM, Jarkko Sakkinen wrote:
> > On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
> > > On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum
> > > wrote:
> > > > On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen
> > > > <jarkko@kernel.org> wrote:
> > > > > 
> > > > > On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen
> > > > > wrote:
> > > > > > On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre
> > > > > > wrote:
> > > > > > > Hi Jarkko,
> > > > > > > 
> > > > > > > On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> > > > > > > > On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette
> > > > > > > > Chatre wrote:
> > > > > > > > > Hi Jarkko,
> > > > > > > > > 
> > > > > > > > > On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> > > > > > > > > > On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette
> > > > > > > > > > Chatre wrote:
> > > > > > > > > > > Hi Jarkko,
> > > > > > > > > > 
> > > > > > > > > > How enclave can check a page range that EPCM has
> > > > > > > > > > the expected permissions?
> > > > > > > > > 
> > > > > > > > > Only way to change EPCM permissions from outside
> > > > > > > > > enclave is to run ENCLS[EMODPR]
> > > > > > > > > that needs to be accepted from within the enclave via
> > > > > > > > > ENCLU[EACCEPT]. At that
> > > > > > > > > time the enclave provides the expected permissions
> > > > > > > > > and that will fail
> > > > > > > > > if there is a mismatch with the EPCM permissions
> > > > > > > > > (SGX_PAGE_ATTRIBUTES_MISMATCH).
> > > > > > > > 
> > > > > > > > This is a very valid point but that does make the
> > > > > > > > introspection possible
> > > > > > > > only at the time of EACCEPT.
> > > > > > > > 
> > > > > > > > It does not give tools for enclave to make sure that
> > > > > > > > EMODPR-ETRACK dance
> > > > > > > > was ever exercised.
> > > > > > > 
> > > > > > > Could you please elaborate? EACCEPT is available to the
> > > > > > > enclave as a tool
> > > > > > > and it would fail if ETRACK was not completed (error
> > > > > > > SGX_NOT_TRACKED).
> > > > > > > 
> > > > > > > Here is the relevant snippet from the SDM from the
> > > > > > > section where it
> > > > > > > describes EACCEPT:
> > > > > > > 
> > > > > > > IF (Tracking not correct)
> > > > > > >     THEN
> > > > > > >         RFLAGS.ZF := 1;
> > > > > > >         RAX := SGX_NOT_TRACKED;
> > > > > > >         GOTO DONE;
> > > > > > > FI;
> > > > > > > 
> > > > > > > Reinette
> > > > > > 
> > > > > > Yes, if enclave calls EACCEPT it does the necessary
> > > > > > introspection and makes
> > > > > > sure that ETRACK is completed. I have trouble understanding
> > > > > > how enclave
> > > > > > makes sure that EACCEPT was called.
> > > > > 
> > > > > I'm not concerned of anything going wrong once EMODPR has
> > > > > been started.
> > > > > 
> > > > > The problem nails down to that the whole EMODPR process is
> > > > > spawned by
> > > > > the entity that is not trusted so maybe that should further
> > > > > broke down
> > > > > to three roles:
> > > > > 
> > > > > 1. Build process B
> > > > > 2. Runner process R.
> > > > > 3. Enclave E.
> > > > > 
> > > > > And to the costraint that we trust B *more* than R. Once B
> > > > > has done all the
> > > > > needed EMODPR calls it would send the file descriptor to R.
> > > > > Even if R would
> > > > > have full access to /dev/sgx_enclave, it would not matter,
> > > > > since B has done
> > > > > EMODPR-EACCEPT dance with E.
> > > > > 
> > > > > So what you can achieve with EMODPR is not protection against
> > > > > mistrusted
> > > > > *OS*. There's absolutely no chance you could use it for that
> > > > > purpose
> > > > > because mistrusted OS controls the whole process.
> > > > > 
> > > > > EMODPR is to help to protect enclave against mistrusted
> > > > > *process*, i.e.
> > > > > in the above scenario R.
> > > > 
> > > > There are two general cases that I can see. Both are valid.
> > > > 
> > > > 1. The OS moves from a trusted to an untrusted state. This
> > > > could be
> > > > the multi-process system you've described. But it could also be
> > > > that
> > > > the kernel becomes compromised after the enclave is fully
> > > > initialized.
> > > > 
> > > > 2. The OS is untrustworthy from the start.
> > > > 
> > > > The second case is the stronger one and if you can solve it,
> > > > the first
> > > > one is solved implicitly. And our end goal is that if the OS
> > > > does
> > > > anything malicious we will crash in a controlled way.
> > > > 
> > > > A defensive enclave will always want to have the least number
> > > > of
> > > > privileges for the maximum protection. Therefore, the enclave
> > > > will
> > > > want the OS to call EMODPR. If that were it, the host could
> > > > just lie.
> > > > But the enclave also verifies that the EMODPR operation was, in
> > > > fact,
> > > > executed by doing EACCEPT. When the enclave calls EACCEPT, if
> > > > the
> > > > kernel hasn't restricted permissions then we get a controlled
> > > > crash.
> > > > Therefore, we have solved the second case.
> > > 
> > > So you're referring to this part of the SDM pseude code in the
> > > SDM:
> > > 
> > > (* Check the destination EPC page for concurrency *)
> > > IF ( EPC page in use )
> > >     THEN #GP(0); FI;
> > > 
> > > I wonder does "EPC page in use" unconditionally trigger when
> > > EACCEPT
> > > is invoked for a page for which all of these conditions hold:
> > > 
> > > - .PR := 0 (no EMODPR in progress)
> > > - .MODIFIED := 0 (no EMODT in progress)
> > > - .PENDING := 0 (no EMODPR in progress)
> > > 
> > > I don't know the exact scope and scale of "EPC page in use".
> > > 
> > > Then, yes, EACCEPT could be at least used to validate that one of
> > > the
> > > three operations above was requested. However, enclave thread
> > > cannot say
> > > which one was it, so it is guesswork.
> > 
> > OK, I got it, and this last paragraph is not true. SECINFO given
> > EACCEPT
> > will lock in rest of the details and make the operation
> > deterministic.
> 
> Indeed - so the SDM pseudo code that is relevant here can be found
> under
> the "(* Verify that accept request matches current EPC page settings
> *)"
> comment where the enclave can verify that all EPCM values are as they
> should
> and would fail with SGX_PAGE_ATTRIBUTES_MISMATCH if there is anything
> amiss.
> 
> > 
> > The only question mark then is the condition when no requests are
> > active.
> 
> Could you please elaborate what you mean with this question? If no
> request
> is active then I understand that to mean that no request has started.

My issue was that when:

- .PR := 0 (no EMODPR in progress)
- .MODIFIED := 0 (no EMODT in progress)
- .PENDING := 0 (no EMODPR in progress)

Does this trigger #GP when you call EACCEPT?

I don't think the answer matters that much tho sice if e.g. EMODPR was never
done, and enclave expected a change, #GP would trigger eventually in SECINFO
validation.

The way I look at EACCEPT is a memory verification tool it does the same at
run-time as EINIT does before run-time.

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-20 12:53                                                                 ` Jarkko Sakkinen
@ 2022-01-20 16:52                                                                   ` Reinette Chatre
  2022-01-26 14:41                                                                     ` Jarkko Sakkinen
  0 siblings, 1 reply; 155+ messages in thread
From: Reinette Chatre @ 2022-01-20 16:52 UTC (permalink / raw)
  To: Jarkko Sakkinen, Nathaniel McCallum
  Cc: Haitao Huang, Andy Lutomirski, dave.hansen, tglx, bp, mingo,
	linux-sgx, x86, seanjc, kai.huang, cathy.zhang, cedric.xing,
	haitao.huang, mark.shanahan, hpa, linux-kernel

Hi Jarkko,

On 1/20/2022 4:53 AM, Jarkko Sakkinen wrote:
> On Tue, 2022-01-18 at 12:59 -0800, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 1/17/2022 6:22 PM, Jarkko Sakkinen wrote:
>>> On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
>>>> On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum
>>>> wrote:
>>>>> On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen
>>>>> <jarkko@kernel.org> wrote:
>>>>>>
>>>>>> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen
>>>>>> wrote:
>>>>>>> On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre
>>>>>>> wrote:
>>>>>>>> Hi Jarkko,
>>>>>>>>
>>>>>>>> On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
>>>>>>>>> On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette
>>>>>>>>> Chatre wrote:
>>>>>>>>>> Hi Jarkko,
>>>>>>>>>>
>>>>>>>>>> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
>>>>>>>>>>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette
>>>>>>>>>>> Chatre wrote:
>>>>>>>>>>>> Hi Jarkko,
>>>>>>>>>>>
>>>>>>>>>>> How enclave can check a page range that EPCM has
>>>>>>>>>>> the expected permissions?
>>>>>>>>>>
>>>>>>>>>> Only way to change EPCM permissions from outside
>>>>>>>>>> enclave is to run ENCLS[EMODPR]
>>>>>>>>>> that needs to be accepted from within the enclave via
>>>>>>>>>> ENCLU[EACCEPT]. At that
>>>>>>>>>> time the enclave provides the expected permissions
>>>>>>>>>> and that will fail
>>>>>>>>>> if there is a mismatch with the EPCM permissions
>>>>>>>>>> (SGX_PAGE_ATTRIBUTES_MISMATCH).
>>>>>>>>>
>>>>>>>>> This is a very valid point but that does make the
>>>>>>>>> introspection possible
>>>>>>>>> only at the time of EACCEPT.
>>>>>>>>>
>>>>>>>>> It does not give tools for enclave to make sure that
>>>>>>>>> EMODPR-ETRACK dance
>>>>>>>>> was ever exercised.
>>>>>>>>
>>>>>>>> Could you please elaborate? EACCEPT is available to the
>>>>>>>> enclave as a tool
>>>>>>>> and it would fail if ETRACK was not completed (error
>>>>>>>> SGX_NOT_TRACKED).
>>>>>>>>
>>>>>>>> Here is the relevant snippet from the SDM from the
>>>>>>>> section where it
>>>>>>>> describes EACCEPT:
>>>>>>>>
>>>>>>>> IF (Tracking not correct)
>>>>>>>>     THEN
>>>>>>>>         RFLAGS.ZF := 1;
>>>>>>>>         RAX := SGX_NOT_TRACKED;
>>>>>>>>         GOTO DONE;
>>>>>>>> FI;
>>>>>>>>
>>>>>>>> Reinette
>>>>>>>
>>>>>>> Yes, if enclave calls EACCEPT it does the necessary
>>>>>>> introspection and makes
>>>>>>> sure that ETRACK is completed. I have trouble understanding
>>>>>>> how enclave
>>>>>>> makes sure that EACCEPT was called.
>>>>>>
>>>>>> I'm not concerned of anything going wrong once EMODPR has
>>>>>> been started.
>>>>>>
>>>>>> The problem nails down to that the whole EMODPR process is
>>>>>> spawned by
>>>>>> the entity that is not trusted so maybe that should further
>>>>>> broke down
>>>>>> to three roles:
>>>>>>
>>>>>> 1. Build process B
>>>>>> 2. Runner process R.
>>>>>> 3. Enclave E.
>>>>>>
>>>>>> And to the costraint that we trust B *more* than R. Once B
>>>>>> has done all the
>>>>>> needed EMODPR calls it would send the file descriptor to R.
>>>>>> Even if R would
>>>>>> have full access to /dev/sgx_enclave, it would not matter,
>>>>>> since B has done
>>>>>> EMODPR-EACCEPT dance with E.
>>>>>>
>>>>>> So what you can achieve with EMODPR is not protection against
>>>>>> mistrusted
>>>>>> *OS*. There's absolutely no chance you could use it for that
>>>>>> purpose
>>>>>> because mistrusted OS controls the whole process.
>>>>>>
>>>>>> EMODPR is to help to protect enclave against mistrusted
>>>>>> *process*, i.e.
>>>>>> in the above scenario R.
>>>>>
>>>>> There are two general cases that I can see. Both are valid.
>>>>>
>>>>> 1. The OS moves from a trusted to an untrusted state. This
>>>>> could be
>>>>> the multi-process system you've described. But it could also be
>>>>> that
>>>>> the kernel becomes compromised after the enclave is fully
>>>>> initialized.
>>>>>
>>>>> 2. The OS is untrustworthy from the start.
>>>>>
>>>>> The second case is the stronger one and if you can solve it,
>>>>> the first
>>>>> one is solved implicitly. And our end goal is that if the OS
>>>>> does
>>>>> anything malicious we will crash in a controlled way.
>>>>>
>>>>> A defensive enclave will always want to have the least number
>>>>> of
>>>>> privileges for the maximum protection. Therefore, the enclave
>>>>> will
>>>>> want the OS to call EMODPR. If that were it, the host could
>>>>> just lie.
>>>>> But the enclave also verifies that the EMODPR operation was, in
>>>>> fact,
>>>>> executed by doing EACCEPT. When the enclave calls EACCEPT, if
>>>>> the
>>>>> kernel hasn't restricted permissions then we get a controlled
>>>>> crash.
>>>>> Therefore, we have solved the second case.
>>>>
>>>> So you're referring to this part of the SDM pseude code in the
>>>> SDM:
>>>>
>>>> (* Check the destination EPC page for concurrency *)
>>>> IF ( EPC page in use )
>>>>     THEN #GP(0); FI;
>>>>
>>>> I wonder does "EPC page in use" unconditionally trigger when
>>>> EACCEPT
>>>> is invoked for a page for which all of these conditions hold:
>>>>
>>>> - .PR := 0 (no EMODPR in progress)
>>>> - .MODIFIED := 0 (no EMODT in progress)
>>>> - .PENDING := 0 (no EMODPR in progress)
>>>>
>>>> I don't know the exact scope and scale of "EPC page in use".
>>>>
>>>> Then, yes, EACCEPT could be at least used to validate that one of
>>>> the
>>>> three operations above was requested. However, enclave thread
>>>> cannot say
>>>> which one was it, so it is guesswork.
>>>
>>> OK, I got it, and this last paragraph is not true. SECINFO given
>>> EACCEPT
>>> will lock in rest of the details and make the operation
>>> deterministic.
>>
>> Indeed - so the SDM pseudo code that is relevant here can be found
>> under
>> the "(* Verify that accept request matches current EPC page settings
>> *)"
>> comment where the enclave can verify that all EPCM values are as they
>> should
>> and would fail with SGX_PAGE_ATTRIBUTES_MISMATCH if there is anything
>> amiss.
>>
>>>
>>> The only question mark then is the condition when no requests are
>>> active.
>>
>> Could you please elaborate what you mean with this question? If no
>> request
>> is active then I understand that to mean that no request has started.
> 
> My issue was that when:
> 
> - .PR := 0 (no EMODPR in progress)
> - .MODIFIED := 0 (no EMODT in progress)
> - .PENDING := 0 (no EMODPR in progress)
> 
> Does this trigger #GP when you call EACCEPT?

From what I understand a #GP would be triggered if the EACCEPT does not
specify at least one of these. That would be a problem with the EACCEPT
instruction as opposed to the EPCM contents or OS flow though. This
can be found under the following comment in the SDM pseudo code:

(* Check that the combination of requested PT, PENDING and MODIFIED is legal *)

As far as the actual checking of EPCM values goes, it would not result
in a #GP but for an unexpected value of MODIFIED or PENDING the EACCEPT
will fail with SGX_PAGE_ATTRIBUTES_MISMATCH. EACCEPT does not enforce the PR
bit but it _does_ enforce the individual permission bits.

> I don't think the answer matters that much tho sice if e.g. EMODPR was never
> done, and enclave expected a change, #GP would trigger eventually in SECINFO
> validation.

Similar here as I understand it will not be a #GP but EACCEPT failure with
error SGX_PAGE_ATTRIBUTES_MISMATCH. The relevant pseudo-code in the SDM is
below and you can see how MODIFIED and PENDING are matched but PR not (while
the individual permission bits are):

(* Verify that accept request matches current EPC page settings *)
IF ( (EPCM(DS:RCX).ENCLAVEADDRESS ≠ DS:RCX) or (EPCM(DS:RCX).PENDING ≠ SCRATCH_SECINFO.FLAGS.PENDING) or
(EPCM(DS:RCX).MODIFIED ≠ SCRATCH_SECINFO.FLAGS.MODIFIED) or (EPCM(DS:RCX).R ≠ SCRATCH_SECINFO.FLAGS.R) or
(EPCM(DS:RCX).W ≠ SCRATCH_SECINFO.FLAGS.W) or (EPCM(DS:RCX).X ≠ SCRATCH_SECINFO.FLAGS.X) or
(EPCM(DS:RCX).PT ≠ SCRATCH_SECINFO.FLAGS.PT) )
THEN
     RFLAGS.ZF := 1;
     RAX := SGX_PAGE_ATTRIBUTES_MISMATCH;
     GOTO DONE;
FI;


> 
> The way I look at EACCEPT is a memory verification tool it does the same at
> run-time as EINIT does before run-time.

Indeed.

Reinette


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 05/25] x86/sgx: Introduce runtime protection bits
  2022-01-20 16:52                                                                   ` Reinette Chatre
@ 2022-01-26 14:41                                                                     ` Jarkko Sakkinen
  0 siblings, 0 replies; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-01-26 14:41 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: Nathaniel McCallum, Haitao Huang, Andy Lutomirski, dave.hansen,
	tglx, bp, mingo, linux-sgx, x86, seanjc, kai.huang, cathy.zhang,
	cedric.xing, haitao.huang, mark.shanahan, hpa, linux-kernel

On Thu, Jan 20, 2022 at 08:52:28AM -0800, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 1/20/2022 4:53 AM, Jarkko Sakkinen wrote:
> > On Tue, 2022-01-18 at 12:59 -0800, Reinette Chatre wrote:
> >> Hi Jarkko,
> >>
> >> On 1/17/2022 6:22 PM, Jarkko Sakkinen wrote:
> >>> On Tue, Jan 18, 2022 at 03:59:29AM +0200, Jarkko Sakkinen wrote:
> >>>> On Mon, Jan 17, 2022 at 08:13:32AM -0500, Nathaniel McCallum
> >>>> wrote:
> >>>>> On Sat, Jan 15, 2022 at 6:57 AM Jarkko Sakkinen
> >>>>> <jarkko@kernel.org> wrote:
> >>>>>>
> >>>>>> On Sat, Jan 15, 2022 at 03:18:04AM +0200, Jarkko Sakkinen
> >>>>>> wrote:
> >>>>>>> On Fri, Jan 14, 2022 at 04:41:59PM -0800, Reinette Chatre
> >>>>>>> wrote:
> >>>>>>>> Hi Jarkko,
> >>>>>>>>
> >>>>>>>> On 1/14/2022 4:27 PM, Jarkko Sakkinen wrote:
> >>>>>>>>> On Fri, Jan 14, 2022 at 04:01:33PM -0800, Reinette
> >>>>>>>>> Chatre wrote:
> >>>>>>>>>> Hi Jarkko,
> >>>>>>>>>>
> >>>>>>>>>> On 1/14/2022 3:15 PM, Jarkko Sakkinen wrote:
> >>>>>>>>>>> On Fri, Jan 14, 2022 at 03:05:21PM -0800, Reinette
> >>>>>>>>>>> Chatre wrote:
> >>>>>>>>>>>> Hi Jarkko,
> >>>>>>>>>>>
> >>>>>>>>>>> How enclave can check a page range that EPCM has
> >>>>>>>>>>> the expected permissions?
> >>>>>>>>>>
> >>>>>>>>>> Only way to change EPCM permissions from outside
> >>>>>>>>>> enclave is to run ENCLS[EMODPR]
> >>>>>>>>>> that needs to be accepted from within the enclave via
> >>>>>>>>>> ENCLU[EACCEPT]. At that
> >>>>>>>>>> time the enclave provides the expected permissions
> >>>>>>>>>> and that will fail
> >>>>>>>>>> if there is a mismatch with the EPCM permissions
> >>>>>>>>>> (SGX_PAGE_ATTRIBUTES_MISMATCH).
> >>>>>>>>>
> >>>>>>>>> This is a very valid point but that does make the
> >>>>>>>>> introspection possible
> >>>>>>>>> only at the time of EACCEPT.
> >>>>>>>>>
> >>>>>>>>> It does not give tools for enclave to make sure that
> >>>>>>>>> EMODPR-ETRACK dance
> >>>>>>>>> was ever exercised.
> >>>>>>>>
> >>>>>>>> Could you please elaborate? EACCEPT is available to the
> >>>>>>>> enclave as a tool
> >>>>>>>> and it would fail if ETRACK was not completed (error
> >>>>>>>> SGX_NOT_TRACKED).
> >>>>>>>>
> >>>>>>>> Here is the relevant snippet from the SDM from the
> >>>>>>>> section where it
> >>>>>>>> describes EACCEPT:
> >>>>>>>>
> >>>>>>>> IF (Tracking not correct)
> >>>>>>>>     THEN
> >>>>>>>>         RFLAGS.ZF := 1;
> >>>>>>>>         RAX := SGX_NOT_TRACKED;
> >>>>>>>>         GOTO DONE;
> >>>>>>>> FI;
> >>>>>>>>
> >>>>>>>> Reinette
> >>>>>>>
> >>>>>>> Yes, if enclave calls EACCEPT it does the necessary
> >>>>>>> introspection and makes
> >>>>>>> sure that ETRACK is completed. I have trouble understanding
> >>>>>>> how enclave
> >>>>>>> makes sure that EACCEPT was called.
> >>>>>>
> >>>>>> I'm not concerned of anything going wrong once EMODPR has
> >>>>>> been started.
> >>>>>>
> >>>>>> The problem nails down to that the whole EMODPR process is
> >>>>>> spawned by
> >>>>>> the entity that is not trusted so maybe that should further
> >>>>>> broke down
> >>>>>> to three roles:
> >>>>>>
> >>>>>> 1. Build process B
> >>>>>> 2. Runner process R.
> >>>>>> 3. Enclave E.
> >>>>>>
> >>>>>> And to the costraint that we trust B *more* than R. Once B
> >>>>>> has done all the
> >>>>>> needed EMODPR calls it would send the file descriptor to R.
> >>>>>> Even if R would
> >>>>>> have full access to /dev/sgx_enclave, it would not matter,
> >>>>>> since B has done
> >>>>>> EMODPR-EACCEPT dance with E.
> >>>>>>
> >>>>>> So what you can achieve with EMODPR is not protection against
> >>>>>> mistrusted
> >>>>>> *OS*. There's absolutely no chance you could use it for that
> >>>>>> purpose
> >>>>>> because mistrusted OS controls the whole process.
> >>>>>>
> >>>>>> EMODPR is to help to protect enclave against mistrusted
> >>>>>> *process*, i.e.
> >>>>>> in the above scenario R.
> >>>>>
> >>>>> There are two general cases that I can see. Both are valid.
> >>>>>
> >>>>> 1. The OS moves from a trusted to an untrusted state. This
> >>>>> could be
> >>>>> the multi-process system you've described. But it could also be
> >>>>> that
> >>>>> the kernel becomes compromised after the enclave is fully
> >>>>> initialized.
> >>>>>
> >>>>> 2. The OS is untrustworthy from the start.
> >>>>>
> >>>>> The second case is the stronger one and if you can solve it,
> >>>>> the first
> >>>>> one is solved implicitly. And our end goal is that if the OS
> >>>>> does
> >>>>> anything malicious we will crash in a controlled way.
> >>>>>
> >>>>> A defensive enclave will always want to have the least number
> >>>>> of
> >>>>> privileges for the maximum protection. Therefore, the enclave
> >>>>> will
> >>>>> want the OS to call EMODPR. If that were it, the host could
> >>>>> just lie.
> >>>>> But the enclave also verifies that the EMODPR operation was, in
> >>>>> fact,
> >>>>> executed by doing EACCEPT. When the enclave calls EACCEPT, if
> >>>>> the
> >>>>> kernel hasn't restricted permissions then we get a controlled
> >>>>> crash.
> >>>>> Therefore, we have solved the second case.
> >>>>
> >>>> So you're referring to this part of the SDM pseude code in the
> >>>> SDM:
> >>>>
> >>>> (* Check the destination EPC page for concurrency *)
> >>>> IF ( EPC page in use )
> >>>>     THEN #GP(0); FI;
> >>>>
> >>>> I wonder does "EPC page in use" unconditionally trigger when
> >>>> EACCEPT
> >>>> is invoked for a page for which all of these conditions hold:
> >>>>
> >>>> - .PR := 0 (no EMODPR in progress)
> >>>> - .MODIFIED := 0 (no EMODT in progress)
> >>>> - .PENDING := 0 (no EMODPR in progress)
> >>>>
> >>>> I don't know the exact scope and scale of "EPC page in use".
> >>>>
> >>>> Then, yes, EACCEPT could be at least used to validate that one of
> >>>> the
> >>>> three operations above was requested. However, enclave thread
> >>>> cannot say
> >>>> which one was it, so it is guesswork.
> >>>
> >>> OK, I got it, and this last paragraph is not true. SECINFO given
> >>> EACCEPT
> >>> will lock in rest of the details and make the operation
> >>> deterministic.
> >>
> >> Indeed - so the SDM pseudo code that is relevant here can be found
> >> under
> >> the "(* Verify that accept request matches current EPC page settings
> >> *)"
> >> comment where the enclave can verify that all EPCM values are as they
> >> should
> >> and would fail with SGX_PAGE_ATTRIBUTES_MISMATCH if there is anything
> >> amiss.
> >>
> >>>
> >>> The only question mark then is the condition when no requests are
> >>> active.
> >>
> >> Could you please elaborate what you mean with this question? If no
> >> request
> >> is active then I understand that to mean that no request has started.
> > 
> > My issue was that when:
> > 
> > - .PR := 0 (no EMODPR in progress)
> > - .MODIFIED := 0 (no EMODT in progress)
> > - .PENDING := 0 (no EMODPR in progress)
> > 
> > Does this trigger #GP when you call EACCEPT?
> 
> From what I understand a #GP would be triggered if the EACCEPT does not
> specify at least one of these. That would be a problem with the EACCEPT
> instruction as opposed to the EPCM contents or OS flow though. This
> can be found under the following comment in the SDM pseudo code:
> 
> (* Check that the combination of requested PT, PENDING and MODIFIED is legal *)
> 
> As far as the actual checking of EPCM values goes, it would not result
> in a #GP but for an unexpected value of MODIFIED or PENDING the EACCEPT
> will fail with SGX_PAGE_ATTRIBUTES_MISMATCH. EACCEPT does not enforce the PR
> bit but it _does_ enforce the individual permission bits.
> 
> > I don't think the answer matters that much tho sice if e.g. EMODPR was never
> > done, and enclave expected a change, #GP would trigger eventually in SECINFO
> > validation.
> 
> Similar here as I understand it will not be a #GP but EACCEPT failure with
> error SGX_PAGE_ATTRIBUTES_MISMATCH. The relevant pseudo-code in the SDM is
> below and you can see how MODIFIED and PENDING are matched but PR not (while
> the individual permission bits are):
> 
> (* Verify that accept request matches current EPC page settings *)
> IF ( (EPCM(DS:RCX).ENCLAVEADDRESS ≠ DS:RCX) or (EPCM(DS:RCX).PENDING ≠ SCRATCH_SECINFO.FLAGS.PENDING) or
> (EPCM(DS:RCX).MODIFIED ≠ SCRATCH_SECINFO.FLAGS.MODIFIED) or (EPCM(DS:RCX).R ≠ SCRATCH_SECINFO.FLAGS.R) or
> (EPCM(DS:RCX).W ≠ SCRATCH_SECINFO.FLAGS.W) or (EPCM(DS:RCX).X ≠ SCRATCH_SECINFO.FLAGS.X) or
> (EPCM(DS:RCX).PT ≠ SCRATCH_SECINFO.FLAGS.PT) )
> THEN
>      RFLAGS.ZF := 1;
>      RAX := SGX_PAGE_ATTRIBUTES_MISMATCH;
>      GOTO DONE;
> FI;
> 
> 
> > 
> > The way I look at EACCEPT is a memory verification tool it does the same at
> > run-time as EINIT does before run-time.
> 
> Indeed.

I think I got this now. Thank you anyway for further explanation :-)

> Reinette

/Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2021-12-01 19:23 ` [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave Reinette Chatre
  2021-12-03  0:38   ` Dave Hansen
  2021-12-04 23:13   ` Jarkko Sakkinen
@ 2022-03-01 15:13   ` Jarkko Sakkinen
  2022-03-01 17:08     ` Reinette Chatre
  2 siblings, 1 reply; 155+ messages in thread
From: Jarkko Sakkinen @ 2022-03-01 15:13 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Wed, Dec 01, 2021 at 11:23:11AM -0800, Reinette Chatre wrote:
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index 848a28d28d3d..1b6ce1da7c92 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -123,4 +123,6 @@ void sgx_encl_free_epc_page(struct sgx_epc_page *page);
>  struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
>  					 unsigned long addr);
>  
> +struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl);
> +void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
>  #endif /* _X86_ENCL_H */
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 5dddb3c9f742..de0bf68ee842 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -17,7 +17,7 @@
>  #include "encl.h"
>  #include "encls.h"
>  
> -static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
> +struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
>  {
>  	struct sgx_va_page *va_page = NULL;
>  	void *err;
> @@ -43,7 +43,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
>  	return va_page;
>  }
>  
> -static void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
> +void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
>  {
>  	encl->page_cnt--;

Nit: this should be a separate patch, e.g.

  x86/sgx: Export sgx_encl_{grow,shrink}()

  In order to use sgx_encl_{grow,shrink}() in the page augementation code
  located in encl.c, export these functions.

BR, Jarkko

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave
  2022-03-01 15:13   ` Jarkko Sakkinen
@ 2022-03-01 17:08     ` Reinette Chatre
  0 siblings, 0 replies; 155+ messages in thread
From: Reinette Chatre @ 2022-03-01 17:08 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 3/1/2022 7:13 AM, Jarkko Sakkinen wrote:
> On Wed, Dec 01, 2021 at 11:23:11AM -0800, Reinette Chatre wrote:
>> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
>> index 848a28d28d3d..1b6ce1da7c92 100644
>> --- a/arch/x86/kernel/cpu/sgx/encl.h
>> +++ b/arch/x86/kernel/cpu/sgx/encl.h
>> @@ -123,4 +123,6 @@ void sgx_encl_free_epc_page(struct sgx_epc_page *page);
>>  struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
>>  					 unsigned long addr);
>>  
>> +struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl);
>> +void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
>>  #endif /* _X86_ENCL_H */
>> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
>> index 5dddb3c9f742..de0bf68ee842 100644
>> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
>> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
>> @@ -17,7 +17,7 @@
>>  #include "encl.h"
>>  #include "encls.h"
>>  
>> -static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
>> +struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
>>  {
>>  	struct sgx_va_page *va_page = NULL;
>>  	void *err;
>> @@ -43,7 +43,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
>>  	return va_page;
>>  }
>>  
>> -static void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
>> +void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
>>  {
>>  	encl->page_cnt--;
> 
> Nit: this should be a separate patch, e.g.
> 
>   x86/sgx: Export sgx_encl_{grow,shrink}()
> 
>   In order to use sgx_encl_{grow,shrink}() in the page augementation code
>   located in encl.c, export these functions.
> 

Sure, will do.

Reinette

^ permalink raw reply	[flat|nested] 155+ messages in thread

end of thread, other threads:[~2022-03-01 17:08 UTC | newest]

Thread overview: 155+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-01 19:22 [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
2021-12-01 19:22 ` [PATCH 01/25] x86/sgx: Add shortlog descriptions to ENCLS wrappers Reinette Chatre
2021-12-04 18:30   ` Jarkko Sakkinen
2021-12-06 21:13     ` Reinette Chatre
2021-12-11  5:28       ` Jarkko Sakkinen
2021-12-13 22:06         ` Reinette Chatre
2021-12-01 19:23 ` [PATCH 02/25] x86/sgx: Add wrappers for SGX2 functions Reinette Chatre
2021-12-04 22:04   ` Jarkko Sakkinen
2021-12-06 21:15     ` Reinette Chatre
2021-12-01 19:23 ` [PATCH 03/25] x86/sgx: Support VMA permissions exceeding enclave permissions Reinette Chatre
2021-12-04 22:25   ` Jarkko Sakkinen
2021-12-04 22:27     ` Jarkko Sakkinen
2021-12-06 21:16       ` Reinette Chatre
2021-12-11  5:39         ` Jarkko Sakkinen
2021-12-13 22:08           ` Reinette Chatre
2021-12-01 19:23 ` [PATCH 04/25] x86/sgx: Add pfn_mkwrite() handler for present PTEs Reinette Chatre
2021-12-04 22:43   ` Jarkko Sakkinen
2021-12-06 21:18     ` Reinette Chatre
2021-12-11  7:37       ` Jarkko Sakkinen
2021-12-13 22:09         ` Reinette Chatre
2021-12-28 14:51           ` Jarkko Sakkinen
2021-12-01 19:23 ` [PATCH 05/25] x86/sgx: Introduce runtime protection bits Reinette Chatre
2021-12-03 19:28   ` Andy Lutomirski
2021-12-03 22:12     ` Reinette Chatre
2021-12-04  0:38       ` Andy Lutomirski
2021-12-04  1:14         ` Reinette Chatre
2021-12-04 17:56           ` Andy Lutomirski
2021-12-04 23:55             ` Reinette Chatre
2021-12-13 22:34               ` Reinette Chatre
2021-12-04 23:57     ` Jarkko Sakkinen
2021-12-06 21:20       ` Reinette Chatre
2021-12-11  7:42         ` Jarkko Sakkinen
2021-12-13 22:10           ` Reinette Chatre
2021-12-28 14:52             ` Jarkko Sakkinen
2022-01-06 17:46               ` Reinette Chatre
2022-01-07 12:16                 ` Jarkko Sakkinen
2022-01-07 16:14                   ` Haitao Huang
2022-01-08 15:45                     ` Jarkko Sakkinen
2022-01-08 15:51                       ` Jarkko Sakkinen
2022-01-08 16:22                         ` Jarkko Sakkinen
2022-01-10 22:05                           ` Haitao Huang
2022-01-11  1:53                             ` Jarkko Sakkinen
2022-01-11  1:55                               ` Jarkko Sakkinen
2022-01-11  2:03                                 ` Jarkko Sakkinen
2022-01-11  2:15                                   ` Jarkko Sakkinen
2022-01-11  3:48                                     ` Haitao Huang
2022-01-12 23:48                                       ` Jarkko Sakkinen
2022-01-13  2:41                                         ` Haitao Huang
2022-01-14 21:36                                           ` Jarkko Sakkinen
2022-01-11 17:13                               ` Reinette Chatre
2022-01-12 23:50                                 ` Jarkko Sakkinen
2022-01-12 23:56                                   ` Jarkko Sakkinen
2022-01-13 20:09                                     ` Nathaniel McCallum
2022-01-13 21:42                                       ` Reinette Chatre
2022-01-14 21:53                                         ` Jarkko Sakkinen
2022-01-14 21:57                                           ` Jarkko Sakkinen
2022-01-14 22:00                                             ` Jarkko Sakkinen
2022-01-14 22:17                                           ` Jarkko Sakkinen
2022-01-14 22:23                                             ` Jarkko Sakkinen
2022-01-14 22:34                                               ` Jarkko Sakkinen
2022-01-14 23:05                                           ` Reinette Chatre
2022-01-14 23:15                                             ` Jarkko Sakkinen
2022-01-15  0:01                                               ` Reinette Chatre
2022-01-15  0:27                                                 ` Jarkko Sakkinen
2022-01-15  0:41                                                   ` Reinette Chatre
2022-01-15  1:18                                                     ` Jarkko Sakkinen
2022-01-15 11:56                                                       ` Jarkko Sakkinen
2022-01-15 11:59                                                         ` Jarkko Sakkinen
2022-01-17 13:13                                                         ` Nathaniel McCallum
2022-01-18  1:59                                                           ` Jarkko Sakkinen
2022-01-18  2:22                                                             ` Jarkko Sakkinen
2022-01-18  3:31                                                               ` Jarkko Sakkinen
2022-01-18 20:59                                                               ` Reinette Chatre
2022-01-20 12:53                                                                 ` Jarkko Sakkinen
2022-01-20 16:52                                                                   ` Reinette Chatre
2022-01-26 14:41                                                                     ` Jarkko Sakkinen
2022-01-15 16:49                                               ` Jarkko Sakkinen
2022-01-18 21:18                                                 ` Reinette Chatre
2022-01-17 13:27                                         ` Nathaniel McCallum
2022-01-18 21:11                                           ` Reinette Chatre
2021-12-04 22:50   ` Jarkko Sakkinen
2021-12-06 21:28     ` Reinette Chatre
2021-12-01 19:23 ` [PATCH 06/25] x86/sgx: Use more generic name for enclave cpumask function Reinette Chatre
2021-12-04 22:56   ` Jarkko Sakkinen
2021-12-06 21:29     ` Reinette Chatre
2021-12-01 19:23 ` [PATCH 07/25] x86/sgx: Move PTE zap code to separate function Reinette Chatre
2021-12-04 22:59   ` Jarkko Sakkinen
2021-12-06 21:30     ` Reinette Chatre
2021-12-11  7:52       ` Jarkko Sakkinen
2021-12-13 22:11         ` Reinette Chatre
2021-12-28 14:55           ` Jarkko Sakkinen
2022-01-06 17:46             ` Reinette Chatre
2022-01-07 12:26               ` Jarkko Sakkinen
2021-12-01 19:23 ` [PATCH 08/25] x86/sgx: Make SGX IPI callback available internally Reinette Chatre
2021-12-04 23:00   ` Jarkko Sakkinen
2021-12-06 21:36     ` Reinette Chatre
2021-12-11  7:53       ` Jarkko Sakkinen
2021-12-01 19:23 ` [PATCH 09/25] x86/sgx: Keep record of SGX page type Reinette Chatre
2021-12-04 23:03   ` Jarkko Sakkinen
2021-12-01 19:23 ` [PATCH 10/25] x86/sgx: Support enclave page permission changes Reinette Chatre
2021-12-02 23:48   ` Dave Hansen
2021-12-03 18:18     ` Reinette Chatre
2021-12-03  0:32   ` Dave Hansen
2021-12-03 18:18     ` Reinette Chatre
2021-12-03 18:14   ` Dave Hansen
2021-12-03 18:49     ` Reinette Chatre
2021-12-03 19:38   ` Andy Lutomirski
2021-12-03 22:34     ` Reinette Chatre
2021-12-04  0:42       ` Andy Lutomirski
2021-12-04  1:35         ` Reinette Chatre
2021-12-04 23:08   ` Jarkko Sakkinen
2021-12-06 20:19     ` Dave Hansen
2021-12-11  5:17       ` Jarkko Sakkinen
2021-12-06 21:42     ` Reinette Chatre
2021-12-11  7:57       ` Jarkko Sakkinen
2021-12-13 22:12         ` Reinette Chatre
2021-12-28 14:56           ` Jarkko Sakkinen
2021-12-01 19:23 ` [PATCH 11/25] selftests/sgx: Add test for EPCM " Reinette Chatre
2021-12-01 19:23 ` [PATCH 12/25] selftests/sgx: Add test for TCS page " Reinette Chatre
2021-12-01 19:23 ` [PATCH 13/25] x86/sgx: Support adding of pages to initialized enclave Reinette Chatre
2021-12-03  0:38   ` Dave Hansen
2021-12-03 18:47     ` Reinette Chatre
2021-12-04 23:13   ` Jarkko Sakkinen
2021-12-06 21:44     ` Reinette Chatre
2021-12-11  8:00       ` Jarkko Sakkinen
2021-12-13 22:12         ` Reinette Chatre
2021-12-28 14:57           ` Jarkko Sakkinen
2022-03-01 15:13   ` Jarkko Sakkinen
2022-03-01 17:08     ` Reinette Chatre
2021-12-01 19:23 ` [PATCH 14/25] x86/sgx: Tighten accessible memory range after enclave initialization Reinette Chatre
2021-12-04 23:14   ` Jarkko Sakkinen
2021-12-06 21:45     ` Reinette Chatre
2021-12-11  8:01       ` Jarkko Sakkinen
2021-12-01 19:23 ` [PATCH 15/25] selftests/sgx: Test two different SGX2 EAUG flows Reinette Chatre
2021-12-01 19:23 ` [PATCH 16/25] x86/sgx: Support modifying SGX page type Reinette Chatre
2021-12-04 23:45   ` Jarkko Sakkinen
2021-12-06 21:48     ` Reinette Chatre
2021-12-11  8:02       ` Jarkko Sakkinen
2021-12-13 17:43         ` Dave Hansen
2021-12-21  8:52           ` Jarkko Sakkinen
2021-12-01 19:23 ` [PATCH 17/25] x86/sgx: Support complete page removal Reinette Chatre
2021-12-04 23:45   ` Jarkko Sakkinen
2021-12-06 21:49     ` Reinette Chatre
2021-12-01 19:23 ` [PATCH 18/25] selftests/sgx: Introduce dynamic entry point Reinette Chatre
2021-12-01 19:23 ` [PATCH 19/25] selftests/sgx: Introduce TCS initialization enclave operation Reinette Chatre
2021-12-01 19:23 ` [PATCH 20/25] selftests/sgx: Test complete changing of page type flow Reinette Chatre
2021-12-01 19:23 ` [PATCH 21/25] selftests/sgx: Test faulty enclave behavior Reinette Chatre
2021-12-01 19:23 ` [PATCH 22/25] selftests/sgx: Test invalid access to removed enclave page Reinette Chatre
2021-12-01 19:23 ` [PATCH 23/25] selftests/sgx: Test reclaiming of untouched page Reinette Chatre
2021-12-01 19:23 ` [PATCH 24/25] x86/sgx: Free up EPC pages directly to support large page ranges Reinette Chatre
2021-12-04 23:47   ` Jarkko Sakkinen
2021-12-06 22:07     ` Reinette Chatre
2021-12-01 19:23 ` [PATCH 25/25] selftests/sgx: Page removal stress test Reinette Chatre
2021-12-02 18:30 ` [PATCH 00/25] x86/sgx and selftests/sgx: Support SGX2 Dave Hansen
2021-12-02 20:38   ` Nathaniel McCallum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.