linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2
@ 2022-04-04 16:49 Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 01/30] x86/sgx: Add short descriptions to ENCLS wrappers Reinette Chatre
                   ` (29 more replies)
  0 siblings, 30 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

V2: https://lore.kernel.org/lkml/cover.1644274683.git.reinette.chatre@intel.com/

Changes since V2 that directly impact user space:
- Maximum allowed permissions of dynamically added pages is RWX,
  previously limited to RW. (Jarkko)
  Dynamically added pages are initially created with architecturally
  limited EPCM permissions of RW. mmap() and mprotect() of these pages
  with RWX permissions would no longer be blocked by SGX driver. PROT_EXEC
  on dynamically added pages will be possible after running ENCLU[EMODPE]
  from within the enclave with appropriate VMA permissions.

- The kernel no longer attempts to track the EPCM runtime permissions. (Jarkko)
  Consequences are:
  - Kernel does not modify PTEs to follow EPCM permissions. User space
    will receive #PF with SGX error code in cases where the V2
    implementation would have resulted in regular (non-SGX) page fault
    error code.
  - SGX_IOC_ENCLAVE_RELAX_PERMISSIONS is removed. This ioctl() was used
    to clear PTEs after permissions were modified from within the enclave
    and ensure correct PTEs are installed. Since PTEs no longer track
    EPCM permissions the changes in EPCM permissions would not impact PTEs.
    As long as new permissions are within the maximum vetted permissions
    (vm_max_prot_bits) only ENCLU[EMODPE] from within enclave is needed,
    as accompanied by appropriate VMA permissions.

- struct sgx_enclave_restrict_perm renamed to
     sgx_enclave_restrict_permissions (Jarkko)

- struct sgx_enclave_modt renamed to struct sgx_enclave_modify_type
  to be consistent with the verbose naming of other SGX uapi structs.

Details about changes since V2 that do not directly impact user space:
- Kernel no longer tracks the runtime EPCM permissions with the aim of
  installing accurate PTEs. (Jarkko)
  - In support of this change the following patches were removed:
    Documentation/x86: Document SGX permission details
    x86/sgx: Support VMA permissions more relaxed than enclave permissions
    x86/sgx: Add pfn_mkwrite() handler for present PTEs
    x86/sgx: Add sgx_encl_page->vm_run_prot_bits for dynamic permission changes
    x86/sgx: Support relaxing of enclave page permissions
  - No more handling of scenarios where VMA permissions may be more
    relaxed than what the EPCM allows. Enclaves are not prevented
    from accessing such pages and the EPCM permissions are entrusted
    to control access as supported by the SGX error code in page faults.
  - No more explicit setting of protection bits in page fault handler.
    Protection bits are inherited from VMA similar to SGX1 support.

- Selftest patches are moved to the end of the series. (Jarkko)

- New patch contributed by Jarkko to avoid duplicated code:
   x86/sgx: Export sgx_encl_page_alloc()

- New patch separating changes from existing patch. (Jarkko)
   x86/sgx: Export sgx_encl_{grow,shrink}()

- New patch to keep one required benefit from the (now removed) kernel
  EPCM permission tracking:
   x86/sgx: Support loading enclave page without VMA permissions check

- Updated cover letter to reflect architecture changes.

- Many smaller changes, please refer to individual patches.

V1: https://lore.kernel.org/linux-sgx/cover.1638381245.git.reinette.chatre@intel.com/

Changes since V1 that directly impact user space:
- SGX2 permission changes changed from a single ioctl() named
  SGX_IOC_PAGE_MODP to two new ioctl()s:
  SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and
  SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS, supported by two different
  parameter structures (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS does
  not support a result output parameter) (Jarkko).

  User space flow impact: After user space runs ENCLU[EMODPE] it
  needs to call SGX_IOC_ENCLAVE_RELAX_PERMISSIONS to have PTEs
  updated. Previously running SGX_IOC_PAGE_MODP in this scenario
  resulted in EPCM.PR being set but calling
  SGX_IOC_ENCLAVE_RELAX_PERMISSIONS will not result in EPCM.PR
  being set anymore and thus no need for an additional
  ENCLU[EACCEPT].

- SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and
  SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
  obtain new permissions from secinfo as parameter instead of
  the permissions directly (Jarkko).

- ioctl() supporting SGX2 page type change is renamed from
  SGX_IOC_PAGE_MODT to SGX_IOC_ENCLAVE_MODIFY_TYPE (Jarkko).

- SGX_IOC_ENCLAVE_MODIFY_TYPE obtains new page type from secinfo
  as parameter instead of the page type directly (Jarkko).

- ioctl() supporting SGX2 page removal is renamed from
  SGX_IOC_PAGE_REMOVE to SGX_IOC_ENCLAVE_REMOVE_PAGES (Jarkko).

- All ioctl() parameter structures have been renamed as a result of the
  ioctl() renaming:
  SGX_IOC_ENCLAVE_RELAX_PERMISSIONS => struct sgx_enclave_relax_perm
  SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS => struct sgx_enclave_restrict_perm
  SGX_IOC_ENCLAVE_MODIFY_TYPE => struct sgx_enclave_modt
  SGX_IOC_ENCLAVE_REMOVE_PAGES => struct sgx_enclave_remove_pages

Changes since V1 that do not directly impact user space:
- Number of patches in series increased from 25 to 32 primarily because
  of splitting the original submission:
  - Wrappers for the new SGX2 functions are introduced in three separate
    patches replacing the original "x86/sgx: Add wrappers for SGX2
    functions"
    (Jarkko).
  - Moving and renaming sgx_encl_ewb_cpumask() is done with two patches
    replacing the original "x86/sgx: Use more generic name for enclave
    cpumask function" (Jarkko).
  - Support for SGX2 EPCM permission changes is split into two ioctls(),
    one for relaxing and one for restricting permissions, each introduced
    by a new patch replacing the original "x86/sgx: Support enclave page
    permission changes" (Jarkko).
  - Extracted code used by existing ioctls() for usage by new ioctl()s
    into a new utility in new patch "x86/sgx: Create utility to validate
    user provided offset and length" (Dave did not specifically ask for
    this but it addresses his review feedback).
  - Two new Documentation patches to support the SGX2 work
    ("Documentation/x86: Introduce enclave runtime management") and
    a dedicated section on the enclave permission management
    ("Documentation/x86: Document SGX permission details") (Andy).
- Most patches were reworked to improve the language by:
  * aiming to refer to exact item instead of English rephrasing (Jarkko).
  * use ioctl() instead of ioctl throughout (Dave).
  * Use "relaxed" instead of "exceed" when referring to permissions
    (Dave).
- Improved documentation with several additions to
  Documentation/x86/sgx.rst.
- Many smaller changes, please refer to individual patches.

Hi Everybody,

The current Linux kernel support for SGX includes support for SGX1 that
requires that an enclave be created with properties that accommodate all
usages over its (the enclave's) lifetime. This includes properties such
as permissions of enclave pages, the number of enclave pages, and the
number of threads supported by the enclave.

Consequences of this requirement to have the enclave be created to
accommodate all usages include:
* pages needing to support relocated code are required to have RWX
  permissions for their entire lifetime,
* an enclave needs to be created with the maximum stack and heap
  projected to be needed during the enclave's entire lifetime which
  can be longer than the processes running within it,
* an enclave needs to be created with support for the maximum number
  of threads projected to run in the enclave.

Since SGX1 a few more functions were introduced, collectively called
SGX2, that support modifications to an initialized enclave. Hardware
supporting these functions are already available as listed on
https://github.com/ayeks/SGX-hardware

This series adds support for SGX2, also referred to as Enclave Dynamic
Memory Management (EDMM). This includes:

* Support modifying EPCM permissions of regular enclave pages belonging
  to an initialized enclave. Only permission restriction is supported
  via a new ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS. Relaxing of
  EPCM permissions can only be done from within the enclave with
  ENCLU[EMODPE].

* Support dynamic addition of regular enclave pages to an initialized
  enclave. At creation new pages are architecturally limited to RW EPCM
  permissions but will be accessible with PROT_EXEC after the enclave
  runs ENCLU[EMODPE] to relax EPCM permissions to RWX.
  Pages are dynamically added to an initialized enclave from the SGX
  page fault handler.

* Support expanding an initialized enclave to accommodate more threads.
  More threads can be accommodated by an enclave with the addition of
  Thread Control Structure (TCS) pages that is done by changing the
  type of regular enclave pages to TCS pages using a new ioctl()
  SGX_IOC_ENCLAVE_MODIFY_TYPE.

* Support removing regular and TCS pages from an initialized enclave.
  Removing pages is accomplished in two stages as supported by two new
  ioctl()s SGX_IOC_ENCLAVE_MODIFY_TYPE (same ioctl() as mentioned in
  previous bullet) and SGX_IOC_ENCLAVE_REMOVE_PAGES.

* Tests covering all the new flows, some edge cases, and one
  comprehensive stress scenario.

No additional work is needed to support SGX2 in a virtualized
environment. All tests included in this series passed when run from
a guest as tested with the recent QEMU release based on 6.2.0
that supports SGX.

Patches 1 to 13 prepare the existing code for SGX2 support by
introducing the SGX2 functions, refactoring code, and tracking enclave
page types.

Patches 14 through 20 enable the SGX2 features and include a
Documentation patch.

Patches 21 through 30 test several scenarios of all the enabled
SGX2 features.

This series is based on v5.17 with the following fixes that have already
been merged for inclusion into v5.18-rc1. These can be obtained from the
x86/sgx branch of tip.git.

commit 2d03861e0d1d ("selftests/sgx: Fix NULL-pointer-dereference upon early test failure")
commit fff36bcbfde1 ("selftests/sgx: Do not attempt enclave build without valid enclave")
commit 2db703fc3b15 ("selftests/sgx: Ensure enclave data available during debug print")
commit 5626de65f97a ("selftests/sgx: Remove extra newlines in test output")
commit b06e15ebd5bf ("selftests/x86: Add validity check and allow field splitting")
commit 6170abb21e23 ("selftests/sgx: Treat CC as one argument")

Your feedback will be greatly appreciated.

Regards,

Reinette

Jarkko Sakkinen (1):
  x86/sgx: Export sgx_encl_page_alloc()

Reinette Chatre (29):
  x86/sgx: Add short descriptions to ENCLS wrappers
  x86/sgx: Add wrapper for SGX2 EMODPR function
  x86/sgx: Add wrapper for SGX2 EMODT function
  x86/sgx: Add wrapper for SGX2 EAUG function
  x86/sgx: Support loading enclave page without VMA permissions check
  x86/sgx: Export sgx_encl_ewb_cpumask()
  x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask()
  x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes()
  x86/sgx: Make sgx_ipi_cb() available internally
  x86/sgx: Create utility to validate user provided offset and length
  x86/sgx: Keep record of SGX page type
  x86/sgx: Export sgx_encl_{grow,shrink}()
  x86/sgx: Support restricting of enclave page permissions
  x86/sgx: Support adding of pages to an initialized enclave
  x86/sgx: Tighten accessible memory range after enclave initialization
  x86/sgx: Support modifying SGX page type
  x86/sgx: Support complete page removal
  x86/sgx: Free up EPC pages directly to support large page ranges
  Documentation/x86: Introduce enclave runtime management section
  selftests/sgx: Add test for EPCM permission changes
  selftests/sgx: Add test for TCS page permission changes
  selftests/sgx: Test two different SGX2 EAUG flows
  selftests/sgx: Introduce dynamic entry point
  selftests/sgx: Introduce TCS initialization enclave operation
  selftests/sgx: Test complete changing of page type flow
  selftests/sgx: Test faulty enclave behavior
  selftests/sgx: Test invalid access to removed enclave page
  selftests/sgx: Test reclaiming of untouched page
  selftests/sgx: Page removal stress test

 Documentation/x86/sgx.rst                     |   15 +
 arch/x86/include/asm/sgx.h                    |    8 +
 arch/x86/include/uapi/asm/sgx.h               |   62 +
 arch/x86/kernel/cpu/sgx/encl.c                |  330 +++-
 arch/x86/kernel/cpu/sgx/encl.h                |   13 +-
 arch/x86/kernel/cpu/sgx/encls.h               |   33 +
 arch/x86/kernel/cpu/sgx/ioctl.c               |  668 +++++++-
 arch/x86/kernel/cpu/sgx/main.c                |   70 +-
 arch/x86/kernel/cpu/sgx/sgx.h                 |    3 +
 tools/testing/selftests/sgx/defines.h         |   23 +
 tools/testing/selftests/sgx/load.c            |   41 +
 tools/testing/selftests/sgx/main.c            | 1456 +++++++++++++++++
 tools/testing/selftests/sgx/main.h            |    1 +
 tools/testing/selftests/sgx/test_encl.c       |   68 +
 .../selftests/sgx/test_encl_bootstrap.S       |    6 +
 15 files changed, 2675 insertions(+), 122 deletions(-)


base-commit: f443e374ae131c168a065ea1748feac6b2e76613
prerequisite-patch-id: 986260c8bc4255eb61e2c4afa88d2b723e376423
prerequisite-patch-id: ba014a99fced2b57d5d9e2dfb9d80ddf4333c13e
prerequisite-patch-id: 65cbb72889b6353a5639b984615d12019136b270
prerequisite-patch-id: e3296a2f0345a77c8a7ca91f76697ae2e1dca21f
prerequisite-patch-id: 0e792adec49b53020ee788fd0126e8f015ff483d
prerequisite-patch-id: b8685cf66d49f89ed7444feafa0129aa6144a163
-- 
2.25.1


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH V3 01/30] x86/sgx: Add short descriptions to ENCLS wrappers
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  6:52   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 02/30] x86/sgx: Add wrapper for SGX2 EMODPR function Reinette Chatre
                   ` (28 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The SGX ENCLS instruction uses EAX to specify an SGX function and
may require additional registers, depending on the SGX function.
ENCLS invokes the specified privileged SGX function for managing
and debugging enclaves. Macros are used to wrap the ENCLS
functionality and several wrappers are used to wrap the macros to
make the different SGX functions accessible in the code.

The wrappers of the supported SGX functions are cryptic. Add short
descriptions of each as a comment.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since V1:
- Fix commit message and subject to not refer to descriptions as
"changelog descriptions" or "shortlog descriptions" (Jarkko).
- Improve all descriptions with guidance from Jarkko.

 arch/x86/kernel/cpu/sgx/encls.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index fa04a73daf9c..0e22fa8f77c5 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -136,57 +136,71 @@ static inline bool encls_failed(int ret)
 	ret;						\
 	})
 
+/* Initialize an EPC page into an SGX Enclave Control Structure (SECS) page. */
 static inline int __ecreate(struct sgx_pageinfo *pginfo, void *secs)
 {
 	return __encls_2(ECREATE, pginfo, secs);
 }
 
+/* Hash a 256 byte region of an enclave page to SECS:MRENCLAVE. */
 static inline int __eextend(void *secs, void *addr)
 {
 	return __encls_2(EEXTEND, secs, addr);
 }
 
+/*
+ * Associate an EPC page to an enclave either as a REG or TCS page
+ * populated with the provided data.
+ */
 static inline int __eadd(struct sgx_pageinfo *pginfo, void *addr)
 {
 	return __encls_2(EADD, pginfo, addr);
 }
 
+/* Finalize enclave build, initialize enclave for user code execution. */
 static inline int __einit(void *sigstruct, void *token, void *secs)
 {
 	return __encls_ret_3(EINIT, sigstruct, secs, token);
 }
 
+/* Disassociate EPC page from its enclave and mark it as unused. */
 static inline int __eremove(void *addr)
 {
 	return __encls_ret_1(EREMOVE, addr);
 }
 
+/* Copy data to an EPC page belonging to a debug enclave. */
 static inline int __edbgwr(void *addr, unsigned long *data)
 {
 	return __encls_2(EDGBWR, *data, addr);
 }
 
+/* Copy data from an EPC page belonging to a debug enclave. */
 static inline int __edbgrd(void *addr, unsigned long *data)
 {
 	return __encls_1_1(EDGBRD, *data, addr);
 }
 
+/* Track that software has completed the required TLB address clears. */
 static inline int __etrack(void *addr)
 {
 	return __encls_ret_1(ETRACK, addr);
 }
 
+/* Load, verify, and unblock an EPC page. */
 static inline int __eldu(struct sgx_pageinfo *pginfo, void *addr,
 			 void *va)
 {
 	return __encls_ret_3(ELDU, pginfo, addr, va);
 }
 
+/* Make EPC page inaccessible to enclave, ready to be written to memory. */
 static inline int __eblock(void *addr)
 {
 	return __encls_ret_1(EBLOCK, addr);
 }
 
+/* Initialize an EPC page into a Version Array (VA) page. */
 static inline int __epa(void *addr)
 {
 	unsigned long rbx = SGX_PAGE_TYPE_VA;
@@ -194,6 +208,7 @@ static inline int __epa(void *addr)
 	return __encls_2(EPA, rbx, addr);
 }
 
+/* Invalidate an EPC page and write it out to main memory. */
 static inline int __ewb(struct sgx_pageinfo *pginfo, void *addr,
 			void *va)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 02/30] x86/sgx: Add wrapper for SGX2 EMODPR function
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 01/30] x86/sgx: Add short descriptions to ENCLS wrappers Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  6:53   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 03/30] x86/sgx: Add wrapper for SGX2 EMODT function Reinette Chatre
                   ` (27 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Add a wrapper for the EMODPR ENCLS leaf function used to
restrict enclave page permissions as maintained in the
SGX hardware's Enclave Page Cache Map (EPCM).

EMODPR:
1) Updates the EPCM permissions of an enclave page by treating
   the new permissions as a mask. Supplying a value that attempts
   to relax EPCM permissions has no effect on EPCM permissions
   (PR bit, see below, is changed).
2) Sets the PR bit in the EPCM entry of the enclave page to
   indicate that permission restriction is in progress. The bit
   is reset by the enclave by invoking ENCLU leaf function
   EACCEPT or EACCEPTCOPY.

The enclave may access the page throughout the entire process
if conforming to the EPCM permissions for the enclave page.

After performing the permission restriction by issuing EMODPR
the kernel needs to collaborate with the hardware to ensure that
all logical processors sees the new restricted permissions. This
is required for the enclave's EACCEPT/EACCEPTCOPY to succeed and
is accomplished with the ETRACK flow.

Expand enum sgx_return_code with the possible EMODPR return
values.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Add detail to changelog that PR bit is set when EPCM permissions
  not changed when relaxing of permissions using EMODPR attempted.

Changes since V1:
- Split original patch ("x86/sgx: Add wrappers for SGX2 functions")
  in three to introduce the SGX2 functions separately (Jarkko).
- Rewrite commit message to include how the EPCM within the hardware
  is changed by the SGX2 function as well as the calling
  conditions (Jarkko).
- Make short description more specific to which permissions (EPCM
  permissions) the function modifies.

 arch/x86/include/asm/sgx.h      | 5 +++++
 arch/x86/kernel/cpu/sgx/encls.h | 6 ++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
index 3f9334ef67cd..d67810b50a81 100644
--- a/arch/x86/include/asm/sgx.h
+++ b/arch/x86/include/asm/sgx.h
@@ -65,17 +65,22 @@ enum sgx_encls_function {
 
 /**
  * enum sgx_return_code - The return code type for ENCLS, ENCLU and ENCLV
+ * %SGX_EPC_PAGE_CONFLICT:	Page is being written by other ENCLS function.
  * %SGX_NOT_TRACKED:		Previous ETRACK's shootdown sequence has not
  *				been completed yet.
  * %SGX_CHILD_PRESENT		SECS has child pages present in the EPC.
  * %SGX_INVALID_EINITTOKEN:	EINITTOKEN is invalid and enclave signer's
  *				public key does not match IA32_SGXLEPUBKEYHASH.
+ * %SGX_PAGE_NOT_MODIFIABLE:	The EPC page cannot be modified because it
+ *				is in the PENDING or MODIFIED state.
  * %SGX_UNMASKED_EVENT:		An unmasked event, e.g. INTR, was received
  */
 enum sgx_return_code {
+	SGX_EPC_PAGE_CONFLICT		= 7,
 	SGX_NOT_TRACKED			= 11,
 	SGX_CHILD_PRESENT		= 13,
 	SGX_INVALID_EINITTOKEN		= 16,
+	SGX_PAGE_NOT_MODIFIABLE		= 20,
 	SGX_UNMASKED_EVENT		= 128,
 };
 
diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index 0e22fa8f77c5..2b091912f038 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -215,4 +215,10 @@ static inline int __ewb(struct sgx_pageinfo *pginfo, void *addr,
 	return __encls_ret_3(EWB, pginfo, addr, va);
 }
 
+/* Restrict the EPCM permissions of an EPC page. */
+static inline int __emodpr(struct sgx_secinfo *secinfo, void *addr)
+{
+	return __encls_ret_2(EMODPR, secinfo, addr);
+}
+
 #endif /* _X86_ENCLS_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 03/30] x86/sgx: Add wrapper for SGX2 EMODT function
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 01/30] x86/sgx: Add short descriptions to ENCLS wrappers Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 02/30] x86/sgx: Add wrapper for SGX2 EMODPR function Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  6:53   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 04/30] x86/sgx: Add wrapper for SGX2 EAUG function Reinette Chatre
                   ` (26 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Add a wrapper for the EMODT ENCLS leaf function used to
change the type of an enclave page as maintained in the
SGX hardware's Enclave Page Cache Map (EPCM).

EMODT:
1) Updates the EPCM page type of the enclave page.
2) Sets the MODIFIED bit in the EPCM entry of the enclave page.
   This bit is reset by the enclave by invoking ENCLU leaf
   function EACCEPT or EACCEPTCOPY.

Access from within the enclave to the enclave page is not possible
while the MODIFIED bit is set.

After changing the enclave page type by issuing EMODT the kernel
needs to collaborate with the hardware to ensure that no logical
processor continues to hold a reference to the changed page. This
is required to ensure no required security checks are circumvented
and is required for the enclave's EACCEPT/EACCEPTCOPY to succeed.
Ensuring that no references to the changed page remain is
accomplished with the ETRACK flow.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since V1:
- Split original patch ("x86/sgx: Add wrappers for SGX2 functions")
  in three to introduce the SGX2 functions separately (Jarkko).
- Rewrite commit message to include how the EPCM within the hardware
  is changed by the SGX2 function as well as the calling
  conditions (Jarkko).

 arch/x86/kernel/cpu/sgx/encls.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index 2b091912f038..7a1ecf704ec1 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -221,4 +221,10 @@ static inline int __emodpr(struct sgx_secinfo *secinfo, void *addr)
 	return __encls_ret_2(EMODPR, secinfo, addr);
 }
 
+/* Change the type of an EPC page. */
+static inline int __emodt(struct sgx_secinfo *secinfo, void *addr)
+{
+	return __encls_ret_2(EMODT, secinfo, addr);
+}
+
 #endif /* _X86_ENCLS_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 04/30] x86/sgx: Add wrapper for SGX2 EAUG function
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (2 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 03/30] x86/sgx: Add wrapper for SGX2 EMODT function Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  6:54   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 05/30] x86/sgx: Support loading enclave page without VMA permissions check Reinette Chatre
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Add a wrapper for the EAUG ENCLS leaf function used to
add a page to an initialized enclave.

EAUG:
1) Stores all properties of the new enclave page in the SGX
   hardware's Enclave Page Cache Map (EPCM).
2) Sets the PENDING bit in the EPCM entry of the enclave page.
   This bit is cleared by the enclave by invoking ENCLU leaf
   function EACCEPT or EACCEPTCOPY.

Access from within the enclave to the new enclave page is not
possible until the PENDING bit is cleared.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since V1:
- Split original patch ("x86/sgx: Add wrappers for SGX2 functions")
  in three to introduce the SGX2 functions separately (Jarkko).
- Rewrite commit message to include how the EPCM within the hardware
  is changed by the SGX2 function as well as any calling
  conditions (Jarkko).

 arch/x86/kernel/cpu/sgx/encls.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
index 7a1ecf704ec1..99004b02e2ed 100644
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -227,4 +227,10 @@ static inline int __emodt(struct sgx_secinfo *secinfo, void *addr)
 	return __encls_ret_2(EMODT, secinfo, addr);
 }
 
+/* Zero a page of EPC memory and add it to an initialized enclave. */
+static inline int __eaug(struct sgx_pageinfo *pginfo, void *addr)
+{
+	return __encls_2(EAUG, pginfo, addr);
+}
+
 #endif /* _X86_ENCLS_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 05/30] x86/sgx: Support loading enclave page without VMA permissions check
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (3 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 04/30] x86/sgx: Add wrapper for SGX2 EAUG function Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  6:56   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 06/30] x86/sgx: Export sgx_encl_ewb_cpumask() Reinette Chatre
                   ` (24 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

sgx_encl_load_page() is used to find and load an enclave page into
enclave (EPC) memory, potentially loading it from the backing storage.
Both usages of sgx_encl_load_page() are during an access to the
enclave page from a VMA and thus the permissions of the VMA are
considered before the enclave page is loaded.

SGX2 functions operating on enclave pages belonging to an initialized
enclave requiring the page to be in EPC. It is thus required to
support loading enclave pages into the EPC independent from a VMA.

Split the current sgx_encl_load_page() to support the two usages:
A new call, sgx_encl_load_page_in_vma(), behaves exactly like the
current sgx_encl_load_page() that takes VMA permissions into account,
while sgx_encl_load_page() just loads an enclave page into EPC.

VMA, PTE, and EPCM permissions would continue to dictate whether
the pages can be accessed from within an enclave.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- New patch

 arch/x86/kernel/cpu/sgx/encl.c | 57 ++++++++++++++++++++++------------
 arch/x86/kernel/cpu/sgx/encl.h |  2 ++
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 7c63a1911fae..05ae1168391c 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -131,25 +131,10 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page,
 	return epc_page;
 }
 
-static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
-						unsigned long addr,
-						unsigned long vm_flags)
+static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
+						  struct sgx_encl_page *entry)
 {
-	unsigned long vm_prot_bits = vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
 	struct sgx_epc_page *epc_page;
-	struct sgx_encl_page *entry;
-
-	entry = xa_load(&encl->page_array, PFN_DOWN(addr));
-	if (!entry)
-		return ERR_PTR(-EFAULT);
-
-	/*
-	 * Verify that the faulted page has equal or higher build time
-	 * permissions than the VMA permissions (i.e. the subset of {VM_READ,
-	 * VM_WRITE, VM_EXECUTE} in vma->vm_flags).
-	 */
-	if ((entry->vm_max_prot_bits & vm_prot_bits) != vm_prot_bits)
-		return ERR_PTR(-EFAULT);
 
 	/* Entry successfully located. */
 	if (entry->epc_page) {
@@ -175,6 +160,40 @@ static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
 	return entry;
 }
 
+static struct sgx_encl_page *sgx_encl_load_page_in_vma(struct sgx_encl *encl,
+						       unsigned long addr,
+						       unsigned long vm_flags)
+{
+	unsigned long vm_prot_bits = vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
+	struct sgx_encl_page *entry;
+
+	entry = xa_load(&encl->page_array, PFN_DOWN(addr));
+	if (!entry)
+		return ERR_PTR(-EFAULT);
+
+	/*
+	 * Verify that the page has equal or higher build time
+	 * permissions than the VMA permissions (i.e. the subset of {VM_READ,
+	 * VM_WRITE, VM_EXECUTE} in vma->vm_flags).
+	 */
+	if ((entry->vm_max_prot_bits & vm_prot_bits) != vm_prot_bits)
+		return ERR_PTR(-EFAULT);
+
+	return __sgx_encl_load_page(encl, entry);
+}
+
+struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
+					 unsigned long addr)
+{
+	struct sgx_encl_page *entry;
+
+	entry = xa_load(&encl->page_array, PFN_DOWN(addr));
+	if (!entry)
+		return ERR_PTR(-EFAULT);
+
+	return __sgx_encl_load_page(encl, entry);
+}
+
 static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 {
 	unsigned long addr = (unsigned long)vmf->address;
@@ -196,7 +215,7 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 
 	mutex_lock(&encl->lock);
 
-	entry = sgx_encl_load_page(encl, addr, vma->vm_flags);
+	entry = sgx_encl_load_page_in_vma(encl, addr, vma->vm_flags);
 	if (IS_ERR(entry)) {
 		mutex_unlock(&encl->lock);
 
@@ -344,7 +363,7 @@ static struct sgx_encl_page *sgx_encl_reserve_page(struct sgx_encl *encl,
 	for ( ; ; ) {
 		mutex_lock(&encl->lock);
 
-		entry = sgx_encl_load_page(encl, addr, vm_flags);
+		entry = sgx_encl_load_page_in_vma(encl, addr, vm_flags);
 		if (PTR_ERR(entry) != -EBUSY)
 			break;
 
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index fec43ca65065..6b34efba1602 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -116,5 +116,7 @@ unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page);
 void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset);
 bool sgx_va_page_full(struct sgx_va_page *va_page);
 void sgx_encl_free_epc_page(struct sgx_epc_page *page);
+struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
+					 unsigned long addr);
 
 #endif /* _X86_ENCL_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 06/30] x86/sgx: Export sgx_encl_ewb_cpumask()
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (4 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 05/30] x86/sgx: Support loading enclave page without VMA permissions check Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  6:56   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 07/30] x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask() Reinette Chatre
                   ` (23 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Using sgx_encl_ewb_cpumask() to learn which CPUs might have executed
an enclave is useful to ensure that TLBs are cleared when changes are
made to enclave pages.

sgx_encl_ewb_cpumask() is used within the reclaimer when an enclave
page is evicted. The upcoming SGX2 support enables changes to be
made to enclave pages and will require TLBs to not refer to the
changed pages and thus will be needing sgx_encl_ewb_cpumask().

Relocate sgx_encl_ewb_cpumask() to be with the rest of the enclave
code in encl.c now that it is no longer unique to the reclaimer.

Take care to ensure that any future usage maintains the
current context requirement that ETRACK has been called first.
Expand the existing comments to highlight this while moving them
to a more prominent location before the function.

No functional change.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since V1:
- New patch split from original "x86/sgx: Use more generic name for
  enclave cpumask function" (Jarkko).
- Change subject line (Jarkko).
- Fixup kernel-doc to use brackets in function name.

 arch/x86/kernel/cpu/sgx/encl.c | 67 ++++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/encl.h |  1 +
 arch/x86/kernel/cpu/sgx/main.c | 29 ---------------
 3 files changed, 68 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 05ae1168391c..c6525eba74e8 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -613,6 +613,73 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
 	return 0;
 }
 
+/**
+ * sgx_encl_ewb_cpumask() - Query which CPUs might be accessing the enclave
+ * @encl: the enclave
+ *
+ * Some SGX functions require that no cached linear-to-physical address
+ * mappings are present before they can succeed. For example, ENCLS[EWB]
+ * copies a page from the enclave page cache to regular main memory but
+ * it fails if it cannot ensure that there are no cached
+ * linear-to-physical address mappings referring to the page.
+ *
+ * SGX hardware flushes all cached linear-to-physical mappings on a CPU
+ * when an enclave is exited via ENCLU[EEXIT] or an Asynchronous Enclave
+ * Exit (AEX). Exiting an enclave will thus ensure cached linear-to-physical
+ * address mappings are cleared but coordination with the tracking done within
+ * the SGX hardware is needed to support the SGX functions that depend on this
+ * cache clearing.
+ *
+ * When the ENCLS[ETRACK] function is issued on an enclave the hardware
+ * tracks threads operating inside the enclave at that time. The SGX
+ * hardware tracking require that all the identified threads must have
+ * exited the enclave in order to flush the mappings before a function such
+ * as ENCLS[EWB] will be permitted
+ *
+ * The following flow is used to support SGX functions that require that
+ * no cached linear-to-physical address mappings are present:
+ * 1) Execute ENCLS[ETRACK] to initiate hardware tracking.
+ * 2) Use this function (sgx_encl_ewb_cpumask()) to query which CPUs might be
+ *    accessing the enclave.
+ * 3) Send IPI to identified CPUs, kicking them out of the enclave and
+ *    thus flushing all locally cached linear-to-physical address mappings.
+ * 4) Execute SGX function.
+ *
+ * Context: It is required to call this function after ENCLS[ETRACK].
+ *          This will ensure that if any new mm appears (racing with
+ *          sgx_encl_mm_add()) then the new mm will enter into the
+ *          enclave with fresh linear-to-physical address mappings.
+ *
+ *          It is required that all IPIs are completed before a new
+ *          ENCLS[ETRACK] is issued so be sure to protect steps 1 to 3
+ *          of the above flow with the enclave's mutex.
+ *
+ * Return: cpumask of CPUs that might be accessing @encl
+ */
+const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl)
+{
+	cpumask_t *cpumask = &encl->cpumask;
+	struct sgx_encl_mm *encl_mm;
+	int idx;
+
+	cpumask_clear(cpumask);
+
+	idx = srcu_read_lock(&encl->srcu);
+
+	list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
+		if (!mmget_not_zero(encl_mm->mm))
+			continue;
+
+		cpumask_or(cpumask, cpumask, mm_cpumask(encl_mm->mm));
+
+		mmput_async(encl_mm->mm);
+	}
+
+	srcu_read_unlock(&encl->srcu, idx);
+
+	return cpumask;
+}
+
 static struct page *sgx_encl_get_backing_page(struct sgx_encl *encl,
 					      pgoff_t index)
 {
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index 6b34efba1602..d2acb4debde5 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -105,6 +105,7 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
 
 void sgx_encl_release(struct kref *ref);
 int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm);
+const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl);
 int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
 			 struct sgx_backing *backing);
 void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 8e4bc6453d26..2de85f459492 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -203,35 +203,6 @@ static void sgx_ipi_cb(void *info)
 {
 }
 
-static const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl)
-{
-	cpumask_t *cpumask = &encl->cpumask;
-	struct sgx_encl_mm *encl_mm;
-	int idx;
-
-	/*
-	 * Can race with sgx_encl_mm_add(), but ETRACK has already been
-	 * executed, which means that the CPUs running in the new mm will enter
-	 * into the enclave with a fresh epoch.
-	 */
-	cpumask_clear(cpumask);
-
-	idx = srcu_read_lock(&encl->srcu);
-
-	list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
-		if (!mmget_not_zero(encl_mm->mm))
-			continue;
-
-		cpumask_or(cpumask, cpumask, mm_cpumask(encl_mm->mm));
-
-		mmput_async(encl_mm->mm);
-	}
-
-	srcu_read_unlock(&encl->srcu, idx);
-
-	return cpumask;
-}
-
 /*
  * Swap page to the regular memory transformed to the blocked state by using
  * EBLOCK, which means that it can no longer be referenced (no new TLB entries).
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 07/30] x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask()
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (5 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 06/30] x86/sgx: Export sgx_encl_ewb_cpumask() Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  6:57   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 08/30] x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes() Reinette Chatre
                   ` (22 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

sgx_encl_ewb_cpumask() is no longer unique to the reclaimer where it
is used during the EWB ENCLS leaf function when EPC pages are written
out to main memory and sgx_encl_ewb_cpumask() is used to learn which
CPUs might have executed the enclave to ensure that TLBs are cleared.

Upcoming SGX2 enabling will use sgx_encl_ewb_cpumask() during the
EMODPR and EMODT ENCLS leaf functions that make changes to enclave
pages. The function is needed for the same reason it is used now: to
learn which CPUs might have executed the enclave to ensure that TLBs
no longer point to the changed pages.

Rename sgx_encl_ewb_cpumask() to sgx_encl_cpumask() to reflect the
broader usage.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since V1:
- New patch split from original "x86/sgx: Use more generic name for
  enclave cpumask function" (Jarkko).

 arch/x86/kernel/cpu/sgx/encl.c | 6 +++---
 arch/x86/kernel/cpu/sgx/encl.h | 2 +-
 arch/x86/kernel/cpu/sgx/main.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index c6525eba74e8..8de9bebc4d81 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -614,7 +614,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
 }
 
 /**
- * sgx_encl_ewb_cpumask() - Query which CPUs might be accessing the enclave
+ * sgx_encl_cpumask() - Query which CPUs might be accessing the enclave
  * @encl: the enclave
  *
  * Some SGX functions require that no cached linear-to-physical address
@@ -639,7 +639,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
  * The following flow is used to support SGX functions that require that
  * no cached linear-to-physical address mappings are present:
  * 1) Execute ENCLS[ETRACK] to initiate hardware tracking.
- * 2) Use this function (sgx_encl_ewb_cpumask()) to query which CPUs might be
+ * 2) Use this function (sgx_encl_cpumask()) to query which CPUs might be
  *    accessing the enclave.
  * 3) Send IPI to identified CPUs, kicking them out of the enclave and
  *    thus flushing all locally cached linear-to-physical address mappings.
@@ -656,7 +656,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
  *
  * Return: cpumask of CPUs that might be accessing @encl
  */
-const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl)
+const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl)
 {
 	cpumask_t *cpumask = &encl->cpumask;
 	struct sgx_encl_mm *encl_mm;
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index d2acb4debde5..e59c2cbf71e2 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -105,7 +105,7 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
 
 void sgx_encl_release(struct kref *ref);
 int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm);
-const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl);
+const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl);
 int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
 			 struct sgx_backing *backing);
 void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 2de85f459492..fa33922879bf 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -249,7 +249,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page,
 			 * miss cpus that entered the enclave between
 			 * generating the mask and incrementing epoch.
 			 */
-			on_each_cpu_mask(sgx_encl_ewb_cpumask(encl),
+			on_each_cpu_mask(sgx_encl_cpumask(encl),
 					 sgx_ipi_cb, NULL, 1);
 			ret = __sgx_encl_ewb(epc_page, va_slot, backing);
 		}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 08/30] x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes()
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (6 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 07/30] x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask() Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  6:59   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 09/30] x86/sgx: Make sgx_ipi_cb() available internally Reinette Chatre
                   ` (21 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The SGX reclaimer removes page table entries pointing to pages that are
moved to swap.

SGX2 enables changes to pages belonging to an initialized enclave, thus
enclave pages may have their permission or type changed while the page
is being accessed by an enclave. Supporting SGX2 requires page table
entries to be removed so that any cached mappings to changed pages
are removed. For example, with the ability to change enclave page types
a regular enclave page may be changed to a Thread Control Structure
(TCS) page that may not be accessed by an enclave.

Factor out the code removing page table entries to a separate function
sgx_zap_enclave_ptes(), fixing accuracy of comments in the process,
and make it available to the upcoming SGX2 code.

Place sgx_zap_enclave_ptes() with the rest of the enclave code in
encl.c interacting with the page table since this code is no longer
unique to the reclaimer.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since V1:
- Elaborate why SGX2 needs this ability (Jarkko).
- More specific subject.
- Fix kernel-doc to have brackets in function name.

 arch/x86/kernel/cpu/sgx/encl.c | 45 +++++++++++++++++++++++++++++++++-
 arch/x86/kernel/cpu/sgx/encl.h |  2 +-
 arch/x86/kernel/cpu/sgx/main.c | 31 ++---------------------
 3 files changed, 47 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 8de9bebc4d81..c77a62432862 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -605,7 +605,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
 
 	spin_lock(&encl->mm_lock);
 	list_add_rcu(&encl_mm->list, &encl->mm_list);
-	/* Pairs with smp_rmb() in sgx_reclaimer_block(). */
+	/* Pairs with smp_rmb() in sgx_zap_enclave_ptes(). */
 	smp_wmb();
 	encl->mm_list_version++;
 	spin_unlock(&encl->mm_lock);
@@ -792,6 +792,49 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm,
 	return ret;
 }
 
+/**
+ * sgx_zap_enclave_ptes() - remove PTEs mapping the address from enclave
+ * @encl: the enclave
+ * @addr: page aligned pointer to single page for which PTEs will be removed
+ *
+ * Multiple VMAs may have an enclave page mapped. Remove the PTE mapping
+ * @addr from each VMA. Ensure that page fault handler is ready to handle
+ * new mappings of @addr before calling this function.
+ */
+void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr)
+{
+	unsigned long mm_list_version;
+	struct sgx_encl_mm *encl_mm;
+	struct vm_area_struct *vma;
+	int idx, ret;
+
+	do {
+		mm_list_version = encl->mm_list_version;
+
+		/* Pairs with smp_wmb() in sgx_encl_mm_add(). */
+		smp_rmb();
+
+		idx = srcu_read_lock(&encl->srcu);
+
+		list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
+			if (!mmget_not_zero(encl_mm->mm))
+				continue;
+
+			mmap_read_lock(encl_mm->mm);
+
+			ret = sgx_encl_find(encl_mm->mm, addr, &vma);
+			if (!ret && encl == vma->vm_private_data)
+				zap_vma_ptes(vma, addr, PAGE_SIZE);
+
+			mmap_read_unlock(encl_mm->mm);
+
+			mmput_async(encl_mm->mm);
+		}
+
+		srcu_read_unlock(&encl->srcu, idx);
+	} while (unlikely(encl->mm_list_version != mm_list_version));
+}
+
 /**
  * sgx_alloc_va_page() - Allocate a Version Array (VA) page
  *
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index e59c2cbf71e2..1b15d22f6757 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -111,7 +111,7 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
 void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
 int sgx_encl_test_and_clear_young(struct mm_struct *mm,
 				  struct sgx_encl_page *page);
-
+void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr);
 struct sgx_epc_page *sgx_alloc_va_page(void);
 unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page);
 void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset);
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index fa33922879bf..ce9e87d5f8ec 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -137,36 +137,9 @@ static void sgx_reclaimer_block(struct sgx_epc_page *epc_page)
 	struct sgx_encl_page *page = epc_page->owner;
 	unsigned long addr = page->desc & PAGE_MASK;
 	struct sgx_encl *encl = page->encl;
-	unsigned long mm_list_version;
-	struct sgx_encl_mm *encl_mm;
-	struct vm_area_struct *vma;
-	int idx, ret;
-
-	do {
-		mm_list_version = encl->mm_list_version;
-
-		/* Pairs with smp_rmb() in sgx_encl_mm_add(). */
-		smp_rmb();
-
-		idx = srcu_read_lock(&encl->srcu);
-
-		list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
-			if (!mmget_not_zero(encl_mm->mm))
-				continue;
-
-			mmap_read_lock(encl_mm->mm);
-
-			ret = sgx_encl_find(encl_mm->mm, addr, &vma);
-			if (!ret && encl == vma->vm_private_data)
-				zap_vma_ptes(vma, addr, PAGE_SIZE);
-
-			mmap_read_unlock(encl_mm->mm);
-
-			mmput_async(encl_mm->mm);
-		}
+	int ret;
 
-		srcu_read_unlock(&encl->srcu, idx);
-	} while (unlikely(encl->mm_list_version != mm_list_version));
+	sgx_zap_enclave_ptes(encl, addr);
 
 	mutex_lock(&encl->lock);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 09/30] x86/sgx: Make sgx_ipi_cb() available internally
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (7 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 08/30] x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes() Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  6:59   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 10/30] x86/sgx: Create utility to validate user provided offset and length Reinette Chatre
                   ` (20 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The ETRACK function followed by an IPI to all CPUs within an enclave
is a common pattern with more frequent use in support of SGX2.

Make the (empty) IPI callback function available internally in
preparation for usage by SGX2.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since V1:
- Replace "for more usages" by "for usage by SGX2" (Jarkko)

 arch/x86/kernel/cpu/sgx/main.c | 2 +-
 arch/x86/kernel/cpu/sgx/sgx.h  | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index ce9e87d5f8ec..6e2cb7564080 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -172,7 +172,7 @@ static int __sgx_encl_ewb(struct sgx_epc_page *epc_page, void *va_slot,
 	return ret;
 }
 
-static void sgx_ipi_cb(void *info)
+void sgx_ipi_cb(void *info)
 {
 }
 
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index 0f17def9fe6f..b30cee4de903 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -90,6 +90,8 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
 int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
 struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
 
+void sgx_ipi_cb(void *info);
+
 #ifdef CONFIG_X86_SGX_KVM
 int __init sgx_vepc_init(void);
 #else
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 10/30] x86/sgx: Create utility to validate user provided offset and length
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (8 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 09/30] x86/sgx: Make sgx_ipi_cb() available internally Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  7:00   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 11/30] x86/sgx: Keep record of SGX page type Reinette Chatre
                   ` (19 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

User provided offset and length is validated when parsing the parameters
of the SGX_IOC_ENCLAVE_ADD_PAGES ioctl(). Extract this validation
into a utility that can be used by the SGX2 ioctl()s that will
also provide these values.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since V1:
- New patch

 arch/x86/kernel/cpu/sgx/ioctl.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 83df20e3e633..f487549bccba 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -372,6 +372,26 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src,
 	return ret;
 }
 
+/*
+ * Ensure user provided offset and length values are valid for
+ * an enclave.
+ */
+static int sgx_validate_offset_length(struct sgx_encl *encl,
+				      unsigned long offset,
+				      unsigned long length)
+{
+	if (!IS_ALIGNED(offset, PAGE_SIZE))
+		return -EINVAL;
+
+	if (!length || length & (PAGE_SIZE - 1))
+		return -EINVAL;
+
+	if (offset + length - PAGE_SIZE >= encl->size)
+		return -EINVAL;
+
+	return 0;
+}
+
 /**
  * sgx_ioc_enclave_add_pages() - The handler for %SGX_IOC_ENCLAVE_ADD_PAGES
  * @encl:       an enclave pointer
@@ -425,14 +445,10 @@ static long sgx_ioc_enclave_add_pages(struct sgx_encl *encl, void __user *arg)
 	if (copy_from_user(&add_arg, arg, sizeof(add_arg)))
 		return -EFAULT;
 
-	if (!IS_ALIGNED(add_arg.offset, PAGE_SIZE) ||
-	    !IS_ALIGNED(add_arg.src, PAGE_SIZE))
-		return -EINVAL;
-
-	if (!add_arg.length || add_arg.length & (PAGE_SIZE - 1))
+	if (!IS_ALIGNED(add_arg.src, PAGE_SIZE))
 		return -EINVAL;
 
-	if (add_arg.offset + add_arg.length - PAGE_SIZE >= encl->size)
+	if (sgx_validate_offset_length(encl, add_arg.offset, add_arg.length))
 		return -EINVAL;
 
 	if (copy_from_user(&secinfo, (void __user *)add_arg.secinfo,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 11/30] x86/sgx: Keep record of SGX page type
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (9 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 10/30] x86/sgx: Create utility to validate user provided offset and length Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  7:00   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 12/30] x86/sgx: Export sgx_encl_{grow,shrink}() Reinette Chatre
                   ` (18 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

SGX2 functions are not allowed on all page types. For example,
ENCLS[EMODPR] is only allowed on regular SGX enclave pages and
ENCLS[EMODPT] is only allowed on TCS and regular pages. If these
functions are attempted on another type of page the hardware would
trigger a fault.

Keep a record of the SGX page type so that there is more
certainty whether an SGX2 instruction can succeed and faults
can be treated as real failures.

The page type is a property of struct sgx_encl_page
and thus does not cover the VA page type. VA pages are maintained
in separate structures and their type can be determined in
a different way. The SGX2 instructions needing the page type do not
operate on VA pages and this is thus not a scenario needing to
be covered at this time.

struct sgx_encl_page hosting this information is maintained for each
enclave page so the space consumed by the struct is important.
The existing sgx_encl_page->vm_max_prot_bits is already unsigned long
while only using three bits. Transition to a bitfield for the two
members to support the additional information without increasing
the space consumed by the struct.

Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Update changelog to motivate transition to bitfield that
  was previously done when (now removed) vm_run_prot_bits was
  added.

Changes since V1:
- Add Acked-by from Jarkko.

 arch/x86/include/asm/sgx.h      | 3 +++
 arch/x86/kernel/cpu/sgx/encl.h  | 3 ++-
 arch/x86/kernel/cpu/sgx/ioctl.c | 2 ++
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
index d67810b50a81..eae20fa52b93 100644
--- a/arch/x86/include/asm/sgx.h
+++ b/arch/x86/include/asm/sgx.h
@@ -239,6 +239,9 @@ struct sgx_pageinfo {
  * %SGX_PAGE_TYPE_REG:	a regular page
  * %SGX_PAGE_TYPE_VA:	a VA page
  * %SGX_PAGE_TYPE_TRIM:	a page in trimmed state
+ *
+ * Make sure when making changes to this enum that its values can still fit
+ * in the bitfield within &struct sgx_encl_page
  */
 enum sgx_page_type {
 	SGX_PAGE_TYPE_SECS,
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index 1b15d22f6757..07abfc70c8e3 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -27,7 +27,8 @@
 
 struct sgx_encl_page {
 	unsigned long desc;
-	unsigned long vm_max_prot_bits;
+	unsigned long vm_max_prot_bits:8;
+	enum sgx_page_type type:16;
 	struct sgx_epc_page *epc_page;
 	struct sgx_encl *encl;
 	struct sgx_va_page *va_page;
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index f487549bccba..0c211af8e948 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -107,6 +107,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs)
 		set_bit(SGX_ENCL_DEBUG, &encl->flags);
 
 	encl->secs.encl = encl;
+	encl->secs.type = SGX_PAGE_TYPE_SECS;
 	encl->base = secs->base;
 	encl->size = secs->size;
 	encl->attributes = secs->attributes;
@@ -344,6 +345,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src,
 	 */
 	encl_page->encl = encl;
 	encl_page->epc_page = epc_page;
+	encl_page->type = (secinfo->flags & SGX_SECINFO_PAGE_TYPE_MASK) >> 8;
 	encl->secs_child_cnt++;
 
 	if (flags & SGX_PAGE_MEASURE) {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 12/30] x86/sgx: Export sgx_encl_{grow,shrink}()
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (10 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 11/30] x86/sgx: Keep record of SGX page type Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  7:04   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 13/30] x86/sgx: Export sgx_encl_page_alloc() Reinette Chatre
                   ` (17 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

In order to use sgx_encl_{grow,shrink}() in the page augmentation code
located in encl.c, export these functions.

Suggested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- New patch.

 arch/x86/kernel/cpu/sgx/encl.h  | 2 ++
 arch/x86/kernel/cpu/sgx/ioctl.c | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index 07abfc70c8e3..9d673d9531f0 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -120,5 +120,7 @@ bool sgx_va_page_full(struct sgx_va_page *va_page);
 void sgx_encl_free_epc_page(struct sgx_epc_page *page);
 struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
 					 unsigned long addr);
+struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl);
+void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
 
 #endif /* _X86_ENCL_H */
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 0c211af8e948..746acddbb774 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -17,7 +17,7 @@
 #include "encl.h"
 #include "encls.h"
 
-static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
+struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
 {
 	struct sgx_va_page *va_page = NULL;
 	void *err;
@@ -43,7 +43,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
 	return va_page;
 }
 
-static void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
+void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
 {
 	encl->page_cnt--;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 13/30] x86/sgx: Export sgx_encl_page_alloc()
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (11 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 12/30] x86/sgx: Export sgx_encl_{grow,shrink}() Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions Reinette Chatre
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

From: Jarkko Sakkinen <jarkko@kernel.org>

Move sgx_encl_page_alloc() to encl.c and export it so that it can be
used in the implementation for support of adding pages to initialized
enclaves, which requires to allocate new enclave pages.

Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- New patch
  Originally submitted at:
  https://lore.kernel.org/linux-sgx/20220308112833.262805-3-jarkko@kernel.org/

 arch/x86/kernel/cpu/sgx/encl.c  | 32 ++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/encl.h  |  3 +++
 arch/x86/kernel/cpu/sgx/ioctl.c | 32 --------------------------------
 3 files changed, 35 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index c77a62432862..546423753e4c 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -792,6 +792,38 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm,
 	return ret;
 }
 
+struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
+					  unsigned long offset,
+					  u64 secinfo_flags)
+{
+	struct sgx_encl_page *encl_page;
+	unsigned long prot;
+
+	encl_page = kzalloc(sizeof(*encl_page), GFP_KERNEL);
+	if (!encl_page)
+		return ERR_PTR(-ENOMEM);
+
+	encl_page->desc = encl->base + offset;
+	encl_page->encl = encl;
+
+	prot = _calc_vm_trans(secinfo_flags, SGX_SECINFO_R, PROT_READ)  |
+	       _calc_vm_trans(secinfo_flags, SGX_SECINFO_W, PROT_WRITE) |
+	       _calc_vm_trans(secinfo_flags, SGX_SECINFO_X, PROT_EXEC);
+
+	/*
+	 * TCS pages must always RW set for CPU access while the SECINFO
+	 * permissions are *always* zero - the CPU ignores the user provided
+	 * values and silently overwrites them with zero permissions.
+	 */
+	if ((secinfo_flags & SGX_SECINFO_PAGE_TYPE_MASK) == SGX_SECINFO_TCS)
+		prot |= PROT_READ | PROT_WRITE;
+
+	/* Calculate maximum of the VM flags for the page. */
+	encl_page->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
+
+	return encl_page;
+}
+
 /**
  * sgx_zap_enclave_ptes() - remove PTEs mapping the address from enclave
  * @encl: the enclave
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index 9d673d9531f0..253ebdd1c5be 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -112,6 +112,9 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
 void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
 int sgx_encl_test_and_clear_young(struct mm_struct *mm,
 				  struct sgx_encl_page *page);
+struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
+					  unsigned long offset,
+					  u64 secinfo_flags);
 void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr);
 struct sgx_epc_page *sgx_alloc_va_page(void);
 unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page);
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 746acddbb774..0460fd224a05 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -169,38 +169,6 @@ static long sgx_ioc_enclave_create(struct sgx_encl *encl, void __user *arg)
 	return ret;
 }
 
-static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
-						 unsigned long offset,
-						 u64 secinfo_flags)
-{
-	struct sgx_encl_page *encl_page;
-	unsigned long prot;
-
-	encl_page = kzalloc(sizeof(*encl_page), GFP_KERNEL);
-	if (!encl_page)
-		return ERR_PTR(-ENOMEM);
-
-	encl_page->desc = encl->base + offset;
-	encl_page->encl = encl;
-
-	prot = _calc_vm_trans(secinfo_flags, SGX_SECINFO_R, PROT_READ)  |
-	       _calc_vm_trans(secinfo_flags, SGX_SECINFO_W, PROT_WRITE) |
-	       _calc_vm_trans(secinfo_flags, SGX_SECINFO_X, PROT_EXEC);
-
-	/*
-	 * TCS pages must always RW set for CPU access while the SECINFO
-	 * permissions are *always* zero - the CPU ignores the user provided
-	 * values and silently overwrites them with zero permissions.
-	 */
-	if ((secinfo_flags & SGX_SECINFO_PAGE_TYPE_MASK) == SGX_SECINFO_TCS)
-		prot |= PROT_READ | PROT_WRITE;
-
-	/* Calculate maximum of the VM flags for the page. */
-	encl_page->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
-
-	return encl_page;
-}
-
 static int sgx_validate_secinfo(struct sgx_secinfo *secinfo)
 {
 	u64 perm = secinfo->flags & SGX_SECINFO_PERMISSION_MASK;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (12 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 13/30] x86/sgx: Export sgx_encl_page_alloc() Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  5:03   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 15/30] x86/sgx: Support adding of pages to an initialized enclave Reinette Chatre
                   ` (15 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

In the initial (SGX1) version of SGX, pages in an enclave need to be
created with permissions that support all usages of the pages, from the
time the enclave is initialized until it is unloaded. For example,
pages used by a JIT compiler or when code needs to otherwise be
relocated need to always have RWX permissions.

SGX2 includes a new function ENCLS[EMODPR] that is run from the kernel
and can be used to restrict the EPCM permissions of regular enclave
pages within an initialized enclave.

Introduce ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS to support
restricting EPCM permissions. With this ioctl() the user specifies
a page range and the EPCM permissions to be applied to all pages in
the provided range. ENCLS[EMODPR] is run to restrict the EPCM
permissions followed by the ENCLS[ETRACK] flow that will ensure
no cached linear-to-physical address mappings to the changed
pages remain.

It is possible for the permission change request to fail on any
page within the provided range, either with an error encountered
by the kernel or by the SGX hardware while running
ENCLS[EMODPR]. To support partial success the ioctl() returns an
error code based on failures encountered by the kernel as well
as two result output parameters: one for the number of pages
that were successfully changed and one for the SGX return code.

The page table entry permissions are not impacted by the EPCM
permission changes. VMAs and PTEs will continue to allow the
maximum vetted permissions determined at the time the pages
are added to the enclave. The SGX error code in a page fault
will indicate if it was an EPCM permission check that prevented
an access attempt.

No checking is done to ensure that the permissions are actually
being restricted. This is because the enclave may have relaxed
the EPCM permissions from within the enclave without letting the
kernel know. An attempt to relax permissions using this call will
be ignored by the hardware.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Include the sgx_ioc_sgx2_ready() utility
  that previously was in "x86/sgx: Support relaxing of enclave page
  permissions" that is removed from the next version.
- Few renames requested by Jarkko:
  struct sgx_enclave_restrict_perm ->
         struct sgx_enclave_restrict_permissions
  sgx_enclave_restrict_perm()     ->
         sgx_enclave_restrict_permissions()
  sgx_ioc_enclave_restrict_perm() ->
         sgx_ioc_enclave_restrict_permissions()
- Make EPCM permissions independent from kernel view of
  permissions.  (Jarkko)
  - Remove attempt at runtime tracking of EPCM permissions
    (sgx_encl_page->vm_run_prot_bits).
  - Do not flush page table entries - they are no longer impacted by
    EPCM permission changes.
  - Modify changelog to reflect new architecture.
- Ensure at least PROT_READ is requested - enclave requires read
  access to the page for commands like EMODPE and EACCEPT. (Jarkko)

Changes since V1:
- Change terminology to use "relax" instead of "extend" to refer to
  the case when enclave page permissions are added (Dave).
- Use ioctl() in commit message (Dave).
- Add examples on what permissions would be allowed (Dave).
- Split enclave page permission changes into two ioctl()s, one for
  permission restricting (SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS)
  and one for permission relaxing (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS)
  (Jarkko).
- In support of the ioctl() name change the following names have been
  changed:
  struct sgx_page_modp -> struct sgx_enclave_restrict_perm
  sgx_ioc_page_modp() -> sgx_ioc_enclave_restrict_perm()
  sgx_page_modp() -> sgx_enclave_restrict_perm()
- ioctl() takes entire secinfo as input instead of
  page permissions only (Jarkko).
- Fix kernel-doc to include () in function name.
- Create and use utility for the ETRACK flow.
- Fixups in comments
- Move kernel-doc to function that provides documentation for
  Documentation/x86/sgx.rst.
- Remove redundant comment.
- Make explicit which members of struct sgx_enclave_restrict_perm
  are for output (Dave).

 arch/x86/include/uapi/asm/sgx.h |  21 +++
 arch/x86/kernel/cpu/sgx/ioctl.c | 242 ++++++++++++++++++++++++++++++++
 2 files changed, 263 insertions(+)

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index f4b81587e90b..a0a24e94fb27 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -29,6 +29,8 @@ enum sgx_page_flags {
 	_IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
 #define SGX_IOC_VEPC_REMOVE_ALL \
 	_IO(SGX_MAGIC, 0x04)
+#define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
+	_IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
 
 /**
  * struct sgx_enclave_create - parameter structure for the
@@ -76,6 +78,25 @@ struct sgx_enclave_provision {
 	__u64 fd;
 };
 
+/**
+ * struct sgx_enclave_restrict_permissions - parameters for ioctl
+ *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
+ * @offset:	starting page offset (page aligned relative to enclave base
+ *		address defined in SECS)
+ * @length:	length of memory (multiple of the page size)
+ * @secinfo:	address for the SECINFO data containing the new permission bits
+ *		for pages in range described by @offset and @length
+ * @result:	(output) SGX result code of ENCLS[EMODPR] function
+ * @count:	(output) bytes successfully changed (multiple of page size)
+ */
+struct sgx_enclave_restrict_permissions {
+	__u64 offset;
+	__u64 length;
+	__u64 secinfo;
+	__u64 result;
+	__u64 count;
+};
+
 struct sgx_enclave_run;
 
 /**
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 0460fd224a05..4d88bfd163e7 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -660,6 +660,244 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
 	return sgx_set_attribute(&encl->attributes_mask, params.fd);
 }
 
+/*
+ * Ensure enclave is ready for SGX2 functions. Readiness is checked
+ * by ensuring the hardware supports SGX2 and the enclave is initialized
+ * and thus able to handle requests to modify pages within it.
+ */
+static int sgx_ioc_sgx2_ready(struct sgx_encl *encl)
+{
+	if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
+		return -ENODEV;
+
+	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
+		return -EINVAL;
+
+	return 0;
+}
+
+/*
+ * Return valid permission fields from a secinfo structure provided by
+ * user space. The secinfo structure is required to only have bits in
+ * the permission fields set.
+ */
+static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
+{
+	struct sgx_secinfo secinfo;
+	u64 perm;
+
+	if (copy_from_user(&secinfo, (void __user *)_secinfo,
+			   sizeof(secinfo)))
+		return -EFAULT;
+
+	if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
+		return -EINVAL;
+
+	if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
+		return -EINVAL;
+
+	perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
+
+	/*
+	 * Read access is required for the enclave to be able to use the page.
+	 * SGX instructions like ENCLU[EMODPE] and ENCLU[EACCEPT] require
+	 * read access.
+	 */
+	if (!(perm & SGX_SECINFO_R))
+		return -EINVAL;
+
+	*secinfo_perm = perm;
+
+	return 0;
+}
+
+/*
+ * Some SGX functions require that no cached linear-to-physical address
+ * mappings are present before they can succeed. Collaborate with
+ * hardware via ENCLS[ETRACK] to ensure that all cached
+ * linear-to-physical address mappings belonging to all threads of
+ * the enclave are cleared. See sgx_encl_cpumask() for details.
+ */
+static int sgx_enclave_etrack(struct sgx_encl *encl)
+{
+	void *epc_virt;
+	int ret;
+
+	epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
+	ret = __etrack(epc_virt);
+	if (ret) {
+		/*
+		 * ETRACK only fails when there is an OS issue. For
+		 * example, two consecutive ETRACK was sent without
+		 * completed IPI between.
+		 */
+		pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
+		/*
+		 * Send IPIs to kick CPUs out of the enclave and
+		 * try ETRACK again.
+		 */
+		on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
+		ret = __etrack(epc_virt);
+		if (ret) {
+			pr_err_once("ETRACK repeat returned %d (0x%x)",
+				    ret, ret);
+			return -EFAULT;
+		}
+	}
+	on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
+
+	return 0;
+}
+
+/**
+ * sgx_enclave_restrict_permissions() - Restrict EPCM permissions
+ * @encl:	Enclave to which the pages belong.
+ * @modp:	Checked parameters from user on which pages need modifying.
+ * @secinfo_perm: New (validated) permission bits.
+ *
+ * Return:
+ * - 0:		Success.
+ * - -errno:	Otherwise.
+ */
+static long
+sgx_enclave_restrict_permissions(struct sgx_encl *encl,
+				 struct sgx_enclave_restrict_permissions *modp,
+				 u64 secinfo_perm)
+{
+	struct sgx_encl_page *entry;
+	struct sgx_secinfo secinfo;
+	unsigned long addr;
+	unsigned long c;
+	void *epc_virt;
+	int ret;
+
+	memset(&secinfo, 0, sizeof(secinfo));
+	secinfo.flags = secinfo_perm;
+
+	for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
+		addr = encl->base + modp->offset + c;
+
+		mutex_lock(&encl->lock);
+
+		entry = sgx_encl_load_page(encl, addr);
+		if (IS_ERR(entry)) {
+			ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
+			goto out_unlock;
+		}
+
+		/*
+		 * Changing EPCM permissions is only supported on regular
+		 * SGX pages. Attempting this change on other pages will
+		 * result in #PF.
+		 */
+		if (entry->type != SGX_PAGE_TYPE_REG) {
+			ret = -EINVAL;
+			goto out_unlock;
+		}
+
+		/*
+		 * Do not verify the permission bits requested. Kernel
+		 * has no control over how EPCM permissions can be relaxed
+		 * from within the enclave. ENCLS[EMODPR] can only
+		 * remove existing EPCM permissions, attempting to set
+		 * new permissions will be ignored by the hardware.
+		 */
+
+		/* Change EPCM permissions. */
+		epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
+		ret = __emodpr(&secinfo, epc_virt);
+		if (encls_faulted(ret)) {
+			/*
+			 * All possible faults should be avoidable:
+			 * parameters have been checked, will only change
+			 * permissions of a regular page, and no concurrent
+			 * SGX1/SGX2 ENCLS instructions since these
+			 * are protected with mutex.
+			 */
+			pr_err_once("EMODPR encountered exception %d\n",
+				    ENCLS_TRAPNR(ret));
+			ret = -EFAULT;
+			goto out_unlock;
+		}
+		if (encls_failed(ret)) {
+			modp->result = ret;
+			ret = -EFAULT;
+			goto out_unlock;
+		}
+
+		ret = sgx_enclave_etrack(encl);
+		if (ret) {
+			ret = -EFAULT;
+			goto out_unlock;
+		}
+
+		mutex_unlock(&encl->lock);
+	}
+
+	ret = 0;
+	goto out;
+
+out_unlock:
+	mutex_unlock(&encl->lock);
+out:
+	modp->count = c;
+
+	return ret;
+}
+
+/**
+ * sgx_ioc_enclave_restrict_permissions() - handler for
+ *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
+ * @encl:	an enclave pointer
+ * @arg:	userspace pointer to a &struct sgx_enclave_restrict_permissions
+ *		instance
+ *
+ * SGX2 distinguishes between relaxing and restricting the enclave page
+ * permissions maintained by the hardware (EPCM permissions) of pages
+ * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
+ *
+ * EPCM permissions cannot be restricted from within the enclave, the enclave
+ * requires the kernel to run the privileged level 0 instructions ENCLS[EMODPR]
+ * and ENCLS[ETRACK]. An attempt to relax EPCM permissions with this call
+ * will be ignored by the hardware.
+ *
+ * Return:
+ * - 0:		Success
+ * - -errno:	Otherwise
+ */
+static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
+						 void __user *arg)
+{
+	struct sgx_enclave_restrict_permissions params;
+	u64 secinfo_perm;
+	long ret;
+
+	ret = sgx_ioc_sgx2_ready(encl);
+	if (ret)
+		return ret;
+
+	if (copy_from_user(&params, arg, sizeof(params)))
+		return -EFAULT;
+
+	if (sgx_validate_offset_length(encl, params.offset, params.length))
+		return -EINVAL;
+
+	ret = sgx_perm_from_user_secinfo((void __user *)params.secinfo,
+					 &secinfo_perm);
+	if (ret)
+		return ret;
+
+	if (params.result || params.count)
+		return -EINVAL;
+
+	ret = sgx_enclave_restrict_permissions(encl, &params, secinfo_perm);
+
+	if (copy_to_user(arg, &params, sizeof(params)))
+		return -EFAULT;
+
+	return ret;
+}
+
 long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 {
 	struct sgx_encl *encl = filep->private_data;
@@ -681,6 +919,10 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 	case SGX_IOC_ENCLAVE_PROVISION:
 		ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
 		break;
+	case SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS:
+		ret = sgx_ioc_enclave_restrict_permissions(encl,
+							   (void __user *)arg);
+		break;
 	default:
 		ret = -ENOIOCTLCMD;
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 15/30] x86/sgx: Support adding of pages to an initialized enclave
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (13 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  5:05   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 16/30] x86/sgx: Tighten accessible memory range after enclave initialization Reinette Chatre
                   ` (14 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

With SGX1 an enclave needs to be created with its maximum memory demands
allocated. Pages cannot be added to an enclave after it is initialized.
SGX2 introduces a new function, ENCLS[EAUG], that can be used to add
pages to an initialized enclave. With SGX2 the enclave still needs to
set aside address space for its maximum memory demands during enclave
creation, but all pages need not be added before enclave initialization.
Pages can be added during enclave runtime.

Add support for dynamically adding pages to an initialized enclave,
architecturally limited to RW permission at creation but allowed to
obtain RWX permissions after enclave runs EMODPE. Add pages via the
page fault handler at the time an enclave address without a backing
enclave page is accessed, potentially directly reclaiming pages if
no free pages are available.

The enclave is still required to run ENCLU[EACCEPT] on the page before
it can be used. A useful flow is for the enclave to run ENCLU[EACCEPT]
on an uninitialized address. This will trigger the page fault handler
that will add the enclave page and return execution to the enclave to
repeat the ENCLU[EACCEPT] instruction, this time successful.

If the enclave accesses an uninitialized address in another way, for
example by expanding the enclave stack to a page that has not yet been
added, then the page fault handler would add the page on the first
write but upon returning to the enclave the instruction that triggered
the page fault would be repeated and since ENCLU[EACCEPT] was not run
yet it would trigger a second page fault, this time with the SGX flag
set in the page fault error code. This can only be recovered by entering
the enclave again and directly running the ENCLU[EACCEPT] instruction on
the now initialized address.

Accessing an uninitialized address from outside the enclave also
triggers this flow but the page will remain inaccessible (access will
result in #PF) until accepted from within the enclave via
ENCLU[EACCEPT].

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Remove runtime tracking of EPCM permissions
  (sgx_encl_page->vm_run_prot_bits) (Jarkko).
- Move export of sgx_encl_{grow,shrink}() to separate patch. (Jarkko)
- Use sgx_encl_page_alloc(). (Jarkko)
- Set max allowed permissions to be RWX (Jarkko). Update changelog
  to indicate the change and use comment in code as
  created by Jarkko in:
https://lore.kernel.org/linux-sgx/20220306053211.135762-4-jarkko@kernel.org
- Do not set protection bits but let it be inherited by VMA (Jarkko)

Changes since V1:
- Fix subject line "to initialized" -> "to an initialized" (Jarkko).
- Move text about hardware's PENDING state to the patch that introduces
  the ENCLS[EAUG] wrapper (Jarkko).
- Ensure kernel-doc uses brackets when referring to function.

 arch/x86/kernel/cpu/sgx/encl.c | 124 +++++++++++++++++++++++++++++++++
 1 file changed, 124 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 546423753e4c..fa4f947f8496 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -194,6 +194,119 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
 	return __sgx_encl_load_page(encl, entry);
 }
 
+/**
+ * sgx_encl_eaug_page() - Dynamically add page to initialized enclave
+ * @vma:	VMA obtained from fault info from where page is accessed
+ * @encl:	enclave accessing the page
+ * @addr:	address that triggered the page fault
+ *
+ * When an initialized enclave accesses a page with no backing EPC page
+ * on a SGX2 system then the EPC can be added dynamically via the SGX2
+ * ENCLS[EAUG] instruction.
+ *
+ * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed
+ * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise.
+ */
+static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
+				     struct sgx_encl *encl, unsigned long addr)
+{
+	struct sgx_pageinfo pginfo = {0};
+	struct sgx_encl_page *encl_page;
+	struct sgx_epc_page *epc_page;
+	struct sgx_va_page *va_page;
+	unsigned long phys_addr;
+	u64 secinfo_flags;
+	vm_fault_t vmret;
+	int ret;
+
+	if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
+		return VM_FAULT_SIGBUS;
+
+	/*
+	 * Ignore internal permission checking for dynamically added pages.
+	 * They matter only for data added during the pre-initialization
+	 * phase. The enclave decides the permissions by the means of
+	 * EACCEPT, EACCEPTCOPY and EMODPE.
+	 */
+	secinfo_flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
+	encl_page = sgx_encl_page_alloc(encl, addr - encl->base, secinfo_flags);
+	if (IS_ERR(encl_page))
+		return VM_FAULT_OOM;
+
+	epc_page = sgx_alloc_epc_page(encl_page, true);
+	if (IS_ERR(epc_page)) {
+		kfree(encl_page);
+		return VM_FAULT_SIGBUS;
+	}
+
+	va_page = sgx_encl_grow(encl);
+	if (IS_ERR(va_page)) {
+		ret = PTR_ERR(va_page);
+		goto err_out_free;
+	}
+
+	mutex_lock(&encl->lock);
+
+	/*
+	 * Copy comment from sgx_encl_add_page() to maintain guidance in
+	 * this similar flow:
+	 * Adding to encl->va_pages must be done under encl->lock.  Ditto for
+	 * deleting (via sgx_encl_shrink()) in the error path.
+	 */
+	if (va_page)
+		list_add(&va_page->list, &encl->va_pages);
+
+	ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc),
+			encl_page, GFP_KERNEL);
+	/*
+	 * If ret == -EBUSY then page was created in another flow while
+	 * running without encl->lock
+	 */
+	if (ret)
+		goto err_out_unlock;
+
+	pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
+	pginfo.addr = encl_page->desc & PAGE_MASK;
+	pginfo.metadata = 0;
+
+	ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
+	if (ret)
+		goto err_out;
+
+	encl_page->encl = encl;
+	encl_page->epc_page = epc_page;
+	encl_page->type = SGX_PAGE_TYPE_REG;
+	encl->secs_child_cnt++;
+
+	sgx_mark_page_reclaimable(encl_page->epc_page);
+
+	phys_addr = sgx_get_epc_phys_addr(epc_page);
+	/*
+	 * Do not undo everything when creating PTE entry fails - next #PF
+	 * would find page ready for a PTE.
+	 */
+	vmret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr));
+	if (vmret != VM_FAULT_NOPAGE) {
+		mutex_unlock(&encl->lock);
+		return VM_FAULT_SIGBUS;
+	}
+	mutex_unlock(&encl->lock);
+	return VM_FAULT_NOPAGE;
+
+err_out:
+	xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
+
+err_out_unlock:
+	sgx_encl_shrink(encl, va_page);
+	mutex_unlock(&encl->lock);
+
+err_out_free:
+	sgx_encl_free_epc_page(epc_page);
+	kfree(encl_page);
+
+	return VM_FAULT_SIGBUS;
+}
+
 static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 {
 	unsigned long addr = (unsigned long)vmf->address;
@@ -213,6 +326,17 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 	if (unlikely(!encl))
 		return VM_FAULT_SIGBUS;
 
+	/*
+	 * The page_array keeps track of all enclave pages, whether they
+	 * are swapped out or not. If there is no entry for this page and
+	 * the system supports SGX2 then it is possible to dynamically add
+	 * a new enclave page. This is only possible for an initialized
+	 * enclave that will be checked for right away.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_SGX2) &&
+	    (!xa_load(&encl->page_array, PFN_DOWN(addr))))
+		return sgx_encl_eaug_page(vma, encl, addr);
+
 	mutex_lock(&encl->lock);
 
 	entry = sgx_encl_load_page_in_vma(encl, addr, vma->vm_flags);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 16/30] x86/sgx: Tighten accessible memory range after enclave initialization
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (14 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 15/30] x86/sgx: Support adding of pages to an initialized enclave Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  7:05   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 17/30] x86/sgx: Support modifying SGX page type Reinette Chatre
                   ` (13 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Before an enclave is initialized the enclave's memory range is unknown.
The enclave's memory range is learned at the time it is created via the
SGX_IOC_ENCLAVE_CREATE ioctl() where the provided memory range is
obtained from an earlier mmap() of /dev/sgx_enclave. After an enclave
is initialized its memory can be mapped into user space (mmap()) from
where it can be entered at its defined entry points.

With the enclave's memory range known after it is initialized there is
no reason why it should be possible to map memory outside this range.

Lock down access to the initialized enclave's memory range by denying
any attempt to map memory outside its memory range.

Locking down the memory range also makes adding pages to an initialized
enclave more efficient. Pages are added to an initialized enclave by
accessing memory that belongs to the enclave's memory range but not yet
backed by an enclave page. If it is possible for user space to map
memory that does not form part of the enclave then an access to this
memory would eventually fail. Failures range from a prompt general
protection fault if the access was an ENCLU[EACCEPT] from within the
enclave, or a page fault via the vDSO if it was another access from
within the enclave, or a SIGBUS (also resulting from a page fault) if
the access was from outside the enclave.

Disallowing invalid memory to be mapped in the first place avoids
preventable failures.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since V1:
- Add comment (Jarkko).

 arch/x86/kernel/cpu/sgx/encl.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index fa4f947f8496..7909570736a0 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -409,6 +409,11 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
 
 	XA_STATE(xas, &encl->page_array, PFN_DOWN(start));
 
+	/* Disallow mapping outside enclave's address range. */
+	if (test_bit(SGX_ENCL_INITIALIZED, &encl->flags) &&
+	    (start < encl->base || end > encl->base + encl->size))
+		return -EACCES;
+
 	/*
 	 * Disallow READ_IMPLIES_EXEC tasks as their VMA permissions might
 	 * conflict with the enclave page permissions.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 17/30] x86/sgx: Support modifying SGX page type
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (15 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 16/30] x86/sgx: Tighten accessible memory range after enclave initialization Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  7:06   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 18/30] x86/sgx: Support complete page removal Reinette Chatre
                   ` (12 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Every enclave contains one or more Thread Control Structures (TCS). The
TCS contains meta-data used by the hardware to save and restore thread
specific information when entering/exiting the enclave. With SGX1 an
enclave needs to be created with enough TCSs to support the largest
number of threads expecting to use the enclave and enough enclave pages
to meet all its anticipated memory demands. In SGX1 all pages remain in
the enclave until the enclave is unloaded.

SGX2 introduces a new function, ENCLS[EMODT], that is used to change
the type of an enclave page from a regular (SGX_PAGE_TYPE_REG) enclave
page to a TCS (SGX_PAGE_TYPE_TCS) page or change the type from a
regular (SGX_PAGE_TYPE_REG) or TCS (SGX_PAGE_TYPE_TCS)
page to a trimmed (SGX_PAGE_TYPE_TRIM) page (setting it up for later
removal).

With the existing support of dynamically adding regular enclave pages
to an initialized enclave and changing the page type to TCS it is
possible to dynamically increase the number of threads supported by an
enclave.

Changing the enclave page type to SGX_PAGE_TYPE_TRIM is the first step
of dynamically removing pages from an initialized enclave. The complete
page removal flow is:
1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
   using the SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl() introduced here.
2) Approve the page removal by running ENCLU[EACCEPT] from within
   the enclave.
3) Initiate actual page removal using the ioctl() introduced in the
   following patch.

Add ioctl() SGX_IOC_ENCLAVE_MODIFY_TYPE to support changing SGX
enclave page types within an initialized enclave. With
SGX_IOC_ENCLAVE_MODIFY_TYPE the user specifies a page range and the
enclave page type to be applied to all pages in the provided range.
The ioctl() itself can return an error code based on failures
encountered by the kernel. It is also possible for SGX specific
failures to be encountered.  Add a result output parameter to
communicate the SGX return code. It is possible for the enclave page
type change request to fail on any page within the provided range.
Support partial success by returning the number of pages that were
successfully changed.

After the page type is changed the page continues to be accessible
from the kernel perspective with page table entries and internal
state. The page may be moved to swap. Any access until ENCLU[EACCEPT]
will encounter a page fault with SGX flag set in error code.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Adjust ioctl number after removal of SGX_IOC_ENCLAVE_RELAX_PERMISSIONS.
- Remove attempt at runtime tracking of EPCM permissions
  (sgx_encl_page->vm_run_prot_bits). (Jarkko)
- Change names to follow guidance of using detailed names (Jarkko):
  struct sgx_enclave_modt -> struct sgx_enclave_modify_type
  sgx_enclave_modt() -> sgx_enclave_modify_type()
  sgx_ioc_enclave_modt() -> sgx_ioc_enclave_modify_type()

Changes since V1:
- Remove the "Earlier changes ..." paragraph (Jarkko).
- Change "new ioctl" text to "Add SGX_IOC_ENCLAVE_MOD_TYPE" (Jarkko).
- Discussion about EPCM interaction and the EPCM MODIFIED bit is moved
  to new patch that introduces the ENCLS[EMODT] wrapper while keeping
  the higher level discussion on page accessibility in
  this commit log (Jarkko).
- Rename SGX_IOC_PAGE_MODT ioctl() to SGX_IOC_ENCLAVE_MODIFY_TYPE
  (Jarkko).
- Rename struct sgx_page_modt to struct sgx_enclave_modt in support
  of ioctl() rename.
- Rename sgx_page_modt() to sgx_enclave_modt() and sgx_ioc_page_modt()
  to sgx_ioc_enclave_modt() in support of ioctl() rename.
- Provide secinfo as parameter to ioctl() instead of just
  page type (Jarkko).
- Update comments to refer to new ioctl() names.
- Use new SGX2 checking helper().
- Use ETRACK flow utility.
- Move kernel-doc to function that provides documentation for
  Documentation/x86/sgx.rst.
- Remove redundant comment.
- Use offset/length validation utility.
- Make explicit which members of struct sgx_enclave_modt are for
  output (Dave).

 arch/x86/include/uapi/asm/sgx.h |  20 +++
 arch/x86/kernel/cpu/sgx/ioctl.c | 209 ++++++++++++++++++++++++++++++++
 2 files changed, 229 insertions(+)

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index a0a24e94fb27..529f4ab28410 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -31,6 +31,8 @@ enum sgx_page_flags {
 	_IO(SGX_MAGIC, 0x04)
 #define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
 	_IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
+#define SGX_IOC_ENCLAVE_MODIFY_TYPE \
+	_IOWR(SGX_MAGIC, 0x06, struct sgx_enclave_modify_type)
 
 /**
  * struct sgx_enclave_create - parameter structure for the
@@ -97,6 +99,24 @@ struct sgx_enclave_restrict_permissions {
 	__u64 count;
 };
 
+/**
+ * struct sgx_enclave_modify_type - parameters for %SGX_IOC_ENCLAVE_MODIFY_TYPE
+ * @offset:	starting page offset (page aligned relative to enclave base
+ *		address defined in SECS)
+ * @length:	length of memory (multiple of the page size)
+ * @secinfo:	address for the SECINFO data containing the new type
+ *		for pages in range described by @offset and @length
+ * @result:	(output) SGX result code of ENCLS[EMODT] function
+ * @count:	(output) bytes successfully changed (multiple of page size)
+ */
+struct sgx_enclave_modify_type {
+	__u64 offset;
+	__u64 length;
+	__u64 secinfo;
+	__u64 result;
+	__u64 count;
+};
+
 struct sgx_enclave_run;
 
 /**
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 4d88bfd163e7..6f769e67ec2d 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -898,6 +898,212 @@ static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
 	return ret;
 }
 
+/**
+ * sgx_enclave_modify_type() - Modify type of SGX enclave pages
+ * @encl:	Enclave to which the pages belong.
+ * @modt:	Checked parameters from user about which pages need modifying.
+ * @page_type:	New page type.
+ *
+ * Return:
+ * - 0:		Success
+ * - -errno:	Otherwise
+ */
+static long sgx_enclave_modify_type(struct sgx_encl *encl,
+				    struct sgx_enclave_modify_type *modt,
+				    enum sgx_page_type page_type)
+{
+	unsigned long max_prot_restore;
+	struct sgx_encl_page *entry;
+	struct sgx_secinfo secinfo;
+	unsigned long prot;
+	unsigned long addr;
+	unsigned long c;
+	void *epc_virt;
+	int ret;
+
+	/*
+	 * The only new page types allowed by hardware are PT_TCS and PT_TRIM.
+	 */
+	if (page_type != SGX_PAGE_TYPE_TCS && page_type != SGX_PAGE_TYPE_TRIM)
+		return -EINVAL;
+
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = page_type << 8;
+
+	for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
+		addr = encl->base + modt->offset + c;
+
+		mutex_lock(&encl->lock);
+
+		entry = sgx_encl_load_page(encl, addr);
+		if (IS_ERR(entry)) {
+			ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
+			goto out_unlock;
+		}
+
+		/*
+		 * Borrow the logic from the Intel SDM. Regular pages
+		 * (SGX_PAGE_TYPE_REG) can change type to SGX_PAGE_TYPE_TCS
+		 * or SGX_PAGE_TYPE_TRIM but TCS pages can only be trimmed.
+		 * CET pages not supported yet.
+		 */
+		if (!(entry->type == SGX_PAGE_TYPE_REG ||
+		      (entry->type == SGX_PAGE_TYPE_TCS &&
+		       page_type == SGX_PAGE_TYPE_TRIM))) {
+			ret = -EINVAL;
+			goto out_unlock;
+		}
+
+		max_prot_restore = entry->vm_max_prot_bits;
+
+		/*
+		 * Once a regular page becomes a TCS page it cannot be
+		 * changed back. So the maximum allowed protection reflects
+		 * the TCS page that is always RW from kernel perspective but
+		 * will be inaccessible from within enclave. Before doing
+		 * so, do make sure that the new page type continues to
+		 * respect the originally vetted page permissions.
+		 */
+		if (entry->type == SGX_PAGE_TYPE_REG &&
+		    page_type == SGX_PAGE_TYPE_TCS) {
+			if (~entry->vm_max_prot_bits & (VM_READ | VM_WRITE)) {
+				ret = -EPERM;
+				goto out_unlock;
+			}
+			prot = PROT_READ | PROT_WRITE;
+			entry->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
+
+			/*
+			 * Prevent page from being reclaimed while mutex
+			 * is released.
+			 */
+			if (sgx_unmark_page_reclaimable(entry->epc_page)) {
+				ret = -EAGAIN;
+				goto out_entry_changed;
+			}
+
+			/*
+			 * Do not keep encl->lock because of dependency on
+			 * mmap_lock acquired in sgx_zap_enclave_ptes().
+			 */
+			mutex_unlock(&encl->lock);
+
+			sgx_zap_enclave_ptes(encl, addr);
+
+			mutex_lock(&encl->lock);
+
+			sgx_mark_page_reclaimable(entry->epc_page);
+		}
+
+		/* Change EPC type */
+		epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
+		ret = __emodt(&secinfo, epc_virt);
+		if (encls_faulted(ret)) {
+			/*
+			 * All possible faults should be avoidable:
+			 * parameters have been checked, will only change
+			 * valid page types, and no concurrent
+			 * SGX1/SGX2 ENCLS instructions since these are
+			 * protected with mutex.
+			 */
+			pr_err_once("EMODT encountered exception %d\n",
+				    ENCLS_TRAPNR(ret));
+			ret = -EFAULT;
+			goto out_entry_changed;
+		}
+		if (encls_failed(ret)) {
+			modt->result = ret;
+			ret = -EFAULT;
+			goto out_entry_changed;
+		}
+
+		ret = sgx_enclave_etrack(encl);
+		if (ret) {
+			ret = -EFAULT;
+			goto out_unlock;
+		}
+
+		entry->type = page_type;
+
+		mutex_unlock(&encl->lock);
+	}
+
+	ret = 0;
+	goto out;
+
+out_entry_changed:
+	entry->vm_max_prot_bits = max_prot_restore;
+out_unlock:
+	mutex_unlock(&encl->lock);
+out:
+	modt->count = c;
+
+	return ret;
+}
+
+/**
+ * sgx_ioc_enclave_modify_type() - handler for %SGX_IOC_ENCLAVE_MODIFY_TYPE
+ * @encl:	an enclave pointer
+ * @arg:	userspace pointer to a &struct sgx_enclave_modify_type instance
+ *
+ * Ability to change the enclave page type supports the following use cases:
+ *
+ * * It is possible to add TCS pages to an enclave by changing the type of
+ *   regular pages (%SGX_PAGE_TYPE_REG) to TCS (%SGX_PAGE_TYPE_TCS) pages.
+ *   With this support the number of threads supported by an initialized
+ *   enclave can be increased dynamically.
+ *
+ * * Regular or TCS pages can dynamically be removed from an initialized
+ *   enclave by changing the page type to %SGX_PAGE_TYPE_TRIM. Changing the
+ *   page type to %SGX_PAGE_TYPE_TRIM marks the page for removal with actual
+ *   removal done by handler of %SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl() called
+ *   after ENCLU[EACCEPT] is run on %SGX_PAGE_TYPE_TRIM page from within the
+ *   enclave.
+ *
+ * Return:
+ * - 0:		Success
+ * - -errno:	Otherwise
+ */
+static long sgx_ioc_enclave_modify_type(struct sgx_encl *encl, void __user *arg)
+{
+	struct sgx_enclave_modify_type params;
+	enum sgx_page_type page_type;
+	struct sgx_secinfo secinfo;
+	long ret;
+
+	ret = sgx_ioc_sgx2_ready(encl);
+	if (ret)
+		return ret;
+
+	if (copy_from_user(&params, arg, sizeof(params)))
+		return -EFAULT;
+
+	if (sgx_validate_offset_length(encl, params.offset, params.length))
+		return -EINVAL;
+
+	if (copy_from_user(&secinfo, (void __user *)params.secinfo,
+			   sizeof(secinfo)))
+		return -EFAULT;
+
+	if (secinfo.flags & ~SGX_SECINFO_PAGE_TYPE_MASK)
+		return -EINVAL;
+
+	if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
+		return -EINVAL;
+
+	if (params.result || params.count)
+		return -EINVAL;
+
+	page_type = (secinfo.flags & SGX_SECINFO_PAGE_TYPE_MASK) >> 8;
+	ret = sgx_enclave_modify_type(encl, &params, page_type);
+
+	if (copy_to_user(arg, &params, sizeof(params)))
+		return -EFAULT;
+
+	return ret;
+}
+
 long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 {
 	struct sgx_encl *encl = filep->private_data;
@@ -923,6 +1129,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 		ret = sgx_ioc_enclave_restrict_permissions(encl,
 							   (void __user *)arg);
 		break;
+	case SGX_IOC_ENCLAVE_MODIFY_TYPE:
+		ret = sgx_ioc_enclave_modify_type(encl, (void __user *)arg);
+		break;
 	default:
 		ret = -ENOIOCTLCMD;
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 18/30] x86/sgx: Support complete page removal
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (16 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 17/30] x86/sgx: Support modifying SGX page type Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  7:08   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges Reinette Chatre
                   ` (11 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The SGX2 page removal flow was introduced in previous patch and is
as follows:
1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
   using the ioctl() SGX_IOC_ENCLAVE_MODIFY_TYPE introduced in
   previous patch.
2) Approve the page removal by running ENCLU[EACCEPT] from within
   the enclave.
3) Initiate actual page removal using the ioctl()
   SGX_IOC_ENCLAVE_REMOVE_PAGES introduced here.

Support the final step of the SGX2 page removal flow with ioctl()
SGX_IOC_ENCLAVE_REMOVE_PAGES. With this ioctl() the user specifies
a page range that should be removed. All pages in the provided
range should have the SGX_PAGE_TYPE_TRIM page type and the request
will fail with EPERM (Operation not permitted) if a page that does
not have the correct type is encountered. Page removal can fail
on any page within the provided range. Support partial success by
returning the number of pages that were successfully removed.

Since actual page removal will succeed even if ENCLU[EACCEPT] was not
run from within the enclave the ENCLU[EMODPR] instruction with RWX
permissions is used as a no-op mechanism to ensure ENCLU[EACCEPT] was
successfully run from within the enclave before the enclave page is
removed.

If the user omits running SGX_IOC_ENCLAVE_REMOVE_PAGES the pages will
still be removed when the enclave is unloaded.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Adjust ioctl number since removal of
  SGX_IOC_ENCLAVE_RELAX_PERMISSIONS.

Changes since V1:
- Update comments to refer to new ioctl() names SGX_IOC_PAGE_MODT ->
  SGX_IOC_ENCLAVE_MODIFY_TYPE.
- Fix kernel-doc to have () as part of function name.
- Change name of ioctl():
  SGX_IOC_PAGE_REMOVE -> SGX_IOC_ENCLAVE_REMOVE_PAGES (Jarkko).
- With the above name change the page removal ioctl() has its name
  aligned with existing SGX_IOC_ENCLAVE_ADD_PAGES ioctl(). Also align
  naming of struct and functions:
  struct sgx_page_remove -> struct sgx_enclave_remove_pages
  sgx_page_remove() -> sgx_encl_remove_pages()
  sgx_ioc_page_remove() -> sgx_ioc_enclave_remove_pages()
- Use new SGX2 checking helper.
- When loading enclave page, make error code consistent with other
  instances to help user distinguish between permanent and temporary
  failures.
- Move kernel-doc to function that provides documentation for
  Documentation/x86/sgx.rst.
- Remove redundant comment.
- Use offset/length validation utility.
- Make explicit which member of struct sgx_enclave_remove_pages is for
  output (Dave).

 arch/x86/include/uapi/asm/sgx.h |  21 +++++
 arch/x86/kernel/cpu/sgx/ioctl.c | 145 ++++++++++++++++++++++++++++++++
 2 files changed, 166 insertions(+)

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index 529f4ab28410..feda7f85b2ce 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -33,6 +33,8 @@ enum sgx_page_flags {
 	_IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
 #define SGX_IOC_ENCLAVE_MODIFY_TYPE \
 	_IOWR(SGX_MAGIC, 0x06, struct sgx_enclave_modify_type)
+#define SGX_IOC_ENCLAVE_REMOVE_PAGES \
+	_IOWR(SGX_MAGIC, 0x07, struct sgx_enclave_remove_pages)
 
 /**
  * struct sgx_enclave_create - parameter structure for the
@@ -117,6 +119,25 @@ struct sgx_enclave_modify_type {
 	__u64 count;
 };
 
+/**
+ * struct sgx_enclave_remove_pages - %SGX_IOC_ENCLAVE_REMOVE_PAGES parameters
+ * @offset:	starting page offset (page aligned relative to enclave base
+ *		address defined in SECS)
+ * @length:	length of memory (multiple of the page size)
+ * @count:	(output) bytes successfully changed (multiple of page size)
+ *
+ * Regular (PT_REG) or TCS (PT_TCS) can be removed from an initialized
+ * enclave if the system supports SGX2. First, the %SGX_IOC_ENCLAVE_MODIFY_TYPE
+ * ioctl() should be used to change the page type to PT_TRIM. After that
+ * succeeds ENCLU[EACCEPT] should be run from within the enclave and then
+ * %SGX_IOC_ENCLAVE_REMOVE_PAGES can be used to complete the page removal.
+ */
+struct sgx_enclave_remove_pages {
+	__u64 offset;
+	__u64 length;
+	__u64 count;
+};
+
 struct sgx_enclave_run;
 
 /**
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 6f769e67ec2d..515e1961cc02 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -1104,6 +1104,148 @@ static long sgx_ioc_enclave_modify_type(struct sgx_encl *encl, void __user *arg)
 	return ret;
 }
 
+/**
+ * sgx_encl_remove_pages() - Remove trimmed pages from SGX enclave
+ * @encl:	Enclave to which the pages belong
+ * @params:	Checked parameters from user on which pages need to be removed
+ *
+ * Return:
+ * - 0:		Success.
+ * - -errno:	Otherwise.
+ */
+static long sgx_encl_remove_pages(struct sgx_encl *encl,
+				  struct sgx_enclave_remove_pages *params)
+{
+	struct sgx_encl_page *entry;
+	struct sgx_secinfo secinfo;
+	unsigned long addr;
+	unsigned long c;
+	void *epc_virt;
+	int ret;
+
+	memset(&secinfo, 0, sizeof(secinfo));
+	secinfo.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
+
+	for (c = 0 ; c < params->length; c += PAGE_SIZE) {
+		addr = encl->base + params->offset + c;
+
+		mutex_lock(&encl->lock);
+
+		entry = sgx_encl_load_page(encl, addr);
+		if (IS_ERR(entry)) {
+			ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
+			goto out_unlock;
+		}
+
+		if (entry->type != SGX_PAGE_TYPE_TRIM) {
+			ret = -EPERM;
+			goto out_unlock;
+		}
+
+		/*
+		 * ENCLS[EMODPR] is a no-op instruction used to inform if
+		 * ENCLU[EACCEPT] was run from within the enclave. If
+		 * ENCLS[EMODPR] is run with RWX on a trimmed page that is
+		 * not yet accepted then it will return
+		 * %SGX_PAGE_NOT_MODIFIABLE, after the trimmed page is
+		 * accepted the instruction will encounter a page fault.
+		 */
+		epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
+		ret = __emodpr(&secinfo, epc_virt);
+		if (!encls_faulted(ret) || ENCLS_TRAPNR(ret) != X86_TRAP_PF) {
+			ret = -EPERM;
+			goto out_unlock;
+		}
+
+		if (sgx_unmark_page_reclaimable(entry->epc_page)) {
+			ret = -EBUSY;
+			goto out_unlock;
+		}
+
+		/*
+		 * Do not keep encl->lock because of dependency on
+		 * mmap_lock acquired in sgx_zap_enclave_ptes().
+		 */
+		mutex_unlock(&encl->lock);
+
+		sgx_zap_enclave_ptes(encl, addr);
+
+		mutex_lock(&encl->lock);
+
+		sgx_encl_free_epc_page(entry->epc_page);
+		encl->secs_child_cnt--;
+		entry->epc_page = NULL;
+		xa_erase(&encl->page_array, PFN_DOWN(entry->desc));
+		sgx_encl_shrink(encl, NULL);
+		kfree(entry);
+
+		mutex_unlock(&encl->lock);
+	}
+
+	ret = 0;
+	goto out;
+
+out_unlock:
+	mutex_unlock(&encl->lock);
+out:
+	params->count = c;
+
+	return ret;
+}
+
+/**
+ * sgx_ioc_enclave_remove_pages() - handler for %SGX_IOC_ENCLAVE_REMOVE_PAGES
+ * @encl:	an enclave pointer
+ * @arg:	userspace pointer to &struct sgx_enclave_remove_pages instance
+ *
+ * Final step of the flow removing pages from an initialized enclave. The
+ * complete flow is:
+ *
+ * 1) User changes the type of the pages to be removed to %SGX_PAGE_TYPE_TRIM
+ *    using the %SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl().
+ * 2) User approves the page removal by running ENCLU[EACCEPT] from within
+ *    the enclave.
+ * 3) User initiates actual page removal using the
+ *    %SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl() that is handled here.
+ *
+ * First remove any page table entries pointing to the page and then proceed
+ * with the actual removal of the enclave page and data in support of it.
+ *
+ * VA pages are not affected by this removal. It is thus possible that the
+ * enclave may end up with more VA pages than needed to support all its
+ * pages.
+ *
+ * Return:
+ * - 0:		Success
+ * - -errno:	Otherwise
+ */
+static long sgx_ioc_enclave_remove_pages(struct sgx_encl *encl,
+					 void __user *arg)
+{
+	struct sgx_enclave_remove_pages params;
+	long ret;
+
+	ret = sgx_ioc_sgx2_ready(encl);
+	if (ret)
+		return ret;
+
+	if (copy_from_user(&params, arg, sizeof(params)))
+		return -EFAULT;
+
+	if (sgx_validate_offset_length(encl, params.offset, params.length))
+		return -EINVAL;
+
+	if (params.count)
+		return -EINVAL;
+
+	ret = sgx_encl_remove_pages(encl, &params);
+
+	if (copy_to_user(arg, &params, sizeof(params)))
+		return -EFAULT;
+
+	return ret;
+}
+
 long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 {
 	struct sgx_encl *encl = filep->private_data;
@@ -1132,6 +1274,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
 	case SGX_IOC_ENCLAVE_MODIFY_TYPE:
 		ret = sgx_ioc_enclave_modify_type(encl, (void __user *)arg);
 		break;
+	case SGX_IOC_ENCLAVE_REMOVE_PAGES:
+		ret = sgx_ioc_enclave_remove_pages(encl, (void __user *)arg);
+		break;
 	default:
 		ret = -ENOIOCTLCMD;
 		break;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (17 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 18/30] x86/sgx: Support complete page removal Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  7:11   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 20/30] Documentation/x86: Introduce enclave runtime management section Reinette Chatre
                   ` (10 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The page reclaimer ensures availability of EPC pages across all
enclaves. In support of this it runs independently from the
individual enclaves in order to take locks from the different
enclaves as it writes pages to swap.

When needing to load a page from swap an EPC page needs to be
available for its contents to be loaded into. Loading an existing
enclave page from swap does not reclaim EPC pages directly if
none are available, instead the reclaimer is woken when the
available EPC pages are found to be below a watermark.

When iterating over a large number of pages in an oversubscribed
environment there is a race between the reclaimer woken up and
EPC pages reclaimed fast enough for the page operations to proceed.

Ensure there are EPC pages available before attempting to load
a page that may potentially be pulled from swap into an available
EPC page.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

Changes since v1:
- Reword commit message.

 arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++++++
 arch/x86/kernel/cpu/sgx/main.c  | 6 ++++++
 arch/x86/kernel/cpu/sgx/sgx.h   | 1 +
 3 files changed, 13 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index 515e1961cc02..f88bc1236276 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -777,6 +777,8 @@ sgx_enclave_restrict_permissions(struct sgx_encl *encl,
 	for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
 		addr = encl->base + modp->offset + c;
 
+		sgx_direct_reclaim();
+
 		mutex_lock(&encl->lock);
 
 		entry = sgx_encl_load_page(encl, addr);
@@ -934,6 +936,8 @@ static long sgx_enclave_modify_type(struct sgx_encl *encl,
 	for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
 		addr = encl->base + modt->offset + c;
 
+		sgx_direct_reclaim();
+
 		mutex_lock(&encl->lock);
 
 		entry = sgx_encl_load_page(encl, addr);
@@ -1129,6 +1133,8 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
 	for (c = 0 ; c < params->length; c += PAGE_SIZE) {
 		addr = encl->base + params->offset + c;
 
+		sgx_direct_reclaim();
+
 		mutex_lock(&encl->lock);
 
 		entry = sgx_encl_load_page(encl, addr);
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 6e2cb7564080..545da16bb3ea 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -370,6 +370,12 @@ static bool sgx_should_reclaim(unsigned long watermark)
 	       !list_empty(&sgx_active_page_list);
 }
 
+void sgx_direct_reclaim(void)
+{
+	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
+		sgx_reclaim_pages();
+}
+
 static int ksgxd(void *p)
 {
 	set_freezable();
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index b30cee4de903..85cbf103b0dd 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -86,6 +86,7 @@ static inline void *sgx_get_epc_virt_addr(struct sgx_epc_page *page)
 struct sgx_epc_page *__sgx_alloc_epc_page(void);
 void sgx_free_epc_page(struct sgx_epc_page *page);
 
+void sgx_direct_reclaim(void);
 void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
 int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
 struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 20/30] Documentation/x86: Introduce enclave runtime management section
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (18 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 21/30] selftests/sgx: Add test for EPCM permission changes Reinette Chatre
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Enclave runtime management is introduced following the pattern
of the section describing enclave building. Provide a brief
summary of enclave runtime management, pointing to the functions
implementing the ioctl()s that will contain details within their
kernel-doc.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Remove references to ioctl() to relax permissions and update to reflect
  function renaming sgx_ioc_enclave_restrict_perm() ->
  sgx_ioc_enclave_restrict_permissions().
- Rename sgx_ioc_enclave_modt -> sgx_ioc_enclave_modify_type

Changes since V1:
- New patch.

 Documentation/x86/sgx.rst | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/Documentation/x86/sgx.rst b/Documentation/x86/sgx.rst
index 265568a9292c..10287c558485 100644
--- a/Documentation/x86/sgx.rst
+++ b/Documentation/x86/sgx.rst
@@ -100,6 +100,21 @@ pages and establish enclave page permissions.
                sgx_ioc_enclave_init
                sgx_ioc_enclave_provision
 
+Enclave runtime management
+--------------------------
+
+Systems supporting SGX2 additionally support changes to initialized
+enclaves: modifying enclave page permissions and type, and dynamically
+adding and removing of enclave pages. When an enclave accesses an address
+within its address range that does not have a backing page then a new
+regular page will be dynamically added to the enclave. The enclave is
+still required to run EACCEPT on the new page before it can be used.
+
+.. kernel-doc:: arch/x86/kernel/cpu/sgx/ioctl.c
+   :functions: sgx_ioc_enclave_restrict_permissions
+               sgx_ioc_enclave_modify_type
+               sgx_ioc_enclave_remove_pages
+
 Enclave vDSO
 ------------
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 21/30] selftests/sgx: Add test for EPCM permission changes
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (19 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 20/30] Documentation/x86: Introduce enclave runtime management section Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-05  7:02   ` Jarkko Sakkinen
  2022-04-04 16:49 ` [PATCH V3 22/30] selftests/sgx: Add test for TCS page " Reinette Chatre
                   ` (8 subsequent siblings)
  29 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

EPCM permission changes could be made from within (to relax
permissions) or out (to restrict permissions) the enclave. Kernel
support is needed when permissions are restricted to be able to
call the privileged ENCLS[EMODPR] instruction. EPCM permissions
can be relaxed via ENCLU[EMODPE] from within the enclave but the
enclave still depends on the kernel to install PTEs with the needed
permissions.

Add a test that exercises a few of the enclave page permission flows:
1) Test starts with a RW (from enclave and kernel perspective)
   enclave page that is mapped via a RW VMA.
2) Use the SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl() to restrict
   the enclave (EPCM) page permissions to read-only.
3) Run ENCLU[EACCEPT] from within the enclave to accept the new page
   permissions.
4) Attempt to write to the enclave page from within the enclave - this
   should fail with a page fault on the EPCM permissions since the page
   table entry continues to allow RW access.
5) Restore EPCM permissions to RW by running ENCLU[EMODPE] from within
   the enclave.
6) Attempt to write to the enclave page from within the enclave - this
   should succeed since both EPCM and PTE permissions allow this access.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Modify test to support separation between EPCM and PTE/VMA permissions
  - Fix changelog and comments to reflect new relationship between
    EPCM and PTE/VMA permissions.
  - With EPCM permissions controlling access instead of PTE permissions,
    check for SGX error code now encountered in page fault.
  - Stop calling SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and ensure that
    only calling ENCLU[EMODPE] from within enclave is necessary to restore
    access to the enclave page.
- Update to use new struct name struct sgx_enclave_restrict_perm -> struct
  sgx_enclave_restrict_permissions. (Jarkko)

Changes since V1:
- Adapt test to the kernel interface changes: the ioctl() name change
  and providing entire secinfo as parameter.
- Remove the ENCLU[EACCEPT] call after permissions are relaxed since
  the new flow no longer results in the EPCM PR bit being set.
- Rewrite error path to reduce line lengths.

 tools/testing/selftests/sgx/defines.h   |  15 ++
 tools/testing/selftests/sgx/main.c      | 218 ++++++++++++++++++++++++
 tools/testing/selftests/sgx/test_encl.c |  38 +++++
 3 files changed, 271 insertions(+)

diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h
index 02d775789ea7..b638eb98c80c 100644
--- a/tools/testing/selftests/sgx/defines.h
+++ b/tools/testing/selftests/sgx/defines.h
@@ -24,6 +24,8 @@ enum encl_op_type {
 	ENCL_OP_PUT_TO_ADDRESS,
 	ENCL_OP_GET_FROM_ADDRESS,
 	ENCL_OP_NOP,
+	ENCL_OP_EACCEPT,
+	ENCL_OP_EMODPE,
 	ENCL_OP_MAX,
 };
 
@@ -53,4 +55,17 @@ struct encl_op_get_from_addr {
 	uint64_t addr;
 };
 
+struct encl_op_eaccept {
+	struct encl_op_header header;
+	uint64_t epc_addr;
+	uint64_t flags;
+	uint64_t ret;
+};
+
+struct encl_op_emodpe {
+	struct encl_op_header header;
+	uint64_t epc_addr;
+	uint64_t flags;
+};
+
 #endif /* DEFINES_H */
diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index dd74fa42302e..0e0bd1c4d702 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -25,6 +25,18 @@ static const uint64_t MAGIC = 0x1122334455667788ULL;
 static const uint64_t MAGIC2 = 0x8877665544332211ULL;
 vdso_sgx_enter_enclave_t vdso_sgx_enter_enclave;
 
+/*
+ * Security Information (SECINFO) data structure needed by a few SGX
+ * instructions (eg. ENCLU[EACCEPT] and ENCLU[EMODPE]) holds meta-data
+ * about an enclave page. &enum sgx_secinfo_page_state specifies the
+ * secinfo flags used for page state.
+ */
+enum sgx_secinfo_page_state {
+	SGX_SECINFO_PENDING = (1 << 3),
+	SGX_SECINFO_MODIFIED = (1 << 4),
+	SGX_SECINFO_PR = (1 << 5),
+};
+
 struct vdso_symtab {
 	Elf64_Sym *elf_symtab;
 	const char *elf_symstrtab;
@@ -555,4 +567,210 @@ TEST_F(enclave, pte_permissions)
 	EXPECT_EQ(self->run.exception_addr, 0);
 }
 
+/*
+ * Enclave page permission test.
+ *
+ * Modify and restore enclave page's EPCM (enclave) permissions from
+ * outside enclave (ENCLS[EMODPR] via kernel) as well as from within
+ * enclave (via ENCLU[EMODPE]). Check for page fault if
+ * VMA allows access but EPCM permissions do not.
+ */
+TEST_F(enclave, epcm_permissions)
+{
+	struct sgx_enclave_restrict_permissions restrict_ioc;
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct encl_op_eaccept eaccept_op;
+	struct encl_op_emodpe emodpe_op;
+	struct sgx_secinfo secinfo;
+	unsigned long data_start;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Ensure kernel supports needed ioctl() and system supports needed
+	 * commands.
+	 */
+	memset(&restrict_ioc, 0, sizeof(restrict_ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS,
+		    &restrict_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	ASSERT_EQ(ret, -1);
+
+	/* ret == -1 */
+	if (errno_save == ENOTTY)
+		SKIP(return,
+		     "Kernel does not support SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl()");
+	else if (errno_save == ENODEV)
+		SKIP(return, "System does not support SGX2");
+
+	/*
+	 * Page that will have its permissions changed is the second data
+	 * page in the .data segment. This forms part of the local encl_buffer
+	 * within the enclave.
+	 *
+	 * At start of test @data_start should have EPCM as well as PTE and
+	 * VMA permissions of RW.
+	 */
+
+	data_start = self->encl.encl_base +
+		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	/*
+	 * Sanity check that page at @data_start is writable before making
+	 * any changes to page permissions.
+	 *
+	 * Start by writing MAGIC to test page.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = data_start;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that
+	 * page is writable.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = data_start;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Change EPCM permissions to read-only. Kernel still considers
+	 * the page writable.
+	 */
+	memset(&restrict_ioc, 0, sizeof(restrict_ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = PROT_READ;
+	restrict_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	restrict_ioc.length = PAGE_SIZE;
+	restrict_ioc.secinfo = (unsigned long)&secinfo;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS,
+		    &restrict_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(restrict_ioc.result, 0);
+	EXPECT_EQ(restrict_ioc.count, 4096);
+
+	/*
+	 * EPCM permissions changed from kernel, need to EACCEPT from enclave.
+	 */
+	eaccept_op.epc_addr = data_start;
+	eaccept_op.flags = PROT_READ | SGX_SECINFO_REG | SGX_SECINFO_PR;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/*
+	 * EPCM permissions of page is now read-only, expect #PF
+	 * on EPCM when attempting to write to page from within enclave.
+	 */
+	put_addr_op.value = MAGIC2;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(self->run.function, ERESUME);
+	EXPECT_EQ(self->run.exception_vector, 14);
+	EXPECT_EQ(self->run.exception_error_code, 0x8007);
+	EXPECT_EQ(self->run.exception_addr, data_start);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	/*
+	 * Received AEX but cannot return to enclave at same entrypoint,
+	 * need different TCS from where EPCM permission can be made writable
+	 * again.
+	 */
+	self->run.tcs = self->encl.encl_base + PAGE_SIZE;
+
+	/*
+	 * Enter enclave at new TCS to change EPCM permissions to be
+	 * writable again and thus fix the page fault that triggered the
+	 * AEX.
+	 */
+
+	emodpe_op.epc_addr = data_start;
+	emodpe_op.flags = PROT_READ | PROT_WRITE;
+	emodpe_op.header.type = ENCL_OP_EMODPE;
+
+	EXPECT_EQ(ENCL_CALL(&emodpe_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Attempt to return to main TCS to resume execution at faulting
+	 * instruction, PTE should continue to allow writing to the page.
+	 */
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Wrong page permissions that caused original fault has
+	 * now been fixed via EPCM permissions.
+	 * Resume execution in main TCS to re-attempt the memory access.
+	 */
+	self->run.tcs = self->encl.encl_base;
+
+	EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0,
+					 ERESUME, 0, 0,
+					 &self->run),
+		  0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	get_addr_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC2);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.user_data, 0);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+}
+
 TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c
index 4fca01cfd898..5b6c65331527 100644
--- a/tools/testing/selftests/sgx/test_encl.c
+++ b/tools/testing/selftests/sgx/test_encl.c
@@ -11,6 +11,42 @@
  */
 static uint8_t encl_buffer[8192] = { 1 };
 
+enum sgx_enclu_function {
+	EACCEPT = 0x5,
+	EMODPE = 0x6,
+};
+
+static void do_encl_emodpe(void *_op)
+{
+	struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0};
+	struct encl_op_emodpe *op = _op;
+
+	secinfo.flags = op->flags;
+
+	asm volatile(".byte 0x0f, 0x01, 0xd7"
+				:
+				: "a" (EMODPE),
+				  "b" (&secinfo),
+				  "c" (op->epc_addr));
+}
+
+static void do_encl_eaccept(void *_op)
+{
+	struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0};
+	struct encl_op_eaccept *op = _op;
+	int rax;
+
+	secinfo.flags = op->flags;
+
+	asm volatile(".byte 0x0f, 0x01, 0xd7"
+				: "=a" (rax)
+				: "a" (EACCEPT),
+				  "b" (&secinfo),
+				  "c" (op->epc_addr));
+
+	op->ret = rax;
+}
+
 static void *memcpy(void *dest, const void *src, size_t n)
 {
 	size_t i;
@@ -62,6 +98,8 @@ void encl_body(void *rdi,  void *rsi)
 		do_encl_op_put_to_addr,
 		do_encl_op_get_from_addr,
 		do_encl_op_nop,
+		do_encl_eaccept,
+		do_encl_emodpe,
 	};
 
 	struct encl_op_header *op = (struct encl_op_header *)rdi;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 22/30] selftests/sgx: Add test for TCS page permission changes
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (20 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 21/30] selftests/sgx: Add test for EPCM permission changes Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 23/30] selftests/sgx: Test two different SGX2 EAUG flows Reinette Chatre
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Kernel should not allow permission changes on TCS pages. Add test to
confirm this behavior.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Update to use new struct name struct sgx_enclave_restrict_perm -> struct
  sgx_enclave_restrict_permissions. (Jarkko)

Changes since V1:
- Adapt test to the kernel interface changes: the ioctl() name change
  and providing entire secinfo as parameter.
- Rewrite error path to reduce line lengths.

 tools/testing/selftests/sgx/main.c | 74 ++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 0e0bd1c4d702..59573c1128c8 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -121,6 +121,24 @@ static Elf64_Sym *vdso_symtab_get(struct vdso_symtab *symtab, const char *name)
 	return NULL;
 }
 
+/*
+ * Return the offset in the enclave where the TCS segment can be found.
+ * The first RW segment loaded is the TCS.
+ */
+static off_t encl_get_tcs_offset(struct encl *encl)
+{
+	int i;
+
+	for (i = 0; i < encl->nr_segments; i++) {
+		struct encl_segment *seg = &encl->segment_tbl[i];
+
+		if (i == 0 && seg->prot == (PROT_READ | PROT_WRITE))
+			return seg->offset;
+	}
+
+	return -1;
+}
+
 /*
  * Return the offset in the enclave where the data segment can be found.
  * The first RW segment loaded is the TCS, skip that to get info on the
@@ -567,6 +585,62 @@ TEST_F(enclave, pte_permissions)
 	EXPECT_EQ(self->run.exception_addr, 0);
 }
 
+/*
+ * Modifying permissions of TCS page should not be possible.
+ */
+TEST_F(enclave, tcs_permissions)
+{
+	struct sgx_enclave_restrict_permissions ioc;
+	struct sgx_secinfo secinfo;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	memset(&ioc, 0, sizeof(ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	/*
+	 * Ensure kernel supports needed ioctl() and system supports needed
+	 * commands.
+	 */
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS, &ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	ASSERT_EQ(ret, -1);
+
+	/* ret == -1 */
+	if (errno_save == ENOTTY)
+		SKIP(return,
+		     "Kernel does not support SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl()");
+	else if (errno_save == ENODEV)
+		SKIP(return, "System does not support SGX2");
+
+	/*
+	 * Attempt to make TCS page read-only. This is not allowed and
+	 * should be prevented by the kernel.
+	 */
+	secinfo.flags = PROT_READ;
+	ioc.offset = encl_get_tcs_offset(&self->encl);
+	ioc.length = PAGE_SIZE;
+	ioc.secinfo = (unsigned long)&secinfo;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS, &ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, -1);
+	EXPECT_EQ(errno_save, EINVAL);
+	EXPECT_EQ(ioc.result, 0);
+	EXPECT_EQ(ioc.count, 0);
+}
+
 /*
  * Enclave page permission test.
  *
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 23/30] selftests/sgx: Test two different SGX2 EAUG flows
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (21 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 22/30] selftests/sgx: Add test for TCS page " Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 24/30] selftests/sgx: Introduce dynamic entry point Reinette Chatre
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Enclave pages can be added to an initialized enclave when an address
belonging to the enclave but without a backing page is accessed from
within the enclave.

Accessing memory without a backing enclave page from within an enclave
can be in different ways:
1) Pre-emptively run ENCLU[EACCEPT]. Since the addition of a page
   always needs to be accepted by the enclave via ENCLU[EACCEPT] this
   flow is efficient since the first execution of ENCLU[EACCEPT]
   triggers the addition of the page and when execution returns to the
   same instruction the second execution would be successful as an
   acceptance of the page.

2) A direct read or write. The flow where a direct read or write
   triggers the page addition execution cannot resume from the
   instruction (read/write) that triggered the fault but instead
   the enclave needs to be entered at a different entry point to
   run needed ENCLU[EACCEPT] before execution can return to the
   original entry point and the read/write instruction that faulted.

Add tests for both flows.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since v2:
- Add inline comment to the mmap() call used in both EAUG tests
  to explain why the mmap() is expected to succeed. (Jarkko)

Changes since v1:
- Replace __cpuid() definition and usage with __cpuid_count(). (Reinette)
- Fix accuracy of comments.

 tools/testing/selftests/sgx/main.c | 250 +++++++++++++++++++++++++++++
 1 file changed, 250 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 59573c1128c8..d52637eb5131 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -86,6 +86,15 @@ static bool vdso_get_symtab(void *addr, struct vdso_symtab *symtab)
 	return true;
 }
 
+static inline int sgx2_supported(void)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	__cpuid_count(SGX_CPUID, 0x0, eax, ebx, ecx, edx);
+
+	return eax & 0x2;
+}
+
 static unsigned long elf_sym_hash(const char *name)
 {
 	unsigned long h = 0, high;
@@ -847,4 +856,245 @@ TEST_F(enclave, epcm_permissions)
 	EXPECT_EQ(self->run.exception_addr, 0);
 }
 
+/*
+ * Test the addition of pages to an initialized enclave via writing to
+ * a page belonging to the enclave's address space but was not added
+ * during enclave creation.
+ */
+TEST_F(enclave, augment)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct encl_op_eaccept eaccept_op;
+	size_t total_size = 0;
+	void *addr;
+	int i;
+
+	if (!sgx2_supported())
+		SKIP(return, "SGX2 not supported");
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	for (i = 0; i < self->encl.nr_segments; i++) {
+		struct encl_segment *seg = &self->encl.segment_tbl[i];
+
+		total_size += seg->size;
+	}
+
+	/*
+	 * Actual enclave size is expected to be larger than the loaded
+	 * test enclave since enclave size must be a power of 2 in bytes
+	 * and test_encl does not consume it all.
+	 */
+	EXPECT_LT(total_size + PAGE_SIZE, self->encl.encl_size);
+
+	/*
+	 * Create memory mapping for the page that will be added. New
+	 * memory mapping is for one page right after all existing
+	 * mappings.
+	 * Kernel will allow new mapping using any permissions if it
+	 * falls into the enclave's address range but not backed
+	 * by existing enclave pages.
+	 */
+	addr = mmap((void *)self->encl.encl_base + total_size, PAGE_SIZE,
+		    PROT_READ | PROT_WRITE | PROT_EXEC,
+		    MAP_SHARED | MAP_FIXED, self->encl.fd, 0);
+	EXPECT_NE(addr, MAP_FAILED);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	/*
+	 * Attempt to write to the new page from within enclave.
+	 * Expected to fail since page is not (yet) part of the enclave.
+	 * The first #PF will trigger the addition of the page to the
+	 * enclave, but since the new page needs an EACCEPT from within the
+	 * enclave before it can be used it would not be possible
+	 * to successfully return to the failing instruction. This is the
+	 * cause of the second #PF captured here having the SGX bit set,
+	 * it is from hardware preventing the page from being used.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = (unsigned long)addr;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(self->run.function, ERESUME);
+	EXPECT_EQ(self->run.exception_vector, 14);
+	EXPECT_EQ(self->run.exception_addr, (unsigned long)addr);
+
+	if (self->run.exception_error_code == 0x6) {
+		munmap(addr, PAGE_SIZE);
+		SKIP(return, "Kernel does not support adding pages to initialized enclave");
+	}
+
+	EXPECT_EQ(self->run.exception_error_code, 0x8007);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	/* Handle AEX by running EACCEPT from new entry point. */
+	self->run.tcs = self->encl.encl_base + PAGE_SIZE;
+
+	eaccept_op.epc_addr = self->encl.encl_base + total_size;
+	eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/* Can now return to main TCS to resume execution. */
+	self->run.tcs = self->encl.encl_base;
+
+	EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0,
+					 ERESUME, 0, 0,
+					 &self->run),
+		  0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory from newly added page that was just written to,
+	 * confirming that data previously written (MAGIC) is present.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = (unsigned long)addr;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	munmap(addr, PAGE_SIZE);
+}
+
+/*
+ * Test for the addition of pages to an initialized enclave via a
+ * pre-emptive run of EACCEPT on page to be added.
+ */
+TEST_F(enclave, augment_via_eaccept)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct encl_op_eaccept eaccept_op;
+	size_t total_size = 0;
+	void *addr;
+	int i;
+
+	if (!sgx2_supported())
+		SKIP(return, "SGX2 not supported");
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	for (i = 0; i < self->encl.nr_segments; i++) {
+		struct encl_segment *seg = &self->encl.segment_tbl[i];
+
+		total_size += seg->size;
+	}
+
+	/*
+	 * Actual enclave size is expected to be larger than the loaded
+	 * test enclave since enclave size must be a power of 2 in bytes while
+	 * test_encl does not consume it all.
+	 */
+	EXPECT_LT(total_size + PAGE_SIZE, self->encl.encl_size);
+
+	/*
+	 * mmap() a page at end of existing enclave to be used for dynamic
+	 * EPC page.
+	 *
+	 * Kernel will allow new mapping using any permissions if it
+	 * falls into the enclave's address range but not backed
+	 * by existing enclave pages.
+	 */
+
+	addr = mmap((void *)self->encl.encl_base + total_size, PAGE_SIZE,
+		    PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED | MAP_FIXED,
+		    self->encl.fd, 0);
+	EXPECT_NE(addr, MAP_FAILED);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	/*
+	 * Run EACCEPT on new page to trigger the #PF->EAUG->EACCEPT(again
+	 * without a #PF). All should be transparent to userspace.
+	 */
+	eaccept_op.epc_addr = self->encl.encl_base + total_size;
+	eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	if (self->run.exception_vector == 14 &&
+	    self->run.exception_error_code == 4 &&
+	    self->run.exception_addr == self->encl.encl_base + total_size) {
+		munmap(addr, PAGE_SIZE);
+		SKIP(return, "Kernel does not support adding pages to initialized enclave");
+	}
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/*
+	 * New page should be accessible from within enclave - attempt to
+	 * write to it.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = (unsigned long)addr;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory from newly added page that was just written to,
+	 * confirming that data previously written (MAGIC) is present.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = (unsigned long)addr;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	munmap(addr, PAGE_SIZE);
+}
+
 TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 24/30] selftests/sgx: Introduce dynamic entry point
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (22 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 23/30] selftests/sgx: Test two different SGX2 EAUG flows Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 25/30] selftests/sgx: Introduce TCS initialization enclave operation Reinette Chatre
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The test enclave (test_encl.elf) is built with two initialized
Thread Control Structures (TCS) included in the binary. Both TCS are
initialized with the same entry point, encl_entry, that correctly
computes the absolute address of the stack based on the stack of each
TCS that is also built into the binary.

A new TCS can be added dynamically to the enclave and requires to be
initialized with an entry point used to enter the enclave. Since the
existing entry point, encl_entry, assumes that the TCS and its stack
exists at particular offsets within the binary it is not able to handle
a dynamically added TCS and its stack.

Introduce a new entry point, encl_dyn_entry, that initializes the
absolute address of that thread's stack to the address immediately
preceding the TCS itself. It is now possible to dynamically add a
contiguous memory region to the enclave with the new stack preceding
the new TCS. With the new TCS initialized with encl_dyn_entry as entry
point the absolute address of the stack is computed correctly on entry.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

No changes since V1

 tools/testing/selftests/sgx/test_encl_bootstrap.S | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/sgx/test_encl_bootstrap.S b/tools/testing/selftests/sgx/test_encl_bootstrap.S
index 82fb0dfcbd23..03ae0f57e29d 100644
--- a/tools/testing/selftests/sgx/test_encl_bootstrap.S
+++ b/tools/testing/selftests/sgx/test_encl_bootstrap.S
@@ -45,6 +45,12 @@ encl_entry:
 	# TCS #2. By adding the value of encl_stack to it, we get
 	# the absolute address for the stack.
 	lea	(encl_stack)(%rbx), %rax
+	jmp encl_entry_core
+encl_dyn_entry:
+	# Entry point for dynamically created TCS page expected to follow
+	# its stack directly.
+	lea -1(%rbx), %rax
+encl_entry_core:
 	xchg	%rsp, %rax
 	push	%rax
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 25/30] selftests/sgx: Introduce TCS initialization enclave operation
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (23 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 24/30] selftests/sgx: Introduce dynamic entry point Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 26/30] selftests/sgx: Test complete changing of page type flow Reinette Chatre
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

The Thread Control Structure (TCS) contains meta-data used by the
hardware to save and restore thread specific information when
entering/exiting the enclave. A TCS can be added to an initialized
enclave by first adding a new regular enclave page, initializing the
content of the new page from within the enclave, and then changing that
page's type to a TCS.

Support the initialization of a TCS from within the enclave.
The variable information needed that should be provided from outside
the enclave is the address of the TCS, address of the State Save Area
(SSA), and the entry point that the thread should use to enter the
enclave. With this information provided all needed fields of a TCS
can be initialized.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
No changes since V2

No changes since V1

 tools/testing/selftests/sgx/defines.h   |  8 +++++++
 tools/testing/selftests/sgx/test_encl.c | 30 +++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h
index b638eb98c80c..d8587c971941 100644
--- a/tools/testing/selftests/sgx/defines.h
+++ b/tools/testing/selftests/sgx/defines.h
@@ -26,6 +26,7 @@ enum encl_op_type {
 	ENCL_OP_NOP,
 	ENCL_OP_EACCEPT,
 	ENCL_OP_EMODPE,
+	ENCL_OP_INIT_TCS_PAGE,
 	ENCL_OP_MAX,
 };
 
@@ -68,4 +69,11 @@ struct encl_op_emodpe {
 	uint64_t flags;
 };
 
+struct encl_op_init_tcs_page {
+	struct encl_op_header header;
+	uint64_t tcs_page;
+	uint64_t ssa;
+	uint64_t entry;
+};
+
 #endif /* DEFINES_H */
diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c
index 5b6c65331527..c0d6397295e3 100644
--- a/tools/testing/selftests/sgx/test_encl.c
+++ b/tools/testing/selftests/sgx/test_encl.c
@@ -57,6 +57,35 @@ static void *memcpy(void *dest, const void *src, size_t n)
 	return dest;
 }
 
+static void *memset(void *dest, int c, size_t n)
+{
+	size_t i;
+
+	for (i = 0; i < n; i++)
+		((char *)dest)[i] = c;
+
+	return dest;
+}
+
+static void do_encl_init_tcs_page(void *_op)
+{
+	struct encl_op_init_tcs_page *op = _op;
+	void *tcs = (void *)op->tcs_page;
+	uint32_t val_32;
+
+	memset(tcs, 0, 16);			/* STATE and FLAGS */
+	memcpy(tcs + 16, &op->ssa, 8);		/* OSSA */
+	memset(tcs + 24, 0, 4);			/* CSSA */
+	val_32 = 1;
+	memcpy(tcs + 28, &val_32, 4);		/* NSSA */
+	memcpy(tcs + 32, &op->entry, 8);	/* OENTRY */
+	memset(tcs + 40, 0, 24);		/* AEP, OFSBASE, OGSBASE */
+	val_32 = 0xFFFFFFFF;
+	memcpy(tcs + 64, &val_32, 4);		/* FSLIMIT */
+	memcpy(tcs + 68, &val_32, 4);		/* GSLIMIT */
+	memset(tcs + 72, 0, 4024);		/* Reserved */
+}
+
 static void do_encl_op_put_to_buf(void *op)
 {
 	struct encl_op_put_to_buf *op2 = op;
@@ -100,6 +129,7 @@ void encl_body(void *rdi,  void *rsi)
 		do_encl_op_nop,
 		do_encl_eaccept,
 		do_encl_emodpe,
+		do_encl_init_tcs_page,
 	};
 
 	struct encl_op_header *op = (struct encl_op_header *)rdi;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 26/30] selftests/sgx: Test complete changing of page type flow
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (24 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 25/30] selftests/sgx: Introduce TCS initialization enclave operation Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 27/30] selftests/sgx: Test faulty enclave behavior Reinette Chatre
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Support for changing an enclave page's type enables an initialized
enclave to be expanded with support for more threads by changing the
type of a regular enclave page to that of a Thread Control Structure
(TCS).  Additionally, being able to change a TCS or regular enclave
page's type to be trimmed (SGX_PAGE_TYPE_TRIM) initiates the removal
of the page from the enclave.

Test changing page type to TCS as well as page removal flows
in two phases: In the first phase support for a new thread is
dynamically added to an initialized enclave and in the second phase
the pages associated with the new thread are removed from the enclave.
As an additional sanity check after the second phase the page used as
a TCS page during the first phase is added back as a regular page and
ensured that it can be written to (which is not possible if it was a
TCS page).

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Rename struct sgx_enclave_modt -> struct sgx_enclave_modify_type

Changes since V1:
- Update to support ioctl() name change (SGX_IOC_PAGE_MODT ->
  SGX_IOC_ENCLAVE_MODIFY_TYPE) and provide secinfo as parameter instead
  of just page type (Jarkko).
- Update test to reflect page removal ioctl() and struct name change:
  SGX_IOC_PAGE_REMOVE->SGX_IOC_ENCLAVE_REMOVE_PAGES,
  struct sgx_page_remove -> struct sgx_enclave_remove_pages (Jarkko).
- Use ioctl() instead of ioctl (Dave).

 tools/testing/selftests/sgx/load.c |  41 ++++
 tools/testing/selftests/sgx/main.c | 347 +++++++++++++++++++++++++++++
 tools/testing/selftests/sgx/main.h |   1 +
 3 files changed, 389 insertions(+)

diff --git a/tools/testing/selftests/sgx/load.c b/tools/testing/selftests/sgx/load.c
index 006b464c8fc9..94bdeac1cf04 100644
--- a/tools/testing/selftests/sgx/load.c
+++ b/tools/testing/selftests/sgx/load.c
@@ -130,6 +130,47 @@ static bool encl_ioc_add_pages(struct encl *encl, struct encl_segment *seg)
 	return true;
 }
 
+/*
+ * Parse the enclave code's symbol table to locate and return address of
+ * the provided symbol
+ */
+uint64_t encl_get_entry(struct encl *encl, const char *symbol)
+{
+	Elf64_Shdr *sections;
+	Elf64_Sym *symtab;
+	Elf64_Ehdr *ehdr;
+	char *sym_names;
+	int num_sym;
+	int i;
+
+	ehdr = encl->bin;
+	sections = encl->bin + ehdr->e_shoff;
+
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		if (sections[i].sh_type == SHT_SYMTAB) {
+			symtab = (Elf64_Sym *)((char *)encl->bin + sections[i].sh_offset);
+			num_sym = sections[i].sh_size / sections[i].sh_entsize;
+			break;
+		}
+	}
+
+	for (i = 0; i < ehdr->e_shnum; i++) {
+		if (sections[i].sh_type == SHT_STRTAB) {
+			sym_names = (char *)encl->bin + sections[i].sh_offset;
+			break;
+		}
+	}
+
+	for (i = 0; i < num_sym; i++) {
+		Elf64_Sym *sym = &symtab[i];
+
+		if (!strcmp(symbol, sym_names + sym->st_name))
+			return (uint64_t)sym->st_value;
+	}
+
+	return 0;
+}
+
 bool encl_load(const char *path, struct encl *encl, unsigned long heap_size)
 {
 	const char device_path[] = "/dev/sgx_enclave";
diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index d52637eb5131..17ade940425b 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -1097,4 +1097,351 @@ TEST_F(enclave, augment_via_eaccept)
 	munmap(addr, PAGE_SIZE);
 }
 
+/*
+ * SGX2 page type modification test in two phases:
+ * Phase 1:
+ * Create a new TCS, consisting out of three new pages (stack page with regular
+ * page type, SSA page with regular page type, and TCS page with TCS page
+ * type) in an initialized enclave and run a simple workload within it.
+ * Phase 2:
+ * Remove the three pages added in phase 1, add a new regular page at the
+ * same address that previously hosted the TCS page and verify that it can
+ * be modified.
+ */
+TEST_F(enclave, tcs_create)
+{
+	struct encl_op_init_tcs_page init_tcs_page_op;
+	struct sgx_enclave_remove_pages remove_ioc;
+	struct encl_op_get_from_addr get_addr_op;
+	struct sgx_enclave_modify_type modt_ioc;
+	struct encl_op_put_to_addr put_addr_op;
+	struct encl_op_get_from_buf get_buf_op;
+	struct encl_op_put_to_buf put_buf_op;
+	void *addr, *tcs, *stack_end, *ssa;
+	struct encl_op_eaccept eaccept_op;
+	struct sgx_secinfo secinfo;
+	size_t total_size = 0;
+	uint64_t val_64;
+	int errno_save;
+	int ret, i;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl,
+				    _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Hardware (SGX2) and kernel support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &modt_ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl()");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Add three regular pages via EAUG: one will be the TCS stack, one
+	 * will be the TCS SSA, and one will be the new TCS. The stack and
+	 * SSA will remain as regular pages, the TCS page will need its
+	 * type changed after populated with needed data.
+	 */
+	for (i = 0; i < self->encl.nr_segments; i++) {
+		struct encl_segment *seg = &self->encl.segment_tbl[i];
+
+		total_size += seg->size;
+	}
+
+	/*
+	 * Actual enclave size is expected to be larger than the loaded
+	 * test enclave since enclave size must be a power of 2 in bytes while
+	 * test_encl does not consume it all.
+	 */
+	EXPECT_LT(total_size + 3 * PAGE_SIZE, self->encl.encl_size);
+
+	/*
+	 * mmap() three pages at end of existing enclave to be used for the
+	 * three new pages.
+	 */
+	addr = mmap((void *)self->encl.encl_base + total_size, 3 * PAGE_SIZE,
+		    PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,
+		    self->encl.fd, 0);
+	EXPECT_NE(addr, MAP_FAILED);
+
+	self->run.exception_vector = 0;
+	self->run.exception_error_code = 0;
+	self->run.exception_addr = 0;
+
+	stack_end = (void *)self->encl.encl_base + total_size;
+	tcs = (void *)self->encl.encl_base + total_size + PAGE_SIZE;
+	ssa = (void *)self->encl.encl_base + total_size + 2 * PAGE_SIZE;
+
+	/*
+	 * Run EACCEPT on each new page to trigger the
+	 * EACCEPT->(#PF)->EAUG->EACCEPT(again without a #PF) flow.
+	 */
+
+	eaccept_op.epc_addr = (unsigned long)stack_end;
+	eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	if (self->run.exception_vector == 14 &&
+	    self->run.exception_error_code == 4 &&
+	    self->run.exception_addr == (unsigned long)stack_end) {
+		munmap(addr, 3 * PAGE_SIZE);
+		SKIP(return, "Kernel does not support adding pages to initialized enclave");
+	}
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	eaccept_op.epc_addr = (unsigned long)ssa;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	eaccept_op.epc_addr = (unsigned long)tcs;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/*
+	 * Three new pages added to enclave. Now populate the TCS page with
+	 * needed data. This should be done from within enclave. Provide
+	 * the function that will do the actual data population with needed
+	 * data.
+	 */
+
+	/*
+	 * New TCS will use the "encl_dyn_entry" entrypoint that expects
+	 * stack to begin in page before TCS page.
+	 */
+	val_64 = encl_get_entry(&self->encl, "encl_dyn_entry");
+	EXPECT_NE(val_64, 0);
+
+	init_tcs_page_op.tcs_page = (unsigned long)tcs;
+	init_tcs_page_op.ssa = (unsigned long)total_size + 2 * PAGE_SIZE;
+	init_tcs_page_op.entry = val_64;
+	init_tcs_page_op.header.type = ENCL_OP_INIT_TCS_PAGE;
+
+	EXPECT_EQ(ENCL_CALL(&init_tcs_page_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/* Change TCS page type to TCS. */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = SGX_PAGE_TYPE_TCS << 8;
+	modt_ioc.offset = total_size + PAGE_SIZE;
+	modt_ioc.length = PAGE_SIZE;
+	modt_ioc.secinfo = (unsigned long)&secinfo;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, 4096);
+
+	/* EACCEPT new TCS page from enclave. */
+	eaccept_op.epc_addr = (unsigned long)tcs;
+	eaccept_op.flags = SGX_SECINFO_TCS | SGX_SECINFO_MODIFIED;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/* Run workload from new TCS. */
+	self->run.tcs = (unsigned long)tcs;
+
+	/*
+	 * Simple workload to write to data buffer and read value back.
+	 */
+	put_buf_op.header.type = ENCL_OP_PUT_TO_BUFFER;
+	put_buf_op.value = MAGIC;
+
+	EXPECT_EQ(ENCL_CALL(&put_buf_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	get_buf_op.header.type = ENCL_OP_GET_FROM_BUFFER;
+	get_buf_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_buf_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_buf_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Phase 2 of test:
+	 * Remove pages associated with new TCS, create a regular page
+	 * where TCS page used to be and verify it can be used as a regular
+	 * page.
+	 */
+
+	/* Start page removal by requesting change of page type to PT_TRIM. */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = SGX_PAGE_TYPE_TRIM << 8;
+	modt_ioc.offset = total_size;
+	modt_ioc.length = 3 * PAGE_SIZE;
+	modt_ioc.secinfo = (unsigned long)&secinfo;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, 3 * PAGE_SIZE);
+
+	/*
+	 * Enter enclave via TCS #1 and approve page removal by sending
+	 * EACCEPT for each of three removed pages.
+	 */
+	self->run.tcs = self->encl.encl_base;
+
+	eaccept_op.epc_addr = (unsigned long)stack_end;
+	eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	eaccept_op.epc_addr = (unsigned long)tcs;
+	eaccept_op.ret = 0;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	eaccept_op.epc_addr = (unsigned long)ssa;
+	eaccept_op.ret = 0;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/* Send final ioctl() to complete page removal. */
+	memset(&remove_ioc, 0, sizeof(remove_ioc));
+
+	remove_ioc.offset = total_size;
+	remove_ioc.length = 3 * PAGE_SIZE;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_REMOVE_PAGES, &remove_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(remove_ioc.count, 3 * PAGE_SIZE);
+
+	/*
+	 * Enter enclave via TCS #1 and access location where TCS #3 was to
+	 * trigger dynamic add of regular page at that location.
+	 */
+	eaccept_op.epc_addr = (unsigned long)tcs;
+	eaccept_op.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_REG | SGX_SECINFO_PENDING;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/*
+	 * New page should be accessible from within enclave - write to it.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = (unsigned long)tcs;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory from newly added page that was just written to,
+	 * confirming that data previously written (MAGIC) is present.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = (unsigned long)tcs;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	munmap(addr, 3 * PAGE_SIZE);
+}
+
 TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/sgx/main.h b/tools/testing/selftests/sgx/main.h
index b45c52ec7ab3..fc585be97e2f 100644
--- a/tools/testing/selftests/sgx/main.h
+++ b/tools/testing/selftests/sgx/main.h
@@ -38,6 +38,7 @@ void encl_delete(struct encl *ctx);
 bool encl_load(const char *path, struct encl *encl, unsigned long heap_size);
 bool encl_measure(struct encl *encl);
 bool encl_build(struct encl *encl);
+uint64_t encl_get_entry(struct encl *encl, const char *symbol);
 
 int sgx_enter_enclave(void *rdi, void *rsi, long rdx, u32 function, void *r8, void *r9,
 		      struct sgx_enclave_run *run);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 27/30] selftests/sgx: Test faulty enclave behavior
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (25 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 26/30] selftests/sgx: Test complete changing of page type flow Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 28/30] selftests/sgx: Test invalid access to removed enclave page Reinette Chatre
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Removing a page from an initialized enclave involves three steps:
first the user requests changing the page type to SGX_PAGE_TYPE_TRIM
via an ioctl(), on success the ENCLU[EACCEPT] instruction needs to be
run from within the enclave to accept the page removal, finally the
user requests page removal to be completed via an ioctl(). Only after
acceptance (ENCLU[EACCEPT]) from within the enclave can the kernel
remove the page from a running enclave.

Test the behavior when the user's request to change the page type
succeeds, but the ENCLU[EACCEPT] instruction is not run before the
ioctl() requesting page removal is run. This should not be permitted.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Rename struct sgx_enclave_modt -> struct sgx_enclave_modify_type

Changes since V1:
- Update to support ioctl() name change (SGX_IOC_PAGE_MODT ->
  SGX_IOC_ENCLAVE_MODIFY_TYPE) and provide secinfo as parameter instead
  of just page type (Jarkko).
- Update test to reflect page removal ioctl() and struct name change:
  SGX_IOC_PAGE_REMOVE->SGX_IOC_ENCLAVE_REMOVE_PAGES,
  struct sgx_page_remove -> struct sgx_enclave_remove_pages (Jarkko).
- Use ioctl() instead of ioctl in text (Dave).

 tools/testing/selftests/sgx/main.c | 116 +++++++++++++++++++++++++++++
 1 file changed, 116 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 17ade940425b..f6a8e2dd4a23 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -1444,4 +1444,120 @@ TEST_F(enclave, tcs_create)
 	munmap(addr, 3 * PAGE_SIZE);
 }
 
+/*
+ * Ensure sane behavior if user requests page removal, does not run
+ * EACCEPT from within enclave but still attempts to finalize page removal
+ * with the SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl(). The latter should fail
+ * because the removal was not EACCEPTed from within the enclave.
+ */
+TEST_F(enclave, remove_added_page_no_eaccept)
+{
+	struct sgx_enclave_remove_pages remove_ioc;
+	struct encl_op_get_from_addr get_addr_op;
+	struct sgx_enclave_modify_type modt_ioc;
+	struct encl_op_put_to_addr put_addr_op;
+	struct sgx_secinfo secinfo;
+	unsigned long data_start;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Hardware (SGX2) and kernel support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &modt_ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl()");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Page that will be removed is the second data page in the .data
+	 * segment. This forms part of the local encl_buffer within the
+	 * enclave.
+	 */
+	data_start = self->encl.encl_base +
+		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	/*
+	 * Sanity check that page at @data_start is writable before
+	 * removing it.
+	 *
+	 * Start by writing MAGIC to test page.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = data_start;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that data
+	 * previously written (MAGIC) is present.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = data_start;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/* Start page removal by requesting change of page type to PT_TRIM */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = SGX_PAGE_TYPE_TRIM << 8;
+	modt_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	modt_ioc.length = PAGE_SIZE;
+	modt_ioc.secinfo = (unsigned long)&secinfo;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, 4096);
+
+	/* Skip EACCEPT */
+
+	/* Send final ioctl() to complete page removal */
+	memset(&remove_ioc, 0, sizeof(remove_ioc));
+
+	remove_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	remove_ioc.length = PAGE_SIZE;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_REMOVE_PAGES, &remove_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	/* Operation not permitted since EACCEPT was omitted. */
+	EXPECT_EQ(ret, -1);
+	EXPECT_EQ(errno_save, EPERM);
+	EXPECT_EQ(remove_ioc.count, 0);
+}
+
 TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 28/30] selftests/sgx: Test invalid access to removed enclave page
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (26 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 27/30] selftests/sgx: Test faulty enclave behavior Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 29/30] selftests/sgx: Test reclaiming of untouched page Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 30/30] selftests/sgx: Page removal stress test Reinette Chatre
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Removing a page from an initialized enclave involves three steps:
(1) the user requests changing the page type to SGX_PAGE_TYPE_TRIM
via the SGX_IOC_ENCLAVE_MODIFY_TYPE  ioctl(), (2) on success the
ENCLU[EACCEPT] instruction is run from within the enclave to accept
the page removal, (3) the user initiates the actual removal of the
page via the SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl().

Test two possible invalid accesses during the page removal flow:
* Test the behavior when a request to remove the page by changing its
  type to SGX_PAGE_TYPE_TRIM completes successfully but instead of
  executing ENCLU[EACCEPT] from within the enclave the enclave attempts
  to read from the page. Even though the page is accessible from the
  page table entries its type is SGX_PAGE_TYPE_TRIM and thus not
  accessible according to SGX. The expected behavior is a page fault
  with the SGX flag set in the error code.
* Test the behavior when the page type is changed successfully and
  ENCLU[EACCEPT] was run from within the enclave. The final ioctl(),
  SGX_IOC_ENCLAVE_REMOVE_PAGES, is omitted and replaced with an
  attempt to access the page. Even though the page is accessible
  from the page table entries its type is SGX_PAGE_TYPE_TRIM and
  thus not accessible according to SGX.  The expected behavior is
  a page fault with the SGX flag set in the error code.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Rename struct sgx_enclave_modt -> struct sgx_enclave_modify_type

Changes since V1:
- Update to support ioctl() name change (SGX_IOC_PAGE_MODT ->
  SGX_IOC_ENCLAVE_MODIFY_TYPE) and provide secinfo as parameter instead
  of just page type (Jarkko).
- Use ioctl() instead of ioctl (Dave).

 tools/testing/selftests/sgx/main.c | 247 +++++++++++++++++++++++++++++
 1 file changed, 247 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index f6a8e2dd4a23..f9f8e3697fa6 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -1560,4 +1560,251 @@ TEST_F(enclave, remove_added_page_no_eaccept)
 	EXPECT_EQ(remove_ioc.count, 0);
 }
 
+/*
+ * Request enclave page removal but instead of correctly following with
+ * EACCEPT a read attempt to page is made from within the enclave.
+ */
+TEST_F(enclave, remove_added_page_invalid_access)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct sgx_enclave_modify_type ioc;
+	struct sgx_secinfo secinfo;
+	unsigned long data_start;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Hardware (SGX2) and kernel support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&ioc, 0, sizeof(ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl()");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Page that will be removed is the second data page in the .data
+	 * segment. This forms part of the local encl_buffer within the
+	 * enclave.
+	 */
+	data_start = self->encl.encl_base +
+		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	/*
+	 * Sanity check that page at @data_start is writable before
+	 * removing it.
+	 *
+	 * Start by writing MAGIC to test page.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = data_start;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that data
+	 * previously written (MAGIC) is present.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = data_start;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/* Start page removal by requesting change of page type to PT_TRIM. */
+	memset(&ioc, 0, sizeof(ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = SGX_PAGE_TYPE_TRIM << 8;
+	ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	ioc.length = PAGE_SIZE;
+	ioc.secinfo = (unsigned long)&secinfo;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(ioc.result, 0);
+	EXPECT_EQ(ioc.count, 4096);
+
+	/*
+	 * Read from page that was just removed.
+	 */
+	get_addr_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	/*
+	 * From kernel perspective the page is present but according to SGX the
+	 * page should not be accessible so a #PF with SGX bit set is
+	 * expected.
+	 */
+
+	EXPECT_EQ(self->run.function, ERESUME);
+	EXPECT_EQ(self->run.exception_vector, 14);
+	EXPECT_EQ(self->run.exception_error_code, 0x8005);
+	EXPECT_EQ(self->run.exception_addr, data_start);
+}
+
+/*
+ * Request enclave page removal and correctly follow with
+ * EACCEPT but do not follow with removal ioctl() but instead a read attempt
+ * to removed page is made from within the enclave.
+ */
+TEST_F(enclave, remove_added_page_invalid_access_after_eaccept)
+{
+	struct encl_op_get_from_addr get_addr_op;
+	struct encl_op_put_to_addr put_addr_op;
+	struct sgx_enclave_modify_type ioc;
+	struct encl_op_eaccept eaccept_op;
+	struct sgx_secinfo secinfo;
+	unsigned long data_start;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	/*
+	 * Hardware (SGX2) and kernel support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&ioc, 0, sizeof(ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl()");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/*
+	 * Page that will be removed is the second data page in the .data
+	 * segment. This forms part of the local encl_buffer within the
+	 * enclave.
+	 */
+	data_start = self->encl.encl_base +
+		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	/*
+	 * Sanity check that page at @data_start is writable before
+	 * removing it.
+	 *
+	 * Start by writing MAGIC to test page.
+	 */
+	put_addr_op.value = MAGIC;
+	put_addr_op.addr = data_start;
+	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/*
+	 * Read memory that was just written to, confirming that data
+	 * previously written (MAGIC) is present.
+	 */
+	get_addr_op.value = 0;
+	get_addr_op.addr = data_start;
+	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	EXPECT_EQ(get_addr_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+
+	/* Start page removal by requesting change of page type to PT_TRIM. */
+	memset(&ioc, 0, sizeof(ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = SGX_PAGE_TYPE_TRIM << 8;
+	ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	ioc.length = PAGE_SIZE;
+	ioc.secinfo = (unsigned long)&secinfo;
+
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(ioc.result, 0);
+	EXPECT_EQ(ioc.count, 4096);
+
+	eaccept_op.epc_addr = (unsigned long)data_start;
+	eaccept_op.ret = 0;
+	eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	/* Skip ioctl() to remove page. */
+
+	/*
+	 * Read from page that was just removed.
+	 */
+	get_addr_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
+
+	/*
+	 * From kernel perspective the page is present but according to SGX the
+	 * page should not be accessible so a #PF with SGX bit set is
+	 * expected.
+	 */
+
+	EXPECT_EQ(self->run.function, ERESUME);
+	EXPECT_EQ(self->run.exception_vector, 14);
+	EXPECT_EQ(self->run.exception_error_code, 0x8005);
+	EXPECT_EQ(self->run.exception_addr, data_start);
+}
+
 TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 29/30] selftests/sgx: Test reclaiming of untouched page
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (27 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 28/30] selftests/sgx: Test invalid access to removed enclave page Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  2022-04-04 16:49 ` [PATCH V3 30/30] selftests/sgx: Page removal stress test Reinette Chatre
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Removing a page from an initialized enclave involves three steps:
(1) the user requests changing the page type to PT_TRIM via the
    SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl()
(2) on success the ENCLU[EACCEPT] instruction is run from within
    the enclave to accept the page removal
(3) the user initiates the actual removal of the page via the
    SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl().

Remove a page that has never been accessed. This means that when the
first ioctl() requesting page removal arrives, there will be no page
table entry, yet a valid page table entry needs to exist for the
ENCLU[EACCEPT] function to succeed. In this test it is verified that
a page table entry can still be installed for a page that is in the
process of being removed.

Suggested-by: Haitao Huang <haitao.huang@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Rename struct sgx_enclave_modt -> struct sgx_enclave_modify_type

Changes since V1:
- Update to support ioctl() name change (SGX_IOC_PAGE_MODT ->
  SGX_IOC_ENCLAVE_MODIFY_TYPE) and provide secinfo as parameter instead
  of just page type (Jarkko).
- Update test to reflect page removal ioctl() and struct name change:
  SGX_IOC_PAGE_REMOVE->SGX_IOC_ENCLAVE_REMOVE_PAGES,
  struct sgx_page_remove -> struct sgx_enclave_remove_pages (Jarkko).
- Ensure test is skipped when SGX2 not supported by kernel.

 tools/testing/selftests/sgx/main.c | 82 ++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index f9f8e3697fa6..82cc2283be03 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -1807,4 +1807,86 @@ TEST_F(enclave, remove_added_page_invalid_access_after_eaccept)
 	EXPECT_EQ(self->run.exception_addr, data_start);
 }
 
+TEST_F(enclave, remove_untouched_page)
+{
+	struct sgx_enclave_remove_pages remove_ioc;
+	struct sgx_enclave_modify_type modt_ioc;
+	struct encl_op_eaccept eaccept_op;
+	struct sgx_secinfo secinfo;
+	unsigned long data_start;
+	int ret, errno_save;
+
+	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
+
+	/*
+	 * Hardware (SGX2) and kernel support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &modt_ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl()");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/* SGX2 is supported by kernel and hardware, test can proceed. */
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	data_start = self->encl.encl_base +
+			 encl_get_data_offset(&self->encl) + PAGE_SIZE;
+
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = SGX_PAGE_TYPE_TRIM << 8;
+	modt_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	modt_ioc.length = PAGE_SIZE;
+	modt_ioc.secinfo = (unsigned long)&secinfo;
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, 4096);
+
+	/*
+	 * Enter enclave via TCS #1 and approve page removal by sending
+	 * EACCEPT for removed page.
+	 */
+
+	eaccept_op.epc_addr = data_start;
+	eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED;
+	eaccept_op.ret = 0;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.exception_vector, 0);
+	EXPECT_EQ(self->run.exception_error_code, 0);
+	EXPECT_EQ(self->run.exception_addr, 0);
+	EXPECT_EQ(eaccept_op.ret, 0);
+
+	memset(&remove_ioc, 0, sizeof(remove_ioc));
+
+	remove_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
+	remove_ioc.length = PAGE_SIZE;
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_REMOVE_PAGES, &remove_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(remove_ioc.count, 4096);
+}
+
 TEST_HARNESS_MAIN
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH V3 30/30] selftests/sgx: Page removal stress test
  2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
                   ` (28 preceding siblings ...)
  2022-04-04 16:49 ` [PATCH V3 29/30] selftests/sgx: Test reclaiming of untouched page Reinette Chatre
@ 2022-04-04 16:49 ` Reinette Chatre
  29 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-04 16:49 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

Create enclave with additional heap that consumes all physical SGX
memory and then remove it.

Depending on the available SGX memory this test could take a
significant time to run (several minutes) as it (1) creates the
enclave, (2) changes the type of every page to be trimmed,
(3) enters the enclave once per page to run EACCEPT, before
(4) the pages are finally removed.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- Rename struct sgx_enclave_modt -> struct sgx_enclave_modify_type

Changes since V1:
- Exit test completely on first failure of EACCEPT of a removed page. Since
  this is an oversubscribed test the number of pages on which this is
  attempted can be significant and in case of failure the per-page
  error logging would overwhelm the system.
- Update test to call renamed ioctl() (SGX_IOC_PAGE_MODT ->
  SGX_IOC_ENCLAVE_MODIFY_TYPE) and provide secinfo as parameter (Jarkko).
- Fixup definitions to be reverse xmas tree.
- Update test to reflect page removal ioctl() and struct name change:
  SGX_IOC_PAGE_REMOVE->SGX_IOC_ENCLAVE_REMOVE_PAGES,
  struct sgx_page_remove -> struct sgx_enclave_remove_pages (Jarkko).
- Ensure test is skipped when SGX2 not supported by kernel.
- Cleanup comments.

 tools/testing/selftests/sgx/main.c | 122 +++++++++++++++++++++++++++++
 1 file changed, 122 insertions(+)

diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
index 82cc2283be03..535f6cd72eb1 100644
--- a/tools/testing/selftests/sgx/main.c
+++ b/tools/testing/selftests/sgx/main.c
@@ -378,7 +378,129 @@ TEST_F(enclave, unclobbered_vdso_oversubscribed)
 	EXPECT_EQ(get_op.value, MAGIC);
 	EXPECT_EEXIT(&self->run);
 	EXPECT_EQ(self->run.user_data, 0);
+}
+
+TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900)
+{
+	struct sgx_enclave_remove_pages remove_ioc;
+	struct sgx_enclave_modify_type modt_ioc;
+	struct encl_op_get_from_buf get_op;
+	struct encl_op_eaccept eaccept_op;
+	struct encl_op_put_to_buf put_op;
+	struct sgx_secinfo secinfo;
+	struct encl_segment *heap;
+	unsigned long total_mem;
+	int ret, errno_save;
+	unsigned long addr;
+	unsigned long i;
+
+	/*
+	 * Create enclave with additional heap that is as big as all
+	 * available physical SGX memory.
+	 */
+	total_mem = get_total_epc_mem();
+	ASSERT_NE(total_mem, 0);
+	TH_LOG("Creating an enclave with %lu bytes heap may take a while ...",
+	       total_mem);
+	ASSERT_TRUE(setup_test_encl(total_mem, &self->encl, _metadata));
+
+	/*
+	 * Hardware (SGX2) and kernel support is needed for this test. Start
+	 * with check that test has a chance of succeeding.
+	 */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &modt_ioc);
+
+	if (ret == -1) {
+		if (errno == ENOTTY)
+			SKIP(return, "Kernel does not support SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl()");
+		else if (errno == ENODEV)
+			SKIP(return, "System does not support SGX2");
+	}
+
+	/*
+	 * Invalid parameters were provided during sanity check,
+	 * expect command to fail.
+	 */
+	EXPECT_EQ(ret, -1);
+
+	/* SGX2 is supported by kernel and hardware, test can proceed. */
+	memset(&self->run, 0, sizeof(self->run));
+	self->run.tcs = self->encl.encl_base;
+
+	heap = &self->encl.segment_tbl[self->encl.nr_segments - 1];
+
+	put_op.header.type = ENCL_OP_PUT_TO_BUFFER;
+	put_op.value = MAGIC;
+
+	EXPECT_EQ(ENCL_CALL(&put_op, &self->run, false), 0);
+
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.user_data, 0);
+
+	get_op.header.type = ENCL_OP_GET_FROM_BUFFER;
+	get_op.value = 0;
+
+	EXPECT_EQ(ENCL_CALL(&get_op, &self->run, false), 0);
+
+	EXPECT_EQ(get_op.value, MAGIC);
+	EXPECT_EEXIT(&self->run);
+	EXPECT_EQ(self->run.user_data, 0);
+
+	/* Trim entire heap. */
+	memset(&modt_ioc, 0, sizeof(modt_ioc));
+	memset(&secinfo, 0, sizeof(secinfo));
+
+	secinfo.flags = SGX_PAGE_TYPE_TRIM << 8;
+	modt_ioc.offset = heap->offset;
+	modt_ioc.length = heap->size;
+	modt_ioc.secinfo = (unsigned long)&secinfo;
+
+	TH_LOG("Changing type of %zd bytes to trimmed may take a while ...",
+	       heap->size);
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPE, &modt_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(modt_ioc.result, 0);
+	EXPECT_EQ(modt_ioc.count, heap->size);
+
+	/* EACCEPT all removed pages. */
+	addr = self->encl.encl_base + heap->offset;
+
+	eaccept_op.flags = SGX_SECINFO_TRIM | SGX_SECINFO_MODIFIED;
+	eaccept_op.header.type = ENCL_OP_EACCEPT;
+
+	TH_LOG("Entering enclave to run EACCEPT for each page of %zd bytes may take a while ...",
+	       heap->size);
+	for (i = 0; i < heap->size; i += 4096) {
+		eaccept_op.epc_addr = addr + i;
+		eaccept_op.ret = 0;
 
+		EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
+
+		EXPECT_EQ(self->run.exception_vector, 0);
+		EXPECT_EQ(self->run.exception_error_code, 0);
+		EXPECT_EQ(self->run.exception_addr, 0);
+		ASSERT_EQ(eaccept_op.ret, 0);
+		ASSERT_EQ(self->run.function, EEXIT);
+	}
+
+	/* Complete page removal. */
+	memset(&remove_ioc, 0, sizeof(remove_ioc));
+
+	remove_ioc.offset = heap->offset;
+	remove_ioc.length = heap->size;
+
+	TH_LOG("Removing %zd bytes from enclave may take a while ...",
+	       heap->size);
+	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_REMOVE_PAGES, &remove_ioc);
+	errno_save = ret == -1 ? errno : 0;
+
+	EXPECT_EQ(ret, 0);
+	EXPECT_EQ(errno_save, 0);
+	EXPECT_EQ(remove_ioc.count, heap->size);
 }
 
 TEST_F(enclave, clobbered_vdso)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-04 16:49 ` [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions Reinette Chatre
@ 2022-04-05  5:03   ` Jarkko Sakkinen
  2022-04-05  5:07     ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  5:03 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> In the initial (SGX1) version of SGX, pages in an enclave need to be
> created with permissions that support all usages of the pages, from the
> time the enclave is initialized until it is unloaded. For example,
> pages used by a JIT compiler or when code needs to otherwise be
> relocated need to always have RWX permissions.
> 
> SGX2 includes a new function ENCLS[EMODPR] that is run from the kernel
> and can be used to restrict the EPCM permissions of regular enclave
> pages within an initialized enclave.
> 
> Introduce ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS to support
> restricting EPCM permissions. With this ioctl() the user specifies
> a page range and the EPCM permissions to be applied to all pages in
> the provided range. ENCLS[EMODPR] is run to restrict the EPCM
> permissions followed by the ENCLS[ETRACK] flow that will ensure
> no cached linear-to-physical address mappings to the changed
> pages remain.
> 
> It is possible for the permission change request to fail on any
> page within the provided range, either with an error encountered
> by the kernel or by the SGX hardware while running
> ENCLS[EMODPR]. To support partial success the ioctl() returns an
> error code based on failures encountered by the kernel as well
> as two result output parameters: one for the number of pages
> that were successfully changed and one for the SGX return code.
> 
> The page table entry permissions are not impacted by the EPCM
> permission changes. VMAs and PTEs will continue to allow the
> maximum vetted permissions determined at the time the pages
> are added to the enclave. The SGX error code in a page fault
> will indicate if it was an EPCM permission check that prevented
> an access attempt.
> 
> No checking is done to ensure that the permissions are actually
> being restricted. This is because the enclave may have relaxed
> the EPCM permissions from within the enclave without letting the
> kernel know. An attempt to relax permissions using this call will
> be ignored by the hardware.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - Include the sgx_ioc_sgx2_ready() utility
>   that previously was in "x86/sgx: Support relaxing of enclave page
>   permissions" that is removed from the next version.
> - Few renames requested by Jarkko:
>   struct sgx_enclave_restrict_perm ->
>          struct sgx_enclave_restrict_permissions
>   sgx_enclave_restrict_perm()     ->
>          sgx_enclave_restrict_permissions()
>   sgx_ioc_enclave_restrict_perm() ->
>          sgx_ioc_enclave_restrict_permissions()
> - Make EPCM permissions independent from kernel view of
>   permissions.  (Jarkko)
>   - Remove attempt at runtime tracking of EPCM permissions
>     (sgx_encl_page->vm_run_prot_bits).
>   - Do not flush page table entries - they are no longer impacted by
>     EPCM permission changes.
>   - Modify changelog to reflect new architecture.
> - Ensure at least PROT_READ is requested - enclave requires read
>   access to the page for commands like EMODPE and EACCEPT. (Jarkko)
> 
> Changes since V1:
> - Change terminology to use "relax" instead of "extend" to refer to
>   the case when enclave page permissions are added (Dave).
> - Use ioctl() in commit message (Dave).
> - Add examples on what permissions would be allowed (Dave).
> - Split enclave page permission changes into two ioctl()s, one for
>   permission restricting (SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS)
>   and one for permission relaxing (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS)
>   (Jarkko).
> - In support of the ioctl() name change the following names have been
>   changed:
>   struct sgx_page_modp -> struct sgx_enclave_restrict_perm
>   sgx_ioc_page_modp() -> sgx_ioc_enclave_restrict_perm()
>   sgx_page_modp() -> sgx_enclave_restrict_perm()
> - ioctl() takes entire secinfo as input instead of
>   page permissions only (Jarkko).
> - Fix kernel-doc to include () in function name.
> - Create and use utility for the ETRACK flow.
> - Fixups in comments
> - Move kernel-doc to function that provides documentation for
>   Documentation/x86/sgx.rst.
> - Remove redundant comment.
> - Make explicit which members of struct sgx_enclave_restrict_perm
>   are for output (Dave).
> 
>  arch/x86/include/uapi/asm/sgx.h |  21 +++
>  arch/x86/kernel/cpu/sgx/ioctl.c | 242 ++++++++++++++++++++++++++++++++
>  2 files changed, 263 insertions(+)
> 
> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> index f4b81587e90b..a0a24e94fb27 100644
> --- a/arch/x86/include/uapi/asm/sgx.h
> +++ b/arch/x86/include/uapi/asm/sgx.h
> @@ -29,6 +29,8 @@ enum sgx_page_flags {
>         _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
>  #define SGX_IOC_VEPC_REMOVE_ALL \
>         _IO(SGX_MAGIC, 0x04)
> +#define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
> +       _IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
>  
>  /**
>   * struct sgx_enclave_create - parameter structure for the
> @@ -76,6 +78,25 @@ struct sgx_enclave_provision {
>         __u64 fd;
>  };
>  
> +/**
> + * struct sgx_enclave_restrict_permissions - parameters for ioctl
> + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> + * @offset:    starting page offset (page aligned relative to enclave base
> + *             address defined in SECS)
> + * @length:    length of memory (multiple of the page size)
> + * @secinfo:   address for the SECINFO data containing the new permission bits
> + *             for pages in range described by @offset and @length
> + * @result:    (output) SGX result code of ENCLS[EMODPR] function
> + * @count:     (output) bytes successfully changed (multiple of page size)
> + */
> +struct sgx_enclave_restrict_permissions {
> +       __u64 offset;
> +       __u64 length;
> +       __u64 secinfo;
> +       __u64 result;
> +       __u64 count;
> +};
> +
>  struct sgx_enclave_run;
>  
>  /**
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 0460fd224a05..4d88bfd163e7 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -660,6 +660,244 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
>         return sgx_set_attribute(&encl->attributes_mask, params.fd);
>  }
>  
> +/*
> + * Ensure enclave is ready for SGX2 functions. Readiness is checked
> + * by ensuring the hardware supports SGX2 and the enclave is initialized
> + * and thus able to handle requests to modify pages within it.
> + */
> +static int sgx_ioc_sgx2_ready(struct sgx_encl *encl)
> +{
> +       if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
> +               return -ENODEV;
> +
> +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> +               return -EINVAL;
> +
> +       return 0;
> +}
> +
> +/*
> + * Return valid permission fields from a secinfo structure provided by
> + * user space. The secinfo structure is required to only have bits in
> + * the permission fields set.
> + */
> +static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> +{
> +       struct sgx_secinfo secinfo;
> +       u64 perm;
> +
> +       if (copy_from_user(&secinfo, (void __user *)_secinfo,
> +                          sizeof(secinfo)))
> +               return -EFAULT;
> +
> +       if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> +               return -EINVAL;
> +
> +       if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> +               return -EINVAL;
> +
> +       perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> +
> +       /*
> +        * Read access is required for the enclave to be able to use the page.
> +        * SGX instructions like ENCLU[EMODPE] and ENCLU[EACCEPT] require
> +        * read access.
> +        */
> +       if (!(perm & SGX_SECINFO_R))
> +               return -EINVAL;
> +
> +       *secinfo_perm = perm;
> +
> +       return 0;
> +}
> +
> +/*
> + * Some SGX functions require that no cached linear-to-physical address
> + * mappings are present before they can succeed. Collaborate with
> + * hardware via ENCLS[ETRACK] to ensure that all cached
> + * linear-to-physical address mappings belonging to all threads of
> + * the enclave are cleared. See sgx_encl_cpumask() for details.
> + */
> +static int sgx_enclave_etrack(struct sgx_encl *encl)
> +{
> +       void *epc_virt;
> +       int ret;
> +
> +       epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
> +       ret = __etrack(epc_virt);
> +       if (ret) {
> +               /*
> +                * ETRACK only fails when there is an OS issue. For
> +                * example, two consecutive ETRACK was sent without
> +                * completed IPI between.
> +                */
> +               pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
> +               /*
> +                * Send IPIs to kick CPUs out of the enclave and
> +                * try ETRACK again.
> +                */
> +               on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> +               ret = __etrack(epc_virt);
> +               if (ret) {
> +                       pr_err_once("ETRACK repeat returned %d (0x%x)",
> +                                   ret, ret);
> +                       return -EFAULT;
> +               }
> +       }
> +       on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> +
> +       return 0;
> +}
> +
> +/**
> + * sgx_enclave_restrict_permissions() - Restrict EPCM permissions
> + * @encl:      Enclave to which the pages belong.
> + * @modp:      Checked parameters from user on which pages need modifying.
> + * @secinfo_perm: New (validated) permission bits.
> + *
> + * Return:
> + * - 0:                Success.
> + * - -errno:   Otherwise.
> + */
> +static long
> +sgx_enclave_restrict_permissions(struct sgx_encl *encl,
> +                                struct sgx_enclave_restrict_permissions *modp,
> +                                u64 secinfo_perm)
> +{
> +       struct sgx_encl_page *entry;
> +       struct sgx_secinfo secinfo;
> +       unsigned long addr;
> +       unsigned long c;
> +       void *epc_virt;
> +       int ret;
> +
> +       memset(&secinfo, 0, sizeof(secinfo));
> +       secinfo.flags = secinfo_perm;
> +
> +       for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
> +               addr = encl->base + modp->offset + c;
> +
> +               mutex_lock(&encl->lock);
> +
> +               entry = sgx_encl_load_page(encl, addr);
> +               if (IS_ERR(entry)) {
> +                       ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> +                       goto out_unlock;
> +               }
> +
> +               /*
> +                * Changing EPCM permissions is only supported on regular
> +                * SGX pages. Attempting this change on other pages will
> +                * result in #PF.
> +                */
> +               if (entry->type != SGX_PAGE_TYPE_REG) {
> +                       ret = -EINVAL;
> +                       goto out_unlock;
> +               }
> +
> +               /*
> +                * Do not verify the permission bits requested. Kernel
> +                * has no control over how EPCM permissions can be relaxed
> +                * from within the enclave. ENCLS[EMODPR] can only
> +                * remove existing EPCM permissions, attempting to set
> +                * new permissions will be ignored by the hardware.
> +                */
> +
> +               /* Change EPCM permissions. */
> +               epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> +               ret = __emodpr(&secinfo, epc_virt);
> +               if (encls_faulted(ret)) {
> +                       /*
> +                        * All possible faults should be avoidable:
> +                        * parameters have been checked, will only change
> +                        * permissions of a regular page, and no concurrent
> +                        * SGX1/SGX2 ENCLS instructions since these
> +                        * are protected with mutex.
> +                        */
> +                       pr_err_once("EMODPR encountered exception %d\n",
> +                                   ENCLS_TRAPNR(ret));
> +                       ret = -EFAULT;
> +                       goto out_unlock;
> +               }
> +               if (encls_failed(ret)) {
> +                       modp->result = ret;
> +                       ret = -EFAULT;
> +                       goto out_unlock;
> +               }
> +
> +               ret = sgx_enclave_etrack(encl);
> +               if (ret) {
> +                       ret = -EFAULT;
> +                       goto out_unlock;
> +               }
> +
> +               mutex_unlock(&encl->lock);
> +       }
> +
> +       ret = 0;
> +       goto out;
> +
> +out_unlock:
> +       mutex_unlock(&encl->lock);
> +out:
> +       modp->count = c;
> +
> +       return ret;
> +}
> +
> +/**
> + * sgx_ioc_enclave_restrict_permissions() - handler for
> + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> + * @encl:      an enclave pointer
> + * @arg:       userspace pointer to a &struct sgx_enclave_restrict_permissions
> + *             instance
> + *
> + * SGX2 distinguishes between relaxing and restricting the enclave page
> + * permissions maintained by the hardware (EPCM permissions) of pages
> + * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
> + *
> + * EPCM permissions cannot be restricted from within the enclave, the enclave
> + * requires the kernel to run the privileged level 0 instructions ENCLS[EMODPR]
> + * and ENCLS[ETRACK]. An attempt to relax EPCM permissions with this call
> + * will be ignored by the hardware.
> + *
> + * Return:
> + * - 0:                Success
> + * - -errno:   Otherwise
> + */
> +static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
> +                                                void __user *arg)
> +{
> +       struct sgx_enclave_restrict_permissions params;
> +       u64 secinfo_perm;
> +       long ret;
> +
> +       ret = sgx_ioc_sgx2_ready(encl);
> +       if (ret)
> +               return ret;
> +
> +       if (copy_from_user(&params, arg, sizeof(params)))
> +               return -EFAULT;
> +
> +       if (sgx_validate_offset_length(encl, params.offset, params.length))
> +               return -EINVAL;
> +
> +       ret = sgx_perm_from_user_secinfo((void __user *)params.secinfo,
> +                                        &secinfo_perm);
> +       if (ret)
> +               return ret;
> +
> +       if (params.result || params.count)
> +               return -EINVAL;
> +
> +       ret = sgx_enclave_restrict_permissions(encl, &params, secinfo_perm);
> +
> +       if (copy_to_user(arg, &params, sizeof(params)))
> +               return -EFAULT;
> +
> +       return ret;
> +}
> +
>  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>  {
>         struct sgx_encl *encl = filep->private_data;
> @@ -681,6 +919,10 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>         case SGX_IOC_ENCLAVE_PROVISION:
>                 ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
>                 break;
> +       case SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS:
> +               ret = sgx_ioc_enclave_restrict_permissions(encl,
> +                                                          (void __user *)arg);
> +               break;
>         default:
>                 ret = -ENOIOCTLCMD;
>                 break;

I think this a big improvement all things considered. I just put 
a kernel building and see if I get this wired to our code:

https://github.com/jarkkojs/aur-linux-sgx/actions/runs/2094084943

I'll report my findings later on.

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 15/30] x86/sgx: Support adding of pages to an initialized enclave
  2022-04-04 16:49 ` [PATCH V3 15/30] x86/sgx: Support adding of pages to an initialized enclave Reinette Chatre
@ 2022-04-05  5:05   ` Jarkko Sakkinen
  2022-04-05 10:03     ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  5:05 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> With SGX1 an enclave needs to be created with its maximum memory demands
> allocated. Pages cannot be added to an enclave after it is initialized.
> SGX2 introduces a new function, ENCLS[EAUG], that can be used to add
> pages to an initialized enclave. With SGX2 the enclave still needs to
> set aside address space for its maximum memory demands during enclave
> creation, but all pages need not be added before enclave initialization.
> Pages can be added during enclave runtime.
> 
> Add support for dynamically adding pages to an initialized enclave,
> architecturally limited to RW permission at creation but allowed to
> obtain RWX permissions after enclave runs EMODPE. Add pages via the
> page fault handler at the time an enclave address without a backing
> enclave page is accessed, potentially directly reclaiming pages if
> no free pages are available.
> 
> The enclave is still required to run ENCLU[EACCEPT] on the page before
> it can be used. A useful flow is for the enclave to run ENCLU[EACCEPT]
> on an uninitialized address. This will trigger the page fault handler
> that will add the enclave page and return execution to the enclave to
> repeat the ENCLU[EACCEPT] instruction, this time successful.
> 
> If the enclave accesses an uninitialized address in another way, for
> example by expanding the enclave stack to a page that has not yet been
> added, then the page fault handler would add the page on the first
> write but upon returning to the enclave the instruction that triggered
> the page fault would be repeated and since ENCLU[EACCEPT] was not run
> yet it would trigger a second page fault, this time with the SGX flag
> set in the page fault error code. This can only be recovered by entering
> the enclave again and directly running the ENCLU[EACCEPT] instruction on
> the now initialized address.
> 
> Accessing an uninitialized address from outside the enclave also
> triggers this flow but the page will remain inaccessible (access will
> result in #PF) until accepted from within the enclave via
> ENCLU[EACCEPT].
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - Remove runtime tracking of EPCM permissions
>   (sgx_encl_page->vm_run_prot_bits) (Jarkko).
> - Move export of sgx_encl_{grow,shrink}() to separate patch. (Jarkko)
> - Use sgx_encl_page_alloc(). (Jarkko)
> - Set max allowed permissions to be RWX (Jarkko). Update changelog
>   to indicate the change and use comment in code as
>   created by Jarkko in:
> https://lore.kernel.org/linux-sgx/20220306053211.135762-4-jarkko@kernel.org
> - Do not set protection bits but let it be inherited by VMA (Jarkko)
> 
> Changes since V1:
> - Fix subject line "to initialized" -> "to an initialized" (Jarkko).
> - Move text about hardware's PENDING state to the patch that introduces
>   the ENCLS[EAUG] wrapper (Jarkko).
> - Ensure kernel-doc uses brackets when referring to function.
> 
>  arch/x86/kernel/cpu/sgx/encl.c | 124 +++++++++++++++++++++++++++++++++
>  1 file changed, 124 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 546423753e4c..fa4f947f8496 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -194,6 +194,119 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
>         return __sgx_encl_load_page(encl, entry);
>  }
>  
> +/**
> + * sgx_encl_eaug_page() - Dynamically add page to initialized enclave
> + * @vma:       VMA obtained from fault info from where page is accessed
> + * @encl:      enclave accessing the page
> + * @addr:      address that triggered the page fault
> + *
> + * When an initialized enclave accesses a page with no backing EPC page
> + * on a SGX2 system then the EPC can be added dynamically via the SGX2
> + * ENCLS[EAUG] instruction.
> + *
> + * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed
> + * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise.
> + */
> +static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> +                                    struct sgx_encl *encl, unsigned long addr)
> +{
> +       struct sgx_pageinfo pginfo = {0};
> +       struct sgx_encl_page *encl_page;
> +       struct sgx_epc_page *epc_page;
> +       struct sgx_va_page *va_page;
> +       unsigned long phys_addr;
> +       u64 secinfo_flags;
> +       vm_fault_t vmret;
> +       int ret;
> +
> +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> +               return VM_FAULT_SIGBUS;
> +
> +       /*
> +        * Ignore internal permission checking for dynamically added pages.
> +        * They matter only for data added during the pre-initialization
> +        * phase. The enclave decides the permissions by the means of
> +        * EACCEPT, EACCEPTCOPY and EMODPE.
> +        */
> +       secinfo_flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
> +       encl_page = sgx_encl_page_alloc(encl, addr - encl->base, secinfo_flags);
> +       if (IS_ERR(encl_page))
> +               return VM_FAULT_OOM;
> +
> +       epc_page = sgx_alloc_epc_page(encl_page, true);
> +       if (IS_ERR(epc_page)) {
> +               kfree(encl_page);
> +               return VM_FAULT_SIGBUS;
> +       }
> +
> +       va_page = sgx_encl_grow(encl);
> +       if (IS_ERR(va_page)) {
> +               ret = PTR_ERR(va_page);
> +               goto err_out_free;
> +       }
> +
> +       mutex_lock(&encl->lock);
> +
> +       /*
> +        * Copy comment from sgx_encl_add_page() to maintain guidance in
> +        * this similar flow:
> +        * Adding to encl->va_pages must be done under encl->lock.  Ditto for
> +        * deleting (via sgx_encl_shrink()) in the error path.
> +        */
> +       if (va_page)
> +               list_add(&va_page->list, &encl->va_pages);
> +
> +       ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc),
> +                       encl_page, GFP_KERNEL);
> +       /*
> +        * If ret == -EBUSY then page was created in another flow while
> +        * running without encl->lock
> +        */
> +       if (ret)
> +               goto err_out_unlock;
> +
> +       pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
> +       pginfo.addr = encl_page->desc & PAGE_MASK;
> +       pginfo.metadata = 0;
> +
> +       ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
> +       if (ret)
> +               goto err_out;
> +
> +       encl_page->encl = encl;
> +       encl_page->epc_page = epc_page;
> +       encl_page->type = SGX_PAGE_TYPE_REG;
> +       encl->secs_child_cnt++;
> +
> +       sgx_mark_page_reclaimable(encl_page->epc_page);
> +
> +       phys_addr = sgx_get_epc_phys_addr(epc_page);
> +       /*
> +        * Do not undo everything when creating PTE entry fails - next #PF
> +        * would find page ready for a PTE.
> +        */
> +       vmret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr));
> +       if (vmret != VM_FAULT_NOPAGE) {
> +               mutex_unlock(&encl->lock);
> +               return VM_FAULT_SIGBUS;
> +       }
> +       mutex_unlock(&encl->lock);
> +       return VM_FAULT_NOPAGE;
> +
> +err_out:
> +       xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
> +
> +err_out_unlock:
> +       sgx_encl_shrink(encl, va_page);
> +       mutex_unlock(&encl->lock);
> +
> +err_out_free:
> +       sgx_encl_free_epc_page(epc_page);
> +       kfree(encl_page);
> +
> +       return VM_FAULT_SIGBUS;
> +}
> +
>  static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
>  {
>         unsigned long addr = (unsigned long)vmf->address;
> @@ -213,6 +326,17 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
>         if (unlikely(!encl))
>                 return VM_FAULT_SIGBUS;
>  
> +       /*
> +        * The page_array keeps track of all enclave pages, whether they
> +        * are swapped out or not. If there is no entry for this page and
> +        * the system supports SGX2 then it is possible to dynamically add
> +        * a new enclave page. This is only possible for an initialized
> +        * enclave that will be checked for right away.
> +        */
> +       if (cpu_feature_enabled(X86_FEATURE_SGX2) &&
> +           (!xa_load(&encl->page_array, PFN_DOWN(addr))))
> +               return sgx_encl_eaug_page(vma, encl, addr);
> +
>         mutex_lock(&encl->lock);
>  
>         entry = sgx_encl_load_page_in_vma(encl, addr, vma->vm_flags);

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05  5:03   ` Jarkko Sakkinen
@ 2022-04-05  5:07     ` Jarkko Sakkinen
  2022-04-05 13:40       ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  5:07 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On Tue, 2022-04-05 at 08:03 +0300, Jarkko Sakkinen wrote:
> On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> > In the initial (SGX1) version of SGX, pages in an enclave need to be
> > created with permissions that support all usages of the pages, from the
> > time the enclave is initialized until it is unloaded. For example,
> > pages used by a JIT compiler or when code needs to otherwise be
> > relocated need to always have RWX permissions.
> > 
> > SGX2 includes a new function ENCLS[EMODPR] that is run from the kernel
> > and can be used to restrict the EPCM permissions of regular enclave
> > pages within an initialized enclave.
> > 
> > Introduce ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS to support
> > restricting EPCM permissions. With this ioctl() the user specifies
> > a page range and the EPCM permissions to be applied to all pages in
> > the provided range. ENCLS[EMODPR] is run to restrict the EPCM
> > permissions followed by the ENCLS[ETRACK] flow that will ensure
> > no cached linear-to-physical address mappings to the changed
> > pages remain.
> > 
> > It is possible for the permission change request to fail on any
> > page within the provided range, either with an error encountered
> > by the kernel or by the SGX hardware while running
> > ENCLS[EMODPR]. To support partial success the ioctl() returns an
> > error code based on failures encountered by the kernel as well
> > as two result output parameters: one for the number of pages
> > that were successfully changed and one for the SGX return code.
> > 
> > The page table entry permissions are not impacted by the EPCM
> > permission changes. VMAs and PTEs will continue to allow the
> > maximum vetted permissions determined at the time the pages
> > are added to the enclave. The SGX error code in a page fault
> > will indicate if it was an EPCM permission check that prevented
> > an access attempt.
> > 
> > No checking is done to ensure that the permissions are actually
> > being restricted. This is because the enclave may have relaxed
> > the EPCM permissions from within the enclave without letting the
> > kernel know. An attempt to relax permissions using this call will
> > be ignored by the hardware.
> > 
> > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > ---
> > Changes since V2:
> > - Include the sgx_ioc_sgx2_ready() utility
> >   that previously was in "x86/sgx: Support relaxing of enclave page
> >   permissions" that is removed from the next version.
> > - Few renames requested by Jarkko:
> >   struct sgx_enclave_restrict_perm ->
> >          struct sgx_enclave_restrict_permissions
> >   sgx_enclave_restrict_perm()     ->
> >          sgx_enclave_restrict_permissions()
> >   sgx_ioc_enclave_restrict_perm() ->
> >          sgx_ioc_enclave_restrict_permissions()
> > - Make EPCM permissions independent from kernel view of
> >   permissions.  (Jarkko)
> >   - Remove attempt at runtime tracking of EPCM permissions
> >     (sgx_encl_page->vm_run_prot_bits).
> >   - Do not flush page table entries - they are no longer impacted by
> >     EPCM permission changes.
> >   - Modify changelog to reflect new architecture.
> > - Ensure at least PROT_READ is requested - enclave requires read
> >   access to the page for commands like EMODPE and EACCEPT. (Jarkko)
> > 
> > Changes since V1:
> > - Change terminology to use "relax" instead of "extend" to refer to
> >   the case when enclave page permissions are added (Dave).
> > - Use ioctl() in commit message (Dave).
> > - Add examples on what permissions would be allowed (Dave).
> > - Split enclave page permission changes into two ioctl()s, one for
> >   permission restricting (SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS)
> >   and one for permission relaxing (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS)
> >   (Jarkko).
> > - In support of the ioctl() name change the following names have been
> >   changed:
> >   struct sgx_page_modp -> struct sgx_enclave_restrict_perm
> >   sgx_ioc_page_modp() -> sgx_ioc_enclave_restrict_perm()
> >   sgx_page_modp() -> sgx_enclave_restrict_perm()
> > - ioctl() takes entire secinfo as input instead of
> >   page permissions only (Jarkko).
> > - Fix kernel-doc to include () in function name.
> > - Create and use utility for the ETRACK flow.
> > - Fixups in comments
> > - Move kernel-doc to function that provides documentation for
> >   Documentation/x86/sgx.rst.
> > - Remove redundant comment.
> > - Make explicit which members of struct sgx_enclave_restrict_perm
> >   are for output (Dave).
> > 
> >  arch/x86/include/uapi/asm/sgx.h |  21 +++
> >  arch/x86/kernel/cpu/sgx/ioctl.c | 242 ++++++++++++++++++++++++++++++++
> >  2 files changed, 263 insertions(+)
> > 
> > diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> > index f4b81587e90b..a0a24e94fb27 100644
> > --- a/arch/x86/include/uapi/asm/sgx.h
> > +++ b/arch/x86/include/uapi/asm/sgx.h
> > @@ -29,6 +29,8 @@ enum sgx_page_flags {
> >         _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
> >  #define SGX_IOC_VEPC_REMOVE_ALL \
> >         _IO(SGX_MAGIC, 0x04)
> > +#define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
> > +       _IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
> >  
> >  /**
> >   * struct sgx_enclave_create - parameter structure for the
> > @@ -76,6 +78,25 @@ struct sgx_enclave_provision {
> >         __u64 fd;
> >  };
> >  
> > +/**
> > + * struct sgx_enclave_restrict_permissions - parameters for ioctl
> > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > + * @offset:    starting page offset (page aligned relative to enclave base
> > + *             address defined in SECS)
> > + * @length:    length of memory (multiple of the page size)
> > + * @secinfo:   address for the SECINFO data containing the new permission bits
> > + *             for pages in range described by @offset and @length
> > + * @result:    (output) SGX result code of ENCLS[EMODPR] function
> > + * @count:     (output) bytes successfully changed (multiple of page size)
> > + */
> > +struct sgx_enclave_restrict_permissions {
> > +       __u64 offset;
> > +       __u64 length;
> > +       __u64 secinfo;
> > +       __u64 result;
> > +       __u64 count;
> > +};
> > +
> >  struct sgx_enclave_run;
> >  
> >  /**
> > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> > index 0460fd224a05..4d88bfd163e7 100644
> > --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> > @@ -660,6 +660,244 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
> >         return sgx_set_attribute(&encl->attributes_mask, params.fd);
> >  }
> >  
> > +/*
> > + * Ensure enclave is ready for SGX2 functions. Readiness is checked
> > + * by ensuring the hardware supports SGX2 and the enclave is initialized
> > + * and thus able to handle requests to modify pages within it.
> > + */
> > +static int sgx_ioc_sgx2_ready(struct sgx_encl *encl)
> > +{
> > +       if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
> > +               return -ENODEV;
> > +
> > +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> > +               return -EINVAL;
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * Return valid permission fields from a secinfo structure provided by
> > + * user space. The secinfo structure is required to only have bits in
> > + * the permission fields set.
> > + */
> > +static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> > +{
> > +       struct sgx_secinfo secinfo;
> > +       u64 perm;
> > +
> > +       if (copy_from_user(&secinfo, (void __user *)_secinfo,
> > +                          sizeof(secinfo)))
> > +               return -EFAULT;
> > +
> > +       if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> > +               return -EINVAL;
> > +
> > +       if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> > +               return -EINVAL;
> > +
> > +       perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> > +
> > +       /*
> > +        * Read access is required for the enclave to be able to use the page.
> > +        * SGX instructions like ENCLU[EMODPE] and ENCLU[EACCEPT] require
> > +        * read access.
> > +        */
> > +       if (!(perm & SGX_SECINFO_R))
> > +               return -EINVAL;
> > +
> > +       *secinfo_perm = perm;
> > +
> > +       return 0;
> > +}
> > +
> > +/*
> > + * Some SGX functions require that no cached linear-to-physical address
> > + * mappings are present before they can succeed. Collaborate with
> > + * hardware via ENCLS[ETRACK] to ensure that all cached
> > + * linear-to-physical address mappings belonging to all threads of
> > + * the enclave are cleared. See sgx_encl_cpumask() for details.
> > + */
> > +static int sgx_enclave_etrack(struct sgx_encl *encl)
> > +{
> > +       void *epc_virt;
> > +       int ret;
> > +
> > +       epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
> > +       ret = __etrack(epc_virt);
> > +       if (ret) {
> > +               /*
> > +                * ETRACK only fails when there is an OS issue. For
> > +                * example, two consecutive ETRACK was sent without
> > +                * completed IPI between.
> > +                */
> > +               pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
> > +               /*
> > +                * Send IPIs to kick CPUs out of the enclave and
> > +                * try ETRACK again.
> > +                */
> > +               on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > +               ret = __etrack(epc_virt);
> > +               if (ret) {
> > +                       pr_err_once("ETRACK repeat returned %d (0x%x)",
> > +                                   ret, ret);
> > +                       return -EFAULT;
> > +               }
> > +       }
> > +       on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > +
> > +       return 0;
> > +}
> > +
> > +/**
> > + * sgx_enclave_restrict_permissions() - Restrict EPCM permissions
> > + * @encl:      Enclave to which the pages belong.
> > + * @modp:      Checked parameters from user on which pages need modifying.
> > + * @secinfo_perm: New (validated) permission bits.
> > + *
> > + * Return:
> > + * - 0:                Success.
> > + * - -errno:   Otherwise.
> > + */
> > +static long
> > +sgx_enclave_restrict_permissions(struct sgx_encl *encl,
> > +                                struct sgx_enclave_restrict_permissions *modp,
> > +                                u64 secinfo_perm)
> > +{
> > +       struct sgx_encl_page *entry;
> > +       struct sgx_secinfo secinfo;
> > +       unsigned long addr;
> > +       unsigned long c;
> > +       void *epc_virt;
> > +       int ret;
> > +
> > +       memset(&secinfo, 0, sizeof(secinfo));
> > +       secinfo.flags = secinfo_perm;
> > +
> > +       for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
> > +               addr = encl->base + modp->offset + c;
> > +
> > +               mutex_lock(&encl->lock);
> > +
> > +               entry = sgx_encl_load_page(encl, addr);
> > +               if (IS_ERR(entry)) {
> > +                       ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> > +                       goto out_unlock;
> > +               }
> > +
> > +               /*
> > +                * Changing EPCM permissions is only supported on regular
> > +                * SGX pages. Attempting this change on other pages will
> > +                * result in #PF.
> > +                */
> > +               if (entry->type != SGX_PAGE_TYPE_REG) {
> > +                       ret = -EINVAL;
> > +                       goto out_unlock;
> > +               }
> > +
> > +               /*
> > +                * Do not verify the permission bits requested. Kernel
> > +                * has no control over how EPCM permissions can be relaxed
> > +                * from within the enclave. ENCLS[EMODPR] can only
> > +                * remove existing EPCM permissions, attempting to set
> > +                * new permissions will be ignored by the hardware.
> > +                */
> > +
> > +               /* Change EPCM permissions. */
> > +               epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> > +               ret = __emodpr(&secinfo, epc_virt);
> > +               if (encls_faulted(ret)) {
> > +                       /*
> > +                        * All possible faults should be avoidable:
> > +                        * parameters have been checked, will only change
> > +                        * permissions of a regular page, and no concurrent
> > +                        * SGX1/SGX2 ENCLS instructions since these
> > +                        * are protected with mutex.
> > +                        */
> > +                       pr_err_once("EMODPR encountered exception %d\n",
> > +                                   ENCLS_TRAPNR(ret));
> > +                       ret = -EFAULT;
> > +                       goto out_unlock;
> > +               }
> > +               if (encls_failed(ret)) {
> > +                       modp->result = ret;
> > +                       ret = -EFAULT;
> > +                       goto out_unlock;
> > +               }
> > +
> > +               ret = sgx_enclave_etrack(encl);
> > +               if (ret) {
> > +                       ret = -EFAULT;
> > +                       goto out_unlock;
> > +               }
> > +
> > +               mutex_unlock(&encl->lock);
> > +       }
> > +
> > +       ret = 0;
> > +       goto out;
> > +
> > +out_unlock:
> > +       mutex_unlock(&encl->lock);
> > +out:
> > +       modp->count = c;
> > +
> > +       return ret;
> > +}
> > +
> > +/**
> > + * sgx_ioc_enclave_restrict_permissions() - handler for
> > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > + * @encl:      an enclave pointer
> > + * @arg:       userspace pointer to a &struct sgx_enclave_restrict_permissions
> > + *             instance
> > + *
> > + * SGX2 distinguishes between relaxing and restricting the enclave page
> > + * permissions maintained by the hardware (EPCM permissions) of pages
> > + * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
> > + *
> > + * EPCM permissions cannot be restricted from within the enclave, the enclave
> > + * requires the kernel to run the privileged level 0 instructions ENCLS[EMODPR]
> > + * and ENCLS[ETRACK]. An attempt to relax EPCM permissions with this call
> > + * will be ignored by the hardware.
> > + *
> > + * Return:
> > + * - 0:                Success
> > + * - -errno:   Otherwise
> > + */
> > +static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
> > +                                                void __user *arg)
> > +{
> > +       struct sgx_enclave_restrict_permissions params;
> > +       u64 secinfo_perm;
> > +       long ret;
> > +
> > +       ret = sgx_ioc_sgx2_ready(encl);
> > +       if (ret)
> > +               return ret;
> > +
> > +       if (copy_from_user(&params, arg, sizeof(params)))
> > +               return -EFAULT;
> > +
> > +       if (sgx_validate_offset_length(encl, params.offset, params.length))
> > +               return -EINVAL;
> > +
> > +       ret = sgx_perm_from_user_secinfo((void __user *)params.secinfo,
> > +                                        &secinfo_perm);
> > +       if (ret)
> > +               return ret;
> > +
> > +       if (params.result || params.count)
> > +               return -EINVAL;
> > +
> > +       ret = sgx_enclave_restrict_permissions(encl, &params, secinfo_perm);
> > +
> > +       if (copy_to_user(arg, &params, sizeof(params)))
> > +               return -EFAULT;
> > +
> > +       return ret;
> > +}
> > +
> >  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> >  {
> >         struct sgx_encl *encl = filep->private_data;
> > @@ -681,6 +919,10 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> >         case SGX_IOC_ENCLAVE_PROVISION:
> >                 ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
> >                 break;
> > +       case SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS:
> > +               ret = sgx_ioc_enclave_restrict_permissions(encl,
> > +                                                          (void __user *)arg);
> > +               break;
> >         default:
> >                 ret = -ENOIOCTLCMD;
> >                 break;
> 
> I think this a big improvement all things considered. I just put 
> a kernel building and see if I get this wired to our code:
> 
> https://github.com/jarkkojs/aur-linux-sgx/actions/runs/2094084943
> 
> I'll report my findings later on.

I pulled the patches from sgx2_submitted_v3_plus_rwx branch. Just
sanity checking that it is v3, correct?

BR, Jarkko


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 01/30] x86/sgx: Add short descriptions to ENCLS wrappers
  2022-04-04 16:49 ` [PATCH V3 01/30] x86/sgx: Add short descriptions to ENCLS wrappers Reinette Chatre
@ 2022-04-05  6:52   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  6:52 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:09AM -0700, Reinette Chatre wrote:
> The SGX ENCLS instruction uses EAX to specify an SGX function and
> may require additional registers, depending on the SGX function.
> ENCLS invokes the specified privileged SGX function for managing
> and debugging enclaves. Macros are used to wrap the ENCLS
> functionality and several wrappers are used to wrap the macros to
> make the different SGX functions accessible in the code.
> 
> The wrappers of the supported SGX functions are cryptic. Add short
> descriptions of each as a comment.
> 
> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since V1:
> - Fix commit message and subject to not refer to descriptions as
> "changelog descriptions" or "shortlog descriptions" (Jarkko).
> - Improve all descriptions with guidance from Jarkko.
> 
>  arch/x86/kernel/cpu/sgx/encls.h | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
> index fa04a73daf9c..0e22fa8f77c5 100644
> --- a/arch/x86/kernel/cpu/sgx/encls.h
> +++ b/arch/x86/kernel/cpu/sgx/encls.h
> @@ -136,57 +136,71 @@ static inline bool encls_failed(int ret)
>  	ret;						\
>  	})
>  
> +/* Initialize an EPC page into an SGX Enclave Control Structure (SECS) page. */
>  static inline int __ecreate(struct sgx_pageinfo *pginfo, void *secs)
>  {
>  	return __encls_2(ECREATE, pginfo, secs);
>  }
>  
> +/* Hash a 256 byte region of an enclave page to SECS:MRENCLAVE. */
>  static inline int __eextend(void *secs, void *addr)
>  {
>  	return __encls_2(EEXTEND, secs, addr);
>  }
>  
> +/*
> + * Associate an EPC page to an enclave either as a REG or TCS page
> + * populated with the provided data.
> + */
>  static inline int __eadd(struct sgx_pageinfo *pginfo, void *addr)
>  {
>  	return __encls_2(EADD, pginfo, addr);
>  }
>  
> +/* Finalize enclave build, initialize enclave for user code execution. */
>  static inline int __einit(void *sigstruct, void *token, void *secs)
>  {
>  	return __encls_ret_3(EINIT, sigstruct, secs, token);
>  }
>  
> +/* Disassociate EPC page from its enclave and mark it as unused. */
>  static inline int __eremove(void *addr)
>  {
>  	return __encls_ret_1(EREMOVE, addr);
>  }
>  
> +/* Copy data to an EPC page belonging to a debug enclave. */
>  static inline int __edbgwr(void *addr, unsigned long *data)
>  {
>  	return __encls_2(EDGBWR, *data, addr);
>  }
>  
> +/* Copy data from an EPC page belonging to a debug enclave. */
>  static inline int __edbgrd(void *addr, unsigned long *data)
>  {
>  	return __encls_1_1(EDGBRD, *data, addr);
>  }
>  
> +/* Track that software has completed the required TLB address clears. */
>  static inline int __etrack(void *addr)
>  {
>  	return __encls_ret_1(ETRACK, addr);
>  }
>  
> +/* Load, verify, and unblock an EPC page. */
>  static inline int __eldu(struct sgx_pageinfo *pginfo, void *addr,
>  			 void *va)
>  {
>  	return __encls_ret_3(ELDU, pginfo, addr, va);
>  }
>  
> +/* Make EPC page inaccessible to enclave, ready to be written to memory. */
>  static inline int __eblock(void *addr)
>  {
>  	return __encls_ret_1(EBLOCK, addr);
>  }
>  
> +/* Initialize an EPC page into a Version Array (VA) page. */
>  static inline int __epa(void *addr)
>  {
>  	unsigned long rbx = SGX_PAGE_TYPE_VA;
> @@ -194,6 +208,7 @@ static inline int __epa(void *addr)
>  	return __encls_2(EPA, rbx, addr);
>  }
>  
> +/* Invalidate an EPC page and write it out to main memory. */
>  static inline int __ewb(struct sgx_pageinfo *pginfo, void *addr,
>  			void *va)
>  {
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 02/30] x86/sgx: Add wrapper for SGX2 EMODPR function
  2022-04-04 16:49 ` [PATCH V3 02/30] x86/sgx: Add wrapper for SGX2 EMODPR function Reinette Chatre
@ 2022-04-05  6:53   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  6:53 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:10AM -0700, Reinette Chatre wrote:
> Add a wrapper for the EMODPR ENCLS leaf function used to
> restrict enclave page permissions as maintained in the
> SGX hardware's Enclave Page Cache Map (EPCM).
> 
> EMODPR:
> 1) Updates the EPCM permissions of an enclave page by treating
>    the new permissions as a mask. Supplying a value that attempts
>    to relax EPCM permissions has no effect on EPCM permissions
>    (PR bit, see below, is changed).
> 2) Sets the PR bit in the EPCM entry of the enclave page to
>    indicate that permission restriction is in progress. The bit
>    is reset by the enclave by invoking ENCLU leaf function
>    EACCEPT or EACCEPTCOPY.
> 
> The enclave may access the page throughout the entire process
> if conforming to the EPCM permissions for the enclave page.
> 
> After performing the permission restriction by issuing EMODPR
> the kernel needs to collaborate with the hardware to ensure that
> all logical processors sees the new restricted permissions. This
> is required for the enclave's EACCEPT/EACCEPTCOPY to succeed and
> is accomplished with the ETRACK flow.
> 
> Expand enum sgx_return_code with the possible EMODPR return
> values.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - Add detail to changelog that PR bit is set when EPCM permissions
>   not changed when relaxing of permissions using EMODPR attempted.
> 
> Changes since V1:
> - Split original patch ("x86/sgx: Add wrappers for SGX2 functions")
>   in three to introduce the SGX2 functions separately (Jarkko).
> - Rewrite commit message to include how the EPCM within the hardware
>   is changed by the SGX2 function as well as the calling
>   conditions (Jarkko).
> - Make short description more specific to which permissions (EPCM
>   permissions) the function modifies.
> 
>  arch/x86/include/asm/sgx.h      | 5 +++++
>  arch/x86/kernel/cpu/sgx/encls.h | 6 ++++++
>  2 files changed, 11 insertions(+)
> 
> diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> index 3f9334ef67cd..d67810b50a81 100644
> --- a/arch/x86/include/asm/sgx.h
> +++ b/arch/x86/include/asm/sgx.h
> @@ -65,17 +65,22 @@ enum sgx_encls_function {
>  
>  /**
>   * enum sgx_return_code - The return code type for ENCLS, ENCLU and ENCLV
> + * %SGX_EPC_PAGE_CONFLICT:	Page is being written by other ENCLS function.
>   * %SGX_NOT_TRACKED:		Previous ETRACK's shootdown sequence has not
>   *				been completed yet.
>   * %SGX_CHILD_PRESENT		SECS has child pages present in the EPC.
>   * %SGX_INVALID_EINITTOKEN:	EINITTOKEN is invalid and enclave signer's
>   *				public key does not match IA32_SGXLEPUBKEYHASH.
> + * %SGX_PAGE_NOT_MODIFIABLE:	The EPC page cannot be modified because it
> + *				is in the PENDING or MODIFIED state.
>   * %SGX_UNMASKED_EVENT:		An unmasked event, e.g. INTR, was received
>   */
>  enum sgx_return_code {
> +	SGX_EPC_PAGE_CONFLICT		= 7,
>  	SGX_NOT_TRACKED			= 11,
>  	SGX_CHILD_PRESENT		= 13,
>  	SGX_INVALID_EINITTOKEN		= 16,
> +	SGX_PAGE_NOT_MODIFIABLE		= 20,
>  	SGX_UNMASKED_EVENT		= 128,
>  };
>  
> diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
> index 0e22fa8f77c5..2b091912f038 100644
> --- a/arch/x86/kernel/cpu/sgx/encls.h
> +++ b/arch/x86/kernel/cpu/sgx/encls.h
> @@ -215,4 +215,10 @@ static inline int __ewb(struct sgx_pageinfo *pginfo, void *addr,
>  	return __encls_ret_3(EWB, pginfo, addr, va);
>  }
>  
> +/* Restrict the EPCM permissions of an EPC page. */
> +static inline int __emodpr(struct sgx_secinfo *secinfo, void *addr)
> +{
> +	return __encls_ret_2(EMODPR, secinfo, addr);
> +}
> +
>  #endif /* _X86_ENCLS_H */
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 03/30] x86/sgx: Add wrapper for SGX2 EMODT function
  2022-04-04 16:49 ` [PATCH V3 03/30] x86/sgx: Add wrapper for SGX2 EMODT function Reinette Chatre
@ 2022-04-05  6:53   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  6:53 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:11AM -0700, Reinette Chatre wrote:
> Add a wrapper for the EMODT ENCLS leaf function used to
> change the type of an enclave page as maintained in the
> SGX hardware's Enclave Page Cache Map (EPCM).
> 
> EMODT:
> 1) Updates the EPCM page type of the enclave page.
> 2) Sets the MODIFIED bit in the EPCM entry of the enclave page.
>    This bit is reset by the enclave by invoking ENCLU leaf
>    function EACCEPT or EACCEPTCOPY.
> 
> Access from within the enclave to the enclave page is not possible
> while the MODIFIED bit is set.
> 
> After changing the enclave page type by issuing EMODT the kernel
> needs to collaborate with the hardware to ensure that no logical
> processor continues to hold a reference to the changed page. This
> is required to ensure no required security checks are circumvented
> and is required for the enclave's EACCEPT/EACCEPTCOPY to succeed.
> Ensuring that no references to the changed page remain is
> accomplished with the ETRACK flow.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since V1:
> - Split original patch ("x86/sgx: Add wrappers for SGX2 functions")
>   in three to introduce the SGX2 functions separately (Jarkko).
> - Rewrite commit message to include how the EPCM within the hardware
>   is changed by the SGX2 function as well as the calling
>   conditions (Jarkko).
> 
>  arch/x86/kernel/cpu/sgx/encls.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
> index 2b091912f038..7a1ecf704ec1 100644
> --- a/arch/x86/kernel/cpu/sgx/encls.h
> +++ b/arch/x86/kernel/cpu/sgx/encls.h
> @@ -221,4 +221,10 @@ static inline int __emodpr(struct sgx_secinfo *secinfo, void *addr)
>  	return __encls_ret_2(EMODPR, secinfo, addr);
>  }
>  
> +/* Change the type of an EPC page. */
> +static inline int __emodt(struct sgx_secinfo *secinfo, void *addr)
> +{
> +	return __encls_ret_2(EMODT, secinfo, addr);
> +}
> +
>  #endif /* _X86_ENCLS_H */
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 04/30] x86/sgx: Add wrapper for SGX2 EAUG function
  2022-04-04 16:49 ` [PATCH V3 04/30] x86/sgx: Add wrapper for SGX2 EAUG function Reinette Chatre
@ 2022-04-05  6:54   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  6:54 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:12AM -0700, Reinette Chatre wrote:
> Add a wrapper for the EAUG ENCLS leaf function used to
> add a page to an initialized enclave.
> 
> EAUG:
> 1) Stores all properties of the new enclave page in the SGX
>    hardware's Enclave Page Cache Map (EPCM).
> 2) Sets the PENDING bit in the EPCM entry of the enclave page.
>    This bit is cleared by the enclave by invoking ENCLU leaf
>    function EACCEPT or EACCEPTCOPY.
> 
> Access from within the enclave to the new enclave page is not
> possible until the PENDING bit is cleared.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since V1:
> - Split original patch ("x86/sgx: Add wrappers for SGX2 functions")
>   in three to introduce the SGX2 functions separately (Jarkko).
> - Rewrite commit message to include how the EPCM within the hardware
>   is changed by the SGX2 function as well as any calling
>   conditions (Jarkko).
> 
>  arch/x86/kernel/cpu/sgx/encls.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encls.h b/arch/x86/kernel/cpu/sgx/encls.h
> index 7a1ecf704ec1..99004b02e2ed 100644
> --- a/arch/x86/kernel/cpu/sgx/encls.h
> +++ b/arch/x86/kernel/cpu/sgx/encls.h
> @@ -227,4 +227,10 @@ static inline int __emodt(struct sgx_secinfo *secinfo, void *addr)
>  	return __encls_ret_2(EMODT, secinfo, addr);
>  }
>  
> +/* Zero a page of EPC memory and add it to an initialized enclave. */
> +static inline int __eaug(struct sgx_pageinfo *pginfo, void *addr)
> +{
> +	return __encls_2(EAUG, pginfo, addr);
> +}
> +
>  #endif /* _X86_ENCLS_H */
> -- 
> 2.25.1
> 


Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 05/30] x86/sgx: Support loading enclave page without VMA permissions check
  2022-04-04 16:49 ` [PATCH V3 05/30] x86/sgx: Support loading enclave page without VMA permissions check Reinette Chatre
@ 2022-04-05  6:56   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  6:56 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:13AM -0700, Reinette Chatre wrote:
> sgx_encl_load_page() is used to find and load an enclave page into
> enclave (EPC) memory, potentially loading it from the backing storage.
> Both usages of sgx_encl_load_page() are during an access to the
> enclave page from a VMA and thus the permissions of the VMA are
> considered before the enclave page is loaded.
> 
> SGX2 functions operating on enclave pages belonging to an initialized
> enclave requiring the page to be in EPC. It is thus required to
> support loading enclave pages into the EPC independent from a VMA.
> 
> Split the current sgx_encl_load_page() to support the two usages:
> A new call, sgx_encl_load_page_in_vma(), behaves exactly like the
> current sgx_encl_load_page() that takes VMA permissions into account,
> while sgx_encl_load_page() just loads an enclave page into EPC.
> 
> VMA, PTE, and EPCM permissions would continue to dictate whether
> the pages can be accessed from within an enclave.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - New patch
> 
>  arch/x86/kernel/cpu/sgx/encl.c | 57 ++++++++++++++++++++++------------
>  arch/x86/kernel/cpu/sgx/encl.h |  2 ++
>  2 files changed, 40 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 7c63a1911fae..05ae1168391c 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -131,25 +131,10 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page,
>  	return epc_page;
>  }
>  
> -static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
> -						unsigned long addr,
> -						unsigned long vm_flags)
> +static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
> +						  struct sgx_encl_page *entry)
>  {
> -	unsigned long vm_prot_bits = vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
>  	struct sgx_epc_page *epc_page;
> -	struct sgx_encl_page *entry;
> -
> -	entry = xa_load(&encl->page_array, PFN_DOWN(addr));
> -	if (!entry)
> -		return ERR_PTR(-EFAULT);
> -
> -	/*
> -	 * Verify that the faulted page has equal or higher build time
> -	 * permissions than the VMA permissions (i.e. the subset of {VM_READ,
> -	 * VM_WRITE, VM_EXECUTE} in vma->vm_flags).
> -	 */
> -	if ((entry->vm_max_prot_bits & vm_prot_bits) != vm_prot_bits)
> -		return ERR_PTR(-EFAULT);
>  
>  	/* Entry successfully located. */
>  	if (entry->epc_page) {
> @@ -175,6 +160,40 @@ static struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
>  	return entry;
>  }
>  
> +static struct sgx_encl_page *sgx_encl_load_page_in_vma(struct sgx_encl *encl,
> +						       unsigned long addr,
> +						       unsigned long vm_flags)
> +{
> +	unsigned long vm_prot_bits = vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
> +	struct sgx_encl_page *entry;
> +
> +	entry = xa_load(&encl->page_array, PFN_DOWN(addr));
> +	if (!entry)
> +		return ERR_PTR(-EFAULT);
> +
> +	/*
> +	 * Verify that the page has equal or higher build time
> +	 * permissions than the VMA permissions (i.e. the subset of {VM_READ,
> +	 * VM_WRITE, VM_EXECUTE} in vma->vm_flags).
> +	 */
> +	if ((entry->vm_max_prot_bits & vm_prot_bits) != vm_prot_bits)
> +		return ERR_PTR(-EFAULT);
> +
> +	return __sgx_encl_load_page(encl, entry);
> +}
> +
> +struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
> +					 unsigned long addr)
> +{
> +	struct sgx_encl_page *entry;
> +
> +	entry = xa_load(&encl->page_array, PFN_DOWN(addr));
> +	if (!entry)
> +		return ERR_PTR(-EFAULT);
> +
> +	return __sgx_encl_load_page(encl, entry);
> +}
> +
>  static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
>  {
>  	unsigned long addr = (unsigned long)vmf->address;
> @@ -196,7 +215,7 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
>  
>  	mutex_lock(&encl->lock);
>  
> -	entry = sgx_encl_load_page(encl, addr, vma->vm_flags);
> +	entry = sgx_encl_load_page_in_vma(encl, addr, vma->vm_flags);
>  	if (IS_ERR(entry)) {
>  		mutex_unlock(&encl->lock);
>  
> @@ -344,7 +363,7 @@ static struct sgx_encl_page *sgx_encl_reserve_page(struct sgx_encl *encl,
>  	for ( ; ; ) {
>  		mutex_lock(&encl->lock);
>  
> -		entry = sgx_encl_load_page(encl, addr, vm_flags);
> +		entry = sgx_encl_load_page_in_vma(encl, addr, vm_flags);
>  		if (PTR_ERR(entry) != -EBUSY)
>  			break;
>  
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index fec43ca65065..6b34efba1602 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -116,5 +116,7 @@ unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page);
>  void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset);
>  bool sgx_va_page_full(struct sgx_va_page *va_page);
>  void sgx_encl_free_epc_page(struct sgx_epc_page *page);
> +struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
> +					 unsigned long addr);
>  
>  #endif /* _X86_ENCL_H */
> -- 
> 2.25.1
> 


Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 06/30] x86/sgx: Export sgx_encl_ewb_cpumask()
  2022-04-04 16:49 ` [PATCH V3 06/30] x86/sgx: Export sgx_encl_ewb_cpumask() Reinette Chatre
@ 2022-04-05  6:56   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  6:56 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:14AM -0700, Reinette Chatre wrote:
> Using sgx_encl_ewb_cpumask() to learn which CPUs might have executed
> an enclave is useful to ensure that TLBs are cleared when changes are
> made to enclave pages.
> 
> sgx_encl_ewb_cpumask() is used within the reclaimer when an enclave
> page is evicted. The upcoming SGX2 support enables changes to be
> made to enclave pages and will require TLBs to not refer to the
> changed pages and thus will be needing sgx_encl_ewb_cpumask().
> 
> Relocate sgx_encl_ewb_cpumask() to be with the rest of the enclave
> code in encl.c now that it is no longer unique to the reclaimer.
> 
> Take care to ensure that any future usage maintains the
> current context requirement that ETRACK has been called first.
> Expand the existing comments to highlight this while moving them
> to a more prominent location before the function.
> 
> No functional change.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since V1:
> - New patch split from original "x86/sgx: Use more generic name for
>   enclave cpumask function" (Jarkko).
> - Change subject line (Jarkko).
> - Fixup kernel-doc to use brackets in function name.
> 
>  arch/x86/kernel/cpu/sgx/encl.c | 67 ++++++++++++++++++++++++++++++++++
>  arch/x86/kernel/cpu/sgx/encl.h |  1 +
>  arch/x86/kernel/cpu/sgx/main.c | 29 ---------------
>  3 files changed, 68 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 05ae1168391c..c6525eba74e8 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -613,6 +613,73 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
>  	return 0;
>  }
>  
> +/**
> + * sgx_encl_ewb_cpumask() - Query which CPUs might be accessing the enclave
> + * @encl: the enclave
> + *
> + * Some SGX functions require that no cached linear-to-physical address
> + * mappings are present before they can succeed. For example, ENCLS[EWB]
> + * copies a page from the enclave page cache to regular main memory but
> + * it fails if it cannot ensure that there are no cached
> + * linear-to-physical address mappings referring to the page.
> + *
> + * SGX hardware flushes all cached linear-to-physical mappings on a CPU
> + * when an enclave is exited via ENCLU[EEXIT] or an Asynchronous Enclave
> + * Exit (AEX). Exiting an enclave will thus ensure cached linear-to-physical
> + * address mappings are cleared but coordination with the tracking done within
> + * the SGX hardware is needed to support the SGX functions that depend on this
> + * cache clearing.
> + *
> + * When the ENCLS[ETRACK] function is issued on an enclave the hardware
> + * tracks threads operating inside the enclave at that time. The SGX
> + * hardware tracking require that all the identified threads must have
> + * exited the enclave in order to flush the mappings before a function such
> + * as ENCLS[EWB] will be permitted
> + *
> + * The following flow is used to support SGX functions that require that
> + * no cached linear-to-physical address mappings are present:
> + * 1) Execute ENCLS[ETRACK] to initiate hardware tracking.
> + * 2) Use this function (sgx_encl_ewb_cpumask()) to query which CPUs might be
> + *    accessing the enclave.
> + * 3) Send IPI to identified CPUs, kicking them out of the enclave and
> + *    thus flushing all locally cached linear-to-physical address mappings.
> + * 4) Execute SGX function.
> + *
> + * Context: It is required to call this function after ENCLS[ETRACK].
> + *          This will ensure that if any new mm appears (racing with
> + *          sgx_encl_mm_add()) then the new mm will enter into the
> + *          enclave with fresh linear-to-physical address mappings.
> + *
> + *          It is required that all IPIs are completed before a new
> + *          ENCLS[ETRACK] is issued so be sure to protect steps 1 to 3
> + *          of the above flow with the enclave's mutex.
> + *
> + * Return: cpumask of CPUs that might be accessing @encl
> + */
> +const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl)
> +{
> +	cpumask_t *cpumask = &encl->cpumask;
> +	struct sgx_encl_mm *encl_mm;
> +	int idx;
> +
> +	cpumask_clear(cpumask);
> +
> +	idx = srcu_read_lock(&encl->srcu);
> +
> +	list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
> +		if (!mmget_not_zero(encl_mm->mm))
> +			continue;
> +
> +		cpumask_or(cpumask, cpumask, mm_cpumask(encl_mm->mm));
> +
> +		mmput_async(encl_mm->mm);
> +	}
> +
> +	srcu_read_unlock(&encl->srcu, idx);
> +
> +	return cpumask;
> +}
> +
>  static struct page *sgx_encl_get_backing_page(struct sgx_encl *encl,
>  					      pgoff_t index)
>  {
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index 6b34efba1602..d2acb4debde5 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -105,6 +105,7 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
>  
>  void sgx_encl_release(struct kref *ref);
>  int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm);
> +const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl);
>  int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
>  			 struct sgx_backing *backing);
>  void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 8e4bc6453d26..2de85f459492 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -203,35 +203,6 @@ static void sgx_ipi_cb(void *info)
>  {
>  }
>  
> -static const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl)
> -{
> -	cpumask_t *cpumask = &encl->cpumask;
> -	struct sgx_encl_mm *encl_mm;
> -	int idx;
> -
> -	/*
> -	 * Can race with sgx_encl_mm_add(), but ETRACK has already been
> -	 * executed, which means that the CPUs running in the new mm will enter
> -	 * into the enclave with a fresh epoch.
> -	 */
> -	cpumask_clear(cpumask);
> -
> -	idx = srcu_read_lock(&encl->srcu);
> -
> -	list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
> -		if (!mmget_not_zero(encl_mm->mm))
> -			continue;
> -
> -		cpumask_or(cpumask, cpumask, mm_cpumask(encl_mm->mm));
> -
> -		mmput_async(encl_mm->mm);
> -	}
> -
> -	srcu_read_unlock(&encl->srcu, idx);
> -
> -	return cpumask;
> -}
> -
>  /*
>   * Swap page to the regular memory transformed to the blocked state by using
>   * EBLOCK, which means that it can no longer be referenced (no new TLB entries).
> -- 
> 2.25.1
> 


Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 07/30] x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask()
  2022-04-04 16:49 ` [PATCH V3 07/30] x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask() Reinette Chatre
@ 2022-04-05  6:57   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  6:57 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:15AM -0700, Reinette Chatre wrote:
> sgx_encl_ewb_cpumask() is no longer unique to the reclaimer where it
> is used during the EWB ENCLS leaf function when EPC pages are written
> out to main memory and sgx_encl_ewb_cpumask() is used to learn which
> CPUs might have executed the enclave to ensure that TLBs are cleared.
> 
> Upcoming SGX2 enabling will use sgx_encl_ewb_cpumask() during the
> EMODPR and EMODT ENCLS leaf functions that make changes to enclave
> pages. The function is needed for the same reason it is used now: to
> learn which CPUs might have executed the enclave to ensure that TLBs
> no longer point to the changed pages.
> 
> Rename sgx_encl_ewb_cpumask() to sgx_encl_cpumask() to reflect the
> broader usage.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since V1:
> - New patch split from original "x86/sgx: Use more generic name for
>   enclave cpumask function" (Jarkko).
> 
>  arch/x86/kernel/cpu/sgx/encl.c | 6 +++---
>  arch/x86/kernel/cpu/sgx/encl.h | 2 +-
>  arch/x86/kernel/cpu/sgx/main.c | 2 +-
>  3 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index c6525eba74e8..8de9bebc4d81 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -614,7 +614,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
>  }
>  
>  /**
> - * sgx_encl_ewb_cpumask() - Query which CPUs might be accessing the enclave
> + * sgx_encl_cpumask() - Query which CPUs might be accessing the enclave
>   * @encl: the enclave
>   *
>   * Some SGX functions require that no cached linear-to-physical address
> @@ -639,7 +639,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
>   * The following flow is used to support SGX functions that require that
>   * no cached linear-to-physical address mappings are present:
>   * 1) Execute ENCLS[ETRACK] to initiate hardware tracking.
> - * 2) Use this function (sgx_encl_ewb_cpumask()) to query which CPUs might be
> + * 2) Use this function (sgx_encl_cpumask()) to query which CPUs might be
>   *    accessing the enclave.
>   * 3) Send IPI to identified CPUs, kicking them out of the enclave and
>   *    thus flushing all locally cached linear-to-physical address mappings.
> @@ -656,7 +656,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
>   *
>   * Return: cpumask of CPUs that might be accessing @encl
>   */
> -const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl)
> +const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl)
>  {
>  	cpumask_t *cpumask = &encl->cpumask;
>  	struct sgx_encl_mm *encl_mm;
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index d2acb4debde5..e59c2cbf71e2 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -105,7 +105,7 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
>  
>  void sgx_encl_release(struct kref *ref);
>  int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm);
> -const cpumask_t *sgx_encl_ewb_cpumask(struct sgx_encl *encl);
> +const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl);
>  int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
>  			 struct sgx_backing *backing);
>  void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 2de85f459492..fa33922879bf 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -249,7 +249,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page,
>  			 * miss cpus that entered the enclave between
>  			 * generating the mask and incrementing epoch.
>  			 */
> -			on_each_cpu_mask(sgx_encl_ewb_cpumask(encl),
> +			on_each_cpu_mask(sgx_encl_cpumask(encl),
>  					 sgx_ipi_cb, NULL, 1);
>  			ret = __sgx_encl_ewb(epc_page, va_slot, backing);
>  		}
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 08/30] x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes()
  2022-04-04 16:49 ` [PATCH V3 08/30] x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes() Reinette Chatre
@ 2022-04-05  6:59   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  6:59 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:16AM -0700, Reinette Chatre wrote:
> The SGX reclaimer removes page table entries pointing to pages that are
> moved to swap.
> 
> SGX2 enables changes to pages belonging to an initialized enclave, thus
> enclave pages may have their permission or type changed while the page
> is being accessed by an enclave. Supporting SGX2 requires page table
> entries to be removed so that any cached mappings to changed pages
> are removed. For example, with the ability to change enclave page types
> a regular enclave page may be changed to a Thread Control Structure
> (TCS) page that may not be accessed by an enclave.
> 
> Factor out the code removing page table entries to a separate function
> sgx_zap_enclave_ptes(), fixing accuracy of comments in the process,
> and make it available to the upcoming SGX2 code.
> 
> Place sgx_zap_enclave_ptes() with the rest of the enclave code in
> encl.c interacting with the page table since this code is no longer
> unique to the reclaimer.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since V1:
> - Elaborate why SGX2 needs this ability (Jarkko).
> - More specific subject.
> - Fix kernel-doc to have brackets in function name.
> 
>  arch/x86/kernel/cpu/sgx/encl.c | 45 +++++++++++++++++++++++++++++++++-
>  arch/x86/kernel/cpu/sgx/encl.h |  2 +-
>  arch/x86/kernel/cpu/sgx/main.c | 31 ++---------------------
>  3 files changed, 47 insertions(+), 31 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 8de9bebc4d81..c77a62432862 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -605,7 +605,7 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
>  
>  	spin_lock(&encl->mm_lock);
>  	list_add_rcu(&encl_mm->list, &encl->mm_list);
> -	/* Pairs with smp_rmb() in sgx_reclaimer_block(). */
> +	/* Pairs with smp_rmb() in sgx_zap_enclave_ptes(). */
>  	smp_wmb();
>  	encl->mm_list_version++;
>  	spin_unlock(&encl->mm_lock);
> @@ -792,6 +792,49 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm,
>  	return ret;
>  }
>  
> +/**
> + * sgx_zap_enclave_ptes() - remove PTEs mapping the address from enclave
> + * @encl: the enclave
> + * @addr: page aligned pointer to single page for which PTEs will be removed
> + *
> + * Multiple VMAs may have an enclave page mapped. Remove the PTE mapping
> + * @addr from each VMA. Ensure that page fault handler is ready to handle
> + * new mappings of @addr before calling this function.
> + */
> +void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr)
> +{
> +	unsigned long mm_list_version;
> +	struct sgx_encl_mm *encl_mm;
> +	struct vm_area_struct *vma;
> +	int idx, ret;
> +
> +	do {
> +		mm_list_version = encl->mm_list_version;
> +
> +		/* Pairs with smp_wmb() in sgx_encl_mm_add(). */
> +		smp_rmb();
> +
> +		idx = srcu_read_lock(&encl->srcu);
> +
> +		list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
> +			if (!mmget_not_zero(encl_mm->mm))
> +				continue;
> +
> +			mmap_read_lock(encl_mm->mm);
> +
> +			ret = sgx_encl_find(encl_mm->mm, addr, &vma);
> +			if (!ret && encl == vma->vm_private_data)
> +				zap_vma_ptes(vma, addr, PAGE_SIZE);
> +
> +			mmap_read_unlock(encl_mm->mm);
> +
> +			mmput_async(encl_mm->mm);
> +		}
> +
> +		srcu_read_unlock(&encl->srcu, idx);
> +	} while (unlikely(encl->mm_list_version != mm_list_version));
> +}
> +
>  /**
>   * sgx_alloc_va_page() - Allocate a Version Array (VA) page
>   *
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index e59c2cbf71e2..1b15d22f6757 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -111,7 +111,7 @@ int sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_index,
>  void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write);
>  int sgx_encl_test_and_clear_young(struct mm_struct *mm,
>  				  struct sgx_encl_page *page);
> -
> +void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr);
>  struct sgx_epc_page *sgx_alloc_va_page(void);
>  unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page);
>  void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset);
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index fa33922879bf..ce9e87d5f8ec 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -137,36 +137,9 @@ static void sgx_reclaimer_block(struct sgx_epc_page *epc_page)
>  	struct sgx_encl_page *page = epc_page->owner;
>  	unsigned long addr = page->desc & PAGE_MASK;
>  	struct sgx_encl *encl = page->encl;
> -	unsigned long mm_list_version;
> -	struct sgx_encl_mm *encl_mm;
> -	struct vm_area_struct *vma;
> -	int idx, ret;
> -
> -	do {
> -		mm_list_version = encl->mm_list_version;
> -
> -		/* Pairs with smp_rmb() in sgx_encl_mm_add(). */
> -		smp_rmb();
> -
> -		idx = srcu_read_lock(&encl->srcu);
> -
> -		list_for_each_entry_rcu(encl_mm, &encl->mm_list, list) {
> -			if (!mmget_not_zero(encl_mm->mm))
> -				continue;
> -
> -			mmap_read_lock(encl_mm->mm);
> -
> -			ret = sgx_encl_find(encl_mm->mm, addr, &vma);
> -			if (!ret && encl == vma->vm_private_data)
> -				zap_vma_ptes(vma, addr, PAGE_SIZE);
> -
> -			mmap_read_unlock(encl_mm->mm);
> -
> -			mmput_async(encl_mm->mm);
> -		}
> +	int ret;
>  
> -		srcu_read_unlock(&encl->srcu, idx);
> -	} while (unlikely(encl->mm_list_version != mm_list_version));
> +	sgx_zap_enclave_ptes(encl, addr);
>  
>  	mutex_lock(&encl->lock);
>  
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 09/30] x86/sgx: Make sgx_ipi_cb() available internally
  2022-04-04 16:49 ` [PATCH V3 09/30] x86/sgx: Make sgx_ipi_cb() available internally Reinette Chatre
@ 2022-04-05  6:59   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  6:59 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:17AM -0700, Reinette Chatre wrote:
> The ETRACK function followed by an IPI to all CPUs within an enclave
> is a common pattern with more frequent use in support of SGX2.
> 
> Make the (empty) IPI callback function available internally in
> preparation for usage by SGX2.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since V1:
> - Replace "for more usages" by "for usage by SGX2" (Jarkko)
> 
>  arch/x86/kernel/cpu/sgx/main.c | 2 +-
>  arch/x86/kernel/cpu/sgx/sgx.h  | 2 ++
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index ce9e87d5f8ec..6e2cb7564080 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -172,7 +172,7 @@ static int __sgx_encl_ewb(struct sgx_epc_page *epc_page, void *va_slot,
>  	return ret;
>  }
>  
> -static void sgx_ipi_cb(void *info)
> +void sgx_ipi_cb(void *info)
>  {
>  }
>  
> diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
> index 0f17def9fe6f..b30cee4de903 100644
> --- a/arch/x86/kernel/cpu/sgx/sgx.h
> +++ b/arch/x86/kernel/cpu/sgx/sgx.h
> @@ -90,6 +90,8 @@ void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
>  int sgx_unmark_page_reclaimable(struct sgx_epc_page *page);
>  struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim);
>  
> +void sgx_ipi_cb(void *info);
> +
>  #ifdef CONFIG_X86_SGX_KVM
>  int __init sgx_vepc_init(void);
>  #else
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 10/30] x86/sgx: Create utility to validate user provided offset and length
  2022-04-04 16:49 ` [PATCH V3 10/30] x86/sgx: Create utility to validate user provided offset and length Reinette Chatre
@ 2022-04-05  7:00   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  7:00 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:18AM -0700, Reinette Chatre wrote:
> User provided offset and length is validated when parsing the parameters
> of the SGX_IOC_ENCLAVE_ADD_PAGES ioctl(). Extract this validation
> into a utility that can be used by the SGX2 ioctl()s that will
> also provide these values.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since V1:
> - New patch
> 
>  arch/x86/kernel/cpu/sgx/ioctl.c | 28 ++++++++++++++++++++++------
>  1 file changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 83df20e3e633..f487549bccba 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -372,6 +372,26 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src,
>  	return ret;
>  }
>  
> +/*
> + * Ensure user provided offset and length values are valid for
> + * an enclave.
> + */
> +static int sgx_validate_offset_length(struct sgx_encl *encl,
> +				      unsigned long offset,
> +				      unsigned long length)
> +{
> +	if (!IS_ALIGNED(offset, PAGE_SIZE))
> +		return -EINVAL;
> +
> +	if (!length || length & (PAGE_SIZE - 1))
> +		return -EINVAL;
> +
> +	if (offset + length - PAGE_SIZE >= encl->size)
> +		return -EINVAL;
> +
> +	return 0;
> +}
> +
>  /**
>   * sgx_ioc_enclave_add_pages() - The handler for %SGX_IOC_ENCLAVE_ADD_PAGES
>   * @encl:       an enclave pointer
> @@ -425,14 +445,10 @@ static long sgx_ioc_enclave_add_pages(struct sgx_encl *encl, void __user *arg)
>  	if (copy_from_user(&add_arg, arg, sizeof(add_arg)))
>  		return -EFAULT;
>  
> -	if (!IS_ALIGNED(add_arg.offset, PAGE_SIZE) ||
> -	    !IS_ALIGNED(add_arg.src, PAGE_SIZE))
> -		return -EINVAL;
> -
> -	if (!add_arg.length || add_arg.length & (PAGE_SIZE - 1))
> +	if (!IS_ALIGNED(add_arg.src, PAGE_SIZE))
>  		return -EINVAL;
>  
> -	if (add_arg.offset + add_arg.length - PAGE_SIZE >= encl->size)
> +	if (sgx_validate_offset_length(encl, add_arg.offset, add_arg.length))
>  		return -EINVAL;
>  
>  	if (copy_from_user(&secinfo, (void __user *)add_arg.secinfo,
> -- 
> 2.25.1
> 


Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 11/30] x86/sgx: Keep record of SGX page type
  2022-04-04 16:49 ` [PATCH V3 11/30] x86/sgx: Keep record of SGX page type Reinette Chatre
@ 2022-04-05  7:00   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  7:00 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:19AM -0700, Reinette Chatre wrote:
> SGX2 functions are not allowed on all page types. For example,
> ENCLS[EMODPR] is only allowed on regular SGX enclave pages and
> ENCLS[EMODPT] is only allowed on TCS and regular pages. If these
> functions are attempted on another type of page the hardware would
> trigger a fault.
> 
> Keep a record of the SGX page type so that there is more
> certainty whether an SGX2 instruction can succeed and faults
> can be treated as real failures.
> 
> The page type is a property of struct sgx_encl_page
> and thus does not cover the VA page type. VA pages are maintained
> in separate structures and their type can be determined in
> a different way. The SGX2 instructions needing the page type do not
> operate on VA pages and this is thus not a scenario needing to
> be covered at this time.
> 
> struct sgx_encl_page hosting this information is maintained for each
> enclave page so the space consumed by the struct is important.
> The existing sgx_encl_page->vm_max_prot_bits is already unsigned long
> while only using three bits. Transition to a bitfield for the two
> members to support the additional information without increasing
> the space consumed by the struct.
> 
> Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - Update changelog to motivate transition to bitfield that
>   was previously done when (now removed) vm_run_prot_bits was
>   added.
> 
> Changes since V1:
> - Add Acked-by from Jarkko.
> 
>  arch/x86/include/asm/sgx.h      | 3 +++
>  arch/x86/kernel/cpu/sgx/encl.h  | 3 ++-
>  arch/x86/kernel/cpu/sgx/ioctl.c | 2 ++
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/sgx.h b/arch/x86/include/asm/sgx.h
> index d67810b50a81..eae20fa52b93 100644
> --- a/arch/x86/include/asm/sgx.h
> +++ b/arch/x86/include/asm/sgx.h
> @@ -239,6 +239,9 @@ struct sgx_pageinfo {
>   * %SGX_PAGE_TYPE_REG:	a regular page
>   * %SGX_PAGE_TYPE_VA:	a VA page
>   * %SGX_PAGE_TYPE_TRIM:	a page in trimmed state
> + *
> + * Make sure when making changes to this enum that its values can still fit
> + * in the bitfield within &struct sgx_encl_page
>   */
>  enum sgx_page_type {
>  	SGX_PAGE_TYPE_SECS,
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index 1b15d22f6757..07abfc70c8e3 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -27,7 +27,8 @@
>  
>  struct sgx_encl_page {
>  	unsigned long desc;
> -	unsigned long vm_max_prot_bits;
> +	unsigned long vm_max_prot_bits:8;
> +	enum sgx_page_type type:16;
>  	struct sgx_epc_page *epc_page;
>  	struct sgx_encl *encl;
>  	struct sgx_va_page *va_page;
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index f487549bccba..0c211af8e948 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -107,6 +107,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs)
>  		set_bit(SGX_ENCL_DEBUG, &encl->flags);
>  
>  	encl->secs.encl = encl;
> +	encl->secs.type = SGX_PAGE_TYPE_SECS;
>  	encl->base = secs->base;
>  	encl->size = secs->size;
>  	encl->attributes = secs->attributes;
> @@ -344,6 +345,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src,
>  	 */
>  	encl_page->encl = encl;
>  	encl_page->epc_page = epc_page;
> +	encl_page->type = (secinfo->flags & SGX_SECINFO_PAGE_TYPE_MASK) >> 8;
>  	encl->secs_child_cnt++;
>  
>  	if (flags & SGX_PAGE_MEASURE) {
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 21/30] selftests/sgx: Add test for EPCM permission changes
  2022-04-04 16:49 ` [PATCH V3 21/30] selftests/sgx: Add test for EPCM permission changes Reinette Chatre
@ 2022-04-05  7:02   ` Jarkko Sakkinen
  2022-04-05  7:03     ` Jarkko Sakkinen
  2022-04-05 17:28     ` Reinette Chatre
  0 siblings, 2 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  7:02 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:29AM -0700, Reinette Chatre wrote:
> EPCM permission changes could be made from within (to relax
> permissions) or out (to restrict permissions) the enclave. Kernel
> support is needed when permissions are restricted to be able to
> call the privileged ENCLS[EMODPR] instruction. EPCM permissions
> can be relaxed via ENCLU[EMODPE] from within the enclave but the
> enclave still depends on the kernel to install PTEs with the needed
> permissions.
> 
> Add a test that exercises a few of the enclave page permission flows:
> 1) Test starts with a RW (from enclave and kernel perspective)
>    enclave page that is mapped via a RW VMA.
> 2) Use the SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl() to restrict
>    the enclave (EPCM) page permissions to read-only.
> 3) Run ENCLU[EACCEPT] from within the enclave to accept the new page
>    permissions.
> 4) Attempt to write to the enclave page from within the enclave - this
>    should fail with a page fault on the EPCM permissions since the page
>    table entry continues to allow RW access.
> 5) Restore EPCM permissions to RW by running ENCLU[EMODPE] from within
>    the enclave.
> 6) Attempt to write to the enclave page from within the enclave - this
>    should succeed since both EPCM and PTE permissions allow this access.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - Modify test to support separation between EPCM and PTE/VMA permissions
>   - Fix changelog and comments to reflect new relationship between
>     EPCM and PTE/VMA permissions.
>   - With EPCM permissions controlling access instead of PTE permissions,
>     check for SGX error code now encountered in page fault.
>   - Stop calling SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and ensure that
>     only calling ENCLU[EMODPE] from within enclave is necessary to restore
>     access to the enclave page.
> - Update to use new struct name struct sgx_enclave_restrict_perm -> struct
>   sgx_enclave_restrict_permissions. (Jarkko)
> 
> Changes since V1:
> - Adapt test to the kernel interface changes: the ioctl() name change
>   and providing entire secinfo as parameter.
> - Remove the ENCLU[EACCEPT] call after permissions are relaxed since
>   the new flow no longer results in the EPCM PR bit being set.
> - Rewrite error path to reduce line lengths.
> 
>  tools/testing/selftests/sgx/defines.h   |  15 ++
>  tools/testing/selftests/sgx/main.c      | 218 ++++++++++++++++++++++++
>  tools/testing/selftests/sgx/test_encl.c |  38 +++++
>  3 files changed, 271 insertions(+)
> 
> diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h
> index 02d775789ea7..b638eb98c80c 100644
> --- a/tools/testing/selftests/sgx/defines.h
> +++ b/tools/testing/selftests/sgx/defines.h
> @@ -24,6 +24,8 @@ enum encl_op_type {
>  	ENCL_OP_PUT_TO_ADDRESS,
>  	ENCL_OP_GET_FROM_ADDRESS,
>  	ENCL_OP_NOP,
> +	ENCL_OP_EACCEPT,
> +	ENCL_OP_EMODPE,
>  	ENCL_OP_MAX,
>  };
>  
> @@ -53,4 +55,17 @@ struct encl_op_get_from_addr {
>  	uint64_t addr;
>  };
>  
> +struct encl_op_eaccept {
> +	struct encl_op_header header;
> +	uint64_t epc_addr;
> +	uint64_t flags;
> +	uint64_t ret;
> +};
> +
> +struct encl_op_emodpe {
> +	struct encl_op_header header;
> +	uint64_t epc_addr;
> +	uint64_t flags;
> +};
> +
>  #endif /* DEFINES_H */
> diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
> index dd74fa42302e..0e0bd1c4d702 100644
> --- a/tools/testing/selftests/sgx/main.c
> +++ b/tools/testing/selftests/sgx/main.c
> @@ -25,6 +25,18 @@ static const uint64_t MAGIC = 0x1122334455667788ULL;
>  static const uint64_t MAGIC2 = 0x8877665544332211ULL;
>  vdso_sgx_enter_enclave_t vdso_sgx_enter_enclave;
>  
> +/*
> + * Security Information (SECINFO) data structure needed by a few SGX
> + * instructions (eg. ENCLU[EACCEPT] and ENCLU[EMODPE]) holds meta-data
> + * about an enclave page. &enum sgx_secinfo_page_state specifies the
> + * secinfo flags used for page state.
> + */
> +enum sgx_secinfo_page_state {
> +	SGX_SECINFO_PENDING = (1 << 3),
> +	SGX_SECINFO_MODIFIED = (1 << 4),
> +	SGX_SECINFO_PR = (1 << 5),
> +};
> +
>  struct vdso_symtab {
>  	Elf64_Sym *elf_symtab;
>  	const char *elf_symstrtab;
> @@ -555,4 +567,210 @@ TEST_F(enclave, pte_permissions)
>  	EXPECT_EQ(self->run.exception_addr, 0);
>  }
>  
> +/*
> + * Enclave page permission test.
> + *
> + * Modify and restore enclave page's EPCM (enclave) permissions from
> + * outside enclave (ENCLS[EMODPR] via kernel) as well as from within
> + * enclave (via ENCLU[EMODPE]). Check for page fault if
> + * VMA allows access but EPCM permissions do not.
> + */
> +TEST_F(enclave, epcm_permissions)
> +{
> +	struct sgx_enclave_restrict_permissions restrict_ioc;
> +	struct encl_op_get_from_addr get_addr_op;
> +	struct encl_op_put_to_addr put_addr_op;
> +	struct encl_op_eaccept eaccept_op;
> +	struct encl_op_emodpe emodpe_op;
> +	struct sgx_secinfo secinfo;
> +	unsigned long data_start;
> +	int ret, errno_save;
> +
> +	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
> +
> +	memset(&self->run, 0, sizeof(self->run));
> +	self->run.tcs = self->encl.encl_base;
> +
> +	/*
> +	 * Ensure kernel supports needed ioctl() and system supports needed
> +	 * commands.
> +	 */
> +	memset(&restrict_ioc, 0, sizeof(restrict_ioc));
> +	memset(&secinfo, 0, sizeof(secinfo));
> +
> +	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS,
> +		    &restrict_ioc);
> +	errno_save = ret == -1 ? errno : 0;
> +
> +	/*
> +	 * Invalid parameters were provided during sanity check,
> +	 * expect command to fail.
> +	 */
> +	ASSERT_EQ(ret, -1);
> +
> +	/* ret == -1 */
> +	if (errno_save == ENOTTY)
> +		SKIP(return,
> +		     "Kernel does not support SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl()");
> +	else if (errno_save == ENODEV)
> +		SKIP(return, "System does not support SGX2");
> +
> +	/*
> +	 * Page that will have its permissions changed is the second data
> +	 * page in the .data segment. This forms part of the local encl_buffer
> +	 * within the enclave.
> +	 *
> +	 * At start of test @data_start should have EPCM as well as PTE and
> +	 * VMA permissions of RW.
> +	 */
> +
> +	data_start = self->encl.encl_base +
> +		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
> +
> +	/*
> +	 * Sanity check that page at @data_start is writable before making
> +	 * any changes to page permissions.
> +	 *
> +	 * Start by writing MAGIC to test page.
> +	 */
> +	put_addr_op.value = MAGIC;
> +	put_addr_op.addr = data_start;
> +	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
> +
> +	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
> +
> +	EXPECT_EEXIT(&self->run);
> +	EXPECT_EQ(self->run.exception_vector, 0);
> +	EXPECT_EQ(self->run.exception_error_code, 0);
> +	EXPECT_EQ(self->run.exception_addr, 0);
> +
> +	/*
> +	 * Read memory that was just written to, confirming that
> +	 * page is writable.
> +	 */
> +	get_addr_op.value = 0;
> +	get_addr_op.addr = data_start;
> +	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
> +
> +	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
> +
> +	EXPECT_EQ(get_addr_op.value, MAGIC);
> +	EXPECT_EEXIT(&self->run);
> +	EXPECT_EQ(self->run.exception_vector, 0);
> +	EXPECT_EQ(self->run.exception_error_code, 0);
> +	EXPECT_EQ(self->run.exception_addr, 0);
> +
> +	/*
> +	 * Change EPCM permissions to read-only. Kernel still considers
> +	 * the page writable.
> +	 */
> +	memset(&restrict_ioc, 0, sizeof(restrict_ioc));
> +	memset(&secinfo, 0, sizeof(secinfo));
> +
> +	secinfo.flags = PROT_READ;
> +	restrict_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
> +	restrict_ioc.length = PAGE_SIZE;
> +	restrict_ioc.secinfo = (unsigned long)&secinfo;
> +
> +	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS,
> +		    &restrict_ioc);
> +	errno_save = ret == -1 ? errno : 0;
> +
> +	EXPECT_EQ(ret, 0);
> +	EXPECT_EQ(errno_save, 0);
> +	EXPECT_EQ(restrict_ioc.result, 0);
> +	EXPECT_EQ(restrict_ioc.count, 4096);
> +
> +	/*
> +	 * EPCM permissions changed from kernel, need to EACCEPT from enclave.
> +	 */
> +	eaccept_op.epc_addr = data_start;
> +	eaccept_op.flags = PROT_READ | SGX_SECINFO_REG | SGX_SECINFO_PR;
> +	eaccept_op.ret = 0;
> +	eaccept_op.header.type = ENCL_OP_EACCEPT;
> +
> +	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
> +
> +	EXPECT_EEXIT(&self->run);
> +	EXPECT_EQ(self->run.exception_vector, 0);
> +	EXPECT_EQ(self->run.exception_error_code, 0);
> +	EXPECT_EQ(self->run.exception_addr, 0);
> +	EXPECT_EQ(eaccept_op.ret, 0);
> +
> +	/*
> +	 * EPCM permissions of page is now read-only, expect #PF
> +	 * on EPCM when attempting to write to page from within enclave.
> +	 */
> +	put_addr_op.value = MAGIC2;
> +
> +	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
> +
> +	EXPECT_EQ(self->run.function, ERESUME);
> +	EXPECT_EQ(self->run.exception_vector, 14);
> +	EXPECT_EQ(self->run.exception_error_code, 0x8007);
> +	EXPECT_EQ(self->run.exception_addr, data_start);
> +
> +	self->run.exception_vector = 0;
> +	self->run.exception_error_code = 0;
> +	self->run.exception_addr = 0;
> +
> +	/*
> +	 * Received AEX but cannot return to enclave at same entrypoint,
> +	 * need different TCS from where EPCM permission can be made writable
> +	 * again.
> +	 */
> +	self->run.tcs = self->encl.encl_base + PAGE_SIZE;
> +
> +	/*
> +	 * Enter enclave at new TCS to change EPCM permissions to be
> +	 * writable again and thus fix the page fault that triggered the
> +	 * AEX.
> +	 */
> +
> +	emodpe_op.epc_addr = data_start;
> +	emodpe_op.flags = PROT_READ | PROT_WRITE;
> +	emodpe_op.header.type = ENCL_OP_EMODPE;
> +
> +	EXPECT_EQ(ENCL_CALL(&emodpe_op, &self->run, true), 0);
> +
> +	EXPECT_EEXIT(&self->run);
> +	EXPECT_EQ(self->run.exception_vector, 0);
> +	EXPECT_EQ(self->run.exception_error_code, 0);
> +	EXPECT_EQ(self->run.exception_addr, 0);
> +
> +	/*
> +	 * Attempt to return to main TCS to resume execution at faulting
> +	 * instruction, PTE should continue to allow writing to the page.
> +	 */
> +	self->run.tcs = self->encl.encl_base;
> +
> +	/*
> +	 * Wrong page permissions that caused original fault has
> +	 * now been fixed via EPCM permissions.
> +	 * Resume execution in main TCS to re-attempt the memory access.
> +	 */
> +	self->run.tcs = self->encl.encl_base;
> +
> +	EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0,
> +					 ERESUME, 0, 0,
> +					 &self->run),
> +		  0);
> +
> +	EXPECT_EEXIT(&self->run);
> +	EXPECT_EQ(self->run.exception_vector, 0);
> +	EXPECT_EQ(self->run.exception_error_code, 0);
> +	EXPECT_EQ(self->run.exception_addr, 0);
> +
> +	get_addr_op.value = 0;
> +
> +	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
> +
> +	EXPECT_EQ(get_addr_op.value, MAGIC2);
> +	EXPECT_EEXIT(&self->run);
> +	EXPECT_EQ(self->run.user_data, 0);
> +	EXPECT_EQ(self->run.exception_vector, 0);
> +	EXPECT_EQ(self->run.exception_error_code, 0);
> +	EXPECT_EQ(self->run.exception_addr, 0);
> +}
> +
>  TEST_HARNESS_MAIN
> diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c
> index 4fca01cfd898..5b6c65331527 100644
> --- a/tools/testing/selftests/sgx/test_encl.c
> +++ b/tools/testing/selftests/sgx/test_encl.c
> @@ -11,6 +11,42 @@
>   */
>  static uint8_t encl_buffer[8192] = { 1 };
>  
> +enum sgx_enclu_function {
> +	EACCEPT = 0x5,
> +	EMODPE = 0x6,
> +};
> +
> +static void do_encl_emodpe(void *_op)
> +{
> +	struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0};
> +	struct encl_op_emodpe *op = _op;
> +
> +	secinfo.flags = op->flags;
> +
> +	asm volatile(".byte 0x0f, 0x01, 0xd7"
> +				:
> +				: "a" (EMODPE),
> +				  "b" (&secinfo),
> +				  "c" (op->epc_addr));
> +}
> +
> +static void do_encl_eaccept(void *_op)
> +{
> +	struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0};
> +	struct encl_op_eaccept *op = _op;
> +	int rax;
> +
> +	secinfo.flags = op->flags;
> +
> +	asm volatile(".byte 0x0f, 0x01, 0xd7"
> +				: "=a" (rax)
> +				: "a" (EACCEPT),
> +				  "b" (&secinfo),
> +				  "c" (op->epc_addr));
> +
> +	op->ret = rax;
> +}
> +
>  static void *memcpy(void *dest, const void *src, size_t n)
>  {
>  	size_t i;
> @@ -62,6 +98,8 @@ void encl_body(void *rdi,  void *rsi)
>  		do_encl_op_put_to_addr,
>  		do_encl_op_get_from_addr,
>  		do_encl_op_nop,
> +		do_encl_eaccept,
> +		do_encl_emodpe,
>  	};
>  
>  	struct encl_op_header *op = (struct encl_op_header *)rdi;
> -- 
> 2.25.1
> 

Lacking:

KERNEL SELFTEST FRAMEWORK
M:	Shuah Khan <shuah@kernel.org>
M:	Shuah Khan <skhan@linuxfoundation.org>
L:	linux-kselftest@vger.kernel.org
S:	Maintained
Q:	https://patchwork.kernel.org/project/linux-kselftest/list/
T:	git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
F:	Documentation/dev-tools/kselftest*
F:	tools/testing/selftests/

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 21/30] selftests/sgx: Add test for EPCM permission changes
  2022-04-05  7:02   ` Jarkko Sakkinen
@ 2022-04-05  7:03     ` Jarkko Sakkinen
  2022-04-05 17:28     ` Reinette Chatre
  1 sibling, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  7:03 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Tue, Apr 05, 2022 at 10:02:38AM +0300, Jarkko Sakkinen wrote:
> On Mon, Apr 04, 2022 at 09:49:29AM -0700, Reinette Chatre wrote:
> > EPCM permission changes could be made from within (to relax
> > permissions) or out (to restrict permissions) the enclave. Kernel
> > support is needed when permissions are restricted to be able to
> > call the privileged ENCLS[EMODPR] instruction. EPCM permissions
> > can be relaxed via ENCLU[EMODPE] from within the enclave but the
> > enclave still depends on the kernel to install PTEs with the needed
> > permissions.
> > 
> > Add a test that exercises a few of the enclave page permission flows:
> > 1) Test starts with a RW (from enclave and kernel perspective)
> >    enclave page that is mapped via a RW VMA.
> > 2) Use the SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl() to restrict
> >    the enclave (EPCM) page permissions to read-only.
> > 3) Run ENCLU[EACCEPT] from within the enclave to accept the new page
> >    permissions.
> > 4) Attempt to write to the enclave page from within the enclave - this
> >    should fail with a page fault on the EPCM permissions since the page
> >    table entry continues to allow RW access.
> > 5) Restore EPCM permissions to RW by running ENCLU[EMODPE] from within
> >    the enclave.
> > 6) Attempt to write to the enclave page from within the enclave - this
> >    should succeed since both EPCM and PTE permissions allow this access.
> > 
> > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > ---
> > Changes since V2:
> > - Modify test to support separation between EPCM and PTE/VMA permissions
> >   - Fix changelog and comments to reflect new relationship between
> >     EPCM and PTE/VMA permissions.
> >   - With EPCM permissions controlling access instead of PTE permissions,
> >     check for SGX error code now encountered in page fault.
> >   - Stop calling SGX_IOC_ENCLAVE_RELAX_PERMISSIONS and ensure that
> >     only calling ENCLU[EMODPE] from within enclave is necessary to restore
> >     access to the enclave page.
> > - Update to use new struct name struct sgx_enclave_restrict_perm -> struct
> >   sgx_enclave_restrict_permissions. (Jarkko)
> > 
> > Changes since V1:
> > - Adapt test to the kernel interface changes: the ioctl() name change
> >   and providing entire secinfo as parameter.
> > - Remove the ENCLU[EACCEPT] call after permissions are relaxed since
> >   the new flow no longer results in the EPCM PR bit being set.
> > - Rewrite error path to reduce line lengths.
> > 
> >  tools/testing/selftests/sgx/defines.h   |  15 ++
> >  tools/testing/selftests/sgx/main.c      | 218 ++++++++++++++++++++++++
> >  tools/testing/selftests/sgx/test_encl.c |  38 +++++
> >  3 files changed, 271 insertions(+)
> > 
> > diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h
> > index 02d775789ea7..b638eb98c80c 100644
> > --- a/tools/testing/selftests/sgx/defines.h
> > +++ b/tools/testing/selftests/sgx/defines.h
> > @@ -24,6 +24,8 @@ enum encl_op_type {
> >  	ENCL_OP_PUT_TO_ADDRESS,
> >  	ENCL_OP_GET_FROM_ADDRESS,
> >  	ENCL_OP_NOP,
> > +	ENCL_OP_EACCEPT,
> > +	ENCL_OP_EMODPE,
> >  	ENCL_OP_MAX,
> >  };
> >  
> > @@ -53,4 +55,17 @@ struct encl_op_get_from_addr {
> >  	uint64_t addr;
> >  };
> >  
> > +struct encl_op_eaccept {
> > +	struct encl_op_header header;
> > +	uint64_t epc_addr;
> > +	uint64_t flags;
> > +	uint64_t ret;
> > +};
> > +
> > +struct encl_op_emodpe {
> > +	struct encl_op_header header;
> > +	uint64_t epc_addr;
> > +	uint64_t flags;
> > +};
> > +
> >  #endif /* DEFINES_H */
> > diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
> > index dd74fa42302e..0e0bd1c4d702 100644
> > --- a/tools/testing/selftests/sgx/main.c
> > +++ b/tools/testing/selftests/sgx/main.c
> > @@ -25,6 +25,18 @@ static const uint64_t MAGIC = 0x1122334455667788ULL;
> >  static const uint64_t MAGIC2 = 0x8877665544332211ULL;
> >  vdso_sgx_enter_enclave_t vdso_sgx_enter_enclave;
> >  
> > +/*
> > + * Security Information (SECINFO) data structure needed by a few SGX
> > + * instructions (eg. ENCLU[EACCEPT] and ENCLU[EMODPE]) holds meta-data
> > + * about an enclave page. &enum sgx_secinfo_page_state specifies the
> > + * secinfo flags used for page state.
> > + */
> > +enum sgx_secinfo_page_state {
> > +	SGX_SECINFO_PENDING = (1 << 3),
> > +	SGX_SECINFO_MODIFIED = (1 << 4),
> > +	SGX_SECINFO_PR = (1 << 5),
> > +};
> > +
> >  struct vdso_symtab {
> >  	Elf64_Sym *elf_symtab;
> >  	const char *elf_symstrtab;
> > @@ -555,4 +567,210 @@ TEST_F(enclave, pte_permissions)
> >  	EXPECT_EQ(self->run.exception_addr, 0);
> >  }
> >  
> > +/*
> > + * Enclave page permission test.
> > + *
> > + * Modify and restore enclave page's EPCM (enclave) permissions from
> > + * outside enclave (ENCLS[EMODPR] via kernel) as well as from within
> > + * enclave (via ENCLU[EMODPE]). Check for page fault if
> > + * VMA allows access but EPCM permissions do not.
> > + */
> > +TEST_F(enclave, epcm_permissions)
> > +{
> > +	struct sgx_enclave_restrict_permissions restrict_ioc;
> > +	struct encl_op_get_from_addr get_addr_op;
> > +	struct encl_op_put_to_addr put_addr_op;
> > +	struct encl_op_eaccept eaccept_op;
> > +	struct encl_op_emodpe emodpe_op;
> > +	struct sgx_secinfo secinfo;
> > +	unsigned long data_start;
> > +	int ret, errno_save;
> > +
> > +	ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata));
> > +
> > +	memset(&self->run, 0, sizeof(self->run));
> > +	self->run.tcs = self->encl.encl_base;
> > +
> > +	/*
> > +	 * Ensure kernel supports needed ioctl() and system supports needed
> > +	 * commands.
> > +	 */
> > +	memset(&restrict_ioc, 0, sizeof(restrict_ioc));
> > +	memset(&secinfo, 0, sizeof(secinfo));
> > +
> > +	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS,
> > +		    &restrict_ioc);
> > +	errno_save = ret == -1 ? errno : 0;
> > +
> > +	/*
> > +	 * Invalid parameters were provided during sanity check,
> > +	 * expect command to fail.
> > +	 */
> > +	ASSERT_EQ(ret, -1);
> > +
> > +	/* ret == -1 */
> > +	if (errno_save == ENOTTY)
> > +		SKIP(return,
> > +		     "Kernel does not support SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS ioctl()");
> > +	else if (errno_save == ENODEV)
> > +		SKIP(return, "System does not support SGX2");
> > +
> > +	/*
> > +	 * Page that will have its permissions changed is the second data
> > +	 * page in the .data segment. This forms part of the local encl_buffer
> > +	 * within the enclave.
> > +	 *
> > +	 * At start of test @data_start should have EPCM as well as PTE and
> > +	 * VMA permissions of RW.
> > +	 */
> > +
> > +	data_start = self->encl.encl_base +
> > +		     encl_get_data_offset(&self->encl) + PAGE_SIZE;
> > +
> > +	/*
> > +	 * Sanity check that page at @data_start is writable before making
> > +	 * any changes to page permissions.
> > +	 *
> > +	 * Start by writing MAGIC to test page.
> > +	 */
> > +	put_addr_op.value = MAGIC;
> > +	put_addr_op.addr = data_start;
> > +	put_addr_op.header.type = ENCL_OP_PUT_TO_ADDRESS;
> > +
> > +	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
> > +
> > +	EXPECT_EEXIT(&self->run);
> > +	EXPECT_EQ(self->run.exception_vector, 0);
> > +	EXPECT_EQ(self->run.exception_error_code, 0);
> > +	EXPECT_EQ(self->run.exception_addr, 0);
> > +
> > +	/*
> > +	 * Read memory that was just written to, confirming that
> > +	 * page is writable.
> > +	 */
> > +	get_addr_op.value = 0;
> > +	get_addr_op.addr = data_start;
> > +	get_addr_op.header.type = ENCL_OP_GET_FROM_ADDRESS;
> > +
> > +	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
> > +
> > +	EXPECT_EQ(get_addr_op.value, MAGIC);
> > +	EXPECT_EEXIT(&self->run);
> > +	EXPECT_EQ(self->run.exception_vector, 0);
> > +	EXPECT_EQ(self->run.exception_error_code, 0);
> > +	EXPECT_EQ(self->run.exception_addr, 0);
> > +
> > +	/*
> > +	 * Change EPCM permissions to read-only. Kernel still considers
> > +	 * the page writable.
> > +	 */
> > +	memset(&restrict_ioc, 0, sizeof(restrict_ioc));
> > +	memset(&secinfo, 0, sizeof(secinfo));
> > +
> > +	secinfo.flags = PROT_READ;
> > +	restrict_ioc.offset = encl_get_data_offset(&self->encl) + PAGE_SIZE;
> > +	restrict_ioc.length = PAGE_SIZE;
> > +	restrict_ioc.secinfo = (unsigned long)&secinfo;
> > +
> > +	ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS,
> > +		    &restrict_ioc);
> > +	errno_save = ret == -1 ? errno : 0;
> > +
> > +	EXPECT_EQ(ret, 0);
> > +	EXPECT_EQ(errno_save, 0);
> > +	EXPECT_EQ(restrict_ioc.result, 0);
> > +	EXPECT_EQ(restrict_ioc.count, 4096);
> > +
> > +	/*
> > +	 * EPCM permissions changed from kernel, need to EACCEPT from enclave.
> > +	 */
> > +	eaccept_op.epc_addr = data_start;
> > +	eaccept_op.flags = PROT_READ | SGX_SECINFO_REG | SGX_SECINFO_PR;
> > +	eaccept_op.ret = 0;
> > +	eaccept_op.header.type = ENCL_OP_EACCEPT;
> > +
> > +	EXPECT_EQ(ENCL_CALL(&eaccept_op, &self->run, true), 0);
> > +
> > +	EXPECT_EEXIT(&self->run);
> > +	EXPECT_EQ(self->run.exception_vector, 0);
> > +	EXPECT_EQ(self->run.exception_error_code, 0);
> > +	EXPECT_EQ(self->run.exception_addr, 0);
> > +	EXPECT_EQ(eaccept_op.ret, 0);
> > +
> > +	/*
> > +	 * EPCM permissions of page is now read-only, expect #PF
> > +	 * on EPCM when attempting to write to page from within enclave.
> > +	 */
> > +	put_addr_op.value = MAGIC2;
> > +
> > +	EXPECT_EQ(ENCL_CALL(&put_addr_op, &self->run, true), 0);
> > +
> > +	EXPECT_EQ(self->run.function, ERESUME);
> > +	EXPECT_EQ(self->run.exception_vector, 14);
> > +	EXPECT_EQ(self->run.exception_error_code, 0x8007);
> > +	EXPECT_EQ(self->run.exception_addr, data_start);
> > +
> > +	self->run.exception_vector = 0;
> > +	self->run.exception_error_code = 0;
> > +	self->run.exception_addr = 0;
> > +
> > +	/*
> > +	 * Received AEX but cannot return to enclave at same entrypoint,
> > +	 * need different TCS from where EPCM permission can be made writable
> > +	 * again.
> > +	 */
> > +	self->run.tcs = self->encl.encl_base + PAGE_SIZE;
> > +
> > +	/*
> > +	 * Enter enclave at new TCS to change EPCM permissions to be
> > +	 * writable again and thus fix the page fault that triggered the
> > +	 * AEX.
> > +	 */
> > +
> > +	emodpe_op.epc_addr = data_start;
> > +	emodpe_op.flags = PROT_READ | PROT_WRITE;
> > +	emodpe_op.header.type = ENCL_OP_EMODPE;
> > +
> > +	EXPECT_EQ(ENCL_CALL(&emodpe_op, &self->run, true), 0);
> > +
> > +	EXPECT_EEXIT(&self->run);
> > +	EXPECT_EQ(self->run.exception_vector, 0);
> > +	EXPECT_EQ(self->run.exception_error_code, 0);
> > +	EXPECT_EQ(self->run.exception_addr, 0);
> > +
> > +	/*
> > +	 * Attempt to return to main TCS to resume execution at faulting
> > +	 * instruction, PTE should continue to allow writing to the page.
> > +	 */
> > +	self->run.tcs = self->encl.encl_base;
> > +
> > +	/*
> > +	 * Wrong page permissions that caused original fault has
> > +	 * now been fixed via EPCM permissions.
> > +	 * Resume execution in main TCS to re-attempt the memory access.
> > +	 */
> > +	self->run.tcs = self->encl.encl_base;
> > +
> > +	EXPECT_EQ(vdso_sgx_enter_enclave((unsigned long)&put_addr_op, 0, 0,
> > +					 ERESUME, 0, 0,
> > +					 &self->run),
> > +		  0);
> > +
> > +	EXPECT_EEXIT(&self->run);
> > +	EXPECT_EQ(self->run.exception_vector, 0);
> > +	EXPECT_EQ(self->run.exception_error_code, 0);
> > +	EXPECT_EQ(self->run.exception_addr, 0);
> > +
> > +	get_addr_op.value = 0;
> > +
> > +	EXPECT_EQ(ENCL_CALL(&get_addr_op, &self->run, true), 0);
> > +
> > +	EXPECT_EQ(get_addr_op.value, MAGIC2);
> > +	EXPECT_EEXIT(&self->run);
> > +	EXPECT_EQ(self->run.user_data, 0);
> > +	EXPECT_EQ(self->run.exception_vector, 0);
> > +	EXPECT_EQ(self->run.exception_error_code, 0);
> > +	EXPECT_EQ(self->run.exception_addr, 0);
> > +}
> > +
> >  TEST_HARNESS_MAIN
> > diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c
> > index 4fca01cfd898..5b6c65331527 100644
> > --- a/tools/testing/selftests/sgx/test_encl.c
> > +++ b/tools/testing/selftests/sgx/test_encl.c
> > @@ -11,6 +11,42 @@
> >   */
> >  static uint8_t encl_buffer[8192] = { 1 };
> >  
> > +enum sgx_enclu_function {
> > +	EACCEPT = 0x5,
> > +	EMODPE = 0x6,
> > +};
> > +
> > +static void do_encl_emodpe(void *_op)
> > +{
> > +	struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0};
> > +	struct encl_op_emodpe *op = _op;
> > +
> > +	secinfo.flags = op->flags;
> > +
> > +	asm volatile(".byte 0x0f, 0x01, 0xd7"
> > +				:
> > +				: "a" (EMODPE),
> > +				  "b" (&secinfo),
> > +				  "c" (op->epc_addr));
> > +}
> > +
> > +static void do_encl_eaccept(void *_op)
> > +{
> > +	struct sgx_secinfo secinfo __aligned(sizeof(struct sgx_secinfo)) = {0};
> > +	struct encl_op_eaccept *op = _op;
> > +	int rax;
> > +
> > +	secinfo.flags = op->flags;
> > +
> > +	asm volatile(".byte 0x0f, 0x01, 0xd7"
> > +				: "=a" (rax)
> > +				: "a" (EACCEPT),
> > +				  "b" (&secinfo),
> > +				  "c" (op->epc_addr));
> > +
> > +	op->ret = rax;
> > +}
> > +
> >  static void *memcpy(void *dest, const void *src, size_t n)
> >  {
> >  	size_t i;
> > @@ -62,6 +98,8 @@ void encl_body(void *rdi,  void *rsi)
> >  		do_encl_op_put_to_addr,
> >  		do_encl_op_get_from_addr,
> >  		do_encl_op_nop,
> > +		do_encl_eaccept,
> > +		do_encl_emodpe,
> >  	};
> >  
> >  	struct encl_op_header *op = (struct encl_op_header *)rdi;
> > -- 
> > 2.25.1
> > 
> 
> Lacking:
> 
> KERNEL SELFTEST FRAMEWORK
> M:	Shuah Khan <shuah@kernel.org>
> M:	Shuah Khan <skhan@linuxfoundation.org>
> L:	linux-kselftest@vger.kernel.org
> S:	Maintained
> Q:	https://patchwork.kernel.org/project/linux-kselftest/list/
> T:	git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
> F:	Documentation/dev-tools/kselftest*
> F:	tools/testing/selftests/
> 
> BR, Jarkko

Anyway, you can put to all selftests:

Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 12/30] x86/sgx: Export sgx_encl_{grow,shrink}()
  2022-04-04 16:49 ` [PATCH V3 12/30] x86/sgx: Export sgx_encl_{grow,shrink}() Reinette Chatre
@ 2022-04-05  7:04   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  7:04 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:20AM -0700, Reinette Chatre wrote:
> In order to use sgx_encl_{grow,shrink}() in the page augmentation code
> located in encl.c, export these functions.
> 
> Suggested-by: Jarkko Sakkinen <jarkko@kernel.org>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - New patch.
> 
>  arch/x86/kernel/cpu/sgx/encl.h  | 2 ++
>  arch/x86/kernel/cpu/sgx/ioctl.c | 4 ++--
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
> index 07abfc70c8e3..9d673d9531f0 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.h
> +++ b/arch/x86/kernel/cpu/sgx/encl.h
> @@ -120,5 +120,7 @@ bool sgx_va_page_full(struct sgx_va_page *va_page);
>  void sgx_encl_free_epc_page(struct sgx_epc_page *page);
>  struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
>  					 unsigned long addr);
> +struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl);
> +void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
>  
>  #endif /* _X86_ENCL_H */
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 0c211af8e948..746acddbb774 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -17,7 +17,7 @@
>  #include "encl.h"
>  #include "encls.h"
>  
> -static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
> +struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
>  {
>  	struct sgx_va_page *va_page = NULL;
>  	void *err;
> @@ -43,7 +43,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
>  	return va_page;
>  }
>  
> -static void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
> +void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
>  {
>  	encl->page_cnt--;
>  
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 16/30] x86/sgx: Tighten accessible memory range after enclave initialization
  2022-04-04 16:49 ` [PATCH V3 16/30] x86/sgx: Tighten accessible memory range after enclave initialization Reinette Chatre
@ 2022-04-05  7:05   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  7:05 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:24AM -0700, Reinette Chatre wrote:
> Before an enclave is initialized the enclave's memory range is unknown.
> The enclave's memory range is learned at the time it is created via the
> SGX_IOC_ENCLAVE_CREATE ioctl() where the provided memory range is
> obtained from an earlier mmap() of /dev/sgx_enclave. After an enclave
> is initialized its memory can be mapped into user space (mmap()) from
> where it can be entered at its defined entry points.
> 
> With the enclave's memory range known after it is initialized there is
> no reason why it should be possible to map memory outside this range.
> 
> Lock down access to the initialized enclave's memory range by denying
> any attempt to map memory outside its memory range.
> 
> Locking down the memory range also makes adding pages to an initialized
> enclave more efficient. Pages are added to an initialized enclave by
> accessing memory that belongs to the enclave's memory range but not yet
> backed by an enclave page. If it is possible for user space to map
> memory that does not form part of the enclave then an access to this
> memory would eventually fail. Failures range from a prompt general
> protection fault if the access was an ENCLU[EACCEPT] from within the
> enclave, or a page fault via the vDSO if it was another access from
> within the enclave, or a SIGBUS (also resulting from a page fault) if
> the access was from outside the enclave.
> 
> Disallowing invalid memory to be mapped in the first place avoids
> preventable failures.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since V1:
> - Add comment (Jarkko).
> 
>  arch/x86/kernel/cpu/sgx/encl.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index fa4f947f8496..7909570736a0 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -409,6 +409,11 @@ int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
>  
>  	XA_STATE(xas, &encl->page_array, PFN_DOWN(start));
>  
> +	/* Disallow mapping outside enclave's address range. */
> +	if (test_bit(SGX_ENCL_INITIALIZED, &encl->flags) &&
> +	    (start < encl->base || end > encl->base + encl->size))
> +		return -EACCES;
> +
>  	/*
>  	 * Disallow READ_IMPLIES_EXEC tasks as their VMA permissions might
>  	 * conflict with the enclave page permissions.
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 17/30] x86/sgx: Support modifying SGX page type
  2022-04-04 16:49 ` [PATCH V3 17/30] x86/sgx: Support modifying SGX page type Reinette Chatre
@ 2022-04-05  7:06   ` Jarkko Sakkinen
  2022-04-05 15:34     ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  7:06 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:25AM -0700, Reinette Chatre wrote:
> Every enclave contains one or more Thread Control Structures (TCS). The
> TCS contains meta-data used by the hardware to save and restore thread
> specific information when entering/exiting the enclave. With SGX1 an
> enclave needs to be created with enough TCSs to support the largest
> number of threads expecting to use the enclave and enough enclave pages
> to meet all its anticipated memory demands. In SGX1 all pages remain in
> the enclave until the enclave is unloaded.
> 
> SGX2 introduces a new function, ENCLS[EMODT], that is used to change
> the type of an enclave page from a regular (SGX_PAGE_TYPE_REG) enclave
> page to a TCS (SGX_PAGE_TYPE_TCS) page or change the type from a
> regular (SGX_PAGE_TYPE_REG) or TCS (SGX_PAGE_TYPE_TCS)
> page to a trimmed (SGX_PAGE_TYPE_TRIM) page (setting it up for later
> removal).
> 
> With the existing support of dynamically adding regular enclave pages
> to an initialized enclave and changing the page type to TCS it is
> possible to dynamically increase the number of threads supported by an
> enclave.
> 
> Changing the enclave page type to SGX_PAGE_TYPE_TRIM is the first step
> of dynamically removing pages from an initialized enclave. The complete
> page removal flow is:
> 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
>    using the SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl() introduced here.
> 2) Approve the page removal by running ENCLU[EACCEPT] from within
>    the enclave.
> 3) Initiate actual page removal using the ioctl() introduced in the
>    following patch.
> 
> Add ioctl() SGX_IOC_ENCLAVE_MODIFY_TYPE to support changing SGX
> enclave page types within an initialized enclave. With
> SGX_IOC_ENCLAVE_MODIFY_TYPE the user specifies a page range and the
> enclave page type to be applied to all pages in the provided range.
> The ioctl() itself can return an error code based on failures
> encountered by the kernel. It is also possible for SGX specific
> failures to be encountered.  Add a result output parameter to
> communicate the SGX return code. It is possible for the enclave page
> type change request to fail on any page within the provided range.
> Support partial success by returning the number of pages that were
> successfully changed.
> 
> After the page type is changed the page continues to be accessible
> from the kernel perspective with page table entries and internal
> state. The page may be moved to swap. Any access until ENCLU[EACCEPT]
> will encounter a page fault with SGX flag set in error code.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - Adjust ioctl number after removal of SGX_IOC_ENCLAVE_RELAX_PERMISSIONS.
> - Remove attempt at runtime tracking of EPCM permissions
>   (sgx_encl_page->vm_run_prot_bits). (Jarkko)
> - Change names to follow guidance of using detailed names (Jarkko):
>   struct sgx_enclave_modt -> struct sgx_enclave_modify_type
>   sgx_enclave_modt() -> sgx_enclave_modify_type()
>   sgx_ioc_enclave_modt() -> sgx_ioc_enclave_modify_type()
> 
> Changes since V1:
> - Remove the "Earlier changes ..." paragraph (Jarkko).
> - Change "new ioctl" text to "Add SGX_IOC_ENCLAVE_MOD_TYPE" (Jarkko).
> - Discussion about EPCM interaction and the EPCM MODIFIED bit is moved
>   to new patch that introduces the ENCLS[EMODT] wrapper while keeping
>   the higher level discussion on page accessibility in
>   this commit log (Jarkko).
> - Rename SGX_IOC_PAGE_MODT ioctl() to SGX_IOC_ENCLAVE_MODIFY_TYPE
>   (Jarkko).
> - Rename struct sgx_page_modt to struct sgx_enclave_modt in support
>   of ioctl() rename.
> - Rename sgx_page_modt() to sgx_enclave_modt() and sgx_ioc_page_modt()
>   to sgx_ioc_enclave_modt() in support of ioctl() rename.
> - Provide secinfo as parameter to ioctl() instead of just
>   page type (Jarkko).
> - Update comments to refer to new ioctl() names.
> - Use new SGX2 checking helper().
> - Use ETRACK flow utility.
> - Move kernel-doc to function that provides documentation for
>   Documentation/x86/sgx.rst.
> - Remove redundant comment.
> - Use offset/length validation utility.
> - Make explicit which members of struct sgx_enclave_modt are for
>   output (Dave).
> 
>  arch/x86/include/uapi/asm/sgx.h |  20 +++
>  arch/x86/kernel/cpu/sgx/ioctl.c | 209 ++++++++++++++++++++++++++++++++
>  2 files changed, 229 insertions(+)
> 
> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> index a0a24e94fb27..529f4ab28410 100644
> --- a/arch/x86/include/uapi/asm/sgx.h
> +++ b/arch/x86/include/uapi/asm/sgx.h
> @@ -31,6 +31,8 @@ enum sgx_page_flags {
>  	_IO(SGX_MAGIC, 0x04)
>  #define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
>  	_IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
> +#define SGX_IOC_ENCLAVE_MODIFY_TYPE \
> +	_IOWR(SGX_MAGIC, 0x06, struct sgx_enclave_modify_type)
>  
>  /**
>   * struct sgx_enclave_create - parameter structure for the
> @@ -97,6 +99,24 @@ struct sgx_enclave_restrict_permissions {
>  	__u64 count;
>  };
>  
> +/**
> + * struct sgx_enclave_modify_type - parameters for %SGX_IOC_ENCLAVE_MODIFY_TYPE
> + * @offset:	starting page offset (page aligned relative to enclave base
> + *		address defined in SECS)
> + * @length:	length of memory (multiple of the page size)
> + * @secinfo:	address for the SECINFO data containing the new type
> + *		for pages in range described by @offset and @length
> + * @result:	(output) SGX result code of ENCLS[EMODT] function
> + * @count:	(output) bytes successfully changed (multiple of page size)
> + */
> +struct sgx_enclave_modify_type {
> +	__u64 offset;
> +	__u64 length;
> +	__u64 secinfo;
> +	__u64 result;
> +	__u64 count;
> +};
> +
>  struct sgx_enclave_run;
>  
>  /**
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 4d88bfd163e7..6f769e67ec2d 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -898,6 +898,212 @@ static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
>  	return ret;
>  }
>  
> +/**
> + * sgx_enclave_modify_type() - Modify type of SGX enclave pages
> + * @encl:	Enclave to which the pages belong.
> + * @modt:	Checked parameters from user about which pages need modifying.
> + * @page_type:	New page type.
> + *
> + * Return:
> + * - 0:		Success
> + * - -errno:	Otherwise
> + */
> +static long sgx_enclave_modify_type(struct sgx_encl *encl,
> +				    struct sgx_enclave_modify_type *modt,
> +				    enum sgx_page_type page_type)
> +{
> +	unsigned long max_prot_restore;
> +	struct sgx_encl_page *entry;
> +	struct sgx_secinfo secinfo;
> +	unsigned long prot;
> +	unsigned long addr;
> +	unsigned long c;
> +	void *epc_virt;
> +	int ret;
> +
> +	/*
> +	 * The only new page types allowed by hardware are PT_TCS and PT_TRIM.
> +	 */
> +	if (page_type != SGX_PAGE_TYPE_TCS && page_type != SGX_PAGE_TYPE_TRIM)
> +		return -EINVAL;
> +
> +	memset(&secinfo, 0, sizeof(secinfo));
> +
> +	secinfo.flags = page_type << 8;
> +
> +	for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
> +		addr = encl->base + modt->offset + c;
> +
> +		mutex_lock(&encl->lock);
> +
> +		entry = sgx_encl_load_page(encl, addr);
> +		if (IS_ERR(entry)) {
> +			ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> +			goto out_unlock;
> +		}
> +
> +		/*
> +		 * Borrow the logic from the Intel SDM. Regular pages
> +		 * (SGX_PAGE_TYPE_REG) can change type to SGX_PAGE_TYPE_TCS
> +		 * or SGX_PAGE_TYPE_TRIM but TCS pages can only be trimmed.
> +		 * CET pages not supported yet.
> +		 */
> +		if (!(entry->type == SGX_PAGE_TYPE_REG ||
> +		      (entry->type == SGX_PAGE_TYPE_TCS &&
> +		       page_type == SGX_PAGE_TYPE_TRIM))) {
> +			ret = -EINVAL;
> +			goto out_unlock;
> +		}
> +
> +		max_prot_restore = entry->vm_max_prot_bits;
> +
> +		/*
> +		 * Once a regular page becomes a TCS page it cannot be
> +		 * changed back. So the maximum allowed protection reflects
> +		 * the TCS page that is always RW from kernel perspective but
> +		 * will be inaccessible from within enclave. Before doing
> +		 * so, do make sure that the new page type continues to
> +		 * respect the originally vetted page permissions.
> +		 */
> +		if (entry->type == SGX_PAGE_TYPE_REG &&
> +		    page_type == SGX_PAGE_TYPE_TCS) {
> +			if (~entry->vm_max_prot_bits & (VM_READ | VM_WRITE)) {
> +				ret = -EPERM;
> +				goto out_unlock;
> +			}
> +			prot = PROT_READ | PROT_WRITE;
> +			entry->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
> +
> +			/*
> +			 * Prevent page from being reclaimed while mutex
> +			 * is released.
> +			 */
> +			if (sgx_unmark_page_reclaimable(entry->epc_page)) {
> +				ret = -EAGAIN;
> +				goto out_entry_changed;
> +			}
> +
> +			/*
> +			 * Do not keep encl->lock because of dependency on
> +			 * mmap_lock acquired in sgx_zap_enclave_ptes().
> +			 */
> +			mutex_unlock(&encl->lock);
> +
> +			sgx_zap_enclave_ptes(encl, addr);
> +
> +			mutex_lock(&encl->lock);
> +
> +			sgx_mark_page_reclaimable(entry->epc_page);
> +		}
> +
> +		/* Change EPC type */
> +		epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> +		ret = __emodt(&secinfo, epc_virt);
> +		if (encls_faulted(ret)) {
> +			/*
> +			 * All possible faults should be avoidable:
> +			 * parameters have been checked, will only change
> +			 * valid page types, and no concurrent
> +			 * SGX1/SGX2 ENCLS instructions since these are
> +			 * protected with mutex.
> +			 */
> +			pr_err_once("EMODT encountered exception %d\n",
> +				    ENCLS_TRAPNR(ret));
> +			ret = -EFAULT;
> +			goto out_entry_changed;
> +		}
> +		if (encls_failed(ret)) {
> +			modt->result = ret;
> +			ret = -EFAULT;
> +			goto out_entry_changed;
> +		}
> +
> +		ret = sgx_enclave_etrack(encl);
> +		if (ret) {
> +			ret = -EFAULT;
> +			goto out_unlock;
> +		}
> +
> +		entry->type = page_type;
> +
> +		mutex_unlock(&encl->lock);
> +	}
> +
> +	ret = 0;
> +	goto out;
> +
> +out_entry_changed:
> +	entry->vm_max_prot_bits = max_prot_restore;
> +out_unlock:
> +	mutex_unlock(&encl->lock);
> +out:
> +	modt->count = c;
> +
> +	return ret;
> +}
> +
> +/**
> + * sgx_ioc_enclave_modify_type() - handler for %SGX_IOC_ENCLAVE_MODIFY_TYPE
> + * @encl:	an enclave pointer
> + * @arg:	userspace pointer to a &struct sgx_enclave_modify_type instance
> + *
> + * Ability to change the enclave page type supports the following use cases:
> + *
> + * * It is possible to add TCS pages to an enclave by changing the type of
> + *   regular pages (%SGX_PAGE_TYPE_REG) to TCS (%SGX_PAGE_TYPE_TCS) pages.
> + *   With this support the number of threads supported by an initialized
> + *   enclave can be increased dynamically.
> + *
> + * * Regular or TCS pages can dynamically be removed from an initialized
> + *   enclave by changing the page type to %SGX_PAGE_TYPE_TRIM. Changing the
> + *   page type to %SGX_PAGE_TYPE_TRIM marks the page for removal with actual
> + *   removal done by handler of %SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl() called
> + *   after ENCLU[EACCEPT] is run on %SGX_PAGE_TYPE_TRIM page from within the
> + *   enclave.
> + *
> + * Return:
> + * - 0:		Success
> + * - -errno:	Otherwise
> + */
> +static long sgx_ioc_enclave_modify_type(struct sgx_encl *encl, void __user *arg)
> +{
> +	struct sgx_enclave_modify_type params;
> +	enum sgx_page_type page_type;
> +	struct sgx_secinfo secinfo;
> +	long ret;
> +
> +	ret = sgx_ioc_sgx2_ready(encl);
> +	if (ret)
> +		return ret;
> +
> +	if (copy_from_user(&params, arg, sizeof(params)))
> +		return -EFAULT;
> +
> +	if (sgx_validate_offset_length(encl, params.offset, params.length))
> +		return -EINVAL;
> +
> +	if (copy_from_user(&secinfo, (void __user *)params.secinfo,
> +			   sizeof(secinfo)))
> +		return -EFAULT;
> +
> +	if (secinfo.flags & ~SGX_SECINFO_PAGE_TYPE_MASK)
> +		return -EINVAL;
> +
> +	if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> +		return -EINVAL;
> +
> +	if (params.result || params.count)
> +		return -EINVAL;
> +
> +	page_type = (secinfo.flags & SGX_SECINFO_PAGE_TYPE_MASK) >> 8;
> +	ret = sgx_enclave_modify_type(encl, &params, page_type);
> +
> +	if (copy_to_user(arg, &params, sizeof(params)))
> +		return -EFAULT;
> +
> +	return ret;
> +}
> +
>  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>  {
>  	struct sgx_encl *encl = filep->private_data;
> @@ -923,6 +1129,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>  		ret = sgx_ioc_enclave_restrict_permissions(encl,
>  							   (void __user *)arg);
>  		break;
> +	case SGX_IOC_ENCLAVE_MODIFY_TYPE:
> +		ret = sgx_ioc_enclave_modify_type(encl, (void __user *)arg);
> +		break;
>  	default:
>  		ret = -ENOIOCTLCMD;
>  		break;
> -- 
> 2.25.1
> 

To be coherent with other names, this should be
SGX_IOC_ENCLAVE_MODIFY_TYPES.

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 18/30] x86/sgx: Support complete page removal
  2022-04-04 16:49 ` [PATCH V3 18/30] x86/sgx: Support complete page removal Reinette Chatre
@ 2022-04-05  7:08   ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  7:08 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:26AM -0700, Reinette Chatre wrote:
> The SGX2 page removal flow was introduced in previous patch and is
> as follows:
> 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
>    using the ioctl() SGX_IOC_ENCLAVE_MODIFY_TYPE introduced in
>    previous patch.
> 2) Approve the page removal by running ENCLU[EACCEPT] from within
>    the enclave.
> 3) Initiate actual page removal using the ioctl()
>    SGX_IOC_ENCLAVE_REMOVE_PAGES introduced here.
> 
> Support the final step of the SGX2 page removal flow with ioctl()
> SGX_IOC_ENCLAVE_REMOVE_PAGES. With this ioctl() the user specifies
> a page range that should be removed. All pages in the provided
> range should have the SGX_PAGE_TYPE_TRIM page type and the request
> will fail with EPERM (Operation not permitted) if a page that does
> not have the correct type is encountered. Page removal can fail
> on any page within the provided range. Support partial success by
> returning the number of pages that were successfully removed.
> 
> Since actual page removal will succeed even if ENCLU[EACCEPT] was not
> run from within the enclave the ENCLU[EMODPR] instruction with RWX
> permissions is used as a no-op mechanism to ensure ENCLU[EACCEPT] was
> successfully run from within the enclave before the enclave page is
> removed.
> 
> If the user omits running SGX_IOC_ENCLAVE_REMOVE_PAGES the pages will
> still be removed when the enclave is unloaded.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - Adjust ioctl number since removal of
>   SGX_IOC_ENCLAVE_RELAX_PERMISSIONS.
> 
> Changes since V1:
> - Update comments to refer to new ioctl() names SGX_IOC_PAGE_MODT ->
>   SGX_IOC_ENCLAVE_MODIFY_TYPE.
> - Fix kernel-doc to have () as part of function name.
> - Change name of ioctl():
>   SGX_IOC_PAGE_REMOVE -> SGX_IOC_ENCLAVE_REMOVE_PAGES (Jarkko).
> - With the above name change the page removal ioctl() has its name
>   aligned with existing SGX_IOC_ENCLAVE_ADD_PAGES ioctl(). Also align
>   naming of struct and functions:
>   struct sgx_page_remove -> struct sgx_enclave_remove_pages
>   sgx_page_remove() -> sgx_encl_remove_pages()
>   sgx_ioc_page_remove() -> sgx_ioc_enclave_remove_pages()
> - Use new SGX2 checking helper.
> - When loading enclave page, make error code consistent with other
>   instances to help user distinguish between permanent and temporary
>   failures.
> - Move kernel-doc to function that provides documentation for
>   Documentation/x86/sgx.rst.
> - Remove redundant comment.
> - Use offset/length validation utility.
> - Make explicit which member of struct sgx_enclave_remove_pages is for
>   output (Dave).
> 
>  arch/x86/include/uapi/asm/sgx.h |  21 +++++
>  arch/x86/kernel/cpu/sgx/ioctl.c | 145 ++++++++++++++++++++++++++++++++
>  2 files changed, 166 insertions(+)
> 
> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> index 529f4ab28410..feda7f85b2ce 100644
> --- a/arch/x86/include/uapi/asm/sgx.h
> +++ b/arch/x86/include/uapi/asm/sgx.h
> @@ -33,6 +33,8 @@ enum sgx_page_flags {
>  	_IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
>  #define SGX_IOC_ENCLAVE_MODIFY_TYPE \
>  	_IOWR(SGX_MAGIC, 0x06, struct sgx_enclave_modify_type)
> +#define SGX_IOC_ENCLAVE_REMOVE_PAGES \
> +	_IOWR(SGX_MAGIC, 0x07, struct sgx_enclave_remove_pages)
>  
>  /**
>   * struct sgx_enclave_create - parameter structure for the
> @@ -117,6 +119,25 @@ struct sgx_enclave_modify_type {
>  	__u64 count;
>  };
>  
> +/**
> + * struct sgx_enclave_remove_pages - %SGX_IOC_ENCLAVE_REMOVE_PAGES parameters
> + * @offset:	starting page offset (page aligned relative to enclave base
> + *		address defined in SECS)
> + * @length:	length of memory (multiple of the page size)
> + * @count:	(output) bytes successfully changed (multiple of page size)
> + *
> + * Regular (PT_REG) or TCS (PT_TCS) can be removed from an initialized
> + * enclave if the system supports SGX2. First, the %SGX_IOC_ENCLAVE_MODIFY_TYPE
> + * ioctl() should be used to change the page type to PT_TRIM. After that
> + * succeeds ENCLU[EACCEPT] should be run from within the enclave and then
> + * %SGX_IOC_ENCLAVE_REMOVE_PAGES can be used to complete the page removal.
> + */
> +struct sgx_enclave_remove_pages {
> +	__u64 offset;
> +	__u64 length;
> +	__u64 count;
> +};
> +
>  struct sgx_enclave_run;
>  
>  /**
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 6f769e67ec2d..515e1961cc02 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -1104,6 +1104,148 @@ static long sgx_ioc_enclave_modify_type(struct sgx_encl *encl, void __user *arg)
>  	return ret;
>  }
>  
> +/**
> + * sgx_encl_remove_pages() - Remove trimmed pages from SGX enclave
> + * @encl:	Enclave to which the pages belong
> + * @params:	Checked parameters from user on which pages need to be removed
> + *
> + * Return:
> + * - 0:		Success.
> + * - -errno:	Otherwise.
> + */
> +static long sgx_encl_remove_pages(struct sgx_encl *encl,
> +				  struct sgx_enclave_remove_pages *params)
> +{
> +	struct sgx_encl_page *entry;
> +	struct sgx_secinfo secinfo;
> +	unsigned long addr;
> +	unsigned long c;
> +	void *epc_virt;
> +	int ret;
> +
> +	memset(&secinfo, 0, sizeof(secinfo));
> +	secinfo.flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
> +
> +	for (c = 0 ; c < params->length; c += PAGE_SIZE) {
> +		addr = encl->base + params->offset + c;
> +
> +		mutex_lock(&encl->lock);
> +
> +		entry = sgx_encl_load_page(encl, addr);
> +		if (IS_ERR(entry)) {
> +			ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> +			goto out_unlock;
> +		}
> +
> +		if (entry->type != SGX_PAGE_TYPE_TRIM) {
> +			ret = -EPERM;
> +			goto out_unlock;
> +		}
> +
> +		/*
> +		 * ENCLS[EMODPR] is a no-op instruction used to inform if
> +		 * ENCLU[EACCEPT] was run from within the enclave. If
> +		 * ENCLS[EMODPR] is run with RWX on a trimmed page that is
> +		 * not yet accepted then it will return
> +		 * %SGX_PAGE_NOT_MODIFIABLE, after the trimmed page is
> +		 * accepted the instruction will encounter a page fault.
> +		 */
> +		epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> +		ret = __emodpr(&secinfo, epc_virt);
> +		if (!encls_faulted(ret) || ENCLS_TRAPNR(ret) != X86_TRAP_PF) {
> +			ret = -EPERM;
> +			goto out_unlock;
> +		}
> +
> +		if (sgx_unmark_page_reclaimable(entry->epc_page)) {
> +			ret = -EBUSY;
> +			goto out_unlock;
> +		}
> +
> +		/*
> +		 * Do not keep encl->lock because of dependency on
> +		 * mmap_lock acquired in sgx_zap_enclave_ptes().
> +		 */
> +		mutex_unlock(&encl->lock);
> +
> +		sgx_zap_enclave_ptes(encl, addr);
> +
> +		mutex_lock(&encl->lock);
> +
> +		sgx_encl_free_epc_page(entry->epc_page);
> +		encl->secs_child_cnt--;
> +		entry->epc_page = NULL;
> +		xa_erase(&encl->page_array, PFN_DOWN(entry->desc));
> +		sgx_encl_shrink(encl, NULL);
> +		kfree(entry);
> +
> +		mutex_unlock(&encl->lock);
> +	}
> +
> +	ret = 0;
> +	goto out;
> +
> +out_unlock:
> +	mutex_unlock(&encl->lock);
> +out:
> +	params->count = c;
> +
> +	return ret;
> +}
> +
> +/**
> + * sgx_ioc_enclave_remove_pages() - handler for %SGX_IOC_ENCLAVE_REMOVE_PAGES
> + * @encl:	an enclave pointer
> + * @arg:	userspace pointer to &struct sgx_enclave_remove_pages instance
> + *
> + * Final step of the flow removing pages from an initialized enclave. The
> + * complete flow is:
> + *
> + * 1) User changes the type of the pages to be removed to %SGX_PAGE_TYPE_TRIM
> + *    using the %SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl().
> + * 2) User approves the page removal by running ENCLU[EACCEPT] from within
> + *    the enclave.
> + * 3) User initiates actual page removal using the
> + *    %SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl() that is handled here.
> + *
> + * First remove any page table entries pointing to the page and then proceed
> + * with the actual removal of the enclave page and data in support of it.
> + *
> + * VA pages are not affected by this removal. It is thus possible that the
> + * enclave may end up with more VA pages than needed to support all its
> + * pages.
> + *
> + * Return:
> + * - 0:		Success
> + * - -errno:	Otherwise
> + */
> +static long sgx_ioc_enclave_remove_pages(struct sgx_encl *encl,
> +					 void __user *arg)
> +{
> +	struct sgx_enclave_remove_pages params;
> +	long ret;
> +
> +	ret = sgx_ioc_sgx2_ready(encl);
> +	if (ret)
> +		return ret;
> +
> +	if (copy_from_user(&params, arg, sizeof(params)))
> +		return -EFAULT;
> +
> +	if (sgx_validate_offset_length(encl, params.offset, params.length))
> +		return -EINVAL;
> +
> +	if (params.count)
> +		return -EINVAL;
> +
> +	ret = sgx_encl_remove_pages(encl, &params);
> +
> +	if (copy_to_user(arg, &params, sizeof(params)))
> +		return -EFAULT;
> +
> +	return ret;
> +}
> +
>  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>  {
>  	struct sgx_encl *encl = filep->private_data;
> @@ -1132,6 +1274,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>  	case SGX_IOC_ENCLAVE_MODIFY_TYPE:
>  		ret = sgx_ioc_enclave_modify_type(encl, (void __user *)arg);
>  		break;
> +	case SGX_IOC_ENCLAVE_REMOVE_PAGES:
> +		ret = sgx_ioc_enclave_remove_pages(encl, (void __user *)arg);
> +		break;
>  	default:
>  		ret = -ENOIOCTLCMD;
>  		break;
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

It's easy to give these quickly as I've spent at least a month looking at
this code while working on support for enarx run-time...

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges
  2022-04-04 16:49 ` [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges Reinette Chatre
@ 2022-04-05  7:11   ` Jarkko Sakkinen
  2022-04-05 17:13     ` Reinette Chatre
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05  7:11 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Mon, Apr 04, 2022 at 09:49:27AM -0700, Reinette Chatre wrote:
> The page reclaimer ensures availability of EPC pages across all
> enclaves. In support of this it runs independently from the
> individual enclaves in order to take locks from the different
> enclaves as it writes pages to swap.
> 
> When needing to load a page from swap an EPC page needs to be
> available for its contents to be loaded into. Loading an existing
> enclave page from swap does not reclaim EPC pages directly if
> none are available, instead the reclaimer is woken when the
> available EPC pages are found to be below a watermark.
> 
> When iterating over a large number of pages in an oversubscribed
> environment there is a race between the reclaimer woken up and
> EPC pages reclaimed fast enough for the page operations to proceed.
> 
> Ensure there are EPC pages available before attempting to load
> a page that may potentially be pulled from swap into an available
> EPC page.
> 
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> No changes since V2
> 
> Changes since v1:
> - Reword commit message.
> 
>  arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++++++
>  arch/x86/kernel/cpu/sgx/main.c  | 6 ++++++
>  arch/x86/kernel/cpu/sgx/sgx.h   | 1 +
>  3 files changed, 13 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 515e1961cc02..f88bc1236276 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -777,6 +777,8 @@ sgx_enclave_restrict_permissions(struct sgx_encl *encl,
>  	for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
>  		addr = encl->base + modp->offset + c;
>  
> +		sgx_direct_reclaim();
> +
>  		mutex_lock(&encl->lock);
>  
>  		entry = sgx_encl_load_page(encl, addr);
> @@ -934,6 +936,8 @@ static long sgx_enclave_modify_type(struct sgx_encl *encl,
>  	for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
>  		addr = encl->base + modt->offset + c;
>  
> +		sgx_direct_reclaim();
> +
>  		mutex_lock(&encl->lock);
>  
>  		entry = sgx_encl_load_page(encl, addr);
> @@ -1129,6 +1133,8 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
>  	for (c = 0 ; c < params->length; c += PAGE_SIZE) {
>  		addr = encl->base + params->offset + c;
>  
> +		sgx_direct_reclaim();
> +
>  		mutex_lock(&encl->lock);
>  
>  		entry = sgx_encl_load_page(encl, addr);
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 6e2cb7564080..545da16bb3ea 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -370,6 +370,12 @@ static bool sgx_should_reclaim(unsigned long watermark)
>  	       !list_empty(&sgx_active_page_list);
>  }
>  
> +void sgx_direct_reclaim(void)
> +{
> +	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
> +		sgx_reclaim_pages();
> +}

Please, instead open code this to both locations - not enough redundancy
to be worth of new function. Causes only unnecessary cross-referencing
when maintaining. Otherwise, I agree with the idea.

BR, Jarkko


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 15/30] x86/sgx: Support adding of pages to an initialized enclave
  2022-04-05  5:05   ` Jarkko Sakkinen
@ 2022-04-05 10:03     ` Jarkko Sakkinen
  2022-04-06  7:37       ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 10:03 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel

On Tue, 2022-04-05 at 08:05 +0300, Jarkko Sakkinen wrote:
> On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> > With SGX1 an enclave needs to be created with its maximum memory demands
> > allocated. Pages cannot be added to an enclave after it is initialized.
> > SGX2 introduces a new function, ENCLS[EAUG], that can be used to add
> > pages to an initialized enclave. With SGX2 the enclave still needs to
> > set aside address space for its maximum memory demands during enclave
> > creation, but all pages need not be added before enclave initialization.
> > Pages can be added during enclave runtime.
> > 
> > Add support for dynamically adding pages to an initialized enclave,
> > architecturally limited to RW permission at creation but allowed to
> > obtain RWX permissions after enclave runs EMODPE. Add pages via the
> > page fault handler at the time an enclave address without a backing
> > enclave page is accessed, potentially directly reclaiming pages if
> > no free pages are available.
> > 
> > The enclave is still required to run ENCLU[EACCEPT] on the page before
> > it can be used. A useful flow is for the enclave to run ENCLU[EACCEPT]
> > on an uninitialized address. This will trigger the page fault handler
> > that will add the enclave page and return execution to the enclave to
> > repeat the ENCLU[EACCEPT] instruction, this time successful.
> > 
> > If the enclave accesses an uninitialized address in another way, for
> > example by expanding the enclave stack to a page that has not yet been
> > added, then the page fault handler would add the page on the first
> > write but upon returning to the enclave the instruction that triggered
> > the page fault would be repeated and since ENCLU[EACCEPT] was not run
> > yet it would trigger a second page fault, this time with the SGX flag
> > set in the page fault error code. This can only be recovered by entering
> > the enclave again and directly running the ENCLU[EACCEPT] instruction on
> > the now initialized address.
> > 
> > Accessing an uninitialized address from outside the enclave also
> > triggers this flow but the page will remain inaccessible (access will
> > result in #PF) until accepted from within the enclave via
> > ENCLU[EACCEPT].
> > 
> > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > ---
> > Changes since V2:
> > - Remove runtime tracking of EPCM permissions
> >   (sgx_encl_page->vm_run_prot_bits) (Jarkko).
> > - Move export of sgx_encl_{grow,shrink}() to separate patch. (Jarkko)
> > - Use sgx_encl_page_alloc(). (Jarkko)
> > - Set max allowed permissions to be RWX (Jarkko). Update changelog
> >   to indicate the change and use comment in code as
> >   created by Jarkko in:
> > https://lore.kernel.org/linux-sgx/20220306053211.135762-4-jarkko@kernel.org
> > - Do not set protection bits but let it be inherited by VMA (Jarkko)
> > 
> > Changes since V1:
> > - Fix subject line "to initialized" -> "to an initialized" (Jarkko).
> > - Move text about hardware's PENDING state to the patch that introduces
> >   the ENCLS[EAUG] wrapper (Jarkko).
> > - Ensure kernel-doc uses brackets when referring to function.
> > 
> >  arch/x86/kernel/cpu/sgx/encl.c | 124 +++++++++++++++++++++++++++++++++
> >  1 file changed, 124 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> > index 546423753e4c..fa4f947f8496 100644
> > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > @@ -194,6 +194,119 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
> >         return __sgx_encl_load_page(encl, entry);
> >  }
> >  
> > +/**
> > + * sgx_encl_eaug_page() - Dynamically add page to initialized enclave
> > + * @vma:       VMA obtained from fault info from where page is accessed
> > + * @encl:      enclave accessing the page
> > + * @addr:      address that triggered the page fault
> > + *
> > + * When an initialized enclave accesses a page with no backing EPC page
> > + * on a SGX2 system then the EPC can be added dynamically via the SGX2
> > + * ENCLS[EAUG] instruction.
> > + *
> > + * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed
> > + * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise.
> > + */
> > +static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> > +                                    struct sgx_encl *encl, unsigned long addr)
> > +{
> > +       struct sgx_pageinfo pginfo = {0};
> > +       struct sgx_encl_page *encl_page;
> > +       struct sgx_epc_page *epc_page;
> > +       struct sgx_va_page *va_page;
> > +       unsigned long phys_addr;
> > +       u64 secinfo_flags;
> > +       vm_fault_t vmret;
> > +       int ret;
> > +
> > +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> > +               return VM_FAULT_SIGBUS;
> > +
> > +       /*
> > +        * Ignore internal permission checking for dynamically added pages.
> > +        * They matter only for data added during the pre-initialization
> > +        * phase. The enclave decides the permissions by the means of
> > +        * EACCEPT, EACCEPTCOPY and EMODPE.
> > +        */
> > +       secinfo_flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
> > +       encl_page = sgx_encl_page_alloc(encl, addr - encl->base, secinfo_flags);
> > +       if (IS_ERR(encl_page))
> > +               return VM_FAULT_OOM;
> > +
> > +       epc_page = sgx_alloc_epc_page(encl_page, true);
> > +       if (IS_ERR(epc_page)) {
> > +               kfree(encl_page);
> > +               return VM_FAULT_SIGBUS;
> > +       }
> > +
> > +       va_page = sgx_encl_grow(encl);
> > +       if (IS_ERR(va_page)) {
> > +               ret = PTR_ERR(va_page);
> > +               goto err_out_free;
> > +       }
> > +
> > +       mutex_lock(&encl->lock);
> > +
> > +       /*
> > +        * Copy comment from sgx_encl_add_page() to maintain guidance in
> > +        * this similar flow:
> > +        * Adding to encl->va_pages must be done under encl->lock.  Ditto for
> > +        * deleting (via sgx_encl_shrink()) in the error path.
> > +        */
> > +       if (va_page)
> > +               list_add(&va_page->list, &encl->va_pages);
> > +
> > +       ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc),
> > +                       encl_page, GFP_KERNEL);
> > +       /*
> > +        * If ret == -EBUSY then page was created in another flow while
> > +        * running without encl->lock
> > +        */
> > +       if (ret)
> > +               goto err_out_unlock;
> > +
> > +       pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
> > +       pginfo.addr = encl_page->desc & PAGE_MASK;
> > +       pginfo.metadata = 0;
> > +
> > +       ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
> > +       if (ret)
> > +               goto err_out;
> > +
> > +       encl_page->encl = encl;
> > +       encl_page->epc_page = epc_page;
> > +       encl_page->type = SGX_PAGE_TYPE_REG;
> > +       encl->secs_child_cnt++;
> > +
> > +       sgx_mark_page_reclaimable(encl_page->epc_page);
> > +
> > +       phys_addr = sgx_get_epc_phys_addr(epc_page);
> > +       /*
> > +        * Do not undo everything when creating PTE entry fails - next #PF
> > +        * would find page ready for a PTE.
> > +        */
> > +       vmret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr));
> > +       if (vmret != VM_FAULT_NOPAGE) {
> > +               mutex_unlock(&encl->lock);
> > +               return VM_FAULT_SIGBUS;
> > +       }
> > +       mutex_unlock(&encl->lock);
> > +       return VM_FAULT_NOPAGE;
> > +
> > +err_out:
> > +       xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
> > +
> > +err_out_unlock:
> > +       sgx_encl_shrink(encl, va_page);
> > +       mutex_unlock(&encl->lock);
> > +
> > +err_out_free:
> > +       sgx_encl_free_epc_page(epc_page);
> > +       kfree(encl_page);
> > +
> > +       return VM_FAULT_SIGBUS;
> > +}
> > +
> >  static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
> >  {
> >         unsigned long addr = (unsigned long)vmf->address;
> > @@ -213,6 +326,17 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
> >         if (unlikely(!encl))
> >                 return VM_FAULT_SIGBUS;
> >  
> > +       /*
> > +        * The page_array keeps track of all enclave pages, whether they
> > +        * are swapped out or not. If there is no entry for this page and
> > +        * the system supports SGX2 then it is possible to dynamically add
> > +        * a new enclave page. This is only possible for an initialized
> > +        * enclave that will be checked for right away.
> > +        */
> > +       if (cpu_feature_enabled(X86_FEATURE_SGX2) &&
> > +           (!xa_load(&encl->page_array, PFN_DOWN(addr))))
> > +               return sgx_encl_eaug_page(vma, encl, addr);
> > +
> >         mutex_lock(&encl->lock);
> >  
> >         entry = sgx_encl_load_page_in_vma(encl, addr, vma->vm_flags);
> 
> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

Tested-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05  5:07     ` Jarkko Sakkinen
@ 2022-04-05 13:40       ` Jarkko Sakkinen
  2022-04-05 14:19         ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 13:40 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

On Tue, 2022-04-05 at 08:07 +0300, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 08:03 +0300, Jarkko Sakkinen wrote:
> > On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> > > In the initial (SGX1) version of SGX, pages in an enclave need to be
> > > created with permissions that support all usages of the pages, from the
> > > time the enclave is initialized until it is unloaded. For example,
> > > pages used by a JIT compiler or when code needs to otherwise be
> > > relocated need to always have RWX permissions.
> > > 
> > > SGX2 includes a new function ENCLS[EMODPR] that is run from the kernel
> > > and can be used to restrict the EPCM permissions of regular enclave
> > > pages within an initialized enclave.
> > > 
> > > Introduce ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS to support
> > > restricting EPCM permissions. With this ioctl() the user specifies
> > > a page range and the EPCM permissions to be applied to all pages in
> > > the provided range. ENCLS[EMODPR] is run to restrict the EPCM
> > > permissions followed by the ENCLS[ETRACK] flow that will ensure
> > > no cached linear-to-physical address mappings to the changed
> > > pages remain.
> > > 
> > > It is possible for the permission change request to fail on any
> > > page within the provided range, either with an error encountered
> > > by the kernel or by the SGX hardware while running
> > > ENCLS[EMODPR]. To support partial success the ioctl() returns an
> > > error code based on failures encountered by the kernel as well
> > > as two result output parameters: one for the number of pages
> > > that were successfully changed and one for the SGX return code.
> > > 
> > > The page table entry permissions are not impacted by the EPCM
> > > permission changes. VMAs and PTEs will continue to allow the
> > > maximum vetted permissions determined at the time the pages
> > > are added to the enclave. The SGX error code in a page fault
> > > will indicate if it was an EPCM permission check that prevented
> > > an access attempt.
> > > 
> > > No checking is done to ensure that the permissions are actually
> > > being restricted. This is because the enclave may have relaxed
> > > the EPCM permissions from within the enclave without letting the
> > > kernel know. An attempt to relax permissions using this call will
> > > be ignored by the hardware.
> > > 
> > > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > > ---
> > > Changes since V2:
> > > - Include the sgx_ioc_sgx2_ready() utility
> > >   that previously was in "x86/sgx: Support relaxing of enclave page
> > >   permissions" that is removed from the next version.
> > > - Few renames requested by Jarkko:
> > >   struct sgx_enclave_restrict_perm ->
> > >          struct sgx_enclave_restrict_permissions
> > >   sgx_enclave_restrict_perm()     ->
> > >          sgx_enclave_restrict_permissions()
> > >   sgx_ioc_enclave_restrict_perm() ->
> > >          sgx_ioc_enclave_restrict_permissions()
> > > - Make EPCM permissions independent from kernel view of
> > >   permissions.  (Jarkko)
> > >   - Remove attempt at runtime tracking of EPCM permissions
> > >     (sgx_encl_page->vm_run_prot_bits).
> > >   - Do not flush page table entries - they are no longer impacted by
> > >     EPCM permission changes.
> > >   - Modify changelog to reflect new architecture.
> > > - Ensure at least PROT_READ is requested - enclave requires read
> > >   access to the page for commands like EMODPE and EACCEPT. (Jarkko)
> > > 
> > > Changes since V1:
> > > - Change terminology to use "relax" instead of "extend" to refer to
> > >   the case when enclave page permissions are added (Dave).
> > > - Use ioctl() in commit message (Dave).
> > > - Add examples on what permissions would be allowed (Dave).
> > > - Split enclave page permission changes into two ioctl()s, one for
> > >   permission restricting (SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS)
> > >   and one for permission relaxing (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS)
> > >   (Jarkko).
> > > - In support of the ioctl() name change the following names have been
> > >   changed:
> > >   struct sgx_page_modp -> struct sgx_enclave_restrict_perm
> > >   sgx_ioc_page_modp() -> sgx_ioc_enclave_restrict_perm()
> > >   sgx_page_modp() -> sgx_enclave_restrict_perm()
> > > - ioctl() takes entire secinfo as input instead of
> > >   page permissions only (Jarkko).
> > > - Fix kernel-doc to include () in function name.
> > > - Create and use utility for the ETRACK flow.
> > > - Fixups in comments
> > > - Move kernel-doc to function that provides documentation for
> > >   Documentation/x86/sgx.rst.
> > > - Remove redundant comment.
> > > - Make explicit which members of struct sgx_enclave_restrict_perm
> > >   are for output (Dave).
> > > 
> > >  arch/x86/include/uapi/asm/sgx.h |  21 +++
> > >  arch/x86/kernel/cpu/sgx/ioctl.c | 242 ++++++++++++++++++++++++++++++++
> > >  2 files changed, 263 insertions(+)
> > > 
> > > diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> > > index f4b81587e90b..a0a24e94fb27 100644
> > > --- a/arch/x86/include/uapi/asm/sgx.h
> > > +++ b/arch/x86/include/uapi/asm/sgx.h
> > > @@ -29,6 +29,8 @@ enum sgx_page_flags {
> > >         _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
> > >  #define SGX_IOC_VEPC_REMOVE_ALL \
> > >         _IO(SGX_MAGIC, 0x04)
> > > +#define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
> > > +       _IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
> > >  
> > >  /**
> > >   * struct sgx_enclave_create - parameter structure for the
> > > @@ -76,6 +78,25 @@ struct sgx_enclave_provision {
> > >         __u64 fd;
> > >  };
> > >  
> > > +/**
> > > + * struct sgx_enclave_restrict_permissions - parameters for ioctl
> > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > + * @offset:    starting page offset (page aligned relative to enclave base
> > > + *             address defined in SECS)
> > > + * @length:    length of memory (multiple of the page size)
> > > + * @secinfo:   address for the SECINFO data containing the new permission bits
> > > + *             for pages in range described by @offset and @length
> > > + * @result:    (output) SGX result code of ENCLS[EMODPR] function
> > > + * @count:     (output) bytes successfully changed (multiple of page size)
> > > + */
> > > +struct sgx_enclave_restrict_permissions {
> > > +       __u64 offset;
> > > +       __u64 length;
> > > +       __u64 secinfo;
> > > +       __u64 result;
> > > +       __u64 count;
> > > +};
> > > +
> > >  struct sgx_enclave_run;
> > >  
> > >  /**
> > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > index 0460fd224a05..4d88bfd163e7 100644
> > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > @@ -660,6 +660,244 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
> > >         return sgx_set_attribute(&encl->attributes_mask, params.fd);
> > >  }
> > >  
> > > +/*
> > > + * Ensure enclave is ready for SGX2 functions. Readiness is checked
> > > + * by ensuring the hardware supports SGX2 and the enclave is initialized
> > > + * and thus able to handle requests to modify pages within it.
> > > + */
> > > +static int sgx_ioc_sgx2_ready(struct sgx_encl *encl)
> > > +{
> > > +       if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
> > > +               return -ENODEV;
> > > +
> > > +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> > > +               return -EINVAL;
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +/*
> > > + * Return valid permission fields from a secinfo structure provided by
> > > + * user space. The secinfo structure is required to only have bits in
> > > + * the permission fields set.
> > > + */
> > > +static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> > > +{
> > > +       struct sgx_secinfo secinfo;
> > > +       u64 perm;
> > > +
> > > +       if (copy_from_user(&secinfo, (void __user *)_secinfo,
> > > +                          sizeof(secinfo)))
> > > +               return -EFAULT;
> > > +
> > > +       if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> > > +               return -EINVAL;
> > > +
> > > +       if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> > > +               return -EINVAL;
> > > +
> > > +       perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> > > +
> > > +       /*
> > > +        * Read access is required for the enclave to be able to use the page.
> > > +        * SGX instructions like ENCLU[EMODPE] and ENCLU[EACCEPT] require
> > > +        * read access.
> > > +        */
> > > +       if (!(perm & SGX_SECINFO_R))
> > > +               return -EINVAL;
> > > +
> > > +       *secinfo_perm = perm;
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +/*
> > > + * Some SGX functions require that no cached linear-to-physical address
> > > + * mappings are present before they can succeed. Collaborate with
> > > + * hardware via ENCLS[ETRACK] to ensure that all cached
> > > + * linear-to-physical address mappings belonging to all threads of
> > > + * the enclave are cleared. See sgx_encl_cpumask() for details.
> > > + */
> > > +static int sgx_enclave_etrack(struct sgx_encl *encl)
> > > +{
> > > +       void *epc_virt;
> > > +       int ret;
> > > +
> > > +       epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
> > > +       ret = __etrack(epc_virt);
> > > +       if (ret) {
> > > +               /*
> > > +                * ETRACK only fails when there is an OS issue. For
> > > +                * example, two consecutive ETRACK was sent without
> > > +                * completed IPI between.
> > > +                */
> > > +               pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
> > > +               /*
> > > +                * Send IPIs to kick CPUs out of the enclave and
> > > +                * try ETRACK again.
> > > +                */
> > > +               on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > +               ret = __etrack(epc_virt);
> > > +               if (ret) {
> > > +                       pr_err_once("ETRACK repeat returned %d (0x%x)",
> > > +                                   ret, ret);
> > > +                       return -EFAULT;
> > > +               }
> > > +       }
> > > +       on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +/**
> > > + * sgx_enclave_restrict_permissions() - Restrict EPCM permissions
> > > + * @encl:      Enclave to which the pages belong.
> > > + * @modp:      Checked parameters from user on which pages need modifying.
> > > + * @secinfo_perm: New (validated) permission bits.
> > > + *
> > > + * Return:
> > > + * - 0:                Success.
> > > + * - -errno:   Otherwise.
> > > + */
> > > +static long
> > > +sgx_enclave_restrict_permissions(struct sgx_encl *encl,
> > > +                                struct sgx_enclave_restrict_permissions *modp,
> > > +                                u64 secinfo_perm)
> > > +{
> > > +       struct sgx_encl_page *entry;
> > > +       struct sgx_secinfo secinfo;
> > > +       unsigned long addr;
> > > +       unsigned long c;
> > > +       void *epc_virt;
> > > +       int ret;
> > > +
> > > +       memset(&secinfo, 0, sizeof(secinfo));
> > > +       secinfo.flags = secinfo_perm;
> > > +
> > > +       for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
> > > +               addr = encl->base + modp->offset + c;
> > > +
> > > +               mutex_lock(&encl->lock);
> > > +
> > > +               entry = sgx_encl_load_page(encl, addr);
> > > +               if (IS_ERR(entry)) {
> > > +                       ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> > > +                       goto out_unlock;
> > > +               }
> > > +
> > > +               /*
> > > +                * Changing EPCM permissions is only supported on regular
> > > +                * SGX pages. Attempting this change on other pages will
> > > +                * result in #PF.
> > > +                */
> > > +               if (entry->type != SGX_PAGE_TYPE_REG) {
> > > +                       ret = -EINVAL;
> > > +                       goto out_unlock;
> > > +               }
> > > +
> > > +               /*
> > > +                * Do not verify the permission bits requested. Kernel
> > > +                * has no control over how EPCM permissions can be relaxed
> > > +                * from within the enclave. ENCLS[EMODPR] can only
> > > +                * remove existing EPCM permissions, attempting to set
> > > +                * new permissions will be ignored by the hardware.
> > > +                */
> > > +
> > > +               /* Change EPCM permissions. */
> > > +               epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> > > +               ret = __emodpr(&secinfo, epc_virt);
> > > +               if (encls_faulted(ret)) {
> > > +                       /*
> > > +                        * All possible faults should be avoidable:
> > > +                        * parameters have been checked, will only change
> > > +                        * permissions of a regular page, and no concurrent
> > > +                        * SGX1/SGX2 ENCLS instructions since these
> > > +                        * are protected with mutex.
> > > +                        */
> > > +                       pr_err_once("EMODPR encountered exception %d\n",
> > > +                                   ENCLS_TRAPNR(ret));
> > > +                       ret = -EFAULT;
> > > +                       goto out_unlock;
> > > +               }
> > > +               if (encls_failed(ret)) {
> > > +                       modp->result = ret;
> > > +                       ret = -EFAULT;
> > > +                       goto out_unlock;
> > > +               }
> > > +
> > > +               ret = sgx_enclave_etrack(encl);
> > > +               if (ret) {
> > > +                       ret = -EFAULT;
> > > +                       goto out_unlock;
> > > +               }
> > > +
> > > +               mutex_unlock(&encl->lock);
> > > +       }
> > > +
> > > +       ret = 0;
> > > +       goto out;
> > > +
> > > +out_unlock:
> > > +       mutex_unlock(&encl->lock);
> > > +out:
> > > +       modp->count = c;
> > > +
> > > +       return ret;
> > > +}
> > > +
> > > +/**
> > > + * sgx_ioc_enclave_restrict_permissions() - handler for
> > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > + * @encl:      an enclave pointer
> > > + * @arg:       userspace pointer to a &struct sgx_enclave_restrict_permissions
> > > + *             instance
> > > + *
> > > + * SGX2 distinguishes between relaxing and restricting the enclave page
> > > + * permissions maintained by the hardware (EPCM permissions) of pages
> > > + * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
> > > + *
> > > + * EPCM permissions cannot be restricted from within the enclave, the enclave
> > > + * requires the kernel to run the privileged level 0 instructions ENCLS[EMODPR]
> > > + * and ENCLS[ETRACK]. An attempt to relax EPCM permissions with this call
> > > + * will be ignored by the hardware.
> > > + *
> > > + * Return:
> > > + * - 0:                Success
> > > + * - -errno:   Otherwise
> > > + */
> > > +static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
> > > +                                                void __user *arg)
> > > +{
> > > +       struct sgx_enclave_restrict_permissions params;
> > > +       u64 secinfo_perm;
> > > +       long ret;
> > > +
> > > +       ret = sgx_ioc_sgx2_ready(encl);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       if (copy_from_user(&params, arg, sizeof(params)))
> > > +               return -EFAULT;
> > > +
> > > +       if (sgx_validate_offset_length(encl, params.offset, params.length))
> > > +               return -EINVAL;
> > > +
> > > +       ret = sgx_perm_from_user_secinfo((void __user *)params.secinfo,
> > > +                                        &secinfo_perm);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       if (params.result || params.count)
> > > +               return -EINVAL;
> > > +
> > > +       ret = sgx_enclave_restrict_permissions(encl, &params, secinfo_perm);
> > > +
> > > +       if (copy_to_user(arg, &params, sizeof(params)))
> > > +               return -EFAULT;
> > > +
> > > +       return ret;
> > > +}
> > > +
> > >  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > >  {
> > >         struct sgx_encl *encl = filep->private_data;
> > > @@ -681,6 +919,10 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > >         case SGX_IOC_ENCLAVE_PROVISION:
> > >                 ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
> > >                 break;
> > > +       case SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS:
> > > +               ret = sgx_ioc_enclave_restrict_permissions(encl,
> > > +                                                          (void __user *)arg);
> > > +               break;
> > >         default:
> > >                 ret = -ENOIOCTLCMD;
> > >                 break;
> > 
> > I think this a big improvement all things considered. I just put 
> > a kernel building and see if I get this wired to our code:
> > 
> > https://github.com/jarkkojs/aur-linux-sgx/actions/runs/2094084943
> > 
> > I'll report my findings later on.
> 
> I pulled the patches from sgx2_submitted_v3_plus_rwx branch. Just
> sanity checking that it is v3, correct?

I'm getting EINVAL with SECINFO that I think is legit:

let mut secinfo_buf: [u8; 64] = [0; 64]; // Initialize with zeros
secinfo_buf[0] = 1; // READ
secinfo_buf[1] = 2; // Regular

I made a small bpftrace script, and here's what happens:

$ cat sgx.bt
kretprobe:sgx_ioctl /retval != 0/
{
	printf("sgx_ioctl: %d\n", retval)
}

kretprobe:sgx_perm_from_user_secinfo.constprop.0 /retval/
{
	printf("sgx_perm_from_user_secinfo.constprop.0 %d\n", retval)
}

kretprobe:sgx_enclave_restrict_permissions /retval/
{
	printf("sgx_enclave_restrict_permissions: %d\n", retval)
}

$ sudo bpftrace sgx.bt
[sudo] password for jarkko: 
Attaching 3 probes...
sgx_perm_from_user_secinfo.constprop.0 -22
sgx_ioctl: -22

Could be that I'm doing something wrong but instantly do not see
anything obvious...

BR, Jarkko
 
 
 
 
 
 





^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05 13:40       ` Jarkko Sakkinen
@ 2022-04-05 14:19         ` Jarkko Sakkinen
  2022-04-05 14:27           ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 14:19 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

On Tue, 2022-04-05 at 16:40 +0300, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 08:07 +0300, Jarkko Sakkinen wrote:
> > On Tue, 2022-04-05 at 08:03 +0300, Jarkko Sakkinen wrote:
> > > On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> > > > In the initial (SGX1) version of SGX, pages in an enclave need to be
> > > > created with permissions that support all usages of the pages, from the
> > > > time the enclave is initialized until it is unloaded. For example,
> > > > pages used by a JIT compiler or when code needs to otherwise be
> > > > relocated need to always have RWX permissions.
> > > > 
> > > > SGX2 includes a new function ENCLS[EMODPR] that is run from the kernel
> > > > and can be used to restrict the EPCM permissions of regular enclave
> > > > pages within an initialized enclave.
> > > > 
> > > > Introduce ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS to support
> > > > restricting EPCM permissions. With this ioctl() the user specifies
> > > > a page range and the EPCM permissions to be applied to all pages in
> > > > the provided range. ENCLS[EMODPR] is run to restrict the EPCM
> > > > permissions followed by the ENCLS[ETRACK] flow that will ensure
> > > > no cached linear-to-physical address mappings to the changed
> > > > pages remain.
> > > > 
> > > > It is possible for the permission change request to fail on any
> > > > page within the provided range, either with an error encountered
> > > > by the kernel or by the SGX hardware while running
> > > > ENCLS[EMODPR]. To support partial success the ioctl() returns an
> > > > error code based on failures encountered by the kernel as well
> > > > as two result output parameters: one for the number of pages
> > > > that were successfully changed and one for the SGX return code.
> > > > 
> > > > The page table entry permissions are not impacted by the EPCM
> > > > permission changes. VMAs and PTEs will continue to allow the
> > > > maximum vetted permissions determined at the time the pages
> > > > are added to the enclave. The SGX error code in a page fault
> > > > will indicate if it was an EPCM permission check that prevented
> > > > an access attempt.
> > > > 
> > > > No checking is done to ensure that the permissions are actually
> > > > being restricted. This is because the enclave may have relaxed
> > > > the EPCM permissions from within the enclave without letting the
> > > > kernel know. An attempt to relax permissions using this call will
> > > > be ignored by the hardware.
> > > > 
> > > > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > > > ---
> > > > Changes since V2:
> > > > - Include the sgx_ioc_sgx2_ready() utility
> > > >   that previously was in "x86/sgx: Support relaxing of enclave page
> > > >   permissions" that is removed from the next version.
> > > > - Few renames requested by Jarkko:
> > > >   struct sgx_enclave_restrict_perm ->
> > > >          struct sgx_enclave_restrict_permissions
> > > >   sgx_enclave_restrict_perm()     ->
> > > >          sgx_enclave_restrict_permissions()
> > > >   sgx_ioc_enclave_restrict_perm() ->
> > > >          sgx_ioc_enclave_restrict_permissions()
> > > > - Make EPCM permissions independent from kernel view of
> > > >   permissions.  (Jarkko)
> > > >   - Remove attempt at runtime tracking of EPCM permissions
> > > >     (sgx_encl_page->vm_run_prot_bits).
> > > >   - Do not flush page table entries - they are no longer impacted by
> > > >     EPCM permission changes.
> > > >   - Modify changelog to reflect new architecture.
> > > > - Ensure at least PROT_READ is requested - enclave requires read
> > > >   access to the page for commands like EMODPE and EACCEPT. (Jarkko)
> > > > 
> > > > Changes since V1:
> > > > - Change terminology to use "relax" instead of "extend" to refer to
> > > >   the case when enclave page permissions are added (Dave).
> > > > - Use ioctl() in commit message (Dave).
> > > > - Add examples on what permissions would be allowed (Dave).
> > > > - Split enclave page permission changes into two ioctl()s, one for
> > > >   permission restricting (SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS)
> > > >   and one for permission relaxing (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS)
> > > >   (Jarkko).
> > > > - In support of the ioctl() name change the following names have been
> > > >   changed:
> > > >   struct sgx_page_modp -> struct sgx_enclave_restrict_perm
> > > >   sgx_ioc_page_modp() -> sgx_ioc_enclave_restrict_perm()
> > > >   sgx_page_modp() -> sgx_enclave_restrict_perm()
> > > > - ioctl() takes entire secinfo as input instead of
> > > >   page permissions only (Jarkko).
> > > > - Fix kernel-doc to include () in function name.
> > > > - Create and use utility for the ETRACK flow.
> > > > - Fixups in comments
> > > > - Move kernel-doc to function that provides documentation for
> > > >   Documentation/x86/sgx.rst.
> > > > - Remove redundant comment.
> > > > - Make explicit which members of struct sgx_enclave_restrict_perm
> > > >   are for output (Dave).
> > > > 
> > > >  arch/x86/include/uapi/asm/sgx.h |  21 +++
> > > >  arch/x86/kernel/cpu/sgx/ioctl.c | 242 ++++++++++++++++++++++++++++++++
> > > >  2 files changed, 263 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> > > > index f4b81587e90b..a0a24e94fb27 100644
> > > > --- a/arch/x86/include/uapi/asm/sgx.h
> > > > +++ b/arch/x86/include/uapi/asm/sgx.h
> > > > @@ -29,6 +29,8 @@ enum sgx_page_flags {
> > > >         _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
> > > >  #define SGX_IOC_VEPC_REMOVE_ALL \
> > > >         _IO(SGX_MAGIC, 0x04)
> > > > +#define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
> > > > +       _IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
> > > >  
> > > >  /**
> > > >   * struct sgx_enclave_create - parameter structure for the
> > > > @@ -76,6 +78,25 @@ struct sgx_enclave_provision {
> > > >         __u64 fd;
> > > >  };
> > > >  
> > > > +/**
> > > > + * struct sgx_enclave_restrict_permissions - parameters for ioctl
> > > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > > + * @offset:    starting page offset (page aligned relative to enclave base
> > > > + *             address defined in SECS)
> > > > + * @length:    length of memory (multiple of the page size)
> > > > + * @secinfo:   address for the SECINFO data containing the new permission bits
> > > > + *             for pages in range described by @offset and @length
> > > > + * @result:    (output) SGX result code of ENCLS[EMODPR] function
> > > > + * @count:     (output) bytes successfully changed (multiple of page size)
> > > > + */
> > > > +struct sgx_enclave_restrict_permissions {
> > > > +       __u64 offset;
> > > > +       __u64 length;
> > > > +       __u64 secinfo;
> > > > +       __u64 result;
> > > > +       __u64 count;
> > > > +};
> > > > +
> > > >  struct sgx_enclave_run;
> > > >  
> > > >  /**
> > > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > index 0460fd224a05..4d88bfd163e7 100644
> > > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > @@ -660,6 +660,244 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
> > > >         return sgx_set_attribute(&encl->attributes_mask, params.fd);
> > > >  }
> > > >  
> > > > +/*
> > > > + * Ensure enclave is ready for SGX2 functions. Readiness is checked
> > > > + * by ensuring the hardware supports SGX2 and the enclave is initialized
> > > > + * and thus able to handle requests to modify pages within it.
> > > > + */
> > > > +static int sgx_ioc_sgx2_ready(struct sgx_encl *encl)
> > > > +{
> > > > +       if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
> > > > +               return -ENODEV;
> > > > +
> > > > +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> > > > +               return -EINVAL;
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Return valid permission fields from a secinfo structure provided by
> > > > + * user space. The secinfo structure is required to only have bits in
> > > > + * the permission fields set.
> > > > + */
> > > > +static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> > > > +{
> > > > +       struct sgx_secinfo secinfo;
> > > > +       u64 perm;
> > > > +
> > > > +       if (copy_from_user(&secinfo, (void __user *)_secinfo,
> > > > +                          sizeof(secinfo)))
> > > > +               return -EFAULT;
> > > > +
> > > > +       if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> > > > +               return -EINVAL;
> > > > +
> > > > +       if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> > > > +               return -EINVAL;
> > > > +
> > > > +       perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> > > > +
> > > > +       /*
> > > > +        * Read access is required for the enclave to be able to use the page.
> > > > +        * SGX instructions like ENCLU[EMODPE] and ENCLU[EACCEPT] require
> > > > +        * read access.
> > > > +        */
> > > > +       if (!(perm & SGX_SECINFO_R))
> > > > +               return -EINVAL;
> > > > +
> > > > +       *secinfo_perm = perm;
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +/*
> > > > + * Some SGX functions require that no cached linear-to-physical address
> > > > + * mappings are present before they can succeed. Collaborate with
> > > > + * hardware via ENCLS[ETRACK] to ensure that all cached
> > > > + * linear-to-physical address mappings belonging to all threads of
> > > > + * the enclave are cleared. See sgx_encl_cpumask() for details.
> > > > + */
> > > > +static int sgx_enclave_etrack(struct sgx_encl *encl)
> > > > +{
> > > > +       void *epc_virt;
> > > > +       int ret;
> > > > +
> > > > +       epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
> > > > +       ret = __etrack(epc_virt);
> > > > +       if (ret) {
> > > > +               /*
> > > > +                * ETRACK only fails when there is an OS issue. For
> > > > +                * example, two consecutive ETRACK was sent without
> > > > +                * completed IPI between.
> > > > +                */
> > > > +               pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
> > > > +               /*
> > > > +                * Send IPIs to kick CPUs out of the enclave and
> > > > +                * try ETRACK again.
> > > > +                */
> > > > +               on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > > +               ret = __etrack(epc_virt);
> > > > +               if (ret) {
> > > > +                       pr_err_once("ETRACK repeat returned %d (0x%x)",
> > > > +                                   ret, ret);
> > > > +                       return -EFAULT;
> > > > +               }
> > > > +       }
> > > > +       on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > > +
> > > > +       return 0;
> > > > +}
> > > > +
> > > > +/**
> > > > + * sgx_enclave_restrict_permissions() - Restrict EPCM permissions
> > > > + * @encl:      Enclave to which the pages belong.
> > > > + * @modp:      Checked parameters from user on which pages need modifying.
> > > > + * @secinfo_perm: New (validated) permission bits.
> > > > + *
> > > > + * Return:
> > > > + * - 0:                Success.
> > > > + * - -errno:   Otherwise.
> > > > + */
> > > > +static long
> > > > +sgx_enclave_restrict_permissions(struct sgx_encl *encl,
> > > > +                                struct sgx_enclave_restrict_permissions *modp,
> > > > +                                u64 secinfo_perm)
> > > > +{
> > > > +       struct sgx_encl_page *entry;
> > > > +       struct sgx_secinfo secinfo;
> > > > +       unsigned long addr;
> > > > +       unsigned long c;
> > > > +       void *epc_virt;
> > > > +       int ret;
> > > > +
> > > > +       memset(&secinfo, 0, sizeof(secinfo));
> > > > +       secinfo.flags = secinfo_perm;
> > > > +
> > > > +       for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
> > > > +               addr = encl->base + modp->offset + c;
> > > > +
> > > > +               mutex_lock(&encl->lock);
> > > > +
> > > > +               entry = sgx_encl_load_page(encl, addr);
> > > > +               if (IS_ERR(entry)) {
> > > > +                       ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> > > > +                       goto out_unlock;
> > > > +               }
> > > > +
> > > > +               /*
> > > > +                * Changing EPCM permissions is only supported on regular
> > > > +                * SGX pages. Attempting this change on other pages will
> > > > +                * result in #PF.
> > > > +                */
> > > > +               if (entry->type != SGX_PAGE_TYPE_REG) {
> > > > +                       ret = -EINVAL;
> > > > +                       goto out_unlock;
> > > > +               }
> > > > +
> > > > +               /*
> > > > +                * Do not verify the permission bits requested. Kernel
> > > > +                * has no control over how EPCM permissions can be relaxed
> > > > +                * from within the enclave. ENCLS[EMODPR] can only
> > > > +                * remove existing EPCM permissions, attempting to set
> > > > +                * new permissions will be ignored by the hardware.
> > > > +                */
> > > > +
> > > > +               /* Change EPCM permissions. */
> > > > +               epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> > > > +               ret = __emodpr(&secinfo, epc_virt);
> > > > +               if (encls_faulted(ret)) {
> > > > +                       /*
> > > > +                        * All possible faults should be avoidable:
> > > > +                        * parameters have been checked, will only change
> > > > +                        * permissions of a regular page, and no concurrent
> > > > +                        * SGX1/SGX2 ENCLS instructions since these
> > > > +                        * are protected with mutex.
> > > > +                        */
> > > > +                       pr_err_once("EMODPR encountered exception %d\n",
> > > > +                                   ENCLS_TRAPNR(ret));
> > > > +                       ret = -EFAULT;
> > > > +                       goto out_unlock;
> > > > +               }
> > > > +               if (encls_failed(ret)) {
> > > > +                       modp->result = ret;
> > > > +                       ret = -EFAULT;
> > > > +                       goto out_unlock;
> > > > +               }
> > > > +
> > > > +               ret = sgx_enclave_etrack(encl);
> > > > +               if (ret) {
> > > > +                       ret = -EFAULT;
> > > > +                       goto out_unlock;
> > > > +               }
> > > > +
> > > > +               mutex_unlock(&encl->lock);
> > > > +       }
> > > > +
> > > > +       ret = 0;
> > > > +       goto out;
> > > > +
> > > > +out_unlock:
> > > > +       mutex_unlock(&encl->lock);
> > > > +out:
> > > > +       modp->count = c;
> > > > +
> > > > +       return ret;
> > > > +}
> > > > +
> > > > +/**
> > > > + * sgx_ioc_enclave_restrict_permissions() - handler for
> > > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > > + * @encl:      an enclave pointer
> > > > + * @arg:       userspace pointer to a &struct sgx_enclave_restrict_permissions
> > > > + *             instance
> > > > + *
> > > > + * SGX2 distinguishes between relaxing and restricting the enclave page
> > > > + * permissions maintained by the hardware (EPCM permissions) of pages
> > > > + * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
> > > > + *
> > > > + * EPCM permissions cannot be restricted from within the enclave, the enclave
> > > > + * requires the kernel to run the privileged level 0 instructions ENCLS[EMODPR]
> > > > + * and ENCLS[ETRACK]. An attempt to relax EPCM permissions with this call
> > > > + * will be ignored by the hardware.
> > > > + *
> > > > + * Return:
> > > > + * - 0:                Success
> > > > + * - -errno:   Otherwise
> > > > + */
> > > > +static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
> > > > +                                                void __user *arg)
> > > > +{
> > > > +       struct sgx_enclave_restrict_permissions params;
> > > > +       u64 secinfo_perm;
> > > > +       long ret;
> > > > +
> > > > +       ret = sgx_ioc_sgx2_ready(encl);
> > > > +       if (ret)
> > > > +               return ret;
> > > > +
> > > > +       if (copy_from_user(&params, arg, sizeof(params)))
> > > > +               return -EFAULT;
> > > > +
> > > > +       if (sgx_validate_offset_length(encl, params.offset, params.length))
> > > > +               return -EINVAL;
> > > > +
> > > > +       ret = sgx_perm_from_user_secinfo((void __user *)params.secinfo,
> > > > +                                        &secinfo_perm);
> > > > +       if (ret)
> > > > +               return ret;
> > > > +
> > > > +       if (params.result || params.count)
> > > > +               return -EINVAL;
> > > > +
> > > > +       ret = sgx_enclave_restrict_permissions(encl, &params, secinfo_perm);
> > > > +
> > > > +       if (copy_to_user(arg, &params, sizeof(params)))
> > > > +               return -EFAULT;
> > > > +
> > > > +       return ret;
> > > > +}
> > > > +
> > > >  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > > >  {
> > > >         struct sgx_encl *encl = filep->private_data;
> > > > @@ -681,6 +919,10 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > > >         case SGX_IOC_ENCLAVE_PROVISION:
> > > >                 ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
> > > >                 break;
> > > > +       case SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS:
> > > > +               ret = sgx_ioc_enclave_restrict_permissions(encl,
> > > > +                                                          (void __user *)arg);
> > > > +               break;
> > > >         default:
> > > >                 ret = -ENOIOCTLCMD;
> > > >                 break;
> > > 
> > > I think this a big improvement all things considered. I just put 
> > > a kernel building and see if I get this wired to our code:
> > > 
> > > https://github.com/jarkkojs/aur-linux-sgx/actions/runs/2094084943
> > > 
> > > I'll report my findings later on.
> > 
> > I pulled the patches from sgx2_submitted_v3_plus_rwx branch. Just
> > sanity checking that it is v3, correct?
> 
> I'm getting EINVAL with SECINFO that I think is legit:
> 
> let mut secinfo_buf: [u8; 64] = [0; 64]; // Initialize with zeros
> secinfo_buf[0] = 1; // READ
> secinfo_buf[1] = 2; // Regular
> 
> I made a small bpftrace script, and here's what happens:
> 
> $ cat sgx.bt
> kretprobe:sgx_ioctl /retval != 0/
> {
>         printf("sgx_ioctl: %d\n", retval)
> }
> 
> kretprobe:sgx_perm_from_user_secinfo.constprop.0 /retval/
> {
>         printf("sgx_perm_from_user_secinfo.constprop.0 %d\n", retval)
> }
> 
> kretprobe:sgx_enclave_restrict_permissions /retval/
> {
>         printf("sgx_enclave_restrict_permissions: %d\n", retval)
> }
> 
> $ sudo bpftrace sgx.bt
> [sudo] password for jarkko: 
> Attaching 3 probes...
> sgx_perm_from_user_secinfo.constprop.0 -22
> sgx_ioctl: -22
> 
> Could be that I'm doing something wrong but instantly do not see
> anything obvious...

It was my bad, i.e.

let mut secinfo_buf: [u8; 64] = [0; 64];
secinfo_buf[0] = 1;
secinfo_buf[1] = 0;
 
BR, Jarkko


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05 14:19         ` Jarkko Sakkinen
@ 2022-04-05 14:27           ` Jarkko Sakkinen
  2022-04-05 14:52             ` Jarkko Sakkinen
  2022-04-05 16:40             ` Reinette Chatre
  0 siblings, 2 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 14:27 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

On Tue, 2022-04-05 at 17:19 +0300, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 16:40 +0300, Jarkko Sakkinen wrote:
> > On Tue, 2022-04-05 at 08:07 +0300, Jarkko Sakkinen wrote:
> > > On Tue, 2022-04-05 at 08:03 +0300, Jarkko Sakkinen wrote:
> > > > On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> > > > > In the initial (SGX1) version of SGX, pages in an enclave need to be
> > > > > created with permissions that support all usages of the pages, from the
> > > > > time the enclave is initialized until it is unloaded. For example,
> > > > > pages used by a JIT compiler or when code needs to otherwise be
> > > > > relocated need to always have RWX permissions.
> > > > > 
> > > > > SGX2 includes a new function ENCLS[EMODPR] that is run from the kernel
> > > > > and can be used to restrict the EPCM permissions of regular enclave
> > > > > pages within an initialized enclave.
> > > > > 
> > > > > Introduce ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS to support
> > > > > restricting EPCM permissions. With this ioctl() the user specifies
> > > > > a page range and the EPCM permissions to be applied to all pages in
> > > > > the provided range. ENCLS[EMODPR] is run to restrict the EPCM
> > > > > permissions followed by the ENCLS[ETRACK] flow that will ensure
> > > > > no cached linear-to-physical address mappings to the changed
> > > > > pages remain.
> > > > > 
> > > > > It is possible for the permission change request to fail on any
> > > > > page within the provided range, either with an error encountered
> > > > > by the kernel or by the SGX hardware while running
> > > > > ENCLS[EMODPR]. To support partial success the ioctl() returns an
> > > > > error code based on failures encountered by the kernel as well
> > > > > as two result output parameters: one for the number of pages
> > > > > that were successfully changed and one for the SGX return code.
> > > > > 
> > > > > The page table entry permissions are not impacted by the EPCM
> > > > > permission changes. VMAs and PTEs will continue to allow the
> > > > > maximum vetted permissions determined at the time the pages
> > > > > are added to the enclave. The SGX error code in a page fault
> > > > > will indicate if it was an EPCM permission check that prevented
> > > > > an access attempt.
> > > > > 
> > > > > No checking is done to ensure that the permissions are actually
> > > > > being restricted. This is because the enclave may have relaxed
> > > > > the EPCM permissions from within the enclave without letting the
> > > > > kernel know. An attempt to relax permissions using this call will
> > > > > be ignored by the hardware.
> > > > > 
> > > > > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > > > > ---
> > > > > Changes since V2:
> > > > > - Include the sgx_ioc_sgx2_ready() utility
> > > > >   that previously was in "x86/sgx: Support relaxing of enclave page
> > > > >   permissions" that is removed from the next version.
> > > > > - Few renames requested by Jarkko:
> > > > >   struct sgx_enclave_restrict_perm ->
> > > > >          struct sgx_enclave_restrict_permissions
> > > > >   sgx_enclave_restrict_perm()     ->
> > > > >          sgx_enclave_restrict_permissions()
> > > > >   sgx_ioc_enclave_restrict_perm() ->
> > > > >          sgx_ioc_enclave_restrict_permissions()
> > > > > - Make EPCM permissions independent from kernel view of
> > > > >   permissions.  (Jarkko)
> > > > >   - Remove attempt at runtime tracking of EPCM permissions
> > > > >     (sgx_encl_page->vm_run_prot_bits).
> > > > >   - Do not flush page table entries - they are no longer impacted by
> > > > >     EPCM permission changes.
> > > > >   - Modify changelog to reflect new architecture.
> > > > > - Ensure at least PROT_READ is requested - enclave requires read
> > > > >   access to the page for commands like EMODPE and EACCEPT. (Jarkko)
> > > > > 
> > > > > Changes since V1:
> > > > > - Change terminology to use "relax" instead of "extend" to refer to
> > > > >   the case when enclave page permissions are added (Dave).
> > > > > - Use ioctl() in commit message (Dave).
> > > > > - Add examples on what permissions would be allowed (Dave).
> > > > > - Split enclave page permission changes into two ioctl()s, one for
> > > > >   permission restricting (SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS)
> > > > >   and one for permission relaxing (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS)
> > > > >   (Jarkko).
> > > > > - In support of the ioctl() name change the following names have been
> > > > >   changed:
> > > > >   struct sgx_page_modp -> struct sgx_enclave_restrict_perm
> > > > >   sgx_ioc_page_modp() -> sgx_ioc_enclave_restrict_perm()
> > > > >   sgx_page_modp() -> sgx_enclave_restrict_perm()
> > > > > - ioctl() takes entire secinfo as input instead of
> > > > >   page permissions only (Jarkko).
> > > > > - Fix kernel-doc to include () in function name.
> > > > > - Create and use utility for the ETRACK flow.
> > > > > - Fixups in comments
> > > > > - Move kernel-doc to function that provides documentation for
> > > > >   Documentation/x86/sgx.rst.
> > > > > - Remove redundant comment.
> > > > > - Make explicit which members of struct sgx_enclave_restrict_perm
> > > > >   are for output (Dave).
> > > > > 
> > > > >  arch/x86/include/uapi/asm/sgx.h |  21 +++
> > > > >  arch/x86/kernel/cpu/sgx/ioctl.c | 242 ++++++++++++++++++++++++++++++++
> > > > >  2 files changed, 263 insertions(+)
> > > > > 
> > > > > diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> > > > > index f4b81587e90b..a0a24e94fb27 100644
> > > > > --- a/arch/x86/include/uapi/asm/sgx.h
> > > > > +++ b/arch/x86/include/uapi/asm/sgx.h
> > > > > @@ -29,6 +29,8 @@ enum sgx_page_flags {
> > > > >         _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
> > > > >  #define SGX_IOC_VEPC_REMOVE_ALL \
> > > > >         _IO(SGX_MAGIC, 0x04)
> > > > > +#define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
> > > > > +       _IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
> > > > >  
> > > > >  /**
> > > > >   * struct sgx_enclave_create - parameter structure for the
> > > > > @@ -76,6 +78,25 @@ struct sgx_enclave_provision {
> > > > >         __u64 fd;
> > > > >  };
> > > > >  
> > > > > +/**
> > > > > + * struct sgx_enclave_restrict_permissions - parameters for ioctl
> > > > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > > > + * @offset:    starting page offset (page aligned relative to enclave base
> > > > > + *             address defined in SECS)
> > > > > + * @length:    length of memory (multiple of the page size)
> > > > > + * @secinfo:   address for the SECINFO data containing the new permission bits
> > > > > + *             for pages in range described by @offset and @length
> > > > > + * @result:    (output) SGX result code of ENCLS[EMODPR] function
> > > > > + * @count:     (output) bytes successfully changed (multiple of page size)
> > > > > + */
> > > > > +struct sgx_enclave_restrict_permissions {
> > > > > +       __u64 offset;
> > > > > +       __u64 length;
> > > > > +       __u64 secinfo;
> > > > > +       __u64 result;
> > > > > +       __u64 count;
> > > > > +};
> > > > > +
> > > > >  struct sgx_enclave_run;
> > > > >  
> > > > >  /**
> > > > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > > index 0460fd224a05..4d88bfd163e7 100644
> > > > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > > @@ -660,6 +660,244 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
> > > > >         return sgx_set_attribute(&encl->attributes_mask, params.fd);
> > > > >  }
> > > > >  
> > > > > +/*
> > > > > + * Ensure enclave is ready for SGX2 functions. Readiness is checked
> > > > > + * by ensuring the hardware supports SGX2 and the enclave is initialized
> > > > > + * and thus able to handle requests to modify pages within it.
> > > > > + */
> > > > > +static int sgx_ioc_sgx2_ready(struct sgx_encl *encl)
> > > > > +{
> > > > > +       if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
> > > > > +               return -ENODEV;
> > > > > +
> > > > > +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Return valid permission fields from a secinfo structure provided by
> > > > > + * user space. The secinfo structure is required to only have bits in
> > > > > + * the permission fields set.
> > > > > + */
> > > > > +static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> > > > > +{
> > > > > +       struct sgx_secinfo secinfo;
> > > > > +       u64 perm;
> > > > > +
> > > > > +       if (copy_from_user(&secinfo, (void __user *)_secinfo,
> > > > > +                          sizeof(secinfo)))
> > > > > +               return -EFAULT;
> > > > > +
> > > > > +       if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> > > > > +
> > > > > +       /*
> > > > > +        * Read access is required for the enclave to be able to use the page.
> > > > > +        * SGX instructions like ENCLU[EMODPE] and ENCLU[EACCEPT] require
> > > > > +        * read access.
> > > > > +        */
> > > > > +       if (!(perm & SGX_SECINFO_R))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       *secinfo_perm = perm;
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * Some SGX functions require that no cached linear-to-physical address
> > > > > + * mappings are present before they can succeed. Collaborate with
> > > > > + * hardware via ENCLS[ETRACK] to ensure that all cached
> > > > > + * linear-to-physical address mappings belonging to all threads of
> > > > > + * the enclave are cleared. See sgx_encl_cpumask() for details.
> > > > > + */
> > > > > +static int sgx_enclave_etrack(struct sgx_encl *encl)
> > > > > +{
> > > > > +       void *epc_virt;
> > > > > +       int ret;
> > > > > +
> > > > > +       epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
> > > > > +       ret = __etrack(epc_virt);
> > > > > +       if (ret) {
> > > > > +               /*
> > > > > +                * ETRACK only fails when there is an OS issue. For
> > > > > +                * example, two consecutive ETRACK was sent without
> > > > > +                * completed IPI between.
> > > > > +                */
> > > > > +               pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
> > > > > +               /*
> > > > > +                * Send IPIs to kick CPUs out of the enclave and
> > > > > +                * try ETRACK again.
> > > > > +                */
> > > > > +               on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > > > +               ret = __etrack(epc_virt);
> > > > > +               if (ret) {
> > > > > +                       pr_err_once("ETRACK repeat returned %d (0x%x)",
> > > > > +                                   ret, ret);
> > > > > +                       return -EFAULT;
> > > > > +               }
> > > > > +       }
> > > > > +       on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > > > +
> > > > > +       return 0;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * sgx_enclave_restrict_permissions() - Restrict EPCM permissions
> > > > > + * @encl:      Enclave to which the pages belong.
> > > > > + * @modp:      Checked parameters from user on which pages need modifying.
> > > > > + * @secinfo_perm: New (validated) permission bits.
> > > > > + *
> > > > > + * Return:
> > > > > + * - 0:                Success.
> > > > > + * - -errno:   Otherwise.
> > > > > + */
> > > > > +static long
> > > > > +sgx_enclave_restrict_permissions(struct sgx_encl *encl,
> > > > > +                                struct sgx_enclave_restrict_permissions *modp,
> > > > > +                                u64 secinfo_perm)
> > > > > +{
> > > > > +       struct sgx_encl_page *entry;
> > > > > +       struct sgx_secinfo secinfo;
> > > > > +       unsigned long addr;
> > > > > +       unsigned long c;
> > > > > +       void *epc_virt;
> > > > > +       int ret;
> > > > > +
> > > > > +       memset(&secinfo, 0, sizeof(secinfo));
> > > > > +       secinfo.flags = secinfo_perm;
> > > > > +
> > > > > +       for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
> > > > > +               addr = encl->base + modp->offset + c;
> > > > > +
> > > > > +               mutex_lock(&encl->lock);
> > > > > +
> > > > > +               entry = sgx_encl_load_page(encl, addr);
> > > > > +               if (IS_ERR(entry)) {
> > > > > +                       ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +
> > > > > +               /*
> > > > > +                * Changing EPCM permissions is only supported on regular
> > > > > +                * SGX pages. Attempting this change on other pages will
> > > > > +                * result in #PF.
> > > > > +                */
> > > > > +               if (entry->type != SGX_PAGE_TYPE_REG) {
> > > > > +                       ret = -EINVAL;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +
> > > > > +               /*
> > > > > +                * Do not verify the permission bits requested. Kernel
> > > > > +                * has no control over how EPCM permissions can be relaxed
> > > > > +                * from within the enclave. ENCLS[EMODPR] can only
> > > > > +                * remove existing EPCM permissions, attempting to set
> > > > > +                * new permissions will be ignored by the hardware.
> > > > > +                */
> > > > > +
> > > > > +               /* Change EPCM permissions. */
> > > > > +               epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> > > > > +               ret = __emodpr(&secinfo, epc_virt);
> > > > > +               if (encls_faulted(ret)) {
> > > > > +                       /*
> > > > > +                        * All possible faults should be avoidable:
> > > > > +                        * parameters have been checked, will only change
> > > > > +                        * permissions of a regular page, and no concurrent
> > > > > +                        * SGX1/SGX2 ENCLS instructions since these
> > > > > +                        * are protected with mutex.
> > > > > +                        */
> > > > > +                       pr_err_once("EMODPR encountered exception %d\n",
> > > > > +                                   ENCLS_TRAPNR(ret));
> > > > > +                       ret = -EFAULT;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +               if (encls_failed(ret)) {
> > > > > +                       modp->result = ret;
> > > > > +                       ret = -EFAULT;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +
> > > > > +               ret = sgx_enclave_etrack(encl);
> > > > > +               if (ret) {
> > > > > +                       ret = -EFAULT;
> > > > > +                       goto out_unlock;
> > > > > +               }
> > > > > +
> > > > > +               mutex_unlock(&encl->lock);
> > > > > +       }
> > > > > +
> > > > > +       ret = 0;
> > > > > +       goto out;
> > > > > +
> > > > > +out_unlock:
> > > > > +       mutex_unlock(&encl->lock);
> > > > > +out:
> > > > > +       modp->count = c;
> > > > > +
> > > > > +       return ret;
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * sgx_ioc_enclave_restrict_permissions() - handler for
> > > > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > > > + * @encl:      an enclave pointer
> > > > > + * @arg:       userspace pointer to a &struct sgx_enclave_restrict_permissions
> > > > > + *             instance
> > > > > + *
> > > > > + * SGX2 distinguishes between relaxing and restricting the enclave page
> > > > > + * permissions maintained by the hardware (EPCM permissions) of pages
> > > > > + * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
> > > > > + *
> > > > > + * EPCM permissions cannot be restricted from within the enclave, the enclave
> > > > > + * requires the kernel to run the privileged level 0 instructions ENCLS[EMODPR]
> > > > > + * and ENCLS[ETRACK]. An attempt to relax EPCM permissions with this call
> > > > > + * will be ignored by the hardware.
> > > > > + *
> > > > > + * Return:
> > > > > + * - 0:                Success
> > > > > + * - -errno:   Otherwise
> > > > > + */
> > > > > +static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
> > > > > +                                                void __user *arg)
> > > > > +{
> > > > > +       struct sgx_enclave_restrict_permissions params;
> > > > > +       u64 secinfo_perm;
> > > > > +       long ret;
> > > > > +
> > > > > +       ret = sgx_ioc_sgx2_ready(encl);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       if (copy_from_user(&params, arg, sizeof(params)))
> > > > > +               return -EFAULT;
> > > > > +
> > > > > +       if (sgx_validate_offset_length(encl, params.offset, params.length))
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       ret = sgx_perm_from_user_secinfo((void __user *)params.secinfo,
> > > > > +                                        &secinfo_perm);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       if (params.result || params.count)
> > > > > +               return -EINVAL;
> > > > > +
> > > > > +       ret = sgx_enclave_restrict_permissions(encl, &params, secinfo_perm);
> > > > > +
> > > > > +       if (copy_to_user(arg, &params, sizeof(params)))
> > > > > +               return -EFAULT;
> > > > > +
> > > > > +       return ret;
> > > > > +}
> > > > > +
> > > > >  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > > > >  {
> > > > >         struct sgx_encl *encl = filep->private_data;
> > > > > @@ -681,6 +919,10 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > > > >         case SGX_IOC_ENCLAVE_PROVISION:
> > > > >                 ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
> > > > >                 break;
> > > > > +       case SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS:
> > > > > +               ret = sgx_ioc_enclave_restrict_permissions(encl,
> > > > > +                                                          (void __user *)arg);
> > > > > +               break;
> > > > >         default:
> > > > >                 ret = -ENOIOCTLCMD;
> > > > >                 break;
> > > > 
> > > > I think this a big improvement all things considered. I just put 
> > > > a kernel building and see if I get this wired to our code:
> > > > 
> > > > https://github.com/jarkkojs/aur-linux-sgx/actions/runs/2094084943
> > > > 
> > > > I'll report my findings later on.
> > > 
> > > I pulled the patches from sgx2_submitted_v3_plus_rwx branch. Just
> > > sanity checking that it is v3, correct?
> > 
> > I'm getting EINVAL with SECINFO that I think is legit:
> > 
> > let mut secinfo_buf: [u8; 64] = [0; 64]; // Initialize with zeros
> > secinfo_buf[0] = 1; // READ
> > secinfo_buf[1] = 2; // Regular
> > 
> > I made a small bpftrace script, and here's what happens:
> > 
> > $ cat sgx.bt
> > kretprobe:sgx_ioctl /retval != 0/
> > {
> >         printf("sgx_ioctl: %d\n", retval)
> > }
> > 
> > kretprobe:sgx_perm_from_user_secinfo.constprop.0 /retval/
> > {
> >         printf("sgx_perm_from_user_secinfo.constprop.0 %d\n", retval)
> > }
> > 
> > kretprobe:sgx_enclave_restrict_permissions /retval/
> > {
> >         printf("sgx_enclave_restrict_permissions: %d\n", retval)
> > }
> > 
> > $ sudo bpftrace sgx.bt
> > [sudo] password for jarkko: 
> > Attaching 3 probes...
> > sgx_perm_from_user_secinfo.constprop.0 -22
> > sgx_ioctl: -22
> > 
> > Could be that I'm doing something wrong but instantly do not see
> > anything obvious...
> 
> It was my bad, i.e.
> 
> let mut secinfo_buf: [u8; 64] = [0; 64];
> secinfo_buf[0] = 1;
> secinfo_buf[1] = 0;
>  
> BR, Jarkko

According to SDM having page type as regular is fine for EMODPR,
i.e. that's why I did not care about having it in SECINFO.

Given that the opcode itself contains validation, I wonder
why this needs to be done:

if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
		return -EINVAL;

if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
		return -EINVAL;

perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;

I.e. why duplicate validation and why does it have different
invariant than the opcode?

While looking into this I also noticed:

static int sgx_validate_offset_length(struct sgx_encl *encl,
				      unsigned long offset,
				      unsigned long length)
{
	if (!IS_ALIGNED(offset, PAGE_SIZE))
		return -EINVAL;

	if (!length || length & (PAGE_SIZE - 1))
		return -EINVAL;

I guess also for length would be good idea to use IS_ALIGNED()
(this inconsistency inherits from the pre-existing code).

BR, Jarkko



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05 14:27           ` Jarkko Sakkinen
@ 2022-04-05 14:52             ` Jarkko Sakkinen
  2022-04-05 16:49               ` Reinette Chatre
  2022-04-05 16:40             ` Reinette Chatre
  1 sibling, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 14:52 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

n Tue, 2022-04-05 at 17:27 +0300, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 17:19 +0300, Jarkko Sakkinen wrote:
> > On Tue, 2022-04-05 at 16:40 +0300, Jarkko Sakkinen wrote:
> > > On Tue, 2022-04-05 at 08:07 +0300, Jarkko Sakkinen wrote:
> > > > On Tue, 2022-04-05 at 08:03 +0300, Jarkko Sakkinen wrote:
> > > > > On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> > > > > > In the initial (SGX1) version of SGX, pages in an enclave need to be
> > > > > > created with permissions that support all usages of the pages, from the
> > > > > > time the enclave is initialized until it is unloaded. For example,
> > > > > > pages used by a JIT compiler or when code needs to otherwise be
> > > > > > relocated need to always have RWX permissions.
> > > > > > 
> > > > > > SGX2 includes a new function ENCLS[EMODPR] that is run from the kernel
> > > > > > and can be used to restrict the EPCM permissions of regular enclave
> > > > > > pages within an initialized enclave.
> > > > > > 
> > > > > > Introduce ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS to support
> > > > > > restricting EPCM permissions. With this ioctl() the user specifies
> > > > > > a page range and the EPCM permissions to be applied to all pages in
> > > > > > the provided range. ENCLS[EMODPR] is run to restrict the EPCM
> > > > > > permissions followed by the ENCLS[ETRACK] flow that will ensure
> > > > > > no cached linear-to-physical address mappings to the changed
> > > > > > pages remain.
> > > > > > 
> > > > > > It is possible for the permission change request to fail on any
> > > > > > page within the provided range, either with an error encountered
> > > > > > by the kernel or by the SGX hardware while running
> > > > > > ENCLS[EMODPR]. To support partial success the ioctl() returns an
> > > > > > error code based on failures encountered by the kernel as well
> > > > > > as two result output parameters: one for the number of pages
> > > > > > that were successfully changed and one for the SGX return code.
> > > > > > 
> > > > > > The page table entry permissions are not impacted by the EPCM
> > > > > > permission changes. VMAs and PTEs will continue to allow the
> > > > > > maximum vetted permissions determined at the time the pages
> > > > > > are added to the enclave. The SGX error code in a page fault
> > > > > > will indicate if it was an EPCM permission check that prevented
> > > > > > an access attempt.
> > > > > > 
> > > > > > No checking is done to ensure that the permissions are actually
> > > > > > being restricted. This is because the enclave may have relaxed
> > > > > > the EPCM permissions from within the enclave without letting the
> > > > > > kernel know. An attempt to relax permissions using this call will
> > > > > > be ignored by the hardware.
> > > > > > 
> > > > > > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > > > > > ---
> > > > > > Changes since V2:
> > > > > > - Include the sgx_ioc_sgx2_ready() utility
> > > > > >   that previously was in "x86/sgx: Support relaxing of enclave page
> > > > > >   permissions" that is removed from the next version.
> > > > > > - Few renames requested by Jarkko:
> > > > > >   struct sgx_enclave_restrict_perm ->
> > > > > >          struct sgx_enclave_restrict_permissions
> > > > > >   sgx_enclave_restrict_perm()     ->
> > > > > >          sgx_enclave_restrict_permissions()
> > > > > >   sgx_ioc_enclave_restrict_perm() ->
> > > > > >          sgx_ioc_enclave_restrict_permissions()
> > > > > > - Make EPCM permissions independent from kernel view of
> > > > > >   permissions.  (Jarkko)
> > > > > >   - Remove attempt at runtime tracking of EPCM permissions
> > > > > >     (sgx_encl_page->vm_run_prot_bits).
> > > > > >   - Do not flush page table entries - they are no longer impacted by
> > > > > >     EPCM permission changes.
> > > > > >   - Modify changelog to reflect new architecture.
> > > > > > - Ensure at least PROT_READ is requested - enclave requires read
> > > > > >   access to the page for commands like EMODPE and EACCEPT. (Jarkko)
> > > > > > 
> > > > > > Changes since V1:
> > > > > > - Change terminology to use "relax" instead of "extend" to refer to
> > > > > >   the case when enclave page permissions are added (Dave).
> > > > > > - Use ioctl() in commit message (Dave).
> > > > > > - Add examples on what permissions would be allowed (Dave).
> > > > > > - Split enclave page permission changes into two ioctl()s, one for
> > > > > >   permission restricting (SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS)
> > > > > >   and one for permission relaxing (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS)
> > > > > >   (Jarkko).
> > > > > > - In support of the ioctl() name change the following names have been
> > > > > >   changed:
> > > > > >   struct sgx_page_modp -> struct sgx_enclave_restrict_perm
> > > > > >   sgx_ioc_page_modp() -> sgx_ioc_enclave_restrict_perm()
> > > > > >   sgx_page_modp() -> sgx_enclave_restrict_perm()
> > > > > > - ioctl() takes entire secinfo as input instead of
> > > > > >   page permissions only (Jarkko).
> > > > > > - Fix kernel-doc to include () in function name.
> > > > > > - Create and use utility for the ETRACK flow.
> > > > > > - Fixups in comments
> > > > > > - Move kernel-doc to function that provides documentation for
> > > > > >   Documentation/x86/sgx.rst.
> > > > > > - Remove redundant comment.
> > > > > > - Make explicit which members of struct sgx_enclave_restrict_perm
> > > > > >   are for output (Dave).
> > > > > > 
> > > > > >  arch/x86/include/uapi/asm/sgx.h |  21 +++
> > > > > >  arch/x86/kernel/cpu/sgx/ioctl.c | 242 ++++++++++++++++++++++++++++++++
> > > > > >  2 files changed, 263 insertions(+)
> > > > > > 
> > > > > > diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> > > > > > index f4b81587e90b..a0a24e94fb27 100644
> > > > > > --- a/arch/x86/include/uapi/asm/sgx.h
> > > > > > +++ b/arch/x86/include/uapi/asm/sgx.h
> > > > > > @@ -29,6 +29,8 @@ enum sgx_page_flags {
> > > > > >         _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
> > > > > >  #define SGX_IOC_VEPC_REMOVE_ALL \
> > > > > >         _IO(SGX_MAGIC, 0x04)
> > > > > > +#define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
> > > > > > +       _IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
> > > > > >  
> > > > > >  /**
> > > > > >   * struct sgx_enclave_create - parameter structure for the
> > > > > > @@ -76,6 +78,25 @@ struct sgx_enclave_provision {
> > > > > >         __u64 fd;
> > > > > >  };
> > > > > >  
> > > > > > +/**
> > > > > > + * struct sgx_enclave_restrict_permissions - parameters for ioctl
> > > > > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > > > > + * @offset:    starting page offset (page aligned relative to enclave base
> > > > > > + *             address defined in SECS)
> > > > > > + * @length:    length of memory (multiple of the page size)
> > > > > > + * @secinfo:   address for the SECINFO data containing the new permission bits
> > > > > > + *             for pages in range described by @offset and @length
> > > > > > + * @result:    (output) SGX result code of ENCLS[EMODPR] function
> > > > > > + * @count:     (output) bytes successfully changed (multiple of page size)
> > > > > > + */
> > > > > > +struct sgx_enclave_restrict_permissions {
> > > > > > +       __u64 offset;
> > > > > > +       __u64 length;
> > > > > > +       __u64 secinfo;
> > > > > > +       __u64 result;
> > > > > > +       __u64 count;
> > > > > > +};
> > > > > > +
> > > > > >  struct sgx_enclave_run;
> > > > > >  
> > > > > >  /**
> > > > > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > > > index 0460fd224a05..4d88bfd163e7 100644
> > > > > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > > > > @@ -660,6 +660,244 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
> > > > > >         return sgx_set_attribute(&encl->attributes_mask, params.fd);
> > > > > >  }
> > > > > >  
> > > > > > +/*
> > > > > > + * Ensure enclave is ready for SGX2 functions. Readiness is checked
> > > > > > + * by ensuring the hardware supports SGX2 and the enclave is initialized
> > > > > > + * and thus able to handle requests to modify pages within it.
> > > > > > + */
> > > > > > +static int sgx_ioc_sgx2_ready(struct sgx_encl *encl)
> > > > > > +{
> > > > > > +       if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
> > > > > > +               return -ENODEV;
> > > > > > +
> > > > > > +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * Return valid permission fields from a secinfo structure provided by
> > > > > > + * user space. The secinfo structure is required to only have bits in
> > > > > > + * the permission fields set.
> > > > > > + */
> > > > > > +static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> > > > > > +{
> > > > > > +       struct sgx_secinfo secinfo;
> > > > > > +       u64 perm;
> > > > > > +
> > > > > > +       if (copy_from_user(&secinfo, (void __user *)_secinfo,
> > > > > > +                          sizeof(secinfo)))
> > > > > > +               return -EFAULT;
> > > > > > +
> > > > > > +       if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> > > > > > +
> > > > > > +       /*
> > > > > > +        * Read access is required for the enclave to be able to use the page.
> > > > > > +        * SGX instructions like ENCLU[EMODPE] and ENCLU[EACCEPT] require
> > > > > > +        * read access.
> > > > > > +        */
> > > > > > +       if (!(perm & SGX_SECINFO_R))
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       *secinfo_perm = perm;
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * Some SGX functions require that no cached linear-to-physical address
> > > > > > + * mappings are present before they can succeed. Collaborate with
> > > > > > + * hardware via ENCLS[ETRACK] to ensure that all cached
> > > > > > + * linear-to-physical address mappings belonging to all threads of
> > > > > > + * the enclave are cleared. See sgx_encl_cpumask() for details.
> > > > > > + */
> > > > > > +static int sgx_enclave_etrack(struct sgx_encl *encl)
> > > > > > +{
> > > > > > +       void *epc_virt;
> > > > > > +       int ret;
> > > > > > +
> > > > > > +       epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
> > > > > > +       ret = __etrack(epc_virt);
> > > > > > +       if (ret) {
> > > > > > +               /*
> > > > > > +                * ETRACK only fails when there is an OS issue. For
> > > > > > +                * example, two consecutive ETRACK was sent without
> > > > > > +                * completed IPI between.
> > > > > > +                */
> > > > > > +               pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
> > > > > > +               /*
> > > > > > +                * Send IPIs to kick CPUs out of the enclave and
> > > > > > +                * try ETRACK again.
> > > > > > +                */
> > > > > > +               on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > > > > +               ret = __etrack(epc_virt);
> > > > > > +               if (ret) {
> > > > > > +                       pr_err_once("ETRACK repeat returned %d (0x%x)",
> > > > > > +                                   ret, ret);
> > > > > > +                       return -EFAULT;
> > > > > > +               }
> > > > > > +       }
> > > > > > +       on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
> > > > > > +
> > > > > > +       return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/**
> > > > > > + * sgx_enclave_restrict_permissions() - Restrict EPCM permissions
> > > > > > + * @encl:      Enclave to which the pages belong.
> > > > > > + * @modp:      Checked parameters from user on which pages need modifying.
> > > > > > + * @secinfo_perm: New (validated) permission bits.
> > > > > > + *
> > > > > > + * Return:
> > > > > > + * - 0:                Success.
> > > > > > + * - -errno:   Otherwise.
> > > > > > + */
> > > > > > +static long
> > > > > > +sgx_enclave_restrict_permissions(struct sgx_encl *encl,
> > > > > > +                                struct sgx_enclave_restrict_permissions *modp,
> > > > > > +                                u64 secinfo_perm)
> > > > > > +{
> > > > > > +       struct sgx_encl_page *entry;
> > > > > > +       struct sgx_secinfo secinfo;
> > > > > > +       unsigned long addr;
> > > > > > +       unsigned long c;
> > > > > > +       void *epc_virt;
> > > > > > +       int ret;
> > > > > > +
> > > > > > +       memset(&secinfo, 0, sizeof(secinfo));
> > > > > > +       secinfo.flags = secinfo_perm;
> > > > > > +
> > > > > > +       for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
> > > > > > +               addr = encl->base + modp->offset + c;
> > > > > > +
> > > > > > +               mutex_lock(&encl->lock);
> > > > > > +
> > > > > > +               entry = sgx_encl_load_page(encl, addr);
> > > > > > +               if (IS_ERR(entry)) {
> > > > > > +                       ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> > > > > > +                       goto out_unlock;
> > > > > > +               }
> > > > > > +
> > > > > > +               /*
> > > > > > +                * Changing EPCM permissions is only supported on regular
> > > > > > +                * SGX pages. Attempting this change on other pages will
> > > > > > +                * result in #PF.
> > > > > > +                */
> > > > > > +               if (entry->type != SGX_PAGE_TYPE_REG) {
> > > > > > +                       ret = -EINVAL;
> > > > > > +                       goto out_unlock;
> > > > > > +               }
> > > > > > +
> > > > > > +               /*
> > > > > > +                * Do not verify the permission bits requested. Kernel
> > > > > > +                * has no control over how EPCM permissions can be relaxed
> > > > > > +                * from within the enclave. ENCLS[EMODPR] can only
> > > > > > +                * remove existing EPCM permissions, attempting to set
> > > > > > +                * new permissions will be ignored by the hardware.
> > > > > > +                */
> > > > > > +
> > > > > > +               /* Change EPCM permissions. */
> > > > > > +               epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> > > > > > +               ret = __emodpr(&secinfo, epc_virt);
> > > > > > +               if (encls_faulted(ret)) {
> > > > > > +                       /*
> > > > > > +                        * All possible faults should be avoidable:
> > > > > > +                        * parameters have been checked, will only change
> > > > > > +                        * permissions of a regular page, and no concurrent
> > > > > > +                        * SGX1/SGX2 ENCLS instructions since these
> > > > > > +                        * are protected with mutex.
> > > > > > +                        */
> > > > > > +                       pr_err_once("EMODPR encountered exception %d\n",
> > > > > > +                                   ENCLS_TRAPNR(ret));
> > > > > > +                       ret = -EFAULT;
> > > > > > +                       goto out_unlock;
> > > > > > +               }
> > > > > > +               if (encls_failed(ret)) {
> > > > > > +                       modp->result = ret;
> > > > > > +                       ret = -EFAULT;
> > > > > > +                       goto out_unlock;
> > > > > > +               }
> > > > > > +
> > > > > > +               ret = sgx_enclave_etrack(encl);
> > > > > > +               if (ret) {
> > > > > > +                       ret = -EFAULT;
> > > > > > +                       goto out_unlock;
> > > > > > +               }
> > > > > > +
> > > > > > +               mutex_unlock(&encl->lock);
> > > > > > +       }
> > > > > > +
> > > > > > +       ret = 0;
> > > > > > +       goto out;
> > > > > > +
> > > > > > +out_unlock:
> > > > > > +       mutex_unlock(&encl->lock);
> > > > > > +out:
> > > > > > +       modp->count = c;
> > > > > > +
> > > > > > +       return ret;
> > > > > > +}
> > > > > > +
> > > > > > +/**
> > > > > > + * sgx_ioc_enclave_restrict_permissions() - handler for
> > > > > > + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
> > > > > > + * @encl:      an enclave pointer
> > > > > > + * @arg:       userspace pointer to a &struct sgx_enclave_restrict_permissions
> > > > > > + *             instance
> > > > > > + *
> > > > > > + * SGX2 distinguishes between relaxing and restricting the enclave page
> > > > > > + * permissions maintained by the hardware (EPCM permissions) of pages
> > > > > > + * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
> > > > > > + *
> > > > > > + * EPCM permissions cannot be restricted from within the enclave, the enclave
> > > > > > + * requires the kernel to run the privileged level 0 instructions ENCLS[EMODPR]
> > > > > > + * and ENCLS[ETRACK]. An attempt to relax EPCM permissions with this call
> > > > > > + * will be ignored by the hardware.
> > > > > > + *
> > > > > > + * Return:
> > > > > > + * - 0:                Success
> > > > > > + * - -errno:   Otherwise
> > > > > > + */
> > > > > > +static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
> > > > > > +                                                void __user *arg)
> > > > > > +{
> > > > > > +       struct sgx_enclave_restrict_permissions params;
> > > > > > +       u64 secinfo_perm;
> > > > > > +       long ret;
> > > > > > +
> > > > > > +       ret = sgx_ioc_sgx2_ready(encl);
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > > +
> > > > > > +       if (copy_from_user(&params, arg, sizeof(params)))
> > > > > > +               return -EFAULT;
> > > > > > +
> > > > > > +       if (sgx_validate_offset_length(encl, params.offset, params.length))
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       ret = sgx_perm_from_user_secinfo((void __user *)params.secinfo,
> > > > > > +                                        &secinfo_perm);
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > > +
> > > > > > +       if (params.result || params.count)
> > > > > > +               return -EINVAL;
> > > > > > +
> > > > > > +       ret = sgx_enclave_restrict_permissions(encl, &params, secinfo_perm);
> > > > > > +
> > > > > > +       if (copy_to_user(arg, &params, sizeof(params)))
> > > > > > +               return -EFAULT;
> > > > > > +
> > > > > > +       return ret;
> > > > > > +}
> > > > > > +
> > > > > >  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > > > > >  {
> > > > > >         struct sgx_encl *encl = filep->private_data;
> > > > > > @@ -681,6 +919,10 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> > > > > >         case SGX_IOC_ENCLAVE_PROVISION:
> > > > > >                 ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
> > > > > >                 break;
> > > > > > +       case SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS:
> > > > > > +               ret = sgx_ioc_enclave_restrict_permissions(encl,
> > > > > > +                                                          (void __user *)arg);
> > > > > > +               break;
> > > > > >         default:
> > > > > >                 ret = -ENOIOCTLCMD;
> > > > > >                 break;
> > > > > 
> > > > > I think this a big improvement all things considered. I just put 
> > > > > a kernel building and see if I get this wired to our code:
> > > > > 
> > > > > https://github.com/jarkkojs/aur-linux-sgx/actions/runs/2094084943
> > > > > 
> > > > > I'll report my findings later on.
> > > > 
> > > > I pulled the patches from sgx2_submitted_v3_plus_rwx branch. Just
> > > > sanity checking that it is v3, correct?
> > > 
> > > I'm getting EINVAL with SECINFO that I think is legit:
> > > 
> > > let mut secinfo_buf: [u8; 64] = [0; 64]; // Initialize with zeros
> > > secinfo_buf[0] = 1; // READ
> > > secinfo_buf[1] = 2; // Regular
> > > 
> > > I made a small bpftrace script, and here's what happens:
> > > 
> > > $ cat sgx.bt
> > > kretprobe:sgx_ioctl /retval != 0/
> > > {
> > >         printf("sgx_ioctl: %d\n", retval)
> > > }
> > > 
> > > kretprobe:sgx_perm_from_user_secinfo.constprop.0 /retval/
> > > {
> > >         printf("sgx_perm_from_user_secinfo.constprop.0 %d\n", retval)
> > > }
> > > 
> > > kretprobe:sgx_enclave_restrict_permissions /retval/
> > > {
> > >         printf("sgx_enclave_restrict_permissions: %d\n", retval)
> > > }
> > > 
> > > $ sudo bpftrace sgx.bt
> > > [sudo] password for jarkko: 
> > > Attaching 3 probes...
> > > sgx_perm_from_user_secinfo.constprop.0 -22
> > > sgx_ioctl: -22
> > > 
> > > Could be that I'm doing something wrong but instantly do not see
> > > anything obvious...
> > 
> > It was my bad, i.e.
> > 
> > let mut secinfo_buf: [u8; 64] = [0; 64];
> > secinfo_buf[0] = 1;
> > secinfo_buf[1] = 0;
> >  
> > BR, Jarkko
> 
> According to SDM having page type as regular is fine for EMODPR,
> i.e. that's why I did not care about having it in SECINFO.
> 
> Given that the opcode itself contains validation, I wonder
> why this needs to be done:
> 
> if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
>                 return -EINVAL;
> 
> if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
>                 return -EINVAL;
> 
> perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> 
> I.e. why duplicate validation and why does it have different
> invariant than the opcode?

Right it is done to prevent exceptions and also pseudo-code
has this validation:

IF (EPCM(DS:RCX).PT is not PT_REG) THEN #PF(DS:RCX); FI; 

This is clearly wrong:

/*
 * Return valid permission fields from a secinfo structure provided by
 * user space. The secinfo structure is required to only have bits in
 * the permission fields set.
 */
static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)

It means that the API requires a malformed data as input.

Maybe it would be better idea then to replace secinfo with just the
permission field?

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 17/30] x86/sgx: Support modifying SGX page type
  2022-04-05  7:06   ` Jarkko Sakkinen
@ 2022-04-05 15:34     ` Jarkko Sakkinen
  2022-04-05 17:05       ` Reinette Chatre
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 15:34 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Tue, 2022-04-05 at 10:06 +0300, Jarkko Sakkinen wrote:
> On Mon, Apr 04, 2022 at 09:49:25AM -0700, Reinette Chatre wrote:
> > Every enclave contains one or more Thread Control Structures (TCS). The
> > TCS contains meta-data used by the hardware to save and restore thread
> > specific information when entering/exiting the enclave. With SGX1 an
> > enclave needs to be created with enough TCSs to support the largest
> > number of threads expecting to use the enclave and enough enclave pages
> > to meet all its anticipated memory demands. In SGX1 all pages remain in
> > the enclave until the enclave is unloaded.
> > 
> > SGX2 introduces a new function, ENCLS[EMODT], that is used to change
> > the type of an enclave page from a regular (SGX_PAGE_TYPE_REG) enclave
> > page to a TCS (SGX_PAGE_TYPE_TCS) page or change the type from a
> > regular (SGX_PAGE_TYPE_REG) or TCS (SGX_PAGE_TYPE_TCS)
> > page to a trimmed (SGX_PAGE_TYPE_TRIM) page (setting it up for later
> > removal).
> > 
> > With the existing support of dynamically adding regular enclave pages
> > to an initialized enclave and changing the page type to TCS it is
> > possible to dynamically increase the number of threads supported by an
> > enclave.
> > 
> > Changing the enclave page type to SGX_PAGE_TYPE_TRIM is the first step
> > of dynamically removing pages from an initialized enclave. The complete
> > page removal flow is:
> > 1) Change the type of the pages to be removed to SGX_PAGE_TYPE_TRIM
> >    using the SGX_IOC_ENCLAVE_MODIFY_TYPE ioctl() introduced here.
> > 2) Approve the page removal by running ENCLU[EACCEPT] from within
> >    the enclave.
> > 3) Initiate actual page removal using the ioctl() introduced in the
> >    following patch.
> > 
> > Add ioctl() SGX_IOC_ENCLAVE_MODIFY_TYPE to support changing SGX
> > enclave page types within an initialized enclave. With
> > SGX_IOC_ENCLAVE_MODIFY_TYPE the user specifies a page range and the
> > enclave page type to be applied to all pages in the provided range.
> > The ioctl() itself can return an error code based on failures
> > encountered by the kernel. It is also possible for SGX specific
> > failures to be encountered.  Add a result output parameter to
> > communicate the SGX return code. It is possible for the enclave page
> > type change request to fail on any page within the provided range.
> > Support partial success by returning the number of pages that were
> > successfully changed.
> > 
> > After the page type is changed the page continues to be accessible
> > from the kernel perspective with page table entries and internal
> > state. The page may be moved to swap. Any access until ENCLU[EACCEPT]
> > will encounter a page fault with SGX flag set in error code.
> > 
> > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > ---
> > Changes since V2:
> > - Adjust ioctl number after removal of SGX_IOC_ENCLAVE_RELAX_PERMISSIONS.
> > - Remove attempt at runtime tracking of EPCM permissions
> >   (sgx_encl_page->vm_run_prot_bits). (Jarkko)
> > - Change names to follow guidance of using detailed names (Jarkko):
> >   struct sgx_enclave_modt -> struct sgx_enclave_modify_type
> >   sgx_enclave_modt() -> sgx_enclave_modify_type()
> >   sgx_ioc_enclave_modt() -> sgx_ioc_enclave_modify_type()
> > 
> > Changes since V1:
> > - Remove the "Earlier changes ..." paragraph (Jarkko).
> > - Change "new ioctl" text to "Add SGX_IOC_ENCLAVE_MOD_TYPE" (Jarkko).
> > - Discussion about EPCM interaction and the EPCM MODIFIED bit is moved
> >   to new patch that introduces the ENCLS[EMODT] wrapper while keeping
> >   the higher level discussion on page accessibility in
> >   this commit log (Jarkko).
> > - Rename SGX_IOC_PAGE_MODT ioctl() to SGX_IOC_ENCLAVE_MODIFY_TYPE
> >   (Jarkko).
> > - Rename struct sgx_page_modt to struct sgx_enclave_modt in support
> >   of ioctl() rename.
> > - Rename sgx_page_modt() to sgx_enclave_modt() and sgx_ioc_page_modt()
> >   to sgx_ioc_enclave_modt() in support of ioctl() rename.
> > - Provide secinfo as parameter to ioctl() instead of just
> >   page type (Jarkko).
> > - Update comments to refer to new ioctl() names.
> > - Use new SGX2 checking helper().
> > - Use ETRACK flow utility.
> > - Move kernel-doc to function that provides documentation for
> >   Documentation/x86/sgx.rst.
> > - Remove redundant comment.
> > - Use offset/length validation utility.
> > - Make explicit which members of struct sgx_enclave_modt are for
> >   output (Dave).
> > 
> >  arch/x86/include/uapi/asm/sgx.h |  20 +++
> >  arch/x86/kernel/cpu/sgx/ioctl.c | 209 ++++++++++++++++++++++++++++++++
> >  2 files changed, 229 insertions(+)
> > 
> > diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
> > index a0a24e94fb27..529f4ab28410 100644
> > --- a/arch/x86/include/uapi/asm/sgx.h
> > +++ b/arch/x86/include/uapi/asm/sgx.h
> > @@ -31,6 +31,8 @@ enum sgx_page_flags {
> >         _IO(SGX_MAGIC, 0x04)
> >  #define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
> >         _IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
> > +#define SGX_IOC_ENCLAVE_MODIFY_TYPE \
> > +       _IOWR(SGX_MAGIC, 0x06, struct sgx_enclave_modify_type)
> >  
> >  /**
> >   * struct sgx_enclave_create - parameter structure for the
> > @@ -97,6 +99,24 @@ struct sgx_enclave_restrict_permissions {
> >         __u64 count;
> >  };
> >  
> > +/**
> > + * struct sgx_enclave_modify_type - parameters for %SGX_IOC_ENCLAVE_MODIFY_TYPE
> > + * @offset:    starting page offset (page aligned relative to enclave base
> > + *             address defined in SECS)
> > + * @length:    length of memory (multiple of the page size)
> > + * @secinfo:   address for the SECINFO data containing the new type
> > + *             for pages in range described by @offset and @length
> > + * @result:    (output) SGX result code of ENCLS[EMODT] function
> > + * @count:     (output) bytes successfully changed (multiple of page size)
> > + */
> > +struct sgx_enclave_modify_type {
> > +       __u64 offset;
> > +       __u64 length;
> > +       __u64 secinfo;
> > +       __u64 result;
> > +       __u64 count;
> > +};
> > +
> >  struct sgx_enclave_run;
> >  
> >  /**
> > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> > index 4d88bfd163e7..6f769e67ec2d 100644
> > --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> > @@ -898,6 +898,212 @@ static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
> >         return ret;
> >  }
> >  
> > +/**
> > + * sgx_enclave_modify_type() - Modify type of SGX enclave pages
> > + * @encl:      Enclave to which the pages belong.
> > + * @modt:      Checked parameters from user about which pages need modifying.
> > + * @page_type: New page type.
> > + *
> > + * Return:
> > + * - 0:                Success
> > + * - -errno:   Otherwise
> > + */
> > +static long sgx_enclave_modify_type(struct sgx_encl *encl,
> > +                                   struct sgx_enclave_modify_type *modt,
> > +                                   enum sgx_page_type page_type)
> > +{
> > +       unsigned long max_prot_restore;
> > +       struct sgx_encl_page *entry;
> > +       struct sgx_secinfo secinfo;
> > +       unsigned long prot;
> > +       unsigned long addr;
> > +       unsigned long c;
> > +       void *epc_virt;
> > +       int ret;
> > +
> > +       /*
> > +        * The only new page types allowed by hardware are PT_TCS and PT_TRIM.
> > +        */
> > +       if (page_type != SGX_PAGE_TYPE_TCS && page_type != SGX_PAGE_TYPE_TRIM)
> > +               return -EINVAL;
> > +
> > +       memset(&secinfo, 0, sizeof(secinfo));
> > +
> > +       secinfo.flags = page_type << 8;
> > +
> > +       for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
> > +               addr = encl->base + modt->offset + c;
> > +
> > +               mutex_lock(&encl->lock);
> > +
> > +               entry = sgx_encl_load_page(encl, addr);
> > +               if (IS_ERR(entry)) {
> > +                       ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
> > +                       goto out_unlock;
> > +               }
> > +
> > +               /*
> > +                * Borrow the logic from the Intel SDM. Regular pages
> > +                * (SGX_PAGE_TYPE_REG) can change type to SGX_PAGE_TYPE_TCS
> > +                * or SGX_PAGE_TYPE_TRIM but TCS pages can only be trimmed.
> > +                * CET pages not supported yet.
> > +                */
> > +               if (!(entry->type == SGX_PAGE_TYPE_REG ||
> > +                     (entry->type == SGX_PAGE_TYPE_TCS &&
> > +                      page_type == SGX_PAGE_TYPE_TRIM))) {
> > +                       ret = -EINVAL;
> > +                       goto out_unlock;
> > +               }
> > +
> > +               max_prot_restore = entry->vm_max_prot_bits;
> > +
> > +               /*
> > +                * Once a regular page becomes a TCS page it cannot be
> > +                * changed back. So the maximum allowed protection reflects
> > +                * the TCS page that is always RW from kernel perspective but
> > +                * will be inaccessible from within enclave. Before doing
> > +                * so, do make sure that the new page type continues to
> > +                * respect the originally vetted page permissions.
> > +                */
> > +               if (entry->type == SGX_PAGE_TYPE_REG &&
> > +                   page_type == SGX_PAGE_TYPE_TCS) {
> > +                       if (~entry->vm_max_prot_bits & (VM_READ | VM_WRITE)) {
> > +                               ret = -EPERM;
> > +                               goto out_unlock;
> > +                       }
> > +                       prot = PROT_READ | PROT_WRITE;
> > +                       entry->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
> > +
> > +                       /*
> > +                        * Prevent page from being reclaimed while mutex
> > +                        * is released.
> > +                        */
> > +                       if (sgx_unmark_page_reclaimable(entry->epc_page)) {
> > +                               ret = -EAGAIN;
> > +                               goto out_entry_changed;
> > +                       }
> > +
> > +                       /*
> > +                        * Do not keep encl->lock because of dependency on
> > +                        * mmap_lock acquired in sgx_zap_enclave_ptes().
> > +                        */
> > +                       mutex_unlock(&encl->lock);
> > +
> > +                       sgx_zap_enclave_ptes(encl, addr);
> > +
> > +                       mutex_lock(&encl->lock);
> > +
> > +                       sgx_mark_page_reclaimable(entry->epc_page);
> > +               }
> > +
> > +               /* Change EPC type */
> > +               epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
> > +               ret = __emodt(&secinfo, epc_virt);
> > +               if (encls_faulted(ret)) {
> > +                       /*
> > +                        * All possible faults should be avoidable:
> > +                        * parameters have been checked, will only change
> > +                        * valid page types, and no concurrent
> > +                        * SGX1/SGX2 ENCLS instructions since these are
> > +                        * protected with mutex.
> > +                        */
> > +                       pr_err_once("EMODT encountered exception %d\n",
> > +                                   ENCLS_TRAPNR(ret));
> > +                       ret = -EFAULT;
> > +                       goto out_entry_changed;
> > +               }
> > +               if (encls_failed(ret)) {
> > +                       modt->result = ret;
> > +                       ret = -EFAULT;
> > +                       goto out_entry_changed;
> > +               }
> > +
> > +               ret = sgx_enclave_etrack(encl);
> > +               if (ret) {
> > +                       ret = -EFAULT;
> > +                       goto out_unlock;
> > +               }
> > +
> > +               entry->type = page_type;
> > +
> > +               mutex_unlock(&encl->lock);
> > +       }
> > +
> > +       ret = 0;
> > +       goto out;
> > +
> > +out_entry_changed:
> > +       entry->vm_max_prot_bits = max_prot_restore;
> > +out_unlock:
> > +       mutex_unlock(&encl->lock);
> > +out:
> > +       modt->count = c;
> > +
> > +       return ret;
> > +}
> > +
> > +/**
> > + * sgx_ioc_enclave_modify_type() - handler for %SGX_IOC_ENCLAVE_MODIFY_TYPE
> > + * @encl:      an enclave pointer
> > + * @arg:       userspace pointer to a &struct sgx_enclave_modify_type instance
> > + *
> > + * Ability to change the enclave page type supports the following use cases:
> > + *
> > + * * It is possible to add TCS pages to an enclave by changing the type of
> > + *   regular pages (%SGX_PAGE_TYPE_REG) to TCS (%SGX_PAGE_TYPE_TCS) pages.
> > + *   With this support the number of threads supported by an initialized
> > + *   enclave can be increased dynamically.
> > + *
> > + * * Regular or TCS pages can dynamically be removed from an initialized
> > + *   enclave by changing the page type to %SGX_PAGE_TYPE_TRIM. Changing the
> > + *   page type to %SGX_PAGE_TYPE_TRIM marks the page for removal with actual
> > + *   removal done by handler of %SGX_IOC_ENCLAVE_REMOVE_PAGES ioctl() called
> > + *   after ENCLU[EACCEPT] is run on %SGX_PAGE_TYPE_TRIM page from within the
> > + *   enclave.
> > + *
> > + * Return:
> > + * - 0:                Success
> > + * - -errno:   Otherwise
> > + */
> > +static long sgx_ioc_enclave_modify_type(struct sgx_encl *encl, void __user *arg)
> > +{
> > +       struct sgx_enclave_modify_type params;
> > +       enum sgx_page_type page_type;
> > +       struct sgx_secinfo secinfo;
> > +       long ret;
> > +
> > +       ret = sgx_ioc_sgx2_ready(encl);
> > +       if (ret)
> > +               return ret;
> > +
> > +       if (copy_from_user(&params, arg, sizeof(params)))
> > +               return -EFAULT;
> > +
> > +       if (sgx_validate_offset_length(encl, params.offset, params.length))
> > +               return -EINVAL;
> > +
> > +       if (copy_from_user(&secinfo, (void __user *)params.secinfo,
> > +                          sizeof(secinfo)))
> > +               return -EFAULT;
> > +
> > +       if (secinfo.flags & ~SGX_SECINFO_PAGE_TYPE_MASK)
> > +               return -EINVAL;
> > +
> > +       if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> > +               return -EINVAL;
> > +
> > +       if (params.result || params.count)
> > +               return -EINVAL;
> > +
> > +       page_type = (secinfo.flags & SGX_SECINFO_PAGE_TYPE_MASK) >> 8;
> > +       ret = sgx_enclave_modify_type(encl, &params, page_type);
> > +
> > +       if (copy_to_user(arg, &params, sizeof(params)))
> > +               return -EFAULT;
> > +
> > +       return ret;
> > +}
> > +
> >  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> >  {
> >         struct sgx_encl *encl = filep->private_data;
> > @@ -923,6 +1129,9 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
> >                 ret = sgx_ioc_enclave_restrict_permissions(encl,
> >                                                            (void __user *)arg);
> >                 break;
> > +       case SGX_IOC_ENCLAVE_MODIFY_TYPE:
> > +               ret = sgx_ioc_enclave_modify_type(encl, (void __user *)arg);
> > +               break;
> >         default:
> >                 ret = -ENOIOCTLCMD;
> >                 break;
> > -- 
> > 2.25.1
> > 
> 
> To be coherent with other names, this should be
> SGX_IOC_ENCLAVE_MODIFY_TYPES.

This should take only page type given that flags are zeroed:

EPCM(DS:RCX).R := 0;
EPCM(DS:RCX).W := 0;
EPCM(DS:RCX).X := 0; 

BR, Jarkko


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05 14:27           ` Jarkko Sakkinen
  2022-04-05 14:52             ` Jarkko Sakkinen
@ 2022-04-05 16:40             ` Reinette Chatre
  1 sibling, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-05 16:40 UTC (permalink / raw)
  To: Jarkko Sakkinen, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

Hi Jarkko,

On 4/5/2022 7:27 AM, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 17:19 +0300, Jarkko Sakkinen wrote:
>> On Tue, 2022-04-05 at 16:40 +0300, Jarkko Sakkinen wrote:
>>> On Tue, 2022-04-05 at 08:07 +0300, Jarkko Sakkinen wrote:
>>>> On Tue, 2022-04-05 at 08:03 +0300, Jarkko Sakkinen wrote:
>>>>> On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
>>>>>> In the initial (SGX1) version of SGX, pages in an enclave need to be
>>>>>> created with permissions that support all usages of the pages, from the
>>>>>> time the enclave is initialized until it is unloaded. For example,
>>>>>> pages used by a JIT compiler or when code needs to otherwise be
>>>>>> relocated need to always have RWX permissions.
>>>>>>
>>>>>> SGX2 includes a new function ENCLS[EMODPR] that is run from the kernel
>>>>>> and can be used to restrict the EPCM permissions of regular enclave
>>>>>> pages within an initialized enclave.
>>>>>>
>>>>>> Introduce ioctl() SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS to support
>>>>>> restricting EPCM permissions. With this ioctl() the user specifies
>>>>>> a page range and the EPCM permissions to be applied to all pages in
>>>>>> the provided range. ENCLS[EMODPR] is run to restrict the EPCM
>>>>>> permissions followed by the ENCLS[ETRACK] flow that will ensure
>>>>>> no cached linear-to-physical address mappings to the changed
>>>>>> pages remain.
>>>>>>
>>>>>> It is possible for the permission change request to fail on any
>>>>>> page within the provided range, either with an error encountered
>>>>>> by the kernel or by the SGX hardware while running
>>>>>> ENCLS[EMODPR]. To support partial success the ioctl() returns an
>>>>>> error code based on failures encountered by the kernel as well
>>>>>> as two result output parameters: one for the number of pages
>>>>>> that were successfully changed and one for the SGX return code.
>>>>>>
>>>>>> The page table entry permissions are not impacted by the EPCM
>>>>>> permission changes. VMAs and PTEs will continue to allow the
>>>>>> maximum vetted permissions determined at the time the pages
>>>>>> are added to the enclave. The SGX error code in a page fault
>>>>>> will indicate if it was an EPCM permission check that prevented
>>>>>> an access attempt.
>>>>>>
>>>>>> No checking is done to ensure that the permissions are actually
>>>>>> being restricted. This is because the enclave may have relaxed
>>>>>> the EPCM permissions from within the enclave without letting the
>>>>>> kernel know. An attempt to relax permissions using this call will
>>>>>> be ignored by the hardware.
>>>>>>
>>>>>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>>>>>> ---
>>>>>> Changes since V2:
>>>>>> - Include the sgx_ioc_sgx2_ready() utility
>>>>>>   that previously was in "x86/sgx: Support relaxing of enclave page
>>>>>>   permissions" that is removed from the next version.
>>>>>> - Few renames requested by Jarkko:
>>>>>>   struct sgx_enclave_restrict_perm ->
>>>>>>          struct sgx_enclave_restrict_permissions
>>>>>>   sgx_enclave_restrict_perm()     ->
>>>>>>          sgx_enclave_restrict_permissions()
>>>>>>   sgx_ioc_enclave_restrict_perm() ->
>>>>>>          sgx_ioc_enclave_restrict_permissions()
>>>>>> - Make EPCM permissions independent from kernel view of
>>>>>>   permissions.  (Jarkko)
>>>>>>   - Remove attempt at runtime tracking of EPCM permissions
>>>>>>     (sgx_encl_page->vm_run_prot_bits).
>>>>>>   - Do not flush page table entries - they are no longer impacted by
>>>>>>     EPCM permission changes.
>>>>>>   - Modify changelog to reflect new architecture.
>>>>>> - Ensure at least PROT_READ is requested - enclave requires read
>>>>>>   access to the page for commands like EMODPE and EACCEPT. (Jarkko)
>>>>>>
>>>>>> Changes since V1:
>>>>>> - Change terminology to use "relax" instead of "extend" to refer to
>>>>>>   the case when enclave page permissions are added (Dave).
>>>>>> - Use ioctl() in commit message (Dave).
>>>>>> - Add examples on what permissions would be allowed (Dave).
>>>>>> - Split enclave page permission changes into two ioctl()s, one for
>>>>>>   permission restricting (SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS)
>>>>>>   and one for permission relaxing (SGX_IOC_ENCLAVE_RELAX_PERMISSIONS)
>>>>>>   (Jarkko).
>>>>>> - In support of the ioctl() name change the following names have been
>>>>>>   changed:
>>>>>>   struct sgx_page_modp -> struct sgx_enclave_restrict_perm
>>>>>>   sgx_ioc_page_modp() -> sgx_ioc_enclave_restrict_perm()
>>>>>>   sgx_page_modp() -> sgx_enclave_restrict_perm()
>>>>>> - ioctl() takes entire secinfo as input instead of
>>>>>>   page permissions only (Jarkko).
>>>>>> - Fix kernel-doc to include () in function name.
>>>>>> - Create and use utility for the ETRACK flow.
>>>>>> - Fixups in comments
>>>>>> - Move kernel-doc to function that provides documentation for
>>>>>>   Documentation/x86/sgx.rst.
>>>>>> - Remove redundant comment.
>>>>>> - Make explicit which members of struct sgx_enclave_restrict_perm
>>>>>>   are for output (Dave).
>>>>>>
>>>>>>  arch/x86/include/uapi/asm/sgx.h |  21 +++
>>>>>>  arch/x86/kernel/cpu/sgx/ioctl.c | 242 ++++++++++++++++++++++++++++++++
>>>>>>  2 files changed, 263 insertions(+)
>>>>>>
>>>>>> diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
>>>>>> index f4b81587e90b..a0a24e94fb27 100644
>>>>>> --- a/arch/x86/include/uapi/asm/sgx.h
>>>>>> +++ b/arch/x86/include/uapi/asm/sgx.h
>>>>>> @@ -29,6 +29,8 @@ enum sgx_page_flags {
>>>>>>         _IOW(SGX_MAGIC, 0x03, struct sgx_enclave_provision)
>>>>>>  #define SGX_IOC_VEPC_REMOVE_ALL \
>>>>>>         _IO(SGX_MAGIC, 0x04)
>>>>>> +#define SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS \
>>>>>> +       _IOWR(SGX_MAGIC, 0x05, struct sgx_enclave_restrict_permissions)
>>>>>>  
>>>>>>  /**
>>>>>>   * struct sgx_enclave_create - parameter structure for the
>>>>>> @@ -76,6 +78,25 @@ struct sgx_enclave_provision {
>>>>>>         __u64 fd;
>>>>>>  };
>>>>>>  
>>>>>> +/**
>>>>>> + * struct sgx_enclave_restrict_permissions - parameters for ioctl
>>>>>> + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
>>>>>> + * @offset:    starting page offset (page aligned relative to enclave base
>>>>>> + *             address defined in SECS)
>>>>>> + * @length:    length of memory (multiple of the page size)
>>>>>> + * @secinfo:   address for the SECINFO data containing the new permission bits
>>>>>> + *             for pages in range described by @offset and @length
>>>>>> + * @result:    (output) SGX result code of ENCLS[EMODPR] function
>>>>>> + * @count:     (output) bytes successfully changed (multiple of page size)
>>>>>> + */
>>>>>> +struct sgx_enclave_restrict_permissions {
>>>>>> +       __u64 offset;
>>>>>> +       __u64 length;
>>>>>> +       __u64 secinfo;
>>>>>> +       __u64 result;
>>>>>> +       __u64 count;
>>>>>> +};
>>>>>> +
>>>>>>  struct sgx_enclave_run;
>>>>>>  
>>>>>>  /**
>>>>>> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
>>>>>> index 0460fd224a05..4d88bfd163e7 100644
>>>>>> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
>>>>>> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
>>>>>> @@ -660,6 +660,244 @@ static long sgx_ioc_enclave_provision(struct sgx_encl *encl, void __user *arg)
>>>>>>         return sgx_set_attribute(&encl->attributes_mask, params.fd);
>>>>>>  }
>>>>>>  
>>>>>> +/*
>>>>>> + * Ensure enclave is ready for SGX2 functions. Readiness is checked
>>>>>> + * by ensuring the hardware supports SGX2 and the enclave is initialized
>>>>>> + * and thus able to handle requests to modify pages within it.
>>>>>> + */
>>>>>> +static int sgx_ioc_sgx2_ready(struct sgx_encl *encl)
>>>>>> +{
>>>>>> +       if (!(cpu_feature_enabled(X86_FEATURE_SGX2)))
>>>>>> +               return -ENODEV;
>>>>>> +
>>>>>> +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
>>>>>> +               return -EINVAL;
>>>>>> +
>>>>>> +       return 0;
>>>>>> +}
>>>>>> +
>>>>>> +/*
>>>>>> + * Return valid permission fields from a secinfo structure provided by
>>>>>> + * user space. The secinfo structure is required to only have bits in
>>>>>> + * the permission fields set.
>>>>>> + */
>>>>>> +static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
>>>>>> +{
>>>>>> +       struct sgx_secinfo secinfo;
>>>>>> +       u64 perm;
>>>>>> +
>>>>>> +       if (copy_from_user(&secinfo, (void __user *)_secinfo,
>>>>>> +                          sizeof(secinfo)))
>>>>>> +               return -EFAULT;
>>>>>> +
>>>>>> +       if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
>>>>>> +               return -EINVAL;
>>>>>> +
>>>>>> +       if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
>>>>>> +               return -EINVAL;
>>>>>> +
>>>>>> +       perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
>>>>>> +
>>>>>> +       /*
>>>>>> +        * Read access is required for the enclave to be able to use the page.
>>>>>> +        * SGX instructions like ENCLU[EMODPE] and ENCLU[EACCEPT] require
>>>>>> +        * read access.
>>>>>> +        */
>>>>>> +       if (!(perm & SGX_SECINFO_R))
>>>>>> +               return -EINVAL;
>>>>>> +
>>>>>> +       *secinfo_perm = perm;
>>>>>> +
>>>>>> +       return 0;
>>>>>> +}
>>>>>> +
>>>>>> +/*
>>>>>> + * Some SGX functions require that no cached linear-to-physical address
>>>>>> + * mappings are present before they can succeed. Collaborate with
>>>>>> + * hardware via ENCLS[ETRACK] to ensure that all cached
>>>>>> + * linear-to-physical address mappings belonging to all threads of
>>>>>> + * the enclave are cleared. See sgx_encl_cpumask() for details.
>>>>>> + */
>>>>>> +static int sgx_enclave_etrack(struct sgx_encl *encl)
>>>>>> +{
>>>>>> +       void *epc_virt;
>>>>>> +       int ret;
>>>>>> +
>>>>>> +       epc_virt = sgx_get_epc_virt_addr(encl->secs.epc_page);
>>>>>> +       ret = __etrack(epc_virt);
>>>>>> +       if (ret) {
>>>>>> +               /*
>>>>>> +                * ETRACK only fails when there is an OS issue. For
>>>>>> +                * example, two consecutive ETRACK was sent without
>>>>>> +                * completed IPI between.
>>>>>> +                */
>>>>>> +               pr_err_once("ETRACK returned %d (0x%x)", ret, ret);
>>>>>> +               /*
>>>>>> +                * Send IPIs to kick CPUs out of the enclave and
>>>>>> +                * try ETRACK again.
>>>>>> +                */
>>>>>> +               on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
>>>>>> +               ret = __etrack(epc_virt);
>>>>>> +               if (ret) {
>>>>>> +                       pr_err_once("ETRACK repeat returned %d (0x%x)",
>>>>>> +                                   ret, ret);
>>>>>> +                       return -EFAULT;
>>>>>> +               }
>>>>>> +       }
>>>>>> +       on_each_cpu_mask(sgx_encl_cpumask(encl), sgx_ipi_cb, NULL, 1);
>>>>>> +
>>>>>> +       return 0;
>>>>>> +}
>>>>>> +
>>>>>> +/**
>>>>>> + * sgx_enclave_restrict_permissions() - Restrict EPCM permissions
>>>>>> + * @encl:      Enclave to which the pages belong.
>>>>>> + * @modp:      Checked parameters from user on which pages need modifying.
>>>>>> + * @secinfo_perm: New (validated) permission bits.
>>>>>> + *
>>>>>> + * Return:
>>>>>> + * - 0:                Success.
>>>>>> + * - -errno:   Otherwise.
>>>>>> + */
>>>>>> +static long
>>>>>> +sgx_enclave_restrict_permissions(struct sgx_encl *encl,
>>>>>> +                                struct sgx_enclave_restrict_permissions *modp,
>>>>>> +                                u64 secinfo_perm)
>>>>>> +{
>>>>>> +       struct sgx_encl_page *entry;
>>>>>> +       struct sgx_secinfo secinfo;
>>>>>> +       unsigned long addr;
>>>>>> +       unsigned long c;
>>>>>> +       void *epc_virt;
>>>>>> +       int ret;
>>>>>> +
>>>>>> +       memset(&secinfo, 0, sizeof(secinfo));
>>>>>> +       secinfo.flags = secinfo_perm;
>>>>>> +
>>>>>> +       for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
>>>>>> +               addr = encl->base + modp->offset + c;
>>>>>> +
>>>>>> +               mutex_lock(&encl->lock);
>>>>>> +
>>>>>> +               entry = sgx_encl_load_page(encl, addr);
>>>>>> +               if (IS_ERR(entry)) {
>>>>>> +                       ret = PTR_ERR(entry) == -EBUSY ? -EAGAIN : -EFAULT;
>>>>>> +                       goto out_unlock;
>>>>>> +               }
>>>>>> +
>>>>>> +               /*
>>>>>> +                * Changing EPCM permissions is only supported on regular
>>>>>> +                * SGX pages. Attempting this change on other pages will
>>>>>> +                * result in #PF.
>>>>>> +                */
>>>>>> +               if (entry->type != SGX_PAGE_TYPE_REG) {
>>>>>> +                       ret = -EINVAL;
>>>>>> +                       goto out_unlock;
>>>>>> +               }
>>>>>> +
>>>>>> +               /*
>>>>>> +                * Do not verify the permission bits requested. Kernel
>>>>>> +                * has no control over how EPCM permissions can be relaxed
>>>>>> +                * from within the enclave. ENCLS[EMODPR] can only
>>>>>> +                * remove existing EPCM permissions, attempting to set
>>>>>> +                * new permissions will be ignored by the hardware.
>>>>>> +                */
>>>>>> +
>>>>>> +               /* Change EPCM permissions. */
>>>>>> +               epc_virt = sgx_get_epc_virt_addr(entry->epc_page);
>>>>>> +               ret = __emodpr(&secinfo, epc_virt);
>>>>>> +               if (encls_faulted(ret)) {
>>>>>> +                       /*
>>>>>> +                        * All possible faults should be avoidable:
>>>>>> +                        * parameters have been checked, will only change
>>>>>> +                        * permissions of a regular page, and no concurrent
>>>>>> +                        * SGX1/SGX2 ENCLS instructions since these
>>>>>> +                        * are protected with mutex.
>>>>>> +                        */
>>>>>> +                       pr_err_once("EMODPR encountered exception %d\n",
>>>>>> +                                   ENCLS_TRAPNR(ret));
>>>>>> +                       ret = -EFAULT;
>>>>>> +                       goto out_unlock;
>>>>>> +               }
>>>>>> +               if (encls_failed(ret)) {
>>>>>> +                       modp->result = ret;
>>>>>> +                       ret = -EFAULT;
>>>>>> +                       goto out_unlock;
>>>>>> +               }
>>>>>> +
>>>>>> +               ret = sgx_enclave_etrack(encl);
>>>>>> +               if (ret) {
>>>>>> +                       ret = -EFAULT;
>>>>>> +                       goto out_unlock;
>>>>>> +               }
>>>>>> +
>>>>>> +               mutex_unlock(&encl->lock);
>>>>>> +       }
>>>>>> +
>>>>>> +       ret = 0;
>>>>>> +       goto out;
>>>>>> +
>>>>>> +out_unlock:
>>>>>> +       mutex_unlock(&encl->lock);
>>>>>> +out:
>>>>>> +       modp->count = c;
>>>>>> +
>>>>>> +       return ret;
>>>>>> +}
>>>>>> +
>>>>>> +/**
>>>>>> + * sgx_ioc_enclave_restrict_permissions() - handler for
>>>>>> + *                                        %SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS
>>>>>> + * @encl:      an enclave pointer
>>>>>> + * @arg:       userspace pointer to a &struct sgx_enclave_restrict_permissions
>>>>>> + *             instance
>>>>>> + *
>>>>>> + * SGX2 distinguishes between relaxing and restricting the enclave page
>>>>>> + * permissions maintained by the hardware (EPCM permissions) of pages
>>>>>> + * belonging to an initialized enclave (after SGX_IOC_ENCLAVE_INIT).
>>>>>> + *
>>>>>> + * EPCM permissions cannot be restricted from within the enclave, the enclave
>>>>>> + * requires the kernel to run the privileged level 0 instructions ENCLS[EMODPR]
>>>>>> + * and ENCLS[ETRACK]. An attempt to relax EPCM permissions with this call
>>>>>> + * will be ignored by the hardware.
>>>>>> + *
>>>>>> + * Return:
>>>>>> + * - 0:                Success
>>>>>> + * - -errno:   Otherwise
>>>>>> + */
>>>>>> +static long sgx_ioc_enclave_restrict_permissions(struct sgx_encl *encl,
>>>>>> +                                                void __user *arg)
>>>>>> +{
>>>>>> +       struct sgx_enclave_restrict_permissions params;
>>>>>> +       u64 secinfo_perm;
>>>>>> +       long ret;
>>>>>> +
>>>>>> +       ret = sgx_ioc_sgx2_ready(encl);
>>>>>> +       if (ret)
>>>>>> +               return ret;
>>>>>> +
>>>>>> +       if (copy_from_user(&params, arg, sizeof(params)))
>>>>>> +               return -EFAULT;
>>>>>> +
>>>>>> +       if (sgx_validate_offset_length(encl, params.offset, params.length))
>>>>>> +               return -EINVAL;
>>>>>> +
>>>>>> +       ret = sgx_perm_from_user_secinfo((void __user *)params.secinfo,
>>>>>> +                                        &secinfo_perm);
>>>>>> +       if (ret)
>>>>>> +               return ret;
>>>>>> +
>>>>>> +       if (params.result || params.count)
>>>>>> +               return -EINVAL;
>>>>>> +
>>>>>> +       ret = sgx_enclave_restrict_permissions(encl, &params, secinfo_perm);
>>>>>> +
>>>>>> +       if (copy_to_user(arg, &params, sizeof(params)))
>>>>>> +               return -EFAULT;
>>>>>> +
>>>>>> +       return ret;
>>>>>> +}
>>>>>> +
>>>>>>  long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>>>>>>  {
>>>>>>         struct sgx_encl *encl = filep->private_data;
>>>>>> @@ -681,6 +919,10 @@ long sgx_ioctl(struct file *filep, unsigned int cmd, unsigned long arg)
>>>>>>         case SGX_IOC_ENCLAVE_PROVISION:
>>>>>>                 ret = sgx_ioc_enclave_provision(encl, (void __user *)arg);
>>>>>>                 break;
>>>>>> +       case SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS:
>>>>>> +               ret = sgx_ioc_enclave_restrict_permissions(encl,
>>>>>> +                                                          (void __user *)arg);
>>>>>> +               break;
>>>>>>         default:
>>>>>>                 ret = -ENOIOCTLCMD;
>>>>>>                 break;
>>>>>
>>>>> I think this a big improvement all things considered. I just put 
>>>>> a kernel building and see if I get this wired to our code:
>>>>>
>>>>> https://github.com/jarkkojs/aur-linux-sgx/actions/runs/2094084943
>>>>>
>>>>> I'll report my findings later on.
>>>>
>>>> I pulled the patches from sgx2_submitted_v3_plus_rwx branch. Just
>>>> sanity checking that it is v3, correct?
>>>
>>> I'm getting EINVAL with SECINFO that I think is legit:
>>>
>>> let mut secinfo_buf: [u8; 64] = [0; 64]; // Initialize with zeros
>>> secinfo_buf[0] = 1; // READ
>>> secinfo_buf[1] = 2; // Regular
>>>
>>> I made a small bpftrace script, and here's what happens:
>>>
>>> $ cat sgx.bt
>>> kretprobe:sgx_ioctl /retval != 0/
>>> {
>>>         printf("sgx_ioctl: %d\n", retval)
>>> }
>>>
>>> kretprobe:sgx_perm_from_user_secinfo.constprop.0 /retval/
>>> {
>>>         printf("sgx_perm_from_user_secinfo.constprop.0 %d\n", retval)
>>> }
>>>
>>> kretprobe:sgx_enclave_restrict_permissions /retval/
>>> {
>>>         printf("sgx_enclave_restrict_permissions: %d\n", retval)
>>> }
>>>
>>> $ sudo bpftrace sgx.bt
>>> [sudo] password for jarkko: 
>>> Attaching 3 probes...
>>> sgx_perm_from_user_secinfo.constprop.0 -22
>>> sgx_ioctl: -22
>>>
>>> Could be that I'm doing something wrong but instantly do not see
>>> anything obvious...
>>
>> It was my bad, i.e.
>>
>> let mut secinfo_buf: [u8; 64] = [0; 64];
>> secinfo_buf[0] = 1;
>> secinfo_buf[1] = 0;
>>  
>> BR, Jarkko
> 
> According to SDM having page type as regular is fine for EMODPR,
> i.e. that's why I did not care about having it in SECINFO.

EMODPR can only be run on regular page type, but having PT_REG set
in EMODPR's secinfo is not required. In this implementation,
when EMODPR is executed _only_ the permission bits in secinfo are
set.

What the hardware does is ensure that the existing EPCM entry
has PT_REG set. It does not check the PT_REG bit in the
provided secinfo.

> 
> Given that the opcode itself contains validation, I wonder
> why this needs to be done:
> 
> if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> 		return -EINVAL;
> 
> if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> 		return -EINVAL;
> 
> perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> 
> I.e. why duplicate validation and why does it have different
> invariant than the opcode?

This is not different - it ends up being the exact secinfo
provided to the hardware. The provided secinfo only has 
permission bits set. The hardware only checks the permission
bits in secinfo (ignoring that it ensures that the reserved bits
are zero).

The implementation ensures that only fields checked by
the hardware are provided.

> 
> While looking into this I also noticed:
> 
> static int sgx_validate_offset_length(struct sgx_encl *encl,
> 				      unsigned long offset,
> 				      unsigned long length)
> {
> 	if (!IS_ALIGNED(offset, PAGE_SIZE))
> 		return -EINVAL;
> 
> 	if (!length || length & (PAGE_SIZE - 1))
> 		return -EINVAL;
> 
> I guess also for length would be good idea to use IS_ALIGNED()
> (this inconsistency inherits from the pre-existing code).
> 

Yes, I was following existing code.

Reinette


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05 14:52             ` Jarkko Sakkinen
@ 2022-04-05 16:49               ` Reinette Chatre
  2022-04-05 18:39                 ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-05 16:49 UTC (permalink / raw)
  To: Jarkko Sakkinen, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

Hi Jarkko,

On 4/5/2022 7:52 AM, Jarkko Sakkinen wrote:
> n Tue, 2022-04-05 at 17:27 +0300, Jarkko Sakkinen wrote:
>> According to SDM having page type as regular is fine for EMODPR,
>> i.e. that's why I did not care about having it in SECINFO.
>>
>> Given that the opcode itself contains validation, I wonder
>> why this needs to be done:
>>
>> if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
>>                 return -EINVAL;
>>
>> if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
>>                 return -EINVAL;
>>
>> perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
>>
>> I.e. why duplicate validation and why does it have different
>> invariant than the opcode?
> 
> Right it is done to prevent exceptions and also pseudo-code
> has this validation:
> 
> IF (EPCM(DS:RCX).PT is not PT_REG) THEN #PF(DS:RCX); FI; 

The current type of the page is validated - not the page type
provided in the parameters of the command.

> 
> This is clearly wrong:

Could you please elaborate what is wrong? The hardware only checks
the permission bits and that is what is provided.

> 
> /*
>  * Return valid permission fields from a secinfo structure provided by
>  * user space. The secinfo structure is required to only have bits in
>  * the permission fields set.
>  */
> static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> 
> It means that the API requires a malformed data as input.

It is not clear to me how this is malformed. The API requires that only
the permission bits are set in the secinfo, only the permission bits in secinfo
is provided to the hardware, and the hardware only checks the permission bits.

> 
> Maybe it would be better idea then to replace secinfo with just the
> permission field?

That is what I implemented in V1 [1], but was asked to change to secinfo. I could
go back to that if you prefer.

Reinette

[1] https://lore.kernel.org/linux-sgx/44fe170cfd855760857660b9f56cae8c4747cc15.1638381245.git.reinette.chatre@intel.com/

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 17/30] x86/sgx: Support modifying SGX page type
  2022-04-05 15:34     ` Jarkko Sakkinen
@ 2022-04-05 17:05       ` Reinette Chatre
  2022-04-05 18:41         ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-05 17:05 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 4/5/2022 8:34 AM, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 10:06 +0300, Jarkko Sakkinen wrote:

>>>
>>
>> To be coherent with other names, this should be
>> SGX_IOC_ENCLAVE_MODIFY_TYPES.

This is not such a clear change request to me:

SGX_IOC_ENCLAVE_ADD_PAGES - add multiple pages
SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS - restrict multiple permissions
SGX_IOC_ENCLAVE_REMOVE_PAGES - remove multiple pages
SGX_IOC_ENCLAVE_MODIFY_TYPE - set a single type

Perhaps it should rather be SGX_IOC_ENCLAVE_SET_TYPE to indicate that
there is a single target type as opposed to the possibility
of multiple source types (TCS and regular pages can be trimmed).

> 
> This should take only page type given that flags are zeroed:
> 
> EPCM(DS:RCX).R := 0;
> EPCM(DS:RCX).W := 0;
> EPCM(DS:RCX).X := 0; 
> 

ok, this was how it was done in V1 [1] and I can go back to that.


Reinette

[1] https://lore.kernel.org/linux-sgx/c0f04a8f7e1afd9e9319bb9f283db9a3187f7abc.1638381245.git.reinette.chatre@intel.com/


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges
  2022-04-05  7:11   ` Jarkko Sakkinen
@ 2022-04-05 17:13     ` Reinette Chatre
  2022-04-05 17:25       ` Dave Hansen
  2022-04-05 18:42       ` Jarkko Sakkinen
  0 siblings, 2 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-05 17:13 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 4/5/2022 12:11 AM, Jarkko Sakkinen wrote:
> On Mon, Apr 04, 2022 at 09:49:27AM -0700, Reinette Chatre wrote:
>> The page reclaimer ensures availability of EPC pages across all
>> enclaves. In support of this it runs independently from the
>> individual enclaves in order to take locks from the different
>> enclaves as it writes pages to swap.
>>
>> When needing to load a page from swap an EPC page needs to be
>> available for its contents to be loaded into. Loading an existing
>> enclave page from swap does not reclaim EPC pages directly if
>> none are available, instead the reclaimer is woken when the
>> available EPC pages are found to be below a watermark.
>>
>> When iterating over a large number of pages in an oversubscribed
>> environment there is a race between the reclaimer woken up and
>> EPC pages reclaimed fast enough for the page operations to proceed.
>>
>> Ensure there are EPC pages available before attempting to load
>> a page that may potentially be pulled from swap into an available
>> EPC page.
>>
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>> ---
>> No changes since V2
>>
>> Changes since v1:
>> - Reword commit message.
>>
>>  arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++++++
>>  arch/x86/kernel/cpu/sgx/main.c  | 6 ++++++
>>  arch/x86/kernel/cpu/sgx/sgx.h   | 1 +
>>  3 files changed, 13 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
>> index 515e1961cc02..f88bc1236276 100644
>> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
>> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
>> @@ -777,6 +777,8 @@ sgx_enclave_restrict_permissions(struct sgx_encl *encl,
>>  	for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
>>  		addr = encl->base + modp->offset + c;
>>  
>> +		sgx_direct_reclaim();
>> +
>>  		mutex_lock(&encl->lock);
>>  
>>  		entry = sgx_encl_load_page(encl, addr);
>> @@ -934,6 +936,8 @@ static long sgx_enclave_modify_type(struct sgx_encl *encl,
>>  	for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
>>  		addr = encl->base + modt->offset + c;
>>  
>> +		sgx_direct_reclaim();
>> +
>>  		mutex_lock(&encl->lock);
>>  
>>  		entry = sgx_encl_load_page(encl, addr);
>> @@ -1129,6 +1133,8 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
>>  	for (c = 0 ; c < params->length; c += PAGE_SIZE) {
>>  		addr = encl->base + params->offset + c;
>>  
>> +		sgx_direct_reclaim();
>> +
>>  		mutex_lock(&encl->lock);
>>  
>>  		entry = sgx_encl_load_page(encl, addr);
>> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
>> index 6e2cb7564080..545da16bb3ea 100644
>> --- a/arch/x86/kernel/cpu/sgx/main.c
>> +++ b/arch/x86/kernel/cpu/sgx/main.c
>> @@ -370,6 +370,12 @@ static bool sgx_should_reclaim(unsigned long watermark)
>>  	       !list_empty(&sgx_active_page_list);
>>  }
>>  
>> +void sgx_direct_reclaim(void)
>> +{
>> +	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
>> +		sgx_reclaim_pages();
>> +}
> 
> Please, instead open code this to both locations - not enough redundancy
> to be worth of new function. Causes only unnecessary cross-referencing
> when maintaining. Otherwise, I agree with the idea.
> 

hmmm, that means the heart of the reclaimer (sgx_reclaim_pages()) would be
made available for direct use from everywhere in the driver. I will look into this.

Reinette


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges
  2022-04-05 17:13     ` Reinette Chatre
@ 2022-04-05 17:25       ` Dave Hansen
  2022-04-06  6:35         ` Jarkko Sakkinen
  2022-04-05 18:42       ` Jarkko Sakkinen
  1 sibling, 1 reply; 79+ messages in thread
From: Dave Hansen @ 2022-04-05 17:25 UTC (permalink / raw)
  To: Reinette Chatre, Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On 4/5/22 10:13, Reinette Chatre wrote:
>>> +void sgx_direct_reclaim(void)
>>> +{
>>> +	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
>>> +		sgx_reclaim_pages();
>>> +}
>> Please, instead open code this to both locations - not enough redundancy
>> to be worth of new function. Causes only unnecessary cross-referencing
>> when maintaining. Otherwise, I agree with the idea.
>>
> hmmm, that means the heart of the reclaimer (sgx_reclaim_pages()) would be
> made available for direct use from everywhere in the driver. I will look into this.

I like the change.  It's not about reducing code redundancy, it's about
*describing* what the code does.  Each location could have:

	/* Enter direct SGX reclaim: */
	if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
		sgx_reclaim_pages();

Or, it could just be:

	sgx_direct_reclaim();

Which also provides a logical choke point to add comments, like:

/*
 * sgx_direct_reclaim() should be called in locations where SGX
 * memory resources might be low and might be needed in order
 * to make forward progress.
 */
void sgx_direct_reclaim(void)
{
	...

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 21/30] selftests/sgx: Add test for EPCM permission changes
  2022-04-05  7:02   ` Jarkko Sakkinen
  2022-04-05  7:03     ` Jarkko Sakkinen
@ 2022-04-05 17:28     ` Reinette Chatre
  2022-04-05 18:43       ` Jarkko Sakkinen
  1 sibling, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-05 17:28 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 4/5/2022 12:02 AM, Jarkko Sakkinen wrote:
> Lacking:
> 
> KERNEL SELFTEST FRAMEWORK
> M:	Shuah Khan <shuah@kernel.org>
> M:	Shuah Khan <skhan@linuxfoundation.org>
> L:	linux-kselftest@vger.kernel.org
> S:	Maintained
> Q:	https://patchwork.kernel.org/project/linux-kselftest/list/
> T:	git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
> F:	Documentation/dev-tools/kselftest*
> F:	tools/testing/selftests/
> 

My apologies. I'll add the kselftest folks to the next version.

Reinette

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05 16:49               ` Reinette Chatre
@ 2022-04-05 18:39                 ` Jarkko Sakkinen
  2022-04-05 18:59                   ` Reinette Chatre
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 18:39 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

On Tue, 2022-04-05 at 09:49 -0700, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 4/5/2022 7:52 AM, Jarkko Sakkinen wrote:
> > n Tue, 2022-04-05 at 17:27 +0300, Jarkko Sakkinen wrote:
> > > According to SDM having page type as regular is fine for EMODPR,
> > > i.e. that's why I did not care about having it in SECINFO.
> > > 
> > > Given that the opcode itself contains validation, I wonder
> > > why this needs to be done:
> > > 
> > > if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> > >                 return -EINVAL;
> > > 
> > > if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> > >                 return -EINVAL;
> > > 
> > > perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> > > 
> > > I.e. why duplicate validation and why does it have different
> > > invariant than the opcode?
> > 
> > Right it is done to prevent exceptions and also pseudo-code
> > has this validation:
> > 
> > IF (EPCM(DS:RCX).PT is not PT_REG) THEN #PF(DS:RCX); FI; 
> 
> The current type of the page is validated - not the page type
> provided in the parameters of the command.
> 
> > 
> > This is clearly wrong:
> 
> Could you please elaborate what is wrong? The hardware only checks
> the permission bits and that is what is provided.

I think it's for most a bit confusing that it takes a special Linux
defined SECINFO instead of what you read from spec. 

> 
> > 
> > /*
> >  * Return valid permission fields from a secinfo structure provided by
> >  * user space. The secinfo structure is required to only have bits in
> >  * the permission fields set.
> >  */
> > static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> > 
> > It means that the API requires a malformed data as input.
> 
> It is not clear to me how this is malformed. The API requires that only
> the permission bits are set in the secinfo, only the permission bits in secinfo
> is provided to the hardware, and the hardware only checks the permission bits.
> 
> > 
> > Maybe it would be better idea then to replace secinfo with just the
> > permission field?
> 
> That is what I implemented in V1 [1], but was asked to change to secinfo. I could
> go back to that if you prefer.

Yeah, if I was the one saying that, I was clearly wrong. But also
perspective is now very different after using a lot of these
features.

Alternatively you could have a single "mod" ioctl given the disjoint
nature how the parameters go to SECINFO.


> Reinette
> 
> [1] https://lore.kernel.org/linux-sgx/44fe170cfd855760857660b9f56cae8c4747cc15.1638381245.git.reinette.chatre@intel.com/

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 17/30] x86/sgx: Support modifying SGX page type
  2022-04-05 17:05       ` Reinette Chatre
@ 2022-04-05 18:41         ` Jarkko Sakkinen
  2022-04-05 18:59           ` Reinette Chatre
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 18:41 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Tue, 2022-04-05 at 10:05 -0700, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 4/5/2022 8:34 AM, Jarkko Sakkinen wrote:
> > On Tue, 2022-04-05 at 10:06 +0300, Jarkko Sakkinen wrote:
> 
> > > > 
> > > 
> > > To be coherent with other names, this should be
> > > SGX_IOC_ENCLAVE_MODIFY_TYPES.
> 
> This is not such a clear change request to me:
> 
> SGX_IOC_ENCLAVE_ADD_PAGES - add multiple pages
> SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS - restrict multiple permissions
> SGX_IOC_ENCLAVE_REMOVE_PAGES - remove multiple pages
> SGX_IOC_ENCLAVE_MODIFY_TYPE - set a single type
> 
> Perhaps it should rather be SGX_IOC_ENCLAVE_SET_TYPE to indicate that
> there is a single target type as opposed to the possibility
> of multiple source types (TCS and regular pages can be trimmed).
> 
> > 
> > This should take only page type given that flags are zeroed:
> > 
> > EPCM(DS:RCX).R := 0;
> > EPCM(DS:RCX).W := 0;
> > EPCM(DS:RCX).X := 0; 
> > 
> 
> ok, this was how it was done in V1 [1] and I can go back to that.

I would name the fields as "flags" and "page_type" just to align
names with SGX instead of trying to mimim "posix names". Otherwise, 
I support that.

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges
  2022-04-05 17:13     ` Reinette Chatre
  2022-04-05 17:25       ` Dave Hansen
@ 2022-04-05 18:42       ` Jarkko Sakkinen
  2022-04-05 19:56         ` Reinette Chatre
  1 sibling, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 18:42 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Tue, 2022-04-05 at 10:13 -0700, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 4/5/2022 12:11 AM, Jarkko Sakkinen wrote:
> > On Mon, Apr 04, 2022 at 09:49:27AM -0700, Reinette Chatre wrote:
> > > The page reclaimer ensures availability of EPC pages across all
> > > enclaves. In support of this it runs independently from the
> > > individual enclaves in order to take locks from the different
> > > enclaves as it writes pages to swap.
> > > 
> > > When needing to load a page from swap an EPC page needs to be
> > > available for its contents to be loaded into. Loading an existing
> > > enclave page from swap does not reclaim EPC pages directly if
> > > none are available, instead the reclaimer is woken when the
> > > available EPC pages are found to be below a watermark.
> > > 
> > > When iterating over a large number of pages in an oversubscribed
> > > environment there is a race between the reclaimer woken up and
> > > EPC pages reclaimed fast enough for the page operations to proceed.
> > > 
> > > Ensure there are EPC pages available before attempting to load
> > > a page that may potentially be pulled from swap into an available
> > > EPC page.
> > > 
> > > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > > ---
> > > No changes since V2
> > > 
> > > Changes since v1:
> > > - Reword commit message.
> > > 
> > >  arch/x86/kernel/cpu/sgx/ioctl.c | 6 ++++++
> > >  arch/x86/kernel/cpu/sgx/main.c  | 6 ++++++
> > >  arch/x86/kernel/cpu/sgx/sgx.h   | 1 +
> > >  3 files changed, 13 insertions(+)
> > > 
> > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > index 515e1961cc02..f88bc1236276 100644
> > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> > > @@ -777,6 +777,8 @@ sgx_enclave_restrict_permissions(struct sgx_encl *encl,
> > >         for (c = 0 ; c < modp->length; c += PAGE_SIZE) {
> > >                 addr = encl->base + modp->offset + c;
> > >  
> > > +               sgx_direct_reclaim();
> > > +
> > >                 mutex_lock(&encl->lock);
> > >  
> > >                 entry = sgx_encl_load_page(encl, addr);
> > > @@ -934,6 +936,8 @@ static long sgx_enclave_modify_type(struct sgx_encl *encl,
> > >         for (c = 0 ; c < modt->length; c += PAGE_SIZE) {
> > >                 addr = encl->base + modt->offset + c;
> > >  
> > > +               sgx_direct_reclaim();
> > > +
> > >                 mutex_lock(&encl->lock);
> > >  
> > >                 entry = sgx_encl_load_page(encl, addr);
> > > @@ -1129,6 +1133,8 @@ static long sgx_encl_remove_pages(struct sgx_encl *encl,
> > >         for (c = 0 ; c < params->length; c += PAGE_SIZE) {
> > >                 addr = encl->base + params->offset + c;
> > >  
> > > +               sgx_direct_reclaim();
> > > +
> > >                 mutex_lock(&encl->lock);
> > >  
> > >                 entry = sgx_encl_load_page(encl, addr);
> > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> > > index 6e2cb7564080..545da16bb3ea 100644
> > > --- a/arch/x86/kernel/cpu/sgx/main.c
> > > +++ b/arch/x86/kernel/cpu/sgx/main.c
> > > @@ -370,6 +370,12 @@ static bool sgx_should_reclaim(unsigned long watermark)
> > >                !list_empty(&sgx_active_page_list);
> > >  }
> > >  
> > > +void sgx_direct_reclaim(void)
> > > +{
> > > +       if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
> > > +               sgx_reclaim_pages();
> > > +}
> > 
> > Please, instead open code this to both locations - not enough redundancy
> > to be worth of new function. Causes only unnecessary cross-referencing
> > when maintaining. Otherwise, I agree with the idea.
> > 
> 
> hmmm, that means the heart of the reclaimer (sgx_reclaim_pages()) would be
> made available for direct use from everywhere in the driver. I will look into this.
> 
> Reinette
> 

It's a valid enough point. Let's keep it as it is :-)

Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 21/30] selftests/sgx: Add test for EPCM permission changes
  2022-04-05 17:28     ` Reinette Chatre
@ 2022-04-05 18:43       ` Jarkko Sakkinen
  0 siblings, 0 replies; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-05 18:43 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Tue, 2022-04-05 at 10:28 -0700, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 4/5/2022 12:02 AM, Jarkko Sakkinen wrote:
> > Lacking:
> > 
> > KERNEL SELFTEST FRAMEWORK
> > M:      Shuah Khan <shuah@kernel.org>
> > M:      Shuah Khan <skhan@linuxfoundation.org>
> > L:      linux-kselftest@vger.kernel.org
> > S:      Maintained
> > Q:      https://patchwork.kernel.org/project/linux-kselftest/list/
> > T:      git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
> > F:      Documentation/dev-tools/kselftest*
> > F:      tools/testing/selftests/
> > 
> 
> My apologies. I'll add the kselftest folks to the next version.
> 
> Reinette

No worries, just reminding that now is a good time, since the different
parts of the patch set are settling. 

BR, Jarkko


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05 18:39                 ` Jarkko Sakkinen
@ 2022-04-05 18:59                   ` Reinette Chatre
  2022-04-06  7:30                     ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-05 18:59 UTC (permalink / raw)
  To: Jarkko Sakkinen, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

Hi Jarkko,

On 4/5/2022 11:39 AM, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 09:49 -0700, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 4/5/2022 7:52 AM, Jarkko Sakkinen wrote:
>>> n Tue, 2022-04-05 at 17:27 +0300, Jarkko Sakkinen wrote:
>>>> According to SDM having page type as regular is fine for EMODPR,
>>>> i.e. that's why I did not care about having it in SECINFO.
>>>>
>>>> Given that the opcode itself contains validation, I wonder
>>>> why this needs to be done:
>>>>
>>>> if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
>>>>                 return -EINVAL;
>>>>
>>>> if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
>>>>                 return -EINVAL;
>>>>
>>>> perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
>>>>
>>>> I.e. why duplicate validation and why does it have different
>>>> invariant than the opcode?
>>>
>>> Right it is done to prevent exceptions and also pseudo-code
>>> has this validation:
>>>
>>> IF (EPCM(DS:RCX).PT is not PT_REG) THEN #PF(DS:RCX); FI; 
>>
>> The current type of the page is validated - not the page type
>> provided in the parameters of the command.
>>
>>>
>>> This is clearly wrong:
>>
>> Could you please elaborate what is wrong? The hardware only checks
>> the permission bits and that is what is provided.
> 
> I think it's for most a bit confusing that it takes a special Linux
> defined SECINFO instead of what you read from spec. 
> 
>>
>>>
>>> /*
>>>  * Return valid permission fields from a secinfo structure provided by
>>>  * user space. The secinfo structure is required to only have bits in
>>>  * the permission fields set.
>>>  */
>>> static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
>>>
>>> It means that the API requires a malformed data as input.
>>
>> It is not clear to me how this is malformed. The API requires that only
>> the permission bits are set in the secinfo, only the permission bits in secinfo
>> is provided to the hardware, and the hardware only checks the permission bits.
>>
>>>
>>> Maybe it would be better idea then to replace secinfo with just the
>>> permission field?
>>
>> That is what I implemented in V1 [1], but was asked to change to secinfo. I could
>> go back to that if you prefer.
> 
> Yeah, if I was the one saying that, I was clearly wrong. But also
> perspective is now very different after using a lot of these
> features.

No problem, I understand.

I plan to replace the current "secinfo" field in struct sgx_enclave_restrict_permissions
with a new "permissions" field that contain only the permissions. Please let
me know if you have concerns with this (I also discuss this more in reply to
your other message related to the page type change ioctl()).

> 
> Alternatively you could have a single "mod" ioctl given the disjoint
> nature how the parameters go to SECINFO.

During V1 review [2] there was clear guidance to not multiplex within an ioctl() so 
I plan to keep them separate for now.


Reinette

>> [1] https://lore.kernel.org/linux-sgx/44fe170cfd855760857660b9f56cae8c4747cc15.1638381245.git.reinette.chatre@intel.com/

[2] https://lore.kernel.org/lkml/0fb14185-5cc3-a963-253d-2e119b4a52bb@intel.com/


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 17/30] x86/sgx: Support modifying SGX page type
  2022-04-05 18:41         ` Jarkko Sakkinen
@ 2022-04-05 18:59           ` Reinette Chatre
  2022-04-06  7:32             ` Jarkko Sakkinen
  0 siblings, 1 reply; 79+ messages in thread
From: Reinette Chatre @ 2022-04-05 18:59 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 4/5/2022 11:41 AM, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 10:05 -0700, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 4/5/2022 8:34 AM, Jarkko Sakkinen wrote:
>>> On Tue, 2022-04-05 at 10:06 +0300, Jarkko Sakkinen wrote:
>>
>>>>>
>>>>
>>>> To be coherent with other names, this should be
>>>> SGX_IOC_ENCLAVE_MODIFY_TYPES.
>>
>> This is not such a clear change request to me:
>>
>> SGX_IOC_ENCLAVE_ADD_PAGES - add multiple pages
>> SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS - restrict multiple permissions
>> SGX_IOC_ENCLAVE_REMOVE_PAGES - remove multiple pages
>> SGX_IOC_ENCLAVE_MODIFY_TYPE - set a single type
>>
>> Perhaps it should rather be SGX_IOC_ENCLAVE_SET_TYPE to indicate that
>> there is a single target type as opposed to the possibility
>> of multiple source types (TCS and regular pages can be trimmed).
>>
>>>

What is your opinion about what the ioctl() name should be? I prefer
to obtain a confirmation from you since you originally [1] requested
SGX_IOC_ENCLAVE_MODIFY_TYPE.

>>> This should take only page type given that flags are zeroed:
>>>
>>> EPCM(DS:RCX).R := 0;
>>> EPCM(DS:RCX).W := 0;
>>> EPCM(DS:RCX).X := 0; 
>>>
>>
>> ok, this was how it was done in V1 [1] and I can go back to that.
> 
> I would name the fields as "flags" and "page_type" just to align
> names with SGX instead of trying to mimim "posix names". Otherwise, 
> I support that.

I will move this ioctl() to use "page_type" instead of "secinfo"
within struct sgx_enclave_modify_type.

Your guidance of "flags" is not clear to me. I assume that you
refer to the field for struct sgx_enclave_restrict_permissions
where I think "permissions" to only contain the new permissions
would be more appropriate. None of the other values in
secinfo.flags are relevant.

Reinette

[1] https://lore.kernel.org/linux-sgx/Yav9g4+L8zg48DRf@iki.fi/


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges
  2022-04-05 18:42       ` Jarkko Sakkinen
@ 2022-04-05 19:56         ` Reinette Chatre
  0 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-05 19:56 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 4/5/2022 11:42 AM, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 10:13 -0700, Reinette Chatre wrote:
>> On 4/5/2022 12:11 AM, Jarkko Sakkinen wrote:
>>> On Mon, Apr 04, 2022 at 09:49:27AM -0700, Reinette Chatre wrote:

...

>>>> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
>>>> index 6e2cb7564080..545da16bb3ea 100644
>>>> --- a/arch/x86/kernel/cpu/sgx/main.c
>>>> +++ b/arch/x86/kernel/cpu/sgx/main.c
>>>> @@ -370,6 +370,12 @@ static bool sgx_should_reclaim(unsigned long watermark)
>>>>                !list_empty(&sgx_active_page_list);
>>>>  }
>>>>  
>>>> +void sgx_direct_reclaim(void)
>>>> +{
>>>> +       if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
>>>> +               sgx_reclaim_pages();
>>>> +}
>>>
>>> Please, instead open code this to both locations - not enough redundancy
>>> to be worth of new function. Causes only unnecessary cross-referencing
>>> when maintaining. Otherwise, I agree with the idea.
>>>
>>
>> hmmm, that means the heart of the reclaimer (sgx_reclaim_pages()) would be
>> made available for direct use from everywhere in the driver. I will look into this.
>>
>> Reinette
>>
> 
> It's a valid enough point. Let's keep it as it is :-)

Will do. I plan to add Dave's suggested comments to sgx_direct_reclaim() that is
introduced in this patch.

> 
> Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

Thank you very much.

Reinette


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges
  2022-04-05 17:25       ` Dave Hansen
@ 2022-04-06  6:35         ` Jarkko Sakkinen
  2022-04-06 17:50           ` Reinette Chatre
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-06  6:35 UTC (permalink / raw)
  To: Dave Hansen, Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Tue, 2022-04-05 at 10:25 -0700, Dave Hansen wrote:
> On 4/5/22 10:13, Reinette Chatre wrote:
> > > > +void sgx_direct_reclaim(void)
> > > > +{
> > > > +       if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
> > > > +               sgx_reclaim_pages();
> > > > +}
> > > Please, instead open code this to both locations - not enough redundancy
> > > to be worth of new function. Causes only unnecessary cross-referencing
> > > when maintaining. Otherwise, I agree with the idea.
> > > 
> > hmmm, that means the heart of the reclaimer (sgx_reclaim_pages()) would be
> > made available for direct use from everywhere in the driver. I will look into this.
> 
> I like the change.  It's not about reducing code redundancy, it's about
> *describing* what the code does.  Each location could have:
> 
>         /* Enter direct SGX reclaim: */
>         if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
>                 sgx_reclaim_pages();
> 
> Or, it could just be:
> 
>         sgx_direct_reclaim();
> 
> Which also provides a logical choke point to add comments, like:
> 
> /*
>  * sgx_direct_reclaim() should be called in locations where SGX
>  * memory resources might be low and might be needed in order
>  * to make forward progress.
>  */
> void sgx_direct_reclaim(void)
> {
>         ...

Maybe cutting hairs but could it be "sgx_reclaim_direct"? Rationale
is easier grepping of reclaimer functions, e.g. when tracing.

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-05 18:59                   ` Reinette Chatre
@ 2022-04-06  7:30                     ` Jarkko Sakkinen
  2022-04-06 17:51                       ` Reinette Chatre
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-06  7:30 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

On Tue, 2022-04-05 at 11:59 -0700, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 4/5/2022 11:39 AM, Jarkko Sakkinen wrote:
> > On Tue, 2022-04-05 at 09:49 -0700, Reinette Chatre wrote:
> > > Hi Jarkko,
> > > 
> > > On 4/5/2022 7:52 AM, Jarkko Sakkinen wrote:
> > > > n Tue, 2022-04-05 at 17:27 +0300, Jarkko Sakkinen wrote:
> > > > > According to SDM having page type as regular is fine for EMODPR,
> > > > > i.e. that's why I did not care about having it in SECINFO.
> > > > > 
> > > > > Given that the opcode itself contains validation, I wonder
> > > > > why this needs to be done:
> > > > > 
> > > > > if (secinfo.flags & ~SGX_SECINFO_PERMISSION_MASK)
> > > > >                 return -EINVAL;
> > > > > 
> > > > > if (memchr_inv(secinfo.reserved, 0, sizeof(secinfo.reserved)))
> > > > >                 return -EINVAL;
> > > > > 
> > > > > perm = secinfo.flags & SGX_SECINFO_PERMISSION_MASK;
> > > > > 
> > > > > I.e. why duplicate validation and why does it have different
> > > > > invariant than the opcode?
> > > > 
> > > > Right it is done to prevent exceptions and also pseudo-code
> > > > has this validation:
> > > > 
> > > > IF (EPCM(DS:RCX).PT is not PT_REG) THEN #PF(DS:RCX); FI; 
> > > 
> > > The current type of the page is validated - not the page type
> > > provided in the parameters of the command.
> > > 
> > > > 
> > > > This is clearly wrong:
> > > 
> > > Could you please elaborate what is wrong? The hardware only checks
> > > the permission bits and that is what is provided.
> > 
> > I think it's for most a bit confusing that it takes a special Linux
> > defined SECINFO instead of what you read from spec. 
> > 
> > > 
> > > > 
> > > > /*
> > > >  * Return valid permission fields from a secinfo structure provided by
> > > >  * user space. The secinfo structure is required to only have bits in
> > > >  * the permission fields set.
> > > >  */
> > > > static int sgx_perm_from_user_secinfo(void __user *_secinfo, u64 *secinfo_perm)
> > > > 
> > > > It means that the API requires a malformed data as input.
> > > 
> > > It is not clear to me how this is malformed. The API requires that only
> > > the permission bits are set in the secinfo, only the permission bits in secinfo
> > > is provided to the hardware, and the hardware only checks the permission bits.
> > > 
> > > > 
> > > > Maybe it would be better idea then to replace secinfo with just the
> > > > permission field?
> > > 
> > > That is what I implemented in V1 [1], but was asked to change to secinfo. I could
> > > go back to that if you prefer.
> > 
> > Yeah, if I was the one saying that, I was clearly wrong. But also
> > perspective is now very different after using a lot of these
> > features.
> 
> No problem, I understand.
> 
> I plan to replace the current "secinfo" field in struct sgx_enclave_restrict_permissions
> with a new "permissions" field that contain only the permissions. Please let
> me know if you have concerns with this (I also discuss this more in reply to
> your other message related to the page type change ioctl()).

I'm cool with it but if it is named as "permissions", then 
it is already software-defined entity, i.e. meaning just that
have this check in place in the ioctl:

if (addp->permissions & !(PROT_READ | PROT_WRITE | PROT_EXEC))
	return -EINVAL;

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 17/30] x86/sgx: Support modifying SGX page type
  2022-04-05 18:59           ` Reinette Chatre
@ 2022-04-06  7:32             ` Jarkko Sakkinen
  2022-04-06 17:50               ` Reinette Chatre
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-06  7:32 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

On Tue, 2022-04-05 at 11:59 -0700, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 4/5/2022 11:41 AM, Jarkko Sakkinen wrote:
> > On Tue, 2022-04-05 at 10:05 -0700, Reinette Chatre wrote:
> > > Hi Jarkko,
> > > 
> > > On 4/5/2022 8:34 AM, Jarkko Sakkinen wrote:
> > > > On Tue, 2022-04-05 at 10:06 +0300, Jarkko Sakkinen wrote:
> > > 
> > > > > > 
> > > > > 
> > > > > To be coherent with other names, this should be
> > > > > SGX_IOC_ENCLAVE_MODIFY_TYPES.
> > > 
> > > This is not such a clear change request to me:
> > > 
> > > SGX_IOC_ENCLAVE_ADD_PAGES - add multiple pages
> > > SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS - restrict multiple permissions
> > > SGX_IOC_ENCLAVE_REMOVE_PAGES - remove multiple pages
> > > SGX_IOC_ENCLAVE_MODIFY_TYPE - set a single type
> > > 
> > > Perhaps it should rather be SGX_IOC_ENCLAVE_SET_TYPE to indicate that
> > > there is a single target type as opposed to the possibility
> > > of multiple source types (TCS and regular pages can be trimmed).
> > > 
> > > > 
> 
> What is your opinion about what the ioctl() name should be? I prefer
> to obtain a confirmation from you since you originally [1] requested
> SGX_IOC_ENCLAVE_MODIFY_TYPE.

s/TYPE/TYPES/g i.e. SGX_IOC_ENCLAVE_MODIFY_TYPES is fine.
> 
> > > > This should take only page type given that flags are zeroed:
> > > > 
> > > > EPCM(DS:RCX).R := 0;
> > > > EPCM(DS:RCX).W := 0;
> > > > EPCM(DS:RCX).X := 0; 
> > > > 
> > > 
> > > ok, this was how it was done in V1 [1] and I can go back to that.
> > 
> > I would name the fields as "flags" and "page_type" just to align
> > names with SGX instead of trying to mimim "posix names". Otherwise, 
> > I support that.
> 
> I will move this ioctl() to use "page_type" instead of "secinfo"
> within struct sgx_enclave_modify_type.
> 
> Your guidance of "flags" is not clear to me. I assume that you
> refer to the field for struct sgx_enclave_restrict_permissions
> where I think "permissions" to only contain the new permissions
> would be more appropriate. None of the other values in
> secinfo.flags are relevant.

I'm fine with your permissions field to the restrict ioctl.

> 
> Reinette
> 
> [1] https://lore.kernel.org/linux-sgx/Yav9g4+L8zg48DRf@iki.fi/
> 

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 15/30] x86/sgx: Support adding of pages to an initialized enclave
  2022-04-05 10:03     ` Jarkko Sakkinen
@ 2022-04-06  7:37       ` Jarkko Sakkinen
  2022-04-06 22:42         ` Reinette Chatre
  0 siblings, 1 reply; 79+ messages in thread
From: Jarkko Sakkinen @ 2022-04-06  7:37 UTC (permalink / raw)
  To: Reinette Chatre, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel, harald

On Tue, 2022-04-05 at 13:03 +0300, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 08:05 +0300, Jarkko Sakkinen wrote:
> > On Mon, 2022-04-04 at 09:49 -0700, Reinette Chatre wrote:
> > > With SGX1 an enclave needs to be created with its maximum memory demands
> > > allocated. Pages cannot be added to an enclave after it is initialized.
> > > SGX2 introduces a new function, ENCLS[EAUG], that can be used to add
> > > pages to an initialized enclave. With SGX2 the enclave still needs to
> > > set aside address space for its maximum memory demands during enclave
> > > creation, but all pages need not be added before enclave initialization.
> > > Pages can be added during enclave runtime.
> > > 
> > > Add support for dynamically adding pages to an initialized enclave,
> > > architecturally limited to RW permission at creation but allowed to
> > > obtain RWX permissions after enclave runs EMODPE. Add pages via the
> > > page fault handler at the time an enclave address without a backing
> > > enclave page is accessed, potentially directly reclaiming pages if
> > > no free pages are available.
> > > 
> > > The enclave is still required to run ENCLU[EACCEPT] on the page before
> > > it can be used. A useful flow is for the enclave to run ENCLU[EACCEPT]
> > > on an uninitialized address. This will trigger the page fault handler
> > > that will add the enclave page and return execution to the enclave to
> > > repeat the ENCLU[EACCEPT] instruction, this time successful.
> > > 
> > > If the enclave accesses an uninitialized address in another way, for
> > > example by expanding the enclave stack to a page that has not yet been
> > > added, then the page fault handler would add the page on the first
> > > write but upon returning to the enclave the instruction that triggered
> > > the page fault would be repeated and since ENCLU[EACCEPT] was not run
> > > yet it would trigger a second page fault, this time with the SGX flag
> > > set in the page fault error code. This can only be recovered by entering
> > > the enclave again and directly running the ENCLU[EACCEPT] instruction on
> > > the now initialized address.
> > > 
> > > Accessing an uninitialized address from outside the enclave also
> > > triggers this flow but the page will remain inaccessible (access will
> > > result in #PF) until accepted from within the enclave via
> > > ENCLU[EACCEPT].
> > > 
> > > Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> > > ---
> > > Changes since V2:
> > > - Remove runtime tracking of EPCM permissions
> > >   (sgx_encl_page->vm_run_prot_bits) (Jarkko).
> > > - Move export of sgx_encl_{grow,shrink}() to separate patch. (Jarkko)
> > > - Use sgx_encl_page_alloc(). (Jarkko)
> > > - Set max allowed permissions to be RWX (Jarkko). Update changelog
> > >   to indicate the change and use comment in code as
> > >   created by Jarkko in:
> > > https://lore.kernel.org/linux-sgx/20220306053211.135762-4-jarkko@kernel.org
> > > - Do not set protection bits but let it be inherited by VMA (Jarkko)
> > > 
> > > Changes since V1:
> > > - Fix subject line "to initialized" -> "to an initialized" (Jarkko).
> > > - Move text about hardware's PENDING state to the patch that introduces
> > >   the ENCLS[EAUG] wrapper (Jarkko).
> > > - Ensure kernel-doc uses brackets when referring to function.
> > > 
> > >  arch/x86/kernel/cpu/sgx/encl.c | 124 +++++++++++++++++++++++++++++++++
> > >  1 file changed, 124 insertions(+)
> > > 
> > > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> > > index 546423753e4c..fa4f947f8496 100644
> > > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > > @@ -194,6 +194,119 @@ struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
> > >         return __sgx_encl_load_page(encl, entry);
> > >  }
> > >  
> > > +/**
> > > + * sgx_encl_eaug_page() - Dynamically add page to initialized enclave
> > > + * @vma:       VMA obtained from fault info from where page is accessed
> > > + * @encl:      enclave accessing the page
> > > + * @addr:      address that triggered the page fault
> > > + *
> > > + * When an initialized enclave accesses a page with no backing EPC page
> > > + * on a SGX2 system then the EPC can be added dynamically via the SGX2
> > > + * ENCLS[EAUG] instruction.
> > > + *
> > > + * Returns: Appropriate vm_fault_t: VM_FAULT_NOPAGE when PTE was installed
> > > + * successfully, VM_FAULT_SIGBUS or VM_FAULT_OOM as error otherwise.
> > > + */
> > > +static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
> > > +                                    struct sgx_encl *encl, unsigned long addr)
> > > +{
> > > +       struct sgx_pageinfo pginfo = {0};
> > > +       struct sgx_encl_page *encl_page;
> > > +       struct sgx_epc_page *epc_page;
> > > +       struct sgx_va_page *va_page;
> > > +       unsigned long phys_addr;
> > > +       u64 secinfo_flags;
> > > +       vm_fault_t vmret;
> > > +       int ret;
> > > +
> > > +       if (!test_bit(SGX_ENCL_INITIALIZED, &encl->flags))
> > > +               return VM_FAULT_SIGBUS;
> > > +
> > > +       /*
> > > +        * Ignore internal permission checking for dynamically added pages.
> > > +        * They matter only for data added during the pre-initialization
> > > +        * phase. The enclave decides the permissions by the means of
> > > +        * EACCEPT, EACCEPTCOPY and EMODPE.
> > > +        */
> > > +       secinfo_flags = SGX_SECINFO_R | SGX_SECINFO_W | SGX_SECINFO_X;
> > > +       encl_page = sgx_encl_page_alloc(encl, addr - encl->base, secinfo_flags);
> > > +       if (IS_ERR(encl_page))
> > > +               return VM_FAULT_OOM;
> > > +
> > > +       epc_page = sgx_alloc_epc_page(encl_page, true);
> > > +       if (IS_ERR(epc_page)) {
> > > +               kfree(encl_page);
> > > +               return VM_FAULT_SIGBUS;
> > > +       }
> > > +
> > > +       va_page = sgx_encl_grow(encl);
> > > +       if (IS_ERR(va_page)) {
> > > +               ret = PTR_ERR(va_page);
> > > +               goto err_out_free;
> > > +       }
> > > +
> > > +       mutex_lock(&encl->lock);
> > > +
> > > +       /*
> > > +        * Copy comment from sgx_encl_add_page() to maintain guidance in
> > > +        * this similar flow:
> > > +        * Adding to encl->va_pages must be done under encl->lock.  Ditto for
> > > +        * deleting (via sgx_encl_shrink()) in the error path.
> > > +        */
> > > +       if (va_page)
> > > +               list_add(&va_page->list, &encl->va_pages);
> > > +
> > > +       ret = xa_insert(&encl->page_array, PFN_DOWN(encl_page->desc),
> > > +                       encl_page, GFP_KERNEL);
> > > +       /*
> > > +        * If ret == -EBUSY then page was created in another flow while
> > > +        * running without encl->lock
> > > +        */
> > > +       if (ret)
> > > +               goto err_out_unlock;
> > > +
> > > +       pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
> > > +       pginfo.addr = encl_page->desc & PAGE_MASK;
> > > +       pginfo.metadata = 0;
> > > +
> > > +       ret = __eaug(&pginfo, sgx_get_epc_virt_addr(epc_page));
> > > +       if (ret)
> > > +               goto err_out;
> > > +
> > > +       encl_page->encl = encl;
> > > +       encl_page->epc_page = epc_page;
> > > +       encl_page->type = SGX_PAGE_TYPE_REG;
> > > +       encl->secs_child_cnt++;
> > > +
> > > +       sgx_mark_page_reclaimable(encl_page->epc_page);
> > > +
> > > +       phys_addr = sgx_get_epc_phys_addr(epc_page);
> > > +       /*
> > > +        * Do not undo everything when creating PTE entry fails - next #PF
> > > +        * would find page ready for a PTE.
> > > +        */
> > > +       vmret = vmf_insert_pfn(vma, addr, PFN_DOWN(phys_addr));
> > > +       if (vmret != VM_FAULT_NOPAGE) {
> > > +               mutex_unlock(&encl->lock);
> > > +               return VM_FAULT_SIGBUS;
> > > +       }
> > > +       mutex_unlock(&encl->lock);
> > > +       return VM_FAULT_NOPAGE;
> > > +
> > > +err_out:
> > > +       xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
> > > +
> > > +err_out_unlock:
> > > +       sgx_encl_shrink(encl, va_page);
> > > +       mutex_unlock(&encl->lock);
> > > +
> > > +err_out_free:
> > > +       sgx_encl_free_epc_page(epc_page);
> > > +       kfree(encl_page);
> > > +
> > > +       return VM_FAULT_SIGBUS;
> > > +}
> > > +
> > >  static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
> > >  {
> > >         unsigned long addr = (unsigned long)vmf->address;
> > > @@ -213,6 +326,17 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
> > >         if (unlikely(!encl))
> > >                 return VM_FAULT_SIGBUS;
> > >  
> > > +       /*
> > > +        * The page_array keeps track of all enclave pages, whether they
> > > +        * are swapped out or not. If there is no entry for this page and
> > > +        * the system supports SGX2 then it is possible to dynamically add
> > > +        * a new enclave page. This is only possible for an initialized
> > > +        * enclave that will be checked for right away.
> > > +        */
> > > +       if (cpu_feature_enabled(X86_FEATURE_SGX2) &&
> > > +           (!xa_load(&encl->page_array, PFN_DOWN(addr))))
> > > +               return sgx_encl_eaug_page(vma, encl, addr);
> > > +
> > >         mutex_lock(&encl->lock);
> > >  
> > >         entry = sgx_encl_load_page_in_vma(encl, addr, vma->vm_flags);
> > 
> > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
> 
> Tested-by: Jarkko Sakkinen <jarkko@kernel.org>

For what is worth I also get a full pass with our test suite,
where the runtime is using EAUG together with EACCEPTCOPY:

    Finished test [unoptimized + debuginfo] target(s) in 0.26s
     Running unittests src/main.rs (target/debug/deps/enarx-ee7f422740eab404)

running 7 tests
test backend::sgx::attestation::tests::request_target_info ... ignored
test backend::sev::snp::tests::test_const_id_macro ... ok
test backend::sev::snp::firmware::test::test_vcek_url ... ok
test backend::sgx::ioctls::tests::restrict_permissions ... ok
test cli::snp::tests::test_empty_cache_path ... ok
test workldr::wasmldr::test::is_builtin ... ok
test cli::snp::tests::test_get_or_write ... ok

test result: ok. 6 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 0.20s

     Running tests/c_integration_tests.rs (target/debug/deps/c_integration_tests-f7a69c2274f59f90)

running 21 tests
test get_att ... ignored
test bind ... ok
test clock_gettime ... ok
test close ... ok
test exit_one ... ok
test getegid ... ok
test geteuid ... ok
test sgx_get_att_quote ... ignored
test sgx_get_att_quote_size ... ignored
test exit_zero ... ok
test getgid ... ok
test write_emsgsize ... ignored
test write_stderr ... ignored
test getuid ... ok
test listen ... ok
test read ... ok
test read_udp ... ok
test readv ... ok
test socket ... ok
test uname ... ok
test write_stdout ... ok

test result: ok. 16 passed; 0 failed; 5 ignored; 0 measured; 0 filtered out; finished in 18.46s

     Running tests/rust_integration_tests.rs (target/debug/deps/rust_integration_tests-0122fb231e20ea63)

running 6 tests
test rust_sev_attestation ... ignored
test echo ... ok
test cpuid ... ok
test memory_stress_test ... ok
test memspike ... ok
test unix_echo ... ok

test result: ok. 5 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 48.22s

     Running tests/wasmldr_tests.rs (target/debug/deps/wasmldr_tests-98b6ff656b9d815e)

running 9 tests
test check_tcp ... ok
test hello_wasi_snapshot1 ... ok
test memspike ... ok
test echo has been running for over 60 seconds
test memory_stress_test has been running for over 60 seconds
test no_export has been running for over 60 seconds
test return_1 has been running for over 60 seconds
test wasi_snapshot1 has been running for over 60 seconds
test memory_stress_test ... ok
Error: default export in '' is not a function
test no_export ... ok
test return_1 ... ok
test wasi_snapshot1 ... ok
test zerooneone ... ok
test echo ... ok

test result: ok. 9 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 102.75s

BR, Jarkko

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges
  2022-04-06  6:35         ` Jarkko Sakkinen
@ 2022-04-06 17:50           ` Reinette Chatre
  0 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-06 17:50 UTC (permalink / raw)
  To: Jarkko Sakkinen, Dave Hansen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 4/5/2022 11:35 PM, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 10:25 -0700, Dave Hansen wrote:
>> On 4/5/22 10:13, Reinette Chatre wrote:
>>>>> +void sgx_direct_reclaim(void)
>>>>> +{
>>>>> +       if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
>>>>> +               sgx_reclaim_pages();
>>>>> +}
>>>> Please, instead open code this to both locations - not enough redundancy
>>>> to be worth of new function. Causes only unnecessary cross-referencing
>>>> when maintaining. Otherwise, I agree with the idea.
>>>>
>>> hmmm, that means the heart of the reclaimer (sgx_reclaim_pages()) would be
>>> made available for direct use from everywhere in the driver. I will look into this.
>>
>> I like the change.  It's not about reducing code redundancy, it's about
>> *describing* what the code does.  Each location could have:
>>
>>         /* Enter direct SGX reclaim: */
>>         if (sgx_should_reclaim(SGX_NR_LOW_PAGES))
>>                 sgx_reclaim_pages();
>>
>> Or, it could just be:
>>
>>         sgx_direct_reclaim();
>>
>> Which also provides a logical choke point to add comments, like:
>>
>> /*
>>  * sgx_direct_reclaim() should be called in locations where SGX
>>  * memory resources might be low and might be needed in order
>>  * to make forward progress.
>>  */
>> void sgx_direct_reclaim(void)
>> {
>>         ...
> 
> Maybe cutting hairs but could it be "sgx_reclaim_direct"? Rationale
> is easier grepping of reclaimer functions, e.g. when tracing.

Sure, will do.

This may not help grepping all reclaimer functions though since
the code is not consistent in this regard.

Reinette

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 17/30] x86/sgx: Support modifying SGX page type
  2022-04-06  7:32             ` Jarkko Sakkinen
@ 2022-04-06 17:50               ` Reinette Chatre
  0 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-06 17:50 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86, seanjc,
	kai.huang, cathy.zhang, cedric.xing, haitao.huang, mark.shanahan,
	hpa, linux-kernel

Hi Jarkko,

On 4/6/2022 12:32 AM, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 11:59 -0700, Reinette Chatre wrote:
>> Hi Jarkko,
>>
>> On 4/5/2022 11:41 AM, Jarkko Sakkinen wrote:
>>> On Tue, 2022-04-05 at 10:05 -0700, Reinette Chatre wrote:
>>>> Hi Jarkko,
>>>>
>>>> On 4/5/2022 8:34 AM, Jarkko Sakkinen wrote:
>>>>> On Tue, 2022-04-05 at 10:06 +0300, Jarkko Sakkinen wrote:
>>>>
>>>>>>>
>>>>>>
>>>>>> To be coherent with other names, this should be
>>>>>> SGX_IOC_ENCLAVE_MODIFY_TYPES.
>>>>
>>>> This is not such a clear change request to me:
>>>>
>>>> SGX_IOC_ENCLAVE_ADD_PAGES - add multiple pages
>>>> SGX_IOC_ENCLAVE_RESTRICT_PERMISSIONS - restrict multiple permissions
>>>> SGX_IOC_ENCLAVE_REMOVE_PAGES - remove multiple pages
>>>> SGX_IOC_ENCLAVE_MODIFY_TYPE - set a single type
>>>>
>>>> Perhaps it should rather be SGX_IOC_ENCLAVE_SET_TYPE to indicate that
>>>> there is a single target type as opposed to the possibility
>>>> of multiple source types (TCS and regular pages can be trimmed).
>>>>
>>>>>
>>
>> What is your opinion about what the ioctl() name should be? I prefer
>> to obtain a confirmation from you since you originally [1] requested
>> SGX_IOC_ENCLAVE_MODIFY_TYPE.
> 
> s/TYPE/TYPES/g i.e. SGX_IOC_ENCLAVE_MODIFY_TYPES is fine.

ok, thank you for confirming, will do.

>>
>>>>> This should take only page type given that flags are zeroed:
>>>>>
>>>>> EPCM(DS:RCX).R := 0;
>>>>> EPCM(DS:RCX).W := 0;
>>>>> EPCM(DS:RCX).X := 0; 
>>>>>
>>>>
>>>> ok, this was how it was done in V1 [1] and I can go back to that.
>>>
>>> I would name the fields as "flags" and "page_type" just to align
>>> names with SGX instead of trying to mimim "posix names". Otherwise, 
>>> I support that.
>>
>> I will move this ioctl() to use "page_type" instead of "secinfo"
>> within struct sgx_enclave_modify_type.
>>
>> Your guidance of "flags" is not clear to me. I assume that you
>> refer to the field for struct sgx_enclave_restrict_permissions
>> where I think "permissions" to only contain the new permissions
>> would be more appropriate. None of the other values in
>> secinfo.flags are relevant.
> 
> I'm fine with your permissions field to the restrict ioctl.

Will do, thank you.

Reinette


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions
  2022-04-06  7:30                     ` Jarkko Sakkinen
@ 2022-04-06 17:51                       ` Reinette Chatre
  0 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-06 17:51 UTC (permalink / raw)
  To: Jarkko Sakkinen, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel

Hi Jarkko,

On 4/6/2022 12:30 AM, Jarkko Sakkinen wrote:
> On Tue, 2022-04-05 at 11:59 -0700, Reinette Chatre wrote:


>> I plan to replace the current "secinfo" field in struct sgx_enclave_restrict_permissions
>> with a new "permissions" field that contain only the permissions. Please let
>> me know if you have concerns with this (I also discuss this more in reply to
>> your other message related to the page type change ioctl()).
> 
> I'm cool with it but if it is named as "permissions", then 
> it is already software-defined entity, i.e. meaning just that
> have this check in place in the ioctl:
> 
> if (addp->permissions & !(PROT_READ | PROT_WRITE | PROT_EXEC))
> 	return -EINVAL;
> 

I assume that we do still want to ensure that
PROT_READ is always set.

I was planning to keep it in the "SGX language" since
this is about changing EPCM permissions with values from
a runtime understanding SGX permissions in secinfo that will
be provided to hardware understanding SGX permissions in
secinfo.

Thus:

if (params.permissions & ~SGX_SECINFO_PERMISSION_MASK)
	return -EINVAL;

if (!(params.permissions & SGX_SECINFO_R))
	return -EINVAL;


Reinette

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH V3 15/30] x86/sgx: Support adding of pages to an initialized enclave
  2022-04-06  7:37       ` Jarkko Sakkinen
@ 2022-04-06 22:42         ` Reinette Chatre
  0 siblings, 0 replies; 79+ messages in thread
From: Reinette Chatre @ 2022-04-06 22:42 UTC (permalink / raw)
  To: Jarkko Sakkinen, dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: seanjc, kai.huang, cathy.zhang, cedric.xing, haitao.huang,
	mark.shanahan, hpa, linux-kernel, nathaniel, harald

Hi Jarkko,

On 4/6/2022 12:37 AM, Jarkko Sakkinen wrote:

...

> For what is worth I also get a full pass with our test suite,
> where the runtime is using EAUG together with EACCEPTCOPY:

Thank you very much for all the testing.

Haitao is also busy with significant testing and uncovered a bug via encountering
the WARN at arch/x86/kernel/cpu/sgx/ioctl.c:40. The issue is clear
and I prepared the fix below that can be applied on top of this series.
I plan to split this patch in two (1st the main change, 2nd the changes
to sgx_encl_eaug_page() that will be rolled into "x86/sgx: Support adding
of pages to an initialized enclave") and roll it into the next version
of this series:

-----8<-----

Subject: [PATCH] x86/sgx: Support VA page allocation without reclaiming

VA page allocation is done during enclave creation and adding of
pages to an enclave. In both usages VA pages are allocated
to always attempt a direct reclaim if no EPC pages are available and
because of this the VA pages are allocated without the enclave's mutex
held. Both usages are protected from concurrent attempts with an
atomic operation on SGX_ENCL_IOCTL making it possible to allocate
the VA pages without the enclave's mutex held.

Dynamically adding pages via the page fault handler does not
have the protection of SGX_ENCL_IOCTL but does not require
VA pages to be allocated with default direct reclaim.

Make VA page allocation with direct reclaim optional to make
it possible to perform allocation with enclave's mutex held
and thus protect against concurrent updates to encl->page_cnt.

Reported-by: Haitao Huang <haitao.huang@intel.com>
Tested-by: Haitao Huang <haitao.huang@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c  | 27 +++++++++++----------------
 arch/x86/kernel/cpu/sgx/encl.h  |  4 ++--
 arch/x86/kernel/cpu/sgx/ioctl.c |  8 ++++----
 3 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 7909570736a0..11f97fdcac1e 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -239,20 +239,14 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
 		return VM_FAULT_SIGBUS;
 	}
 
-	va_page = sgx_encl_grow(encl);
+	mutex_lock(&encl->lock);
+
+	va_page = sgx_encl_grow(encl, false);
 	if (IS_ERR(va_page)) {
 		ret = PTR_ERR(va_page);
-		goto err_out_free;
+		goto err_out_unlock;
 	}
 
-	mutex_lock(&encl->lock);
-
-	/*
-	 * Copy comment from sgx_encl_add_page() to maintain guidance in
-	 * this similar flow:
-	 * Adding to encl->va_pages must be done under encl->lock.  Ditto for
-	 * deleting (via sgx_encl_shrink()) in the error path.
-	 */
 	if (va_page)
 		list_add(&va_page->list, &encl->va_pages);
 
@@ -263,7 +257,7 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
 	 * running without encl->lock
 	 */
 	if (ret)
-		goto err_out_unlock;
+		goto err_out_shrink;
 
 	pginfo.secs = (unsigned long)sgx_get_epc_virt_addr(encl->secs.epc_page);
 	pginfo.addr = encl_page->desc & PAGE_MASK;
@@ -296,11 +290,10 @@ static vm_fault_t sgx_encl_eaug_page(struct vm_area_struct *vma,
 err_out:
 	xa_erase(&encl->page_array, PFN_DOWN(encl_page->desc));
 
-err_out_unlock:
+err_out_shrink:
 	sgx_encl_shrink(encl, va_page);
+err_out_unlock:
 	mutex_unlock(&encl->lock);
-
-err_out_free:
 	sgx_encl_free_epc_page(epc_page);
 	kfree(encl_page);
 
@@ -998,6 +991,8 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr)
 
 /**
  * sgx_alloc_va_page() - Allocate a Version Array (VA) page
+ * @reclaim: Reclaim EPC pages directly if none available. Enclave
+ *           mutex should not be held if this is set.
  *
  * Allocate a free EPC page and convert it to a Version Array (VA) page.
  *
@@ -1005,12 +1000,12 @@ void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr)
  *   a VA page,
  *   -errno otherwise
  */
-struct sgx_epc_page *sgx_alloc_va_page(void)
+struct sgx_epc_page *sgx_alloc_va_page(bool reclaim)
 {
 	struct sgx_epc_page *epc_page;
 	int ret;
 
-	epc_page = sgx_alloc_epc_page(NULL, true);
+	epc_page = sgx_alloc_epc_page(NULL, reclaim);
 	if (IS_ERR(epc_page))
 		return ERR_CAST(epc_page);
 
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index 253ebdd1c5be..66adb8faec45 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -116,14 +116,14 @@ struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
 					  unsigned long offset,
 					  u64 secinfo_flags);
 void sgx_zap_enclave_ptes(struct sgx_encl *encl, unsigned long addr);
-struct sgx_epc_page *sgx_alloc_va_page(void);
+struct sgx_epc_page *sgx_alloc_va_page(bool reclaim);
 unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page);
 void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset);
 bool sgx_va_page_full(struct sgx_va_page *va_page);
 void sgx_encl_free_epc_page(struct sgx_epc_page *page);
 struct sgx_encl_page *sgx_encl_load_page(struct sgx_encl *encl,
 					 unsigned long addr);
-struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl);
+struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim);
 void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page);
 
 #endif /* _X86_ENCL_H */
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index f88bc1236276..b163afff239f 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -17,7 +17,7 @@
 #include "encl.h"
 #include "encls.h"
 
-struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
+struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl, bool reclaim)
 {
 	struct sgx_va_page *va_page = NULL;
 	void *err;
@@ -30,7 +30,7 @@ struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl)
 		if (!va_page)
 			return ERR_PTR(-ENOMEM);
 
-		va_page->epc_page = sgx_alloc_va_page();
+		va_page->epc_page = sgx_alloc_va_page(reclaim);
 		if (IS_ERR(va_page->epc_page)) {
 			err = ERR_CAST(va_page->epc_page);
 			kfree(va_page);
@@ -64,7 +64,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs)
 	struct file *backing;
 	long ret;
 
-	va_page = sgx_encl_grow(encl);
+	va_page = sgx_encl_grow(encl, true);
 	if (IS_ERR(va_page))
 		return PTR_ERR(va_page);
 	else if (va_page)
@@ -275,7 +275,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long src,
 		return PTR_ERR(epc_page);
 	}
 
-	va_page = sgx_encl_grow(encl);
+	va_page = sgx_encl_grow(encl, true);
 	if (IS_ERR(va_page)) {
 		ret = PTR_ERR(va_page);
 		goto err_out_free;


^ permalink raw reply related	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2022-04-06 22:43 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-04 16:49 [PATCH V3 00/30] x86/sgx and selftests/sgx: Support SGX2 Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 01/30] x86/sgx: Add short descriptions to ENCLS wrappers Reinette Chatre
2022-04-05  6:52   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 02/30] x86/sgx: Add wrapper for SGX2 EMODPR function Reinette Chatre
2022-04-05  6:53   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 03/30] x86/sgx: Add wrapper for SGX2 EMODT function Reinette Chatre
2022-04-05  6:53   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 04/30] x86/sgx: Add wrapper for SGX2 EAUG function Reinette Chatre
2022-04-05  6:54   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 05/30] x86/sgx: Support loading enclave page without VMA permissions check Reinette Chatre
2022-04-05  6:56   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 06/30] x86/sgx: Export sgx_encl_ewb_cpumask() Reinette Chatre
2022-04-05  6:56   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 07/30] x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask() Reinette Chatre
2022-04-05  6:57   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 08/30] x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes() Reinette Chatre
2022-04-05  6:59   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 09/30] x86/sgx: Make sgx_ipi_cb() available internally Reinette Chatre
2022-04-05  6:59   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 10/30] x86/sgx: Create utility to validate user provided offset and length Reinette Chatre
2022-04-05  7:00   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 11/30] x86/sgx: Keep record of SGX page type Reinette Chatre
2022-04-05  7:00   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 12/30] x86/sgx: Export sgx_encl_{grow,shrink}() Reinette Chatre
2022-04-05  7:04   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 13/30] x86/sgx: Export sgx_encl_page_alloc() Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 14/30] x86/sgx: Support restricting of enclave page permissions Reinette Chatre
2022-04-05  5:03   ` Jarkko Sakkinen
2022-04-05  5:07     ` Jarkko Sakkinen
2022-04-05 13:40       ` Jarkko Sakkinen
2022-04-05 14:19         ` Jarkko Sakkinen
2022-04-05 14:27           ` Jarkko Sakkinen
2022-04-05 14:52             ` Jarkko Sakkinen
2022-04-05 16:49               ` Reinette Chatre
2022-04-05 18:39                 ` Jarkko Sakkinen
2022-04-05 18:59                   ` Reinette Chatre
2022-04-06  7:30                     ` Jarkko Sakkinen
2022-04-06 17:51                       ` Reinette Chatre
2022-04-05 16:40             ` Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 15/30] x86/sgx: Support adding of pages to an initialized enclave Reinette Chatre
2022-04-05  5:05   ` Jarkko Sakkinen
2022-04-05 10:03     ` Jarkko Sakkinen
2022-04-06  7:37       ` Jarkko Sakkinen
2022-04-06 22:42         ` Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 16/30] x86/sgx: Tighten accessible memory range after enclave initialization Reinette Chatre
2022-04-05  7:05   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 17/30] x86/sgx: Support modifying SGX page type Reinette Chatre
2022-04-05  7:06   ` Jarkko Sakkinen
2022-04-05 15:34     ` Jarkko Sakkinen
2022-04-05 17:05       ` Reinette Chatre
2022-04-05 18:41         ` Jarkko Sakkinen
2022-04-05 18:59           ` Reinette Chatre
2022-04-06  7:32             ` Jarkko Sakkinen
2022-04-06 17:50               ` Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 18/30] x86/sgx: Support complete page removal Reinette Chatre
2022-04-05  7:08   ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 19/30] x86/sgx: Free up EPC pages directly to support large page ranges Reinette Chatre
2022-04-05  7:11   ` Jarkko Sakkinen
2022-04-05 17:13     ` Reinette Chatre
2022-04-05 17:25       ` Dave Hansen
2022-04-06  6:35         ` Jarkko Sakkinen
2022-04-06 17:50           ` Reinette Chatre
2022-04-05 18:42       ` Jarkko Sakkinen
2022-04-05 19:56         ` Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 20/30] Documentation/x86: Introduce enclave runtime management section Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 21/30] selftests/sgx: Add test for EPCM permission changes Reinette Chatre
2022-04-05  7:02   ` Jarkko Sakkinen
2022-04-05  7:03     ` Jarkko Sakkinen
2022-04-05 17:28     ` Reinette Chatre
2022-04-05 18:43       ` Jarkko Sakkinen
2022-04-04 16:49 ` [PATCH V3 22/30] selftests/sgx: Add test for TCS page " Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 23/30] selftests/sgx: Test two different SGX2 EAUG flows Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 24/30] selftests/sgx: Introduce dynamic entry point Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 25/30] selftests/sgx: Introduce TCS initialization enclave operation Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 26/30] selftests/sgx: Test complete changing of page type flow Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 27/30] selftests/sgx: Test faulty enclave behavior Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 28/30] selftests/sgx: Test invalid access to removed enclave page Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 29/30] selftests/sgx: Test reclaiming of untouched page Reinette Chatre
2022-04-04 16:49 ` [PATCH V3 30/30] selftests/sgx: Page removal stress test Reinette Chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).