linux-sgx.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/5] security: x86/sgx: SGX vs. LSM
@ 2019-06-06  2:11 Sean Christopherson
  2019-06-06  2:11 ` [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect() Sean Christopherson
                   ` (5 more replies)
  0 siblings, 6 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-06  2:11 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

This series is the result of a rather absurd amount of discussion over
how to get SGX to play nice with LSM policies, without having to resort
to evil shenanigans or put undue burden on userspace.  Discussions are
still ongoing, e.g. folks are exploring alternatives to changing the
proposed SGX UAPI, but I wanted to get this updated version of the code
posted to show a fairly minimal implemenation(from a kernel perspective),
e.g. the diff stats aren't too scary, especially considering 50% of the
added lines are comments.

This series is a delta to Jarkko's ongoing SGX series and applies on
Jarkko's current master at https://github.com/jsakkine-intel/linux-sgx.git:

  dfc89a83b5bc ("docs: x86/sgx: Document the enclave API")

The basic gist of the approach is to track an enclave's page protections
separately from any vmas that map the page, and separate from the hardware
enforced protections.  The SGX UAPI is modified to require userspace to
explicitly define the protections for each enclave page, i.e. the ioctl
to add pages to an enclave is extended to take PROT_{READ,WRITE,EXEC}
flags.

An enclave page's protections are the maximal protections that userspace
can use to map the page, e.g. mprotect() and mmap() are rejected if the
protections for the vma would be more permissible than those of the
associated enclave page.

Tracking protections for an enclave page (in additional to vmas) allows
SGX to invoke LSM upcalls while the enclave is being built.  This is
critical to enabling LSMs to implement policies for enclave pages that
are functionally equivalent to existing policies for normal pages.

v1: https://lkml.kernel.org/r/20190531233159.30992-1-sean.j.christopherson@intel.com

v2:
  - Dropped the patch(es) to extend the SGX UAPI to allow adding multiple
    enclave pages in a single syscall [Jarkko].

  - Reject ioctl() immediately on LSM denial [Stephen].

  - Rework SELinux code to avoid checking EXEMEM multiple times [Stephen].

  - Adding missing equivalents to existing selinux_file_protect() checks
    [Stephen].

  - Hold mmap_sem across copy_to_user() to prevent a TOCTOU race when
    checking the source vma [Stephen].

  - Stubify security_enclave_load() if !CONFIG_SECURITY [Stephen].

  - Make flags a 32-bit field [Andy].

  - Don't validate the SECINFO protection flags against the enclave
    page's protection flags [Andy].

  - Rename mprotect() hook to may_mprotect() [Andy].

  - Test 'vma->vm_flags & VM_MAYEXEC' instead of manually checking for
    a noexec path [Jarkko].

  - Drop the SGX defined flags (use PROT_*) [Jarkko].

  - Improve comments and changelogs [Jarkko].

Sean Christopherson (5):
  mm: Introduce vm_ops->may_mprotect()
  x86/sgx: Require userspace to define enclave pages' protection bits
  x86/sgx: Enforce noexec filesystem restriction for enclaves
  LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX
  security/selinux: Add enclave_load() implementation

 arch/x86/include/uapi/asm/sgx.h        |  2 +
 arch/x86/kernel/cpu/sgx/driver/ioctl.c | 57 ++++++++++++++++++---
 arch/x86/kernel/cpu/sgx/driver/main.c  |  5 ++
 arch/x86/kernel/cpu/sgx/encl.c         | 53 ++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/encl.h         |  4 ++
 include/linux/lsm_hooks.h              | 13 +++++
 include/linux/mm.h                     |  2 +
 include/linux/security.h               | 12 +++++
 mm/mprotect.c                          | 15 ++++--
 security/security.c                    |  7 +++
 security/selinux/hooks.c               | 69 ++++++++++++++++++++++++++
 11 files changed, 228 insertions(+), 11 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect()
  2019-06-06  2:11 [RFC PATCH v2 0/5] security: x86/sgx: SGX vs. LSM Sean Christopherson
@ 2019-06-06  2:11 ` Sean Christopherson
  2019-06-10 15:06   ` Jarkko Sakkinen
  2019-06-06  2:11 ` [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits Sean Christopherson
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-06  2:11 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

SGX will use the may_mprotect() hook to prevent userspace from
circumventing various security checks, e.g. Linux Security Modules.
Naming it may_mprotect() instead of simply mprotect() is intended to
reflect the hook's purpose as a way to gate mprotect() as opposed to
a wholesale replacement.

Enclaves are built by copying data from normal memory into the Enclave
Page Cache (EPC).  Due to the nature of SGX, the EPC is represented by a
single file that must be MAP_SHARED, i.e. mprotect() only ever sees a
MAP_SHARED vm_file that references single file path.  Furthermore, all
enclaves will need read, write and execute pages in the EPC.

As a result, LSM policies cannot be meaningfully applied, e.g. an LSM
can deny access to the EPC as a whole, but can't deny PROT_EXEC on page
that originated in a non-EXECUTE file (which is long gone by the time
mprotect() is called).

By hooking mprotect(), SGX can make explicit LSM upcalls while an
enclave is being built, i.e. when the kernel has a handle to origin of
each enclave page, and enforce the result of the LSM policy whenever
userspace maps the enclave page in the future.

Alternatively, SGX could play games with MAY_{READ,WRITE,EXEC}, but
that approach is quite ugly, e.g. would require userspace to call an
SGX ioctl() prior to using mprotect() to extend a page's protections.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 include/linux/mm.h |  2 ++
 mm/mprotect.c      | 15 +++++++++++----
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0e8834ac32b7..a697996040ac 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -458,6 +458,8 @@ struct vm_operations_struct {
 	void (*close)(struct vm_area_struct * area);
 	int (*split)(struct vm_area_struct * area, unsigned long addr);
 	int (*mremap)(struct vm_area_struct * area);
+	int (*may_mprotect)(struct vm_area_struct * area, unsigned long start,
+			    unsigned long end, unsigned long prot);
 	vm_fault_t (*fault)(struct vm_fault *vmf);
 	vm_fault_t (*huge_fault)(struct vm_fault *vmf,
 			enum page_entry_size pe_size);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index bf38dfbbb4b4..18732543b295 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -547,13 +547,20 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
 			goto out;
 		}
 
-		error = security_file_mprotect(vma, reqprot, prot);
-		if (error)
-			goto out;
-
 		tmp = vma->vm_end;
 		if (tmp > end)
 			tmp = end;
+
+		if (vma->vm_ops && vma->vm_ops->may_mprotect) {
+			error = vma->vm_ops->may_mprotect(vma, nstart, tmp, prot);
+			if (error)
+				goto out;
+		}
+
+		error = security_file_mprotect(vma, reqprot, prot);
+		if (error)
+			goto out;
+
 		error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
 		if (error)
 			goto out;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-06  2:11 [RFC PATCH v2 0/5] security: x86/sgx: SGX vs. LSM Sean Christopherson
  2019-06-06  2:11 ` [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect() Sean Christopherson
@ 2019-06-06  2:11 ` Sean Christopherson
  2019-06-10 15:27   ` Jarkko Sakkinen
  2019-06-10 18:29   ` Xing, Cedric
  2019-06-06  2:11 ` [RFC PATCH v2 3/5] x86/sgx: Enforce noexec filesystem restriction for enclaves Sean Christopherson
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-06  2:11 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

Existing Linux Security Module policies restrict userspace's ability to
map memory, e.g. may require priveleged permissions to map a page that
is simultaneously writable and executable.  Said permissions are often
tied to the file which backs the mapped memory, i.e. vm_file.

For reasons explained below, SGX does not allow LSMs to enforce policies
using existing LSM hooks such as file_mprotect().  Explicitly track the
protection bits for an enclave page (separate from the vma/pte bits) and
require userspace to explicit define each page's protection bit when the
page is added to the enclave.  Enclave page protection bits pave the way
adding security_enclave_load() as an SGX equivalent to file_mprotect(),
e.g. SGX can pass the page's protection bits and source vma to LSMs.
The source vma will allow LSMs to tie permissions to files, e.g. the
file containing the enclave's code and initial data, and the protection
bits will allow LSMs to make decisions based on the capabilities of the
enclave, e.g. if a page can be converted from RW to RX.

Due to the nature of the Enclave Page Cache, and because the EPC is
manually managed by SGX, all enclave vmas are backed by the same file,
i.e. /dev/sgx/enclave.  Specifically, a single file allows SGX to use
file op hooks to move pages in/out of the EPC.

Furthermore, EPC pages for any given enclave are fundamentally shared
between processes, i.e. CoW semantics are not possible with EPC pages
due to hardware restrictions such as 1:1 mappings between virtual and
physical addresses (within the enclave).

Lastly, all real world enclaves will need read, write and execute
permissions to EPC pages.

As a result, SGX does not play nice with existing LSM behavior as it is
impossible to apply policies to enclaves with reasonable granularity,
e.g. an LSM can deny access to EPC altogether, but can't deny
potentially unwanted behavior such as mapping pages RW->RW or RWX.

For example, because all (practical) enclaves need RW pages for data and
RX pages for code, SELinux's existing policies will require all enclaves
to have FILE__READ, FILE__WRITE and FILE__EXECUTE permissions on
/dev/sgx/enclave.  Witholding FILE__WRITE or FILE__EXECUTE in an attempt
to deny RW->RX or RWX would prevent running *any* enclave, even those
that cleanly separate RW and RX pages.  And because /dev/sgx/enclave
requires MAP_SHARED, the anonymous/CoW checks that would trigger
FILE__EXECMOD or PROCESS__EXECMEM permissions will never fire.

Taking protection bits has a second use in that it can be used to
prevent loading an enclave from a noexec file system.  On SGX2 hardware,
regardless of kernel support for SGX2, userspace could EADD a page from
a noexec path using read-only permissions and later mprotect() and
ENCLU[EMODPE] the page to gain execute permissions.  By requiring
the enclave's page protections up front, SGX will be able to enforce
noexec paths when building enclaves.

To prevent userspace from circumventing the allowed protections, do not
allow PROT_{READ,WRITE,EXEC} mappings to an enclave without an
associated enclave page, i.e. prevent creating a mapping with unchecked
protection bits.

Alternatively, SGX could pre-check what transitions are/aren't allowed
using some form of proxy for the enclave, e.g. its sigstruct, and
dynamically track protections in the SGX driver.  Dynamically tracking
protections and pre-checking permissions has several drawbacks:

  - Complicates the SGX implementation due to the need to coordinate
    tracking across multiple mm structs and vmas.

  - LSM auditing would log denials that never manifest in failure.

  - Requires additional SGX specific flags/definitions be passed to/from
    LSMs.

A second alternative would be to again use sigstruct as a proxy for the
enclave when performing access control checks, but hold a reference to
the sigstruct file and perform LSM checks during mmap()/mmprotect() as
opposed to pre-checking permissions at enclave build time.  The big
downside to this approach is that it effecitvely requires userspace to
place sigstruct in a file, and the SGX driver must "pin" said file by
holding a reference to the file for the lifetime of the enclave.

A third alternative would be to pull the protection bits from the page's
SECINFO, i.e. make decisions based on the protections enforced by
hardware.  However, with SGX2, userspace can extend the hardware-
enforced protections via ENCLU[EMODPE], e.g. can add a page as RW and
later convert it to RX.  With SGX2, making a decision based on the
initial protections would either create a security hole or force SGX to
dynamically track "dirty" pages (see first alternative above).

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/include/uapi/asm/sgx.h        |  2 +
 arch/x86/kernel/cpu/sgx/driver/ioctl.c | 14 +++++--
 arch/x86/kernel/cpu/sgx/driver/main.c  |  5 +++
 arch/x86/kernel/cpu/sgx/encl.c         | 53 ++++++++++++++++++++++++++
 arch/x86/kernel/cpu/sgx/encl.h         |  4 ++
 5 files changed, 74 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/uapi/asm/sgx.h b/arch/x86/include/uapi/asm/sgx.h
index 9ed690a38c70..2c6198ffeaf8 100644
--- a/arch/x86/include/uapi/asm/sgx.h
+++ b/arch/x86/include/uapi/asm/sgx.h
@@ -37,12 +37,14 @@ struct sgx_enclave_create  {
  * @addr:	address within the ELRANGE
  * @src:	address for the page data
  * @secinfo:	address for the SECINFO data
+ * @flags:	flags, e.g. PROT_{READ,WRITE,EXEC}
  * @mrmask:	bitmask for the measured 256 byte chunks
  */
 struct sgx_enclave_add_page {
 	__u64	addr;
 	__u64	src;
 	__u64	secinfo;
+	__u32	flags;
 	__u16	mrmask;
 } __attribute__((__packed__));
 
diff --git a/arch/x86/kernel/cpu/sgx/driver/ioctl.c b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
index a27ec26a9350..ef5c2ce0f37b 100644
--- a/arch/x86/kernel/cpu/sgx/driver/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
@@ -235,7 +235,8 @@ static int sgx_validate_secs(const struct sgx_secs *secs,
 }
 
 static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
-						 unsigned long addr)
+						 unsigned long addr,
+						 unsigned long prot)
 {
 	struct sgx_encl_page *encl_page;
 	int ret;
@@ -247,6 +248,7 @@ static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
 		return ERR_PTR(-ENOMEM);
 	encl_page->desc = addr;
 	encl_page->encl = encl;
+	encl_page->prot = prot;
 	ret = radix_tree_insert(&encl->page_tree, PFN_DOWN(encl_page->desc),
 				encl_page);
 	if (ret) {
@@ -531,7 +533,7 @@ static int __sgx_encl_add_page(struct sgx_encl *encl,
 
 static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long addr,
 			     void *data, struct sgx_secinfo *secinfo,
-			     unsigned int mrmask)
+			     unsigned int mrmask, unsigned long prot)
 {
 	u64 page_type = secinfo->flags & SGX_SECINFO_PAGE_TYPE_MASK;
 	struct sgx_encl_page *encl_page;
@@ -557,7 +559,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long addr,
 		goto out;
 	}
 
-	encl_page = sgx_encl_page_alloc(encl, addr);
+	encl_page = sgx_encl_page_alloc(encl, addr, prot);
 	if (IS_ERR(encl_page)) {
 		ret = PTR_ERR(encl_page);
 		goto out;
@@ -599,6 +601,7 @@ static long sgx_ioc_enclave_add_page(struct file *filep, unsigned int cmd,
 	struct sgx_enclave_add_page *addp = (void *)arg;
 	struct sgx_encl *encl = filep->private_data;
 	struct sgx_secinfo secinfo;
+	unsigned long prot;
 	struct page *data_page;
 	void *data;
 	int ret;
@@ -618,7 +621,10 @@ static long sgx_ioc_enclave_add_page(struct file *filep, unsigned int cmd,
 		goto out;
 	}
 
-	ret = sgx_encl_add_page(encl, addp->addr, data, &secinfo, addp->mrmask);
+	prot = addp->flags & (PROT_READ | PROT_WRITE | PROT_EXEC);
+
+	ret = sgx_encl_add_page(encl, addp->addr, data, &secinfo, addp->mrmask,
+				prot);
 	if (ret)
 		goto out;
 
diff --git a/arch/x86/kernel/cpu/sgx/driver/main.c b/arch/x86/kernel/cpu/sgx/driver/main.c
index 129d356aff30..65a87c2fdf02 100644
--- a/arch/x86/kernel/cpu/sgx/driver/main.c
+++ b/arch/x86/kernel/cpu/sgx/driver/main.c
@@ -63,6 +63,11 @@ static long sgx_compat_ioctl(struct file *filep, unsigned int cmd,
 static int sgx_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct sgx_encl *encl = file->private_data;
+	int ret;
+
+	ret = sgx_map_allowed(encl, vma->vm_start, vma->vm_end, vma->vm_flags);
+	if (ret)
+		return ret;
 
 	vma->vm_ops = &sgx_vm_ops;
 	vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 7216bdf07bd0..a5a412220058 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -235,6 +235,58 @@ static void sgx_vma_close(struct vm_area_struct *vma)
 	kref_put(&encl->refcount, sgx_encl_release);
 }
 
+
+/**
+ * sgx_map_allowed - check vma protections against the associated enclave page
+ * @encl:	an enclave
+ * @start:	start address of the mapping (inclusive)
+ * @end:	end address of the mapping (exclusive)
+ * @prot:	protection bits of the mapping
+ *
+ * Verify a userspace mapping to an enclave page would not violate the security
+ * requirements of the *kernel*.  Note, this is in no way related to the
+ * page protections enforced by hardware via the EPCM.  The EPCM protections
+ * can be directly extended by the enclave, i.e. cannot be relied upon by the
+ * kernel for security guarantees of any kind.
+ *
+ * Return:
+ *   0 on success,
+ *   -EACCES if the mapping is disallowed
+ */
+int sgx_map_allowed(struct sgx_encl *encl, unsigned long start,
+		    unsigned long end, unsigned long prot)
+{
+	struct sgx_encl_page *page;
+	unsigned long addr;
+
+	prot &= (VM_READ | VM_WRITE | VM_EXEC);
+	if (!prot || !encl)
+		return 0;
+
+	mutex_lock(&encl->lock);
+
+	for (addr = start; addr < end; addr += PAGE_SIZE) {
+		page = radix_tree_lookup(&encl->page_tree, addr >> PAGE_SHIFT);
+
+		/*
+		 * Do not allow R|W|X to a non-existent page, or protections
+		 * beyond those of the existing enclave page.
+		 */
+		if (!page || (prot & ~page->prot))
+			return -EACCES;
+	}
+
+	mutex_unlock(&encl->lock);
+
+	return 0;
+}
+
+static int sgx_vma_mprotect(struct vm_area_struct *vma, unsigned long start,
+			    unsigned long end, unsigned long prot)
+{
+	return sgx_map_allowed(vma->vm_private_data, start, end, prot);
+}
+
 static unsigned int sgx_vma_fault(struct vm_fault *vmf)
 {
 	unsigned long addr = (unsigned long)vmf->address;
@@ -372,6 +424,7 @@ static int sgx_vma_access(struct vm_area_struct *vma, unsigned long addr,
 const struct vm_operations_struct sgx_vm_ops = {
 	.close = sgx_vma_close,
 	.open = sgx_vma_open,
+	.may_mprotect = sgx_vma_mprotect,
 	.fault = sgx_vma_fault,
 	.access = sgx_vma_access,
 };
diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h
index c557f0374d74..176467c0eb22 100644
--- a/arch/x86/kernel/cpu/sgx/encl.h
+++ b/arch/x86/kernel/cpu/sgx/encl.h
@@ -41,6 +41,7 @@ enum sgx_encl_page_desc {
 
 struct sgx_encl_page {
 	unsigned long desc;
+	unsigned long prot;
 	struct sgx_epc_page *epc_page;
 	struct sgx_va_page *va_page;
 	struct sgx_encl *encl;
@@ -106,6 +107,9 @@ static inline unsigned long sgx_pcmd_offset(pgoff_t page_index)
 	       sizeof(struct sgx_pcmd);
 }
 
+int sgx_map_allowed(struct sgx_encl *encl, unsigned long start,
+		    unsigned long end, unsigned long prot);
+
 enum sgx_encl_mm_iter {
 	SGX_ENCL_MM_ITER_DONE		= 0,
 	SGX_ENCL_MM_ITER_NEXT		= 1,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RFC PATCH v2 3/5] x86/sgx: Enforce noexec filesystem restriction for enclaves
  2019-06-06  2:11 [RFC PATCH v2 0/5] security: x86/sgx: SGX vs. LSM Sean Christopherson
  2019-06-06  2:11 ` [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect() Sean Christopherson
  2019-06-06  2:11 ` [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits Sean Christopherson
@ 2019-06-06  2:11 ` Sean Christopherson
  2019-06-10 16:00   ` Jarkko Sakkinen
  2019-06-06  2:11 ` [RFC PATCH v2 4/5] LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX Sean Christopherson
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-06  2:11 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

Do not allow an enclave page to be mapped with PROT_EXEC if the source
vma does not have VM_MAYEXEC.  This effectively enforces noexec as
do_mmap() clears VM_MAYEXEC if the vma is being loaded from a noexec
path, i.e. prevents executing a file by loading it into an enclave.
Checking noexec indirectly by way of VM_MAYEXEC naturally handles any
other cases that clear VM_MAYEXEC to deny execute permissions.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/driver/ioctl.c | 47 +++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/driver/ioctl.c b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
index ef5c2ce0f37b..44b2d73de7c3 100644
--- a/arch/x86/kernel/cpu/sgx/driver/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
@@ -577,6 +577,44 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long addr,
 	return ret;
 }
 
+static int sgx_encl_page_copy(void *dst, unsigned long src, unsigned long prot)
+{
+	struct vm_area_struct *vma;
+	int ret;
+
+	if (!(prot & VM_EXEC))
+		return 0;
+
+	/* Hold mmap_sem across copy_from_user() to avoid a TOCTOU race. */
+	down_read(&current->mm->mmap_sem);
+
+	vma = find_vma(current->mm, src);
+	if (!vma) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	/*
+	 * Query VM_MAYEXEC as an indirect path_noexec() check (see do_mmap()),
+	 * but with some future proofing against other cases that may deny
+	 * execute permissions.
+	 */
+	if (!(vma->vm_flags & VM_MAYEXEC)) {
+		ret = -EACCES;
+		goto out;
+	}
+
+	if (copy_from_user(dst, (void __user *)src, PAGE_SIZE))
+		ret = -EFAULT;
+	else
+		ret = 0;
+
+out:
+	up_read(&current->mm->mmap_sem);
+
+	return ret;
+}
+
 /**
  * sgx_ioc_enclave_add_page - handler for %SGX_IOC_ENCLAVE_ADD_PAGE
  *
@@ -616,13 +654,12 @@ static long sgx_ioc_enclave_add_page(struct file *filep, unsigned int cmd,
 
 	data = kmap(data_page);
 
-	if (copy_from_user((void *)data, (void __user *)addp->src, PAGE_SIZE)) {
-		ret = -EFAULT;
-		goto out;
-	}
-
 	prot = addp->flags & (PROT_READ | PROT_WRITE | PROT_EXEC);
 
+	ret = sgx_encl_page_copy(data, addp->src, prot);
+	if (ret)
+		goto out;
+
 	ret = sgx_encl_add_page(encl, addp->addr, data, &secinfo, addp->mrmask,
 				prot);
 	if (ret)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RFC PATCH v2 4/5] LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX
  2019-06-06  2:11 [RFC PATCH v2 0/5] security: x86/sgx: SGX vs. LSM Sean Christopherson
                   ` (2 preceding siblings ...)
  2019-06-06  2:11 ` [RFC PATCH v2 3/5] x86/sgx: Enforce noexec filesystem restriction for enclaves Sean Christopherson
@ 2019-06-06  2:11 ` Sean Christopherson
  2019-06-07 19:58   ` Stephen Smalley
  2019-06-10 16:05   ` Jarkko Sakkinen
  2019-06-06  2:11 ` [RFC PATCH v2 5/5] security/selinux: Add enclave_load() implementation Sean Christopherson
  2019-06-10  7:03 ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Cedric Xing
  5 siblings, 2 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-06  2:11 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

enclave_load() is roughly analogous to the existing file_mprotect().

Due to the nature of SGX and its Enclave Page Cache (EPC), all enclave
VMAs are backed by a single file, i.e. /dev/sgx/enclave, that must be
MAP_SHARED.  Furthermore, all enclaves need read, write and execute
VMAs.  As a result, the existing/standard call to file_mprotect() does
not provide any meaningful security for enclaves since an LSM can only
deny/grant access to the EPC as a whole.

security_enclave_load() is called when SGX is first loading an enclave
page, i.e. copying a page from normal memory into the EPC.  Although
the prototype for enclave_load() is similar to file_mprotect(), e.g.
SGX could theoretically use file_mprotect() and set reqprot=prot, a
separate hook is desirable as the semantics of an enclave's protection
bits are different than those of vmas, e.g. an enclave page tracks the
maximal set of protections, whereas file_mprotect() operates on the
actual protections being provided.  In other words, LSMs will likely
want to implement different policies for enclave page protections.

Note, extensive discussion yielded no sane alternative to some form of
SGX specific LSM hook[1].

[1] https://lkml.kernel.org/r/CALCETrXf8mSK45h7sTK5Wf+pXLVn=Bjsc_RLpgO-h-qdzBRo5Q@mail.gmail.com

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/driver/ioctl.c | 12 ++++++------
 include/linux/lsm_hooks.h              | 13 +++++++++++++
 include/linux/security.h               | 12 ++++++++++++
 security/security.c                    |  7 +++++++
 4 files changed, 38 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/driver/ioctl.c b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
index 44b2d73de7c3..29c0df672250 100644
--- a/arch/x86/kernel/cpu/sgx/driver/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
@@ -8,6 +8,7 @@
 #include <linux/highmem.h>
 #include <linux/ratelimit.h>
 #include <linux/sched/signal.h>
+#include <linux/security.h>
 #include <linux/shmem_fs.h>
 #include <linux/slab.h>
 #include <linux/suspend.h>
@@ -582,9 +583,6 @@ static int sgx_encl_page_copy(void *dst, unsigned long src, unsigned long prot)
 	struct vm_area_struct *vma;
 	int ret;
 
-	if (!(prot & VM_EXEC))
-		return 0;
-
 	/* Hold mmap_sem across copy_from_user() to avoid a TOCTOU race. */
 	down_read(&current->mm->mmap_sem);
 
@@ -599,15 +597,17 @@ static int sgx_encl_page_copy(void *dst, unsigned long src, unsigned long prot)
 	 * but with some future proofing against other cases that may deny
 	 * execute permissions.
 	 */
-	if (!(vma->vm_flags & VM_MAYEXEC)) {
+	if ((prot & VM_EXEC) && !(vma->vm_flags & VM_MAYEXEC)) {
 		ret = -EACCES;
 		goto out;
 	}
 
+	ret = security_enclave_load(vma, prot);
+	if (ret)
+		goto out;
+
 	if (copy_from_user(dst, (void __user *)src, PAGE_SIZE))
 		ret = -EFAULT;
-	else
-		ret = 0;
 
 out:
 	up_read(&current->mm->mmap_sem);
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 47f58cfb6a19..c6f47a7eef70 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1446,6 +1446,12 @@
  * @bpf_prog_free_security:
  *	Clean up the security information stored inside bpf prog.
  *
+ * Security hooks for Intel SGX enclaves.
+ *
+ * @enclave_load:
+ *	@vma: the source memory region of the enclave page being loaded.
+ *	@prot: the (maximal) protections of the enclave page.
+ *	Return 0 if permission is granted.
  */
 union security_list_options {
 	int (*binder_set_context_mgr)(struct task_struct *mgr);
@@ -1807,6 +1813,10 @@ union security_list_options {
 	int (*bpf_prog_alloc_security)(struct bpf_prog_aux *aux);
 	void (*bpf_prog_free_security)(struct bpf_prog_aux *aux);
 #endif /* CONFIG_BPF_SYSCALL */
+
+#ifdef CONFIG_INTEL_SGX
+	int (*enclave_load)(struct vm_area_struct *vma, unsigned long prot);
+#endif /* CONFIG_INTEL_SGX */
 };
 
 struct security_hook_heads {
@@ -2046,6 +2056,9 @@ struct security_hook_heads {
 	struct hlist_head bpf_prog_alloc_security;
 	struct hlist_head bpf_prog_free_security;
 #endif /* CONFIG_BPF_SYSCALL */
+#ifdef CONFIG_INTEL_SGX
+	struct hlist_head enclave_load;
+#endif /* CONFIG_INTEL_SGX */
 } __randomize_layout;
 
 /*
diff --git a/include/linux/security.h b/include/linux/security.h
index 659071c2e57c..0b6d1eb7368b 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1829,5 +1829,17 @@ static inline void security_bpf_prog_free(struct bpf_prog_aux *aux)
 #endif /* CONFIG_SECURITY */
 #endif /* CONFIG_BPF_SYSCALL */
 
+#ifdef CONFIG_INTEL_SGX
+#ifdef CONFIG_SECURITY
+int security_enclave_load(struct vm_area_struct *vma, unsigned long prot);
+#else
+static inline int security_enclave_load(struct vm_area_struct *vma,
+					unsigned long prot)
+{
+	return 0;
+}
+#endif /* CONFIG_SECURITY */
+#endif /* CONFIG_INTEL_SGX */
+
 #endif /* ! __LINUX_SECURITY_H */
 
diff --git a/security/security.c b/security/security.c
index 613a5c00e602..c6f7f26969b2 100644
--- a/security/security.c
+++ b/security/security.c
@@ -2359,3 +2359,10 @@ void security_bpf_prog_free(struct bpf_prog_aux *aux)
 	call_void_hook(bpf_prog_free_security, aux);
 }
 #endif /* CONFIG_BPF_SYSCALL */
+
+#ifdef CONFIG_INTEL_SGX
+int security_enclave_load(struct vm_area_struct *vma, unsigned long prot)
+{
+	return call_int_hook(enclave_load, 0, vma, prot);
+}
+#endif /* CONFIG_INTEL_SGX */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RFC PATCH v2 5/5] security/selinux: Add enclave_load() implementation
  2019-06-06  2:11 [RFC PATCH v2 0/5] security: x86/sgx: SGX vs. LSM Sean Christopherson
                   ` (3 preceding siblings ...)
  2019-06-06  2:11 ` [RFC PATCH v2 4/5] LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX Sean Christopherson
@ 2019-06-06  2:11 ` Sean Christopherson
  2019-06-07 21:16   ` Stephen Smalley
  2019-06-17 16:38   ` Jarkko Sakkinen
  2019-06-10  7:03 ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Cedric Xing
  5 siblings, 2 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-06  2:11 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

The goal of selinux_enclave_load() is to provide a facsimile of the
existing selinux_file_mprotect() and file_map_prot_check() policies,
but tailored to the unique properties of SGX.

For example, an enclave page is technically backed by a MAP_SHARED file,
but the "file" is essentially shared memory that is never persisted
anywhere and also requires execute permissions (for some pages).

The basic concept is to require appropriate execute permissions on the
source of the enclave for pages that are requesting PROT_EXEC, e.g. if
an enclave page is being loaded from a regular file, require
FILE__EXECUTE and/or FILE__EXECMOND, and if it's coming from an
anonymous/private mapping, require PROCESS__EXECMEM since the process
is essentially executing from the mapping, albeit in a roundabout way.

Note, FILE__READ and FILE__WRITE are intentionally not required even if
the source page is backed by a regular file.  Writes to the enclave page
are contained to the EPC, i.e. never hit the original file, and read
permissions have already been vetted (or the VMA doesn't have PROT_READ,
in which case loading the page into the enclave will fail).

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 security/selinux/hooks.c | 69 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 3ec702cf46ca..3c5418edf51c 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -6726,6 +6726,71 @@ static void selinux_bpf_prog_free(struct bpf_prog_aux *aux)
 }
 #endif
 
+#ifdef CONFIG_INTEL_SGX
+int selinux_enclave_load(struct vm_area_struct *vma, unsigned long prot)
+{
+	const struct cred *cred = current_cred();
+	u32 sid = cred_sid(cred);
+	int ret;
+
+	/* SGX is supported only in 64-bit kernels. */
+	WARN_ON_ONCE(!default_noexec);
+
+	/* Only executable enclave pages are restricted in any way. */
+	if (!(prot & PROT_EXEC))
+		return 0;
+
+	/*
+	 * The source page is exectuable, i.e. has already passed SELinux's
+	 * checks, and userspace is not requesting RW->RX capabilities.
+	 */
+	if ((vma->vm_flags & VM_EXEC) && !(prot & PROT_WRITE))
+		return 0;
+
+	/*
+	 * The source page is not executable, or userspace is requesting the
+	 * ability to do a RW->RX conversion.  Permissions are required as
+	 * follows, in order of increasing privelege:
+	 *
+	 * EXECUTE - Load an executable enclave page without RW->RX intent from
+	 *           a non-executable vma that is backed by a shared mapping to
+	 *           a regular file that has not undergone COW.
+	 *
+	 * EXECMOD - Load an executable enclave page without RW->RX intent from
+	 *           a non-executable vma that is backed by a shared mapping to
+	 *           a regular file that *has* undergone COW.
+	 *
+	 *         - Load an enclave page *with* RW->RX intent from a shared
+	 *           mapping to a regular file.
+	 *
+	 * EXECMEM - Load an exectuable enclave page from an anonymous mapping.
+	 *
+	 *         - Load an exectuable enclave page from a private file, e.g.
+	 *           from a shared mapping to a hugetlbfs file.
+	 *
+	 *         - Load an enclave page *with* RW->RX intent from a private
+	 *           mapping to a regular file.
+	 *
+	 * Note, this hybrid EXECMOD and EXECMEM behavior is intentional and
+	 * reflects the nature of enclaves and the EPC, e.g. EPC is effectively
+	 * a non-persistent shared file, but each enclave is a private domain
+	 * within that shared file, so delegate to the source of the enclave.
+	 */
+	if (vma->vm_file && !IS_PRIVATE(file_inode(vma->vm_file) &&
+	    ((vma->vm_flags & VM_SHARED) || !(prot & PROT_WRITE)))) {
+		if (!vma->anon_vma && !(prot & PROT_WRITE))
+			ret = file_has_perm(cred, vma->vm_file, FILE__EXECUTE);
+		else
+			ret = file_has_perm(cred, vma->vm_file, FILE__EXECMOD);
+	} else {
+		ret = avc_has_perm(&selinux_state,
+				   sid, sid, SECCLASS_PROCESS,
+				   PROCESS__EXECMEM, NULL);
+	}
+	return ret;
+}
+#endif
+
 struct lsm_blob_sizes selinux_blob_sizes __lsm_ro_after_init = {
 	.lbs_cred = sizeof(struct task_security_struct),
 	.lbs_file = sizeof(struct file_security_struct),
@@ -6968,6 +7033,10 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(bpf_map_free_security, selinux_bpf_map_free),
 	LSM_HOOK_INIT(bpf_prog_free_security, selinux_bpf_prog_free),
 #endif
+
+#ifdef CONFIG_INTEL_SGX
+	LSM_HOOK_INIT(enclave_load, selinux_enclave_load),
+#endif
 };
 
 static __init int selinux_init(void)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 4/5] LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX
  2019-06-06  2:11 ` [RFC PATCH v2 4/5] LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX Sean Christopherson
@ 2019-06-07 19:58   ` Stephen Smalley
  2019-06-10 16:21     ` Sean Christopherson
  2019-06-10 16:05   ` Jarkko Sakkinen
  1 sibling, 1 reply; 67+ messages in thread
From: Stephen Smalley @ 2019-06-07 19:58 UTC (permalink / raw)
  To: Sean Christopherson, Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, James Morris, Serge E . Hallyn,
	LSM List, Paul Moore, Eric Paris, selinux, Jethro Beekman,
	Dave Hansen, Thomas Gleixner, Linus Torvalds, LKML, X86 ML,
	linux-sgx, Andrew Morton, nhorman, npmccallum, Serge Ayoun,
	Shay Katz-zamir, Haitao Huang, Andy Shevchenko, Kai Svahn,
	Borislav Petkov, Josh Triplett, Kai Huang, David Rientjes,
	William Roberts, Philip Tricca

On 6/5/19 10:11 PM, Sean Christopherson wrote:
> enclave_load() is roughly analogous to the existing file_mprotect().
> 
> Due to the nature of SGX and its Enclave Page Cache (EPC), all enclave
> VMAs are backed by a single file, i.e. /dev/sgx/enclave, that must be
> MAP_SHARED.  Furthermore, all enclaves need read, write and execute
> VMAs.  As a result, the existing/standard call to file_mprotect() does
> not provide any meaningful security for enclaves since an LSM can only
> deny/grant access to the EPC as a whole.
> 
> security_enclave_load() is called when SGX is first loading an enclave
> page, i.e. copying a page from normal memory into the EPC.  Although
> the prototype for enclave_load() is similar to file_mprotect(), e.g.
> SGX could theoretically use file_mprotect() and set reqprot=prot, a
> separate hook is desirable as the semantics of an enclave's protection
> bits are different than those of vmas, e.g. an enclave page tracks the
> maximal set of protections, whereas file_mprotect() operates on the
> actual protections being provided.  In other words, LSMs will likely
> want to implement different policies for enclave page protections.
> 
> Note, extensive discussion yielded no sane alternative to some form of
> SGX specific LSM hook[1].
> 
> [1] https://lkml.kernel.org/r/CALCETrXf8mSK45h7sTK5Wf+pXLVn=Bjsc_RLpgO-h-qdzBRo5Q@mail.gmail.com
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>   arch/x86/kernel/cpu/sgx/driver/ioctl.c | 12 ++++++------
>   include/linux/lsm_hooks.h              | 13 +++++++++++++
>   include/linux/security.h               | 12 ++++++++++++
>   security/security.c                    |  7 +++++++
>   4 files changed, 38 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/driver/ioctl.c b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
> index 44b2d73de7c3..29c0df672250 100644
> --- a/arch/x86/kernel/cpu/sgx/driver/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
> @@ -8,6 +8,7 @@
>   #include <linux/highmem.h>
>   #include <linux/ratelimit.h>
>   #include <linux/sched/signal.h>
> +#include <linux/security.h>
>   #include <linux/shmem_fs.h>
>   #include <linux/slab.h>
>   #include <linux/suspend.h>
> @@ -582,9 +583,6 @@ static int sgx_encl_page_copy(void *dst, unsigned long src, unsigned long prot)
>   	struct vm_area_struct *vma;
>   	int ret;
>   
> -	if (!(prot & VM_EXEC))
> -		return 0;
> -

Is there a real use case where LSM will want to be called if !(prot & 
VM_EXEC)? Also, you seem to be mixing prot and PROT_EXEC with vm_flags 
and VM_EXEC; other code does not appear to assume they are identical and 
explicitly converts, e.g. calc_vm_prot_bits().

>   	/* Hold mmap_sem across copy_from_user() to avoid a TOCTOU race. */
>   	down_read(&current->mm->mmap_sem);
>   
> @@ -599,15 +597,17 @@ static int sgx_encl_page_copy(void *dst, unsigned long src, unsigned long prot)
>   	 * but with some future proofing against other cases that may deny
>   	 * execute permissions.
>   	 */
> -	if (!(vma->vm_flags & VM_MAYEXEC)) {
> +	if ((prot & VM_EXEC) && !(vma->vm_flags & VM_MAYEXEC)) {
>   		ret = -EACCES;
>   		goto out;
>   	}
>   
> +	ret = security_enclave_load(vma, prot);
> +	if (ret)
> +		goto out;
> +
>   	if (copy_from_user(dst, (void __user *)src, PAGE_SIZE))
>   		ret = -EFAULT;
> -	else
> -		ret = 0;
>   
>   out:
>   	up_read(&current->mm->mmap_sem);
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index 47f58cfb6a19..c6f47a7eef70 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -1446,6 +1446,12 @@
>    * @bpf_prog_free_security:
>    *	Clean up the security information stored inside bpf prog.
>    *
> + * Security hooks for Intel SGX enclaves.
> + *
> + * @enclave_load:
> + *	@vma: the source memory region of the enclave page being loaded.
> + *	@prot: the (maximal) protections of the enclave page.
> + *	Return 0 if permission is granted.
>    */
>   union security_list_options {
>   	int (*binder_set_context_mgr)(struct task_struct *mgr);
> @@ -1807,6 +1813,10 @@ union security_list_options {
>   	int (*bpf_prog_alloc_security)(struct bpf_prog_aux *aux);
>   	void (*bpf_prog_free_security)(struct bpf_prog_aux *aux);
>   #endif /* CONFIG_BPF_SYSCALL */
> +
> +#ifdef CONFIG_INTEL_SGX
> +	int (*enclave_load)(struct vm_area_struct *vma, unsigned long prot);
> +#endif /* CONFIG_INTEL_SGX */
>   };
>   
>   struct security_hook_heads {
> @@ -2046,6 +2056,9 @@ struct security_hook_heads {
>   	struct hlist_head bpf_prog_alloc_security;
>   	struct hlist_head bpf_prog_free_security;
>   #endif /* CONFIG_BPF_SYSCALL */
> +#ifdef CONFIG_INTEL_SGX
> +	struct hlist_head enclave_load;
> +#endif /* CONFIG_INTEL_SGX */
>   } __randomize_layout;
>   
>   /*
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 659071c2e57c..0b6d1eb7368b 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -1829,5 +1829,17 @@ static inline void security_bpf_prog_free(struct bpf_prog_aux *aux)
>   #endif /* CONFIG_SECURITY */
>   #endif /* CONFIG_BPF_SYSCALL */
>   
> +#ifdef CONFIG_INTEL_SGX
> +#ifdef CONFIG_SECURITY
> +int security_enclave_load(struct vm_area_struct *vma, unsigned long prot);
> +#else
> +static inline int security_enclave_load(struct vm_area_struct *vma,
> +					unsigned long prot)
> +{
> +	return 0;
> +}
> +#endif /* CONFIG_SECURITY */
> +#endif /* CONFIG_INTEL_SGX */
> +
>   #endif /* ! __LINUX_SECURITY_H */
>   
> diff --git a/security/security.c b/security/security.c
> index 613a5c00e602..c6f7f26969b2 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -2359,3 +2359,10 @@ void security_bpf_prog_free(struct bpf_prog_aux *aux)
>   	call_void_hook(bpf_prog_free_security, aux);
>   }
>   #endif /* CONFIG_BPF_SYSCALL */
> +
> +#ifdef CONFIG_INTEL_SGX
> +int security_enclave_load(struct vm_area_struct *vma, unsigned long prot)
> +{
> +	return call_int_hook(enclave_load, 0, vma, prot);
> +}
> +#endif /* CONFIG_INTEL_SGX */
> 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 5/5] security/selinux: Add enclave_load() implementation
  2019-06-06  2:11 ` [RFC PATCH v2 5/5] security/selinux: Add enclave_load() implementation Sean Christopherson
@ 2019-06-07 21:16   ` Stephen Smalley
  2019-06-10 16:46     ` Sean Christopherson
  2019-06-17 16:38   ` Jarkko Sakkinen
  1 sibling, 1 reply; 67+ messages in thread
From: Stephen Smalley @ 2019-06-07 21:16 UTC (permalink / raw)
  To: Sean Christopherson, Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, James Morris, Serge E . Hallyn,
	LSM List, Paul Moore, Eric Paris, selinux, Jethro Beekman,
	Dave Hansen, Thomas Gleixner, Linus Torvalds, LKML, X86 ML,
	linux-sgx, Andrew Morton, nhorman, npmccallum, Serge Ayoun,
	Shay Katz-zamir, Haitao Huang, Andy Shevchenko, Kai Svahn,
	Borislav Petkov, Josh Triplett, Kai Huang, David Rientjes,
	William Roberts, Philip Tricca

On 6/5/19 10:11 PM, Sean Christopherson wrote:
> The goal of selinux_enclave_load() is to provide a facsimile of the
> existing selinux_file_mprotect() and file_map_prot_check() policies,
> but tailored to the unique properties of SGX.
> 
> For example, an enclave page is technically backed by a MAP_SHARED file,
> but the "file" is essentially shared memory that is never persisted
> anywhere and also requires execute permissions (for some pages).
> 
> The basic concept is to require appropriate execute permissions on the
> source of the enclave for pages that are requesting PROT_EXEC, e.g. if
> an enclave page is being loaded from a regular file, require
> FILE__EXECUTE and/or FILE__EXECMOND, and if it's coming from an
> anonymous/private mapping, require PROCESS__EXECMEM since the process
> is essentially executing from the mapping, albeit in a roundabout way.
> 
> Note, FILE__READ and FILE__WRITE are intentionally not required even if
> the source page is backed by a regular file.  Writes to the enclave page
> are contained to the EPC, i.e. never hit the original file, and read
> permissions have already been vetted (or the VMA doesn't have PROT_READ,
> in which case loading the page into the enclave will fail).
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>   security/selinux/hooks.c | 69 ++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 69 insertions(+)
> 
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 3ec702cf46ca..3c5418edf51c 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -6726,6 +6726,71 @@ static void selinux_bpf_prog_free(struct bpf_prog_aux *aux)
>   }
>   #endif
>   
> +#ifdef CONFIG_INTEL_SGX
> +int selinux_enclave_load(struct vm_area_struct *vma, unsigned long prot)
> +{
> +	const struct cred *cred = current_cred();
> +	u32 sid = cred_sid(cred);
> +	int ret;
> +
> +	/* SGX is supported only in 64-bit kernels. */
> +	WARN_ON_ONCE(!default_noexec);
> +
> +	/* Only executable enclave pages are restricted in any way. */
> +	if (!(prot & PROT_EXEC))
> +		return 0;

prot/PROT_EXEC or vmflags/VM_EXEC

> +
> +	/*
> +	 * The source page is exectuable, i.e. has already passed SELinux's

executable

> +	 * checks, and userspace is not requesting RW->RX capabilities.

Is it requesting W->X or WX?

> +	 */
> +	if ((vma->vm_flags & VM_EXEC) && !(prot & PROT_WRITE))
> +		return 0;
> +
> +	/*
> +	 * The source page is not executable, or userspace is requesting the
> +	 * ability to do a RW->RX conversion.  Permissions are required as
> +	 * follows, in order of increasing privelege:
> +	 *
> +	 * EXECUTE - Load an executable enclave page without RW->RX intent from
> +	 *           a non-executable vma that is backed by a shared mapping to
> +	 *           a regular file that has not undergone COW.

Shared mapping or unmodified private file mapping

> +	 *
> +	 * EXECMOD - Load an executable enclave page without RW->RX intent from
> +	 *           a non-executable vma that is backed by a shared mapping to
> +	 *           a regular file that *has* undergone COW.

modified private file mapping (write to shared mapping won't trigger 
COW; it would have been checked by FILE__WRITE earlier)

> +	 *
> +	 *         - Load an enclave page *with* RW->RX intent from a shared
> +	 *           mapping to a regular file.
> +	 *
> +	 * EXECMEM - Load an exectuable enclave page from an anonymous mapping.

executable

> +	 *
> +	 *         - Load an exectuable enclave page from a private file, e.g.

executable

> +	 *           from a shared mapping to a hugetlbfs file.
> +	 *
> +	 *         - Load an enclave page *with* RW->RX intent from a private

W->X or WX?

> +	 *           mapping to a regular file.
> +	 *
> +	 * Note, this hybrid EXECMOD and EXECMEM behavior is intentional and
> +	 * reflects the nature of enclaves and the EPC, e.g. EPC is effectively
> +	 * a non-persistent shared file, but each enclave is a private domain
> +	 * within that shared file, so delegate to the source of the enclave.
> +	 */
> +	if (vma->vm_file && !IS_PRIVATE(file_inode(vma->vm_file) &&
> +	    ((vma->vm_flags & VM_SHARED) || !(prot & PROT_WRITE)))) {
> +		if (!vma->anon_vma && !(prot & PROT_WRITE))
> +			ret = file_has_perm(cred, vma->vm_file, FILE__EXECUTE);
> +		else
> +			ret = file_has_perm(cred, vma->vm_file, FILE__EXECMOD);
> +	} else {
> +		ret = avc_has_perm(&selinux_state,
> +				   sid, sid, SECCLASS_PROCESS,
> +				   PROCESS__EXECMEM, NULL);
> +	}
> +	return ret;
> +}
> +#endif
> +
>   struct lsm_blob_sizes selinux_blob_sizes __lsm_ro_after_init = {
>   	.lbs_cred = sizeof(struct task_security_struct),
>   	.lbs_file = sizeof(struct file_security_struct),
> @@ -6968,6 +7033,10 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
>   	LSM_HOOK_INIT(bpf_map_free_security, selinux_bpf_map_free),
>   	LSM_HOOK_INIT(bpf_prog_free_security, selinux_bpf_prog_free),
>   #endif
> +
> +#ifdef CONFIG_INTEL_SGX
> +	LSM_HOOK_INIT(enclave_load, selinux_enclave_load),
> +#endif
>   };
>   
>   static __init int selinux_init(void)
> 


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks
  2019-06-06  2:11 [RFC PATCH v2 0/5] security: x86/sgx: SGX vs. LSM Sean Christopherson
                   ` (4 preceding siblings ...)
  2019-06-06  2:11 ` [RFC PATCH v2 5/5] security/selinux: Add enclave_load() implementation Sean Christopherson
@ 2019-06-10  7:03 ` Cedric Xing
  2019-06-10  7:03   ` [RFC PATCH v1 1/3] LSM/x86/sgx: Add " Cedric Xing
                     ` (3 more replies)
  5 siblings, 4 replies; 67+ messages in thread
From: Cedric Xing @ 2019-06-10  7:03 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-kernel, linux-sgx
  Cc: Cedric Xing, jarkko.sakkinen, luto, sds, jmorris, serge, paul,
	eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca

This series intends to make the new SGX subsystem and the existing LSM
architecture work together smoothly so that, say, SGX cannot be abused to work
around restrictions set forth by LSM. This series applies on top of Jarkko
Sakkinen's SGX series v20 (https://lkml.org/lkml/2019/4/17/344), where abundant
details of this SGX/LSM problem could be found.

This series is an alternative to Sean Christopherson's recent RFC series
(https://lkml.org/lkml/2019/6/5/1070) that was trying to solve the same
problem. The key problem is for LSM to determine the "maximal (most permissive)
protection" allowed for individual enclave pages. Sean's approach is to take
that from user mode code as a parameter of the EADD ioctl, validate it with LSM
ahead of time, and then enforce it inside the SGX subsystem. The major
disadvantage IMHO is that a priori knowledge of "maximal protection" is needed,
but it isn't always available in certain use cases. In fact, it is an unusual
approach to take "maximal protection" from user code, as what SELinux is doing
today is to determine "maximal protection" of a vma using attributes associated
with vma->vm_file instead. When it comes to enclaves, vma->vm_file always
points /dev/sgx/enclave, so what's missing is a new way for LSM modules to
remember origins of enclave pages so that they don't solely depend on
vma->vm_file to determine "maximal protection".

This series takes advantage of the fact that enclave pages cannot be remapped
(to different linear address), therefore the pair of { vma->vm_file,
linear_address } can be used to uniquely identify an enclave page. Then by
notifying LSM on creation of every enclave page (via a new LSM hook -
security_enclave_load), LSM modules would be able to track origin and
protection changes of every page, hence be able to judge correctly upon
mmap/mprotect requests.

Cedric Xing (3):
  LSM/x86/sgx: Add SGX specific LSM hooks
  LSM/x86/sgx: Implement SGX specific hooks in SELinux
  LSM/x86/sgx: Call new LSM hooks from SGX subsystem

 arch/x86/kernel/cpu/sgx/driver/ioctl.c |  72 +++++-
 arch/x86/kernel/cpu/sgx/driver/main.c  |  12 +-
 include/linux/lsm_hooks.h              |  33 +++
 include/linux/security.h               |  26 +++
 security/security.c                    |  21 ++
 security/selinux/Makefile              |   2 +
 security/selinux/hooks.c               |  77 ++++++-
 security/selinux/include/intel_sgx.h   |  18 ++
 security/selinux/include/objsec.h      |   3 +
 security/selinux/intel_sgx.c           | 292 +++++++++++++++++++++++++
 10 files changed, 545 insertions(+), 11 deletions(-)
 create mode 100644 security/selinux/include/intel_sgx.h
 create mode 100644 security/selinux/intel_sgx.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 67+ messages in thread

* [RFC PATCH v1 1/3] LSM/x86/sgx: Add SGX specific LSM hooks
  2019-06-10  7:03 ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Cedric Xing
@ 2019-06-10  7:03   ` Cedric Xing
  2019-06-10  7:03   ` [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux Cedric Xing
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 67+ messages in thread
From: Cedric Xing @ 2019-06-10  7:03 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-kernel, linux-sgx
  Cc: Cedric Xing, jarkko.sakkinen, luto, sds, jmorris, serge, paul,
	eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca

This patch has made two changes to LSM hooks.

The first change is the addition of two new SGX specific LSM hooks.

security_enclave_load() - is called whenever new EPC pages are added to an
enclave, so that an LSM module could initialize internal states for those
pages. An LSM module may track protections ever granted to enclave pages in
order to come to reasonable decisions in security_file_mprotect() hook in
future.

security_enclave_init() - is called when an enclave is about to be intialized
(by EINIT). An LSM module may approve/decline the request by looking into the
SIGSTRUCT, or the file from which the SIGSTRUCT was loaded from.

The second change is to export symbol security_file_mprotect() to make it
available to kernel modules. The SGX module will invoke
security_file_mprotect() to validate protection for the virtual memory range
being mmap()'ed.

Please see include/linux/lsm_hooks.h for more information.

Signed-off-by: Cedric Xing <cedric.xing@intel.com>
---
 include/linux/lsm_hooks.h | 33 +++++++++++++++++++++++++++++++++
 include/linux/security.h  | 26 ++++++++++++++++++++++++++
 security/security.c       | 21 +++++++++++++++++++++
 3 files changed, 80 insertions(+)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 47f58cfb6a19..ceb18c5c25f3 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1446,6 +1446,27 @@
  * @bpf_prog_free_security:
  *	Clean up the security information stored inside bpf prog.
  *
+ * Security hooks for SGX enclaves
+ *
+ * @enclave_load:
+ *	Check permissions before loading pages into enclaves. Must be called
+ *	with current->mm->mmap_sem locked.
+ *	@encl: file pointer identifying the enclave
+ *	@addr: linear address to which new pages are being added. Must be page
+ *	aligned
+ *	@size: total size of pages being added. Must be integral multiple of
+ *	page size
+ *	@prot: requested protection. Shall be the same protection as the VMA
+ *	covering the target linear range, or 0 if target range not mapped
+ *	@source: the VMA containing the source pages. Shall be NULL if there's
+ *	no source pages (e.g. EAUG)
+ *
+ * @enclave_init:
+ *	Check SIGSTRUCT before initializing (EINIT) enclaves. Must be called
+ *	with current->mm->mmap_sem locked.
+ *	@encl: file pointer identifying the enclave being initialized
+ *	@sigstruct: pointer to sigstruct in kernel memory
+ *	@sigstruct_vma: vma containing the original sigstruct in user space
  */
 union security_list_options {
 	int (*binder_set_context_mgr)(struct task_struct *mgr);
@@ -1807,6 +1828,14 @@ union security_list_options {
 	int (*bpf_prog_alloc_security)(struct bpf_prog_aux *aux);
 	void (*bpf_prog_free_security)(struct bpf_prog_aux *aux);
 #endif /* CONFIG_BPF_SYSCALL */
+#ifdef CONFIG_INTEL_SGX
+	int (*enclave_load)(struct file *encl, unsigned long addr,
+			    unsigned long size, unsigned long prot,
+			    struct vm_area_struct *source);
+	int (*enclave_init)(struct file *encl,
+			    const struct sgx_sigstruct *sigstruct,
+			    struct vm_area_struct *sigstruct_vma);
+#endif /* CONFIG_INTEL_SGX */
 };
 
 struct security_hook_heads {
@@ -2046,6 +2075,10 @@ struct security_hook_heads {
 	struct hlist_head bpf_prog_alloc_security;
 	struct hlist_head bpf_prog_free_security;
 #endif /* CONFIG_BPF_SYSCALL */
+#ifdef CONFIG_INTEL_SGX
+	struct hlist_head enclave_load;
+	struct hlist_head enclave_init;
+#endif /* CONFIG_INTEL_SGX */
 } __randomize_layout;
 
 /*
diff --git a/include/linux/security.h b/include/linux/security.h
index 659071c2e57c..d44655dd06dd 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1829,5 +1829,31 @@ static inline void security_bpf_prog_free(struct bpf_prog_aux *aux)
 #endif /* CONFIG_SECURITY */
 #endif /* CONFIG_BPF_SYSCALL */
 
+#ifdef CONFIG_INTEL_SGX
+struct sgx_sigstruct;
+#ifdef CONFIG_SECURITY
+extern int security_enclave_load(struct file *encl, unsigned long addr,
+				 unsigned long size, unsigned long prot,
+				 struct vm_area_struct *source);
+extern int security_enclave_init(struct file *encl,
+				 const struct sgx_sigstruct *sigstruct,
+				 struct vm_area_struct *sigstruct_vma);
+#else
+static inline int security_enclave_load(struct file *encl, unsigned long addr,
+					unsigned long size, unsigned long prot,
+					struct vm_area_struct *source)
+{
+	return 0;
+}
+
+static inline int security_enclave_init(struct file *encl,
+					const struct sigstruct *sigstruct,
+					struct vm_area_struct *sigstruct_vma)
+{
+	return 0;
+}
+#endif /* CONFIG_SECURITY */
+#endif /* CONFIG_INTEL_SGX */
+
 #endif /* ! __LINUX_SECURITY_H */
 
diff --git a/security/security.c b/security/security.c
index f493db0bf62a..3a5c9847f2c8 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1420,6 +1420,7 @@ int security_file_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
 {
 	return call_int_hook(file_mprotect, 0, vma, reqprot, prot);
 }
+EXPORT_SYMBOL(security_file_mprotect);
 
 int security_file_lock(struct file *file, unsigned int cmd)
 {
@@ -2355,3 +2356,23 @@ void security_bpf_prog_free(struct bpf_prog_aux *aux)
 	call_void_hook(bpf_prog_free_security, aux);
 }
 #endif /* CONFIG_BPF_SYSCALL */
+
+#ifdef CONFIG_INTEL_SGX
+
+int security_enclave_load(struct file *encl, unsigned long addr,
+			  unsigned long size, unsigned long prot,
+			  struct vm_area_struct *source)
+{
+	return call_int_hook(enclave_load, 0, encl, addr, size, prot, source);
+}
+EXPORT_SYMBOL(security_enclave_load);
+
+int security_enclave_init(struct file *encl,
+			  const struct sgx_sigstruct *sigstruct,
+			  struct vm_area_struct *sigstruct_vma)
+{
+	return call_int_hook(enclave_init, 0, encl, sigstruct, sigstruct_vma);
+}
+EXPORT_SYMBOL(security_enclave_init);
+
+#endif /* CONFIG_INTEL_SGX */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-10  7:03 ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Cedric Xing
  2019-06-10  7:03   ` [RFC PATCH v1 1/3] LSM/x86/sgx: Add " Cedric Xing
@ 2019-06-10  7:03   ` Cedric Xing
  2019-06-11 13:40     ` Stephen Smalley
  2019-06-10  7:03   ` [RFC PATCH v1 3/3] LSM/x86/sgx: Call new LSM hooks from SGX subsystem Cedric Xing
  2019-06-10 17:36   ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Jarkko Sakkinen
  3 siblings, 1 reply; 67+ messages in thread
From: Cedric Xing @ 2019-06-10  7:03 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-kernel, linux-sgx
  Cc: Cedric Xing, jarkko.sakkinen, luto, sds, jmorris, serge, paul,
	eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca

In this patch, SELinux maintains two bits per enclave page, namely SGX__EXECUTE
and SGX__EXECMOD.

SGX__EXECUTE is set initially (by selinux_enclave_load) for every enclave page
that was loaded from a potentially executable source page. SGX__EXECMOD is set
for every page that was loaded from a file that has FILE__EXECMOD.

At runtime, on every protection change (resulted in a call to
selinux_file_mprotect), SGX__EXECUTE is cleared for a page if VM_WRITE is
requested, unless SGX__EXECMOD is set.

To track enclave page protection changes, SELinux has been changed in four
different places.

Firstly, storage is required for storing per page SGX__EXECUTE and SGX__EXECMOD
bits. Given every enclave instance is uniquely tied to an open file (i.e.
struct file), the storage is allocated by extending `file_security_struct`.
More precisely, a new field `esec` has been added, initially zero, to point to
the data structure for tracking per page protection. `esec` will be
allocated/initialized at the first invocation of selinux_enclave_load().

Then, selinux_enclave_load() initializes those 2 bits for every new enclave as
described above. One more detail worth noting, is that selinux_enclave_load()
sets SGX__EXECUTE/SGX__EXECMOD for EAUG'ed pages (for upcoming SGX2) only if
the calling process has FILE__EXECMOD on the sigstruct file.

Afterwards, every change on protection will go through selinux_file_mprotect()
so will be noted. Please note that user space could munmap() then mmap() to
work around mprotect(), but that "leak" could be "plugged" by SGX subsystem
calling security_file_mprotect() explicitly whenever new mappings are created.

Finally, the storage for page protection tracking must be freed when the
associated file is closed. Hence a new selinux_file_free_security() has been
added.

Signed-off-by: Cedric Xing <cedric.xing@intel.com>
---
 security/selinux/Makefile            |   2 +
 security/selinux/hooks.c             |  77 ++++++-
 security/selinux/include/intel_sgx.h |  18 ++
 security/selinux/include/objsec.h    |   3 +
 security/selinux/intel_sgx.c         | 292 +++++++++++++++++++++++++++
 5 files changed, 391 insertions(+), 1 deletion(-)
 create mode 100644 security/selinux/include/intel_sgx.h
 create mode 100644 security/selinux/intel_sgx.c

diff --git a/security/selinux/Makefile b/security/selinux/Makefile
index ccf950409384..58a05a9639e0 100644
--- a/security/selinux/Makefile
+++ b/security/selinux/Makefile
@@ -14,6 +14,8 @@ selinux-$(CONFIG_SECURITY_NETWORK_XFRM) += xfrm.o
 
 selinux-$(CONFIG_NETLABEL) += netlabel.o
 
+selinux-$(CONFIG_INTEL_SGX) += intel_sgx.o
+
 ccflags-y := -I$(srctree)/security/selinux -I$(srctree)/security/selinux/include
 
 $(addprefix $(obj)/,$(selinux-y)): $(obj)/flask.h
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 3ec702cf46ca..17f855871a41 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -103,6 +103,7 @@
 #include "netlabel.h"
 #include "audit.h"
 #include "avc_ss.h"
+#include "intel_sgx.h"
 
 struct selinux_state selinux_state;
 
@@ -3485,6 +3486,11 @@ static int selinux_file_alloc_security(struct file *file)
 	return file_alloc_security(file);
 }
 
+static void selinux_file_free_security(struct file *file)
+{
+	sgxsec_enclave_free(file);
+}
+
 /*
  * Check whether a task has the ioctl permission and cmd
  * operation to an inode.
@@ -3656,6 +3662,7 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 				 unsigned long reqprot,
 				 unsigned long prot)
 {
+	int rc;
 	const struct cred *cred = current_cred();
 	u32 sid = cred_sid(cred);
 
@@ -3664,7 +3671,7 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 
 	if (default_noexec &&
 	    (prot & PROT_EXEC) && !(vma->vm_flags & VM_EXEC)) {
-		int rc = 0;
+		rc = 0;
 		if (vma->vm_start >= vma->vm_mm->start_brk &&
 		    vma->vm_end <= vma->vm_mm->brk) {
 			rc = avc_has_perm(&selinux_state,
@@ -3691,6 +3698,12 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
 			return rc;
 	}
 
+#ifdef CONFIG_INTEL_SGX
+	rc = sgxsec_mprotect(vma, prot);
+	if (rc <= 0)
+		return rc;
+#endif
+
 	return file_map_prot_check(vma->vm_file, prot, vma->vm_flags&VM_SHARED);
 }
 
@@ -6726,6 +6739,62 @@ static void selinux_bpf_prog_free(struct bpf_prog_aux *aux)
 }
 #endif
 
+#ifdef CONFIG_INTEL_SGX
+
+static int selinux_enclave_load(struct file *encl, unsigned long addr,
+				unsigned long size, unsigned long prot,
+				struct vm_area_struct *source)
+{
+	if (source) {
+		/**
+		 * Adding page from source => EADD request
+		 */
+		int rc = selinux_file_mprotect(source, prot, prot);
+		if (rc)
+			return rc;
+
+		if (!(prot & VM_EXEC) &&
+		    selinux_file_mprotect(source, VM_EXEC, VM_EXEC))
+			prot = 0;
+		else {
+			prot = SGX__EXECUTE;
+			if (source->vm_file &&
+			    !file_has_perm(current_cred(), source->vm_file,
+					   FILE__EXECMOD))
+				prot |= SGX__EXECMOD;
+		}
+		return sgxsec_eadd(encl, addr, size, prot);
+	} else {
+		/**
+		  * Adding page from NULL => EAUG request
+		  */
+		return sgxsec_eaug(encl, addr, size, prot);
+	}
+}
+
+static int selinux_enclave_init(struct file *encl,
+				const struct sgx_sigstruct *sigstruct,
+				struct vm_area_struct *vma)
+{
+	int rc = 0;
+
+	if (!vma)
+		rc = -EINVAL;
+
+	if (!rc && !(vma->vm_flags & VM_EXEC))
+		rc = selinux_file_mprotect(vma, VM_EXEC, VM_EXEC);
+
+	if (!rc) {
+		if (vma->vm_file)
+			rc = file_has_perm(current_cred(), vma->vm_file,
+					   FILE__EXECMOD);
+		rc = sgxsec_einit(encl, sigstruct, !rc);
+	}
+	return rc;
+}
+
+#endif
+
 struct lsm_blob_sizes selinux_blob_sizes __lsm_ro_after_init = {
 	.lbs_cred = sizeof(struct task_security_struct),
 	.lbs_file = sizeof(struct file_security_struct),
@@ -6808,6 +6877,7 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
 
 	LSM_HOOK_INIT(file_permission, selinux_file_permission),
 	LSM_HOOK_INIT(file_alloc_security, selinux_file_alloc_security),
+	LSM_HOOK_INIT(file_free_security, selinux_file_free_security),
 	LSM_HOOK_INIT(file_ioctl, selinux_file_ioctl),
 	LSM_HOOK_INIT(mmap_file, selinux_mmap_file),
 	LSM_HOOK_INIT(mmap_addr, selinux_mmap_addr),
@@ -6968,6 +7038,11 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
 	LSM_HOOK_INIT(bpf_map_free_security, selinux_bpf_map_free),
 	LSM_HOOK_INIT(bpf_prog_free_security, selinux_bpf_prog_free),
 #endif
+
+#ifdef CONFIG_INTEL_SGX
+	LSM_HOOK_INIT(enclave_load, selinux_enclave_load),
+	LSM_HOOK_INIT(enclave_init, selinux_enclave_init),
+#endif
 };
 
 static __init int selinux_init(void)
diff --git a/security/selinux/include/intel_sgx.h b/security/selinux/include/intel_sgx.h
new file mode 100644
index 000000000000..8f9c6c734921
--- /dev/null
+++ b/security/selinux/include/intel_sgx.h
@@ -0,0 +1,18 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+// Copyright(c) 2016-18 Intel Corporation.
+
+#ifndef _SELINUX_SGXSEC_H_
+#define _SELINUX_SGXSEC_H_
+
+#include <linux/lsm_hooks.h>
+
+#define SGX__EXECUTE	1
+#define SGX__EXECMOD	2
+
+void sgxsec_enclave_free(struct file *);
+int sgxsec_mprotect(struct vm_area_struct *, size_t);
+int sgxsec_eadd(struct file *, size_t, size_t, size_t);
+int sgxsec_eaug(struct file *, size_t, size_t, size_t);
+int sgxsec_einit(struct file *, const struct sgx_sigstruct *, int);
+
+#endif
diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
index 231262d8eac9..0fb4da7e3a8a 100644
--- a/security/selinux/include/objsec.h
+++ b/security/selinux/include/objsec.h
@@ -71,6 +71,9 @@ struct file_security_struct {
 	u32 fown_sid;		/* SID of file owner (for SIGIO) */
 	u32 isid;		/* SID of inode at the time of file open */
 	u32 pseqno;		/* Policy seqno at the time of file open */
+#ifdef CONFIG_INTEL_SGX
+	atomic_long_t esec;
+#endif
 };
 
 struct superblock_security_struct {
diff --git a/security/selinux/intel_sgx.c b/security/selinux/intel_sgx.c
new file mode 100644
index 000000000000..37dacf5c295f
--- /dev/null
+++ b/security/selinux/intel_sgx.c
@@ -0,0 +1,292 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+// Copyright(c) 2016-18 Intel Corporation.
+
+#include "objsec.h"
+#include "intel_sgx.h"
+
+struct region {
+	struct list_head	link;
+	size_t			start;
+	size_t			end;
+	size_t			data;
+};
+
+static inline struct region *region_new(void)
+{
+	struct region *n = kzalloc(sizeof(struct region), GFP_KERNEL);
+	if (n)
+		INIT_LIST_HEAD(&n->link);
+	return n;
+}
+
+static inline void region_free(struct region *r)
+{
+	list_del(&r->link);
+	kfree(r);
+}
+
+static struct list_head *
+region_apply_to_range(struct list_head *rgs,
+		      size_t start, size_t end,
+		      struct list_head *(*cb)(struct region *,
+					      size_t, size_t, size_t),
+		      size_t arg)
+{
+	struct region *r, *n;
+
+	list_for_each_entry(r, rgs, link)
+		if (start < r-> end)
+			break;
+
+	if (&r->link == rgs || end <= r->start)
+		return rgs;
+
+	do {
+		struct list_head *ret;
+		n = list_next_entry(r, link);
+		ret = (*cb)(r, start, end, arg);
+		if (ret)
+			return ret;
+		r = n;
+	} while (&r->link != rgs && r->start < end);
+	return &r->link;
+}
+
+static struct list_head *
+region_clear_cb(struct region *r, size_t start, size_t end, size_t arg)
+{
+	if (end < r->end) {
+		if (start > r->start) {
+			struct region *n = region_new();
+			if (unlikely(!n))
+				return ERR_PTR(-ENOMEM);
+
+			n->start = r->start;
+			n->end = start;
+			n->data = r->data;
+			list_add_tail(&n->link, &r->link);
+		}
+		r->start = end;
+		return &r->link;
+	}
+
+	if (start > r->start)
+		r->end = start;
+	else
+		region_free(r);
+	return NULL;
+}
+
+static inline struct list_head *
+region_clear_range(struct list_head *rgs, size_t start, size_t end)
+{
+	return region_apply_to_range(rgs, start, end, region_clear_cb, 0);
+}
+
+static struct list_head *
+region_add_range(struct list_head *rgs, size_t start, size_t end, size_t data)
+{
+	struct region *r, *n;
+
+	n = list_entry(region_clear_range(rgs, start, end), typeof(*n), link);
+	if (unlikely(IS_ERR_VALUE(&n->link)))
+		return &n->link;
+
+	if (&n->link != rgs && end == n->start && data == n->data) {
+		n->start = start;
+		r = n;
+	} else {
+		r = region_new();
+		if (unlikely(!r))
+			return ERR_PTR(-ENOMEM);
+
+		r->start = start;
+		r->end = end;
+		r->data = data;
+		list_add_tail(&r->link, &n->link);
+	}
+
+	n = list_prev_entry(r, link);
+	if (&n->link != rgs && start == n->end && data == n->data) {
+		r->start = n->start;
+		region_free(n);
+	}
+
+	return &r->link;
+}
+
+static inline int
+enclave_add_pages(struct list_head *rgs, size_t start, size_t end, size_t flags)
+{
+	void *p = region_add_range(rgs, start, end, flags);
+	return PTR_ERR_OR_ZERO(p);
+}
+
+static inline int enclave_prot_allowed(size_t prot, size_t flags)
+{
+	return !(prot & VM_EXEC) || (flags & SGX__EXECUTE);
+}
+
+static struct list_head *
+enclave_prot_check_cb(struct region *r, size_t start, size_t end, size_t prot)
+{
+	if (!enclave_prot_allowed(prot, r->data))
+		return ERR_PTR(-EACCES);
+	return NULL;
+}
+
+static struct list_head *
+enclave_prot_set_cb(struct region *r, size_t start, size_t end, size_t prot)
+{
+	BUG_ON(!enclave_prot_allowed(prot, r->data));
+
+	if (!(prot & VM_WRITE) ||
+	    (r->data & SGX__EXECMOD) ||
+	    !(r->data & SGX__EXECUTE))
+		return NULL;
+
+	if (end < r->end) {
+		struct region *n = region_new();
+		if (unlikely(!n))
+			return ERR_PTR(-ENOMEM);
+
+		n->start = end;
+		n->end = r->end;
+		n->data = r->data;
+		r->end = end;
+		list_add(&n->link, &r->link);
+	}
+
+	if (start > r->start) {
+		struct region *n = region_new();
+		if (unlikely(!n))
+			return ERR_PTR(-ENOMEM);
+
+		n->start = r->start;
+		n->end = start;
+		n->data = r->data;
+		r->start = start;
+		list_add_tail(&n->link, &r->link);
+	}
+
+	r->data &= ~SGX__EXECUTE;
+	return NULL;
+}
+
+static inline int
+enclave_mprotect(struct list_head *rgs, size_t start, size_t end, size_t prot)
+{
+	void *ret;
+
+	ret = region_apply_to_range(rgs, start, end,
+				    enclave_prot_check_cb, prot);
+	if (!IS_ERR_VALUE(ret) && (prot & VM_WRITE))
+		ret = region_apply_to_range(rgs, start, end,
+					    enclave_prot_set_cb, prot);
+	return PTR_ERR_OR_ZERO(ret);
+}
+
+struct enclave_sec {
+	struct rw_semaphore	sem;
+	struct list_head	regions;
+	size_t			eaug_perm;
+};
+
+static inline struct enclave_sec *__esec(struct file_security_struct *fsec)
+{
+	return (struct enclave_sec *)atomic_long_read(&fsec->esec);
+}
+
+static struct enclave_sec *encl_esec(struct file *encl)
+{
+	struct file_security_struct *fsec = selinux_file(encl);
+	struct enclave_sec *esec = __esec(fsec);
+
+	if (unlikely(!esec)) {
+		long n;
+
+		esec = kzalloc(sizeof(*esec), GFP_KERNEL);
+		if (!esec)
+			return NULL;
+
+		init_rwsem(&esec->sem);
+		INIT_LIST_HEAD(&esec->regions);
+
+		n = atomic_long_cmpxchg(&fsec->esec, 0, (long)esec);
+		if (n) {
+			kfree(esec);
+			esec = (typeof(esec))n;
+		}
+	}
+
+	return esec;
+}
+
+void sgxsec_enclave_free(struct file *encl)
+{
+	struct enclave_sec *esec = __esec(selinux_file(encl));
+
+	if (esec) {
+		struct region *r, *n;
+
+		BUG_ON(rwsem_is_locked(&esec->sem));
+
+		list_for_each_entry_safe(r, n, &esec->regions, link)
+			region_free(r);
+
+		kfree(esec);
+	}
+}
+
+int sgxsec_mprotect(struct vm_area_struct *vma, size_t prot)
+{
+	struct enclave_sec *esec;
+	int rc;
+
+	if (!vma->vm_file || !(esec = __esec(selinux_file(vma->vm_file)))) {
+		/* Positive return value indicates non-enclave VMA */
+		return 1;
+	}
+
+	down_read(&esec->sem);
+	rc = enclave_mprotect(&esec->regions, vma->vm_start, vma->vm_end, prot);
+	up_read(&esec->sem);
+	return rc;
+}
+
+int sgxsec_eadd(struct file *encl, size_t start, size_t size, size_t perm)
+{
+	struct enclave_sec *esec = encl_esec(encl);
+	int rc;
+
+	if (down_write_killable(&esec->sem))
+		return -EINTR;
+	rc = enclave_add_pages(&esec->regions, start, start + size, perm);
+	up_write(&esec->sem);
+	return rc;
+}
+
+int sgxsec_eaug(struct file *encl, size_t start, size_t size, size_t prot)
+{
+	struct enclave_sec *esec = encl_esec(encl);
+	int rc = -EPERM;
+
+	if (down_write_killable(&esec->sem))
+		return -EINTR;
+	if (enclave_prot_allowed(prot, esec->eaug_perm))
+		rc = enclave_add_pages(&esec->regions, start, start + size,
+				       esec->eaug_perm);
+	up_write(&esec->sem);
+	return rc;
+}
+
+int sgxsec_einit(struct file *encl, const struct sgx_sigstruct *sigstruct, int execmod)
+{
+	struct enclave_sec *esec = encl_esec(encl);
+
+	if (down_write_killable(&esec->sem))
+		return -EINTR;
+	esec->eaug_perm = execmod ? SGX__EXECUTE | SGX__EXECMOD : 0;
+	up_write(&esec->sem);
+	return 0;
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [RFC PATCH v1 3/3] LSM/x86/sgx: Call new LSM hooks from SGX subsystem
  2019-06-10  7:03 ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Cedric Xing
  2019-06-10  7:03   ` [RFC PATCH v1 1/3] LSM/x86/sgx: Add " Cedric Xing
  2019-06-10  7:03   ` [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux Cedric Xing
@ 2019-06-10  7:03   ` Cedric Xing
  2019-06-10 17:36   ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Jarkko Sakkinen
  3 siblings, 0 replies; 67+ messages in thread
From: Cedric Xing @ 2019-06-10  7:03 UTC (permalink / raw)
  To: linux-security-module, selinux, linux-kernel, linux-sgx
  Cc: Cedric Xing, jarkko.sakkinen, luto, sds, jmorris, serge, paul,
	eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca

There are three places LSM hooks are called from within the SGX subsystem.

The first place is to invoke security_file_mprotect() in sgx_mmap() to validate
requested protection. Given the architecture of SGX subsystem, all enclaves
look like file mappings of /dev/sgx/enclave device file, meaning the existing
security_mmap_file() invoked inside vm_mmap_pgoff() cannot provide any
meaningful information to LSM. Based on the idea that mmap(prot) is equivalent
to mmap(PROT_NONE) followed by mprotect(prot), security_file_mprotect() shall
be queried with more specific enclave/page information.

Secondly, security_enclave_load() is invoked upon loading of every enclave
page.

Lastly, security_enclave_init() is invoked before initializing (EINIT) every
enclave.

Signed-off-by: Cedric Xing <cedric.xing@intel.com>
---
 arch/x86/kernel/cpu/sgx/driver/ioctl.c | 72 +++++++++++++++++++++++---
 arch/x86/kernel/cpu/sgx/driver/main.c  | 12 ++++-
 2 files changed, 74 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/driver/ioctl.c b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
index b186fb7b48d5..a3f22a6f6d2b 100644
--- a/arch/x86/kernel/cpu/sgx/driver/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
@@ -11,6 +11,7 @@
 #include <linux/shmem_fs.h>
 #include <linux/slab.h>
 #include <linux/suspend.h>
+#include <linux/security.h>
 #include "driver.h"
 
 struct sgx_add_page_req {
@@ -575,6 +576,42 @@ static int sgx_encl_add_page(struct sgx_encl *encl, unsigned long addr,
 	return ret;
 }
 
+static int sgx_encl_prepare_page(struct file *filp, unsigned long dst,
+				 unsigned long src, void *buf)
+{
+	struct vm_area_struct *vma;
+	unsigned long prot;
+	int rc = 0;
+
+	if (dst & ~PAGE_SIZE)
+		return -EINVAL;
+
+	down_read(&current->mm->mmap_sem);
+
+	vma = find_vma(current->mm, dst);
+	if (vma && dst >= vma->vm_start)
+		prot = vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
+	else
+		prot = 0;
+
+	vma = find_vma(current->mm, src);
+	if (!vma || src < vma->vm_start || src + PAGE_SIZE > vma->vm_end)
+		rc = -EFAULT;
+
+	if (!rc && !(vma->vm_flags & VM_MAYEXEC))
+		rc = -EACCES;
+
+	if (!rc)
+		rc = security_enclave_load(filp, dst, PAGE_SIZE, prot, vma);
+
+	if (!rc && copy_from_user(buf, (void __user *)src, PAGE_SIZE))
+		rc = -EFAULT;
+
+	up_read(&current->mm->mmap_sem);
+
+	return rc;
+}
+
 /**
  * sgx_ioc_enclave_add_page - handler for %SGX_IOC_ENCLAVE_ADD_PAGE
  *
@@ -613,10 +650,9 @@ static long sgx_ioc_enclave_add_page(struct file *filep, unsigned int cmd,
 
 	data = kmap(data_page);
 
-	if (copy_from_user((void *)data, (void __user *)addp->src, PAGE_SIZE)) {
-		ret = -EFAULT;
+	ret = sgx_encl_prepare_page(filep, addp->addr, addp->src, data);
+	if (ret)
 		goto out;
-	}
 
 	ret = sgx_encl_add_page(encl, addp->addr, data, &secinfo, addp->mrmask);
 	if (ret)
@@ -718,6 +754,29 @@ static int sgx_encl_init(struct sgx_encl *encl, struct sgx_sigstruct *sigstruct,
 	return ret;
 }
 
+static int sgx_encl_prepare_sigstruct(struct file *filp, unsigned long src,
+				      struct sgx_sigstruct *ss)
+{
+	struct vm_area_struct *vma;
+	int rc = 0;
+
+	down_read(&current->mm->mmap_sem);
+
+	vma = find_vma(current->mm, src);
+	if (!vma || src < vma->vm_start || src + sizeof(*ss) > vma->vm_end)
+		rc = -EFAULT;
+
+	if (!rc && copy_from_user(ss, (void __user *)src, sizeof(*ss)))
+		rc = -EFAULT;
+
+	if (!rc)
+		rc = security_enclave_init(filp, ss, vma);
+
+	up_read(&current->mm->mmap_sem);
+
+	return rc;
+}
+
 /**
  * sgx_ioc_enclave_init - handler for %SGX_IOC_ENCLAVE_INIT
  *
@@ -753,12 +812,9 @@ static long sgx_ioc_enclave_init(struct file *filep, unsigned int cmd,
 		((unsigned long)sigstruct + PAGE_SIZE / 2);
 	memset(einittoken, 0, sizeof(*einittoken));
 
-	if (copy_from_user(sigstruct, (void __user *)initp->sigstruct,
-			   sizeof(*sigstruct))) {
-		ret = -EFAULT;
+	ret = sgx_encl_prepare_sigstruct(filep, initp->sigstruct, sigstruct);
+	if (ret)
 		goto out;
-	}
-
 
 	ret = sgx_encl_init(encl, sigstruct, einittoken);
 
diff --git a/arch/x86/kernel/cpu/sgx/driver/main.c b/arch/x86/kernel/cpu/sgx/driver/main.c
index 58ba6153070b..c634df440c16 100644
--- a/arch/x86/kernel/cpu/sgx/driver/main.c
+++ b/arch/x86/kernel/cpu/sgx/driver/main.c
@@ -63,14 +63,22 @@ static long sgx_compat_ioctl(struct file *filep, unsigned int cmd,
 static int sgx_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct sgx_encl *encl = file->private_data;
+	unsigned long prot;
+	int rc;
 
 	vma->vm_ops = &sgx_vm_ops;
 	vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO;
 	vma->vm_private_data = encl;
 
-	kref_get(&encl->refcount);
+	prot = vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC);
+	vma->vm_flags &= ~prot;
+	rc = security_file_mprotect(vma, prot, prot);
+	if (!rc) {
+		vma->vm_flags |= prot;
+		kref_get(&encl->refcount);
+	}
 
-	return 0;
+	return rc;
 }
 
 static unsigned long sgx_get_unmapped_area(struct file *file,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect()
  2019-06-06  2:11 ` [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect() Sean Christopherson
@ 2019-06-10 15:06   ` Jarkko Sakkinen
  2019-06-10 15:55     ` Sean Christopherson
  0 siblings, 1 reply; 67+ messages in thread
From: Jarkko Sakkinen @ 2019-06-10 15:06 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Wed, Jun 05, 2019 at 07:11:41PM -0700, Sean Christopherson wrote:
> SGX will use the may_mprotect() hook to prevent userspace from
> circumventing various security checks, e.g. Linux Security Modules.
> Naming it may_mprotect() instead of simply mprotect() is intended to
> reflect the hook's purpose as a way to gate mprotect() as opposed to
> a wholesale replacement.

"This commit adds may_mprotect() to struct vm_operations_struct, which
can be used to ask from the owner of a VMA if mprotect() is allowed."

This would be more appropriate statement because that is what the code
change aims for precisely. I did not even understand what you meant by
gating in this context. I would leave SGX and LSM's (and especially
"various security checks", which means abssolutely nothing) out of the
first paragraph completely.

> Enclaves are built by copying data from normal memory into the Enclave
> Page Cache (EPC).  Due to the nature of SGX, the EPC is represented by a
> single file that must be MAP_SHARED, i.e. mprotect() only ever sees a
> MAP_SHARED vm_file that references single file path.  Furthermore, all
> enclaves will need read, write and execute pages in the EPC.

I would just say that "Due to the fact that EPC is delivered as IO
memory from the preboot firmware, it can be only mapped as MAP_SHARED".
It is what it is.

> As a result, LSM policies cannot be meaningfully applied, e.g. an LSM
> can deny access to the EPC as a whole, but can't deny PROT_EXEC on page
> that originated in a non-EXECUTE file (which is long gone by the time
> mprotect() is called).

I have hard time following what is paragraph is trying to say.

> By hooking mprotect(), SGX can make explicit LSM upcalls while an
> enclave is being built, i.e. when the kernel has a handle to origin of
> each enclave page, and enforce the result of the LSM policy whenever
> userspace maps the enclave page in the future.

"LSM policy whenever calls mprotect()"? I'm no sure why you mean by
mapping here and if there is any need to talk about future. Isn't this
needed now?

> Alternatively, SGX could play games with MAY_{READ,WRITE,EXEC}, but
> that approach is quite ugly, e.g. would require userspace to call an
> SGX ioctl() prior to using mprotect() to extend a page's protections.

Instead of talking "playing games" I would state what could be done with
VM_MAY{READ,WRITE,EXEC} and why it is bad. Leaves questions otherwise.

> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
>  include/linux/mm.h |  2 ++
>  mm/mprotect.c      | 15 +++++++++++----
>  2 files changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0e8834ac32b7..a697996040ac 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -458,6 +458,8 @@ struct vm_operations_struct {
>  	void (*close)(struct vm_area_struct * area);
>  	int (*split)(struct vm_area_struct * area, unsigned long addr);
>  	int (*mremap)(struct vm_area_struct * area);
> +	int (*may_mprotect)(struct vm_area_struct * area, unsigned long start,
> +			    unsigned long end, unsigned long prot);

Could be just boolean.

/Jarkko

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-06  2:11 ` [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits Sean Christopherson
@ 2019-06-10 15:27   ` Jarkko Sakkinen
  2019-06-10 16:15     ` Sean Christopherson
  2019-06-10 18:29   ` Xing, Cedric
  1 sibling, 1 reply; 67+ messages in thread
From: Jarkko Sakkinen @ 2019-06-10 15:27 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Wed, Jun 05, 2019 at 07:11:42PM -0700, Sean Christopherson wrote:
> [SNAP]

Same general criticism as for the previous patch: try to say things as
they are without anything extra.

> A third alternative would be to pull the protection bits from the page's
> SECINFO, i.e. make decisions based on the protections enforced by
> hardware.  However, with SGX2, userspace can extend the hardware-
> enforced protections via ENCLU[EMODPE], e.g. can add a page as RW and
> later convert it to RX.  With SGX2, making a decision based on the
> initial protections would either create a security hole or force SGX to
> dynamically track "dirty" pages (see first alternative above).
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

'flags' should would renamed as 'secinfo_flags_mask' even if the name is
longish. It would use the same values as the SECINFO flags. The field in
struct sgx_encl_page should have the same name. That would express
exactly relation between SECINFO and the new field. I would have never
asked on last iteration why SECINFO is not enough with a better naming.

The same field can be also used to cage page type to a subset of values.

/Jarkko

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect()
  2019-06-10 15:06   ` Jarkko Sakkinen
@ 2019-06-10 15:55     ` Sean Christopherson
  2019-06-10 17:47       ` Xing, Cedric
  0 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-10 15:55 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Mon, Jun 10, 2019 at 06:06:00PM +0300, Jarkko Sakkinen wrote:
> On Wed, Jun 05, 2019 at 07:11:41PM -0700, Sean Christopherson wrote:
> > SGX will use the may_mprotect() hook to prevent userspace from
> > circumventing various security checks, e.g. Linux Security Modules.
> > Naming it may_mprotect() instead of simply mprotect() is intended to
> > reflect the hook's purpose as a way to gate mprotect() as opposed to
> > a wholesale replacement.
> 
> "This commit adds may_mprotect() to struct vm_operations_struct, which
> can be used to ask from the owner of a VMA if mprotect() is allowed."
> 
> This would be more appropriate statement because that is what the code
> change aims for precisely. I did not even understand what you meant by
> gating in this context. I would leave SGX and LSM's (and especially
> "various security checks", which means abssolutely nothing) out of the
> first paragraph completely.
> 
> > Enclaves are built by copying data from normal memory into the Enclave
> > Page Cache (EPC).  Due to the nature of SGX, the EPC is represented by a
> > single file that must be MAP_SHARED, i.e. mprotect() only ever sees a
> > MAP_SHARED vm_file that references single file path.  Furthermore, all
> > enclaves will need read, write and execute pages in the EPC.
> 
> I would just say that "Due to the fact that EPC is delivered as IO
> memory from the preboot firmware, it can be only mapped as MAP_SHARED".
> It is what it is.

I was trying to convey that the nature of SGX itself requires that an
enclave's pages are shared between process.  E.g. {MAP,VM}_SHARED would be
required even if we modified the mmu to handle EPC memory in such a way
that it didn't have to be tagged with VM_PFNMAP.

> > As a result, LSM policies cannot be meaningfully applied, e.g. an LSM
> > can deny access to the EPC as a whole, but can't deny PROT_EXEC on page
> > that originated in a non-EXECUTE file (which is long gone by the time
> > mprotect() is called).
> 
> I have hard time following what is paragraph is trying to say.
> 
> > By hooking mprotect(), SGX can make explicit LSM upcalls while an
> > enclave is being built, i.e. when the kernel has a handle to origin of
> > each enclave page, and enforce the result of the LSM policy whenever
> > userspace maps the enclave page in the future.
> 
> "LSM policy whenever calls mprotect()"? I'm no sure why you mean by
> mapping here and if there is any need to talk about future. Isn't this
> needed now?

Future is referring to the timeline of a running kernel, not the future
of the kernel code.

Rather than trying to explain all of the above with words, I'll provide
code examples to show how ->may_protect() will be used by SGX and why it
is the preferred solution.

> > Alternatively, SGX could play games with MAY_{READ,WRITE,EXEC}, but
> > that approach is quite ugly, e.g. would require userspace to call an
> > SGX ioctl() prior to using mprotect() to extend a page's protections.
> 
> Instead of talking "playing games" I would state what could be done with
> VM_MAY{READ,WRITE,EXEC} and why it is bad. Leaves questions otherwise.
> 
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > ---
> >  include/linux/mm.h |  2 ++
> >  mm/mprotect.c      | 15 +++++++++++----
> >  2 files changed, 13 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 0e8834ac32b7..a697996040ac 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -458,6 +458,8 @@ struct vm_operations_struct {
> >  	void (*close)(struct vm_area_struct * area);
> >  	int (*split)(struct vm_area_struct * area, unsigned long addr);
> >  	int (*mremap)(struct vm_area_struct * area);
> > +	int (*may_mprotect)(struct vm_area_struct * area, unsigned long start,
> > +			    unsigned long end, unsigned long prot);
> 
> Could be just boolean.
> 
> /Jarkko

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 3/5] x86/sgx: Enforce noexec filesystem restriction for enclaves
  2019-06-06  2:11 ` [RFC PATCH v2 3/5] x86/sgx: Enforce noexec filesystem restriction for enclaves Sean Christopherson
@ 2019-06-10 16:00   ` Jarkko Sakkinen
  2019-06-10 16:44     ` Andy Lutomirski
  0 siblings, 1 reply; 67+ messages in thread
From: Jarkko Sakkinen @ 2019-06-10 16:00 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Wed, Jun 05, 2019 at 07:11:43PM -0700, Sean Christopherson wrote:
> +		goto out;
> +	}
> +
> +	/*
> +	 * Query VM_MAYEXEC as an indirect path_noexec() check (see do_mmap()),
> +	 * but with some future proofing against other cases that may deny
> +	 * execute permissions.
> +	 */
> +	if (!(vma->vm_flags & VM_MAYEXEC)) {
> +		ret = -EACCES;
> +		goto out;
> +	}
> +
> +	if (copy_from_user(dst, (void __user *)src, PAGE_SIZE))
> +		ret = -EFAULT;
> +	else
> +		ret = 0;
> +
> +out:
> +	up_read(&current->mm->mmap_sem);
> +
> +	return ret;
> +}

I would suggest to express the above instead like this for clarity
and consistency:

		goto err_map_sem;
	}

	/* Query VM_MAYEXEC as an indirect path_noexec() check
	 * (see do_mmap()).
	 */
	if (!(vma->vm_flags & VM_MAYEXEC)) {
		ret = -EACCES;
		goto err_mmap_sem;
	}

	if (copy_from_user(dst, (void __user *)src, PAGE_SIZE)) {
		ret = -EFAULT;
		goto err_mmap_sem;
	}

	return 0;

err_mmap_sem:
	up_read(&current->mm->mmap_sem);
	return ret;
}

The comment about future proofing is unnecessary.

/Jarkk

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 4/5] LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX
  2019-06-06  2:11 ` [RFC PATCH v2 4/5] LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX Sean Christopherson
  2019-06-07 19:58   ` Stephen Smalley
@ 2019-06-10 16:05   ` Jarkko Sakkinen
  1 sibling, 0 replies; 67+ messages in thread
From: Jarkko Sakkinen @ 2019-06-10 16:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Wed, Jun 05, 2019 at 07:11:44PM -0700, Sean Christopherson wrote:
> enclave_load() is roughly analogous to the existing file_mprotect().
> 
> Due to the nature of SGX and its Enclave Page Cache (EPC), all enclave
> VMAs are backed by a single file, i.e. /dev/sgx/enclave, that must be
> MAP_SHARED.  Furthermore, all enclaves need read, write and execute
> VMAs.  As a result, the existing/standard call to file_mprotect() does
> not provide any meaningful security for enclaves since an LSM can only
> deny/grant access to the EPC as a whole.
> 
> security_enclave_load() is called when SGX is first loading an enclave
> page, i.e. copying a page from normal memory into the EPC.  Although
> the prototype for enclave_load() is similar to file_mprotect(), e.g.
> SGX could theoretically use file_mprotect() and set reqprot=prot, a
> separate hook is desirable as the semantics of an enclave's protection
> bits are different than those of vmas, e.g. an enclave page tracks the
> maximal set of protections, whereas file_mprotect() operates on the
> actual protections being provided.  In other words, LSMs will likely
> want to implement different policies for enclave page protections.
> 
> Note, extensive discussion yielded no sane alternative to some form of
> SGX specific LSM hook[1].
> 
> [1] https://lkml.kernel.org/r/CALCETrXf8mSK45h7sTK5Wf+pXLVn=Bjsc_RLpgO-h-qdzBRo5Q@mail.gmail.com
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

4/5 and 5/5 should only be added after upstreaming SGX.

/Jarkko

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-10 15:27   ` Jarkko Sakkinen
@ 2019-06-10 16:15     ` Sean Christopherson
  2019-06-10 17:45       ` Jarkko Sakkinen
  0 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-10 16:15 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Mon, Jun 10, 2019 at 06:27:17PM +0300, Jarkko Sakkinen wrote:
> On Wed, Jun 05, 2019 at 07:11:42PM -0700, Sean Christopherson wrote:
> > [SNAP]
> 
> Same general criticism as for the previous patch: try to say things as
> they are without anything extra.
> 
> > A third alternative would be to pull the protection bits from the page's
> > SECINFO, i.e. make decisions based on the protections enforced by
> > hardware.  However, with SGX2, userspace can extend the hardware-
> > enforced protections via ENCLU[EMODPE], e.g. can add a page as RW and
> > later convert it to RX.  With SGX2, making a decision based on the
> > initial protections would either create a security hole or force SGX to
> > dynamically track "dirty" pages (see first alternative above).
> > 
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> 
> 'flags' should would renamed as 'secinfo_flags_mask' even if the name is
> longish. It would use the same values as the SECINFO flags. The field in
> struct sgx_encl_page should have the same name. That would express
> exactly relation between SECINFO and the new field. I would have never
> asked on last iteration why SECINFO is not enough with a better naming.

No, these flags do not impact the EPCM protections in any way.  Userspace
can extend the EPCM protections without going through the kernel.  The
protection flags for an enclave page impact VMA/PTE protection bits.

IMO, it is best to treat the EPCM as being completely separate from the
kernel's EPC management.

> The same field can be also used to cage page type to a subset of values.
> 
> /Jarkko

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 4/5] LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX
  2019-06-07 19:58   ` Stephen Smalley
@ 2019-06-10 16:21     ` Sean Christopherson
  0 siblings, 0 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-10 16:21 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Jarkko Sakkinen, Andy Lutomirski, Cedric Xing, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Fri, Jun 07, 2019 at 03:58:34PM -0400, Stephen Smalley wrote:
> On 6/5/19 10:11 PM, Sean Christopherson wrote:
> >enclave_load() is roughly analogous to the existing file_mprotect().
> >
> >Due to the nature of SGX and its Enclave Page Cache (EPC), all enclave
> >VMAs are backed by a single file, i.e. /dev/sgx/enclave, that must be
> >MAP_SHARED.  Furthermore, all enclaves need read, write and execute
> >VMAs.  As a result, the existing/standard call to file_mprotect() does
> >not provide any meaningful security for enclaves since an LSM can only
> >deny/grant access to the EPC as a whole.
> >
> >security_enclave_load() is called when SGX is first loading an enclave
> >page, i.e. copying a page from normal memory into the EPC.  Although
> >the prototype for enclave_load() is similar to file_mprotect(), e.g.
> >SGX could theoretically use file_mprotect() and set reqprot=prot, a
> >separate hook is desirable as the semantics of an enclave's protection
> >bits are different than those of vmas, e.g. an enclave page tracks the
> >maximal set of protections, whereas file_mprotect() operates on the
> >actual protections being provided.  In other words, LSMs will likely
> >want to implement different policies for enclave page protections.
> >
> >Note, extensive discussion yielded no sane alternative to some form of
> >SGX specific LSM hook[1].
> >
> >[1] https://lkml.kernel.org/r/CALCETrXf8mSK45h7sTK5Wf+pXLVn=Bjsc_RLpgO-h-qdzBRo5Q@mail.gmail.com
> >
> >Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> >---
> >  arch/x86/kernel/cpu/sgx/driver/ioctl.c | 12 ++++++------
> >  include/linux/lsm_hooks.h              | 13 +++++++++++++
> >  include/linux/security.h               | 12 ++++++++++++
> >  security/security.c                    |  7 +++++++
> >  4 files changed, 38 insertions(+), 6 deletions(-)
> >
> >diff --git a/arch/x86/kernel/cpu/sgx/driver/ioctl.c b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
> >index 44b2d73de7c3..29c0df672250 100644
> >--- a/arch/x86/kernel/cpu/sgx/driver/ioctl.c
> >+++ b/arch/x86/kernel/cpu/sgx/driver/ioctl.c
> >@@ -8,6 +8,7 @@
> >  #include <linux/highmem.h>
> >  #include <linux/ratelimit.h>
> >  #include <linux/sched/signal.h>
> >+#include <linux/security.h>
> >  #include <linux/shmem_fs.h>
> >  #include <linux/slab.h>
> >  #include <linux/suspend.h>
> >@@ -582,9 +583,6 @@ static int sgx_encl_page_copy(void *dst, unsigned long src, unsigned long prot)
> >  	struct vm_area_struct *vma;
> >  	int ret;
> >-	if (!(prot & VM_EXEC))
> >-		return 0;
> >-
> 
> Is there a real use case where LSM will want to be called if !(prot &
> VM_EXEC)?

I don't think so?  I have no objection to conditioning the LSM calls on
the page being executable.  I actually had the code written that way in
the first RFC, but it felt weird for SGX to be making assumptions about
LSM use cases.

> Also, you seem to be mixing prot and PROT_EXEC with vm_flags and
> VM_EXEC; other code does not appear to assume they are identical and
> explicitly converts, e.g. calc_vm_prot_bits().

Argh, I'll clean that up.
 
> >  	/* Hold mmap_sem across copy_from_user() to avoid a TOCTOU race. */
> >  	down_read(&current->mm->mmap_sem);

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 3/5] x86/sgx: Enforce noexec filesystem restriction for enclaves
  2019-06-10 16:00   ` Jarkko Sakkinen
@ 2019-06-10 16:44     ` Andy Lutomirski
  2019-06-11 17:21       ` Stephen Smalley
  0 siblings, 1 reply; 67+ messages in thread
From: Andy Lutomirski @ 2019-06-10 16:44 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Sean Christopherson, Andy Lutomirski, Cedric Xing,
	Stephen Smalley, James Morris, Serge E . Hallyn, LSM List,
	Paul Moore, Eric Paris, selinux, Jethro Beekman, Dave Hansen,
	Thomas Gleixner, Linus Torvalds, LKML, X86 ML, linux-sgx,
	Andrew Morton, nhorman, npmccallum, Serge Ayoun, Shay Katz-zamir,
	Haitao Huang, Andy Shevchenko, Kai Svahn, Borislav Petkov,
	Josh Triplett, Kai Huang, David Rientjes, William Roberts,
	Philip Tricca

On Mon, Jun 10, 2019 at 9:00 AM Jarkko Sakkinen
<jarkko.sakkinen@linux.intel.com> wrote:
>
> On Wed, Jun 05, 2019 at 07:11:43PM -0700, Sean Christopherson wrote:
> > +             goto out;
> > +     }
> > +
> > +     /*
> > +      * Query VM_MAYEXEC as an indirect path_noexec() check (see do_mmap()),
> > +      * but with some future proofing against other cases that may deny
> > +      * execute permissions.
> > +      */
> > +     if (!(vma->vm_flags & VM_MAYEXEC)) {
> > +             ret = -EACCES;
> > +             goto out;
> > +     }
> > +
> > +     if (copy_from_user(dst, (void __user *)src, PAGE_SIZE))
> > +             ret = -EFAULT;
> > +     else
> > +             ret = 0;
> > +
> > +out:
> > +     up_read(&current->mm->mmap_sem);
> > +
> > +     return ret;
> > +}
>
> I would suggest to express the above instead like this for clarity
> and consistency:
>
>                 goto err_map_sem;
>         }
>
>         /* Query VM_MAYEXEC as an indirect path_noexec() check
>          * (see do_mmap()).
>          */
>         if (!(vma->vm_flags & VM_MAYEXEC)) {
>                 ret = -EACCES;
>                 goto err_mmap_sem;
>         }
>
>         if (copy_from_user(dst, (void __user *)src, PAGE_SIZE)) {
>                 ret = -EFAULT;
>                 goto err_mmap_sem;
>         }
>
>         return 0;
>
> err_mmap_sem:
>         up_read(&current->mm->mmap_sem);
>         return ret;
> }
>
> The comment about future proofing is unnecessary.
>

I'm also torn as to whether this patch is needed at all.  If we ever
get O_MAYEXEC, then enclave loaders should use it to enforce noexec in
userspace.  Otherwise I'm unconvinced it's that special.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 5/5] security/selinux: Add enclave_load() implementation
  2019-06-07 21:16   ` Stephen Smalley
@ 2019-06-10 16:46     ` Sean Christopherson
  0 siblings, 0 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-10 16:46 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Jarkko Sakkinen, Andy Lutomirski, Cedric Xing, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Fri, Jun 07, 2019 at 05:16:01PM -0400, Stephen Smalley wrote:
> On 6/5/19 10:11 PM, Sean Christopherson wrote:
> >The goal of selinux_enclave_load() is to provide a facsimile of the
> >existing selinux_file_mprotect() and file_map_prot_check() policies,
> >but tailored to the unique properties of SGX.
> >
> >For example, an enclave page is technically backed by a MAP_SHARED file,
> >but the "file" is essentially shared memory that is never persisted
> >anywhere and also requires execute permissions (for some pages).
> >
> >The basic concept is to require appropriate execute permissions on the
> >source of the enclave for pages that are requesting PROT_EXEC, e.g. if
> >an enclave page is being loaded from a regular file, require
> >FILE__EXECUTE and/or FILE__EXECMOND, and if it's coming from an
> >anonymous/private mapping, require PROCESS__EXECMEM since the process
> >is essentially executing from the mapping, albeit in a roundabout way.
> >
> >Note, FILE__READ and FILE__WRITE are intentionally not required even if
> >the source page is backed by a regular file.  Writes to the enclave page
> >are contained to the EPC, i.e. never hit the original file, and read
> >permissions have already been vetted (or the VMA doesn't have PROT_READ,
> >in which case loading the page into the enclave will fail).
> >
> >Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> >---
> >  security/selinux/hooks.c | 69 ++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 69 insertions(+)
> >
> >diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> >index 3ec702cf46ca..3c5418edf51c 100644
> >--- a/security/selinux/hooks.c
> >+++ b/security/selinux/hooks.c
> >@@ -6726,6 +6726,71 @@ static void selinux_bpf_prog_free(struct bpf_prog_aux *aux)
> >  }
> >  #endif
> >+#ifdef CONFIG_INTEL_SGX
> >+int selinux_enclave_load(struct vm_area_struct *vma, unsigned long prot)
> >+{
> >+	const struct cred *cred = current_cred();
> >+	u32 sid = cred_sid(cred);
> >+	int ret;
> >+
> >+	/* SGX is supported only in 64-bit kernels. */
> >+	WARN_ON_ONCE(!default_noexec);
> >+
> >+	/* Only executable enclave pages are restricted in any way. */
> >+	if (!(prot & PROT_EXEC))
> >+		return 0;
> 
> prot/PROT_EXEC or vmflags/VM_EXEC
> 
> >+
> >+	/*
> >+	 * The source page is exectuable, i.e. has already passed SELinux's
> 
> executable
> 
> >+	 * checks, and userspace is not requesting RW->RX capabilities.
> 
> Is it requesting W->X or WX?

Hmm, good point.  I'll reword the "requesting RW->RX" and "RW->RX intent"
phrases to make it clear that we don't actually know whether userspace
intends to do W->X or WX, and I'll also expand the "Note, this hybrid
EXECMOD and EXECMEM behavior" comment to explain that existing checks
won't prevent WX.

> >+	 */
> >+	if ((vma->vm_flags & VM_EXEC) && !(prot & PROT_WRITE))
> >+		return 0;
> >+
> >+	/*
> >+	 * The source page is not executable, or userspace is requesting the
> >+	 * ability to do a RW->RX conversion.  Permissions are required as
> >+	 * follows, in order of increasing privelege:
> >+	 *
> >+	 * EXECUTE - Load an executable enclave page without RW->RX intent from
> >+	 *           a non-executable vma that is backed by a shared mapping to
> >+	 *           a regular file that has not undergone COW.
> 
> Shared mapping or unmodified private file mapping

Doh, messed that up.  Thanks!

> >+	 *
> >+	 * EXECMOD - Load an executable enclave page without RW->RX intent from
> >+	 *           a non-executable vma that is backed by a shared mapping to
> >+	 *           a regular file that *has* undergone COW.
> 
> modified private file mapping (write to shared mapping won't trigger COW; it
> would have been checked by FILE__WRITE earlier)

Same mental error.  Will fix.

> >+	 *
> >+	 *         - Load an enclave page *with* RW->RX intent from a shared
> >+	 *           mapping to a regular file.
> >+	 *
> >+	 * EXECMEM - Load an exectuable enclave page from an anonymous mapping.
> 
> executable
> 
> >+	 *
> >+	 *         - Load an exectuable enclave page from a private file, e.g.
> 
> executable

At least I'm consistent.

> >+	 *           from a shared mapping to a hugetlbfs file.
> >+	 *
> >+	 *         - Load an enclave page *with* RW->RX intent from a private
> 
> W->X or WX?
>
> >+	 *           mapping to a regular file.
> >+	 *
> >+	 * Note, this hybrid EXECMOD and EXECMEM behavior is intentional and
> >+	 * reflects the nature of enclaves and the EPC, e.g. EPC is effectively
> >+	 * a non-persistent shared file, but each enclave is a private domain
> >+	 * within that shared file, so delegate to the source of the enclave.
> >+	 */
> >+	if (vma->vm_file && !IS_PRIVATE(file_inode(vma->vm_file) &&
> >+	    ((vma->vm_flags & VM_SHARED) || !(prot & PROT_WRITE)))) {
> >+		if (!vma->anon_vma && !(prot & PROT_WRITE))
> >+			ret = file_has_perm(cred, vma->vm_file, FILE__EXECUTE);
> >+		else
> >+			ret = file_has_perm(cred, vma->vm_file, FILE__EXECMOD);
> >+	} else {
> >+		ret = avc_has_perm(&selinux_state,
> >+				   sid, sid, SECCLASS_PROCESS,
> >+				   PROCESS__EXECMEM, NULL);
> >+	}
> >+	return ret;
> >+}
> >+#endif
> >+
> >  struct lsm_blob_sizes selinux_blob_sizes __lsm_ro_after_init = {
> >  	.lbs_cred = sizeof(struct task_security_struct),
> >  	.lbs_file = sizeof(struct file_security_struct),
> >@@ -6968,6 +7033,10 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
> >  	LSM_HOOK_INIT(bpf_map_free_security, selinux_bpf_map_free),
> >  	LSM_HOOK_INIT(bpf_prog_free_security, selinux_bpf_prog_free),
> >  #endif
> >+
> >+#ifdef CONFIG_INTEL_SGX
> >+	LSM_HOOK_INIT(enclave_load, selinux_enclave_load),
> >+#endif
> >  };
> >  static __init int selinux_init(void)
> >
> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks
  2019-06-10  7:03 ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Cedric Xing
                     ` (2 preceding siblings ...)
  2019-06-10  7:03   ` [RFC PATCH v1 3/3] LSM/x86/sgx: Call new LSM hooks from SGX subsystem Cedric Xing
@ 2019-06-10 17:36   ` Jarkko Sakkinen
  3 siblings, 0 replies; 67+ messages in thread
From: Jarkko Sakkinen @ 2019-06-10 17:36 UTC (permalink / raw)
  To: Cedric Xing
  Cc: linux-security-module, selinux, linux-kernel, linux-sgx, luto,
	sds, jmorris, serge, paul, eparis, jethro, dave.hansen, tglx,
	torvalds, akpm, nhorman, pmccallum, serge.ayoun, shay.katz-zamir,
	haitao.huang, andriy.shevchenko, kai.svahn, bp, josh, kai.huang,
	rientjes, william.c.roberts, philip.b.tricca

On Mon, Jun 10, 2019 at 12:03:03AM -0700, Cedric Xing wrote:
> This series intends to make the new SGX subsystem and the existing LSM
> architecture work together smoothly so that, say, SGX cannot be abused to work
> around restrictions set forth by LSM. This series applies on top of Jarkko
> Sakkinen's SGX series v20 (https://lkml.org/lkml/2019/4/17/344), where abundant
> details of this SGX/LSM problem could be found.
> 
> This series is an alternative to Sean Christopherson's recent RFC series
> (https://lkml.org/lkml/2019/6/5/1070) that was trying to solve the same
> problem. The key problem is for LSM to determine the "maximal (most permissive)
> protection" allowed for individual enclave pages. Sean's approach is to take
> that from user mode code as a parameter of the EADD ioctl, validate it with LSM
> ahead of time, and then enforce it inside the SGX subsystem. The major
> disadvantage IMHO is that a priori knowledge of "maximal protection" is needed,
> but it isn't always available in certain use cases. In fact, it is an unusual
> approach to take "maximal protection" from user code, as what SELinux is doing
> today is to determine "maximal protection" of a vma using attributes associated
> with vma->vm_file instead. When it comes to enclaves, vma->vm_file always
> points /dev/sgx/enclave, so what's missing is a new way for LSM modules to
> remember origins of enclave pages so that they don't solely depend on
> vma->vm_file to determine "maximal protection".
> 
> This series takes advantage of the fact that enclave pages cannot be remapped
> (to different linear address), therefore the pair of { vma->vm_file,
> linear_address } can be used to uniquely identify an enclave page. Then by
> notifying LSM on creation of every enclave page (via a new LSM hook -
> security_enclave_load), LSM modules would be able to track origin and
> protection changes of every page, hence be able to judge correctly upon
> mmap/mprotect requests.
> 
> Cedric Xing (3):
>   LSM/x86/sgx: Add SGX specific LSM hooks
>   LSM/x86/sgx: Implement SGX specific hooks in SELinux
>   LSM/x86/sgx: Call new LSM hooks from SGX subsystem

A patch set containing direct LSM changes should consider all LSMs.
This will allow all the LSM maintainers to consider the changes. Now we
have a limited audience and we are favoring one LSM.

There is no good reason why direct LSM changes cannot be done
post-upstreaming like we do for virtualization.

Looking at Sean's patches, overally 1/5-3/5 make perfect sense.

/Jarkko

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-10 16:15     ` Sean Christopherson
@ 2019-06-10 17:45       ` Jarkko Sakkinen
  2019-06-10 18:17         ` Sean Christopherson
  0 siblings, 1 reply; 67+ messages in thread
From: Jarkko Sakkinen @ 2019-06-10 17:45 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Mon, Jun 10, 2019 at 09:15:33AM -0700, Sean Christopherson wrote:
> > 'flags' should would renamed as 'secinfo_flags_mask' even if the name is
> > longish. It would use the same values as the SECINFO flags. The field in
> > struct sgx_encl_page should have the same name. That would express
> > exactly relation between SECINFO and the new field. I would have never
> > asked on last iteration why SECINFO is not enough with a better naming.
> 
> No, these flags do not impact the EPCM protections in any way.  Userspace
> can extend the EPCM protections without going through the kernel.  The
> protection flags for an enclave page impact VMA/PTE protection bits.
> 
> IMO, it is best to treat the EPCM as being completely separate from the
> kernel's EPC management.

It is a clumsy API if permissions are not taken in the same format for
everything. There is no reason not to do it. The way mprotect() callback
just interprets the field is as VMA permissions.

It would also be more future-proof just to have a mask covering all bits
of the SECINFO flags field.

/Jarkko

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect()
  2019-06-10 15:55     ` Sean Christopherson
@ 2019-06-10 17:47       ` Xing, Cedric
  2019-06-10 19:49         ` Sean Christopherson
  0 siblings, 1 reply; 67+ messages in thread
From: Xing, Cedric @ 2019-06-10 17:47 UTC (permalink / raw)
  To: Christopherson, Sean J, Jarkko Sakkinen
  Cc: Andy Lutomirski, Stephen Smalley, James Morris, Serge E . Hallyn,
	LSM List, Paul Moore, Eric Paris, selinux, Jethro Beekman,
	Hansen, Dave, Thomas Gleixner, Linus Torvalds, LKML, X86 ML,
	linux-sgx, Andrew Morton, nhorman, npmccallum, Ayoun, Serge,
	Katz-zamir, Shay, Huang, Haitao, Andy Shevchenko, Svahn, Kai,
	Borislav Petkov, Josh Triplett, Huang, Kai, David Rientjes,
	Roberts, William C, Tricca, Philip B

> From: Christopherson, Sean J
> Sent: Monday, June 10, 2019 8:56 AM
> 
> > > As a result, LSM policies cannot be meaningfully applied, e.g. an
> > > LSM can deny access to the EPC as a whole, but can't deny PROT_EXEC
> > > on page that originated in a non-EXECUTE file (which is long gone by
> > > the time
> > > mprotect() is called).
> >
> > I have hard time following what is paragraph is trying to say.
> >
> > > By hooking mprotect(), SGX can make explicit LSM upcalls while an
> > > enclave is being built, i.e. when the kernel has a handle to origin
> > > of each enclave page, and enforce the result of the LSM policy
> > > whenever userspace maps the enclave page in the future.
> >
> > "LSM policy whenever calls mprotect()"? I'm no sure why you mean by
> > mapping here and if there is any need to talk about future. Isn't this
> > needed now?
> 
> Future is referring to the timeline of a running kernel, not the future
> of the kernel code.
> 
> Rather than trying to explain all of the above with words, I'll provide
> code examples to show how ->may_protect() will be used by SGX and why it
> is the preferred solution.

The LSM concept is to separate security policy enforcement from the rest of the kernel. For modules, the "official" way is to use VM_MAY* flags to limit allowable permissions, while LSM uses security_file_mprotect(). I guess that's why we didn't have .may_mprotect() in the first place. What you are doing is enforcing some security policy outside of LSM, which is dirty from architecture perspective.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-10 17:45       ` Jarkko Sakkinen
@ 2019-06-10 18:17         ` Sean Christopherson
  2019-06-12 19:26           ` Jarkko Sakkinen
  0 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-10 18:17 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Mon, Jun 10, 2019 at 08:45:06PM +0300, Jarkko Sakkinen wrote:
> On Mon, Jun 10, 2019 at 09:15:33AM -0700, Sean Christopherson wrote:
> > > 'flags' should would renamed as 'secinfo_flags_mask' even if the name is
> > > longish. It would use the same values as the SECINFO flags. The field in
> > > struct sgx_encl_page should have the same name. That would express
> > > exactly relation between SECINFO and the new field. I would have never
> > > asked on last iteration why SECINFO is not enough with a better naming.
> > 
> > No, these flags do not impact the EPCM protections in any way.  Userspace
> > can extend the EPCM protections without going through the kernel.  The
> > protection flags for an enclave page impact VMA/PTE protection bits.
> > 
> > IMO, it is best to treat the EPCM as being completely separate from the
> > kernel's EPC management.
> 
> It is a clumsy API if permissions are not taken in the same format for
> everything. There is no reason not to do it. The way mprotect() callback
> just interprets the field is as VMA permissions.

They are two entirely different things.  The explicit protection bits are
consumed by the kernel, while SECINFO.flags is consumed by the CPU.  The
intent is to have the protection flags be analogous to mprotect(), the
fact that they have a similar/identical format to SECINFO is irrelevant.

Calling the field secinfo_flags_mask is straight up wrong on SGX2, as 
userspace can use EMODPE to set SECINFO after the page is added.  It's
also wrong on SGX1 when adding TCS pages since SECINFO.RWX bits for TCS
pages are forced to zero by hardware.

> It would also be more future-proof just to have a mask covering all bits
> of the SECINFO flags field.

This simply doesn't work, e.g. the PENDING, MODIFIED and PR flags in the
SECINFO are read-only from a software perspective.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-06  2:11 ` [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits Sean Christopherson
  2019-06-10 15:27   ` Jarkko Sakkinen
@ 2019-06-10 18:29   ` Xing, Cedric
  2019-06-10 19:15     ` Andy Lutomirski
  1 sibling, 1 reply; 67+ messages in thread
From: Xing, Cedric @ 2019-06-10 18:29 UTC (permalink / raw)
  To: Christopherson, Sean J, Jarkko Sakkinen
  Cc: Andy Lutomirski, Stephen Smalley, James Morris, Serge E . Hallyn,
	LSM List, Paul Moore, Eric Paris, selinux, Jethro Beekman,
	Hansen, Dave, Thomas Gleixner, Linus Torvalds, LKML, X86 ML,
	linux-sgx, Andrew Morton, nhorman, npmccallum, Ayoun, Serge,
	Katz-zamir, Shay, Huang, Haitao, Andy Shevchenko, Svahn, Kai,
	Borislav Petkov, Josh Triplett, Huang, Kai, David Rientjes,
	Roberts, William C, Tricca, Philip B

> From: Christopherson, Sean J
> Sent: Wednesday, June 05, 2019 7:12 PM
> 
> +/**
> + * sgx_map_allowed - check vma protections against the associated
> enclave page
> + * @encl:	an enclave
> + * @start:	start address of the mapping (inclusive)
> + * @end:	end address of the mapping (exclusive)
> + * @prot:	protection bits of the mapping
> + *
> + * Verify a userspace mapping to an enclave page would not violate the
> +security
> + * requirements of the *kernel*.  Note, this is in no way related to
> +the
> + * page protections enforced by hardware via the EPCM.  The EPCM
> +protections
> + * can be directly extended by the enclave, i.e. cannot be relied upon
> +by the
> + * kernel for security guarantees of any kind.
> + *
> + * Return:
> + *   0 on success,
> + *   -EACCES if the mapping is disallowed
> + */
> +int sgx_map_allowed(struct sgx_encl *encl, unsigned long start,
> +		    unsigned long end, unsigned long prot) {
> +	struct sgx_encl_page *page;
> +	unsigned long addr;
> +
> +	prot &= (VM_READ | VM_WRITE | VM_EXEC);
> +	if (!prot || !encl)
> +		return 0;
> +
> +	mutex_lock(&encl->lock);
> +
> +	for (addr = start; addr < end; addr += PAGE_SIZE) {
> +		page = radix_tree_lookup(&encl->page_tree, addr >>
> PAGE_SHIFT);
> +
> +		/*
> +		 * Do not allow R|W|X to a non-existent page, or protections
> +		 * beyond those of the existing enclave page.
> +		 */
> +		if (!page || (prot & ~page->prot))
> +			return -EACCES;

In SGX2, pages will be "mapped" before being populated.

Here's a brief summary for those who don't have enough background on how new EPC pages could be added to a running enclave in SGX2:
  - There are 2 new instructions - EACCEPT and EAUG.
  - EAUG is used by SGX module to add (augment) a new page to an existing enclave. The newly added page is *inaccessible* until the enclave *accepts* it.
  - EACCEPT is the instruction for an enclave to accept a new page.

And the s/w flow for an enclave to request new EPC pages is expected to be something like the following:
  - The enclave issues EACCEPT at the linear address that it would like a new page.
  - EACCEPT results in #PF, as there's no page at the linear address above.
  - SGX module is notified about the #PF, in form of its vma->vm_ops->fault() being called by kernel.
  - SGX module EAUGs a new EPC page at the fault address, and resumes the enclave.
  - EACCEPT is reattempted, and succeeds at this time.

But with the above check in sgx_map_allowed(), I'm not sure how this will work out with SGX2.

> +	}
> +
> +	mutex_unlock(&encl->lock);
> +
> +	return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-10 18:29   ` Xing, Cedric
@ 2019-06-10 19:15     ` Andy Lutomirski
  2019-06-10 22:28       ` Xing, Cedric
  0 siblings, 1 reply; 67+ messages in thread
From: Andy Lutomirski @ 2019-06-10 19:15 UTC (permalink / raw)
  To: Xing, Cedric
  Cc: Christopherson, Sean J, Jarkko Sakkinen, Andy Lutomirski,
	Stephen Smalley, James Morris, Serge E . Hallyn, LSM List,
	Paul Moore, Eric Paris, selinux, Jethro Beekman, Hansen, Dave,
	Thomas Gleixner, Linus Torvalds, LKML, X86 ML, linux-sgx,
	Andrew Morton, nhorman, npmccallum, Ayoun, Serge, Katz-zamir,
	Shay, Huang, Haitao, Andy Shevchenko, Svahn, Kai,
	Borislav Petkov, Josh Triplett, Huang, Kai, David Rientjes,
	Roberts, William C, Tricca, Philip B

On Mon, Jun 10, 2019 at 11:29 AM Xing, Cedric <cedric.xing@intel.com> wrote:
>
> > From: Christopherson, Sean J
> > Sent: Wednesday, June 05, 2019 7:12 PM
> >
> > +/**
> > + * sgx_map_allowed - check vma protections against the associated
> > enclave page
> > + * @encl:    an enclave
> > + * @start:   start address of the mapping (inclusive)
> > + * @end:     end address of the mapping (exclusive)
> > + * @prot:    protection bits of the mapping
> > + *
> > + * Verify a userspace mapping to an enclave page would not violate the
> > +security
> > + * requirements of the *kernel*.  Note, this is in no way related to
> > +the
> > + * page protections enforced by hardware via the EPCM.  The EPCM
> > +protections
> > + * can be directly extended by the enclave, i.e. cannot be relied upon
> > +by the
> > + * kernel for security guarantees of any kind.
> > + *
> > + * Return:
> > + *   0 on success,
> > + *   -EACCES if the mapping is disallowed
> > + */
> > +int sgx_map_allowed(struct sgx_encl *encl, unsigned long start,
> > +                 unsigned long end, unsigned long prot) {
> > +     struct sgx_encl_page *page;
> > +     unsigned long addr;
> > +
> > +     prot &= (VM_READ | VM_WRITE | VM_EXEC);
> > +     if (!prot || !encl)
> > +             return 0;
> > +
> > +     mutex_lock(&encl->lock);
> > +
> > +     for (addr = start; addr < end; addr += PAGE_SIZE) {
> > +             page = radix_tree_lookup(&encl->page_tree, addr >>
> > PAGE_SHIFT);
> > +
> > +             /*
> > +              * Do not allow R|W|X to a non-existent page, or protections
> > +              * beyond those of the existing enclave page.
> > +              */
> > +             if (!page || (prot & ~page->prot))
> > +                     return -EACCES;
>
> In SGX2, pages will be "mapped" before being populated.
>
> Here's a brief summary for those who don't have enough background on how new EPC pages could be added to a running enclave in SGX2:
>   - There are 2 new instructions - EACCEPT and EAUG.
>   - EAUG is used by SGX module to add (augment) a new page to an existing enclave. The newly added page is *inaccessible* until the enclave *accepts* it.
>   - EACCEPT is the instruction for an enclave to accept a new page.
>
> And the s/w flow for an enclave to request new EPC pages is expected to be something like the following:
>   - The enclave issues EACCEPT at the linear address that it would like a new page.
>   - EACCEPT results in #PF, as there's no page at the linear address above.
>   - SGX module is notified about the #PF, in form of its vma->vm_ops->fault() being called by kernel.
>   - SGX module EAUGs a new EPC page at the fault address, and resumes the enclave.
>   - EACCEPT is reattempted, and succeeds at this time.

This seems like an odd workflow.  Shouldn't the #PF return back to
untrusted userspace so that the untrusted user code can make its own
decision as to whether it wants to EAUG a page there as opposed to,
say, killing the enclave or waiting to keep resource usage under
control?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect()
  2019-06-10 17:47       ` Xing, Cedric
@ 2019-06-10 19:49         ` Sean Christopherson
  2019-06-10 22:06           ` Xing, Cedric
  0 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-10 19:49 UTC (permalink / raw)
  To: Xing, Cedric
  Cc: Jarkko Sakkinen, Andy Lutomirski, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Hansen, Dave, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, Andy Shevchenko,
	Svahn, Kai, Borislav Petkov, Josh Triplett, Huang, Kai,
	David Rientjes, Roberts, William C, Tricca, Philip B

On Mon, Jun 10, 2019 at 10:47:52AM -0700, Xing, Cedric wrote:
> > From: Christopherson, Sean J
> > Sent: Monday, June 10, 2019 8:56 AM
> > 
> > > > As a result, LSM policies cannot be meaningfully applied, e.g. an
> > > > LSM can deny access to the EPC as a whole, but can't deny PROT_EXEC
> > > > on page that originated in a non-EXECUTE file (which is long gone by
> > > > the time
> > > > mprotect() is called).
> > >
> > > I have hard time following what is paragraph is trying to say.
> > >
> > > > By hooking mprotect(), SGX can make explicit LSM upcalls while an
> > > > enclave is being built, i.e. when the kernel has a handle to origin
> > > > of each enclave page, and enforce the result of the LSM policy
> > > > whenever userspace maps the enclave page in the future.
> > >
> > > "LSM policy whenever calls mprotect()"? I'm no sure why you mean by
> > > mapping here and if there is any need to talk about future. Isn't this
> > > needed now?
> > 
> > Future is referring to the timeline of a running kernel, not the future
> > of the kernel code.
> > 
> > Rather than trying to explain all of the above with words, I'll provide
> > code examples to show how ->may_protect() will be used by SGX and why it
> > is the preferred solution.
> 
> The LSM concept is to separate security policy enforcement from the rest of
> the kernel. For modules, the "official" way is to use VM_MAY* flags to limit
> allowable permissions, while LSM uses security_file_mprotect().
> I guess that's why we didn't have .may_mprotect() in the first place.

Heh, so I've typed up about five different responses to this comment.  In
doing so, I think I've convinced myself that ->may_mprotect() is
unnecessary.  Rther than hook mprotect(), simply update the VM_MAY* flags
during mmap(), with all bits cleared if there isn't an associated enclave
page.  IIRC, the need to add ->may_protect() came about when we were
exploring more dynamic interplay between SGX and LSMs.

> What you are doing is enforcing some security policy outside of LSM, which
> is dirty from architecture perspective.

No, the enclave page protections are enforced regardless of LSM policy,
and in v2 those protections are immutable.  Yes, the explicit enclave
page protection bits are being added primarily for LSMs, but they don't
impact functionality other than at the security_enclave_load() touchpoint.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect()
  2019-06-10 19:49         ` Sean Christopherson
@ 2019-06-10 22:06           ` Xing, Cedric
  0 siblings, 0 replies; 67+ messages in thread
From: Xing, Cedric @ 2019-06-10 22:06 UTC (permalink / raw)
  To: Christopherson, Sean J
  Cc: Jarkko Sakkinen, Andy Lutomirski, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Hansen, Dave, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, Andy Shevchenko,
	Svahn, Kai, Borislav Petkov, Josh Triplett, Huang, Kai,
	David Rientjes, Roberts, William C, Tricca, Philip B

> From: Christopherson, Sean J
> Sent: Monday, June 10, 2019 12:50 PM
> 
> On Mon, Jun 10, 2019 at 10:47:52AM -0700, Xing, Cedric wrote:
> > > From: Christopherson, Sean J
> > > Sent: Monday, June 10, 2019 8:56 AM
> > >
> > > > > As a result, LSM policies cannot be meaningfully applied, e.g.
> > > > > an LSM can deny access to the EPC as a whole, but can't deny
> > > > > PROT_EXEC on page that originated in a non-EXECUTE file (which
> > > > > is long gone by the time
> > > > > mprotect() is called).
> > > >
> > > > I have hard time following what is paragraph is trying to say.
> > > >
> > > > > By hooking mprotect(), SGX can make explicit LSM upcalls while
> > > > > an enclave is being built, i.e. when the kernel has a handle to
> > > > > origin of each enclave page, and enforce the result of the LSM
> > > > > policy whenever userspace maps the enclave page in the future.
> > > >
> > > > "LSM policy whenever calls mprotect()"? I'm no sure why you mean
> > > > by mapping here and if there is any need to talk about future.
> > > > Isn't this needed now?
> > >
> > > Future is referring to the timeline of a running kernel, not the
> > > future of the kernel code.
> > >
> > > Rather than trying to explain all of the above with words, I'll
> > > provide code examples to show how ->may_protect() will be used by
> > > SGX and why it is the preferred solution.
> >
> > The LSM concept is to separate security policy enforcement from the
> > rest of the kernel. For modules, the "official" way is to use VM_MAY*
> > flags to limit allowable permissions, while LSM uses
> security_file_mprotect().
> > I guess that's why we didn't have .may_mprotect() in the first place.
> 
> Heh, so I've typed up about five different responses to this comment.
> In doing so, I think I've convinced myself that ->may_mprotect() is
> unnecessary.  Rther than hook mprotect(), simply update the VM_MAY*
> flags during mmap(), with all bits cleared if there isn't an associated
> enclave page.  IIRC, the need to add ->may_protect() came about when we
> were exploring more dynamic interplay between SGX and LSMs.
> 
> > What you are doing is enforcing some security policy outside of LSM,
> > which is dirty from architecture perspective.
> 
> No, the enclave page protections are enforced regardless of LSM policy,
> and in v2 those protections are immutable.  Yes, the explicit enclave
> page protection bits are being added primarily for LSMs, but they don't
> impact functionality other than at the security_enclave_load()
> touchpoint.

Disagreed.

You can say you want to enforce "something" without LSM. But what's the purpose of that "something" without LSM? Why doesn't the original mprotect() enforce that "something"?

It *does* affect functionality because user mode code has to figure out an "explicit protection" to make sure the enclave would work with *and also* without LSM. That said, the "explicit protection" can neither be too restrictive (or enclave wouldn't work) nor be too permissive (or LSM policies are violated). But what if the user mode code doesn't have appropriate "explicit protection" ahead of time as it is just going to mprotect() as the enclave requests at runtime?

And your restrictions on mmap()'ing non-existing pages also have great impacts to SGX2 support.

I think some reasonable answers are needed to the above questions before we can call this proposal viable. 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-10 19:15     ` Andy Lutomirski
@ 2019-06-10 22:28       ` Xing, Cedric
  2019-06-12  0:09         ` Andy Lutomirski
  0 siblings, 1 reply; 67+ messages in thread
From: Xing, Cedric @ 2019-06-10 22:28 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Christopherson, Sean J, Jarkko Sakkinen, Stephen Smalley,
	James Morris, Serge E . Hallyn, LSM List, Paul Moore, Eric Paris,
	selinux, Jethro Beekman, Hansen, Dave, Thomas Gleixner,
	Linus Torvalds, LKML, X86 ML, linux-sgx, Andrew Morton, nhorman,
	npmccallum, Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao,
	Andy Shevchenko, Svahn, Kai, Borislav Petkov, Josh Triplett,
	Huang, Kai, David Rientjes, Roberts, William C, Tricca, Philip B

> From: Andy Lutomirski [mailto:luto@kernel.org]
> Sent: Monday, June 10, 2019 12:15 PM
> 
> On Mon, Jun 10, 2019 at 11:29 AM Xing, Cedric <cedric.xing@intel.com>
> wrote:
> >
> > > From: Christopherson, Sean J
> > > Sent: Wednesday, June 05, 2019 7:12 PM
> > >
> > > +/**
> > > + * sgx_map_allowed - check vma protections against the associated
> > > enclave page
> > > + * @encl:    an enclave
> > > + * @start:   start address of the mapping (inclusive)
> > > + * @end:     end address of the mapping (exclusive)
> > > + * @prot:    protection bits of the mapping
> > > + *
> > > + * Verify a userspace mapping to an enclave page would not violate
> > > +the security
> > > + * requirements of the *kernel*.  Note, this is in no way related
> > > +to the
> > > + * page protections enforced by hardware via the EPCM.  The EPCM
> > > +protections
> > > + * can be directly extended by the enclave, i.e. cannot be relied
> > > +upon by the
> > > + * kernel for security guarantees of any kind.
> > > + *
> > > + * Return:
> > > + *   0 on success,
> > > + *   -EACCES if the mapping is disallowed
> > > + */
> > > +int sgx_map_allowed(struct sgx_encl *encl, unsigned long start,
> > > +                 unsigned long end, unsigned long prot) {
> > > +     struct sgx_encl_page *page;
> > > +     unsigned long addr;
> > > +
> > > +     prot &= (VM_READ | VM_WRITE | VM_EXEC);
> > > +     if (!prot || !encl)
> > > +             return 0;
> > > +
> > > +     mutex_lock(&encl->lock);
> > > +
> > > +     for (addr = start; addr < end; addr += PAGE_SIZE) {
> > > +             page = radix_tree_lookup(&encl->page_tree, addr >>
> > > PAGE_SHIFT);
> > > +
> > > +             /*
> > > +              * Do not allow R|W|X to a non-existent page, or
> protections
> > > +              * beyond those of the existing enclave page.
> > > +              */
> > > +             if (!page || (prot & ~page->prot))
> > > +                     return -EACCES;
> >
> > In SGX2, pages will be "mapped" before being populated.
> >
> > Here's a brief summary for those who don't have enough background on
> how new EPC pages could be added to a running enclave in SGX2:
> >   - There are 2 new instructions - EACCEPT and EAUG.
> >   - EAUG is used by SGX module to add (augment) a new page to an
> existing enclave. The newly added page is *inaccessible* until the
> enclave *accepts* it.
> >   - EACCEPT is the instruction for an enclave to accept a new page.
> >
> > And the s/w flow for an enclave to request new EPC pages is expected
> to be something like the following:
> >   - The enclave issues EACCEPT at the linear address that it would
> like a new page.
> >   - EACCEPT results in #PF, as there's no page at the linear address
> above.
> >   - SGX module is notified about the #PF, in form of its vma->vm_ops-
> >fault() being called by kernel.
> >   - SGX module EAUGs a new EPC page at the fault address, and resumes
> the enclave.
> >   - EACCEPT is reattempted, and succeeds at this time.
> 
> This seems like an odd workflow.  Shouldn't the #PF return back to
> untrusted userspace so that the untrusted user code can make its own
> decision as to whether it wants to EAUG a page there as opposed to, say,
> killing the enclave or waiting to keep resource usage under control?

This may seem odd to some at the first glance. But if you can think of how static heap (pre-allocated by EADD before EINIT) works, the load parses the "metadata" coming with the enclave to decide the address/size of the heap, EADDs it, and calls it done. In the case of "dynamic" heap (allocated dynamically by EAUG after EINIT), the same thing applies - the loader determines the range of the heap, tells the SGX module about it, and calls it done. Everything else is the between the enclave and the SGX module.

In practice, untrusted code usually doesn't know much about enclaves, just like it doesn't know much about the shared objects loaded into its address space either. Without the necessary knowledge, untrusted code usually just does what it is told (via o-calls, or return value from e-calls), without judging that's right or wrong. 

When it comes to #PF like what I described, of course a signal could be sent to the untrusted code but what would it do then? Usually it'd just come back asking for a page at the fault address. So we figured it'd be more efficient to just have the kernel EAUG at #PF. 

Please don't get me wrong though, as I'm not dictating what the s/w flow shall be. It's just going to be a choice offered to user mode. And that choice was planned to be offered via mprotect() - i.e. a writable vma causes kernel to EAUG while a non-writable vma will result in a signal (then the user mode could decide whether to EAUG). The key point is flexibility - as we want to allow all reasonable s/w flows instead of dictating one over others. We had similar discussions on vDSO API before. And I think you accepted my approach because of its flexibility. Am I right?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-10  7:03   ` [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux Cedric Xing
@ 2019-06-11 13:40     ` Stephen Smalley
  2019-06-11 22:02       ` Sean Christopherson
  2019-06-11 22:55       ` Xing, Cedric
  0 siblings, 2 replies; 67+ messages in thread
From: Stephen Smalley @ 2019-06-11 13:40 UTC (permalink / raw)
  To: Cedric Xing, linux-security-module, selinux, linux-kernel, linux-sgx
  Cc: jarkko.sakkinen, luto, jmorris, serge, paul, eparis, jethro,
	dave.hansen, tglx, torvalds, akpm, nhorman, pmccallum,
	serge.ayoun, shay.katz-zamir, haitao.huang, andriy.shevchenko,
	kai.svahn, bp, josh, kai.huang, rientjes, william.c.roberts,
	philip.b.tricca

On 6/10/19 3:03 AM, Cedric Xing wrote:
> In this patch, SELinux maintains two bits per enclave page, namely SGX__EXECUTE
> and SGX__EXECMOD.
> 
> SGX__EXECUTE is set initially (by selinux_enclave_load) for every enclave page
> that was loaded from a potentially executable source page. SGX__EXECMOD is set
> for every page that was loaded from a file that has FILE__EXECMOD.
> 
> At runtime, on every protection change (resulted in a call to
> selinux_file_mprotect), SGX__EXECUTE is cleared for a page if VM_WRITE is
> requested, unless SGX__EXECMOD is set.
> 
> To track enclave page protection changes, SELinux has been changed in four
> different places.
> 
> Firstly, storage is required for storing per page SGX__EXECUTE and SGX__EXECMOD
> bits. Given every enclave instance is uniquely tied to an open file (i.e.
> struct file), the storage is allocated by extending `file_security_struct`.
> More precisely, a new field `esec` has been added, initially zero, to point to
> the data structure for tracking per page protection. `esec` will be
> allocated/initialized at the first invocation of selinux_enclave_load().
> 
> Then, selinux_enclave_load() initializes those 2 bits for every new enclave as
> described above. One more detail worth noting, is that selinux_enclave_load()
> sets SGX__EXECUTE/SGX__EXECMOD for EAUG'ed pages (for upcoming SGX2) only if
> the calling process has FILE__EXECMOD on the sigstruct file.
> 
> Afterwards, every change on protection will go through selinux_file_mprotect()
> so will be noted. Please note that user space could munmap() then mmap() to
> work around mprotect(), but that "leak" could be "plugged" by SGX subsystem
> calling security_file_mprotect() explicitly whenever new mappings are created.
> 
> Finally, the storage for page protection tracking must be freed when the
> associated file is closed. Hence a new selinux_file_free_security() has been
> added.
> 
> Signed-off-by: Cedric Xing <cedric.xing@intel.com>
> ---
>   security/selinux/Makefile            |   2 +
>   security/selinux/hooks.c             |  77 ++++++-
>   security/selinux/include/intel_sgx.h |  18 ++
>   security/selinux/include/objsec.h    |   3 +
>   security/selinux/intel_sgx.c         | 292 +++++++++++++++++++++++++++
>   5 files changed, 391 insertions(+), 1 deletion(-)
>   create mode 100644 security/selinux/include/intel_sgx.h
>   create mode 100644 security/selinux/intel_sgx.c
> 
> diff --git a/security/selinux/Makefile b/security/selinux/Makefile
> index ccf950409384..58a05a9639e0 100644
> --- a/security/selinux/Makefile
> +++ b/security/selinux/Makefile
> @@ -14,6 +14,8 @@ selinux-$(CONFIG_SECURITY_NETWORK_XFRM) += xfrm.o
>   
>   selinux-$(CONFIG_NETLABEL) += netlabel.o
>   
> +selinux-$(CONFIG_INTEL_SGX) += intel_sgx.o
> +
>   ccflags-y := -I$(srctree)/security/selinux -I$(srctree)/security/selinux/include
>   
>   $(addprefix $(obj)/,$(selinux-y)): $(obj)/flask.h
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 3ec702cf46ca..17f855871a41 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -103,6 +103,7 @@
>   #include "netlabel.h"
>   #include "audit.h"
>   #include "avc_ss.h"
> +#include "intel_sgx.h"
>   
>   struct selinux_state selinux_state;
>   
> @@ -3485,6 +3486,11 @@ static int selinux_file_alloc_security(struct file *file)
>   	return file_alloc_security(file);
>   }
>   
> +static void selinux_file_free_security(struct file *file)
> +{
> +	sgxsec_enclave_free(file);
> +}
> +
>   /*
>    * Check whether a task has the ioctl permission and cmd
>    * operation to an inode.
> @@ -3656,6 +3662,7 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
>   				 unsigned long reqprot,
>   				 unsigned long prot)
>   {
> +	int rc;
>   	const struct cred *cred = current_cred();
>   	u32 sid = cred_sid(cred);
>   
> @@ -3664,7 +3671,7 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
>   
>   	if (default_noexec &&
>   	    (prot & PROT_EXEC) && !(vma->vm_flags & VM_EXEC)) {
> -		int rc = 0;
> +		rc = 0;
>   		if (vma->vm_start >= vma->vm_mm->start_brk &&
>   		    vma->vm_end <= vma->vm_mm->brk) {
>   			rc = avc_has_perm(&selinux_state,
> @@ -3691,6 +3698,12 @@ static int selinux_file_mprotect(struct vm_area_struct *vma,
>   			return rc;
>   	}
>   
> +#ifdef CONFIG_INTEL_SGX
> +	rc = sgxsec_mprotect(vma, prot);
> +	if (rc <= 0)
> +		return rc;

Why are you skipping the file_map_prot_check() call when rc == 0?
What would SELinux check if you didn't do so - 
FILE__READ|FILE__WRITE|FILE__EXECUTE to /dev/sgx/enclave?  Is it a 
problem to let SELinux proceed with that check?

> +#endif
> +
>   	return file_map_prot_check(vma->vm_file, prot, vma->vm_flags&VM_SHARED);
>   }
>   
> @@ -6726,6 +6739,62 @@ static void selinux_bpf_prog_free(struct bpf_prog_aux *aux)
>   }
>   #endif
>   
> +#ifdef CONFIG_INTEL_SGX
> +
> +static int selinux_enclave_load(struct file *encl, unsigned long addr,
> +				unsigned long size, unsigned long prot,
> +				struct vm_area_struct *source)
> +{
> +	if (source) {
> +		/**
> +		 * Adding page from source => EADD request
> +		 */
> +		int rc = selinux_file_mprotect(source, prot, prot);
> +		if (rc)
> +			return rc;
> +
> +		if (!(prot & VM_EXEC) &&
> +		    selinux_file_mprotect(source, VM_EXEC, VM_EXEC))

I wouldn't conflate VM_EXEC with PROT_EXEC even if they happen to be 
defined with the same values currently.  Elsewhere the kernel appears to 
explicitly translate them ala calc_vm_prot_bits().

Also, this will mean that we will always perform an execute check on all 
sources, thereby triggering audit denial messages for any EADD sources 
that are only intended to be data.  Depending on the source, this could 
trigger PROCESS__EXECMEM or FILE__EXECMOD or FILE__EXECUTE.  In a world 
where users often just run any denials they see through audit2allow, 
they'll end up always allowing them all.  How can they tell whether it 
was needed? It would be preferable if we could only trigger execute 
checks when there is some probability that execute will be requested in 
the future.  Alternatives would be to silence the audit of these 
permission checks always via use of _noaudit() interfaces or to silence 
audit of these permissions via dontaudit rules in policy, but the latter 
would hide all denials of the permission by the process, not just those 
triggered from security_enclave_load().  And if we silence them, then we 
won't see them even if they were needed.

> +			prot = 0;
> +		else {
> +			prot = SGX__EXECUTE;
> +			if (source->vm_file &&
> +			    !file_has_perm(current_cred(), source->vm_file,
> +					   FILE__EXECMOD))
> +				prot |= SGX__EXECMOD;

Similarly, this means that we will always perform a FILE__EXECMOD check 
on all executable sources, triggering audit denial messages for any EADD 
source that is executable but to which EXECMOD is not allowed, and again 
the most common pattern will be that users will add EXECMOD to all 
executable sources to avoid this.

> +		}
> +		return sgxsec_eadd(encl, addr, size, prot);
> +	} else {
> +		/**
> +		  * Adding page from NULL => EAUG request
> +		  */
> +		return sgxsec_eaug(encl, addr, size, prot);
> +	}
> +}
> +
> +static int selinux_enclave_init(struct file *encl,
> +				const struct sgx_sigstruct *sigstruct,
> +				struct vm_area_struct *vma)
> +{
> +	int rc = 0;
> +
> +	if (!vma)
> +		rc = -EINVAL;

Is it ever valid to call this hook with a NULL vma?  If not, this should 
be handled/prevented by the caller.  If so, I'd just return -EINVAL 
immediately here.

> +
> +	if (!rc && !(vma->vm_flags & VM_EXEC))
> +		rc = selinux_file_mprotect(vma, VM_EXEC, VM_EXEC);

I had thought we were trying to avoid overloading FILE__EXECUTE (or 
whatever gets checked here, e.g. could be PROCESS__EXECMEM or 
FILE__EXECMOD) on the sigstruct file, since the caller isn't truly 
executing code from it.

I'd define new ENCLAVE__* permissions, including an up-front 
ENCLAVE__INIT permission that governs whether the sigstruct file can be 
used at all irrespective of memory protections.

Then you can also have ENCLAVE__EXECUTE, ENCLAVE__EXECMEM, 
ENCLAVE__EXECMOD for the execute-related checks.  Or you can use the 
/dev/sgx/enclave inode as the target for the execute checks and just 
reuse the file permissions there.

> +
> +	if (!rc) {
> +		if (vma->vm_file)
> +			rc = file_has_perm(current_cred(), vma->vm_file,
> +					   FILE__EXECMOD);

Similar issue here with always triggering EXECMOD audit denials even if 
never required.

> +		rc = sgxsec_einit(encl, sigstruct, !rc);
> +	}
> +	return rc;
> +}
> +
> +#endif
> +
>   struct lsm_blob_sizes selinux_blob_sizes __lsm_ro_after_init = {
>   	.lbs_cred = sizeof(struct task_security_struct),
>   	.lbs_file = sizeof(struct file_security_struct),
> @@ -6808,6 +6877,7 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
>   
>   	LSM_HOOK_INIT(file_permission, selinux_file_permission),
>   	LSM_HOOK_INIT(file_alloc_security, selinux_file_alloc_security),
> +	LSM_HOOK_INIT(file_free_security, selinux_file_free_security),
>   	LSM_HOOK_INIT(file_ioctl, selinux_file_ioctl),
>   	LSM_HOOK_INIT(mmap_file, selinux_mmap_file),
>   	LSM_HOOK_INIT(mmap_addr, selinux_mmap_addr),
> @@ -6968,6 +7038,11 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
>   	LSM_HOOK_INIT(bpf_map_free_security, selinux_bpf_map_free),
>   	LSM_HOOK_INIT(bpf_prog_free_security, selinux_bpf_prog_free),
>   #endif
> +
> +#ifdef CONFIG_INTEL_SGX
> +	LSM_HOOK_INIT(enclave_load, selinux_enclave_load),
> +	LSM_HOOK_INIT(enclave_init, selinux_enclave_init),
> +#endif
>   };
>   
>   static __init int selinux_init(void)
> diff --git a/security/selinux/include/intel_sgx.h b/security/selinux/include/intel_sgx.h
> new file mode 100644
> index 000000000000..8f9c6c734921
> --- /dev/null
> +++ b/security/selinux/include/intel_sgx.h
> @@ -0,0 +1,18 @@
> +// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
> +// Copyright(c) 2016-18 Intel Corporation.
> +
> +#ifndef _SELINUX_SGXSEC_H_
> +#define _SELINUX_SGXSEC_H_
> +
> +#include <linux/lsm_hooks.h>
> +
> +#define SGX__EXECUTE	1
> +#define SGX__EXECMOD	2
> +
> +void sgxsec_enclave_free(struct file *);
> +int sgxsec_mprotect(struct vm_area_struct *, size_t);
> +int sgxsec_eadd(struct file *, size_t, size_t, size_t);
> +int sgxsec_eaug(struct file *, size_t, size_t, size_t);
> +int sgxsec_einit(struct file *, const struct sgx_sigstruct *, int);
> +
> +#endif
> diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h
> index 231262d8eac9..0fb4da7e3a8a 100644
> --- a/security/selinux/include/objsec.h
> +++ b/security/selinux/include/objsec.h
> @@ -71,6 +71,9 @@ struct file_security_struct {
>   	u32 fown_sid;		/* SID of file owner (for SIGIO) */
>   	u32 isid;		/* SID of inode at the time of file open */
>   	u32 pseqno;		/* Policy seqno at the time of file open */
> +#ifdef CONFIG_INTEL_SGX
> +	atomic_long_t esec;
> +#endif
>   };
>   
>   struct superblock_security_struct {
> diff --git a/security/selinux/intel_sgx.c b/security/selinux/intel_sgx.c
> new file mode 100644
> index 000000000000..37dacf5c295f
> --- /dev/null
> +++ b/security/selinux/intel_sgx.c
> @@ -0,0 +1,292 @@
> +// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
> +// Copyright(c) 2016-18 Intel Corporation.
> +
> +#include "objsec.h"
> +#include "intel_sgx.h"
> +
> +struct region {
> +	struct list_head	link;
> +	size_t			start;
> +	size_t			end;
> +	size_t			data;
> +};
> +
> +static inline struct region *region_new(void)
> +{
> +	struct region *n = kzalloc(sizeof(struct region), GFP_KERNEL);
> +	if (n)
> +		INIT_LIST_HEAD(&n->link);
> +	return n;
> +}
> +
> +static inline void region_free(struct region *r)
> +{
> +	list_del(&r->link);
> +	kfree(r);
> +}
> +
> +static struct list_head *
> +region_apply_to_range(struct list_head *rgs,
> +		      size_t start, size_t end,
> +		      struct list_head *(*cb)(struct region *,
> +					      size_t, size_t, size_t),
> +		      size_t arg)
> +{
> +	struct region *r, *n;
> +
> +	list_for_each_entry(r, rgs, link)
> +		if (start < r-> end)
> +			break;
> +
> +	if (&r->link == rgs || end <= r->start)
> +		return rgs;
> +
> +	do {
> +		struct list_head *ret;
> +		n = list_next_entry(r, link);
> +		ret = (*cb)(r, start, end, arg);
> +		if (ret)
> +			return ret;
> +		r = n;
> +	} while (&r->link != rgs && r->start < end);
> +	return &r->link;
> +}
> +
> +static struct list_head *
> +region_clear_cb(struct region *r, size_t start, size_t end, size_t arg)
> +{
> +	if (end < r->end) {
> +		if (start > r->start) {
> +			struct region *n = region_new();
> +			if (unlikely(!n))
> +				return ERR_PTR(-ENOMEM);
> +
> +			n->start = r->start;
> +			n->end = start;
> +			n->data = r->data;
> +			list_add_tail(&n->link, &r->link);
> +		}
> +		r->start = end;
> +		return &r->link;
> +	}
> +
> +	if (start > r->start)
> +		r->end = start;
> +	else
> +		region_free(r);
> +	return NULL;
> +}
> +
> +static inline struct list_head *
> +region_clear_range(struct list_head *rgs, size_t start, size_t end)
> +{
> +	return region_apply_to_range(rgs, start, end, region_clear_cb, 0);
> +}
> +
> +static struct list_head *
> +region_add_range(struct list_head *rgs, size_t start, size_t end, size_t data)
> +{
> +	struct region *r, *n;
> +
> +	n = list_entry(region_clear_range(rgs, start, end), typeof(*n), link);
> +	if (unlikely(IS_ERR_VALUE(&n->link)))
> +		return &n->link;
> +
> +	if (&n->link != rgs && end == n->start && data == n->data) {
> +		n->start = start;
> +		r = n;
> +	} else {
> +		r = region_new();
> +		if (unlikely(!r))
> +			return ERR_PTR(-ENOMEM);
> +
> +		r->start = start;
> +		r->end = end;
> +		r->data = data;
> +		list_add_tail(&r->link, &n->link);
> +	}
> +
> +	n = list_prev_entry(r, link);
> +	if (&n->link != rgs && start == n->end && data == n->data) {
> +		r->start = n->start;
> +		region_free(n);
> +	}
> +
> +	return &r->link;
> +}
> +
> +static inline int
> +enclave_add_pages(struct list_head *rgs, size_t start, size_t end, size_t flags)
> +{
> +	void *p = region_add_range(rgs, start, end, flags);
> +	return PTR_ERR_OR_ZERO(p);
> +}
> +
> +static inline int enclave_prot_allowed(size_t prot, size_t flags)
> +{
> +	return !(prot & VM_EXEC) || (flags & SGX__EXECUTE);
> +}
> +
> +static struct list_head *
> +enclave_prot_check_cb(struct region *r, size_t start, size_t end, size_t prot)
> +{
> +	if (!enclave_prot_allowed(prot, r->data))
> +		return ERR_PTR(-EACCES);
> +	return NULL;
> +}
> +
> +static struct list_head *
> +enclave_prot_set_cb(struct region *r, size_t start, size_t end, size_t prot)
> +{
> +	BUG_ON(!enclave_prot_allowed(prot, r->data));
> +
> +	if (!(prot & VM_WRITE) ||
> +	    (r->data & SGX__EXECMOD) ||
> +	    !(r->data & SGX__EXECUTE))
> +		return NULL;
> +
> +	if (end < r->end) {
> +		struct region *n = region_new();
> +		if (unlikely(!n))
> +			return ERR_PTR(-ENOMEM);
> +
> +		n->start = end;
> +		n->end = r->end;
> +		n->data = r->data;
> +		r->end = end;
> +		list_add(&n->link, &r->link);
> +	}
> +
> +	if (start > r->start) {
> +		struct region *n = region_new();
> +		if (unlikely(!n))
> +			return ERR_PTR(-ENOMEM);
> +
> +		n->start = r->start;
> +		n->end = start;
> +		n->data = r->data;
> +		r->start = start;
> +		list_add_tail(&n->link, &r->link);
> +	}
> +
> +	r->data &= ~SGX__EXECUTE;
> +	return NULL;
> +}
> +
> +static inline int
> +enclave_mprotect(struct list_head *rgs, size_t start, size_t end, size_t prot)
> +{
> +	void *ret;
> +
> +	ret = region_apply_to_range(rgs, start, end,
> +				    enclave_prot_check_cb, prot);
> +	if (!IS_ERR_VALUE(ret) && (prot & VM_WRITE))
> +		ret = region_apply_to_range(rgs, start, end,
> +					    enclave_prot_set_cb, prot);
> +	return PTR_ERR_OR_ZERO(ret);
> +}
> +
> +struct enclave_sec {
> +	struct rw_semaphore	sem;
> +	struct list_head	regions;
> +	size_t			eaug_perm;
> +};
> +
> +static inline struct enclave_sec *__esec(struct file_security_struct *fsec)
> +{
> +	return (struct enclave_sec *)atomic_long_read(&fsec->esec);
> +}
> +
> +static struct enclave_sec *encl_esec(struct file *encl)
> +{
> +	struct file_security_struct *fsec = selinux_file(encl);
> +	struct enclave_sec *esec = __esec(fsec);
> +
> +	if (unlikely(!esec)) {
> +		long n;
> +
> +		esec = kzalloc(sizeof(*esec), GFP_KERNEL);
> +		if (!esec)
> +			return NULL;
> +
> +		init_rwsem(&esec->sem);
> +		INIT_LIST_HEAD(&esec->regions);
> +
> +		n = atomic_long_cmpxchg(&fsec->esec, 0, (long)esec);
> +		if (n) {
> +			kfree(esec);
> +			esec = (typeof(esec))n;
> +		}
> +	}
> +
> +	return esec;
> +}
> +
> +void sgxsec_enclave_free(struct file *encl)
> +{
> +	struct enclave_sec *esec = __esec(selinux_file(encl));
> +
> +	if (esec) {
> +		struct region *r, *n;
> +
> +		BUG_ON(rwsem_is_locked(&esec->sem));
> +
> +		list_for_each_entry_safe(r, n, &esec->regions, link)
> +			region_free(r);
> +
> +		kfree(esec);
> +	}
> +}
> +
> +int sgxsec_mprotect(struct vm_area_struct *vma, size_t prot)
> +{
> +	struct enclave_sec *esec;
> +	int rc;
> +
> +	if (!vma->vm_file || !(esec = __esec(selinux_file(vma->vm_file)))) {
> +		/* Positive return value indicates non-enclave VMA */
> +		return 1;
> +	}
> +
> +	down_read(&esec->sem);
> +	rc = enclave_mprotect(&esec->regions, vma->vm_start, vma->vm_end, prot);

Why is it safe for this to only use down_read()? enclave_mprotect() can 
call enclave_prot_set_cb() which modifies the list?

> +	up_read(&esec->sem);
> +	return rc;
> +}
> +
> +int sgxsec_eadd(struct file *encl, size_t start, size_t size, size_t perm)
> +{
> +	struct enclave_sec *esec = encl_esec(encl);
> +	int rc;
> +
> +	if (down_write_killable(&esec->sem))
> +		return -EINTR;
> +	rc = enclave_add_pages(&esec->regions, start, start + size, perm);
> +	up_write(&esec->sem);
> +	return rc;
> +}
> +
> +int sgxsec_eaug(struct file *encl, size_t start, size_t size, size_t prot)
> +{
> +	struct enclave_sec *esec = encl_esec(encl);
> +	int rc = -EPERM;
> +
> +	if (down_write_killable(&esec->sem))
> +		return -EINTR;
> +	if (enclave_prot_allowed(prot, esec->eaug_perm))
> +		rc = enclave_add_pages(&esec->regions, start, start + size,
> +				       esec->eaug_perm);
> +	up_write(&esec->sem);
> +	return rc;
> +}
> +
> +int sgxsec_einit(struct file *encl, const struct sgx_sigstruct *sigstruct, int execmod)
> +{
> +	struct enclave_sec *esec = encl_esec(encl);
> +
> +	if (down_write_killable(&esec->sem))
> +		return -EINTR;
> +	esec->eaug_perm = execmod ? SGX__EXECUTE | SGX__EXECMOD : 0;
> +	up_write(&esec->sem);
> +	return 0;
> +}

I haven't looked at this code closely, but it feels like a lot of 
SGX-specific logic embedded into SELinux that will have to be repeated 
or reused for every security module.  Does SGX not track this state itself?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 3/5] x86/sgx: Enforce noexec filesystem restriction for enclaves
  2019-06-10 16:44     ` Andy Lutomirski
@ 2019-06-11 17:21       ` Stephen Smalley
  0 siblings, 0 replies; 67+ messages in thread
From: Stephen Smalley @ 2019-06-11 17:21 UTC (permalink / raw)
  To: Andy Lutomirski, Jarkko Sakkinen
  Cc: Sean Christopherson, Cedric Xing, James Morris, Serge E . Hallyn,
	LSM List, Paul Moore, Eric Paris, selinux, Jethro Beekman,
	Dave Hansen, Thomas Gleixner, Linus Torvalds, LKML, X86 ML,
	linux-sgx, Andrew Morton, nhorman, npmccallum, Serge Ayoun,
	Shay Katz-zamir, Haitao Huang, Andy Shevchenko, Kai Svahn,
	Borislav Petkov, Josh Triplett, Kai Huang, David Rientjes,
	William Roberts, Philip Tricca

On 6/10/19 12:44 PM, Andy Lutomirski wrote:
> On Mon, Jun 10, 2019 at 9:00 AM Jarkko Sakkinen
> <jarkko.sakkinen@linux.intel.com> wrote:
>>
>> On Wed, Jun 05, 2019 at 07:11:43PM -0700, Sean Christopherson wrote:
>>> +             goto out;
>>> +     }
>>> +
>>> +     /*
>>> +      * Query VM_MAYEXEC as an indirect path_noexec() check (see do_mmap()),
>>> +      * but with some future proofing against other cases that may deny
>>> +      * execute permissions.
>>> +      */
>>> +     if (!(vma->vm_flags & VM_MAYEXEC)) {
>>> +             ret = -EACCES;
>>> +             goto out;
>>> +     }
>>> +
>>> +     if (copy_from_user(dst, (void __user *)src, PAGE_SIZE))
>>> +             ret = -EFAULT;
>>> +     else
>>> +             ret = 0;
>>> +
>>> +out:
>>> +     up_read(&current->mm->mmap_sem);
>>> +
>>> +     return ret;
>>> +}
>>
>> I would suggest to express the above instead like this for clarity
>> and consistency:
>>
>>                  goto err_map_sem;
>>          }
>>
>>          /* Query VM_MAYEXEC as an indirect path_noexec() check
>>           * (see do_mmap()).
>>           */
>>          if (!(vma->vm_flags & VM_MAYEXEC)) {
>>                  ret = -EACCES;
>>                  goto err_mmap_sem;
>>          }
>>
>>          if (copy_from_user(dst, (void __user *)src, PAGE_SIZE)) {
>>                  ret = -EFAULT;
>>                  goto err_mmap_sem;
>>          }
>>
>>          return 0;
>>
>> err_mmap_sem:
>>          up_read(&current->mm->mmap_sem);
>>          return ret;
>> }
>>
>> The comment about future proofing is unnecessary.
>>
> 
> I'm also torn as to whether this patch is needed at all.  If we ever
> get O_MAYEXEC, then enclave loaders should use it to enforce noexec in
> userspace.  Otherwise I'm unconvinced it's that special.

What's a situation where we would want to allow this?  Why is it 
different than do_mmap()?




^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-11 13:40     ` Stephen Smalley
@ 2019-06-11 22:02       ` Sean Christopherson
  2019-06-12  9:32         ` Dr. Greg
                           ` (2 more replies)
  2019-06-11 22:55       ` Xing, Cedric
  1 sibling, 3 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-11 22:02 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Cedric Xing, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, dave.hansen, tglx, torvalds, akpm, nhorman, pmccallum,
	serge.ayoun, shay.katz-zamir, haitao.huang, andriy.shevchenko,
	kai.svahn, bp, josh, kai.huang, rientjes, william.c.roberts,
	philip.b.tricca

On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> I haven't looked at this code closely, but it feels like a lot of
> SGX-specific logic embedded into SELinux that will have to be repeated or
> reused for every security module.  Does SGX not track this state itself?

SGX does track equivalent state.

There are three proposals on the table (I think):

  1. Require userspace to explicitly specificy (maximal) enclave page
     permissions at build time.  The enclave page permissions are provided
     to, and checked by, LSMs at enclave build time.

     Pros: Low-complexity kernel implementation, straightforward auditing
     Cons: Sullies the SGX UAPI to some extent, may increase complexity of
           SGX2 enclave loaders.

  2. Pre-check LSM permissions and dynamically track mappings to enclave
     pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
     based on the pre-checked permissions.

     Pros: Does not impact SGX UAPI, medium kernel complexity
     Cons: Auditing is complex/weird, requires taking enclave-specific
           lock during mprotect() to query/update tracking.

  3. Implement LSM hooks in SGX to allow LSMs to track enclave regions
     from cradle to grave, but otherwise defer everything to LSMs.

     Pros: Does not impact SGX UAPI, maximum flexibility, precise auditing
     Cons: Most complex and "heaviest" kernel implementation of the three,
           pushes more SGX details into LSMs.

My RFC series[1] implements #1.  My understanding is that Andy (Lutomirski)
prefers #2.  Cedric's RFC series implements #3.

Perhaps the easiest way to make forward progress is to rule out the
options we absolutely *don't* want by focusing on the potentially blocking
issue with each option:

  #1 - SGX UAPI funkiness

  #2 - Auditing complexity, potential enclave lock contention

  #3 - Pushing SGX details into LSMs and complexity of kernel implementation


[1] https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherson@intel.com

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-11 13:40     ` Stephen Smalley
  2019-06-11 22:02       ` Sean Christopherson
@ 2019-06-11 22:55       ` Xing, Cedric
  2019-06-13 18:00         ` Stephen Smalley
  1 sibling, 1 reply; 67+ messages in thread
From: Xing, Cedric @ 2019-06-11 22:55 UTC (permalink / raw)
  To: Stephen Smalley, linux-security-module, selinux, linux-kernel, linux-sgx
  Cc: jarkko.sakkinen, luto, jmorris, serge, paul, eparis, jethro,
	Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum, Ayoun,
	Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko, Svahn,
	Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C, Tricca,
	Philip B

> From: linux-sgx-owner@vger.kernel.org [mailto:linux-sgx-
> owner@vger.kernel.org] On Behalf Of Stephen Smalley
> Sent: Tuesday, June 11, 2019 6:40 AM
> 
> >
> > +#ifdef CONFIG_INTEL_SGX
> > +	rc = sgxsec_mprotect(vma, prot);
> > +	if (rc <= 0)
> > +		return rc;
> 
> Why are you skipping the file_map_prot_check() call when rc == 0?
> What would SELinux check if you didn't do so -
> FILE__READ|FILE__WRITE|FILE__EXECUTE to /dev/sgx/enclave?  Is it a
> problem to let SELinux proceed with that check?

We can continue the check. But in practice, all FILE__{READ|WRITE|EXECUTE} are needed for every enclave, then what's the point of checking them? FILE__EXECMOD may be the only flag that has a meaning, but it's kind of redundant because sigstruct file was checked against that already.

> > +static int selinux_enclave_load(struct file *encl, unsigned long addr,
> > +				unsigned long size, unsigned long prot,
> > +				struct vm_area_struct *source)
> > +{
> > +	if (source) {
> > +		/**
> > +		 * Adding page from source => EADD request
> > +		 */
> > +		int rc = selinux_file_mprotect(source, prot, prot);
> > +		if (rc)
> > +			return rc;
> > +
> > +		if (!(prot & VM_EXEC) &&
> > +		    selinux_file_mprotect(source, VM_EXEC, VM_EXEC))
> 
> I wouldn't conflate VM_EXEC with PROT_EXEC even if they happen to be
> defined with the same values currently.  Elsewhere the kernel appears to
> explicitly translate them ala calc_vm_prot_bits().

Thanks! I'd change them to PROT_EXEC in the next version.

> 
> Also, this will mean that we will always perform an execute check on all
> sources, thereby triggering audit denial messages for any EADD sources
> that are only intended to be data.  Depending on the source, this could
> trigger PROCESS__EXECMEM or FILE__EXECMOD or FILE__EXECUTE.  In a world
> where users often just run any denials they see through audit2allow,
> they'll end up always allowing them all.  How can they tell whether it
> was needed? It would be preferable if we could only trigger execute
> checks when there is some probability that execute will be requested in
> the future.  Alternatives would be to silence the audit of these
> permission checks always via use of _noaudit() interfaces or to silence
> audit of these permissions via dontaudit rules in policy, but the latter
> would hide all denials of the permission by the process, not just those
> triggered from security_enclave_load().  And if we silence them, then we
> won't see them even if they were needed.

*_noaudit() is exactly what I wanted. But I couldn't find selinux_file_mprotect_noaudit()/file_has_perm_noaudit(), and I'm reluctant to duplicate code. Any suggestions?
 
> 
> > +			prot = 0;
> > +		else {
> > +			prot = SGX__EXECUTE;
> > +			if (source->vm_file &&
> > +			    !file_has_perm(current_cred(), source->vm_file,
> > +					   FILE__EXECMOD))
> > +				prot |= SGX__EXECMOD;
> 
> Similarly, this means that we will always perform a FILE__EXECMOD check
> on all executable sources, triggering audit denial messages for any EADD
> source that is executable but to which EXECMOD is not allowed, and again
> the most common pattern will be that users will add EXECMOD to all
> executable sources to avoid this.
> 
> > +		}
> > +		return sgxsec_eadd(encl, addr, size, prot);
> > +	} else {
> > +		/**
> > +		  * Adding page from NULL => EAUG request
> > +		  */
> > +		return sgxsec_eaug(encl, addr, size, prot);
> > +	}
> > +}
> > +
> > +static int selinux_enclave_init(struct file *encl,
> > +				const struct sgx_sigstruct *sigstruct,
> > +				struct vm_area_struct *vma)
> > +{
> > +	int rc = 0;
> > +
> > +	if (!vma)
> > +		rc = -EINVAL;
> 
> Is it ever valid to call this hook with a NULL vma?  If not, this should
> be handled/prevented by the caller.  If so, I'd just return -EINVAL
> immediately here.

vma shall never be NULL. I'll update it in the next version.

> 
> > +
> > +	if (!rc && !(vma->vm_flags & VM_EXEC))
> > +		rc = selinux_file_mprotect(vma, VM_EXEC, VM_EXEC);
> 
> I had thought we were trying to avoid overloading FILE__EXECUTE (or
> whatever gets checked here, e.g. could be PROCESS__EXECMEM or
> FILE__EXECMOD) on the sigstruct file, since the caller isn't truly
> executing code from it.

Agreed. Another problem with FILE__EXECMOD on the sigstruct file is that user code would then be allowed to modify SIGSTRUCT at will, which effectively wipes out the protection provided by FILE__EXECUTE.

> 
> I'd define new ENCLAVE__* permissions, including an up-front
> ENCLAVE__INIT permission that governs whether the sigstruct file can be
> used at all irrespective of memory protections.

Agreed.

> 
> Then you can also have ENCLAVE__EXECUTE, ENCLAVE__EXECMEM,
> ENCLAVE__EXECMOD for the execute-related checks.  Or you can use the
> /dev/sgx/enclave inode as the target for the execute checks and just
> reuse the file permissions there.

Now we've got 2 options - 1) New ENCLAVE__* flags on sigstruct file or 2) FILE__* on /dev/sgx/enclave. Which one do you think makes more sense?

ENCLAVE__EXECMEM seems to offer finer granularity (than PROCESS__EXECMEM) but I wonder if it'd have any real use in practice.

> > +int sgxsec_mprotect(struct vm_area_struct *vma, size_t prot) {
> > +	struct enclave_sec *esec;
> > +	int rc;
> > +
> > +	if (!vma->vm_file || !(esec = __esec(selinux_file(vma->vm_file))))
> {
> > +		/* Positive return value indicates non-enclave VMA */
> > +		return 1;
> > +	}
> > +
> > +	down_read(&esec->sem);
> > +	rc = enclave_mprotect(&esec->regions, vma->vm_start, vma->vm_end,
> > +prot);
> 
> Why is it safe for this to only use down_read()? enclave_mprotect() can
> call enclave_prot_set_cb() which modifies the list?

Probably because it was too late at night when I wrote this line:-( Good catch!

> 
> I haven't looked at this code closely, but it feels like a lot of SGX-
> specific logic embedded into SELinux that will have to be repeated or
> reused for every security module.  Does SGX not track this state itself?

I can tell you have looked quite closely, and I truly think you for your time!

You are right that there are SGX specific stuff. More precisely, SGX enclaves don't have access to anything except memory, so there are only 3 questions that need to be answered for each enclave page: 1) whether X is allowed; 2) whether W->X is allowed and 3 whether WX is allowed. This proposal tries to cache the answers to those questions upon creation of each enclave page, meaning it involves a) figuring out the answers and b) "remember" them for every page. #b is generic, mostly captured in intel_sgx.c, and could be shared among all LSM modules; while #a is SELinux specific. I could move intel_sgx.c up one level in the directory hierarchy if that's what you'd suggest.

By "SGX", did you mean the SGX subsystem being upstreamed? It doesn’t track that state. In practice, there's no way for SGX to track it because there's no vm_ops->may_mprotect() callback. It doesn't follow the philosophy of Linux either, as mprotect() doesn't track it for regular memory. And it doesn't have a use without LSM, so I believe it makes more sense to track it inside LSM.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-10 22:28       ` Xing, Cedric
@ 2019-06-12  0:09         ` Andy Lutomirski
  2019-06-12 14:34           ` Sean Christopherson
  0 siblings, 1 reply; 67+ messages in thread
From: Andy Lutomirski @ 2019-06-12  0:09 UTC (permalink / raw)
  To: Xing, Cedric
  Cc: Andy Lutomirski, Christopherson, Sean J, Jarkko Sakkinen,
	Stephen Smalley, James Morris, Serge E . Hallyn, LSM List,
	Paul Moore, Eric Paris, selinux, Jethro Beekman, Hansen, Dave,
	Thomas Gleixner, Linus Torvalds, LKML, X86 ML, linux-sgx,
	Andrew Morton, nhorman, npmccallum, Ayoun, Serge, Katz-zamir,
	Shay, Huang, Haitao, Andy Shevchenko, Svahn, Kai,
	Borislav Petkov, Josh Triplett, Huang, Kai, David Rientjes,
	Roberts, William C, Tricca, Philip B



On Jun 10, 2019, at 3:28 PM, Xing, Cedric <cedric.xing@intel.com> wrote:

>> From: Andy Lutomirski [mailto:luto@kernel.org]
>> Sent: Monday, June 10, 2019 12:15 PM
>> 
>> On Mon, Jun 10, 2019 at 11:29 AM Xing, Cedric <cedric.xing@intel.com>
>> wrote:
>>> 
>>>> From: Christopherson, Sean J
>>>> Sent: Wednesday, June 05, 2019 7:12 PM
>>>> 
>>>> +/**
>>>> + * sgx_map_allowed - check vma protections against the associated
>>>> enclave page
>>>> + * @encl:    an enclave
>>>> + * @start:   start address of the mapping (inclusive)
>>>> + * @end:     end address of the mapping (exclusive)
>>>> + * @prot:    protection bits of the mapping
>>>> + *
>>>> + * Verify a userspace mapping to an enclave page would not violate
>>>> +the security
>>>> + * requirements of the *kernel*.  Note, this is in no way related
>>>> +to the
>>>> + * page protections enforced by hardware via the EPCM.  The EPCM
>>>> +protections
>>>> + * can be directly extended by the enclave, i.e. cannot be relied
>>>> +upon by the
>>>> + * kernel for security guarantees of any kind.
>>>> + *
>>>> + * Return:
>>>> + *   0 on success,
>>>> + *   -EACCES if the mapping is disallowed
>>>> + */
>>>> +int sgx_map_allowed(struct sgx_encl *encl, unsigned long start,
>>>> +                 unsigned long end, unsigned long prot) {
>>>> +     struct sgx_encl_page *page;
>>>> +     unsigned long addr;
>>>> +
>>>> +     prot &= (VM_READ | VM_WRITE | VM_EXEC);
>>>> +     if (!prot || !encl)
>>>> +             return 0;
>>>> +
>>>> +     mutex_lock(&encl->lock);
>>>> +
>>>> +     for (addr = start; addr < end; addr += PAGE_SIZE) {
>>>> +             page = radix_tree_lookup(&encl->page_tree, addr >>
>>>> PAGE_SHIFT);
>>>> +
>>>> +             /*
>>>> +              * Do not allow R|W|X to a non-existent page, or
>> protections
>>>> +              * beyond those of the existing enclave page.
>>>> +              */
>>>> +             if (!page || (prot & ~page->prot))
>>>> +                     return -EACCES;
>>> 
>>> In SGX2, pages will be "mapped" before being populated.
>>> 
>>> Here's a brief summary for those who don't have enough background on
>> how new EPC pages could be added to a running enclave in SGX2:
>>>  - There are 2 new instructions - EACCEPT and EAUG.
>>>  - EAUG is used by SGX module to add (augment) a new page to an
>> existing enclave. The newly added page is *inaccessible* until the
>> enclave *accepts* it.
>>>  - EACCEPT is the instruction for an enclave to accept a new page.
>>> 
>>> And the s/w flow for an enclave to request new EPC pages is expected
>> to be something like the following:
>>>  - The enclave issues EACCEPT at the linear address that it would
>> like a new page.
>>>  - EACCEPT results in #PF, as there's no page at the linear address
>> above.
>>>  - SGX module is notified about the #PF, in form of its vma->vm_ops-
>>> fault() being called by kernel.
>>>  - SGX module EAUGs a new EPC page at the fault address, and resumes
>> the enclave.
>>>  - EACCEPT is reattempted, and succeeds at this time.
>> 
>> This seems like an odd workflow.  Shouldn't the #PF return back to
>> untrusted userspace so that the untrusted user code can make its own
>> decision as to whether it wants to EAUG a page there as opposed to, say,
>> killing the enclave or waiting to keep resource usage under control?
> 
> This may seem odd to some at the first glance. But if you can think of how static heap (pre-allocated by EADD before EINIT) works, the load parses the "metadata" coming with the enclave to decide the address/size of the heap, EADDs it, and calls it done. In the case of "dynamic" heap (allocated dynamically by EAUG after EINIT), the same thing applies - the loader determines the range of the heap, tells the SGX module about it, and calls it done. Everything else is the between the enclave and the SGX module.
> 
> In practice, untrusted code usually doesn't know much about enclaves, just like it doesn't know much about the shared objects loaded into its address space either. Without the necessary knowledge, untrusted code usually just does what it is told (via o-calls, or return value from e-calls), without judging that's right or wrong. 
> 
> When it comes to #PF like what I described, of course a signal could be sent to the untrusted code but what would it do then? Usually it'd just come back asking for a page at the fault address. So we figured it'd be more efficient to just have the kernel EAUG at #PF. 
> 
> Please don't get me wrong though, as I'm not dictating what the s/w flow shall be. It's just going to be a choice offered to user mode. And that choice was planned to be offered via mprotect() - i.e. a writable vma causes kernel to EAUG while a non-writable vma will result in a signal (then the user mode could decide whether to EAUG). The key point is flexibility - as we want to allow all reasonable s/w flows instead of dictating one over others. We had similar discussions on vDSO API before. And I think you accepted my approach because of its flexibility. Am I right?

As long as user code can turn this off, I have no real objection. But it might make sense to have it be more explicit — have an ioctl set up a range as “EAUG-on-demand”.

But this is all currently irrelevant. We can argue about it when the patches show up. :)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-11 22:02       ` Sean Christopherson
@ 2019-06-12  9:32         ` Dr. Greg
  2019-06-12 14:25           ` Sean Christopherson
  2019-06-12 19:30         ` Andy Lutomirski
  2019-06-13 17:02         ` Stephen Smalley
  2 siblings, 1 reply; 67+ messages in thread
From: Dr. Greg @ 2019-06-12  9:32 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Stephen Smalley, Cedric Xing, linux-security-module, selinux,
	linux-kernel, linux-sgx, jarkko.sakkinen, luto, jmorris, serge,
	paul, eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca

On Tue, Jun 11, 2019 at 03:02:43PM -0700, Sean Christopherson wrote:

Good morning, I hope the week is proceeding well for everyone.

> On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> > I haven't looked at this code closely, but it feels like a lot of
> > SGX-specific logic embedded into SELinux that will have to be repeated or
> > reused for every security module.  Does SGX not track this state itself?

> SGX does track equivalent state.
> 
> There are three proposals on the table (I think):
> 
>   1. Require userspace to explicitly specificy (maximal) enclave page
>      permissions at build time.  The enclave page permissions are provided
>      to, and checked by, LSMs at enclave build time.
> 
>      Pros: Low-complexity kernel implementation, straightforward auditing
>      Cons: Sullies the SGX UAPI to some extent, may increase complexity of
>            SGX2 enclave loaders.
> 
>   2. Pre-check LSM permissions and dynamically track mappings to enclave
>      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
>      based on the pre-checked permissions.
> 
>      Pros: Does not impact SGX UAPI, medium kernel complexity
>      Cons: Auditing is complex/weird, requires taking enclave-specific
>            lock during mprotect() to query/update tracking.
> 
>   3. Implement LSM hooks in SGX to allow LSMs to track enclave regions
>      from cradle to grave, but otherwise defer everything to LSMs.
> 
>      Pros: Does not impact SGX UAPI, maximum flexibility, precise auditing
>      Cons: Most complex and "heaviest" kernel implementation of the three,
>            pushes more SGX details into LSMs.
> 
> My RFC series[1] implements #1.  My understanding is that Andy (Lutomirski)
> prefers #2.  Cedric's RFC series implements #3.
> 
> Perhaps the easiest way to make forward progress is to rule out the
> options we absolutely *don't* want by focusing on the potentially blocking
> issue with each option:
>
>   #1 - SGX UAPI funkiness
> 
>   #2 - Auditing complexity, potential enclave lock contention
> 
>   #3 - Pushing SGX details into LSMs and complexity of kernel implementation

At the risk of repeating myself, I believe the issue that has not
received full clarity is that, for a security relevant solution, there
has to be two separate aspects of LSM coverage for SGX.  I believe
that a high level review of the requirements may assist in selection
of a course of action for the driver.

The first aspect of LSM control has been covered extensively and that
is the notion of implementing control over the ability of a user
identity to request some cohort of page privileges.  The cohort of
obvious concern is the ability of a page to possess both WRITE and
EXECUTE privileges at sometime during its lifetime.

Given that SGX2 support is the ultimate and necesary goal for this
driver, the selected proposal should be the one that gives the most
simplistic application of this policy.  As I have noted previously,
once SGX2 becomes available, the only relevant security control that
can be realized with this type of LSM support is whether or not the
platform owner wishes to limit access by a user identity to the
ability to dynamically load code in enclave context.

With SGX2 we will, by necessity, have to admit the notion that a
platform owner will not have any effective visibility into code that
is loaded and executed, since it can come in over a secured network
connection in an enclave security context.  This advocates for the
simplest approach possible to providing some type of regulation to any
form of WX page access.

Current state of the art, and there doesn't appear to be a reason to
change this, is to package an enclave in the form of an ELF shared
library.  It seems straight forward to inherit and act on page
privileges from the privileges specified on the ELF sections that are
loaded.  Loaders will have a file descriptor available so an mmap of
the incoming page with the specified privileges should trigger the
required LSM interventions and tie them to a specific enclave.

The current enclave 'standard' also uses layout metadata, stored in a
special .notes section of the shared image, to direct a loader with
respect to construction of the enclave stack, heap, TCS and other
miscellaneous regions not directly coded by the ELF TEXT sections.  It
seems straight forward to extend this paradigm to declare region(s) of
an enclave that are eligible to be generated at runtime (EAUG'ed) with
the RWX protections needed to support dynamically loaded code.

If an enclave wishes to support this functionality, it would seem
straight forward to require an enclave to provide a single zero page
which the loader will mmap with those protections in order to trigger
the desired LSM checks against that specific enclave.

The simplest driver approach that achieves the desired introspection
of permissions in the described framework will implement as much LSM
security as is possible with SGX technology and with minimal
disruption to the existing SGX software eco-system.

This leaves the second aspect of LSM security and that is the ability
to inspect and act on the initialized characteristics of the enclave.
This is the aspect of SGX LSM functionality that has not been clearly
called out.

All that is needed here is an LSM hook that gets handed a pointer to
the signature structure (SIGSTRUCT) that is passed to the EINIT ioctl.
If the SIGSTRUCT does not match the proposed enclave image that the
processor has computed secondary to the enclave image creation process
the enclave will not initialize, so all that is needed is for an LSM
to be allowed to interpret and act on the characteristics defined in
that structure before the enclave is actually initialized.

As we have now collectively demonstrated, it is easy to get lost in
minutia with respect to all of this.  I believe if we can focus on a
solution that implements what I have discussed above we will achieve
as much as can be achieved with respect to platform security for SGX
systems.

Best wishes for a productive remainder of the week.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686            EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"Nullum magnum ingenium sine mixtura dementiae fuit."
        (There is no great genius without some touch of madness.)
                                -- Seneca

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-12  9:32         ` Dr. Greg
@ 2019-06-12 14:25           ` Sean Christopherson
  2019-06-13  7:25             ` Dr. Greg
  0 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-12 14:25 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Stephen Smalley, Cedric Xing, linux-security-module, selinux,
	linux-kernel, linux-sgx, jarkko.sakkinen, luto, jmorris, serge,
	paul, eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca

On Wed, Jun 12, 2019 at 04:32:21AM -0500, Dr. Greg wrote:
> With SGX2 we will, by necessity, have to admit the notion that a
> platform owner will not have any effective visibility into code that
> is loaded and executed, since it can come in over a secured network
> connection in an enclave security context.  This advocates for the
> simplest approach possible to providing some type of regulation to any
> form of WX page access.

I believe we're all on the same page in the sense that we all want the
"simplest approach possible", but there's a sliding scale of complexity
between the kernel and userspace.  We can make life simple for userspace
at the cost of additional complexity in the kernel, and vice versa.  The
disagreement is over where to shove the extra complexity.

> Current state of the art, and there doesn't appear to be a reason to
> change this, is to package an enclave in the form of an ELF shared
> library.  It seems straight forward to inherit and act on page
> privileges from the privileges specified on the ELF sections that are
> loaded.  Loaders will have a file descriptor available so an mmap of
> the incoming page with the specified privileges should trigger the
> required LSM interventions and tie them to a specific enclave.
> 
> The current enclave 'standard' also uses layout metadata, stored in a
> special .notes section of the shared image, to direct a loader with
> respect to construction of the enclave stack, heap, TCS and other
> miscellaneous regions not directly coded by the ELF TEXT sections.  It
> seems straight forward to extend this paradigm to declare region(s) of
> an enclave that are eligible to be generated at runtime (EAUG'ed) with
> the RWX protections needed to support dynamically loaded code.
> 
> If an enclave wishes to support this functionality, it would seem
> straight forward to require an enclave to provide a single zero page
> which the loader will mmap with those protections in order to trigger
> the desired LSM checks against that specific enclave.

This is effectively #1, e.g. would require userspace to pre-declare its
intent to make regions W->X.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-12  0:09         ` Andy Lutomirski
@ 2019-06-12 14:34           ` Sean Christopherson
  2019-06-12 18:20             ` Xing, Cedric
  0 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-12 14:34 UTC (permalink / raw)
  To: Andy Lutomirski, q
  Cc: Xing, Cedric, Andy Lutomirski, Jarkko Sakkinen, Stephen Smalley,
	James Morris, Serge E . Hallyn, LSM List, Paul Moore, Eric Paris,
	selinux, Jethro Beekman, Hansen, Dave, Thomas Gleixner,
	Linus Torvalds, LKML, X86 ML, linux-sgx, Andrew Morton, nhorman,
	npmccallum, Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao,
	Andy Shevchenko, Svahn, Kai, Borislav Petkov, Josh Triplett,
	Huang, Kai, David Rientjes, Roberts, William C, Tricca, Philip B

On Tue, Jun 11, 2019 at 05:09:28PM -0700, Andy Lutomirski wrote:
> 
> On Jun 10, 2019, at 3:28 PM, Xing, Cedric <cedric.xing@intel.com> wrote:
> 
> >> From: Andy Lutomirski [mailto:luto@kernel.org]
> >> Sent: Monday, June 10, 2019 12:15 PM
> >> This seems like an odd workflow.  Shouldn't the #PF return back to
> >> untrusted userspace so that the untrusted user code can make its own
> >> decision as to whether it wants to EAUG a page there as opposed to, say,
> >> killing the enclave or waiting to keep resource usage under control?
> > 
> > This may seem odd to some at the first glance. But if you can think of how
> > static heap (pre-allocated by EADD before EINIT) works, the load parses the
> > "metadata" coming with the enclave to decide the address/size of the heap,
> > EADDs it, and calls it done. In the case of "dynamic" heap (allocated
> > dynamically by EAUG after EINIT), the same thing applies - the loader
> > determines the range of the heap, tells the SGX module about it, and calls
> > it done. Everything else is the between the enclave and the SGX module.
> > 
> > In practice, untrusted code usually doesn't know much about enclaves, just
> > like it doesn't know much about the shared objects loaded into its address
> > space either. Without the necessary knowledge, untrusted code usually just
> > does what it is told (via o-calls, or return value from e-calls), without
> > judging that's right or wrong. 
> > 
> > When it comes to #PF like what I described, of course a signal could be
> > sent to the untrusted code but what would it do then? Usually it'd just
> > come back asking for a page at the fault address. So we figured it'd be
> > more efficient to just have the kernel EAUG at #PF. 
> > 
> > Please don't get me wrong though, as I'm not dictating what the s/w flow
> > shall be. It's just going to be a choice offered to user mode. And that
> > choice was planned to be offered via mprotect() - i.e. a writable vma
> > causes kernel to EAUG while a non-writable vma will result in a signal
> > (then the user mode could decide whether to EAUG). The key point is
> > flexibility - as we want to allow all reasonable s/w flows instead of
> > dictating one over others. We had similar discussions on vDSO API before.
> > And I think you accepted my approach because of its flexibility. Am I
> > right?
> 
> As long as user code can turn this off, I have no real objection. But it
> might make sense to have it be more explicit — have an ioctl set up a range
> as “EAUG-on-demand”.

This was part of the motivation behind changing SGX_IOC_ENCLAVE_ADD_PAGE
to SGX_IOC_ENCLAVE_ADD_REGION and adding a @flags parameter.  E.g. adding
support for "EAUG-on-demand" regions would just be a new flag.

> But this is all currently irrelevant. We can argue about it when the patches
> show up. :)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-12 14:34           ` Sean Christopherson
@ 2019-06-12 18:20             ` Xing, Cedric
  0 siblings, 0 replies; 67+ messages in thread
From: Xing, Cedric @ 2019-06-12 18:20 UTC (permalink / raw)
  To: Christopherson, Sean J, Andy Lutomirski, q
  Cc: Andy Lutomirski, Jarkko Sakkinen, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Hansen, Dave, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, Andy Shevchenko,
	Svahn, Kai, Borislav Petkov, Josh Triplett, Huang, Kai,
	David Rientjes, Roberts, William C, Tricca, Philip B

> From: Christopherson, Sean J
> Sent: Wednesday, June 12, 2019 7:34 AM
> 
> On Tue, Jun 11, 2019 at 05:09:28PM -0700, Andy Lutomirski wrote:
> >
> > On Jun 10, 2019, at 3:28 PM, Xing, Cedric <cedric.xing@intel.com>
> wrote:
> >
> > >> From: Andy Lutomirski [mailto:luto@kernel.org]
> > >> Sent: Monday, June 10, 2019 12:15 PM This seems like an odd
> > >> workflow.  Shouldn't the #PF return back to untrusted userspace so
> > >> that the untrusted user code can make its own decision as to
> > >> whether it wants to EAUG a page there as opposed to, say, killing
> > >> the enclave or waiting to keep resource usage under control?
> > >
> > > This may seem odd to some at the first glance. But if you can think
> > > of how static heap (pre-allocated by EADD before EINIT) works, the
> > > load parses the "metadata" coming with the enclave to decide the
> > > address/size of the heap, EADDs it, and calls it done. In the case
> > > of "dynamic" heap (allocated dynamically by EAUG after EINIT), the
> > > same thing applies - the loader determines the range of the heap,
> > > tells the SGX module about it, and calls it done. Everything else is
> the between the enclave and the SGX module.
> > >
> > > In practice, untrusted code usually doesn't know much about
> > > enclaves, just like it doesn't know much about the shared objects
> > > loaded into its address space either. Without the necessary
> > > knowledge, untrusted code usually just does what it is told (via
> > > o-calls, or return value from e-calls), without judging that's right
> or wrong.
> > >
> > > When it comes to #PF like what I described, of course a signal could
> > > be sent to the untrusted code but what would it do then? Usually
> > > it'd just come back asking for a page at the fault address. So we
> > > figured it'd be more efficient to just have the kernel EAUG at #PF.
> > >
> > > Please don't get me wrong though, as I'm not dictating what the s/w
> > > flow shall be. It's just going to be a choice offered to user mode.
> > > And that choice was planned to be offered via mprotect() - i.e. a
> > > writable vma causes kernel to EAUG while a non-writable vma will
> > > result in a signal (then the user mode could decide whether to
> > > EAUG). The key point is flexibility - as we want to allow all
> > > reasonable s/w flows instead of dictating one over others. We had
> similar discussions on vDSO API before.
> > > And I think you accepted my approach because of its flexibility. Am
> > > I right?
> >
> > As long as user code can turn this off, I have no real objection. But
> > it might make sense to have it be more explicit — have an ioctl set up
> > a range as “EAUG-on-demand”.
> 
> This was part of the motivation behind changing SGX_IOC_ENCLAVE_ADD_PAGE
> to SGX_IOC_ENCLAVE_ADD_REGION and adding a @flags parameter.  E.g.
> adding support for "EAUG-on-demand" regions would just be a new flag.

We'll end up in some sort of interface eventually. But that's too early to discuss.

Currently what we need is the plumbing - i.e. the range has to be mmap()'ed and it cannot be PROT_NONE, otherwise vm_ops->fault() will not be reached.

> 
> > But this is all currently irrelevant. We can argue about it when the
> > patches show up. :)

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
  2019-06-10 18:17         ` Sean Christopherson
@ 2019-06-12 19:26           ` Jarkko Sakkinen
  0 siblings, 0 replies; 67+ messages in thread
From: Jarkko Sakkinen @ 2019-06-12 19:26 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Mon, Jun 10, 2019 at 11:17:44AM -0700, Sean Christopherson wrote:
> On Mon, Jun 10, 2019 at 08:45:06PM +0300, Jarkko Sakkinen wrote:
> > On Mon, Jun 10, 2019 at 09:15:33AM -0700, Sean Christopherson wrote:
> > > > 'flags' should would renamed as 'secinfo_flags_mask' even if the name is
> > > > longish. It would use the same values as the SECINFO flags. The field in
> > > > struct sgx_encl_page should have the same name. That would express
> > > > exactly relation between SECINFO and the new field. I would have never
> > > > asked on last iteration why SECINFO is not enough with a better naming.
> > > 
> > > No, these flags do not impact the EPCM protections in any way.  Userspace
> > > can extend the EPCM protections without going through the kernel.  The
> > > protection flags for an enclave page impact VMA/PTE protection bits.
> > > 
> > > IMO, it is best to treat the EPCM as being completely separate from the
> > > kernel's EPC management.
> > 
> > It is a clumsy API if permissions are not taken in the same format for
> > everything. There is no reason not to do it. The way mprotect() callback
> > just interprets the field is as VMA permissions.
> 
> They are two entirely different things.  The explicit protection bits are
> consumed by the kernel, while SECINFO.flags is consumed by the CPU.  The
> intent is to have the protection flags be analogous to mprotect(), the
> fact that they have a similar/identical format to SECINFO is irrelevant.
> 
> Calling the field secinfo_flags_mask is straight up wrong on SGX2, as 
> userspace can use EMODPE to set SECINFO after the page is added.  It's
> also wrong on SGX1 when adding TCS pages since SECINFO.RWX bits for TCS
> pages are forced to zero by hardware.

The new variable tells the limits on which kernel will co-operate with
the enclave. It is way more descriptive than 'flags'.

> > It would also be more future-proof just to have a mask covering all bits
> > of the SECINFO flags field.
> 
> This simply doesn't work, e.g. the PENDING, MODIFIED and PR flags in the
> SECINFO are read-only from a software perspective.

It is easy to validate reserved bits from a SECINFO struct.

/Jarkko

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-11 22:02       ` Sean Christopherson
  2019-06-12  9:32         ` Dr. Greg
@ 2019-06-12 19:30         ` Andy Lutomirski
  2019-06-12 22:02           ` Sean Christopherson
  2019-06-13 17:02         ` Stephen Smalley
  2 siblings, 1 reply; 67+ messages in thread
From: Andy Lutomirski @ 2019-06-12 19:30 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Stephen Smalley, Cedric Xing, LSM List, selinux, LKML, linux-sgx,
	Jarkko Sakkinen, Andrew Lutomirski, James Morris,
	Serge E. Hallyn, Paul Moore, Eric Paris, Jethro Beekman,
	Dave Hansen, Thomas Gleixner, Linus Torvalds, Andrew Morton,
	nhorman, pmccallum, Ayoun, Serge, Katz-zamir, Shay, Huang,
	Haitao, Andy Shevchenko, Svahn, Kai, Borislav Petkov,
	Josh Triplett, Huang, Kai, David Rientjes, Roberts, William C,
	Philip Tricca

On Tue, Jun 11, 2019 at 3:02 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> > I haven't looked at this code closely, but it feels like a lot of
> > SGX-specific logic embedded into SELinux that will have to be repeated or
> > reused for every security module.  Does SGX not track this state itself?
>
> SGX does track equivalent state.
>
> There are three proposals on the table (I think):

Sounds about right.  I've been playing with #1 and #2 (as text, not
code), and I'll post my latest thoughts on it below.  But first, I
should mention that I think we've gotten a bit too caught up on
SELinux-y terminology like "EXECMOD" and "EXECMEM", which is relevant
since the kernel has very little visibility into what the enclave is
doing.  Instead, I think we should think about the relevant
permissions more like this:

a) "execute code from a particular source, e.g. a file"
b) "execute code supplied from arbitrary memory outside the enclave"
c) "execute code generated within the enclave"
d) "possess WX enclave memory"

I think that any sensible policy that allows (b) should allow (a).
Similarly, any policy that allows (d) should allow (c).   I don't see
any particular need for the kernel to go out of its way to ensure
these relationships, though.

We could plausibly also distinguish "execute measured code", although
I think that the details of defining and implenenting this, especially
with SGX2, could be nastier than we want to deal with.  A minimal
approach that mostly ignores SGX2 would be to have another permission
"execute code supplied from outside the enclave that was not
measured".  This permission would be required on top of (a) or (b),
depending on where that code comes from.

If we want to map these to existing SELinux terms, we could use
EXECUTE for (a), EXECMOD for (c), and EXECMEM for (d). (b) seems to
also map to EXECMOD or EXECMEM depending on exactly how it happens,
and I'm not sure this makes all that much sense.

>
>   1. Require userspace to explicitly specificy (maximal) enclave page
>      permissions at build time.  The enclave page permissions are provided
>      to, and checked by, LSMs at enclave build time.
>
>      Pros: Low-complexity kernel implementation, straightforward auditing
>      Cons: Sullies the SGX UAPI to some extent, may increase complexity of
>            SGX2 enclave loaders.

In my notes, this works like this.  This is similar, but not
identical, to what Sean has been sending out.

EADD takes flags: ALLOW_READ, ALLOW_WRITE, ALLOW_EXEC.  It calls a new hook:

  int security_enclave_load(struct vm_area_struct *source, unsigned int flags);

(Sean passed in the secinfo protection too, but I think we agreed
that this could be omitted.)  This hook will fail if ALLOW_EXEC is
requested and the LSM doesn't consider the source VMA to be
executable.  Privileges (a) and (b) are implemented here.

Optionally, we can enforce noexec here.

The future EAUG ioctl takes the same flags, but it doesn't call
security_enclave_load().  (As Cedric noted, the actual user API for EAUG
is not settled, but I don't think it makes much difference here.)

EINIT takes a sigstruct pointer.  SGX calls a new hook:

  unsigned int security_enclave_init(struct sigstruct *sigstruct,
struct vm_area_struct *source, unsigned int flags);

This hook can return -EPERM.  Otherwise it returns 0 or a combination of
flags DENY_WX and DENY_X_IF_ALLOW_WRITE.  The driver saves this value.
These represent permissions (c) and (d).

If we want to have a permission for "execute code supplied from
outside the enclave that was not measured", we could have a flag like
HAS_UNMEASURED_ALLOW_EXEC_PAGE that the LSM could consider.

mmap() and mprotect() enforce the following rules:

 - Deny if a PROT_ flag is requested but the corresponding ALLOW_ flag
   is not set for all pages in question.

 - Deny if PROT_WRITE, PROT_EXEC, and DENY_WX are all set.

 - Deny if PROT_EXEC, ALLOW_WRITE, and DENY_X_IF_ALLOW_WRITE are all set.

mprotect() and mmap() do *not* call SGX-specific LSM hooks to ask for
permission, although they can optionally call an LSM hook if they hit one of
the -EPERM cases for auditing purposes.


I think this model works quite well in an SGX1 world.  The main thing
that makes me uneasy about this model is that, in SGX2, it requires
that an SGX2-compatible enclave loader must pre-declare to the kernel
whether it intends for its dynamically allocated memory to be
ALLOW_EXEC.  If ALLOW_EXEC is set but not actually needed, it will
still fail if DENY_X_IF_ALLOW_WRITE ends up being set.  The other
version below does not have this limitation.

>
>   2. Pre-check LSM permissions and dynamically track mappings to enclave
>      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
>      based on the pre-checked permissions.
>
>      Pros: Does not impact SGX UAPI, medium kernel complexity
>      Cons: Auditing is complex/weird, requires taking enclave-specific
>            lock during mprotect() to query/update tracking.

Here's how this looks in my mind.  It's quite similar, except that
ALLOW_READ, ALLOW_WRITE, and ALLOW_EXEC are replaced with a little
state machine.

EADD does not take any special flags.  It calls this LSM hook:

  int security_enclave_load(struct vm_area_struct *source);

This hook can return -EPERM.  Otherwise it 0 or ALLOC_EXEC_IF_UNMODIFIED
(i.e. 1).  This hook enforces permissions (a) and (b).

The driver tracks a state for each page, and the possible states are:

 - CLEAN_MAYEXEC /* no W or X VMAs have existed, but X is okay */
 - CLEAN_NOEXEC /* no W or X VMAs have existed, and X is not okay */
 - CLEAN_EXEC /* no W VMA has existed, but an X VMA has existed */
 - DIRTY /* a W VMA has existed */

The initial state for a page is CLEAN_MAYEXEC if the hook said
ALLOW_EXEC_IF_UNMODIFIED and CLEAN_NOEXEC otherwise.

The future EAUG does not call a hook at all and puts pages into the state
CLEAN_NOEXEC.  If SGX3 or later ever adds EAUG-but-don't-clear, it can
call security_enclave_load() and add CLEAN_MAYEXEC pages if appropriate.

EINIT takes a sigstruct pointer.  SGX calls a new hook:

  unsigned int security_enclave_init(struct sigstruct *sigstruct,
struct vm_area_struct *source, unsigned int flags);

This hook can return -EPERM.  Otherwise it returns 0 or a combination of
flags DENY_WX and DENY_X_DIRTY.  The driver saves this value.
These represent permissions (c) and (d).

If we want to have a permission for "execute code supplied from outside the
enclave that was not measured", we could have a flag like
HAS_UNMEASURED_CLEAN_EXEC_PAGE that the LSM could consider.

mmap() and mprotect() enforce the following rules:

 - If VM_EXEC is requested and (either the page is DIRTY or VM_WRITE is
   requested) and DENY_X_DIRTY, then deny.

 - If VM_WRITE and VM_EXEC are both requested and DENY_WX, then deny.

 - If VM_WRITE is requested, we need to update the state.  If it was
   CLEAN_EXEC, then we reject if DENY_X_DIRTY.  Otherwise we change the
   state to DIRTY.

 - If VM_EXEC is requested and the page is CLEAN_NOEXEC, then deny.

mprotect() and mmap() do *not* call SGX-specific LSM hooks to ask for
permission, although they can optionally call an LSM hook if they hit one of
the -EPERM cases for auditing purposes.

Before the SIGSTRUCT is provided to the driver, the driver acts as though
DENY_X_DIRTY and DENY_WX are both set.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-12 19:30         ` Andy Lutomirski
@ 2019-06-12 22:02           ` Sean Christopherson
  2019-06-13  0:10             ` Xing, Cedric
  2019-06-13  1:02             ` Xing, Cedric
  0 siblings, 2 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-12 22:02 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Stephen Smalley, Cedric Xing, LSM List, selinux, LKML, linux-sgx,
	Jarkko Sakkinen, James Morris, Serge E. Hallyn, Paul Moore,
	Eric Paris, Jethro Beekman, Dave Hansen, Thomas Gleixner,
	Linus Torvalds, Andrew Morton, nhorman, pmccallum, Ayoun, Serge,
	Katz-zamir, Shay, Huang, Haitao, Andy Shevchenko, Svahn, Kai,
	Borislav Petkov, Josh Triplett, Huang, Kai, David Rientjes,
	Roberts, William C, Philip Tricca

On Wed, Jun 12, 2019 at 12:30:20PM -0700, Andy Lutomirski wrote:
> On Tue, Jun 11, 2019 at 3:02 PM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> >
> >   1. Require userspace to explicitly specificy (maximal) enclave page
> >      permissions at build time.  The enclave page permissions are provided
> >      to, and checked by, LSMs at enclave build time.
> >
> >      Pros: Low-complexity kernel implementation, straightforward auditing
> >      Cons: Sullies the SGX UAPI to some extent, may increase complexity of
> >            SGX2 enclave loaders.
> 
> In my notes, this works like this.  This is similar, but not
> identical, to what Sean has been sending out.

...

> mmap() and mprotect() enforce the following rules:
> 
>  - Deny if a PROT_ flag is requested but the corresponding ALLOW_ flag
>    is not set for all pages in question.
> 
>  - Deny if PROT_WRITE, PROT_EXEC, and DENY_WX are all set.
> 
>  - Deny if PROT_EXEC, ALLOW_WRITE, and DENY_X_IF_ALLOW_WRITE are all set.
> 
> mprotect() and mmap() do *not* call SGX-specific LSM hooks to ask for
> permission, although they can optionally call an LSM hook if they hit one of
> the -EPERM cases for auditing purposes.

IMO, #1 only makes sense if it's stripped down to avoid auditing and
locking complications, i.e. gets a pass/fail at security_enclave_load()
and clears VM_MAY* flags during mmap().  If we want WX and W->X to be
differentiated by security_enclave_init() as opposed to
security_enclave_load(), then we should just scrap #1.

> I think this model works quite well in an SGX1 world.  The main thing
> that makes me uneasy about this model is that, in SGX2, it requires
> that an SGX2-compatible enclave loader must pre-declare to the kernel
> whether it intends for its dynamically allocated memory to be
> ALLOW_EXEC.  If ALLOW_EXEC is set but not actually needed, it will
> still fail if DENY_X_IF_ALLOW_WRITE ends up being set.  The other
> version below does not have this limitation.

I'm not convinced this will be a meaningful limitation in practice, though
that's probably obvious from my RFCs :-).  That being said, the UAPI quirk
is essentially a dealbreaker for multiple people, so let's drop #1.

I discussed the options with Cedric offline, and he is ok with option #2
*if* the idea actually translates to acceptable code and doesn't present
problems for userspace and/or future SGX features.

So, I'll work on an RFC series to implement #2 as described below.  If it
works out, yay!  If not, i.e. option #2 is fundamentally broken, I'll
shift my focus to Cedric's code (option #3).

> >   2. Pre-check LSM permissions and dynamically track mappings to enclave
> >      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> >      based on the pre-checked permissions.
> >
> >      Pros: Does not impact SGX UAPI, medium kernel complexity
> >      Cons: Auditing is complex/weird, requires taking enclave-specific
> >            lock during mprotect() to query/update tracking.
> 
> Here's how this looks in my mind.  It's quite similar, except that
> ALLOW_READ, ALLOW_WRITE, and ALLOW_EXEC are replaced with a little
> state machine.
> 
> EADD does not take any special flags.  It calls this LSM hook:
> 
>   int security_enclave_load(struct vm_area_struct *source);
> 
> This hook can return -EPERM.  Otherwise it 0 or ALLOC_EXEC_IF_UNMODIFIED
> (i.e. 1).  This hook enforces permissions (a) and (b).
> 
> The driver tracks a state for each page, and the possible states are:
> 
>  - CLEAN_MAYEXEC /* no W or X VMAs have existed, but X is okay */
>  - CLEAN_NOEXEC /* no W or X VMAs have existed, and X is not okay */
>  - CLEAN_EXEC /* no W VMA has existed, but an X VMA has existed */
>  - DIRTY /* a W VMA has existed */
> 
> The initial state for a page is CLEAN_MAYEXEC if the hook said
> ALLOW_EXEC_IF_UNMODIFIED and CLEAN_NOEXEC otherwise.
> 
> The future EAUG does not call a hook at all and puts pages into the state
> CLEAN_NOEXEC.  If SGX3 or later ever adds EAUG-but-don't-clear, it can
> call security_enclave_load() and add CLEAN_MAYEXEC pages if appropriate.
> 
> EINIT takes a sigstruct pointer.  SGX calls a new hook:
> 
>   unsigned int security_enclave_init(struct sigstruct *sigstruct,
> struct vm_area_struct *source, unsigned int flags);
> 
> This hook can return -EPERM.  Otherwise it returns 0 or a combination of
> flags DENY_WX and DENY_X_DIRTY.  The driver saves this value.
> These represent permissions (c) and (d).
> 
> If we want to have a permission for "execute code supplied from outside the
> enclave that was not measured", we could have a flag like
> HAS_UNMEASURED_CLEAN_EXEC_PAGE that the LSM could consider.
>
> mmap() and mprotect() enforce the following rules:
> 
>  - If VM_EXEC is requested and (either the page is DIRTY or VM_WRITE is
>    requested) and DENY_X_DIRTY, then deny.
> 
>  - If VM_WRITE and VM_EXEC are both requested and DENY_WX, then deny.
> 
>  - If VM_WRITE is requested, we need to update the state.  If it was
>    CLEAN_EXEC, then we reject if DENY_X_DIRTY.  Otherwise we change the
>    state to DIRTY.
> 
>  - If VM_EXEC is requested and the page is CLEAN_NOEXEC, then deny.
> 
> mprotect() and mmap() do *not* call SGX-specific LSM hooks to ask for
> permission, although they can optionally call an LSM hook if they hit one of
> the -EPERM cases for auditing purposes.
> 
> Before the SIGSTRUCT is provided to the driver, the driver acts as though
> DENY_X_DIRTY and DENY_WX are both set.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-12 22:02           ` Sean Christopherson
@ 2019-06-13  0:10             ` Xing, Cedric
  2019-06-13  1:02             ` Xing, Cedric
  1 sibling, 0 replies; 67+ messages in thread
From: Xing, Cedric @ 2019-06-13  0:10 UTC (permalink / raw)
  To: Christopherson, Sean J, Andy Lutomirski
  Cc: Stephen Smalley, LSM List, selinux, LKML, linux-sgx,
	Jarkko Sakkinen, James Morris, Serge E. Hallyn, Paul Moore,
	Eric Paris, Jethro Beekman, Hansen, Dave, Thomas Gleixner,
	Linus Torvalds, Andrew Morton, nhorman, pmccallum, Ayoun, Serge,
	Katz-zamir, Shay, Huang, Haitao, Andy Shevchenko, Svahn, Kai,
	Borislav Petkov, Josh Triplett, Huang, Kai, David Rientjes,
	Roberts, William C, Tricca, Philip B

> From: Christopherson, Sean J
> Sent: Wednesday, June 12, 2019 3:03 PM
> 
> > I think this model works quite well in an SGX1 world.  The main thing
> > that makes me uneasy about this model is that, in SGX2, it requires
> > that an SGX2-compatible enclave loader must pre-declare to the kernel
> > whether it intends for its dynamically allocated memory to be
> > ALLOW_EXEC.  If ALLOW_EXEC is set but not actually needed, it will
> > still fail if DENY_X_IF_ALLOW_WRITE ends up being set.  The other
> > version below does not have this limitation.
> 
> I'm not convinced this will be a meaningful limitation in practice,
> though that's probably obvious from my RFCs :-).  That being said, the
> UAPI quirk is essentially a dealbreaker for multiple people, so let's
> drop #1.
> 
> I discussed the options with Cedric offline, and he is ok with option #2
> *if* the idea actually translates to acceptable code and doesn't present
> problems for userspace and/or future SGX features.
> 
> So, I'll work on an RFC series to implement #2 as described below.  If
> it works out, yay!  If not, i.e. option #2 is fundamentally broken, I'll
> shift my focus to Cedric's code (option #3).
> 
> > >   2. Pre-check LSM permissions and dynamically track mappings to
> enclave
> > >      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> > >      based on the pre-checked permissions.
> > >
> > >      Pros: Does not impact SGX UAPI, medium kernel complexity
> > >      Cons: Auditing is complex/weird, requires taking enclave-
> specific
> > >            lock during mprotect() to query/update tracking.
> >
> > Here's how this looks in my mind.  It's quite similar, except that
> > ALLOW_READ, ALLOW_WRITE, and ALLOW_EXEC are replaced with a little
> > state machine.
> >
> > EADD does not take any special flags.  It calls this LSM hook:
> >
> >   int security_enclave_load(struct vm_area_struct *source);
> >
> > This hook can return -EPERM.  Otherwise it 0 or
> > ALLOC_EXEC_IF_UNMODIFIED (i.e. 1).  This hook enforces permissions (a)
> and (b).
> >
> > The driver tracks a state for each page, and the possible states are:
> >
> >  - CLEAN_MAYEXEC /* no W or X VMAs have existed, but X is okay */
> >  - CLEAN_NOEXEC /* no W or X VMAs have existed, and X is not okay */
> >  - CLEAN_EXEC /* no W VMA has existed, but an X VMA has existed */
> >  - DIRTY /* a W VMA has existed */
> >
> > The initial state for a page is CLEAN_MAYEXEC if the hook said
> > ALLOW_EXEC_IF_UNMODIFIED and CLEAN_NOEXEC otherwise.
> >
> > The future EAUG does not call a hook at all and puts pages into the
> > state CLEAN_NOEXEC.  If SGX3 or later ever adds EAUG-but-don't-clear,
> > it can call security_enclave_load() and add CLEAN_MAYEXEC pages if
> appropriate.
> >
> > EINIT takes a sigstruct pointer.  SGX calls a new hook:
> >
> >   unsigned int security_enclave_init(struct sigstruct *sigstruct,
> > struct vm_area_struct *source, unsigned int flags);
> >
> > This hook can return -EPERM.  Otherwise it returns 0 or a combination
> > of flags DENY_WX and DENY_X_DIRTY.  The driver saves this value.
> > These represent permissions (c) and (d).
> >
> > If we want to have a permission for "execute code supplied from
> > outside the enclave that was not measured", we could have a flag like
> > HAS_UNMEASURED_CLEAN_EXEC_PAGE that the LSM could consider.
> >
> > mmap() and mprotect() enforce the following rules:
> >
> >  - If VM_EXEC is requested and (either the page is DIRTY or VM_WRITE
> is
> >    requested) and DENY_X_DIRTY, then deny.
> >
> >  - If VM_WRITE and VM_EXEC are both requested and DENY_WX, then deny.
> >
> >  - If VM_WRITE is requested, we need to update the state.  If it was
> >    CLEAN_EXEC, then we reject if DENY_X_DIRTY.  Otherwise we change
> the
> >    state to DIRTY.
> >
> >  - If VM_EXEC is requested and the page is CLEAN_NOEXEC, then deny.
> >
> > mprotect() and mmap() do *not* call SGX-specific LSM hooks to ask for
> > permission, although they can optionally call an LSM hook if they hit
> > one of the -EPERM cases for auditing purposes.
> >
> > Before the SIGSTRUCT is provided to the driver, the driver acts as
> > though DENY_X_DIRTY and DENY_WX are both set.

I think we've been discussing 2 topics simultaneously, one is the state machine that accepts/rejects mmap/mprotect requests, while the other is where is the best place to put it. I think we have an agreement on the former, and IMO option #2 and #3 differ only in the latter.

Option #2 keeps the state machine inside SGX subsystem, so it could reuse existing data structures for page tracking/locking to some extent. Sean may have smarter ideas, but it looks to me like the existing 'struct sgx_encl_page' tracks individual enclave pages while the FSM states apply to ranges. So in order *not* to test page by page in mmap/mprotect, I guess some new range oriented structures are still necessary. But I don't think it very important anyway. 

My major concern is more from the architecture/modularity perspective. Specifically, the state machine is defined by LSM but SGX does the state transitions. That's a brittle relationship that'd break easily if the state machine changes in future, or if different LSM modules want to define different FSMs (comprised of different set of states and/or triggers). After all, what's needed by the SGX subsystem is just the decision, not the FSM definition. I think we should take a closer look at this area once Sean's patch comes out.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-12 22:02           ` Sean Christopherson
  2019-06-13  0:10             ` Xing, Cedric
@ 2019-06-13  1:02             ` Xing, Cedric
  1 sibling, 0 replies; 67+ messages in thread
From: Xing, Cedric @ 2019-06-13  1:02 UTC (permalink / raw)
  To: Christopherson, Sean J, Andy Lutomirski
  Cc: Stephen Smalley, LSM List, selinux, LKML, linux-sgx,
	Jarkko Sakkinen, James Morris, Serge E. Hallyn, Paul Moore,
	Eric Paris, Jethro Beekman, Hansen, Dave, Thomas Gleixner,
	Linus Torvalds, Andrew Morton, nhorman, Ayoun, Serge, Katz-zamir,
	Shay, Huang, Haitao, Andy Shevchenko, Svahn, Kai,
	Borislav Petkov, Josh Triplett, Huang, Kai, David Rientjes,
	Roberts, William C, Tricca, Philip B

> From: Christopherson, Sean J
> Sent: Wednesday, June 12, 2019 3:03 PM
> 
> > I think this model works quite well in an SGX1 world.  The main thing
> > that makes me uneasy about this model is that, in SGX2, it requires
> > that an SGX2-compatible enclave loader must pre-declare to the kernel
> > whether it intends for its dynamically allocated memory to be
> > ALLOW_EXEC.  If ALLOW_EXEC is set but not actually needed, it will
> > still fail if DENY_X_IF_ALLOW_WRITE ends up being set.  The other
> > version below does not have this limitation.
> 
> I'm not convinced this will be a meaningful limitation in practice,
> though that's probably obvious from my RFCs :-).  That being said, the
> UAPI quirk is essentially a dealbreaker for multiple people, so let's
> drop #1.
> 
> I discussed the options with Cedric offline, and he is ok with option #2
> *if* the idea actually translates to acceptable code and doesn't present
> problems for userspace and/or future SGX features.
> 
> So, I'll work on an RFC series to implement #2 as described below.  If
> it works out, yay!  If not, i.e. option #2 is fundamentally broken, I'll
> shift my focus to Cedric's code (option #3).
> 
> > >   2. Pre-check LSM permissions and dynamically track mappings to
> enclave
> > >      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> > >      based on the pre-checked permissions.
> > >
> > >      Pros: Does not impact SGX UAPI, medium kernel complexity
> > >      Cons: Auditing is complex/weird, requires taking enclave-
> specific
> > >            lock during mprotect() to query/update tracking.
> >
> > Here's how this looks in my mind.  It's quite similar, except that
> > ALLOW_READ, ALLOW_WRITE, and ALLOW_EXEC are replaced with a little
> > state machine.
> >
> > EADD does not take any special flags.  It calls this LSM hook:
> >
> >   int security_enclave_load(struct vm_area_struct *source);
> >
> > This hook can return -EPERM.  Otherwise it 0 or
> > ALLOC_EXEC_IF_UNMODIFIED (i.e. 1).  This hook enforces permissions (a)
> and (b).
> >
> > The driver tracks a state for each page, and the possible states are:
> >
> >  - CLEAN_MAYEXEC /* no W or X VMAs have existed, but X is okay */
> >  - CLEAN_NOEXEC /* no W or X VMAs have existed, and X is not okay */
> >  - CLEAN_EXEC /* no W VMA has existed, but an X VMA has existed */
> >  - DIRTY /* a W VMA has existed */
> >
> > The initial state for a page is CLEAN_MAYEXEC if the hook said
> > ALLOW_EXEC_IF_UNMODIFIED and CLEAN_NOEXEC otherwise.
> >
> > The future EAUG does not call a hook at all and puts pages into the
> > state CLEAN_NOEXEC.  If SGX3 or later ever adds EAUG-but-don't-clear,
> > it can call security_enclave_load() and add CLEAN_MAYEXEC pages if
> appropriate.
> >
> > EINIT takes a sigstruct pointer.  SGX calls a new hook:
> >
> >   unsigned int security_enclave_init(struct sigstruct *sigstruct,
> > struct vm_area_struct *source, unsigned int flags);
> >
> > This hook can return -EPERM.  Otherwise it returns 0 or a combination
> > of flags DENY_WX and DENY_X_DIRTY.  The driver saves this value.
> > These represent permissions (c) and (d).
> >
> > If we want to have a permission for "execute code supplied from
> > outside the enclave that was not measured", we could have a flag like
> > HAS_UNMEASURED_CLEAN_EXEC_PAGE that the LSM could consider.
> >
> > mmap() and mprotect() enforce the following rules:
> >
> >  - If VM_EXEC is requested and (either the page is DIRTY or VM_WRITE
> is
> >    requested) and DENY_X_DIRTY, then deny.
> >
> >  - If VM_WRITE and VM_EXEC are both requested and DENY_WX, then deny.
> >
> >  - If VM_WRITE is requested, we need to update the state.  If it was
> >    CLEAN_EXEC, then we reject if DENY_X_DIRTY.  Otherwise we change
> the
> >    state to DIRTY.
> >
> >  - If VM_EXEC is requested and the page is CLEAN_NOEXEC, then deny.
> >
> > mprotect() and mmap() do *not* call SGX-specific LSM hooks to ask for
> > permission, although they can optionally call an LSM hook if they hit
> > one of the -EPERM cases for auditing purposes.
> >
> > Before the SIGSTRUCT is provided to the driver, the driver acts as
> > though DENY_X_DIRTY and DENY_WX are both set.

I think we've been discussing 2 topics simultaneously, one is the state machine that accepts/rejects mmap/mprotect requests, while the other is where is the best place to put it. I think we have an agreement on the former, and IMO option #2 and #3 differ only in the latter.

Option #2 keeps the state machine inside SGX subsystem, so it could reuse existing data structures for page tracking/locking to some extent. Sean may have smarter ideas, but it looks to me like the existing 'struct sgx_encl_page' tracks individual enclave pages while the FSM states apply to ranges. So in order *not* to test page by page in mmap/mprotect, I guess some new range oriented structures are still necessary. But I don't think it very important anyway. 

My major concern is more from the architecture/modularity perspective. Specifically, the state machine is defined by LSM but SGX does the state transitions. That's a brittle relationship that'd break easily if the state machine changes in future, or if different LSM modules want to define different FSMs (comprised of different set of states and/or triggers). After all, what's needed by the SGX subsystem is just the decision, not the FSM definition. I think we should take a closer look at this area once Sean's patch comes out.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-12 14:25           ` Sean Christopherson
@ 2019-06-13  7:25             ` Dr. Greg
  0 siblings, 0 replies; 67+ messages in thread
From: Dr. Greg @ 2019-06-13  7:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Stephen Smalley, Cedric Xing, linux-security-module, selinux,
	linux-kernel, linux-sgx, jarkko.sakkinen, luto, jmorris, serge,
	paul, eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca

On Wed, Jun 12, 2019 at 07:25:57AM -0700, Sean Christopherson wrote:

Good morning, we hope the week continues to go well for everyone.

> On Wed, Jun 12, 2019 at 04:32:21AM -0500, Dr. Greg wrote:
> > With SGX2 we will, by necessity, have to admit the notion that a
> > platform owner will not have any effective visibility into code that
> > is loaded and executed, since it can come in over a secured network
> > connection in an enclave security context.  This advocates for the
> > simplest approach possible to providing some type of regulation to any
> > form of WX page access.

> I believe we're all on the same page in the sense that we all want
> the "simplest approach possible", but there's a sliding scale of
> complexity between the kernel and userspace.  We can make life
> simple for userspace at the cost of additional complexity in the
> kernel, and vice versa.  The disagreement is over where to shove the
> extra complexity.

Yes, we are certainly cognizant of and sympathetic to the engineering
tensions involved.

The purpose of our e-mail was to leaven the discussion with the notion
that the most important question is how much complexity should be
shoved in either direction.  With respect to SGX as a technology, the
most important engineering metric is how much effective security is
actually being achieved.

Given an admission that enclave dynamic memory management (EDMM/SGX2)
is the goal in all of this, there are only two effective security
questions to be answered:

1.) Should a corpus of known memory with executable permissions be
copied into to an enclave.

2.) Should a corpus of executable memory with unknown content be
available to an enclave.

Given the functionality that SGX implements, both questions ultimately
devolve to whether or not a platform owner trusts an enclave author.
Security relevant trust is conveyed through cryptographically mediated
mechanisms.

The decision has been made to take full hardware mediated
cryptographic trust off the table for the mainstream Linux
implementation.  Given that, the most pragmatic engineering solution
would seem to be to implement the least complex implementation that
allows a platform owner to answer the two questions above.

See below.

> > Current state of the art, and there doesn't appear to be a reason to
> > change this, is to package an enclave in the form of an ELF shared
> > library.  It seems straight forward to inherit and act on page
> > privileges from the privileges specified on the ELF sections that are
> > loaded.  Loaders will have a file descriptor available so an mmap of
> > the incoming page with the specified privileges should trigger the
> > required LSM interventions and tie them to a specific enclave.
> > 
> > The current enclave 'standard' also uses layout metadata, stored in a
> > special .notes section of the shared image, to direct a loader with
> > respect to construction of the enclave stack, heap, TCS and other
> > miscellaneous regions not directly coded by the ELF TEXT sections.  It
> > seems straight forward to extend this paradigm to declare region(s) of
> > an enclave that are eligible to be generated at runtime (EAUG'ed) with
> > the RWX protections needed to support dynamically loaded code.
> > 
> > If an enclave wishes to support this functionality, it would seem
> > straight forward to require an enclave to provide a single zero page
> > which the loader will mmap with those protections in order to trigger
> > the desired LSM checks against that specific enclave.

> This is effectively #1, e.g. would require userspace to pre-declare its
> intent to make regions W->X.

Yes, we understood that when we wrote our original e-mail.

This model effectively allows the two relevant security questions to
be easily answered and is most consistent with current enclave
formats, software practices and runtimes.  It is also largely
consistent with existing LSM practices.

There hasn't been any discussion with respect to backports of this
driver but we believe it it safe to conclude that the industry is
going to be at least two years away from any type of realistic
deployments of this driver.  By that time there will be over a half a
decade of software deployment of existing API's and enclave formats.

Expecting a 'flag day' to be successful would seem to be contrary to
all known history of software practice and would thus disadvantage
Linux as an effective platform for this technology.

Best wishes for a productive remainder of the week to everyone.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686            EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"Sometimes I sing and dance around the house in my underwear,
 doesn't make me Madonna, never will.
                                -- Cyn
                                   Working Girl

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-11 22:02       ` Sean Christopherson
  2019-06-12  9:32         ` Dr. Greg
  2019-06-12 19:30         ` Andy Lutomirski
@ 2019-06-13 17:02         ` Stephen Smalley
  2019-06-13 23:03           ` Xing, Cedric
  2019-06-14  0:46           ` Sean Christopherson
  2 siblings, 2 replies; 67+ messages in thread
From: Stephen Smalley @ 2019-06-13 17:02 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Cedric Xing, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, dave.hansen, tglx, torvalds, akpm, nhorman, pmccallum,
	serge.ayoun, shay.katz-zamir, haitao.huang, andriy.shevchenko,
	kai.svahn, bp, josh, kai.huang, rientjes, william.c.roberts,
	philip.b.tricca

On 6/11/19 6:02 PM, Sean Christopherson wrote:
> On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
>> I haven't looked at this code closely, but it feels like a lot of
>> SGX-specific logic embedded into SELinux that will have to be repeated or
>> reused for every security module.  Does SGX not track this state itself?
> 
> SGX does track equivalent state.
> 
> There are three proposals on the table (I think):
> 
>    1. Require userspace to explicitly specificy (maximal) enclave page
>       permissions at build time.  The enclave page permissions are provided
>       to, and checked by, LSMs at enclave build time.
> 
>       Pros: Low-complexity kernel implementation, straightforward auditing
>       Cons: Sullies the SGX UAPI to some extent, may increase complexity of
>             SGX2 enclave loaders.
> 
>    2. Pre-check LSM permissions and dynamically track mappings to enclave
>       pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
>       based on the pre-checked permissions.
> 
>       Pros: Does not impact SGX UAPI, medium kernel complexity
>       Cons: Auditing is complex/weird, requires taking enclave-specific
>             lock during mprotect() to query/update tracking.
> 
>    3. Implement LSM hooks in SGX to allow LSMs to track enclave regions
>       from cradle to grave, but otherwise defer everything to LSMs.
> 
>       Pros: Does not impact SGX UAPI, maximum flexibility, precise auditing
>       Cons: Most complex and "heaviest" kernel implementation of the three,
>             pushes more SGX details into LSMs.
> 
> My RFC series[1] implements #1.  My understanding is that Andy (Lutomirski)
> prefers #2.  Cedric's RFC series implements #3.
> 
> Perhaps the easiest way to make forward progress is to rule out the
> options we absolutely *don't* want by focusing on the potentially blocking
> issue with each option:
> 
>    #1 - SGX UAPI funkiness
> 
>    #2 - Auditing complexity, potential enclave lock contention
> 
>    #3 - Pushing SGX details into LSMs and complexity of kernel implementation
> 
> 
> [1] https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherson@intel.com

Given the complexity tradeoff, what is the clear motivating example for 
why #1 isn't the obvious choice? That the enclave loader has no way of 
knowing a priori whether the enclave will require W->X or WX?  But 
aren't we better off requiring enclaves to be explicitly marked as 
needing such so that we can make a more informed decision about whether 
to load them in the first place?

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-11 22:55       ` Xing, Cedric
@ 2019-06-13 18:00         ` Stephen Smalley
  2019-06-13 19:48           ` Sean Christopherson
                             ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: Stephen Smalley @ 2019-06-13 18:00 UTC (permalink / raw)
  To: Xing, Cedric, linux-security-module, selinux, linux-kernel, linux-sgx
  Cc: jarkko.sakkinen, luto, jmorris, serge, paul, eparis, jethro,
	Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum, Ayoun,
	Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko, Svahn,
	Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C, Tricca,
	Philip B

On 6/11/19 6:55 PM, Xing, Cedric wrote:
>> From: linux-sgx-owner@vger.kernel.org [mailto:linux-sgx-
>> owner@vger.kernel.org] On Behalf Of Stephen Smalley
>> Sent: Tuesday, June 11, 2019 6:40 AM
>>
>>>
>>> +#ifdef CONFIG_INTEL_SGX
>>> +	rc = sgxsec_mprotect(vma, prot);
>>> +	if (rc <= 0)
>>> +		return rc;
>>
>> Why are you skipping the file_map_prot_check() call when rc == 0?
>> What would SELinux check if you didn't do so -
>> FILE__READ|FILE__WRITE|FILE__EXECUTE to /dev/sgx/enclave?  Is it a
>> problem to let SELinux proceed with that check?
> 
> We can continue the check. But in practice, all FILE__{READ|WRITE|EXECUTE} are needed for every enclave, then what's the point of checking them? FILE__EXECMOD may be the only flag that has a meaning, but it's kind of redundant because sigstruct file was checked against that already.

I don't believe FILE__EXECMOD will be checked since it is a shared file 
mapping.  We'll check at least FILE__READ and FILE__WRITE anyway upon 
open(), and possibly FILE__EXECUTE upon mmap() unless that is never 
PROT_EXEC.  We want the policy to accurately reflect the operations of 
the system, even when an operation "must" be allowed, and even here this 
only needs to be allowed to processes authorized as enclave loaders, not 
to all processes.

I don't think there are other examples where we skip a SELinux check 
like this.  If we were to do so here, we would at least need a comment 
explaining that it was intentional and why.  The risk would be that 
future checking added into file_map_prot_check() would be unwittingly 
bypassed for these mappings.  A warning there would also be advisable if 
we skip it for these mappings.

> 
>>> +static int selinux_enclave_load(struct file *encl, unsigned long addr,
>>> +				unsigned long size, unsigned long prot,
>>> +				struct vm_area_struct *source)
>>> +{
>>> +	if (source) {
>>> +		/**
>>> +		 * Adding page from source => EADD request
>>> +		 */
>>> +		int rc = selinux_file_mprotect(source, prot, prot);
>>> +		if (rc)
>>> +			return rc;
>>> +
>>> +		if (!(prot & VM_EXEC) &&
>>> +		    selinux_file_mprotect(source, VM_EXEC, VM_EXEC))
>>
>> I wouldn't conflate VM_EXEC with PROT_EXEC even if they happen to be
>> defined with the same values currently.  Elsewhere the kernel appears to
>> explicitly translate them ala calc_vm_prot_bits().
> 
> Thanks! I'd change them to PROT_EXEC in the next version.
> 
>>
>> Also, this will mean that we will always perform an execute check on all
>> sources, thereby triggering audit denial messages for any EADD sources
>> that are only intended to be data.  Depending on the source, this could
>> trigger PROCESS__EXECMEM or FILE__EXECMOD or FILE__EXECUTE.  In a world
>> where users often just run any denials they see through audit2allow,
>> they'll end up always allowing them all.  How can they tell whether it
>> was needed? It would be preferable if we could only trigger execute
>> checks when there is some probability that execute will be requested in
>> the future.  Alternatives would be to silence the audit of these
>> permission checks always via use of _noaudit() interfaces or to silence
>> audit of these permissions via dontaudit rules in policy, but the latter
>> would hide all denials of the permission by the process, not just those
>> triggered from security_enclave_load().  And if we silence them, then we
>> won't see them even if they were needed.
> 
> *_noaudit() is exactly what I wanted. But I couldn't find selinux_file_mprotect_noaudit()/file_has_perm_noaudit(), and I'm reluctant to duplicate code. Any suggestions?

I would have no objection to adding _noaudit() variants of these, either 
duplicating code (if sufficiently small/simple) or creating a common 
helper with a bool audit flag that gets used for both. But the larger 
issue would be to resolve how to ultimately ensure that a denial is 
audited later if the denied permission is actually requested and blocked 
via sgxsec_mprotect().

>   
>>
>>> +			prot = 0;
>>> +		else {
>>> +			prot = SGX__EXECUTE;
>>> +			if (source->vm_file &&
>>> +			    !file_has_perm(current_cred(), source->vm_file,
>>> +					   FILE__EXECMOD))
>>> +				prot |= SGX__EXECMOD;
>>
>> Similarly, this means that we will always perform a FILE__EXECMOD check
>> on all executable sources, triggering audit denial messages for any EADD
>> source that is executable but to which EXECMOD is not allowed, and again
>> the most common pattern will be that users will add EXECMOD to all
>> executable sources to avoid this.
>>
>>> +		}
>>> +		return sgxsec_eadd(encl, addr, size, prot);
>>> +	} else {
>>> +		/**
>>> +		  * Adding page from NULL => EAUG request
>>> +		  */
>>> +		return sgxsec_eaug(encl, addr, size, prot);
>>> +	}
>>> +}
>>> +
>>> +static int selinux_enclave_init(struct file *encl,
>>> +				const struct sgx_sigstruct *sigstruct,
>>> +				struct vm_area_struct *vma)
>>> +{
>>> +	int rc = 0;
>>> +
>>> +	if (!vma)
>>> +		rc = -EINVAL;
>>
>> Is it ever valid to call this hook with a NULL vma?  If not, this should
>> be handled/prevented by the caller.  If so, I'd just return -EINVAL
>> immediately here.
> 
> vma shall never be NULL. I'll update it in the next version.
> 
>>
>>> +
>>> +	if (!rc && !(vma->vm_flags & VM_EXEC))
>>> +		rc = selinux_file_mprotect(vma, VM_EXEC, VM_EXEC);
>>
>> I had thought we were trying to avoid overloading FILE__EXECUTE (or
>> whatever gets checked here, e.g. could be PROCESS__EXECMEM or
>> FILE__EXECMOD) on the sigstruct file, since the caller isn't truly
>> executing code from it.
> 
> Agreed. Another problem with FILE__EXECMOD on the sigstruct file is that user code would then be allowed to modify SIGSTRUCT at will, which effectively wipes out the protection provided by FILE__EXECUTE.
> 
>>
>> I'd define new ENCLAVE__* permissions, including an up-front
>> ENCLAVE__INIT permission that governs whether the sigstruct file can be
>> used at all irrespective of memory protections.
> 
> Agreed.
> 
>>
>> Then you can also have ENCLAVE__EXECUTE, ENCLAVE__EXECMEM,
>> ENCLAVE__EXECMOD for the execute-related checks.  Or you can use the
>> /dev/sgx/enclave inode as the target for the execute checks and just
>> reuse the file permissions there.
> 
> Now we've got 2 options - 1) New ENCLAVE__* flags on sigstruct file or 2) FILE__* on /dev/sgx/enclave. Which one do you think makes more sense?
> 
> ENCLAVE__EXECMEM seems to offer finer granularity (than PROCESS__EXECMEM) but I wonder if it'd have any real use in practice.

Defining a separate ENCLAVE__EXECUTE and using it here for the sigstruct 
file would avoid any ambiguity with the FILE__EXECUTE check to the 
/dev/sgx/enclave inode that might occur upon mmap() or mprotect().  A 
separate ENCLAVE__EXECMEM would enable allowing WX within the enclave 
while denying it in the host application or vice versa, which could be a 
good thing for security, particularly if SGX2 largely ends up always 
wanting WX.

> 
>>> +int sgxsec_mprotect(struct vm_area_struct *vma, size_t prot) {
>>> +	struct enclave_sec *esec;
>>> +	int rc;
>>> +
>>> +	if (!vma->vm_file || !(esec = __esec(selinux_file(vma->vm_file))))
>> {
>>> +		/* Positive return value indicates non-enclave VMA */
>>> +		return 1;
>>> +	}
>>> +
>>> +	down_read(&esec->sem);
>>> +	rc = enclave_mprotect(&esec->regions, vma->vm_start, vma->vm_end,
>>> +prot);
>>
>> Why is it safe for this to only use down_read()? enclave_mprotect() can
>> call enclave_prot_set_cb() which modifies the list?
> 
> Probably because it was too late at night when I wrote this line:-( Good catch!
> 
>>
>> I haven't looked at this code closely, but it feels like a lot of SGX-
>> specific logic embedded into SELinux that will have to be repeated or
>> reused for every security module.  Does SGX not track this state itself?
> 
> I can tell you have looked quite closely, and I truly think you for your time!
> 
> You are right that there are SGX specific stuff. More precisely, SGX enclaves don't have access to anything except memory, so there are only 3 questions that need to be answered for each enclave page: 1) whether X is allowed; 2) whether W->X is allowed and 3 whether WX is allowed. This proposal tries to cache the answers to those questions upon creation of each enclave page, meaning it involves a) figuring out the answers and b) "remember" them for every page. #b is generic, mostly captured in intel_sgx.c, and could be shared among all LSM modules; while #a is SELinux specific. I could move intel_sgx.c up one level in the directory hierarchy if that's what you'd suggest.
> 
> By "SGX", did you mean the SGX subsystem being upstreamed? It doesn’t track that state. In practice, there's no way for SGX to track it because there's no vm_ops->may_mprotect() callback. It doesn't follow the philosophy of Linux either, as mprotect() doesn't track it for regular memory. And it doesn't have a use without LSM, so I believe it makes more sense to track it inside LSM.

Yes, the SGX driver/subsystem.  I had the impression from Sean that it 
does track this kind of per-page state already in some manner, but 
possibly he means it does under a given proposal and not in the current 
driver.

Even the #b remembering might end up being SELinux-specific if we also 
have to remember the original inputs used to compute the answer so that 
we can audit that information when access is denied later upon 
mprotect().  At the least we'd need it to save some opaque data and pass 
it to a callback into SELinux to perform that auditing.


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-13 18:00         ` Stephen Smalley
@ 2019-06-13 19:48           ` Sean Christopherson
  2019-06-13 21:09             ` Xing, Cedric
  2019-06-13 21:02           ` Xing, Cedric
  2019-06-14  0:37           ` Sean Christopherson
  2 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-13 19:48 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Xing, Cedric, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko,
	Svahn, Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C,
	Tricca, Philip B

On Thu, Jun 13, 2019 at 02:00:29PM -0400, Stephen Smalley wrote:
> On 6/11/19 6:55 PM, Xing, Cedric wrote:
> >You are right that there are SGX specific stuff. More precisely, SGX
> >enclaves don't have access to anything except memory, so there are only 3
> >questions that need to be answered for each enclave page: 1) whether X is
> >allowed; 2) whether W->X is allowed and 3 whether WX is allowed. This
> >proposal tries to cache the answers to those questions upon creation of each
> >enclave page, meaning it involves a) figuring out the answers and b)
> >"remember" them for every page. #b is generic, mostly captured in
> >intel_sgx.c, and could be shared among all LSM modules; while #a is SELinux
> >specific. I could move intel_sgx.c up one level in the directory hierarchy
> >if that's what you'd suggest.
> >
> >By "SGX", did you mean the SGX subsystem being upstreamed? It doesn’t track
> >that state. In practice, there's no way for SGX to track it because there's
> >no vm_ops->may_mprotect() callback. It doesn't follow the philosophy of
> >Linux either, as mprotect() doesn't track it for regular memory. And it
> >doesn't have a use without LSM, so I believe it makes more sense to track it
> >inside LSM.
> 
> Yes, the SGX driver/subsystem.  I had the impression from Sean that it does
> track this kind of per-page state already in some manner, but possibly he
> means it does under a given proposal and not in the current driver.

Yeah, under a given proposal.  SGX has per-page tracking, adding new flags
is fairly easy.  Philosophical objections aside, adding .may_mprotect() is
trivial.

> Even the #b remembering might end up being SELinux-specific if we also have
> to remember the original inputs used to compute the answer so that we can
> audit that information when access is denied later upon mprotect().  At the
> least we'd need it to save some opaque data and pass it to a callback into
> SELinux to perform that auditing.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-13 18:00         ` Stephen Smalley
  2019-06-13 19:48           ` Sean Christopherson
@ 2019-06-13 21:02           ` Xing, Cedric
  2019-06-14  0:37           ` Sean Christopherson
  2 siblings, 0 replies; 67+ messages in thread
From: Xing, Cedric @ 2019-06-13 21:02 UTC (permalink / raw)
  To: Stephen Smalley, linux-security-module, selinux, linux-kernel, linux-sgx
  Cc: jarkko.sakkinen, luto, jmorris, serge, paul, eparis, jethro,
	Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum, Ayoun,
	Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko, Svahn,
	Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C, Tricca,
	Philip B

> From: linux-sgx-owner@vger.kernel.org [mailto:linux-sgx-
> owner@vger.kernel.org] On Behalf Of Stephen Smalley
> 
> On 6/11/19 6:55 PM, Xing, Cedric wrote:
> >> From: linux-sgx-owner@vger.kernel.org [mailto:linux-sgx-
> >> owner@vger.kernel.org] On Behalf Of Stephen Smalley
> >> Sent: Tuesday, June 11, 2019 6:40 AM
> >>
> >>>
> >>> +#ifdef CONFIG_INTEL_SGX
> >>> +	rc = sgxsec_mprotect(vma, prot);
> >>> +	if (rc <= 0)
> >>> +		return rc;
> >>
> >> Why are you skipping the file_map_prot_check() call when rc == 0?
> >> What would SELinux check if you didn't do so -
> >> FILE__READ|FILE__WRITE|FILE__EXECUTE to /dev/sgx/enclave?  Is it a
> >> problem to let SELinux proceed with that check?
> >
> > We can continue the check. But in practice, all
> FILE__{READ|WRITE|EXECUTE} are needed for every enclave, then what's the
> point of checking them? FILE__EXECMOD may be the only flag that has a
> meaning, but it's kind of redundant because sigstruct file was checked
> against that already.
> 
> I don't believe FILE__EXECMOD will be checked since it is a shared file
> mapping.  We'll check at least FILE__READ and FILE__WRITE anyway upon
> open(), and possibly FILE__EXECUTE upon mmap() unless that is never
> PROT_EXEC.  We want the policy to accurately reflect the operations of
> the system, even when an operation "must" be allowed, and even here this
> only needs to be allowed to processes authorized as enclave loaders, not
> to all processes.
> 
> I don't think there are other examples where we skip a SELinux check
> like this.  If we were to do so here, we would at least need a comment
> explaining that it was intentional and why.  The risk would be that
> future checking added into file_map_prot_check() would be unwittingly
> bypassed for these mappings.  A warning there would also be advisable if
> we skip it for these mappings.

You are right! The code was written assuming file_map_prot_check() wouldn't object if sgxsec_mprotect() approves it, but that may not always be the case if new checks are added in future. I'll add the check back.
 
> 
> >
> >>> +static int selinux_enclave_load(struct file *encl, unsigned long
> addr,
> >>> +				unsigned long size, unsigned long prot,
> >>> +				struct vm_area_struct *source)
> >>> +{
> >>> +	if (source) {
> >>> +		/**
> >>> +		 * Adding page from source => EADD request
> >>> +		 */
> >>> +		int rc = selinux_file_mprotect(source, prot, prot);
> >>> +		if (rc)
> >>> +			return rc;
> >>> +
> >>> +		if (!(prot & VM_EXEC) &&
> >>> +		    selinux_file_mprotect(source, VM_EXEC, VM_EXEC))
> >>
> >> I wouldn't conflate VM_EXEC with PROT_EXEC even if they happen to be
> >> defined with the same values currently.  Elsewhere the kernel appears
> >> to explicitly translate them ala calc_vm_prot_bits().
> >
> > Thanks! I'd change them to PROT_EXEC in the next version.
> >
> >>
> >> Also, this will mean that we will always perform an execute check on
> >> all sources, thereby triggering audit denial messages for any EADD
> >> sources that are only intended to be data.  Depending on the source,
> >> this could trigger PROCESS__EXECMEM or FILE__EXECMOD or
> >> FILE__EXECUTE.  In a world where users often just run any denials
> >> they see through audit2allow, they'll end up always allowing them
> >> all.  How can they tell whether it was needed? It would be preferable
> >> if we could only trigger execute checks when there is some
> >> probability that execute will be requested in the future.
> >> Alternatives would be to silence the audit of these permission checks
> >> always via use of _noaudit() interfaces or to silence audit of these
> >> permissions via dontaudit rules in policy, but the latter would hide
> >> all denials of the permission by the process, not just those
> >> triggered from security_enclave_load().  And if we silence them, then
> we won't see them even if they were needed.
> >
> > *_noaudit() is exactly what I wanted. But I couldn't find
> selinux_file_mprotect_noaudit()/file_has_perm_noaudit(), and I'm
> reluctant to duplicate code. Any suggestions?
> 
> I would have no objection to adding _noaudit() variants of these, either
> duplicating code (if sufficiently small/simple) or creating a common
> helper with a bool audit flag that gets used for both. But the larger
> issue would be to resolve how to ultimately ensure that a denial is
> audited later if the denied permission is actually requested and blocked
> via sgxsec_mprotect().

The idea here is to precompute the answers as if a certain request were received, so that we don't have to store all inputs to the precomputation. sgxsec_mprotect(), if coded correctly, would make the same decision regardless it was precomputed or computed at the time of the real request. Auditing requires more information than making the decision itself, such as the file path and when the request was made. I'm reluctant to keep the source files open just for audit logs. I'll need a closer look at the auditing code to figure out an appropriate way.

> 
> >
> >>
> >>> +			prot = 0;
> >>> +		else {
> >>> +			prot = SGX__EXECUTE;
> >>> +			if (source->vm_file &&
> >>> +			    !file_has_perm(current_cred(), source->vm_file,
> >>> +					   FILE__EXECMOD))
> >>> +				prot |= SGX__EXECMOD;
> >>
> >> Similarly, this means that we will always perform a FILE__EXECMOD
> check
> >> on all executable sources, triggering audit denial messages for any
> EADD
> >> source that is executable but to which EXECMOD is not allowed, and
> again
> >> the most common pattern will be that users will add EXECMOD to all
> >> executable sources to avoid this.
> >>
> >>> +		}
> >>> +		return sgxsec_eadd(encl, addr, size, prot);
> >>> +	} else {
> >>> +		/**
> >>> +		  * Adding page from NULL => EAUG request
> >>> +		  */
> >>> +		return sgxsec_eaug(encl, addr, size, prot);
> >>> +	}
> >>> +}
> >>> +
> >>> +static int selinux_enclave_init(struct file *encl,
> >>> +				const struct sgx_sigstruct *sigstruct,
> >>> +				struct vm_area_struct *vma)
> >>> +{
> >>> +	int rc = 0;
> >>> +
> >>> +	if (!vma)
> >>> +		rc = -EINVAL;
> >>
> >> Is it ever valid to call this hook with a NULL vma?  If not, this
> should
> >> be handled/prevented by the caller.  If so, I'd just return -EINVAL
> >> immediately here.
> >
> > vma shall never be NULL. I'll update it in the next version.
> >
> >>
> >>> +
> >>> +	if (!rc && !(vma->vm_flags & VM_EXEC))
> >>> +		rc = selinux_file_mprotect(vma, VM_EXEC, VM_EXEC);
> >>
> >> I had thought we were trying to avoid overloading FILE__EXECUTE (or
> >> whatever gets checked here, e.g. could be PROCESS__EXECMEM or
> >> FILE__EXECMOD) on the sigstruct file, since the caller isn't truly
> >> executing code from it.
> >
> > Agreed. Another problem with FILE__EXECMOD on the sigstruct file is
> that user code would then be allowed to modify SIGSTRUCT at will, which
> effectively wipes out the protection provided by FILE__EXECUTE.
> >
> >>
> >> I'd define new ENCLAVE__* permissions, including an up-front
> >> ENCLAVE__INIT permission that governs whether the sigstruct file can
> be
> >> used at all irrespective of memory protections.
> >
> > Agreed.
> >
> >>
> >> Then you can also have ENCLAVE__EXECUTE, ENCLAVE__EXECMEM,
> >> ENCLAVE__EXECMOD for the execute-related checks.  Or you can use the
> >> /dev/sgx/enclave inode as the target for the execute checks and just
> >> reuse the file permissions there.
> >
> > Now we've got 2 options - 1) New ENCLAVE__* flags on sigstruct file or
> 2) FILE__* on /dev/sgx/enclave. Which one do you think makes more sense?
> >
> > ENCLAVE__EXECMEM seems to offer finer granularity (than
> PROCESS__EXECMEM) but I wonder if it'd have any real use in practice.
> 
> Defining a separate ENCLAVE__EXECUTE and using it here for the sigstruct
> file would avoid any ambiguity with the FILE__EXECUTE check to the
> /dev/sgx/enclave inode that might occur upon mmap() or mprotect().  A
> separate ENCLAVE__EXECMEM would enable allowing WX within the enclave
> while denying it in the host application or vice versa, which could be a
> good thing for security, particularly if SGX2 largely ends up always
> wanting WX.

Agreed. I'll include those new flags in my next version.

> 
> >
> >>> +int sgxsec_mprotect(struct vm_area_struct *vma, size_t prot) {
> >>> +	struct enclave_sec *esec;
> >>> +	int rc;
> >>> +
> >>> +	if (!vma->vm_file || !(esec = __esec(selinux_file(vma->vm_file))))
> >> {
> >>> +		/* Positive return value indicates non-enclave VMA */
> >>> +		return 1;
> >>> +	}
> >>> +
> >>> +	down_read(&esec->sem);
> >>> +	rc = enclave_mprotect(&esec->regions, vma->vm_start, vma->vm_end,
> >>> +prot);
> >>
> >> Why is it safe for this to only use down_read()? enclave_mprotect()
> can
> >> call enclave_prot_set_cb() which modifies the list?
> >
> > Probably because it was too late at night when I wrote this line:-
> ( Good catch!
> >
> >>
> >> I haven't looked at this code closely, but it feels like a lot of
> SGX-
> >> specific logic embedded into SELinux that will have to be repeated or
> >> reused for every security module.  Does SGX not track this state
> itself?
> >
> > I can tell you have looked quite closely, and I truly think you for
> your time!
> >
> > You are right that there are SGX specific stuff. More precisely, SGX
> enclaves don't have access to anything except memory, so there are only
> 3 questions that need to be answered for each enclave page: 1) whether X
> is allowed; 2) whether W->X is allowed and 3 whether WX is allowed. This
> proposal tries to cache the answers to those questions upon creation of
> each enclave page, meaning it involves a) figuring out the answers and b)
> "remember" them for every page. #b is generic, mostly captured in
> intel_sgx.c, and could be shared among all LSM modules; while #a is
> SELinux specific. I could move intel_sgx.c up one level in the directory
> hierarchy if that's what you'd suggest.
> >
> > By "SGX", did you mean the SGX subsystem being upstreamed? It doesn’t
> track that state. In practice, there's no way for SGX to track it
> because there's no vm_ops->may_mprotect() callback. It doesn't follow
> the philosophy of Linux either, as mprotect() doesn't track it for
> regular memory. And it doesn't have a use without LSM, so I believe it
> makes more sense to track it inside LSM.
> 
> Yes, the SGX driver/subsystem.  I had the impression from Sean that it
> does track this kind of per-page state already in some manner, but
> possibly he means it does under a given proposal and not in the current
> driver.

Yes, SGX subsystem does track per-page states. But this page protection flags apply to *ranges*. 

In practice, those per-page states are *not* checked at mmap/mprotect. They are used mainly by vm_ops->fault() and the page swapper thread.

That said, merging protection flags into per-page states will require page-by-page checks, which will definitely hurt performance. Unless the driver also maintains some range oriented structures just like what you see here.

> 
> Even the #b remembering might end up being SELinux-specific if we also
> have to remember the original inputs used to compute the answer so that
> we can audit that information when access is denied later upon
> mprotect().  At the least we'd need it to save some opaque data and pass
> it to a callback into SELinux to perform that auditing.

Agreed. What's commonly needed here is a data structure that supports setting/querying value on ranges. It's close to what xarray supports, but xarray doesn't support range querying.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-13 19:48           ` Sean Christopherson
@ 2019-06-13 21:09             ` Xing, Cedric
  0 siblings, 0 replies; 67+ messages in thread
From: Xing, Cedric @ 2019-06-13 21:09 UTC (permalink / raw)
  To: Christopherson, Sean J, Stephen Smalley
  Cc: linux-security-module, selinux, linux-kernel, linux-sgx,
	jarkko.sakkinen, luto, jmorris, serge, paul, eparis, jethro,
	Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum, Ayoun,
	Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko, Svahn,
	Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C, Tricca,
	Philip B

> From: Christopherson, Sean J
> Sent: Thursday, June 13, 2019 12:49 PM
> 
> > >By "SGX", did you mean the SGX subsystem being upstreamed? It doesn’t
> > >track that state. In practice, there's no way for SGX to track it
> > >because there's no vm_ops->may_mprotect() callback. It doesn't follow
> > >the philosophy of Linux either, as mprotect() doesn't track it for
> > >regular memory. And it doesn't have a use without LSM, so I believe
> > >it makes more sense to track it inside LSM.
> >
> > Yes, the SGX driver/subsystem.  I had the impression from Sean that it
> > does track this kind of per-page state already in some manner, but
> > possibly he means it does under a given proposal and not in the
> current driver.
> 
> Yeah, under a given proposal.  SGX has per-page tracking, adding new
> flags is fairly easy.  Philosophical objections aside,
> adding .may_mprotect() is trivial.

As I pointed out in an earlier email, protection flags are associated with ranges. They could of course be duplicated to every page but that will hurt performance because every page within the range would have to be tested individually.

Furthermore, though .may_protect()is able to make the decision, I don't think it can do the audit log as well, unless it is coded in an SELinux specific way. Then I wonder how it could work with LSM modules other than SELinux.

> 
> > Even the #b remembering might end up being SELinux-specific if we also
> > have to remember the original inputs used to compute the answer so
> > that we can audit that information when access is denied later upon
> > mprotect().  At the least we'd need it to save some opaque data and
> > pass it to a callback into SELinux to perform that auditing.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-13 17:02         ` Stephen Smalley
@ 2019-06-13 23:03           ` Xing, Cedric
  2019-06-13 23:17             ` Sean Christopherson
  2019-06-14  0:46           ` Sean Christopherson
  1 sibling, 1 reply; 67+ messages in thread
From: Xing, Cedric @ 2019-06-13 23:03 UTC (permalink / raw)
  To: Stephen Smalley, Christopherson, Sean J
  Cc: linux-security-module, selinux, linux-kernel, linux-sgx,
	jarkko.sakkinen, luto, jmorris, serge, paul, eparis, jethro,
	Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum, Ayoun,
	Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko, Svahn,
	Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C, Tricca,
	Philip B

> From: Stephen Smalley [mailto:sds@tycho.nsa.gov]
> Sent: Thursday, June 13, 2019 10:02 AM
> 
> On 6/11/19 6:02 PM, Sean Christopherson wrote:
> > On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> >> I haven't looked at this code closely, but it feels like a lot of
> >> SGX-specific logic embedded into SELinux that will have to be
> >> repeated or reused for every security module.  Does SGX not track
> this state itself?
> >
> > SGX does track equivalent state.
> >
> > There are three proposals on the table (I think):
> >
> >    1. Require userspace to explicitly specificy (maximal) enclave page
> >       permissions at build time.  The enclave page permissions are
> provided
> >       to, and checked by, LSMs at enclave build time.
> >
> >       Pros: Low-complexity kernel implementation, straightforward
> auditing
> >       Cons: Sullies the SGX UAPI to some extent, may increase
> complexity of
> >             SGX2 enclave loaders.
> >
> >    2. Pre-check LSM permissions and dynamically track mappings to
> enclave
> >       pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> >       based on the pre-checked permissions.
> >
> >       Pros: Does not impact SGX UAPI, medium kernel complexity
> >       Cons: Auditing is complex/weird, requires taking enclave-
> specific
> >             lock during mprotect() to query/update tracking.
> >
> >    3. Implement LSM hooks in SGX to allow LSMs to track enclave
> regions
> >       from cradle to grave, but otherwise defer everything to LSMs.
> >
> >       Pros: Does not impact SGX UAPI, maximum flexibility, precise
> auditing
> >       Cons: Most complex and "heaviest" kernel implementation of the
> three,
> >             pushes more SGX details into LSMs.
> >
> > My RFC series[1] implements #1.  My understanding is that Andy
> > (Lutomirski) prefers #2.  Cedric's RFC series implements #3.
> >
> > Perhaps the easiest way to make forward progress is to rule out the
> > options we absolutely *don't* want by focusing on the potentially
> > blocking issue with each option:
> >
> >    #1 - SGX UAPI funkiness
> >
> >    #2 - Auditing complexity, potential enclave lock contention
> >
> >    #3 - Pushing SGX details into LSMs and complexity of kernel
> > implementation
> >
> >
> > [1]
> > https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherson
> > @intel.com
> 
> Given the complexity tradeoff, what is the clear motivating example for
> why #1 isn't the obvious choice? That the enclave loader has no way of
> knowing a priori whether the enclave will require W->X or WX?  But
> aren't we better off requiring enclaves to be explicitly marked as
> needing such so that we can make a more informed decision about whether
> to load them in the first place?

Are you asking this question at a) page granularity, b) file granularity or c) enclave (potentially comprised of multiple executable files) granularity?

#b is what we have on regular executable files and shared objects (i.e. FILE__EXECMOD). We all know how to do that.

#c is kind of new but could be done via some proxy file (e.g. sigstruct file) hence reduced to #b.

#a is problematic. It'd require compilers/linkers to generate such information, and proper executable image file format to carry that information, to be eventually picked up the loader. SELinux doesn't have PAGE__EXECMOD I guess is because it is generally considered impractical.

Option #1 however requires #a because the driver doesn't track which page was loaded from which file, otherwise it can no longer be qualified "simple". Or we could just implement #c, which will make all options simpler. But I guess #b is still preferred, to be aligned with what SELinux is enforcing today on regular memory pages.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-13 23:03           ` Xing, Cedric
@ 2019-06-13 23:17             ` Sean Christopherson
  2019-06-14  0:31               ` Xing, Cedric
  0 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-13 23:17 UTC (permalink / raw)
  To: Xing, Cedric
  Cc: Stephen Smalley, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko,
	Svahn, Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C,
	Tricca, Philip B

On Thu, Jun 13, 2019 at 04:03:24PM -0700, Xing, Cedric wrote:
> > From: Stephen Smalley [mailto:sds@tycho.nsa.gov]
> > Sent: Thursday, June 13, 2019 10:02 AM
> > 
> > > My RFC series[1] implements #1.  My understanding is that Andy
> > > (Lutomirski) prefers #2.  Cedric's RFC series implements #3.
> > >
> > > Perhaps the easiest way to make forward progress is to rule out the
> > > options we absolutely *don't* want by focusing on the potentially
> > > blocking issue with each option:
> > >
> > >    #1 - SGX UAPI funkiness
> > >
> > >    #2 - Auditing complexity, potential enclave lock contention
> > >
> > >    #3 - Pushing SGX details into LSMs and complexity of kernel
> > > implementation
> > >
> > >
> > > [1]
> > > https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherson
> > > @intel.com
> > 
> > Given the complexity tradeoff, what is the clear motivating example for
> > why #1 isn't the obvious choice? That the enclave loader has no way of
> > knowing a priori whether the enclave will require W->X or WX?  But
> > aren't we better off requiring enclaves to be explicitly marked as
> > needing such so that we can make a more informed decision about whether
> > to load them in the first place?
> 
> Are you asking this question at a) page granularity, b) file granularity or
> c) enclave (potentially comprised of multiple executable files) granularity?
> 
> #b is what we have on regular executable files and shared objects (i.e.
> FILE__EXECMOD). We all know how to do that.
> 
> #c is kind of new but could be done via some proxy file (e.g. sigstruct file)
> hence reduced to #b.
> 
> #a is problematic. It'd require compilers/linkers to generate such
> information, and proper executable image file format to carry that
> information, to be eventually picked up the loader. SELinux doesn't have
> PAGE__EXECMOD I guess is because it is generally considered impractical.
> 
> Option #1 however requires #a because the driver doesn't track which page was
> loaded from which file, otherwise it can no longer be qualified "simple". Or
> we could just implement #c, which will make all options simpler. But I guess
> #b is still preferred, to be aligned with what SELinux is enforcing today on
> regular memory pages.o

Option #1 doesn't require (a).  The checks will happen for every page,
but in the RFCs I sent, the policies are still attached to files and
processes, i.e. (b).

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-13 23:17             ` Sean Christopherson
@ 2019-06-14  0:31               ` Xing, Cedric
  0 siblings, 0 replies; 67+ messages in thread
From: Xing, Cedric @ 2019-06-14  0:31 UTC (permalink / raw)
  To: Christopherson, Sean J
  Cc: Stephen Smalley, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko,
	Svahn, Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C,
	Tricca, Philip B

> From: Christopherson, Sean J
> Sent: Thursday, June 13, 2019 4:18 PM
> 
> On Thu, Jun 13, 2019 at 04:03:24PM -0700, Xing, Cedric wrote:
> > > From: Stephen Smalley [mailto:sds@tycho.nsa.gov]
> > > Sent: Thursday, June 13, 2019 10:02 AM
> > >
> > > > My RFC series[1] implements #1.  My understanding is that Andy
> > > > (Lutomirski) prefers #2.  Cedric's RFC series implements #3.
> > > >
> > > > Perhaps the easiest way to make forward progress is to rule out
> the
> > > > options we absolutely *don't* want by focusing on the potentially
> > > > blocking issue with each option:
> > > >
> > > >    #1 - SGX UAPI funkiness
> > > >
> > > >    #2 - Auditing complexity, potential enclave lock contention
> > > >
> > > >    #3 - Pushing SGX details into LSMs and complexity of kernel
> > > > implementation
> > > >
> > > >
> > > > [1]
> > > > https://lkml.kernel.org/r/20190606021145.12604-1-
> sean.j.christopherson
> > > > @intel.com
> > >
> > > Given the complexity tradeoff, what is the clear motivating example
> for
> > > why #1 isn't the obvious choice? That the enclave loader has no way
> of
> > > knowing a priori whether the enclave will require W->X or WX?  But
> > > aren't we better off requiring enclaves to be explicitly marked as
> > > needing such so that we can make a more informed decision about
> whether
> > > to load them in the first place?
> >
> > Are you asking this question at a) page granularity, b) file
> granularity or
> > c) enclave (potentially comprised of multiple executable files)
> granularity?
> >
> > #b is what we have on regular executable files and shared objects (i.e.
> > FILE__EXECMOD). We all know how to do that.
> >
> > #c is kind of new but could be done via some proxy file (e.g.
> sigstruct file)
> > hence reduced to #b.
> >
> > #a is problematic. It'd require compilers/linkers to generate such
> > information, and proper executable image file format to carry that
> > information, to be eventually picked up the loader. SELinux doesn't
> have
> > PAGE__EXECMOD I guess is because it is generally considered
> impractical.
> >
> > Option #1 however requires #a because the driver doesn't track which
> page was
> > loaded from which file, otherwise it can no longer be qualified
> "simple". Or
> > we could just implement #c, which will make all options simpler. But I
> guess
> > #b is still preferred, to be aligned with what SELinux is enforcing
> today on
> > regular memory pages.o
> 
> Option #1 doesn't require (a).  The checks will happen for every page,
> but in the RFCs I sent, the policies are still attached to files and
> processes, i.e. (b).

I was talking at the UAPI level - i.e. your ioctl requires ALLOW_* at page granularity, hence #a.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-13 18:00         ` Stephen Smalley
  2019-06-13 19:48           ` Sean Christopherson
  2019-06-13 21:02           ` Xing, Cedric
@ 2019-06-14  0:37           ` Sean Christopherson
  2 siblings, 0 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-14  0:37 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Xing, Cedric, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko,
	Svahn, Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C,
	Tricca, Philip B

On Thu, Jun 13, 2019 at 02:00:29PM -0400, Stephen Smalley wrote:
> On 6/11/19 6:55 PM, Xing, Cedric wrote:
> >*_noaudit() is exactly what I wanted. But I couldn't find
> >selinux_file_mprotect_noaudit()/file_has_perm_noaudit(), and I'm reluctant
> >to duplicate code. Any suggestions?
> 
> I would have no objection to adding _noaudit() variants of these, either
> duplicating code (if sufficiently small/simple) or creating a common helper
> with a bool audit flag that gets used for both. But the larger issue would
> be to resolve how to ultimately ensure that a denial is audited later if the
> denied permission is actually requested and blocked via sgxsec_mprotect().

I too would like to see a solution to the auditing issue.  I wrongly
assumed Cedric's approach (option #3) didn't suffer the same auditing
problem as Andy's dynamic tracking proposal (option #2).  After reading
through the code more carefully (trying to steal ideas to finish off an
implementation of #2), I've come to realize options #2 (Andy) and #3
(Cedric) are basically identical concepts, the only difference being who
tracks state.

We can use the f_security blob sizes to identify which LSM denied
something, but I haven't the faintest idea how to track the auditing
information in a sane fashion.  We'd basically have to do a deep copy on
struct common_audit_data, or pre-generate and store the audit message.
For SELinux, a deep copy is somewhat feasible because selinux_audit_data
distills everything down to basic types.  AppArmor on the other hand has
'struct aa_label *label', which at a glance all but requires pre-generating
the audit message, and since AppArmor logs denials from every profile, it's 
possible the "sgx audit blob" will consume a non-trivial amount of data.

Even if we figure out a way to store the audit messages without exploding
memory consumption or making things horrendously complex, we still have a
problem of reporting state info.  Any number of things could be removed or
modified by the time the audit is actually triggered, e.g. files removed,
AppArmor profiles modified, etc...  Any such change means we're logging
garbage.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-13 17:02         ` Stephen Smalley
  2019-06-13 23:03           ` Xing, Cedric
@ 2019-06-14  0:46           ` Sean Christopherson
  2019-06-14 15:38             ` Sean Christopherson
                               ` (2 more replies)
  1 sibling, 3 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-14  0:46 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Cedric Xing, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, dave.hansen, tglx, torvalds, akpm, nhorman, pmccallum,
	serge.ayoun, shay.katz-zamir, haitao.huang, andriy.shevchenko,
	kai.svahn, bp, josh, kai.huang, rientjes, william.c.roberts,
	philip.b.tricca

On Thu, Jun 13, 2019 at 01:02:17PM -0400, Stephen Smalley wrote:
> On 6/11/19 6:02 PM, Sean Christopherson wrote:
> >On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> >>I haven't looked at this code closely, but it feels like a lot of
> >>SGX-specific logic embedded into SELinux that will have to be repeated or
> >>reused for every security module.  Does SGX not track this state itself?
> >
> >SGX does track equivalent state.
> >
> >There are three proposals on the table (I think):
> >
> >   1. Require userspace to explicitly specificy (maximal) enclave page
> >      permissions at build time.  The enclave page permissions are provided
> >      to, and checked by, LSMs at enclave build time.
> >
> >      Pros: Low-complexity kernel implementation, straightforward auditing
> >      Cons: Sullies the SGX UAPI to some extent, may increase complexity of
> >            SGX2 enclave loaders.
> >
> >   2. Pre-check LSM permissions and dynamically track mappings to enclave
> >      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> >      based on the pre-checked permissions.
> >
> >      Pros: Does not impact SGX UAPI, medium kernel complexity
> >      Cons: Auditing is complex/weird, requires taking enclave-specific
> >            lock during mprotect() to query/update tracking.
> >
> >   3. Implement LSM hooks in SGX to allow LSMs to track enclave regions
> >      from cradle to grave, but otherwise defer everything to LSMs.
> >
> >      Pros: Does not impact SGX UAPI, maximum flexibility, precise auditing
> >      Cons: Most complex and "heaviest" kernel implementation of the three,
> >            pushes more SGX details into LSMs.
> >
> >My RFC series[1] implements #1.  My understanding is that Andy (Lutomirski)
> >prefers #2.  Cedric's RFC series implements #3.
> >
> >Perhaps the easiest way to make forward progress is to rule out the
> >options we absolutely *don't* want by focusing on the potentially blocking
> >issue with each option:
> >
> >   #1 - SGX UAPI funkiness
> >
> >   #2 - Auditing complexity, potential enclave lock contention
> >
> >   #3 - Pushing SGX details into LSMs and complexity of kernel implementation
> >
> >
> >[1] https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherson@intel.com
> 
> Given the complexity tradeoff, what is the clear motivating example for why
> #1 isn't the obvious choice? That the enclave loader has no way of knowing a
> priori whether the enclave will require W->X or WX?  But aren't we better
> off requiring enclaves to be explicitly marked as needing such so that we
> can make a more informed decision about whether to load them in the first
> place?

Andy and/or Cedric, can you please weigh in with a concrete (and practical)
use case that will break if we go with #1?  The auditing issues for #2/#3
are complex to say the least...

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-14  0:46           ` Sean Christopherson
@ 2019-06-14 15:38             ` Sean Christopherson
  2019-06-16 22:14               ` Andy Lutomirski
  2019-06-14 17:16             ` Xing, Cedric
  2019-06-14 23:19             ` Dr. Greg
  2 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-14 15:38 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Cedric Xing, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, dave.hansen, tglx, torvalds, akpm, nhorman, pmccallum,
	serge.ayoun, shay.katz-zamir, haitao.huang, andriy.shevchenko,
	kai.svahn, bp, josh, kai.huang, rientjes, william.c.roberts,
	philip.b.tricca

On Thu, Jun 13, 2019 at 05:46:00PM -0700, Sean Christopherson wrote:
> On Thu, Jun 13, 2019 at 01:02:17PM -0400, Stephen Smalley wrote:
> > On 6/11/19 6:02 PM, Sean Christopherson wrote:
> > >On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> > >>I haven't looked at this code closely, but it feels like a lot of
> > >>SGX-specific logic embedded into SELinux that will have to be repeated or
> > >>reused for every security module.  Does SGX not track this state itself?
> > >
> > >SGX does track equivalent state.
> > >
> > >There are three proposals on the table (I think):
> > >
> > >   1. Require userspace to explicitly specificy (maximal) enclave page
> > >      permissions at build time.  The enclave page permissions are provided
> > >      to, and checked by, LSMs at enclave build time.
> > >
> > >      Pros: Low-complexity kernel implementation, straightforward auditing
> > >      Cons: Sullies the SGX UAPI to some extent, may increase complexity of
> > >            SGX2 enclave loaders.
> > >
> > >   2. Pre-check LSM permissions and dynamically track mappings to enclave
> > >      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> > >      based on the pre-checked permissions.
> > >
> > >      Pros: Does not impact SGX UAPI, medium kernel complexity
> > >      Cons: Auditing is complex/weird, requires taking enclave-specific
> > >            lock during mprotect() to query/update tracking.
> > >
> > >   3. Implement LSM hooks in SGX to allow LSMs to track enclave regions
> > >      from cradle to grave, but otherwise defer everything to LSMs.
> > >
> > >      Pros: Does not impact SGX UAPI, maximum flexibility, precise auditing
> > >      Cons: Most complex and "heaviest" kernel implementation of the three,
> > >            pushes more SGX details into LSMs.
> > >
> > >My RFC series[1] implements #1.  My understanding is that Andy (Lutomirski)
> > >prefers #2.  Cedric's RFC series implements #3.
> > >
> > >Perhaps the easiest way to make forward progress is to rule out the
> > >options we absolutely *don't* want by focusing on the potentially blocking
> > >issue with each option:
> > >
> > >   #1 - SGX UAPI funkiness
> > >
> > >   #2 - Auditing complexity, potential enclave lock contention
> > >
> > >   #3 - Pushing SGX details into LSMs and complexity of kernel implementation
> > >
> > >
> > >[1] https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherson@intel.com
> > 
> > Given the complexity tradeoff, what is the clear motivating example for why
> > #1 isn't the obvious choice? That the enclave loader has no way of knowing a
> > priori whether the enclave will require W->X or WX?  But aren't we better
> > off requiring enclaves to be explicitly marked as needing such so that we
> > can make a more informed decision about whether to load them in the first
> > place?
> 
> Andy and/or Cedric, can you please weigh in with a concrete (and practical)
> use case that will break if we go with #1?  The auditing issues for #2/#3
> are complex to say the least...

Follow-up question, is #1 any more palatable if SELinux adds SGX specific
permissions and ties them to the process (instead of the vma or sigstruct)?

Something like this for SELinux, where the absolute worst case scenario is
that SGX2 enclave loaders need SGXEXECMEM.  Graphene would need SGXEXECUNMR
and probably SGXEXECANON.

static inline int sgx_has_perm(u32 sid, u32 requested)
{
        return avc_has_perm(&selinux_state, sid, sid,
			    SECCLASS_PROCESS2, requested, NULL);
}

static int selinux_enclave_load(struct vm_area_struct *vma, unsigned long prot,
				bool measured)
{
	const struct cred *cred = current_cred();
	u32 sid = cred_sid(cred);
	int ret;

	/* SGX is supported only in 64-bit kernels. */
	WARN_ON_ONCE(!default_noexec);

	/* Only executable enclave pages are restricted in any way. */
	if (!(prot & PROT_EXEC))
		return 0;

	/*
	 * Private mappings to enclave pages are impossible, ergo we don't
	 * differentiate between W->X and WX, either case requires EXECMEM.
	 */
	if (prot & PROT_WRITE) {
		ret = sgx_has_perm(sid, PROCESS2__SGXEXECMEM);
		if (ret)
			goto out;
	}
	if (!measured) {
		ret = sgx_has_perm(sid, PROCESS2__SGXEXECUNMR);
		if (ret)
			goto out;
	}

	if (!vma->vm_file || !IS_PRIVATE(file_inode(vma->vm_file)) ||
	    vma->anon_vma) {
		/*
		 * Loading enclave code from an anonymous mapping or from a
		 * modified private file mapping.
		 */
		ret = sgx_has_perm(sid, PROCESS2__SGXEXECANON);
		if (ret)
			goto out;
	} else {
		/* Loading from a shared or unmodified private file mapping. */
		ret = sgx_has_perm(sid, PROCESS2__SGXEXECFILE);
		if (ret)
			goto out;

		/* The source file must be executable in this case. */
		ret = file_has_perm(cred, vma->vm_file, FILE__EXECUTE);
		if (ret)
			goto out;
	}

out:
	return ret;
}


Given that AppArmor generally only cares about accessing files, its
enclave_load() implementation would be something like:

static int apparmor_enclave_load(struct vm_area_struct *vma, unsigned long prot,
				bool measured)
{
	if (!(prot & PROT_EXEC))
		return 0;

	return common_file_perm(OP_ENCL_LOAD, vma->vm_file, AA_EXEC_MMAP);
}

^ permalink raw reply	[flat|nested] 67+ messages in thread

* RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-14  0:46           ` Sean Christopherson
  2019-06-14 15:38             ` Sean Christopherson
@ 2019-06-14 17:16             ` Xing, Cedric
  2019-06-14 17:45               ` Sean Christopherson
  2019-06-16 22:16               ` Andy Lutomirski
  2019-06-14 23:19             ` Dr. Greg
  2 siblings, 2 replies; 67+ messages in thread
From: Xing, Cedric @ 2019-06-14 17:16 UTC (permalink / raw)
  To: Christopherson, Sean J, Stephen Smalley
  Cc: linux-security-module, selinux, linux-kernel, linux-sgx,
	jarkko.sakkinen, luto, jmorris, serge, paul, eparis, jethro,
	Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum, Ayoun,
	Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko, Svahn,
	Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C, Tricca,
	Philip B

> From: Christopherson, Sean J
> Sent: Thursday, June 13, 2019 5:46 PM
> 
> On Thu, Jun 13, 2019 at 01:02:17PM -0400, Stephen Smalley wrote:
> > On 6/11/19 6:02 PM, Sean Christopherson wrote:
> > >On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> > >>I haven't looked at this code closely, but it feels like a lot of
> > >>SGX-specific logic embedded into SELinux that will have to be
> > >>repeated or reused for every security module.  Does SGX not track
> this state itself?
> > >
> > >SGX does track equivalent state.
> > >
> > >There are three proposals on the table (I think):
> > >
> > >   1. Require userspace to explicitly specificy (maximal) enclave
> page
> > >      permissions at build time.  The enclave page permissions are
> provided
> > >      to, and checked by, LSMs at enclave build time.
> > >
> > >      Pros: Low-complexity kernel implementation, straightforward
> auditing
> > >      Cons: Sullies the SGX UAPI to some extent, may increase
> complexity of
> > >            SGX2 enclave loaders.
> > >
> > >   2. Pre-check LSM permissions and dynamically track mappings to
> enclave
> > >      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> > >      based on the pre-checked permissions.
> > >
> > >      Pros: Does not impact SGX UAPI, medium kernel complexity
> > >      Cons: Auditing is complex/weird, requires taking enclave-
> specific
> > >            lock during mprotect() to query/update tracking.
> > >
> > >   3. Implement LSM hooks in SGX to allow LSMs to track enclave
> regions
> > >      from cradle to grave, but otherwise defer everything to LSMs.
> > >
> > >      Pros: Does not impact SGX UAPI, maximum flexibility, precise
> auditing
> > >      Cons: Most complex and "heaviest" kernel implementation of the
> three,
> > >            pushes more SGX details into LSMs.
> > >
> > >My RFC series[1] implements #1.  My understanding is that Andy
> > >(Lutomirski) prefers #2.  Cedric's RFC series implements #3.
> > >
> > >Perhaps the easiest way to make forward progress is to rule out the
> > >options we absolutely *don't* want by focusing on the potentially
> > >blocking issue with each option:
> > >
> > >   #1 - SGX UAPI funkiness
> > >
> > >   #2 - Auditing complexity, potential enclave lock contention
> > >
> > >   #3 - Pushing SGX details into LSMs and complexity of kernel
> > > implementation
> > >
> > >
> > >[1]
> > >https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherso
> > >n@intel.com
> >
> > Given the complexity tradeoff, what is the clear motivating example
> > for why
> > #1 isn't the obvious choice? That the enclave loader has no way of
> > knowing a priori whether the enclave will require W->X or WX?  But
> > aren't we better off requiring enclaves to be explicitly marked as
> > needing such so that we can make a more informed decision about
> > whether to load them in the first place?
> 
> Andy and/or Cedric, can you please weigh in with a concrete (and
> practical) use case that will break if we go with #1?  The auditing
> issues for #2/#3 are complex to say the least...

How does enclave loader provide per-page ALLOW_* flags? And a related question is why they are necessary for enclaves but unnecessary for regular executables or shared objects.

What's the story for SGX2 if mmap()'ing non-existing pages is not allowed?

What's the story for auditing?

After everything above has been taken care of properly, will #1 still be simpler than #2/#3?


^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-14 17:16             ` Xing, Cedric
@ 2019-06-14 17:45               ` Sean Christopherson
  2019-06-14 17:53                 ` Sean Christopherson
  2019-06-16 22:16               ` Andy Lutomirski
  1 sibling, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-14 17:45 UTC (permalink / raw)
  To: Xing, Cedric
  Cc: Stephen Smalley, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko,
	Svahn, Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C,
	Tricca, Philip B

On Fri, Jun 14, 2019 at 10:16:55AM -0700, Xing, Cedric wrote:
> > From: Christopherson, Sean J
> > Sent: Thursday, June 13, 2019 5:46 PM
> > 
> > On Thu, Jun 13, 2019 at 01:02:17PM -0400, Stephen Smalley wrote:
> > > On 6/11/19 6:02 PM, Sean Christopherson wrote:
> > > >My RFC series[1] implements #1.  My understanding is that Andy
> > > >(Lutomirski) prefers #2.  Cedric's RFC series implements #3.
> > > >
> > > >Perhaps the easiest way to make forward progress is to rule out the
> > > >options we absolutely *don't* want by focusing on the potentially
> > > >blocking issue with each option:
> > > >
> > > >   #1 - SGX UAPI funkiness
> > > >
> > > >   #2 - Auditing complexity, potential enclave lock contention
> > > >
> > > >   #3 - Pushing SGX details into LSMs and complexity of kernel
> > > > implementation
> > > >
> > > >
> > > >[1]
> > > >https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherso
> > > >n@intel.com
> > >
> > > Given the complexity tradeoff, what is the clear motivating example
> > > for why
> > > #1 isn't the obvious choice? That the enclave loader has no way of
> > > knowing a priori whether the enclave will require W->X or WX?  But
> > > aren't we better off requiring enclaves to be explicitly marked as
> > > needing such so that we can make a more informed decision about
> > > whether to load them in the first place?
> > 
> > Andy and/or Cedric, can you please weigh in with a concrete (and
> > practical) use case that will break if we go with #1?  The auditing
> > issues for #2/#3 are complex to say the least...
> 
> How does enclave loader provide per-page ALLOW_* flags?

Unchanged from my RFC, i.e. specified at SGX_IOC_ENCLAVE_ADD_PAGE(S).

> And a related question is why they are necessary for enclaves but
> unnecessary for regular executables or shared objects.

Because at mmap()/mprotect() time we don't have the source file of the
enclave page to check SELinux's FILE__EXECUTE or AppArmor's AA_EXEC_MMAP.

> What's the story for SGX2 if mmap()'ing non-existing pages is not allowed?

Userspace will need to invoke an ioctl() to tell SGX "this range can be
EAUG'd".

> 
> What's the story for auditing?

It happens naturally when security_enclave_load() is called.  Am I
missing something?

> After everything above has been taken care of properly, will #1 still be
> simpler than #2/#3?

The state tracking of #2/#3 doesn't scare me, it's purely the auditing.
Holding an audit message for an indeterminate amount of time is a
nightmare.

Here's a thought.  What if we simply require FILE__EXECUTE or AA_EXEC_MAP
to load any enclave page from a file?  Alternatively, we could add an SGX
specific file policity, e.g. FILE__ENCLAVELOAD and AA_MAY_LOAD_ENCLAVE.
As in my other email, SELinux's W^X restrictions can be tied to the process,
i.e. they can be checked at mmap()/mprotect() without throwing a wrench in
auditing.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-14 17:45               ` Sean Christopherson
@ 2019-06-14 17:53                 ` Sean Christopherson
  2019-06-14 20:01                   ` Sean Christopherson
  0 siblings, 1 reply; 67+ messages in thread
From: Sean Christopherson @ 2019-06-14 17:53 UTC (permalink / raw)
  To: Xing, Cedric
  Cc: Stephen Smalley, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko,
	Svahn, Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C,
	Tricca, Philip B

On Fri, Jun 14, 2019 at 10:45:56AM -0700, Sean Christopherson wrote:
> The state tracking of #2/#3 doesn't scare me, it's purely the auditing.
> Holding an audit message for an indeterminate amount of time is a
> nightmare.
> 
> Here's a thought.  What if we simply require FILE__EXECUTE or AA_EXEC_MAP
> to load any enclave page from a file?  Alternatively, we could add an SGX
> specific file policity, e.g. FILE__ENCLAVELOAD and AA_MAY_LOAD_ENCLAVE.
> As in my other email, SELinux's W^X restrictions can be tied to the process,
> i.e. they can be checked at mmap()/mprotect() without throwing a wrench in
> auditing.

We would also need to require VM_MAYEXEC on all enclave pages, or forego
enforcing path_noexec() for enclaves.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-14 17:53                 ` Sean Christopherson
@ 2019-06-14 20:01                   ` Sean Christopherson
  0 siblings, 0 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-14 20:01 UTC (permalink / raw)
  To: Xing, Cedric
  Cc: Stephen Smalley, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, Hansen, Dave, tglx, torvalds, akpm, nhorman, pmccallum,
	Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao, andriy.shevchenko,
	Svahn, Kai, bp, josh, Huang, Kai, rientjes, Roberts, William C,
	Tricca, Philip B

On Fri, Jun 14, 2019 at 10:53:39AM -0700, Sean Christopherson wrote:
> On Fri, Jun 14, 2019 at 10:45:56AM -0700, Sean Christopherson wrote:
> > The state tracking of #2/#3 doesn't scare me, it's purely the auditing.
> > Holding an audit message for an indeterminate amount of time is a
> > nightmare.
> > 
> > Here's a thought.  What if we simply require FILE__EXECUTE or AA_EXEC_MAP
> > to load any enclave page from a file?  Alternatively, we could add an SGX
> > specific file policity, e.g. FILE__ENCLAVELOAD and AA_MAY_LOAD_ENCLAVE.
> > As in my other email, SELinux's W^X restrictions can be tied to the process,
> > i.e. they can be checked at mmap()/mprotect() without throwing a wrench in
> > auditing.
> 
> We would also need to require VM_MAYEXEC on all enclave pages, or forego
> enforcing path_noexec() for enclaves.

Scratch that thought.   Tying W^X restrictions to the process only works
if its done at load time.  E.g. If process A maps a page W and process B
maps the same page X, then which process needs W^X depends on the order of
mmap()/mprotect() between the two processes.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-14  0:46           ` Sean Christopherson
  2019-06-14 15:38             ` Sean Christopherson
  2019-06-14 17:16             ` Xing, Cedric
@ 2019-06-14 23:19             ` Dr. Greg
  2 siblings, 0 replies; 67+ messages in thread
From: Dr. Greg @ 2019-06-14 23:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Stephen Smalley, Cedric Xing, linux-security-module, selinux,
	linux-kernel, linux-sgx, jarkko.sakkinen, luto, jmorris, serge,
	paul, eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca

On Thu, Jun 13, 2019 at 05:46:00PM -0700, Sean Christopherson wrote:

Good afternoon, I hope the week is ending well for everyone.

> On Thu, Jun 13, 2019 at 01:02:17PM -0400, Stephen Smalley wrote:
> > Given the complexity tradeoff, what is the clear motivating
> > example for why #1 isn't the obvious choice? That the enclave
> > loader has no way of knowing a priori whether the enclave will
> > require W->X or WX?  But aren't we better off requiring enclaves
> > to be explicitly marked as needing such so that we can make a more
> > informed decision about whether to load them in the first place?

> Andy and/or Cedric, can you please weigh in with a concrete (and
> practical) use case that will break if we go with #1?  The auditing
> issues for #2/#3 are complex to say the least...

So we are back to choosing door 1, door 2 or door 3.

That brings us back to our previous e-mail, where we suggested that
the most fundamental question to answer with the LSM issue is how much
effective security is being purchased at what complexity cost.

We are practical guys at our company, we direct the development and
deployment of practical SGX systems, including an independent
implementation of SGX runtime/attestation/provisioning et.al.  Our
comments, for whatever they are worth, are meant to reflect the real
world deployment of this technology.

Lets start big picture.

One of the clients we are consulting with on this technology is
running well north of 1400 Linux systems.  Every one of which has
selinux=0 in /proc/cmdline and will do so until approximately the heat
death of the Universe.

Our AI LSM will use any SGX LSM driver hooks that eventuate from these
discussions, so we support the notion of the LSM getting a look at
permissions of executable code.  However, our client isn't unique in
their configuration choice, so we believe this fact calls the question
as to how much SGX specific complexity should be injected into the
LSM.

So, as we noted in our previous e-mail, there are only two relevant
security questions the LSM needs to answer:

1.) Should a page of memory with executable content be allowed into an
enclave?

2.) Should an enclave be allowed to possess one or more pages of
executable memory which will have WX permissions sometime during its
lifetime?

Sean is suggesting the strategy of an ioctl to call out pages that
conform to question 2 (EAUG'ed pages).  That doesn't seem like an
onerous requirement, since all of the current enclave loaders already
have all of the metadata infrastructure to map/load page ranges.  The
EAUG WX range would simply be another layout type that gets walked
over when the enclave image is built.

Given that, we were somewhat surprised to hear Sean say that he had
been advised that door 1 was a non-starter.  Presumably this was
because of the need to delineate a specific cohort of pages that will
be permitted WX.  If that is the case, the question that needs to be
called, as Stephen alludes to above, is whether or not WX privileges
should be considered a characterizing feature of the VMA that defines
an enclave rather then a per page attribute.

Do we realistically believe that an LSM will be implemented that
reacts differently when the 357th page of WX memory is added as
opposed to the first?  The operative security question is whether or
not the platform owner is willing to allow arbitrary executable code,
that they may have no visibility into, to be executed on their
platform.

We talk to people that, as a technology, SGX is about building
'security archipelagos', islands of trusted execution on potentially
multiple platforms that interact to deliver a service, all of which
consider their surrounding platforms and the network in between them
as adversarial.  This model is, by definition, adverserial to the
notion and function of the LSM.

With respect to SGX dynamic code loading, the future for security
concious architectures, will be to pull the code from remotely
attested repository servers over the network.  The only relevant
security question that can be answered is whether or not a platform
owner feels comfortable with that model.

Best wishes for a pleasant weekend to everyone.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"My spin on the meeting?  I lie somewhere between the individual who
 feels that we are all going to join hands and march forward carrying
 the organization into the information age and Dr. Wettstein.  Who
 feels that they are holding secret meetings at 6 o'clock in the
 morning plotting strategy on how to replace our system."
                                -- Paul S. Etzell, M.D.
                                   Medical Director, Roger Maris Cancer Center

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-14 15:38             ` Sean Christopherson
@ 2019-06-16 22:14               ` Andy Lutomirski
  2019-06-17 16:49                 ` Sean Christopherson
  0 siblings, 1 reply; 67+ messages in thread
From: Andy Lutomirski @ 2019-06-16 22:14 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Stephen Smalley, Cedric Xing, LSM List, selinux, LKML, linux-sgx,
	Jarkko Sakkinen, Andrew Lutomirski, James Morris,
	Serge E. Hallyn, Paul Moore, Eric Paris, Jethro Beekman,
	Dave Hansen, Thomas Gleixner, Linus Torvalds, Andrew Morton,
	nhorman, pmccallum, Ayoun, Serge, Katz-zamir, Shay, Huang,
	Haitao, Andy Shevchenko, Svahn, Kai, Borislav Petkov,
	Josh Triplett, Huang, Kai, David Rientjes, Roberts, William C,
	Philip Tricca

On Fri, Jun 14, 2019 at 8:38 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Thu, Jun 13, 2019 at 05:46:00PM -0700, Sean Christopherson wrote:
> > On Thu, Jun 13, 2019 at 01:02:17PM -0400, Stephen Smalley wrote:
> > > On 6/11/19 6:02 PM, Sean Christopherson wrote:
> > > >On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> > > >>I haven't looked at this code closely, but it feels like a lot of
> > > >>SGX-specific logic embedded into SELinux that will have to be repeated or
> > > >>reused for every security module.  Does SGX not track this state itself?
> > > >
> > > >SGX does track equivalent state.
> > > >
> > > >There are three proposals on the table (I think):
> > > >
> > > >   1. Require userspace to explicitly specificy (maximal) enclave page
> > > >      permissions at build time.  The enclave page permissions are provided
> > > >      to, and checked by, LSMs at enclave build time.
> > > >
> > > >      Pros: Low-complexity kernel implementation, straightforward auditing
> > > >      Cons: Sullies the SGX UAPI to some extent, may increase complexity of
> > > >            SGX2 enclave loaders.
> > > >
> > > >   2. Pre-check LSM permissions and dynamically track mappings to enclave
> > > >      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> > > >      based on the pre-checked permissions.
> > > >
> > > >      Pros: Does not impact SGX UAPI, medium kernel complexity
> > > >      Cons: Auditing is complex/weird, requires taking enclave-specific
> > > >            lock during mprotect() to query/update tracking.
> > > >
> > > >   3. Implement LSM hooks in SGX to allow LSMs to track enclave regions
> > > >      from cradle to grave, but otherwise defer everything to LSMs.
> > > >
> > > >      Pros: Does not impact SGX UAPI, maximum flexibility, precise auditing
> > > >      Cons: Most complex and "heaviest" kernel implementation of the three,
> > > >            pushes more SGX details into LSMs.
> > > >
> > > >My RFC series[1] implements #1.  My understanding is that Andy (Lutomirski)
> > > >prefers #2.  Cedric's RFC series implements #3.
> > > >
> > > >Perhaps the easiest way to make forward progress is to rule out the
> > > >options we absolutely *don't* want by focusing on the potentially blocking
> > > >issue with each option:
> > > >
> > > >   #1 - SGX UAPI funkiness
> > > >
> > > >   #2 - Auditing complexity, potential enclave lock contention
> > > >
> > > >   #3 - Pushing SGX details into LSMs and complexity of kernel implementation
> > > >
> > > >
> > > >[1] https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherson@intel.com
> > >
> > > Given the complexity tradeoff, what is the clear motivating example for why
> > > #1 isn't the obvious choice? That the enclave loader has no way of knowing a
> > > priori whether the enclave will require W->X or WX?  But aren't we better
> > > off requiring enclaves to be explicitly marked as needing such so that we
> > > can make a more informed decision about whether to load them in the first
> > > place?
> >
> > Andy and/or Cedric, can you please weigh in with a concrete (and practical)
> > use case that will break if we go with #1?  The auditing issues for #2/#3
> > are complex to say the least...

The most significant issue I see is the following.  Consider two
cases. First, an SGX2 enclave that dynamically allocates memory but
doesn't execute code from dynamic memory.  Second, an SGX2 enclave
that *does* execute code from dynamic memory.  In #1, the untrusted
stack needs to decide whether to ALLOW_EXEC when the memory is
allocated, which means that it either needs to assume the worst or it
needs to know at allocation time whether the enclave ever intends to
change the permission to X.

I suppose there's a middle ground.  The driver could use model #1 for
driver-filled pages and model #2 for dynamic pages.  I haven't tried
to fully work it out, but I think there would be the ALLOW_READ /
ALLOW_WRITE / ALLOW_EXEC flag for EADD-ed pages but, for EAUG-ed
pages, there would be a different policy.  This might be as simple as
internally having four flags instead of three:

ALLOW_READ, ALLOW_WRITE, ALLOW_EXEC: as before

ALLOW_EXEC_COND: set implicitly by the driver for EAUG.

As in #1, if you try to mmap or protect a page with neither ALLOW_EXEC
variant, it fails (-EACCES, perhaps).  But, if you try to mmap or
mprotect an ALLOW_EXEC_COND page with PROT_EXEC, you ask LSM for
permission.  There is no fancy DIRTY tracking here, since it's
reasonable to just act as though *every* ALLOW_EXEC_COND page is
dirty.  There is no real auditing issue here, since LSM can just log
what permission is missing.

Does this seem sensible?  It might give us the best of #1 and #2.

>
> Follow-up question, is #1 any more palatable if SELinux adds SGX specific
> permissions and ties them to the process (instead of the vma or sigstruct)?

I'm not sure this makes a difference.  It simplifies SIGSTRUCT
handling, which is handy.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-14 17:16             ` Xing, Cedric
  2019-06-14 17:45               ` Sean Christopherson
@ 2019-06-16 22:16               ` Andy Lutomirski
  1 sibling, 0 replies; 67+ messages in thread
From: Andy Lutomirski @ 2019-06-16 22:16 UTC (permalink / raw)
  To: Xing, Cedric
  Cc: Christopherson, Sean J, Stephen Smalley, linux-security-module,
	selinux, linux-kernel, linux-sgx, jarkko.sakkinen, luto, jmorris,
	serge, paul, eparis, jethro, Hansen, Dave, tglx, torvalds, akpm,
	nhorman, pmccallum, Ayoun, Serge, Katz-zamir, Shay, Huang,
	Haitao, andriy.shevchenko, Svahn, Kai, bp, josh, Huang, Kai,
	rientjes, Roberts, William C, Tricca, Philip B

On Fri, Jun 14, 2019 at 10:16 AM Xing, Cedric <cedric.xing@intel.com> wrote:
>
> > From: Christopherson, Sean J
> > Sent: Thursday, June 13, 2019 5:46 PM
> >
> > On Thu, Jun 13, 2019 at 01:02:17PM -0400, Stephen Smalley wrote:
> > > On 6/11/19 6:02 PM, Sean Christopherson wrote:
> > > >On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> > > >>I haven't looked at this code closely, but it feels like a lot of
> > > >>SGX-specific logic embedded into SELinux that will have to be
> > > >>repeated or reused for every security module.  Does SGX not track
> > this state itself?
> > > >
> > > >SGX does track equivalent state.
> > > >
> > > >There are three proposals on the table (I think):
> > > >
> > > >   1. Require userspace to explicitly specificy (maximal) enclave
> > page
> > > >      permissions at build time.  The enclave page permissions are
> > provided
> > > >      to, and checked by, LSMs at enclave build time.
> > > >
> > > >      Pros: Low-complexity kernel implementation, straightforward
> > auditing
> > > >      Cons: Sullies the SGX UAPI to some extent, may increase
> > complexity of
> > > >            SGX2 enclave loaders.
> > > >
> > > >   2. Pre-check LSM permissions and dynamically track mappings to
> > enclave
> > > >      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
> > > >      based on the pre-checked permissions.
> > > >
> > > >      Pros: Does not impact SGX UAPI, medium kernel complexity
> > > >      Cons: Auditing is complex/weird, requires taking enclave-
> > specific
> > > >            lock during mprotect() to query/update tracking.
> > > >
> > > >   3. Implement LSM hooks in SGX to allow LSMs to track enclave
> > regions
> > > >      from cradle to grave, but otherwise defer everything to LSMs.
> > > >
> > > >      Pros: Does not impact SGX UAPI, maximum flexibility, precise
> > auditing
> > > >      Cons: Most complex and "heaviest" kernel implementation of the
> > three,
> > > >            pushes more SGX details into LSMs.
> > > >
> > > >My RFC series[1] implements #1.  My understanding is that Andy
> > > >(Lutomirski) prefers #2.  Cedric's RFC series implements #3.
> > > >
> > > >Perhaps the easiest way to make forward progress is to rule out the
> > > >options we absolutely *don't* want by focusing on the potentially
> > > >blocking issue with each option:
> > > >
> > > >   #1 - SGX UAPI funkiness
> > > >
> > > >   #2 - Auditing complexity, potential enclave lock contention
> > > >
> > > >   #3 - Pushing SGX details into LSMs and complexity of kernel
> > > > implementation
> > > >
> > > >
> > > >[1]
> > > >https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherso
> > > >n@intel.com
> > >
> > > Given the complexity tradeoff, what is the clear motivating example
> > > for why
> > > #1 isn't the obvious choice? That the enclave loader has no way of
> > > knowing a priori whether the enclave will require W->X or WX?  But
> > > aren't we better off requiring enclaves to be explicitly marked as
> > > needing such so that we can make a more informed decision about
> > > whether to load them in the first place?
> >
> > Andy and/or Cedric, can you please weigh in with a concrete (and
> > practical) use case that will break if we go with #1?  The auditing
> > issues for #2/#3 are complex to say the least...
>
> How does enclave loader provide per-page ALLOW_* flags? And a related question is why they are necessary for enclaves but unnecessary for regular executables or shared objects.
>
> What's the story for SGX2 if mmap()'ing non-existing pages is not allowed?
>

I think it just works.  Either you can't mmap() the page until you
have explicitly EAUG-ed it, or you add a new ioctl() that is
effectively "EAUG lazily".  The latter would declare that address and
request that it get allocated and EAUGed when faulted, but it wouldn't
actually do the EAUG.

--Andy

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v2 5/5] security/selinux: Add enclave_load() implementation
  2019-06-06  2:11 ` [RFC PATCH v2 5/5] security/selinux: Add enclave_load() implementation Sean Christopherson
  2019-06-07 21:16   ` Stephen Smalley
@ 2019-06-17 16:38   ` Jarkko Sakkinen
  1 sibling, 0 replies; 67+ messages in thread
From: Jarkko Sakkinen @ 2019-06-17 16:38 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Cedric Xing, Stephen Smalley, James Morris,
	Serge E . Hallyn, LSM List, Paul Moore, Eric Paris, selinux,
	Jethro Beekman, Dave Hansen, Thomas Gleixner, Linus Torvalds,
	LKML, X86 ML, linux-sgx, Andrew Morton, nhorman, npmccallum,
	Serge Ayoun, Shay Katz-zamir, Haitao Huang, Andy Shevchenko,
	Kai Svahn, Borislav Petkov, Josh Triplett, Kai Huang,
	David Rientjes, William Roberts, Philip Tricca

On Wed, Jun 05, 2019 at 07:11:45PM -0700, Sean Christopherson wrote:
> The goal of selinux_enclave_load() is to provide a facsimile of the
> existing selinux_file_mprotect() and file_map_prot_check() policies,
> but tailored to the unique properties of SGX.
> 
> For example, an enclave page is technically backed by a MAP_SHARED file,
> but the "file" is essentially shared memory that is never persisted
> anywhere and also requires execute permissions (for some pages).
> 
> The basic concept is to require appropriate execute permissions on the
> source of the enclave for pages that are requesting PROT_EXEC, e.g. if
> an enclave page is being loaded from a regular file, require
> FILE__EXECUTE and/or FILE__EXECMOND, and if it's coming from an
> anonymous/private mapping, require PROCESS__EXECMEM since the process
> is essentially executing from the mapping, albeit in a roundabout way.
> 
> Note, FILE__READ and FILE__WRITE are intentionally not required even if
> the source page is backed by a regular file.  Writes to the enclave page
> are contained to the EPC, i.e. never hit the original file, and read
> permissions have already been vetted (or the VMA doesn't have PROT_READ,
> in which case loading the page into the enclave will fail).
> 
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>

In the end of the day, the main problem with this patch is that the
existing LSM hooks are generic. I don't we can have specific hooks
for proprietary hardware.

/Jarkko

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-16 22:14               ` Andy Lutomirski
@ 2019-06-17 16:49                 ` Sean Christopherson
  2019-06-17 17:08                   ` Andy Lutomirski
  2019-06-18 15:40                   ` Dr. Greg
  0 siblings, 2 replies; 67+ messages in thread
From: Sean Christopherson @ 2019-06-17 16:49 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Stephen Smalley, Cedric Xing, LSM List, selinux, LKML, linux-sgx,
	Jarkko Sakkinen, James Morris, Serge E. Hallyn, Paul Moore,
	Eric Paris, Jethro Beekman, Dave Hansen, Thomas Gleixner,
	Linus Torvalds, Andrew Morton, nhorman, pmccallum, Ayoun, Serge,
	Katz-zamir, Shay, Huang, Haitao, Andy Shevchenko, Svahn, Kai,
	Borislav Petkov, Josh Triplett, Huang, Kai, David Rientjes,
	Roberts, William C, Philip Tricca

On Sun, Jun 16, 2019 at 03:14:51PM -0700, Andy Lutomirski wrote:
> On Fri, Jun 14, 2019 at 8:38 AM Sean Christopherson
> <sean.j.christopherson@intel.com> wrote:
> > > Andy and/or Cedric, can you please weigh in with a concrete (and practical)
> > > use case that will break if we go with #1?  The auditing issues for #2/#3
> > > are complex to say the least...
> 
> The most significant issue I see is the following.  Consider two
> cases. First, an SGX2 enclave that dynamically allocates memory but
> doesn't execute code from dynamic memory.  Second, an SGX2 enclave
> that *does* execute code from dynamic memory.  In #1, the untrusted
> stack needs to decide whether to ALLOW_EXEC when the memory is
> allocated, which means that it either needs to assume the worst or it
> needs to know at allocation time whether the enclave ever intends to
> change the permission to X.

I'm just not convinced that folks running enclaves that can't communicate
their basic functionality will care one whit about SELinux restrictions,
i.e. will happily give EXECMOD even if it's not strictly necessary.
 
> I suppose there's a middle ground.  The driver could use model #1 for
> driver-filled pages and model #2 for dynamic pages.  I haven't tried
> to fully work it out, but I think there would be the ALLOW_READ /
> ALLOW_WRITE / ALLOW_EXEC flag for EADD-ed pages but, for EAUG-ed
> pages, there would be a different policy.  This might be as simple as
> internally having four flags instead of three:
> 
> ALLOW_READ, ALLOW_WRITE, ALLOW_EXEC: as before
> 
> ALLOW_EXEC_COND: set implicitly by the driver for EAUG.
> 
> As in #1, if you try to mmap or protect a page with neither ALLOW_EXEC
> variant, it fails (-EACCES, perhaps).  But, if you try to mmap or
> mprotect an ALLOW_EXEC_COND page with PROT_EXEC, you ask LSM for
> permission.  There is no fancy DIRTY tracking here, since it's
> reasonable to just act as though *every* ALLOW_EXEC_COND page is
> dirty.  There is no real auditing issue here, since LSM can just log
> what permission is missing.
> 
> Does this seem sensible?  It might give us the best of #1 and #2.

It would work and is easy to implement *if* SELinux ties permissions to
the process, as the SIGSTRUCT vma/file won't be available at
EAUG+mprotect().  I already have a set of patches to that effect, I'll
send 'em out in a bit.

FWIW, we still need to differentiate W->X from WX on SGX1, i.e. declaring
ALLOW_WRITE + ALLOW_EXEC shouldn't imply WX.  This is also addressed in
the forthcoming updated RFC.

> > Follow-up question, is #1 any more palatable if SELinux adds SGX specific
> > permissions and ties them to the process (instead of the vma or sigstruct)?
> 
> I'm not sure this makes a difference.  It simplifies SIGSTRUCT
> handling, which is handy.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-17 16:49                 ` Sean Christopherson
@ 2019-06-17 17:08                   ` Andy Lutomirski
  2019-06-18 15:40                   ` Dr. Greg
  1 sibling, 0 replies; 67+ messages in thread
From: Andy Lutomirski @ 2019-06-17 17:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Stephen Smalley, Cedric Xing, LSM List, selinux,
	LKML, linux-sgx, Jarkko Sakkinen, James Morris, Serge E. Hallyn,
	Paul Moore, Eric Paris, Jethro Beekman, Dave Hansen,
	Thomas Gleixner, Linus Torvalds, Andrew Morton, nhorman,
	pmccallum, Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao,
	Andy Shevchenko, Svahn, Kai, Borislav Petkov, Josh Triplett,
	Huang, Kai, David Rientjes, Roberts, William C, Philip Tricca

On Mon, Jun 17, 2019 at 9:49 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Sun, Jun 16, 2019 at 03:14:51PM -0700, Andy Lutomirski wrote:
> > On Fri, Jun 14, 2019 at 8:38 AM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > > > Andy and/or Cedric, can you please weigh in with a concrete (and practical)
> > > > use case that will break if we go with #1?  The auditing issues for #2/#3
> > > > are complex to say the least...
> >
> > The most significant issue I see is the following.  Consider two
> > cases. First, an SGX2 enclave that dynamically allocates memory but
> > doesn't execute code from dynamic memory.  Second, an SGX2 enclave
> > that *does* execute code from dynamic memory.  In #1, the untrusted
> > stack needs to decide whether to ALLOW_EXEC when the memory is
> > allocated, which means that it either needs to assume the worst or it
> > needs to know at allocation time whether the enclave ever intends to
> > change the permission to X.
>
> I'm just not convinced that folks running enclaves that can't communicate
> their basic functionality will care one whit about SELinux restrictions,
> i.e. will happily give EXECMOD even if it's not strictly necessary.

At least when permissions are learned, if there's no ALLOW_EXEC for
EAUG, then EXECMOD won't get learned if there's no eventual attempt to
execute the memory.

>
> > I suppose there's a middle ground.  The driver could use model #1 for
> > driver-filled pages and model #2 for dynamic pages.  I haven't tried
> > to fully work it out, but I think there would be the ALLOW_READ /
> > ALLOW_WRITE / ALLOW_EXEC flag for EADD-ed pages but, for EAUG-ed
> > pages, there would be a different policy.  This might be as simple as
> > internally having four flags instead of three:
> >
> > ALLOW_READ, ALLOW_WRITE, ALLOW_EXEC: as before
> >
> > ALLOW_EXEC_COND: set implicitly by the driver for EAUG.
> >
> > As in #1, if you try to mmap or protect a page with neither ALLOW_EXEC
> > variant, it fails (-EACCES, perhaps).  But, if you try to mmap or
> > mprotect an ALLOW_EXEC_COND page with PROT_EXEC, you ask LSM for
> > permission.  There is no fancy DIRTY tracking here, since it's
> > reasonable to just act as though *every* ALLOW_EXEC_COND page is
> > dirty.  There is no real auditing issue here, since LSM can just log
> > what permission is missing.
> >
> > Does this seem sensible?  It might give us the best of #1 and #2.
>
> It would work and is easy to implement *if* SELinux ties permissions to
> the process, as the SIGSTRUCT vma/file won't be available at
> EAUG+mprotect().  I already have a set of patches to that effect, I'll
> send 'em out in a bit.

I'm okay with that.

>
> FWIW, we still need to differentiate W->X from WX on SGX1, i.e. declaring
> ALLOW_WRITE + ALLOW_EXEC shouldn't imply WX.  This is also addressed in
> the forthcoming updated RFC.

Sounds good.

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
  2019-06-17 16:49                 ` Sean Christopherson
  2019-06-17 17:08                   ` Andy Lutomirski
@ 2019-06-18 15:40                   ` Dr. Greg
  1 sibling, 0 replies; 67+ messages in thread
From: Dr. Greg @ 2019-06-18 15:40 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Stephen Smalley, Cedric Xing, LSM List, selinux,
	LKML, linux-sgx, Jarkko Sakkinen, James Morris, Serge E. Hallyn,
	Paul Moore, Eric Paris, Jethro Beekman, Dave Hansen,
	Thomas Gleixner, Linus Torvalds, Andrew Morton, nhorman,
	pmccallum, Ayoun, Serge, Katz-zamir, Shay, Huang, Haitao,
	Andy Shevchenko, Svahn, Kai, Borislav Petkov, Josh Triplett,
	Huang, Kai, David Rientjes, Roberts, William C, Philip Tricca

On Mon, Jun 17, 2019 at 09:49:15AM -0700, Sean Christopherson wrote:

Good morning to everyone.

> On Sun, Jun 16, 2019 at 03:14:51PM -0700, Andy Lutomirski wrote:
> > The most significant issue I see is the following.  Consider two
> > cases. First, an SGX2 enclave that dynamically allocates memory but
> > doesn't execute code from dynamic memory.  Second, an SGX2 enclave
> > that *does* execute code from dynamic memory.  In #1, the untrusted
> > stack needs to decide whether to ALLOW_EXEC when the memory is
> > allocated, which means that it either needs to assume the worst or it
> > needs to know at allocation time whether the enclave ever intends to
> > change the permission to X.

> I'm just not convinced that folks running enclaves that can't
> communicate their basic functionality will care one whit about
> SELinux restrictions, i.e. will happily give EXECMOD even if it's
> not strictly necessary.

Hence the comments in my mail from last Friday.

It seems to us that the path forward is to require the enclave
author/signer to express their intent to implement executable dynamic
memory, see below.

> > I suppose there's a middle ground.  The driver could use model #1 for
> > driver-filled pages and model #2 for dynamic pages.  I haven't tried
> > to fully work it out, but I think there would be the ALLOW_READ /
> > ALLOW_WRITE / ALLOW_EXEC flag for EADD-ed pages but, for EAUG-ed
> > pages, there would be a different policy.  This might be as simple as
> > internally having four flags instead of three:
> > 
> > ALLOW_READ, ALLOW_WRITE, ALLOW_EXEC: as before
> > 
> > ALLOW_EXEC_COND: set implicitly by the driver for EAUG.
> > 
> > As in #1, if you try to mmap or protect a page with neither ALLOW_EXEC
> > variant, it fails (-EACCES, perhaps).  But, if you try to mmap or
> > mprotect an ALLOW_EXEC_COND page with PROT_EXEC, you ask LSM for
> > permission.  There is no fancy DIRTY tracking here, since it's
> > reasonable to just act as though *every* ALLOW_EXEC_COND page is
> > dirty.  There is no real auditing issue here, since LSM can just log
> > what permission is missing.
> > 
> > Does this seem sensible?  It might give us the best of #1 and #2.

> It would work and is easy to implement *if* SELinux ties permissions
> to the process, as the SIGSTRUCT vma/file won't be available at
> EAUG+mprotect().  I already have a set of patches to that effect,
> I'll send 'em out in a bit.

The VMA that is crafted from the enclave file is going to exist for
the life of the enclave.  If the intent to use executable dynamic
memory is specified when the enclave image is being built, or as a
component of enclave initialization, the driver is in a position to
log/deny a request to EAUG+mprotect whenever it occurs.  The sensitive
criteria would seem to be any request for dynamically allocated memory
with executable status.

The potential security impact of dynamically executable content is
something that is dependent on the enclave author rather then the
context of execution that is requesting pages to be allocated for such
purposes.  There is going to be an LSM hook to evaluate the SIGSTRUCT
at the time of EINIT, so all of the necessary information is there to
make a decision on whether or not to flag the VMA as being allowed to
support dynamically executable content.

It doesn't seem like an onerous requirement for this information to be
specified in the enclave metadata.  For optimum security, one could
perhaps argue that the ability to implement dynamic memory should have
been a specifiable attribute of the enclave, similar to the debug,
launch and provisioning attributes.

As we have indicated in the past, once the enclave is initialized with
permissions for dynamically executable content, the platform is
completely dependent on the security intentions of the author of the
enclave.  Given that, the notion of enduring significant LSM
complexity does not seem to be justified.

Which opens up another set of security implications to discuss but we will let
those lie for the moment.

Have a good day.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686            EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"More people are killed every year by pigs than by sharks, which shows
 you how good we are at evaluating risk."
                                -- Bruce Schneier
                                   Beyond Fear

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2019-06-18 15:42 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-06  2:11 [RFC PATCH v2 0/5] security: x86/sgx: SGX vs. LSM Sean Christopherson
2019-06-06  2:11 ` [RFC PATCH v2 1/5] mm: Introduce vm_ops->may_mprotect() Sean Christopherson
2019-06-10 15:06   ` Jarkko Sakkinen
2019-06-10 15:55     ` Sean Christopherson
2019-06-10 17:47       ` Xing, Cedric
2019-06-10 19:49         ` Sean Christopherson
2019-06-10 22:06           ` Xing, Cedric
2019-06-06  2:11 ` [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits Sean Christopherson
2019-06-10 15:27   ` Jarkko Sakkinen
2019-06-10 16:15     ` Sean Christopherson
2019-06-10 17:45       ` Jarkko Sakkinen
2019-06-10 18:17         ` Sean Christopherson
2019-06-12 19:26           ` Jarkko Sakkinen
2019-06-10 18:29   ` Xing, Cedric
2019-06-10 19:15     ` Andy Lutomirski
2019-06-10 22:28       ` Xing, Cedric
2019-06-12  0:09         ` Andy Lutomirski
2019-06-12 14:34           ` Sean Christopherson
2019-06-12 18:20             ` Xing, Cedric
2019-06-06  2:11 ` [RFC PATCH v2 3/5] x86/sgx: Enforce noexec filesystem restriction for enclaves Sean Christopherson
2019-06-10 16:00   ` Jarkko Sakkinen
2019-06-10 16:44     ` Andy Lutomirski
2019-06-11 17:21       ` Stephen Smalley
2019-06-06  2:11 ` [RFC PATCH v2 4/5] LSM: x86/sgx: Introduce ->enclave_load() hook for Intel SGX Sean Christopherson
2019-06-07 19:58   ` Stephen Smalley
2019-06-10 16:21     ` Sean Christopherson
2019-06-10 16:05   ` Jarkko Sakkinen
2019-06-06  2:11 ` [RFC PATCH v2 5/5] security/selinux: Add enclave_load() implementation Sean Christopherson
2019-06-07 21:16   ` Stephen Smalley
2019-06-10 16:46     ` Sean Christopherson
2019-06-17 16:38   ` Jarkko Sakkinen
2019-06-10  7:03 ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Cedric Xing
2019-06-10  7:03   ` [RFC PATCH v1 1/3] LSM/x86/sgx: Add " Cedric Xing
2019-06-10  7:03   ` [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux Cedric Xing
2019-06-11 13:40     ` Stephen Smalley
2019-06-11 22:02       ` Sean Christopherson
2019-06-12  9:32         ` Dr. Greg
2019-06-12 14:25           ` Sean Christopherson
2019-06-13  7:25             ` Dr. Greg
2019-06-12 19:30         ` Andy Lutomirski
2019-06-12 22:02           ` Sean Christopherson
2019-06-13  0:10             ` Xing, Cedric
2019-06-13  1:02             ` Xing, Cedric
2019-06-13 17:02         ` Stephen Smalley
2019-06-13 23:03           ` Xing, Cedric
2019-06-13 23:17             ` Sean Christopherson
2019-06-14  0:31               ` Xing, Cedric
2019-06-14  0:46           ` Sean Christopherson
2019-06-14 15:38             ` Sean Christopherson
2019-06-16 22:14               ` Andy Lutomirski
2019-06-17 16:49                 ` Sean Christopherson
2019-06-17 17:08                   ` Andy Lutomirski
2019-06-18 15:40                   ` Dr. Greg
2019-06-14 17:16             ` Xing, Cedric
2019-06-14 17:45               ` Sean Christopherson
2019-06-14 17:53                 ` Sean Christopherson
2019-06-14 20:01                   ` Sean Christopherson
2019-06-16 22:16               ` Andy Lutomirski
2019-06-14 23:19             ` Dr. Greg
2019-06-11 22:55       ` Xing, Cedric
2019-06-13 18:00         ` Stephen Smalley
2019-06-13 19:48           ` Sean Christopherson
2019-06-13 21:09             ` Xing, Cedric
2019-06-13 21:02           ` Xing, Cedric
2019-06-14  0:37           ` Sean Christopherson
2019-06-10  7:03   ` [RFC PATCH v1 3/3] LSM/x86/sgx: Call new LSM hooks from SGX subsystem Cedric Xing
2019-06-10 17:36   ` [RFC PATCH v1 0/3] security/x86/sgx: SGX specific LSM hooks Jarkko Sakkinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).