All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jarkko Sakkinen <jarkko@kernel.org>
To: linux-sgx@vger.kernel.org
Cc: Nathaniel McCallum <nathaniel@profian.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Jarkko Sakkinen <jarkko@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)),
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-kernel@vger.kernel.org (open list:X86 ARCHITECTURE (32-BIT
	AND 64-BIT))
Subject: [RFC PATCH v2.1 07/30] x86/sgx: Add pfn_mkwrite() handler for present PTEs
Date: Fri,  4 Mar 2022 11:35:01 +0200	[thread overview]
Message-ID: <20220304093524.397485-7-jarkko@kernel.org> (raw)
In-Reply-To: <20220304093524.397485-1-jarkko@kernel.org>

From: Reinette Chatre <reinette.chatre@intel.com>

By default a write page fault on a present PTE inherits the
permissions of the VMA.

When using SGX2, enclave page permissions maintained in the
hardware's Enclave Page Cache Map (EPCM) may change after a VMA
accessing the page is created. A VMA's permissions may thus be
more relaxed than the EPCM permissions even though the VMA was
originally created not to have more relaxed permissions. Following
the default behavior during a page fault on a present PTE while
the VMA permissions are more relaxed than the EPCM permissions would
result in the PTE for an enclave page to be writable even
though the page is not writable according to the EPCM permissions.

The kernel should not allow writing to a page if that page is not
writable: the PTE should accurately reflect the EPCM permissions
while not being more relaxed than the VMA permissions.

Do not blindly accept VMA permissions on a page fault due to a
write attempt to a present PTE. Install a pfn_mkwrite() handler
that ensures that the VMA permissions agree with the EPCM
permissions in this regard.

Before and after page fault flow scenarios
==========================================

Consider the following scenario that will be possible when using SGX2:
* An enclave page exists with RW EPCM permissions.
* A RW VMA maps the range spanning the enclave page.
* The enclave page's EPCM permissions are changed to read-only.
* There is no PTE for the enclave page.

Considering that the PTE is not present in the scenario,
user space will observe the following when attempting to write to the
enclave page from within the enclave:
 1) Instruction writing to enclave page is run from within the enclave.
 2) A page fault with second and third bits set (0x6) is encountered
    and handled by the SGX handler sgx_vma_fault() that installs a
    read-only page table entry following previous patch that installs
    a PTE with permissions that VMA and enclave agree on
    (read-only in this case).
 3) Instruction writing to enclave page is re-attempted.
 4) A page fault with first three bits set (0x7) is encountered and
    transparently (from SGX driver and user space perspective) handled
    by the kernel with the PTE made writable because the VMA is
    writable.
 5) Instruction writing to enclave page is re-attempted.
 6) Since the EPCM permissions prevents writing to the page a new page
    fault is encountered, this time with the SGX flag set in the error
    code (0x8007). No action is taken by the kernel for this page fault
    and execution returns to user space.
 7) Typically such a fault will be passed on to an application with a
    signal but if the enclave is entered with the vDSO function provided
    by the kernel then user space does not receive a signal but instead
    the vDSO function returns successfully with exception information
    (vector=14, error code=0x8007, and address) within the exception
    fields within the vDSO function's struct sgx_enclave_run.

As can be observed it is not possible for user space to write to an
enclave page if that page's EPCM permissions do not allow so,
no matter what the VMA or PTE allows.

Even so, the kernel should not allow writing to a page if that page is
not writable. The PTE should accurately reflect the EPCM permissions.

With a pfn_mkwrite() handler that ensures that the VMA permissions
agree with the EPCM permissions user space observes the following
when attempting to write to the enclave page from within the enclave:
 1) Instruction writing to enclave page is run from within the enclave.
 2) A page fault with second and third bits set (0x6) is encountered
    and handled by the SGX handler sgx_vma_fault() that installs a
    read-only page table entry following previous patch that installs
    a PTE with permissions that VMA and enclave agree on
    (read-only in this case).
 3) Instruction writing to enclave page is re-attempted.
 4) A page fault with first three bits set (0x7) is encountered and
    passed to the pfn_mkwrite() handler for consideration. The handler
    determines that the page should not be writable and returns SIGBUS.
 5) Typically such a fault will be passed on to an application with a
    signal but if the enclave is entered with the vDSO function provided
    by the kernel then user space does not receive a signal but instead
    the vDSO function returns successfully with exception information
    (vector=14, error code=0x7, and address) within the exception fields
    within the vDSO function's struct sgx_enclave_run.

The accurate exception information supports the SGX runtime, which is
virtually always implemented inside a shared library, by providing
accurate information in support of its management of the SGX enclave.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 42 ++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 20e97d3abdce..6d25f7ed1294 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -184,6 +184,47 @@ static vm_fault_t sgx_vma_fault(struct vm_fault *vmf)
 	return VM_FAULT_NOPAGE;
 }
 
+/*
+ * A fault occurred while writing to a present enclave PTE. Since PTE is
+ * present this will not be handled by sgx_vma_fault(). VMA may allow
+ * writing to the page while enclave (as based on EPCM permissions) does
+ * not. Do not follow the default of inheriting VMA permissions in this
+ * regard, ensure enclave also allows writing to the page.
+ */
+static vm_fault_t sgx_vma_pfn_mkwrite(struct vm_fault *vmf)
+{
+	unsigned long addr = (unsigned long)vmf->address;
+	struct vm_area_struct *vma = vmf->vma;
+	struct sgx_encl_page *entry;
+	struct sgx_encl *encl;
+	vm_fault_t ret = 0;
+
+	encl = vma->vm_private_data;
+
+	/*
+	 * It's very unlikely but possible that allocating memory for the
+	 * mm_list entry of a forked process failed in sgx_vma_open(). When
+	 * this happens, vm_private_data is set to NULL.
+	 */
+	if (unlikely(!encl))
+		return VM_FAULT_SIGBUS;
+
+	mutex_lock(&encl->lock);
+
+	entry = xa_load(&encl->page_array, PFN_DOWN(addr));
+	if (!entry) {
+		ret = VM_FAULT_SIGBUS;
+		goto out;
+	}
+
+	if (!(entry->vm_max_prot_bits & VM_WRITE))
+		ret = VM_FAULT_SIGBUS;
+
+out:
+	mutex_unlock(&encl->lock);
+	return ret;
+}
+
 static void sgx_vma_open(struct vm_area_struct *vma)
 {
 	struct sgx_encl *encl = vma->vm_private_data;
@@ -381,6 +422,7 @@ const struct vm_operations_struct sgx_vm_ops = {
 	.mprotect = sgx_vma_mprotect,
 	.open = sgx_vma_open,
 	.access = sgx_vma_access,
+	.pfn_mkwrite = sgx_vma_pfn_mkwrite,
 };
 
 /**
-- 
2.35.1


  parent reply	other threads:[~2022-03-04  9:37 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-04  9:34 [RFC PATCH v2.1 01/30] x86/sgx: Add short descriptions to ENCLS wrappers Jarkko Sakkinen
2022-03-04  9:34 ` [RFC PATCH v2.1 02/30] x86/sgx: Add wrapper for SGX2 EMODPR function Jarkko Sakkinen
2022-03-04  9:34 ` [RFC PATCH v2.1 03/30] x86/sgx: Add wrapper for SGX2 EMODT function Jarkko Sakkinen
2022-03-04  9:34 ` [RFC PATCH v2.1 04/30] x86/sgx: Add wrapper for SGX2 EAUG function Jarkko Sakkinen
2022-03-04  9:34 ` [RFC PATCH v2.1 05/30] Documentation/x86: Document SGX permission details Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 06/30] x86/sgx: Support VMA permissions more relaxed than enclave permissions Jarkko Sakkinen
2022-03-04  9:35 ` Jarkko Sakkinen [this message]
2022-03-04  9:35 ` [RFC PATCH v2.1 08/30] x86/sgx: Export sgx_encl_ewb_cpumask() Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 09/30] x86/sgx: Rename sgx_encl_ewb_cpumask() as sgx_encl_cpumask() Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 10/30] x86/sgx: Move PTE zap code to new sgx_zap_enclave_ptes() Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 11/30] x86/sgx: Make sgx_ipi_cb() available internally Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 12/30] x86/sgx: Create utility to validate user provided offset and length Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 13/30] x86/sgx: Keep record of SGX page type Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 14/30] x86/sgx: Support restricting of enclave page permissions Jarkko Sakkinen
2022-03-09  8:52   ` Jarkko Sakkinen
2022-03-09  9:35     ` Jarkko Sakkinen
2022-03-09 16:59       ` Reinette Chatre
2022-03-09 19:10         ` Reinette Chatre
2022-03-09 23:35         ` Jarkko Sakkinen
2022-03-09 23:42           ` Jarkko Sakkinen
2022-03-10  0:11             ` Reinette Chatre
2022-03-10  0:10           ` Reinette Chatre
2022-03-10  2:02             ` Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 15/30] selftests/sgx: Add test for EPCM permission changes Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 16/30] selftests/sgx: Add test for TCS page " Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 17/30] x86/sgx: Support adding of pages to an initialized enclave Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 18/30] x86/sgx: Tighten accessible memory range after enclave initialization Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 19/30] selftests/sgx: Test two different SGX2 EAUG flows Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 20/30] x86/sgx: Support modifying SGX page type Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 21/30] x86/sgx: Support complete page removal Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 22/30] Documentation/x86: Introduce enclave runtime management section Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 23/30] selftests/sgx: Introduce dynamic entry point Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 24/30] selftests/sgx: Introduce TCS initialization enclave operation Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 25/30] selftests/sgx: Test complete changing of page type flow Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 26/30] selftests/sgx: Test faulty enclave behavior Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 27/30] selftests/sgx: Test invalid access to removed enclave page Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 28/30] selftests/sgx: Test reclaiming of untouched page Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 29/30] x86/sgx: Free up EPC pages directly to support large page ranges Jarkko Sakkinen
2022-03-04  9:35 ` [RFC PATCH v2.1 30/30] selftests/sgx: Page removal stress test Jarkko Sakkinen
2022-03-04  9:40 ` [RFC PATCH v2.1 01/30] x86/sgx: Add short descriptions to ENCLS wrappers Jarkko Sakkinen
2022-03-04  9:41   ` Jarkko Sakkinen
2022-03-14 19:04 ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220304093524.397485-7-jarkko@kernel.org \
    --to=jarkko@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sgx@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nathaniel@profian.com \
    --cc=reinette.chatre@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.