linux-sgx.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes
@ 2019-10-10 23:20 Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 1/9] x86/sgx: WARN once if an enclave is released with unfreed EPC pages Sean Christopherson
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:20 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

Two critical bug fixes for a memory leak in sgx_encl_destroy() and a
livelock due to the EPC page free count getting corrupted.  The rest of
the patches are minor bug fixes and enhancements I collected in the
process of hunting down the livelock.

v2: Fully fix the memory leak, which was hilariously worse than originally
    thought.  Fun fact, the leak has existed since v18, i.e. nearly a year.

Sean Christopherson (9):
  x86/sgx: WARN once if an enclave is released with unfreed EPC pages
  x86/sgx: Do not EWB SECS if the enclave is dead
  x86/sgx: Fix a memory leak in sgx_encl_destroy()
  x86/sgx: WARN on any non-zero return from __eremove()
  x86/sgx: WARN only once if EREMOVE fails
  x86/sgx: Split second half of sgx_free_page() to a separate helper
  x86/sgx: Use the post-reclaim variant of __sgx_free_page()
  x86/sgx: Don't update free page count if EPC section allocation fails
  x86/sgx: Reinstate per EPC section free page counts

 arch/x86/kernel/cpu/sgx/encl.c    | 34 ++++++++++++-------
 arch/x86/kernel/cpu/sgx/ioctl.c   |  6 ++--
 arch/x86/kernel/cpu/sgx/main.c    | 54 ++++++++++++++++++++++---------
 arch/x86/kernel/cpu/sgx/reclaim.c | 25 +++++++-------
 arch/x86/kernel/cpu/sgx/sgx.h     | 19 ++++++++++-
 5 files changed, 96 insertions(+), 42 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH for_v23 v2 1/9] x86/sgx: WARN once if an enclave is released with unfreed EPC pages
  2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
@ 2019-10-10 23:21 ` Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 2/9] x86/sgx: Do not EWB SECS if the enclave is dead Sean Christopherson
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:21 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

Add a WARN to detect EPC page leaks when releasing an enclave.  The
release flow uses the common sgx_encl_destroy() helper, which is allowed
to be called while the reclaimer holds references to the enclave's EPC
pages and so can't WARN in the scenario where the SECS is leaked because
it has active child pages.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index c13c3ba3430a..b4d7b2f9609f 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -511,6 +511,7 @@ void sgx_encl_release(struct kref *ref)
 		fput(encl->backing);
 
 	WARN_ONCE(!list_empty(&encl->mm_list), "mm_list non-empty");
+	WARN_ON_ONCE(encl->secs_child_cnt || encl->secs.epc_page);
 
 	kfree(encl);
 }
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for_v23 v2 2/9] x86/sgx: Do not EWB SECS if the enclave is dead
  2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 1/9] x86/sgx: WARN once if an enclave is released with unfreed EPC pages Sean Christopherson
@ 2019-10-10 23:21 ` Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 3/9] x86/sgx: Fix a memory leak in sgx_encl_destroy() Sean Christopherson
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:21 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

Don't EWB if the enclave is dead when opportunistically zapping the SECS
during reclaim as VA pages are freed by sgx_encl_destroy(), i.e.
sgx_encl_ewb() will consume a bad encl->va_pages if the enclave has been
destroyed.

Add a comment in sgx_encl_destroy() to explicit call out that it's ok to
free VA pages.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c    |  6 +++++-
 arch/x86/kernel/cpu/sgx/reclaim.c | 19 +++++++++++--------
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index b4d7b2f9609f..ea21d3737a32 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -484,7 +484,11 @@ void sgx_encl_destroy(struct sgx_encl *encl)
 		encl->secs.epc_page = NULL;
 	}
 
-
+	/*
+	 * The reclaimer is responsible for checking SGX_ENCL_DEAD before doing
+	 * EWB, thus it's safe to free VA pages even if the reclaimer holds a
+	 * reference to the enclave.
+	 */
 	while (!list_empty(&encl->va_pages)) {
 		va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
 					   list);
diff --git a/arch/x86/kernel/cpu/sgx/reclaim.c b/arch/x86/kernel/cpu/sgx/reclaim.c
index 391fbc3e7e98..8143c9a20894 100644
--- a/arch/x86/kernel/cpu/sgx/reclaim.c
+++ b/arch/x86/kernel/cpu/sgx/reclaim.c
@@ -321,16 +321,19 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
 	encl_page->epc_page = NULL;
 	encl->secs_child_cnt--;
 
-	if (!encl->secs_child_cnt &&
-	    (atomic_read(&encl->flags) &
-	     (SGX_ENCL_DEAD | SGX_ENCL_INITIALIZED))) {
-		ret = sgx_encl_get_backing(encl, PFN_DOWN(encl->size),
-					   &secs_backing);
-		if (!ret) {
-			sgx_encl_ewb(encl->secs.epc_page, &secs_backing);
+	if (!encl->secs_child_cnt) {
+		if (atomic_read(&encl->flags) & SGX_ENCL_DEAD) {
 			sgx_free_page(encl->secs.epc_page);
-
 			encl->secs.epc_page = NULL;
+		} else if (atomic_read(&encl->flags) & SGX_ENCL_INITIALIZED) {
+			ret = sgx_encl_get_backing(encl, PFN_DOWN(encl->size),
+						   &secs_backing);
+			if (!ret) {
+				sgx_encl_ewb(encl->secs.epc_page, &secs_backing);
+
+				sgx_free_page(encl->secs.epc_page);
+				encl->secs.epc_page = NULL;
+			}
 
 			sgx_encl_put_backing(&secs_backing, true);
 		}
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for_v23 v2 3/9] x86/sgx: Fix a memory leak in sgx_encl_destroy()
  2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 1/9] x86/sgx: WARN once if an enclave is released with unfreed EPC pages Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 2/9] x86/sgx: Do not EWB SECS if the enclave is dead Sean Christopherson
@ 2019-10-10 23:21 ` Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 4/9] x86/sgx: WARN on any non-zero return from __eremove() Sean Christopherson
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:21 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

Delete an enclave page's entry in the radix tree regardless of whether
or not it has an associated EPC page, and free the page itself when it's
deleted from the radix tree.

Don't free/delete anything if the page is held by the reclaimer, as the
reclaimer needs the page itself and the driver needs the radix entry to
re-process the entry during sgx_encl_release().

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index ea21d3737a32..c1d2df186b2d 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -469,14 +469,19 @@ void sgx_encl_destroy(struct sgx_encl *encl)
 	radix_tree_for_each_slot(slot, &encl->page_tree, &iter, 0) {
 		entry = *slot;
 		if (entry->epc_page) {
-			if (!sgx_free_page(entry->epc_page)) {
-				encl->secs_child_cnt--;
-				entry->epc_page = NULL;
-			}
-
-			radix_tree_delete(&entry->encl->page_tree,
-					  PFN_DOWN(entry->desc));
+			/*
+			 * The page and its radix tree entry cannot be freed
+			 * if the page is being held by the reclaimer.
+			 */
+			if (sgx_free_page(entry->epc_page))
+				continue;
+			encl->secs_child_cnt--;
+			entry->epc_page = NULL;
 		}
+
+		radix_tree_delete(&entry->encl->page_tree,
+				  PFN_DOWN(entry->desc));
+		kfree(entry);
 	}
 
 	if (!encl->secs_child_cnt && encl->secs.epc_page) {
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for_v23 v2 4/9] x86/sgx: WARN on any non-zero return from __eremove()
  2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
                   ` (2 preceding siblings ...)
  2019-10-10 23:21 ` [PATCH for_v23 v2 3/9] x86/sgx: Fix a memory leak in sgx_encl_destroy() Sean Christopherson
@ 2019-10-10 23:21 ` Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 5/9] x86/sgx: WARN only once if EREMOVE fails Sean Christopherson
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:21 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

WARN on any non-zero return from __eremove() to make it clear that any
kind of failure is unexpected.  Technically, warning on negative values
is unnecessary as the ENCLS helpers return SGX error codes, which are
currently all postive.  But, the more precise check might be
misinterpreted as implying the negative values are expected/ok.

Note, prior to a recent change, warning only on positive values was
necessary to avoid a redundant double-WARN as the WARN resided outside
of what is now sgx_free_page() and so could consume -EBUSY.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 2b540abbb61f..5170d4ba1096 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -137,7 +137,7 @@ int sgx_free_page(struct sgx_epc_page *page)
 	spin_unlock(&sgx_active_page_list_lock);
 
 	ret = __eremove(sgx_epc_addr(page));
-	WARN(ret > 0, "EREMOVE returned %d (0x%x)", ret, ret);
+	WARN(ret, "EREMOVE returned %d (0x%x)", ret, ret);
 
 	spin_lock(&section->lock);
 	list_add_tail(&page->list, &section->page_list);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for_v23 v2 5/9] x86/sgx: WARN only once if EREMOVE fails
  2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
                   ` (3 preceding siblings ...)
  2019-10-10 23:21 ` [PATCH for_v23 v2 4/9] x86/sgx: WARN on any non-zero return from __eremove() Sean Christopherson
@ 2019-10-10 23:21 ` Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 6/9] x86/sgx: Split second half of sgx_free_page() to a separate helper Sean Christopherson
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:21 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

WARN only once if EREMOVE fails to avoid spamming the kernel log if a
catastrophic failure occurs.  Warning on every failure is helpful for
development, but is a bad idea for production code as EREMOVE rarely
fails just once...

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 5170d4ba1096..84a44251387b 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -137,7 +137,7 @@ int sgx_free_page(struct sgx_epc_page *page)
 	spin_unlock(&sgx_active_page_list_lock);
 
 	ret = __eremove(sgx_epc_addr(page));
-	WARN(ret, "EREMOVE returned %d (0x%x)", ret, ret);
+	WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret);
 
 	spin_lock(&section->lock);
 	list_add_tail(&page->list, &section->page_list);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for_v23 v2 6/9] x86/sgx: Split second half of sgx_free_page() to a separate helper
  2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
                   ` (4 preceding siblings ...)
  2019-10-10 23:21 ` [PATCH for_v23 v2 5/9] x86/sgx: WARN only once if EREMOVE fails Sean Christopherson
@ 2019-10-10 23:21 ` Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 7/9] x86/sgx: Use the post-reclaim variant of __sgx_free_page() Sean Christopherson
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:21 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

Move the post-reclaim half of sgx_free_page() to a standalone helper so
that it can be used in flows where the page is known to be
non-reclaimable.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 44 ++++++++++++++++++++++++++--------
 arch/x86/kernel/cpu/sgx/sgx.h  |  1 +
 2 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 84a44251387b..e22cdbb431a3 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -103,6 +103,39 @@ struct sgx_epc_page *sgx_alloc_page(void *owner, bool reclaim)
 	return entry;
 }
 
+
+/**
+ * __sgx_free_page() - Free an EPC page
+ * @page:	pointer a previously allocated EPC page
+ *
+ * EREMOVE an EPC page and insert it back to the list of free pages. The page
+ * must not be reclaimable.
+ */
+void __sgx_free_page(struct sgx_epc_page *page)
+{
+	struct sgx_epc_section *section;
+	int ret;
+
+	/*
+	 * Don't take sgx_active_page_list_lock when asserting the page isn't
+	 * reclaimable, missing a WARN in the very rare case is preferable to
+	 * unnecessarily taking a global lock in the common case.
+	 */
+	WARN_ON_ONCE(page->desc & SGX_EPC_PAGE_RECLAIMABLE);
+
+	ret = __eremove(sgx_epc_addr(page));
+	if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret))
+		return;
+
+	section = sgx_epc_section(page);
+
+	spin_lock(&section->lock);
+	list_add_tail(&page->list, &section->page_list);
+	sgx_nr_free_pages++;
+	spin_unlock(&section->lock);
+
+}
+
 /**
  * sgx_free_page() - Free an EPC page
  * @page:	pointer a previously allocated EPC page
@@ -116,9 +149,6 @@ struct sgx_epc_page *sgx_alloc_page(void *owner, bool reclaim)
  */
 int sgx_free_page(struct sgx_epc_page *page)
 {
-	struct sgx_epc_section *section = sgx_epc_section(page);
-	int ret;
-
 	/*
 	 * Remove the page from the active list if necessary.  If the page
 	 * is actively being reclaimed, i.e. RECLAIMABLE is set but the
@@ -136,13 +166,7 @@ int sgx_free_page(struct sgx_epc_page *page)
 	}
 	spin_unlock(&sgx_active_page_list_lock);
 
-	ret = __eremove(sgx_epc_addr(page));
-	WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret);
-
-	spin_lock(&section->lock);
-	list_add_tail(&page->list, &section->page_list);
-	sgx_nr_free_pages++;
-	spin_unlock(&section->lock);
+	__sgx_free_page(page);
 
 	return 0;
 }
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index 160a3c996ef6..87e375e8c25e 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -85,6 +85,7 @@ void sgx_reclaim_pages(void);
 
 struct sgx_epc_page *sgx_try_alloc_page(void);
 struct sgx_epc_page *sgx_alloc_page(void *owner, bool reclaim);
+void __sgx_free_page(struct sgx_epc_page *page);
 int sgx_free_page(struct sgx_epc_page *page);
 
 #endif /* _X86_SGX_H */
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for_v23 v2 7/9] x86/sgx: Use the post-reclaim variant of __sgx_free_page()
  2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
                   ` (5 preceding siblings ...)
  2019-10-10 23:21 ` [PATCH for_v23 v2 6/9] x86/sgx: Split second half of sgx_free_page() to a separate helper Sean Christopherson
@ 2019-10-10 23:21 ` Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 8/9] x86/sgx: Don't update free page count if EPC section allocation fails Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 9/9] x86/sgx: Reinstate per EPC section free page counts Sean Christopherson
  8 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:21 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

Use __sgx_free_page() in all locations where the EPC page is supposed to
be unreclaimable so that a WARN fires if the page is unexpectedly marked
for reclaim.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c    | 8 ++++----
 arch/x86/kernel/cpu/sgx/ioctl.c   | 6 +++---
 arch/x86/kernel/cpu/sgx/reclaim.c | 4 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index c1d2df186b2d..1fa990d48ee2 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -73,7 +73,7 @@ static struct sgx_epc_page *sgx_encl_eldu(struct sgx_encl_page *encl_page,
 
 	ret = __sgx_encl_eldu(encl_page, epc_page, secs_page);
 	if (ret) {
-		sgx_free_page(epc_page);
+		__sgx_free_page(epc_page);
 		return ERR_PTR(ret);
 	}
 
@@ -485,7 +485,7 @@ void sgx_encl_destroy(struct sgx_encl *encl)
 	}
 
 	if (!encl->secs_child_cnt && encl->secs.epc_page) {
-		sgx_free_page(encl->secs.epc_page);
+		__sgx_free_page(encl->secs.epc_page);
 		encl->secs.epc_page = NULL;
 	}
 
@@ -498,7 +498,7 @@ void sgx_encl_destroy(struct sgx_encl *encl)
 		va_page = list_first_entry(&encl->va_pages, struct sgx_va_page,
 					   list);
 		list_del(&va_page->list);
-		sgx_free_page(va_page->epc_page);
+		__sgx_free_page(va_page->epc_page);
 		kfree(va_page);
 	}
 }
@@ -693,7 +693,7 @@ struct sgx_epc_page *sgx_alloc_va_page(void)
 	ret = __epa(sgx_epc_addr(epc_page));
 	if (ret) {
 		WARN_ONCE(1, "EPA returned %d (0x%x)", ret, ret);
-		sgx_free_page(epc_page);
+		__sgx_free_page(epc_page);
 		return ERR_PTR(-EFAULT);
 	}
 
diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
index eeed919ed3e2..7a6bc3bf8f8c 100644
--- a/arch/x86/kernel/cpu/sgx/ioctl.c
+++ b/arch/x86/kernel/cpu/sgx/ioctl.c
@@ -47,7 +47,7 @@ static void sgx_encl_shrink(struct sgx_encl *encl, struct sgx_va_page *va_page)
 	encl->page_cnt--;
 
 	if (va_page) {
-		sgx_free_page(va_page->epc_page);
+		__sgx_free_page(va_page->epc_page);
 		list_del(&va_page->list);
 		kfree(va_page);
 	}
@@ -220,7 +220,7 @@ static int sgx_encl_create(struct sgx_encl *encl, struct sgx_secs *secs)
 	return 0;
 
 err_out:
-	sgx_free_page(encl->secs.epc_page);
+	__sgx_free_page(encl->secs.epc_page);
 	encl->secs.epc_page = NULL;
 
 err_out_backing:
@@ -441,7 +441,7 @@ static int sgx_encl_add_page(struct sgx_encl *encl,
 	up_read(&current->mm->mmap_sem);
 
 err_out_free:
-	sgx_free_page(epc_page);
+	__sgx_free_page(epc_page);
 	kfree(encl_page);
 
 	return ret;
diff --git a/arch/x86/kernel/cpu/sgx/reclaim.c b/arch/x86/kernel/cpu/sgx/reclaim.c
index 8143c9a20894..3f183dd0e653 100644
--- a/arch/x86/kernel/cpu/sgx/reclaim.c
+++ b/arch/x86/kernel/cpu/sgx/reclaim.c
@@ -323,7 +323,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
 
 	if (!encl->secs_child_cnt) {
 		if (atomic_read(&encl->flags) & SGX_ENCL_DEAD) {
-			sgx_free_page(encl->secs.epc_page);
+			__sgx_free_page(encl->secs.epc_page);
 			encl->secs.epc_page = NULL;
 		} else if (atomic_read(&encl->flags) & SGX_ENCL_INITIALIZED) {
 			ret = sgx_encl_get_backing(encl, PFN_DOWN(encl->size),
@@ -331,7 +331,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page,
 			if (!ret) {
 				sgx_encl_ewb(encl->secs.epc_page, &secs_backing);
 
-				sgx_free_page(encl->secs.epc_page);
+				__sgx_free_page(encl->secs.epc_page);
 				encl->secs.epc_page = NULL;
 			}
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for_v23 v2 8/9] x86/sgx: Don't update free page count if EPC section allocation fails
  2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
                   ` (6 preceding siblings ...)
  2019-10-10 23:21 ` [PATCH for_v23 v2 7/9] x86/sgx: Use the post-reclaim variant of __sgx_free_page() Sean Christopherson
@ 2019-10-10 23:21 ` Sean Christopherson
  2019-10-10 23:21 ` [PATCH for_v23 v2 9/9] x86/sgx: Reinstate per EPC section free page counts Sean Christopherson
  8 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:21 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

Update the number of free pages only after an EPC section is fully
initialized, else the free page count will be left in a bogus state if
allocation fails.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index e22cdbb431a3..48a4f37b5b3c 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -216,9 +216,10 @@ static bool __init sgx_alloc_epc_section(u64 addr, u64 size,
 
 		page->desc = (addr + (i << PAGE_SHIFT)) | index;
 		list_add_tail(&page->list, &section->unsanitized_page_list);
-		sgx_nr_free_pages++;
 	}
 
+	sgx_nr_free_pages += nr_pages;
+
 	return true;
 
 err_out:
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH for_v23 v2 9/9] x86/sgx: Reinstate per EPC section free page counts
  2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
                   ` (7 preceding siblings ...)
  2019-10-10 23:21 ` [PATCH for_v23 v2 8/9] x86/sgx: Don't update free page count if EPC section allocation fails Sean Christopherson
@ 2019-10-10 23:21 ` Sean Christopherson
  8 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2019-10-10 23:21 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: linux-sgx

Track the free page count on a per EPC section basis so that the value
is properly protected by the section's spinlock.

As was pointed out when the change was proposed[*], using a global
non-atomic counter to track the number of free EPC pages is not safe.
The order of non-atomic reads and writes are not guaranteed, i.e.
concurrent RMW operats can write stale data.  This causes a variety
of bad behavior, e.g. livelocks because the free page count wraps and
causes the swap thread to stop reclaiming.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c    | 11 +++++------
 arch/x86/kernel/cpu/sgx/reclaim.c |  4 ++--
 arch/x86/kernel/cpu/sgx/sgx.h     | 18 +++++++++++++++++-
 3 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 48a4f37b5b3c..cc87690fa1ec 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -13,18 +13,17 @@
 
 struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS];
 int sgx_nr_epc_sections;
-unsigned long sgx_nr_free_pages;
 
 static struct sgx_epc_page *__sgx_try_alloc_page(struct sgx_epc_section *section)
 {
 	struct sgx_epc_page *page;
 
-	if (list_empty(&section->page_list))
+	if (!section->free_cnt)
 		return NULL;
 
 	page = list_first_entry(&section->page_list, struct sgx_epc_page, list);
 	list_del_init(&page->list);
-	sgx_nr_free_pages--;
+	section->free_cnt--;
 	return page;
 }
 
@@ -97,7 +96,7 @@ struct sgx_epc_page *sgx_alloc_page(void *owner, bool reclaim)
 		schedule();
 	}
 
-	if (sgx_nr_free_pages < SGX_NR_LOW_PAGES)
+	if (!sgx_at_least_N_free_pages(SGX_NR_LOW_PAGES))
 		wake_up(&ksgxswapd_waitq);
 
 	return entry;
@@ -131,7 +130,7 @@ void __sgx_free_page(struct sgx_epc_page *page)
 
 	spin_lock(&section->lock);
 	list_add_tail(&page->list, &section->page_list);
-	sgx_nr_free_pages++;
+	section->free_cnt++;
 	spin_unlock(&section->lock);
 
 }
@@ -218,7 +217,7 @@ static bool __init sgx_alloc_epc_section(u64 addr, u64 size,
 		list_add_tail(&page->list, &section->unsanitized_page_list);
 	}
 
-	sgx_nr_free_pages += nr_pages;
+	section->free_cnt = nr_pages;
 
 	return true;
 
diff --git a/arch/x86/kernel/cpu/sgx/reclaim.c b/arch/x86/kernel/cpu/sgx/reclaim.c
index 3f183dd0e653..8619141f4bed 100644
--- a/arch/x86/kernel/cpu/sgx/reclaim.c
+++ b/arch/x86/kernel/cpu/sgx/reclaim.c
@@ -68,7 +68,7 @@ static void sgx_sanitize_section(struct sgx_epc_section *section)
 
 static inline bool sgx_should_reclaim(void)
 {
-	return sgx_nr_free_pages < SGX_NR_HIGH_PAGES &&
+	return !sgx_at_least_N_free_pages(SGX_NR_HIGH_PAGES) &&
 	       !list_empty(&sgx_active_page_list);
 }
 
@@ -430,7 +430,7 @@ void sgx_reclaim_pages(void)
 		section = sgx_epc_section(epc_page);
 		spin_lock(&section->lock);
 		list_add_tail(&epc_page->list, &section->page_list);
-		sgx_nr_free_pages++;
+		section->free_cnt++;
 		spin_unlock(&section->lock);
 	}
 }
diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h
index 87e375e8c25e..c7f0277299f6 100644
--- a/arch/x86/kernel/cpu/sgx/sgx.h
+++ b/arch/x86/kernel/cpu/sgx/sgx.h
@@ -30,6 +30,7 @@ struct sgx_epc_page {
 struct sgx_epc_section {
 	unsigned long pa;
 	void *va;
+	unsigned long free_cnt;
 	struct list_head page_list;
 	struct list_head unsanitized_page_list;
 	spinlock_t lock;
@@ -73,12 +74,27 @@ static inline void *sgx_epc_addr(struct sgx_epc_page *page)
 #define SGX_NR_HIGH_PAGES	64
 
 extern int sgx_nr_epc_sections;
-extern unsigned long sgx_nr_free_pages;
 extern struct task_struct *ksgxswapd_tsk;
 extern struct wait_queue_head(ksgxswapd_waitq);
 extern struct list_head sgx_active_page_list;
 extern spinlock_t sgx_active_page_list_lock;
 
+static inline bool sgx_at_least_N_free_pages(unsigned long threshold)
+{
+	struct sgx_epc_section *section;
+	unsigned long free_cnt = 0;
+	int i;
+
+	for (i = 0; i < sgx_nr_epc_sections; i++) {
+		section = &sgx_epc_sections[i];
+		free_cnt += section->free_cnt;
+		if (free_cnt >= threshold)
+			return true;
+	}
+
+	return false;
+}
+
 bool __init sgx_page_reclaimer_init(void);
 void sgx_mark_page_reclaimable(struct sgx_epc_page *page);
 void sgx_reclaim_pages(void);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-10-10 23:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-10 23:20 [PATCH for_v23 v2 0/9] x86/sgx: Misc page related fixes Sean Christopherson
2019-10-10 23:21 ` [PATCH for_v23 v2 1/9] x86/sgx: WARN once if an enclave is released with unfreed EPC pages Sean Christopherson
2019-10-10 23:21 ` [PATCH for_v23 v2 2/9] x86/sgx: Do not EWB SECS if the enclave is dead Sean Christopherson
2019-10-10 23:21 ` [PATCH for_v23 v2 3/9] x86/sgx: Fix a memory leak in sgx_encl_destroy() Sean Christopherson
2019-10-10 23:21 ` [PATCH for_v23 v2 4/9] x86/sgx: WARN on any non-zero return from __eremove() Sean Christopherson
2019-10-10 23:21 ` [PATCH for_v23 v2 5/9] x86/sgx: WARN only once if EREMOVE fails Sean Christopherson
2019-10-10 23:21 ` [PATCH for_v23 v2 6/9] x86/sgx: Split second half of sgx_free_page() to a separate helper Sean Christopherson
2019-10-10 23:21 ` [PATCH for_v23 v2 7/9] x86/sgx: Use the post-reclaim variant of __sgx_free_page() Sean Christopherson
2019-10-10 23:21 ` [PATCH for_v23 v2 8/9] x86/sgx: Don't update free page count if EPC section allocation fails Sean Christopherson
2019-10-10 23:21 ` [PATCH for_v23 v2 9/9] x86/sgx: Reinstate per EPC section free page counts Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).