linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2] x86/sgx: Silence softlockup detection when releasing large enclaves
@ 2022-01-25 18:22 Reinette Chatre
  2022-01-26 14:32 ` Jarkko Sakkinen
  0 siblings, 1 reply; 2+ messages in thread
From: Reinette Chatre @ 2022-01-25 18:22 UTC (permalink / raw)
  To: dave.hansen, jarkko, tglx, bp, luto, mingo, linux-sgx, x86
  Cc: vijay.dhanraj, linux-kernel, stable

Vijay reported that the "unclobbered_vdso_oversubscribed" selftest
triggers the softlockup detector.

Actual SGX systems have 128GB of enclave memory or more.  The
"unclobbered_vdso_oversubscribed" selftest creates one enclave which
consumes all of the enclave memory on the system. Tearing down such a
large enclave takes around a minute, most of it in the loop where
the EREMOVE instruction is applied to each individual 4k enclave page.

Spending one minute in a loop triggers the softlockup detector.

Add a cond_resched() to give other tasks a chance to run and placate
the softlockup detector.

Cc: stable@vger.kernel.org
Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Reported-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Softlockup message:

watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [test_sgx:11502]
Kernel panic - not syncing: softlockup: hung tasks
<snip>
sgx_encl_release+0x86/0x1c0
sgx_release+0x11c/0x130
__fput+0xb0/0x280
____fput+0xe/0x10
task_work_run+0x6c/0xc0
exit_to_user_mode_prepare+0x1eb/0x1f0
syscall_exit_to_user_mode+0x1d/0x50
do_syscall_64+0x46/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae

Changes since V1:
- V1: https://lore.kernel.org/lkml/1aa037705e5aa209d8b7a075873c6b4190327436.1642530802.git.reinette.chatre@intel.com/
- Add comment provided by Jarkko.

 arch/x86/kernel/cpu/sgx/encl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index 001808e3901c..48afe96ae0f0 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -410,6 +410,8 @@ void sgx_encl_release(struct kref *ref)
 		}
 
 		kfree(entry);
+		/* Invoke scheduler to prevent soft lockups. */
+		cond_resched();
 	}
 
 	xa_destroy(&encl->page_array);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH V2] x86/sgx: Silence softlockup detection when releasing large enclaves
  2022-01-25 18:22 [PATCH V2] x86/sgx: Silence softlockup detection when releasing large enclaves Reinette Chatre
@ 2022-01-26 14:32 ` Jarkko Sakkinen
  0 siblings, 0 replies; 2+ messages in thread
From: Jarkko Sakkinen @ 2022-01-26 14:32 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: dave.hansen, tglx, bp, luto, mingo, linux-sgx, x86,
	vijay.dhanraj, linux-kernel, stable

On Tue, Jan 25, 2022 at 10:22:43AM -0800, Reinette Chatre wrote:
> Vijay reported that the "unclobbered_vdso_oversubscribed" selftest
> triggers the softlockup detector.
> 
> Actual SGX systems have 128GB of enclave memory or more.  The
> "unclobbered_vdso_oversubscribed" selftest creates one enclave which
> consumes all of the enclave memory on the system. Tearing down such a
> large enclave takes around a minute, most of it in the loop where
> the EREMOVE instruction is applied to each individual 4k enclave page.
> 
> Spending one minute in a loop triggers the softlockup detector.
> 
> Add a cond_resched() to give other tasks a chance to run and placate
> the softlockup detector.
> 
> Cc: stable@vger.kernel.org
> Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
> Reported-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
> Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Softlockup message:
> 
> watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [test_sgx:11502]
> Kernel panic - not syncing: softlockup: hung tasks
> <snip>
> sgx_encl_release+0x86/0x1c0
> sgx_release+0x11c/0x130
> __fput+0xb0/0x280
> ____fput+0xe/0x10
> task_work_run+0x6c/0xc0
> exit_to_user_mode_prepare+0x1eb/0x1f0
> syscall_exit_to_user_mode+0x1d/0x50
> do_syscall_64+0x46/0xb0
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Changes since V1:
> - V1: https://lore.kernel.org/lkml/1aa037705e5aa209d8b7a075873c6b4190327436.1642530802.git.reinette.chatre@intel.com/
> - Add comment provided by Jarkko.
> 
>  arch/x86/kernel/cpu/sgx/encl.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index 001808e3901c..48afe96ae0f0 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -410,6 +410,8 @@ void sgx_encl_release(struct kref *ref)
>  		}
>  
>  		kfree(entry);
> +		/* Invoke scheduler to prevent soft lockups. */
> +		cond_resched();
>  	}
>  
>  	xa_destroy(&encl->page_array);
> -- 
> 2.25.1
> 

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>  (kselftest as sanity check)

BR, Jarkko

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-01-26 14:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-25 18:22 [PATCH V2] x86/sgx: Silence softlockup detection when releasing large enclaves Reinette Chatre
2022-01-26 14:32 ` Jarkko Sakkinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).