All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release().
@ 2020-12-11 11:32 Jarkko Sakkinen
  2020-12-14 19:01 ` Sean Christopherson
  0 siblings, 1 reply; 9+ messages in thread
From: Jarkko Sakkinen @ 2020-12-11 11:32 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, linux-sgx, Jarkko Sakkinen, Borislav Petkov,
	Dave Hansen, Sean Christopherson

Each sgx_mmun_notifier_release() starts a grace period, which means that
one extra synchronize_rcu() in sgx_encl_release(). Add it there.

sgx_release() has the loop that drains the list but with bad luck the
entry is already gone from the list before that loop processes it.

Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
---
 arch/x86/kernel/cpu/sgx/encl.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index ee50a5010277..48539a6ee315 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -438,6 +438,13 @@ void sgx_encl_release(struct kref *ref)
 	if (encl->backing)
 		fput(encl->backing);
 
+	/*
+	 * Each sgx_mmun_notifier_release() starts a grace period. Thus one
+	 * "extra" synchronize_rcu() is required here. This can go undetected by
+	 * sgx_release() when it drains the mm list.
+	 */
+	synchronize_srcu(&encl->srcu);
+
 	cleanup_srcu_struct(&encl->srcu);
 
 	WARN_ON_ONCE(!list_empty(&encl->mm_list));
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release().
  2020-12-11 11:32 [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release() Jarkko Sakkinen
@ 2020-12-14 19:01 ` Sean Christopherson
  2020-12-15  5:55   ` Jarkko Sakkinen
  0 siblings, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2020-12-14 19:01 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: x86, linux-kernel, linux-sgx, Borislav Petkov, Dave Hansen

On Fri, Dec 11, 2020, Jarkko Sakkinen wrote:
> Each sgx_mmun_notifier_release() starts a grace period, which means that

Should be sgx_mmu_notifier_release(), here and in the comment.

> one extra synchronize_rcu() in sgx_encl_release(). Add it there.
> 
> sgx_release() has the loop that drains the list but with bad luck the
> entry is already gone from the list before that loop processes it.

Why not include the actual analysis that "proves" the bug?  The splat that
Haitao reported would also be useful info.

> Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Reported-by: Sean Christopherson <seanjc@google.com>

Haitao reported the bug, and for all intents and purposes provided the fix.  I
just did the analysis to verify that there was a legitimate bug and that the
synchronization in sgx_encl_release() was indeed necessary.

> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
> ---
>  arch/x86/kernel/cpu/sgx/encl.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index ee50a5010277..48539a6ee315 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -438,6 +438,13 @@ void sgx_encl_release(struct kref *ref)
>  	if (encl->backing)
>  		fput(encl->backing);
>  
> +	/*
> +	 * Each sgx_mmun_notifier_release() starts a grace period. Thus one
> +	 * "extra" synchronize_rcu() is required here. This can go undetected by
> +	 * sgx_release() when it drains the mm list.
> +	 */
> +	synchronize_srcu(&encl->srcu);
> +
>  	cleanup_srcu_struct(&encl->srcu);
>  
>  	WARN_ON_ONCE(!list_empty(&encl->mm_list));
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release().
  2020-12-14 19:01 ` Sean Christopherson
@ 2020-12-15  5:55   ` Jarkko Sakkinen
  2020-12-15  5:59     ` Jarkko Sakkinen
  2020-12-15 22:04     ` Sean Christopherson
  0 siblings, 2 replies; 9+ messages in thread
From: Jarkko Sakkinen @ 2020-12-15  5:55 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: x86, linux-kernel, linux-sgx, Borislav Petkov, Dave Hansen

On Mon, Dec 14, 2020 at 11:01:32AM -0800, Sean Christopherson wrote:
> On Fri, Dec 11, 2020, Jarkko Sakkinen wrote:
> > Each sgx_mmun_notifier_release() starts a grace period, which means that
> 
> Should be sgx_mmu_notifier_release(), here and in the comment.

Thanks.

> > one extra synchronize_rcu() in sgx_encl_release(). Add it there.
> > 
> > sgx_release() has the loop that drains the list but with bad luck the
> > entry is already gone from the list before that loop processes it.
> 
> Why not include the actual analysis that "proves" the bug?  The splat that
> Haitao reported would also be useful info.

True. I can include a snippet of dmesg to the commit message.

> > Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
> > Cc: Borislav Petkov <bp@alien8.de>
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Reported-by: Sean Christopherson <seanjc@google.com>
> 
> Haitao reported the bug, and for all intents and purposes provided the fix.  I
> just did the analysis to verify that there was a legitimate bug and that the
> synchronization in sgx_encl_release() was indeed necessary.

Good and valid point. The way I see it, the tags should be:

Reported-by: Haitao Huang <haitao.huang@linux.intel.com>
Suggested-by: Sean Christopherson <seanjc@google.com>

Haitao pointed out the bug but from your analysis I could resolve that
this is the fix to implement, and was able to write the long
description for the commit.

Does this make sense to you?

/Jarkko

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release().
  2020-12-15  5:55   ` Jarkko Sakkinen
@ 2020-12-15  5:59     ` Jarkko Sakkinen
  2020-12-15 17:34       ` Haitao Huang
  2020-12-15 22:04     ` Sean Christopherson
  1 sibling, 1 reply; 9+ messages in thread
From: Jarkko Sakkinen @ 2020-12-15  5:59 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: x86, linux-kernel, linux-sgx, Borislav Petkov, Dave Hansen

On Tue, Dec 15, 2020 at 07:56:01AM +0200, Jarkko Sakkinen wrote:
> On Mon, Dec 14, 2020 at 11:01:32AM -0800, Sean Christopherson wrote:
> > On Fri, Dec 11, 2020, Jarkko Sakkinen wrote:
> > > Each sgx_mmun_notifier_release() starts a grace period, which means that
> > 
> > Should be sgx_mmu_notifier_release(), here and in the comment.
> 
> Thanks.
> 
> > > one extra synchronize_rcu() in sgx_encl_release(). Add it there.
> > > 
> > > sgx_release() has the loop that drains the list but with bad luck the
> > > entry is already gone from the list before that loop processes it.
> > 
> > Why not include the actual analysis that "proves" the bug?  The splat that
> > Haitao reported would also be useful info.
> 
> True. I can include a snippet of dmesg to the commit message.
> 
> > > Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
> > > Cc: Borislav Petkov <bp@alien8.de>
> > > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > > Reported-by: Sean Christopherson <seanjc@google.com>
> > 
> > Haitao reported the bug, and for all intents and purposes provided the fix.  I
> > just did the analysis to verify that there was a legitimate bug and that the
> > synchronization in sgx_encl_release() was indeed necessary.
> 
> Good and valid point. The way I see it, the tags should be:
> 
> Reported-by: Haitao Huang <haitao.huang@linux.intel.com>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> 
> Haitao pointed out the bug but from your analysis I could resolve that
> this is the fix to implement, and was able to write the long
> description for the commit.
> 
> Does this make sense to you?

I'm sending v2 next week (this week on vacation).

/Jarkko

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release().
  2020-12-15  5:59     ` Jarkko Sakkinen
@ 2020-12-15 17:34       ` Haitao Huang
  2020-12-15 21:35         ` Jarkko Sakkinen
  0 siblings, 1 reply; 9+ messages in thread
From: Haitao Huang @ 2020-12-15 17:34 UTC (permalink / raw)
  To: Sean Christopherson, Jarkko Sakkinen
  Cc: x86, linux-kernel, linux-sgx, Borislav Petkov, Dave Hansen

On Mon, 14 Dec 2020 23:59:55 -0600, Jarkko Sakkinen <jarkko@kernel.org>
wrote:

> On Tue, Dec 15, 2020 at 07:56:01AM +0200, Jarkko Sakkinen wrote:
>> On Mon, Dec 14, 2020 at 11:01:32AM -0800, Sean Christopherson wrote:
>> > On Fri, Dec 11, 2020, Jarkko Sakkinen wrote:
>> > > Each sgx_mmun_notifier_release() starts a grace period, which means  
>> that
>> >
>> > Should be sgx_mmu_notifier_release(), here and in the comment.
>>
>> Thanks.
>>
>> > > one extra synchronize_rcu() in sgx_encl_release(). Add it there.
>> > >
>> > > sgx_release() has the loop that drains the list but with bad luck  
>> the
>> > > entry is already gone from the list before that loop processes it.
>> >
>> > Why not include the actual analysis that "proves" the bug?  The splat  
>> that
>> > Haitao reported would also be useful info.
>>
>> True. I can include a snippet of dmesg to the commit message.
>>
>> > > Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
>> > > Cc: Borislav Petkov <bp@alien8.de>
>> > > Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> > > Reported-by: Sean Christopherson <seanjc@google.com>
>> >
>> > Haitao reported the bug, and for all intents and purposes provided  
>> the fix.  I
>> > just did the analysis to verify that there was a legitimate bug and  
>> that the
>> > synchronization in sgx_encl_release() was indeed necessary.
>>
>> Good and valid point. The way I see it, the tags should be:
>>
>> Reported-by: Haitao Huang <haitao.huang@linux.intel.com>
>> Suggested-by: Sean Christopherson <seanjc@google.com>
>>
>> Haitao pointed out the bug but from your analysis I could resolve that
>> this is the fix to implement, and was able to write the long
>> description for the commit.
>>
>> Does this make sense to you?
>
> I'm sending v2 next week (this week on vacation).
>
> /Jarkko

I don't mind either how tags are assigned. But our testing reveals
significant latency introduced in scenarios of heavy loading/unloading
enclaves. synchronize_srcu_expedited fixed the issue. Please analyze and
confirm if that's more appropriate than synchronize_srcu here.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release().
  2020-12-15 17:34       ` Haitao Huang
@ 2020-12-15 21:35         ` Jarkko Sakkinen
  0 siblings, 0 replies; 9+ messages in thread
From: Jarkko Sakkinen @ 2020-12-15 21:35 UTC (permalink / raw)
  To: Haitao Huang
  Cc: Sean Christopherson, x86, linux-kernel, linux-sgx,
	Borislav Petkov, Dave Hansen

On Tue, Dec 15, 2020 at 11:34:37AM -0600, Haitao Huang wrote:
> On Mon, 14 Dec 2020 23:59:55 -0600, Jarkko Sakkinen <jarkko@kernel.org>
> wrote:
> 
> > On Tue, Dec 15, 2020 at 07:56:01AM +0200, Jarkko Sakkinen wrote:
> > > On Mon, Dec 14, 2020 at 11:01:32AM -0800, Sean Christopherson wrote:
> > > > On Fri, Dec 11, 2020, Jarkko Sakkinen wrote:
> > > > > Each sgx_mmun_notifier_release() starts a grace period, which
> > > means that
> > > >
> > > > Should be sgx_mmu_notifier_release(), here and in the comment.
> > > 
> > > Thanks.
> > > 
> > > > > one extra synchronize_rcu() in sgx_encl_release(). Add it there.
> > > > >
> > > > > sgx_release() has the loop that drains the list but with bad
> > > luck the
> > > > > entry is already gone from the list before that loop processes it.
> > > >
> > > > Why not include the actual analysis that "proves" the bug?  The
> > > splat that
> > > > Haitao reported would also be useful info.
> > > 
> > > True. I can include a snippet of dmesg to the commit message.
> > > 
> > > > > Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
> > > > > Cc: Borislav Petkov <bp@alien8.de>
> > > > > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > > > > Reported-by: Sean Christopherson <seanjc@google.com>
> > > >
> > > > Haitao reported the bug, and for all intents and purposes provided
> > > the fix.  I
> > > > just did the analysis to verify that there was a legitimate bug
> > > and that the
> > > > synchronization in sgx_encl_release() was indeed necessary.
> > > 
> > > Good and valid point. The way I see it, the tags should be:
> > > 
> > > Reported-by: Haitao Huang <haitao.huang@linux.intel.com>
> > > Suggested-by: Sean Christopherson <seanjc@google.com>
> > > 
> > > Haitao pointed out the bug but from your analysis I could resolve that
> > > this is the fix to implement, and was able to write the long
> > > description for the commit.
> > > 
> > > Does this make sense to you?
> > 
> > I'm sending v2 next week (this week on vacation).
> > 
> > /Jarkko
> 
> I don't mind either how tags are assigned. But our testing reveals
> significant latency introduced in scenarios of heavy loading/unloading
> enclaves. synchronize_srcu_expedited fixed the issue. Please analyze and
> confirm if that's more appropriate than synchronize_srcu here.

I don't see any obvious reason why *_expedited could not be used here,
as most of the time sync's are taken care of sgx_release() loop, and the
final sync is with sgx_mmu_notifier_release(). More aggressive spinning
should not do any harm here.

About the tags. I just try to get them right, and it is sometimes not
straight-forward. So I guess, with all things considered, I'll put
suggested-by from you. Once I get a refined patch out, try it out with
your workloads and provide me tested-by, if it is working for you.

/Jarkko

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release().
  2020-12-15  5:55   ` Jarkko Sakkinen
  2020-12-15  5:59     ` Jarkko Sakkinen
@ 2020-12-15 22:04     ` Sean Christopherson
  2020-12-16 12:25       ` Jarkko Sakkinen
  1 sibling, 1 reply; 9+ messages in thread
From: Sean Christopherson @ 2020-12-15 22:04 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: x86, linux-kernel, linux-sgx, Borislav Petkov, Dave Hansen

On Tue, Dec 15, 2020, Jarkko Sakkinen wrote:
> On Mon, Dec 14, 2020 at 11:01:32AM -0800, Sean Christopherson wrote:
> > Haitao reported the bug, and for all intents and purposes provided the fix.  I
> > just did the analysis to verify that there was a legitimate bug and that the
> > synchronization in sgx_encl_release() was indeed necessary.
> 
> Good and valid point. The way I see it, the tags should be:
> 
> Reported-by: Haitao Huang <haitao.huang@linux.intel.com>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> 
> Haitao pointed out the bug but from your analysis I could resolve that
> this is the fix to implement, and was able to write the long
> description for the commit.
> 
> Does this make sense to you?

Yep, works for me.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release().
  2020-12-15 22:04     ` Sean Christopherson
@ 2020-12-16 12:25       ` Jarkko Sakkinen
  0 siblings, 0 replies; 9+ messages in thread
From: Jarkko Sakkinen @ 2020-12-16 12:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: x86, linux-kernel, linux-sgx, Borislav Petkov, Dave Hansen

On Tue, Dec 15, 2020 at 02:04:10PM -0800, Sean Christopherson wrote:
> On Tue, Dec 15, 2020, Jarkko Sakkinen wrote:
> > On Mon, Dec 14, 2020 at 11:01:32AM -0800, Sean Christopherson wrote:
> > > Haitao reported the bug, and for all intents and purposes provided the fix.  I
> > > just did the analysis to verify that there was a legitimate bug and that the
> > > synchronization in sgx_encl_release() was indeed necessary.
> > 
> > Good and valid point. The way I see it, the tags should be:
> > 
> > Reported-by: Haitao Huang <haitao.huang@linux.intel.com>
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > 
> > Haitao pointed out the bug but from your analysis I could resolve that
> > this is the fix to implement, and was able to write the long
> > description for the commit.
> > 
> > Does this make sense to you?
> 
> Yep, works for me.

I'll just add two suggested-by's. Process guide does not forbid that
and it best describes matters.

/Jarkko

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release().
@ 2020-12-15 21:40 Jarkko Sakkinen
  0 siblings, 0 replies; 9+ messages in thread
From: Jarkko Sakkinen @ 2020-12-15 21:40 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, linux-sgx, Jarkko Sakkinen, Haitao Huang,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Sean Christopherson, Jethro Beekman

Add synchronize_srcu_expedited() to sgx_encl_release() to catch a grace
period initiated by sgx_mmu_notifier_release().

A trivial example of a failing sequence with tasks A and B:

1. A: -> sgx_release()
2. B: -> sgx_mmu_notifier_release()
3. B: -> list_del_rcu()
3. A: -> sgx_encl_release()
4. A: -> cleanup_srcu_struct()

The loop in sgx_release() observes an empty list because B has removed its
entry in the middle, and calls cleanup_srcu_struct() before B has a chance
to calls synchronize_srcu().

Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Suggested-by: Haitao Huang <haitao.huang@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
---
 arch/x86/kernel/cpu/sgx/encl.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index ee50a5010277..fe7256db6e73 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -438,6 +438,12 @@ void sgx_encl_release(struct kref *ref)
 	if (encl->backing)
 		fput(encl->backing);
 
+	/*
+	 * Each sgx_mmu_notifier_release() starts a grace period. Therefore, an
+	 * additional sync is required here.
+	 */
+	synchronize_srcu_expedited(&encl->srcu);
+
 	cleanup_srcu_struct(&encl->srcu);
 
 	WARN_ON_ONCE(!list_empty(&encl->mm_list));
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-12-16 12:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-11 11:32 [PATCH] x86/sgx: Synchronize encl->srcu in sgx_encl_release() Jarkko Sakkinen
2020-12-14 19:01 ` Sean Christopherson
2020-12-15  5:55   ` Jarkko Sakkinen
2020-12-15  5:59     ` Jarkko Sakkinen
2020-12-15 17:34       ` Haitao Huang
2020-12-15 21:35         ` Jarkko Sakkinen
2020-12-15 22:04     ` Sean Christopherson
2020-12-16 12:25       ` Jarkko Sakkinen
2020-12-15 21:40 Jarkko Sakkinen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.