linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ocxl: Fix endiannes bug in ocxl_link_update_pe()
@ 2018-12-16 21:28 Greg Kurz
  2018-12-17  0:00 ` Andrew Donnellan
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Greg Kurz @ 2018-12-16 21:28 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Christophe Lombard, Vaibhav Jain, stable, Frederic Barrat,
	Andrew Donnellan, Alastair D'Silva

All fields in the PE are big-endian. Use cpu_to_be32() like everywhere
else something is written to the PE. Otherwise a wrong TID will be used
by the NPU. If this TID happens to point to an existing thread sharing
the same mm, it could be woken up by error. This is highly improbable
though. The likely outcome of this is the NPU not finding the target
thread and forcing the AFU into sending an interrupt, which userspace
is supposed to handle anyway.

Fixes: e948e06fc63a ("ocxl: Expose the thread_id needed for wait on POWER9")
Cc: stable@vger.kernel.org      # v4.18
Signed-off-by: Greg Kurz <groug@kaod.org>
---

This bug remained unnoticed so far because the current OCXL test suite
happens to call OCXL_IOCTL_ENABLE_P9_WAIT before attaching a context.
This causes ocxl_link_update_pe() to be called before ocxl_link_add_pe()
which re-writes the TID in the PE with the appropriate endianness.

I have some patches that change the behavior of the OCXL test suite so that
it can catch the issue:

https://github.com/gkurz/libocxl/commits/wake-host-thread-rework
---
 drivers/misc/ocxl/link.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
index 31695a078485..646d16450066 100644
--- a/drivers/misc/ocxl/link.c
+++ b/drivers/misc/ocxl/link.c
@@ -566,7 +566,7 @@ int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid)
 
 	mutex_lock(&spa->spa_lock);
 
-	pe->tid = tid;
+	pe->tid = cpu_to_be32(tid);
 
 	/*
 	 * The barrier makes sure the PE is updated


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] ocxl: Fix endiannes bug in ocxl_link_update_pe()
  2018-12-16 21:28 [PATCH] ocxl: Fix endiannes bug in ocxl_link_update_pe() Greg Kurz
@ 2018-12-17  0:00 ` Andrew Donnellan
  2018-12-17  0:38 ` Alastair D'Silva
  2018-12-22  9:54 ` Michael Ellerman
  2 siblings, 0 replies; 5+ messages in thread
From: Andrew Donnellan @ 2018-12-17  0:00 UTC (permalink / raw)
  To: Greg Kurz, linuxppc-dev
  Cc: Christophe Lombard, Vaibhav Jain, stable, Frederic Barrat,
	Alastair D'Silva

On 17/12/18 8:28 am, Greg Kurz wrote:
> All fields in the PE are big-endian. Use cpu_to_be32() like everywhere
> else something is written to the PE. Otherwise a wrong TID will be used
> by the NPU. If this TID happens to point to an existing thread sharing
> the same mm, it could be woken up by error. This is highly improbable
> though. The likely outcome of this is the NPU not finding the target
> thread and forcing the AFU into sending an interrupt, which userspace
> is supposed to handle anyway.
> 
> Fixes: e948e06fc63a ("ocxl: Expose the thread_id needed for wait on POWER9")
> Cc: stable@vger.kernel.org      # v4.18
> Signed-off-by: Greg Kurz <groug@kaod.org>

If only we read our sparse warnings which would have told us this. One 
warning down :)

Thanks for finding this!

Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

> ---
> 
> This bug remained unnoticed so far because the current OCXL test suite
> happens to call OCXL_IOCTL_ENABLE_P9_WAIT before attaching a context.
> This causes ocxl_link_update_pe() to be called before ocxl_link_add_pe()
> which re-writes the TID in the PE with the appropriate endianness.
> 
> I have some patches that change the behavior of the OCXL test suite so that
> it can catch the issue:
> 
> https://github.com/gkurz/libocxl/commits/wake-host-thread-rework
> ---
>   drivers/misc/ocxl/link.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
> index 31695a078485..646d16450066 100644
> --- a/drivers/misc/ocxl/link.c
> +++ b/drivers/misc/ocxl/link.c
> @@ -566,7 +566,7 @@ int ocxl_link_update_pe(void *link_handle, int pasid, __u16 tid)
>   
>   	mutex_lock(&spa->spa_lock);
>   
> -	pe->tid = tid;
> +	pe->tid = cpu_to_be32(tid);
>   
>   	/*
>   	 * The barrier makes sure the PE is updated
> 
> 

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] ocxl: Fix endiannes bug in ocxl_link_update_pe()
  2018-12-16 21:28 [PATCH] ocxl: Fix endiannes bug in ocxl_link_update_pe() Greg Kurz
  2018-12-17  0:00 ` Andrew Donnellan
@ 2018-12-17  0:38 ` Alastair D'Silva
  2018-12-20 14:53   ` Greg Kurz
  2018-12-22  9:54 ` Michael Ellerman
  2 siblings, 1 reply; 5+ messages in thread
From: Alastair D'Silva @ 2018-12-17  0:38 UTC (permalink / raw)
  To: Greg Kurz, linuxppc-dev
  Cc: Christophe Lombard, Vaibhav Jain, stable, Frederic Barrat,
	Andrew Donnellan

On Sun, 2018-12-16 at 22:28 +0100, Greg Kurz wrote:
> All fields in the PE are big-endian. Use cpu_to_be32() like
> everywhere
> else something is written to the PE. Otherwise a wrong TID will be
> used
> by the NPU. If this TID happens to point to an existing thread
> sharing
> the same mm, it could be woken up by error. This is highly improbable
> though. The likely outcome of this is the NPU not finding the target
> thread and forcing the AFU into sending an interrupt, which userspace
> is supposed to handle anyway.
> 
> Fixes: e948e06fc63a ("ocxl: Expose the thread_id needed for wait on
> POWER9")
> Cc: stable@vger.kernel.org      # v4.18
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
> 
> This bug remained unnoticed so far because the current OCXL test
> suite
> happens to call OCXL_IOCTL_ENABLE_P9_WAIT before attaching a context.
> This causes ocxl_link_update_pe() to be called before
> ocxl_link_add_pe()
> which re-writes the TID in the PE with the appropriate endianness.
> 
> I have some patches that change the behavior of the OCXL test suite
> so that
> it can catch the issue:
> 
> https://github.com/gkurz/libocxl/commits/wake-host-thread-rework
> ---
>  drivers/misc/ocxl/link.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
> index 31695a078485..646d16450066 100644
> --- a/drivers/misc/ocxl/link.c
> +++ b/drivers/misc/ocxl/link.c
> @@ -566,7 +566,7 @@ int ocxl_link_update_pe(void *link_handle, int
> pasid, __u16 tid)
>  
>  	mutex_lock(&spa->spa_lock);
>  
> -	pe->tid = tid;
> +	pe->tid = cpu_to_be32(tid);
>  
>  	/*
>  	 * The barrier makes sure the PE is updated
> 

Good catch, thanks.

Reviewed-by: Alastair D'Silva <alastair@d-silva.org>

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] ocxl: Fix endiannes bug in ocxl_link_update_pe()
  2018-12-17  0:38 ` Alastair D'Silva
@ 2018-12-20 14:53   ` Greg Kurz
  0 siblings, 0 replies; 5+ messages in thread
From: Greg Kurz @ 2018-12-20 14:53 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: Christophe Lombard, Vaibhav Jain, Frederic Barrat, stable,
	Andrew Donnellan, linuxppc-dev

On Mon, 17 Dec 2018 11:38:51 +1100
"Alastair D'Silva" <alastair@au1.ibm.com> wrote:

> On Sun, 2018-12-16 at 22:28 +0100, Greg Kurz wrote:
> > All fields in the PE are big-endian. Use cpu_to_be32() like
> > everywhere
> > else something is written to the PE. Otherwise a wrong TID will be
> > used
> > by the NPU. If this TID happens to point to an existing thread
> > sharing
> > the same mm, it could be woken up by error. This is highly improbable
> > though. The likely outcome of this is the NPU not finding the target
> > thread and forcing the AFU into sending an interrupt, which userspace
> > is supposed to handle anyway.
> > 
> > Fixes: e948e06fc63a ("ocxl: Expose the thread_id needed for wait on
> > POWER9")
> > Cc: stable@vger.kernel.org      # v4.18
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> > 
> > This bug remained unnoticed so far because the current OCXL test
> > suite
> > happens to call OCXL_IOCTL_ENABLE_P9_WAIT before attaching a context.
> > This causes ocxl_link_update_pe() to be called before
> > ocxl_link_add_pe()
> > which re-writes the TID in the PE with the appropriate endianness.
> > 
> > I have some patches that change the behavior of the OCXL test suite
> > so that
> > it can catch the issue:
> > 
> > https://github.com/gkurz/libocxl/commits/wake-host-thread-rework
> > ---
> >  drivers/misc/ocxl/link.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
> > index 31695a078485..646d16450066 100644
> > --- a/drivers/misc/ocxl/link.c
> > +++ b/drivers/misc/ocxl/link.c
> > @@ -566,7 +566,7 @@ int ocxl_link_update_pe(void *link_handle, int
> > pasid, __u16 tid)
> >  
> >  	mutex_lock(&spa->spa_lock);
> >  
> > -	pe->tid = tid;
> > +	pe->tid = cpu_to_be32(tid);
> >  
> >  	/*
> >  	 * The barrier makes sure the PE is updated
> >   
> 
> Good catch, thanks.
> 
> Reviewed-by: Alastair D'Silva <alastair@d-silva.org>
> 

Friendly ping before Xmas break :)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ocxl: Fix endiannes bug in ocxl_link_update_pe()
  2018-12-16 21:28 [PATCH] ocxl: Fix endiannes bug in ocxl_link_update_pe() Greg Kurz
  2018-12-17  0:00 ` Andrew Donnellan
  2018-12-17  0:38 ` Alastair D'Silva
@ 2018-12-22  9:54 ` Michael Ellerman
  2 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2018-12-22  9:54 UTC (permalink / raw)
  To: Greg Kurz, linuxppc-dev
  Cc: Christophe Lombard, Vaibhav Jain, Frederic Barrat, stable,
	Andrew Donnellan, Alastair D'Silva

On Sun, 2018-12-16 at 21:28:50 UTC, Greg Kurz wrote:
> All fields in the PE are big-endian. Use cpu_to_be32() like everywhere
> else something is written to the PE. Otherwise a wrong TID will be used
> by the NPU. If this TID happens to point to an existing thread sharing
> the same mm, it could be woken up by error. This is highly improbable
> though. The likely outcome of this is the NPU not finding the target
> thread and forcing the AFU into sending an interrupt, which userspace
> is supposed to handle anyway.
> 
> Fixes: e948e06fc63a ("ocxl: Expose the thread_id needed for wait on POWER9")
> Cc: stable@vger.kernel.org      # v4.18
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> Reviewed-by: Alastair D'Silva <alastair@d-silva.org>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e1e71e201703500f708bdeaf64660a

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-12-22 10:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-16 21:28 [PATCH] ocxl: Fix endiannes bug in ocxl_link_update_pe() Greg Kurz
2018-12-17  0:00 ` Andrew Donnellan
2018-12-17  0:38 ` Alastair D'Silva
2018-12-20 14:53   ` Greg Kurz
2018-12-22  9:54 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).