All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
@ 2020-05-06  5:32 Leon Romanovsky
  2020-05-06 14:43 ` Jason Gunthorpe
  0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2020-05-06  5:32 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Jack Morgenstein, linux-rdma

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

The IB core pkey cache is populated by procedure ib_cache_update().
Initially, the pkey cache pointer is NULL. ib_cache_update allocates
a buffer and populates it with the device's pkeys, via repeated calls
to procedure ib_query_pkey().

If there is a failure in populating the pkey buffer via ib_query_pkey(),
ib_cache_update does not replace the old pkey buffer cache with the
updated one -- it leaves the old cache as is.

Since initially the pkey buffer cache is NULL, when calling
ib_cache_update the first time, a failure in ib_query_pkey() will cause
the pkey buffer cache pointer to remain NULL.

In this situation, any calls subsequent to ib_get_cached_pkey(),
ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
dereference the NULL pkey cache pointer, causing a kernel panic.

Fix this by checking the ib_cache_update() return value.

Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
Changelog:
v1: I rewrote the patch to take care of ib_cache_update() return value.
v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
---
 drivers/infiniband/core/cache.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 717b798cddad..1cbebfa374a5 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
 	if (err)
 		return err;

-	rdma_for_each_port (device, p)
-		ib_cache_update(device, p, true);
+	rdma_for_each_port (device, p) {
+		err = ib_cache_update(device, p, true);
+		if (err)
+			goto out;
+	}

 	return 0;
+
+out:
+	ib_cache_release_one(device);
+	return err;
 }

 void ib_cache_release_one(struct ib_device *device)
--
2.26.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
  2020-05-06  5:32 [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache Leon Romanovsky
@ 2020-05-06 14:43 ` Jason Gunthorpe
  2020-05-06 16:56   ` Leon Romanovsky
  0 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2020-05-06 14:43 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, Jack Morgenstein, linux-rdma

On Wed, May 06, 2020 at 08:32:13AM +0300, Leon Romanovsky wrote:
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> 
> The IB core pkey cache is populated by procedure ib_cache_update().
> Initially, the pkey cache pointer is NULL. ib_cache_update allocates
> a buffer and populates it with the device's pkeys, via repeated calls
> to procedure ib_query_pkey().
> 
> If there is a failure in populating the pkey buffer via ib_query_pkey(),
> ib_cache_update does not replace the old pkey buffer cache with the
> updated one -- it leaves the old cache as is.
> 
> Since initially the pkey buffer cache is NULL, when calling
> ib_cache_update the first time, a failure in ib_query_pkey() will cause
> the pkey buffer cache pointer to remain NULL.
> 
> In this situation, any calls subsequent to ib_get_cached_pkey(),
> ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
> dereference the NULL pkey cache pointer, causing a kernel panic.
> 
> Fix this by checking the ib_cache_update() return value.
> 
> Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
> Changelog:
> v1: I rewrote the patch to take care of ib_cache_update() return value.
> v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
> ---
>  drivers/infiniband/core/cache.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> --
> 2.26.2
> 
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index 717b798cddad..1cbebfa374a5 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
>  	if (err)
>  		return err;
> 
> -	rdma_for_each_port (device, p)
> -		ib_cache_update(device, p, true);
> +	rdma_for_each_port (device, p) {
> +		err = ib_cache_update(device, p, true);
> +		if (err)
> +			goto out;
> +	}
> 
>  	return 0;
> +
> +out:
> +	ib_cache_release_one(device);
> +	return err;

ib_cache_release_once can be called only once, and it is always called
by ib_device_release(), it should not be called here

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
  2020-05-06 14:43 ` Jason Gunthorpe
@ 2020-05-06 16:56   ` Leon Romanovsky
  2020-05-06 18:09     ` Jason Gunthorpe
  0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2020-05-06 16:56 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, Jack Morgenstein, linux-rdma

On Wed, May 06, 2020 at 11:43:44AM -0300, Jason Gunthorpe wrote:
> On Wed, May 06, 2020 at 08:32:13AM +0300, Leon Romanovsky wrote:
> > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> >
> > The IB core pkey cache is populated by procedure ib_cache_update().
> > Initially, the pkey cache pointer is NULL. ib_cache_update allocates
> > a buffer and populates it with the device's pkeys, via repeated calls
> > to procedure ib_query_pkey().
> >
> > If there is a failure in populating the pkey buffer via ib_query_pkey(),
> > ib_cache_update does not replace the old pkey buffer cache with the
> > updated one -- it leaves the old cache as is.
> >
> > Since initially the pkey buffer cache is NULL, when calling
> > ib_cache_update the first time, a failure in ib_query_pkey() will cause
> > the pkey buffer cache pointer to remain NULL.
> >
> > In this situation, any calls subsequent to ib_get_cached_pkey(),
> > ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
> > dereference the NULL pkey cache pointer, causing a kernel panic.
> >
> > Fix this by checking the ib_cache_update() return value.
> >
> > Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > ---
> > Changelog:
> > v1: I rewrote the patch to take care of ib_cache_update() return value.
> > v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
> > ---
> >  drivers/infiniband/core/cache.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > --
> > 2.26.2
> >
> > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > index 717b798cddad..1cbebfa374a5 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
> >  	if (err)
> >  		return err;
> >
> > -	rdma_for_each_port (device, p)
> > -		ib_cache_update(device, p, true);
> > +	rdma_for_each_port (device, p) {
> > +		err = ib_cache_update(device, p, true);
> > +		if (err)
> > +			goto out;
> > +	}
> >
> >  	return 0;
> > +
> > +out:
> > +	ib_cache_release_one(device);
> > +	return err;
>
> ib_cache_release_once can be called only once, and it is always called
> by ib_device_release(), it should not be called here

It doesn't sound right if we rely on ib_device_release() to unwind error
in ib_cache_setup_one(). I don't think that we need to return from
ib_cache_setup_one() without cleaning it.

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
  2020-05-06 16:56   ` Leon Romanovsky
@ 2020-05-06 18:09     ` Jason Gunthorpe
  2020-05-06 18:31       ` Leon Romanovsky
  2020-05-06 18:41       ` jackm
  0 siblings, 2 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2020-05-06 18:09 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, Jack Morgenstein, linux-rdma

On Wed, May 06, 2020 at 07:56:08PM +0300, Leon Romanovsky wrote:
> On Wed, May 06, 2020 at 11:43:44AM -0300, Jason Gunthorpe wrote:
> > On Wed, May 06, 2020 at 08:32:13AM +0300, Leon Romanovsky wrote:
> > > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > >
> > > The IB core pkey cache is populated by procedure ib_cache_update().
> > > Initially, the pkey cache pointer is NULL. ib_cache_update allocates
> > > a buffer and populates it with the device's pkeys, via repeated calls
> > > to procedure ib_query_pkey().
> > >
> > > If there is a failure in populating the pkey buffer via ib_query_pkey(),
> > > ib_cache_update does not replace the old pkey buffer cache with the
> > > updated one -- it leaves the old cache as is.
> > >
> > > Since initially the pkey buffer cache is NULL, when calling
> > > ib_cache_update the first time, a failure in ib_query_pkey() will cause
> > > the pkey buffer cache pointer to remain NULL.
> > >
> > > In this situation, any calls subsequent to ib_get_cached_pkey(),
> > > ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
> > > dereference the NULL pkey cache pointer, causing a kernel panic.
> > >
> > > Fix this by checking the ib_cache_update() return value.
> > >
> > > Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > > Changelog:
> > > v1: I rewrote the patch to take care of ib_cache_update() return value.
> > > v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
> > >  drivers/infiniband/core/cache.c | 11 +++++++++--
> > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > >
> > >
> > > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > > index 717b798cddad..1cbebfa374a5 100644
> > > +++ b/drivers/infiniband/core/cache.c
> > > @@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
> > >  	if (err)
> > >  		return err;
> > >
> > > -	rdma_for_each_port (device, p)
> > > -		ib_cache_update(device, p, true);
> > > +	rdma_for_each_port (device, p) {
> > > +		err = ib_cache_update(device, p, true);
> > > +		if (err)
> > > +			goto out;
> > > +	}
> > >
> > >  	return 0;
> > > +
> > > +out:
> > > +	ib_cache_release_one(device);
> > > +	return err;
> >
> > ib_cache_release_once can be called only once, and it is always called
> > by ib_device_release(), it should not be called here
> 
> It doesn't sound right if we rely on ib_device_release() to unwind error
> in ib_cache_setup_one(). I don't think that we need to return from
> ib_cache_setup_one() without cleaning it.

We do as ib_cache_release_one() cannot be called multiple times

The general design of all this pre-registration stuff is that the
release function does the clean up and the individual functions should
not error unwind cleanup done in the unconditional release.

Other schemes were too complicated

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
  2020-05-06 18:09     ` Jason Gunthorpe
@ 2020-05-06 18:31       ` Leon Romanovsky
  2020-05-06 18:41       ` jackm
  1 sibling, 0 replies; 8+ messages in thread
From: Leon Romanovsky @ 2020-05-06 18:31 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, Jack Morgenstein, linux-rdma

On Wed, May 06, 2020 at 03:09:36PM -0300, Jason Gunthorpe wrote:
> On Wed, May 06, 2020 at 07:56:08PM +0300, Leon Romanovsky wrote:
> > On Wed, May 06, 2020 at 11:43:44AM -0300, Jason Gunthorpe wrote:
> > > On Wed, May 06, 2020 at 08:32:13AM +0300, Leon Romanovsky wrote:
> > > > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > > >
> > > > The IB core pkey cache is populated by procedure ib_cache_update().
> > > > Initially, the pkey cache pointer is NULL. ib_cache_update allocates
> > > > a buffer and populates it with the device's pkeys, via repeated calls
> > > > to procedure ib_query_pkey().
> > > >
> > > > If there is a failure in populating the pkey buffer via ib_query_pkey(),
> > > > ib_cache_update does not replace the old pkey buffer cache with the
> > > > updated one -- it leaves the old cache as is.
> > > >
> > > > Since initially the pkey buffer cache is NULL, when calling
> > > > ib_cache_update the first time, a failure in ib_query_pkey() will cause
> > > > the pkey buffer cache pointer to remain NULL.
> > > >
> > > > In this situation, any calls subsequent to ib_get_cached_pkey(),
> > > > ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
> > > > dereference the NULL pkey cache pointer, causing a kernel panic.
> > > >
> > > > Fix this by checking the ib_cache_update() return value.
> > > >
> > > > Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > > > Changelog:
> > > > v1: I rewrote the patch to take care of ib_cache_update() return value.
> > > > v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
> > > >  drivers/infiniband/core/cache.c | 11 +++++++++--
> > > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > > >
> > > >
> > > > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > > > index 717b798cddad..1cbebfa374a5 100644
> > > > +++ b/drivers/infiniband/core/cache.c
> > > > @@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
> > > >  	if (err)
> > > >  		return err;
> > > >
> > > > -	rdma_for_each_port (device, p)
> > > > -		ib_cache_update(device, p, true);
> > > > +	rdma_for_each_port (device, p) {
> > > > +		err = ib_cache_update(device, p, true);
> > > > +		if (err)
> > > > +			goto out;
> > > > +	}
> > > >
> > > >  	return 0;
> > > > +
> > > > +out:
> > > > +	ib_cache_release_one(device);
> > > > +	return err;
> > >
> > > ib_cache_release_once can be called only once, and it is always called
> > > by ib_device_release(), it should not be called here
> >
> > It doesn't sound right if we rely on ib_device_release() to unwind error
> > in ib_cache_setup_one(). I don't think that we need to return from
> > ib_cache_setup_one() without cleaning it.
>
> We do as ib_cache_release_one() cannot be called multiple times

Do you want me to respin?

>
> The general design of all this pre-registration stuff is that the
> release function does the clean up and the individual functions should
> not error unwind cleanup done in the unconditional release.
>
> Other schemes were too complicated

It doesn't mean that it is right :)

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
  2020-05-06 18:09     ` Jason Gunthorpe
  2020-05-06 18:31       ` Leon Romanovsky
@ 2020-05-06 18:41       ` jackm
  2020-05-06 18:57         ` Jason Gunthorpe
  1 sibling, 1 reply; 8+ messages in thread
From: jackm @ 2020-05-06 18:41 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Doug Ledford, linux-rdma, ferasda, mohammadkab, moshet

On Wed, 6 May 2020 15:09:36 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:

> > > > +out:
> > > > +	ib_cache_release_one(device);
> > > > +	return err;  
> > >
> > > ib_cache_release_once can be called only once, and it is always
> > > called by ib_device_release(), it should not be called here  
> > 
> > It doesn't sound right if we rely on ib_device_release() to unwind
> > error in ib_cache_setup_one(). I don't think that we need to return
> > from ib_cache_setup_one() without cleaning it.  
> 
> We do as ib_cache_release_one() cannot be called multiple times
> 
> The general design of all this pre-registration stuff is that the
> release function does the clean up and the individual functions should
> not error unwind cleanup done in the unconditional release.
> 
> Other schemes were too complicated
> 
> Jason

What about calling gid_table_release_one(device) instead of
ib_cache_release_one(device) in the error flow ?

gid_table_release_one() calls gid_table_release -- which frees the
gid table and sets its pointer to NULL.

Then, if ib_cache_release_one is called later, gid_table_release_one()
will simply do nothing (it calls gid_table_release, which returns
without doing anything if the table pointer argument is NULL -- which
it will be).

Thus, unlike ib_cache_release_one() -- gid_table_release_one() is
callable multiple times.

This also has the advantage of unwinding the gid_table_setup_one() in
the ib_cache_setup_one() error flow -- which is a symmetric unwind.

-Jack

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
  2020-05-06 18:41       ` jackm
@ 2020-05-06 18:57         ` Jason Gunthorpe
  2020-05-07  5:58           ` Leon Romanovsky
  0 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2020-05-06 18:57 UTC (permalink / raw)
  To: jackm
  Cc: Leon Romanovsky, Doug Ledford, linux-rdma, ferasda, mohammadkab, moshet

On Wed, May 06, 2020 at 09:41:23PM +0300, jackm wrote:
> On Wed, 6 May 2020 15:09:36 -0300
> Jason Gunthorpe <jgg@ziepe.ca> wrote:
> 
> > > > > +out:
> > > > > +	ib_cache_release_one(device);
> > > > > +	return err;  
> > > >
> > > > ib_cache_release_once can be called only once, and it is always
> > > > called by ib_device_release(), it should not be called here  
> > > 
> > > It doesn't sound right if we rely on ib_device_release() to unwind
> > > error in ib_cache_setup_one(). I don't think that we need to return
> > > from ib_cache_setup_one() without cleaning it.  
> > 
> > We do as ib_cache_release_one() cannot be called multiple times
> > 
> > The general design of all this pre-registration stuff is that the
> > release function does the clean up and the individual functions should
> > not error unwind cleanup done in the unconditional release.
> > 
> > Other schemes were too complicated
> > 
> > Jason
> 
> What about calling gid_table_release_one(device) instead of
> ib_cache_release_one(device) in the error flow ?

Why?

That is not the design, everything that is freed by release is defered
to release, even on error paths.

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
  2020-05-06 18:57         ` Jason Gunthorpe
@ 2020-05-07  5:58           ` Leon Romanovsky
  0 siblings, 0 replies; 8+ messages in thread
From: Leon Romanovsky @ 2020-05-07  5:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: jackm, Doug Ledford, linux-rdma, ferasda, mohammadkab, moshet

On Wed, May 06, 2020 at 03:57:37PM -0300, Jason Gunthorpe wrote:
> On Wed, May 06, 2020 at 09:41:23PM +0300, jackm wrote:
> > On Wed, 6 May 2020 15:09:36 -0300
> > Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >
> > > > > > +out:
> > > > > > +	ib_cache_release_one(device);
> > > > > > +	return err;
> > > > >
> > > > > ib_cache_release_once can be called only once, and it is always
> > > > > called by ib_device_release(), it should not be called here
> > > >
> > > > It doesn't sound right if we rely on ib_device_release() to unwind
> > > > error in ib_cache_setup_one(). I don't think that we need to return
> > > > from ib_cache_setup_one() without cleaning it.
> > >
> > > We do as ib_cache_release_one() cannot be called multiple times
> > >
> > > The general design of all this pre-registration stuff is that the
> > > release function does the clean up and the individual functions should
> > > not error unwind cleanup done in the unconditional release.
> > >
> > > Other schemes were too complicated
> > >
> > > Jason
> >
> > What about calling gid_table_release_one(device) instead of
> > ib_cache_release_one(device) in the error flow ?
>
> Why?

Because it doesn't look clean.

>
> That is not the design, everything that is freed by release is defered
> to release, even on error paths.

I'll resend now.

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-07  5:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-06  5:32 [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache Leon Romanovsky
2020-05-06 14:43 ` Jason Gunthorpe
2020-05-06 16:56   ` Leon Romanovsky
2020-05-06 18:09     ` Jason Gunthorpe
2020-05-06 18:31       ` Leon Romanovsky
2020-05-06 18:41       ` jackm
2020-05-06 18:57         ` Jason Gunthorpe
2020-05-07  5:58           ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.