* [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
@ 2020-05-06 5:32 Leon Romanovsky
2020-05-06 14:43 ` Jason Gunthorpe
0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2020-05-06 5:32 UTC (permalink / raw)
To: Doug Ledford, Jason Gunthorpe; +Cc: Jack Morgenstein, linux-rdma
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
The IB core pkey cache is populated by procedure ib_cache_update().
Initially, the pkey cache pointer is NULL. ib_cache_update allocates
a buffer and populates it with the device's pkeys, via repeated calls
to procedure ib_query_pkey().
If there is a failure in populating the pkey buffer via ib_query_pkey(),
ib_cache_update does not replace the old pkey buffer cache with the
updated one -- it leaves the old cache as is.
Since initially the pkey buffer cache is NULL, when calling
ib_cache_update the first time, a failure in ib_query_pkey() will cause
the pkey buffer cache pointer to remain NULL.
In this situation, any calls subsequent to ib_get_cached_pkey(),
ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
dereference the NULL pkey cache pointer, causing a kernel panic.
Fix this by checking the ib_cache_update() return value.
Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
Changelog:
v1: I rewrote the patch to take care of ib_cache_update() return value.
v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
---
drivers/infiniband/core/cache.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 717b798cddad..1cbebfa374a5 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
if (err)
return err;
- rdma_for_each_port (device, p)
- ib_cache_update(device, p, true);
+ rdma_for_each_port (device, p) {
+ err = ib_cache_update(device, p, true);
+ if (err)
+ goto out;
+ }
return 0;
+
+out:
+ ib_cache_release_one(device);
+ return err;
}
void ib_cache_release_one(struct ib_device *device)
--
2.26.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
2020-05-06 5:32 [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache Leon Romanovsky
@ 2020-05-06 14:43 ` Jason Gunthorpe
2020-05-06 16:56 ` Leon Romanovsky
0 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2020-05-06 14:43 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Doug Ledford, Jack Morgenstein, linux-rdma
On Wed, May 06, 2020 at 08:32:13AM +0300, Leon Romanovsky wrote:
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
>
> The IB core pkey cache is populated by procedure ib_cache_update().
> Initially, the pkey cache pointer is NULL. ib_cache_update allocates
> a buffer and populates it with the device's pkeys, via repeated calls
> to procedure ib_query_pkey().
>
> If there is a failure in populating the pkey buffer via ib_query_pkey(),
> ib_cache_update does not replace the old pkey buffer cache with the
> updated one -- it leaves the old cache as is.
>
> Since initially the pkey buffer cache is NULL, when calling
> ib_cache_update the first time, a failure in ib_query_pkey() will cause
> the pkey buffer cache pointer to remain NULL.
>
> In this situation, any calls subsequent to ib_get_cached_pkey(),
> ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
> dereference the NULL pkey cache pointer, causing a kernel panic.
>
> Fix this by checking the ib_cache_update() return value.
>
> Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
> Changelog:
> v1: I rewrote the patch to take care of ib_cache_update() return value.
> v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
> ---
> drivers/infiniband/core/cache.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> --
> 2.26.2
>
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index 717b798cddad..1cbebfa374a5 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
> if (err)
> return err;
>
> - rdma_for_each_port (device, p)
> - ib_cache_update(device, p, true);
> + rdma_for_each_port (device, p) {
> + err = ib_cache_update(device, p, true);
> + if (err)
> + goto out;
> + }
>
> return 0;
> +
> +out:
> + ib_cache_release_one(device);
> + return err;
ib_cache_release_once can be called only once, and it is always called
by ib_device_release(), it should not be called here
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
2020-05-06 14:43 ` Jason Gunthorpe
@ 2020-05-06 16:56 ` Leon Romanovsky
2020-05-06 18:09 ` Jason Gunthorpe
0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2020-05-06 16:56 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Doug Ledford, Jack Morgenstein, linux-rdma
On Wed, May 06, 2020 at 11:43:44AM -0300, Jason Gunthorpe wrote:
> On Wed, May 06, 2020 at 08:32:13AM +0300, Leon Romanovsky wrote:
> > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> >
> > The IB core pkey cache is populated by procedure ib_cache_update().
> > Initially, the pkey cache pointer is NULL. ib_cache_update allocates
> > a buffer and populates it with the device's pkeys, via repeated calls
> > to procedure ib_query_pkey().
> >
> > If there is a failure in populating the pkey buffer via ib_query_pkey(),
> > ib_cache_update does not replace the old pkey buffer cache with the
> > updated one -- it leaves the old cache as is.
> >
> > Since initially the pkey buffer cache is NULL, when calling
> > ib_cache_update the first time, a failure in ib_query_pkey() will cause
> > the pkey buffer cache pointer to remain NULL.
> >
> > In this situation, any calls subsequent to ib_get_cached_pkey(),
> > ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
> > dereference the NULL pkey cache pointer, causing a kernel panic.
> >
> > Fix this by checking the ib_cache_update() return value.
> >
> > Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > ---
> > Changelog:
> > v1: I rewrote the patch to take care of ib_cache_update() return value.
> > v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
> > ---
> > drivers/infiniband/core/cache.c | 11 +++++++++--
> > 1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > --
> > 2.26.2
> >
> > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > index 717b798cddad..1cbebfa374a5 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
> > if (err)
> > return err;
> >
> > - rdma_for_each_port (device, p)
> > - ib_cache_update(device, p, true);
> > + rdma_for_each_port (device, p) {
> > + err = ib_cache_update(device, p, true);
> > + if (err)
> > + goto out;
> > + }
> >
> > return 0;
> > +
> > +out:
> > + ib_cache_release_one(device);
> > + return err;
>
> ib_cache_release_once can be called only once, and it is always called
> by ib_device_release(), it should not be called here
It doesn't sound right if we rely on ib_device_release() to unwind error
in ib_cache_setup_one(). I don't think that we need to return from
ib_cache_setup_one() without cleaning it.
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
2020-05-06 16:56 ` Leon Romanovsky
@ 2020-05-06 18:09 ` Jason Gunthorpe
2020-05-06 18:31 ` Leon Romanovsky
2020-05-06 18:41 ` jackm
0 siblings, 2 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2020-05-06 18:09 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Doug Ledford, Jack Morgenstein, linux-rdma
On Wed, May 06, 2020 at 07:56:08PM +0300, Leon Romanovsky wrote:
> On Wed, May 06, 2020 at 11:43:44AM -0300, Jason Gunthorpe wrote:
> > On Wed, May 06, 2020 at 08:32:13AM +0300, Leon Romanovsky wrote:
> > > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > >
> > > The IB core pkey cache is populated by procedure ib_cache_update().
> > > Initially, the pkey cache pointer is NULL. ib_cache_update allocates
> > > a buffer and populates it with the device's pkeys, via repeated calls
> > > to procedure ib_query_pkey().
> > >
> > > If there is a failure in populating the pkey buffer via ib_query_pkey(),
> > > ib_cache_update does not replace the old pkey buffer cache with the
> > > updated one -- it leaves the old cache as is.
> > >
> > > Since initially the pkey buffer cache is NULL, when calling
> > > ib_cache_update the first time, a failure in ib_query_pkey() will cause
> > > the pkey buffer cache pointer to remain NULL.
> > >
> > > In this situation, any calls subsequent to ib_get_cached_pkey(),
> > > ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
> > > dereference the NULL pkey cache pointer, causing a kernel panic.
> > >
> > > Fix this by checking the ib_cache_update() return value.
> > >
> > > Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
> > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > > Changelog:
> > > v1: I rewrote the patch to take care of ib_cache_update() return value.
> > > v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
> > > drivers/infiniband/core/cache.c | 11 +++++++++--
> > > 1 file changed, 9 insertions(+), 2 deletions(-)
> > >
> > >
> > > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > > index 717b798cddad..1cbebfa374a5 100644
> > > +++ b/drivers/infiniband/core/cache.c
> > > @@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
> > > if (err)
> > > return err;
> > >
> > > - rdma_for_each_port (device, p)
> > > - ib_cache_update(device, p, true);
> > > + rdma_for_each_port (device, p) {
> > > + err = ib_cache_update(device, p, true);
> > > + if (err)
> > > + goto out;
> > > + }
> > >
> > > return 0;
> > > +
> > > +out:
> > > + ib_cache_release_one(device);
> > > + return err;
> >
> > ib_cache_release_once can be called only once, and it is always called
> > by ib_device_release(), it should not be called here
>
> It doesn't sound right if we rely on ib_device_release() to unwind error
> in ib_cache_setup_one(). I don't think that we need to return from
> ib_cache_setup_one() without cleaning it.
We do as ib_cache_release_one() cannot be called multiple times
The general design of all this pre-registration stuff is that the
release function does the clean up and the individual functions should
not error unwind cleanup done in the unconditional release.
Other schemes were too complicated
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
2020-05-06 18:09 ` Jason Gunthorpe
@ 2020-05-06 18:31 ` Leon Romanovsky
2020-05-06 18:41 ` jackm
1 sibling, 0 replies; 8+ messages in thread
From: Leon Romanovsky @ 2020-05-06 18:31 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Doug Ledford, Jack Morgenstein, linux-rdma
On Wed, May 06, 2020 at 03:09:36PM -0300, Jason Gunthorpe wrote:
> On Wed, May 06, 2020 at 07:56:08PM +0300, Leon Romanovsky wrote:
> > On Wed, May 06, 2020 at 11:43:44AM -0300, Jason Gunthorpe wrote:
> > > On Wed, May 06, 2020 at 08:32:13AM +0300, Leon Romanovsky wrote:
> > > > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > > >
> > > > The IB core pkey cache is populated by procedure ib_cache_update().
> > > > Initially, the pkey cache pointer is NULL. ib_cache_update allocates
> > > > a buffer and populates it with the device's pkeys, via repeated calls
> > > > to procedure ib_query_pkey().
> > > >
> > > > If there is a failure in populating the pkey buffer via ib_query_pkey(),
> > > > ib_cache_update does not replace the old pkey buffer cache with the
> > > > updated one -- it leaves the old cache as is.
> > > >
> > > > Since initially the pkey buffer cache is NULL, when calling
> > > > ib_cache_update the first time, a failure in ib_query_pkey() will cause
> > > > the pkey buffer cache pointer to remain NULL.
> > > >
> > > > In this situation, any calls subsequent to ib_get_cached_pkey(),
> > > > ib_find_cached_pkey(), or ib_find_cached_pkey_exact() will try to
> > > > dereference the NULL pkey cache pointer, causing a kernel panic.
> > > >
> > > > Fix this by checking the ib_cache_update() return value.
> > > >
> > > > Fixes: 8faea9fd4a39 ("RDMA/cache: Move the cache per-port data into the main ib_port_data")
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > > > Changelog:
> > > > v1: I rewrote the patch to take care of ib_cache_update() return value.
> > > > v0: https://lore.kernel.org/linux-rdma/20200426075811.129814-1-leon@kernel.org
> > > > drivers/infiniband/core/cache.c | 11 +++++++++--
> > > > 1 file changed, 9 insertions(+), 2 deletions(-)
> > > >
> > > >
> > > > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > > > index 717b798cddad..1cbebfa374a5 100644
> > > > +++ b/drivers/infiniband/core/cache.c
> > > > @@ -1553,10 +1553,17 @@ int ib_cache_setup_one(struct ib_device *device)
> > > > if (err)
> > > > return err;
> > > >
> > > > - rdma_for_each_port (device, p)
> > > > - ib_cache_update(device, p, true);
> > > > + rdma_for_each_port (device, p) {
> > > > + err = ib_cache_update(device, p, true);
> > > > + if (err)
> > > > + goto out;
> > > > + }
> > > >
> > > > return 0;
> > > > +
> > > > +out:
> > > > + ib_cache_release_one(device);
> > > > + return err;
> > >
> > > ib_cache_release_once can be called only once, and it is always called
> > > by ib_device_release(), it should not be called here
> >
> > It doesn't sound right if we rely on ib_device_release() to unwind error
> > in ib_cache_setup_one(). I don't think that we need to return from
> > ib_cache_setup_one() without cleaning it.
>
> We do as ib_cache_release_one() cannot be called multiple times
Do you want me to respin?
>
> The general design of all this pre-registration stuff is that the
> release function does the clean up and the individual functions should
> not error unwind cleanup done in the unconditional release.
>
> Other schemes were too complicated
It doesn't mean that it is right :)
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
2020-05-06 18:09 ` Jason Gunthorpe
2020-05-06 18:31 ` Leon Romanovsky
@ 2020-05-06 18:41 ` jackm
2020-05-06 18:57 ` Jason Gunthorpe
1 sibling, 1 reply; 8+ messages in thread
From: jackm @ 2020-05-06 18:41 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leon Romanovsky, Doug Ledford, linux-rdma, ferasda, mohammadkab, moshet
On Wed, 6 May 2020 15:09:36 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > > +out:
> > > > + ib_cache_release_one(device);
> > > > + return err;
> > >
> > > ib_cache_release_once can be called only once, and it is always
> > > called by ib_device_release(), it should not be called here
> >
> > It doesn't sound right if we rely on ib_device_release() to unwind
> > error in ib_cache_setup_one(). I don't think that we need to return
> > from ib_cache_setup_one() without cleaning it.
>
> We do as ib_cache_release_one() cannot be called multiple times
>
> The general design of all this pre-registration stuff is that the
> release function does the clean up and the individual functions should
> not error unwind cleanup done in the unconditional release.
>
> Other schemes were too complicated
>
> Jason
What about calling gid_table_release_one(device) instead of
ib_cache_release_one(device) in the error flow ?
gid_table_release_one() calls gid_table_release -- which frees the
gid table and sets its pointer to NULL.
Then, if ib_cache_release_one is called later, gid_table_release_one()
will simply do nothing (it calls gid_table_release, which returns
without doing anything if the table pointer argument is NULL -- which
it will be).
Thus, unlike ib_cache_release_one() -- gid_table_release_one() is
callable multiple times.
This also has the advantage of unwinding the gid_table_setup_one() in
the ib_cache_setup_one() error flow -- which is a symmetric unwind.
-Jack
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
2020-05-06 18:41 ` jackm
@ 2020-05-06 18:57 ` Jason Gunthorpe
2020-05-07 5:58 ` Leon Romanovsky
0 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2020-05-06 18:57 UTC (permalink / raw)
To: jackm
Cc: Leon Romanovsky, Doug Ledford, linux-rdma, ferasda, mohammadkab, moshet
On Wed, May 06, 2020 at 09:41:23PM +0300, jackm wrote:
> On Wed, 6 May 2020 15:09:36 -0300
> Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> > > > > +out:
> > > > > + ib_cache_release_one(device);
> > > > > + return err;
> > > >
> > > > ib_cache_release_once can be called only once, and it is always
> > > > called by ib_device_release(), it should not be called here
> > >
> > > It doesn't sound right if we rely on ib_device_release() to unwind
> > > error in ib_cache_setup_one(). I don't think that we need to return
> > > from ib_cache_setup_one() without cleaning it.
> >
> > We do as ib_cache_release_one() cannot be called multiple times
> >
> > The general design of all this pre-registration stuff is that the
> > release function does the clean up and the individual functions should
> > not error unwind cleanup done in the unconditional release.
> >
> > Other schemes were too complicated
> >
> > Jason
>
> What about calling gid_table_release_one(device) instead of
> ib_cache_release_one(device) in the error flow ?
Why?
That is not the design, everything that is freed by release is defered
to release, even on error paths.
Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache
2020-05-06 18:57 ` Jason Gunthorpe
@ 2020-05-07 5:58 ` Leon Romanovsky
0 siblings, 0 replies; 8+ messages in thread
From: Leon Romanovsky @ 2020-05-07 5:58 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: jackm, Doug Ledford, linux-rdma, ferasda, mohammadkab, moshet
On Wed, May 06, 2020 at 03:57:37PM -0300, Jason Gunthorpe wrote:
> On Wed, May 06, 2020 at 09:41:23PM +0300, jackm wrote:
> > On Wed, 6 May 2020 15:09:36 -0300
> > Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >
> > > > > > +out:
> > > > > > + ib_cache_release_one(device);
> > > > > > + return err;
> > > > >
> > > > > ib_cache_release_once can be called only once, and it is always
> > > > > called by ib_device_release(), it should not be called here
> > > >
> > > > It doesn't sound right if we rely on ib_device_release() to unwind
> > > > error in ib_cache_setup_one(). I don't think that we need to return
> > > > from ib_cache_setup_one() without cleaning it.
> > >
> > > We do as ib_cache_release_one() cannot be called multiple times
> > >
> > > The general design of all this pre-registration stuff is that the
> > > release function does the clean up and the individual functions should
> > > not error unwind cleanup done in the unconditional release.
> > >
> > > Other schemes were too complicated
> > >
> > > Jason
> >
> > What about calling gid_table_release_one(device) instead of
> > ib_cache_release_one(device) in the error flow ?
>
> Why?
Because it doesn't look clean.
>
> That is not the design, everything that is freed by release is defered
> to release, even on error paths.
I'll resend now.
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-05-07 5:58 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-06 5:32 [PATCH rdma-rc v1] IB/core: Fix potential NULL pointer dereference in pkey cache Leon Romanovsky
2020-05-06 14:43 ` Jason Gunthorpe
2020-05-06 16:56 ` Leon Romanovsky
2020-05-06 18:09 ` Jason Gunthorpe
2020-05-06 18:31 ` Leon Romanovsky
2020-05-06 18:41 ` jackm
2020-05-06 18:57 ` Jason Gunthorpe
2020-05-07 5:58 ` Leon Romanovsky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.