linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.
@ 2021-07-21 11:21 Dakshaja Uppalapati
  2021-07-22  7:43 ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Dakshaja Uppalapati @ 2021-07-21 11:21 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, bharat, dakshaja

Previous atomic increment decrement logic expects the atomic count
to be '0' after the final decrement. Replacing atomic count with
refcount does not allow that as refcount_dec() considers count of 1 as
underflow. Therefore fix the current refcount logic by decrementing
the refcount if one on the final deref in c4iw_destroy_cq().

Fixes: 7183451f846d (RDMA/cxgb4: Use refcount_t instead of atomic_t for reference counting")
Signed-off-by: Dakshaja Uppalapati <dakshaja@chelsio.com>
Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
---
 drivers/infiniband/hw/cxgb4/cq.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
index 6c8c910..54a5b60 100644
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -976,8 +976,7 @@ int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 	chp = to_c4iw_cq(ib_cq);
 
 	xa_erase_irq(&chp->rhp->cqs, chp->cq.cqid);
-	refcount_dec(&chp->refcnt);
-	wait_event(chp->wait, !refcount_read(&chp->refcnt));
+	wait_event(chp->wait, refcount_dec_if_one(&chp->refcnt));
 
 	ucontext = rdma_udata_to_drv_context(udata, struct c4iw_ucontext,
 					     ibucontext);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.
  2021-07-21 11:21 [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs Dakshaja Uppalapati
@ 2021-07-22  7:43 ` Leon Romanovsky
  2021-07-22 12:06   ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2021-07-22  7:43 UTC (permalink / raw)
  To: Dakshaja Uppalapati; +Cc: jgg, dledford, linux-rdma, bharat

On Wed, Jul 21, 2021 at 04:51:55PM +0530, Dakshaja Uppalapati wrote:
> Previous atomic increment decrement logic expects the atomic count
> to be '0' after the final decrement. Replacing atomic count with
> refcount does not allow that as refcount_dec() considers count of 1 as
> underflow. Therefore fix the current refcount logic by decrementing
> the refcount if one on the final deref in c4iw_destroy_cq().
> 
> Fixes: 7183451f846d (RDMA/cxgb4: Use refcount_t instead of atomic_t for reference counting")
> Signed-off-by: Dakshaja Uppalapati <dakshaja@chelsio.com>
> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
> ---
>  drivers/infiniband/hw/cxgb4/cq.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)


Thanks, 
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

We have plenty of such errors, worth to check them:
➜  kernel git:(rdma-next) git grep refcount_read drivers/infiniband/ | grep -v WARN_ON
drivers/infiniband/core/device.c:	if (!refcount_read(&ib_dev->refcount))
drivers/infiniband/core/device.c:	if (refcount_read(&device->refcount) == 0 ||
drivers/infiniband/core/iwpm_util.c:	if (!refcount_read(&iwpm_admin.refcount)) {
drivers/infiniband/core/iwpm_util.c:	if (!refcount_read(&iwpm_admin.refcount)) {
drivers/infiniband/core/ucma.c:	if (refcount_read(&ctx->ref))
drivers/infiniband/hw/cxgb4/cq.c:	wait_event(chp->wait, !refcount_read(&chp->refcnt));
drivers/infiniband/hw/irdma/utils.c:			   refcount_read(&cqp_request->refcnt) == 1, 1000);
drivers/infiniband/hw/mlx5/mlx5_ib.h:	wait_event(mmkey->wait, refcount_read(&mmkey->usecount) == 0);
drivers/infiniband/hw/mlx5/mr.c:	    refcount_read(&mr->mmkey.usecount) != 0 &&

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.
  2021-07-22  7:43 ` Leon Romanovsky
@ 2021-07-22 12:06   ` Jason Gunthorpe
  2021-07-22 12:57     ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2021-07-22 12:06 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Dakshaja Uppalapati, dledford, linux-rdma, bharat

On Thu, Jul 22, 2021 at 10:43:00AM +0300, Leon Romanovsky wrote:
> On Wed, Jul 21, 2021 at 04:51:55PM +0530, Dakshaja Uppalapati wrote:
> > Previous atomic increment decrement logic expects the atomic count
> > to be '0' after the final decrement. Replacing atomic count with
> > refcount does not allow that as refcount_dec() considers count of 1 as
> > underflow. Therefore fix the current refcount logic by decrementing
> > the refcount if one on the final deref in c4iw_destroy_cq().
> > 
> > Fixes: 7183451f846d (RDMA/cxgb4: Use refcount_t instead of atomic_t for reference counting")
> > Signed-off-by: Dakshaja Uppalapati <dakshaja@chelsio.com>
> > Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
> >  drivers/infiniband/hw/cxgb4/cq.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> 
> Thanks, 
> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> 
> We have plenty of such errors, worth to check them:
> ➜  kernel git:(rdma-next) git grep refcount_read drivers/infiniband/ | grep -v WARN_ON
> drivers/infiniband/core/device.c:	if (!refcount_read(&ib_dev->refcount))
> drivers/infiniband/core/device.c:	if (refcount_read(&device->refcount) == 0 ||
> drivers/infiniband/core/iwpm_util.c:	if (!refcount_read(&iwpm_admin.refcount)) {
> drivers/infiniband/core/iwpm_util.c:	if (!refcount_read(&iwpm_admin.refcount)) {
> drivers/infiniband/core/ucma.c:	if (refcount_read(&ctx->ref))
> drivers/infiniband/hw/cxgb4/cq.c:	wait_event(chp->wait, !refcount_read(&chp->refcnt));
> drivers/infiniband/hw/irdma/utils.c:			   refcount_read(&cqp_request->refcnt) == 1, 1000);
> drivers/infiniband/hw/mlx5/mlx5_ib.h:	wait_event(mmkey->wait, refcount_read(&mmkey->usecount) == 0);
> drivers/infiniband/hw/mlx5/mr.c:	    refcount_read(&mr->mmkey.usecount) != 0 &&

It isn't the read that is the problem, it is the naked dec.

This common pattern is just being done in an obtuse and arguably wrong
way

It is supposed to look like this:

void ib_device_put(struct ib_device *device)
{
        if (refcount_dec_and_test(&device->refcount))
                complete(&device->unreg_completion);
}

[..]

        ib_device_put(device);
        wait_for_completion(&device->unreg_completion);


Not use refcount_read() and a naked work queue

So I'd say these ones should be looked at:

drivers/infiniband/hw/cxgb4/cq.c:       refcount_dec(&chp->refcnt);
drivers/infiniband/hw/hns/hns_roce_db.c:        refcount_dec(&db->u.user_page->refcount);
drivers/infiniband/hw/irdma/cm.c:       refcount_dec(&cm_node->refcnt);
drivers/infiniband/hw/irdma/cm.c:               refcount_dec(&listener->refcnt);
drivers/infiniband/hw/irdma/cm.c:                       refcount_dec(&listener->refcnt);

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.
  2021-07-22 12:06   ` Jason Gunthorpe
@ 2021-07-22 12:57     ` Leon Romanovsky
  2021-07-22 13:02       ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2021-07-22 12:57 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Dakshaja Uppalapati, dledford, linux-rdma, bharat

On Thu, Jul 22, 2021 at 09:06:07AM -0300, Jason Gunthorpe wrote:
> On Thu, Jul 22, 2021 at 10:43:00AM +0300, Leon Romanovsky wrote:
> > On Wed, Jul 21, 2021 at 04:51:55PM +0530, Dakshaja Uppalapati wrote:
> > > Previous atomic increment decrement logic expects the atomic count
> > > to be '0' after the final decrement. Replacing atomic count with
> > > refcount does not allow that as refcount_dec() considers count of 1 as
> > > underflow. Therefore fix the current refcount logic by decrementing
> > > the refcount if one on the final deref in c4iw_destroy_cq().
> > > 
> > > Fixes: 7183451f846d (RDMA/cxgb4: Use refcount_t instead of atomic_t for reference counting")
> > > Signed-off-by: Dakshaja Uppalapati <dakshaja@chelsio.com>
> > > Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
> > >  drivers/infiniband/hw/cxgb4/cq.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > 
> > Thanks, 
> > Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> > 
> > We have plenty of such errors, worth to check them:
> > ➜  kernel git:(rdma-next) git grep refcount_read drivers/infiniband/ | grep -v WARN_ON
> > drivers/infiniband/core/device.c:	if (!refcount_read(&ib_dev->refcount))
> > drivers/infiniband/core/device.c:	if (refcount_read(&device->refcount) == 0 ||
> > drivers/infiniband/core/iwpm_util.c:	if (!refcount_read(&iwpm_admin.refcount)) {
> > drivers/infiniband/core/iwpm_util.c:	if (!refcount_read(&iwpm_admin.refcount)) {
> > drivers/infiniband/core/ucma.c:	if (refcount_read(&ctx->ref))
> > drivers/infiniband/hw/cxgb4/cq.c:	wait_event(chp->wait, !refcount_read(&chp->refcnt));
> > drivers/infiniband/hw/irdma/utils.c:			   refcount_read(&cqp_request->refcnt) == 1, 1000);
> > drivers/infiniband/hw/mlx5/mlx5_ib.h:	wait_event(mmkey->wait, refcount_read(&mmkey->usecount) == 0);
> > drivers/infiniband/hw/mlx5/mr.c:	    refcount_read(&mr->mmkey.usecount) != 0 &&
> 
> It isn't the read that is the problem, it is the naked dec.
> 
> This common pattern is just being done in an obtuse and arguably wrong
> way
> 
> It is supposed to look like this:
> 
> void ib_device_put(struct ib_device *device)
> {
>         if (refcount_dec_and_test(&device->refcount))
>                 complete(&device->unreg_completion);
> }
> 
> [..]
> 
>         ib_device_put(device);
>         wait_for_completion(&device->unreg_completion);
> 
> 
> Not use refcount_read() and a naked work queue
> 
> So I'd say these ones should be looked at:
> 
> drivers/infiniband/hw/cxgb4/cq.c:       refcount_dec(&chp->refcnt);
> drivers/infiniband/hw/hns/hns_roce_db.c:        refcount_dec(&db->u.user_page->refcount);
> drivers/infiniband/hw/irdma/cm.c:       refcount_dec(&cm_node->refcnt);
> drivers/infiniband/hw/irdma/cm.c:               refcount_dec(&listener->refcnt);
> drivers/infiniband/hw/irdma/cm.c:                       refcount_dec(&listener->refcnt);

Jason,

We are talking about two different issues that this refcount_read patch pointed.
You are focused on wrong usage of completion, I saw useless compare of
refcount_t with 0 that can't be.

I prepared series that cleans iwpm from this type of error:
 drivers/infiniband/core/iwpm_util.c:        if (!refcount_read(&iwpm_admin.refcount)) {
 drivers/infiniband/core/iwpm_util.c:        if (!refcount_read(&iwpm_admin.refcount)) {

Thanks

> 
> Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.
  2021-07-22 12:57     ` Leon Romanovsky
@ 2021-07-22 13:02       ` Jason Gunthorpe
  2021-07-22 13:23         ` Dakshaja Uppalapati
  2021-07-22 14:01         ` Leon Romanovsky
  0 siblings, 2 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2021-07-22 13:02 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Dakshaja Uppalapati, dledford, linux-rdma, bharat

On Thu, Jul 22, 2021 at 03:57:39PM +0300, Leon Romanovsky wrote:

> We are talking about two different issues that this refcount_read patch pointed.
> You are focused on wrong usage of completion, I saw useless compare of
> refcount_t with 0 that can't be.

It can be zero. Anything that does refcount_dec_and_test() can make
the refcount be zero.

The issue here is that refcount_dec() cannot make the refcount zero as
it is improper use of the API.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.
  2021-07-22 13:02       ` Jason Gunthorpe
@ 2021-07-22 13:23         ` Dakshaja Uppalapati
  2021-07-22 14:01         ` Leon Romanovsky
  1 sibling, 0 replies; 9+ messages in thread
From: Dakshaja Uppalapati @ 2021-07-22 13:23 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: dledford, linux-rdma, bharat, dakshaja

On Thursday, July 07/22/21, 2021 at 10:02:31 -0300, Jason Gunthorpe wrote:
> On Thu, Jul 22, 2021 at 03:57:39PM +0300, Leon Romanovsky wrote:
> 
> > We are talking about two different issues that this refcount_read patch pointed.
> > You are focused on wrong usage of completion, I saw useless compare of
> > refcount_t with 0 that can't be.
> 
> It can be zero. Anything that does refcount_dec_and_test() can make
> the refcount be zero.
> 
> The issue here is that refcount_dec() cannot make the refcount zero as
> it is improper use of the API.

Hi Jason

I didn't understood properly. Can you please tell that the current 
patch for cxgb4 is still wrong?

Thanks
Dakshaja



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.
  2021-07-22 13:02       ` Jason Gunthorpe
  2021-07-22 13:23         ` Dakshaja Uppalapati
@ 2021-07-22 14:01         ` Leon Romanovsky
  2021-07-22 14:02           ` Jason Gunthorpe
  1 sibling, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2021-07-22 14:01 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Dakshaja Uppalapati, dledford, linux-rdma, bharat

On Thu, Jul 22, 2021 at 10:02:31AM -0300, Jason Gunthorpe wrote:
> On Thu, Jul 22, 2021 at 03:57:39PM +0300, Leon Romanovsky wrote:
> 
> > We are talking about two different issues that this refcount_read patch pointed.
> > You are focused on wrong usage of completion, I saw useless compare of
> > refcount_t with 0 that can't be.
> 
> It can be zero. Anything that does refcount_dec_and_test() can make
> the refcount be zero.

It can be, but it is not the case at least for iwpm.

Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.
  2021-07-22 14:01         ` Leon Romanovsky
@ 2021-07-22 14:02           ` Jason Gunthorpe
       [not found]             ` <20210803155919.GB11439@chelsio.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2021-07-22 14:02 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Dakshaja Uppalapati, dledford, linux-rdma, bharat

On Thu, Jul 22, 2021 at 05:01:30PM +0300, Leon Romanovsky wrote:
> On Thu, Jul 22, 2021 at 10:02:31AM -0300, Jason Gunthorpe wrote:
> > On Thu, Jul 22, 2021 at 03:57:39PM +0300, Leon Romanovsky wrote:
> > 
> > > We are talking about two different issues that this refcount_read patch pointed.
> > > You are focused on wrong usage of completion, I saw useless compare of
> > > refcount_t with 0 that can't be.
> > 
> > It can be zero. Anything that does refcount_dec_and_test() can make
> > the refcount be zero.
> 
> It can be, but it is not the case at least for iwpm.

That is even more broken then. Structs with non-zero refcounts in them
should not be freed.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs.
       [not found]             ` <20210803155919.GB11439@chelsio.com>
@ 2021-08-03 16:09               ` Jason Gunthorpe
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2021-08-03 16:09 UTC (permalink / raw)
  To: Dakshaja Uppalapati
  Cc: Leon Romanovsky, dledford, linux-rdma, Potnuri Bharat Teja

On Tue, Aug 03, 2021 at 09:29:20PM +0530, Dakshaja Uppalapati wrote:
> Hi Jason,
> 
> Gentle ping.
> 
> I think the current patch for cxgb4 is legit. I am expecting 
> this to be pulled for next rc. Please let me know if I am missing
> something here.

As I said, you need to rework this to use a proper completion pattern
not hacky like this.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-08-03 16:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-21 11:21 [PATCH for-rc] iw_cxgb4: Fix refcount underflow while destroying cqs Dakshaja Uppalapati
2021-07-22  7:43 ` Leon Romanovsky
2021-07-22 12:06   ` Jason Gunthorpe
2021-07-22 12:57     ` Leon Romanovsky
2021-07-22 13:02       ` Jason Gunthorpe
2021-07-22 13:23         ` Dakshaja Uppalapati
2021-07-22 14:01         ` Leon Romanovsky
2021-07-22 14:02           ` Jason Gunthorpe
     [not found]             ` <20210803155919.GB11439@chelsio.com>
2021-08-03 16:09               ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).