All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-rc] RDMA/mlx5: Clear old rate limit when closing QP
@ 2019-10-02 12:02 Leon Romanovsky
  2019-10-17 20:12 ` Doug Ledford
  0 siblings, 1 reply; 4+ messages in thread
From: Leon Romanovsky @ 2019-10-02 12:02 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Rafi Wiener, RDMA mailing list, Bodong Wang, Oleg Kuporosov,
	Leon Romanovsky

From: Rafi Wiener <rafiw@mellanox.com>

Before QP is closed it changes to ERROR state, when this happens
the QP was left with old rate limit that was already removed from
the table.

Fixes: 7d29f349a4b9 ("IB/mlx5: Properly adjust rate limit on QP state transitions")
Signed-off-by: Rafi Wiener <rafiw@mellanox.com>
Signed-off-by: Oleg Kuporosov <olegk@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/qp.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 8937d72ddcf6..5fd071c05944 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -3249,10 +3249,12 @@ static int modify_raw_packet_qp_sq(
 	}
 
 	/* Only remove the old rate after new rate was set */
-	if ((old_rl.rate &&
-	     !mlx5_rl_are_equal(&old_rl, &new_rl)) ||
-	    (new_state != MLX5_SQC_STATE_RDY))
+	if ((old_rl.rate && !mlx5_rl_are_equal(&old_rl, &new_rl)) ||
+	    (new_state != MLX5_SQC_STATE_RDY)) {
 		mlx5_rl_remove_rate(dev, &old_rl);
+		if (new_state != MLX5_SQC_STATE_RDY)
+			memset(&new_rl, 0, sizeof(new_rl));
+	}
 
 	ibqp->rl = new_rl;
 	sq->state = new_state;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH rdma-rc] RDMA/mlx5: Clear old rate limit when closing QP
  2019-10-02 12:02 [PATCH rdma-rc] RDMA/mlx5: Clear old rate limit when closing QP Leon Romanovsky
@ 2019-10-17 20:12 ` Doug Ledford
  2019-10-20  5:27   ` Leon Romanovsky
  2019-10-20  8:59   ` Rafi Wiener
  0 siblings, 2 replies; 4+ messages in thread
From: Doug Ledford @ 2019-10-17 20:12 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Rafi Wiener, RDMA mailing list, Bodong Wang, Oleg Kuporosov,
	Leon Romanovsky

[-- Attachment #1: Type: text/plain, Size: 860 bytes --]

On Wed, 2019-10-02 at 15:02 +0300, Leon Romanovsky wrote:
> From: Rafi Wiener <rafiw@mellanox.com>
> 
> Before QP is closed it changes to ERROR state, when this happens
> the QP was left with old rate limit that was already removed from
> the table.
> 
> Fixes: 7d29f349a4b9 ("IB/mlx5: Properly adjust rate limit on QP state
> transitions")
> Signed-off-by: Rafi Wiener <rafiw@mellanox.com>
> Signed-off-by: Oleg Kuporosov <olegk@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>

If you are in the process of closing the queue pair, does this solve
some sort of multi-close race, or is it just being pedantic before
freeing the qp struct?

I took it regardless, just curious.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH rdma-rc] RDMA/mlx5: Clear old rate limit when closing QP
  2019-10-17 20:12 ` Doug Ledford
@ 2019-10-20  5:27   ` Leon Romanovsky
  2019-10-20  8:59   ` Rafi Wiener
  1 sibling, 0 replies; 4+ messages in thread
From: Leon Romanovsky @ 2019-10-20  5:27 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Jason Gunthorpe, Rafi Wiener, RDMA mailing list, Bodong Wang,
	Oleg Kuporosov

On Thu, Oct 17, 2019 at 04:12:04PM -0400, Doug Ledford wrote:
> On Wed, 2019-10-02 at 15:02 +0300, Leon Romanovsky wrote:
> > From: Rafi Wiener <rafiw@mellanox.com>
> >
> > Before QP is closed it changes to ERROR state, when this happens
> > the QP was left with old rate limit that was already removed from
> > the table.
> >
> > Fixes: 7d29f349a4b9 ("IB/mlx5: Properly adjust rate limit on QP state
> > transitions")
> > Signed-off-by: Rafi Wiener <rafiw@mellanox.com>
> > Signed-off-by: Oleg Kuporosov <olegk@mellanox.com>
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>
> If you are in the process of closing the queue pair, does this solve
> some sort of multi-close race, or is it just being pedantic before
> freeing the qp struct?

It fixes real bug with panic, I didn't add splat, because it had debug
info needed to find this problem.

The nutshell of this bug is how we are storing rate limits:
in one table of global mlx5_core_dev and struct (not pointer) of
mlx5_rate_limit inside mlx5_ib_qp. Such combination still allows
access to rate limit (old one) for ibqp, for example for compare
(mlx5_rl_are_equal).

The best solution is to rewrite rl logic to use pointers, but it was too
much to demand from Oleg and Rafi, who stepped on this bug with their
user space application.

Thanks

>
> I took it regardless, just curious.
>
> --
> Doug Ledford <dledford@redhat.com>
>     GPG KeyID: B826A3330E572FDD
>     Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH rdma-rc] RDMA/mlx5: Clear old rate limit when closing QP
  2019-10-17 20:12 ` Doug Ledford
  2019-10-20  5:27   ` Leon Romanovsky
@ 2019-10-20  8:59   ` Rafi Wiener
  1 sibling, 0 replies; 4+ messages in thread
From: Rafi Wiener @ 2019-10-20  8:59 UTC (permalink / raw)
  To: Doug Ledford, Leon Romanovsky, Jason Gunthorpe
  Cc: RDMA mailing list, Bodong Wang, Oleg Kuporosov, Leon Romanovsky

Hi Doug,
The issue is when you have a few QPs with the same rate limit you remove the rate twice from the table instead of once. This causes issue, no racing or stressing occurs.
Rafi 

-----Original Message-----
From: Doug Ledford <dledford@redhat.com> 
Sent: Thursday, October 17, 2019 11:12 PM
To: Leon Romanovsky <leon@kernel.org>; Jason Gunthorpe <jgg@mellanox.com>
Cc: Rafi Wiener <rafiw@mellanox.com>; RDMA mailing list <linux-rdma@vger.kernel.org>; Bodong Wang <bodong@mellanox.com>; Oleg Kuporosov <olegk@mellanox.com>; Leon Romanovsky <leonro@mellanox.com>
Subject: Re: [PATCH rdma-rc] RDMA/mlx5: Clear old rate limit when closing QP

On Wed, 2019-10-02 at 15:02 +0300, Leon Romanovsky wrote:
> From: Rafi Wiener <rafiw@mellanox.com>
> 
> Before QP is closed it changes to ERROR state, when this happens the 
> QP was left with old rate limit that was already removed from the 
> table.
> 
> Fixes: 7d29f349a4b9 ("IB/mlx5: Properly adjust rate limit on QP state
> transitions")
> Signed-off-by: Rafi Wiener <rafiw@mellanox.com>
> Signed-off-by: Oleg Kuporosov <olegk@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>

If you are in the process of closing the queue pair, does this solve some sort of multi-close race, or is it just being pedantic before freeing the qp struct?

I took it regardless, just curious.

--
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-10-20  8:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-02 12:02 [PATCH rdma-rc] RDMA/mlx5: Clear old rate limit when closing QP Leon Romanovsky
2019-10-17 20:12 ` Doug Ledford
2019-10-20  5:27   ` Leon Romanovsky
2019-10-20  8:59   ` Rafi Wiener

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.