* [PATCH for-rc 0/3] irdma fixes
@ 2022-04-25 18:17 Shiraz Saleem
2022-04-25 18:17 ` [PATCH for-rc 1/3] RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state Shiraz Saleem
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Shiraz Saleem @ 2022-04-25 18:17 UTC (permalink / raw)
To: jgg, leon; +Cc: linux-rdma, Shiraz Saleem
This series contains a few irdma bug fixes for 5.18 cycle.
Mustafa Ismail (1):
RDMA/irdma: Fix possible crash due to NULL netdev in notifier
Shiraz Saleem (1):
RDMA/irdma: Reduce iWARP QP destroy time
Tatyana Nikolova (1):
RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state
drivers/infiniband/hw/irdma/cm.c | 26 +++++++++-----------------
drivers/infiniband/hw/irdma/utils.c | 21 +++++++++------------
drivers/infiniband/hw/irdma/verbs.c | 4 ++--
3 files changed, 20 insertions(+), 31 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH for-rc 1/3] RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state
2022-04-25 18:17 [PATCH for-rc 0/3] irdma fixes Shiraz Saleem
@ 2022-04-25 18:17 ` Shiraz Saleem
2022-04-25 18:17 ` [PATCH for-rc 2/3] RDMA/irdma: Reduce iWARP QP destroy time Shiraz Saleem
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Shiraz Saleem @ 2022-04-25 18:17 UTC (permalink / raw)
To: jgg, leon; +Cc: linux-rdma, Tatyana Nikolova, Shiraz Saleem
From: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
When connection establishment fails in iWARP mode, an app can drain the
QPs and hang because flush isn't issued when the QP is modified from RTR
state to error. Issue a flush in this case using function
irdma_cm_disconn().
Update irdma_cm_disconn() to do flush when cm_id is NULL, which is the
case when the QP is in RTR state and there is an error in the connection
establishment.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
drivers/infiniband/hw/irdma/cm.c | 16 +++++-----------
drivers/infiniband/hw/irdma/verbs.c | 4 ++--
2 files changed, 7 insertions(+), 13 deletions(-)
diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c
index dedb3b7..d185f3a0 100644
--- a/drivers/infiniband/hw/irdma/cm.c
+++ b/drivers/infiniband/hw/irdma/cm.c
@@ -3467,12 +3467,6 @@ static void irdma_cm_disconn_true(struct irdma_qp *iwqp)
}
cm_id = iwqp->cm_id;
- /* make sure we havent already closed this connection */
- if (!cm_id) {
- spin_unlock_irqrestore(&iwqp->lock, flags);
- return;
- }
-
original_hw_tcp_state = iwqp->hw_tcp_state;
original_ibqp_state = iwqp->ibqp_state;
last_ae = iwqp->last_aeq;
@@ -3494,11 +3488,11 @@ static void irdma_cm_disconn_true(struct irdma_qp *iwqp)
disconn_status = -ECONNRESET;
}
- if ((original_hw_tcp_state == IRDMA_TCP_STATE_CLOSED ||
- original_hw_tcp_state == IRDMA_TCP_STATE_TIME_WAIT ||
- last_ae == IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE ||
- last_ae == IRDMA_AE_BAD_CLOSE ||
- last_ae == IRDMA_AE_LLP_CONNECTION_RESET || iwdev->rf->reset)) {
+ if (original_hw_tcp_state == IRDMA_TCP_STATE_CLOSED ||
+ original_hw_tcp_state == IRDMA_TCP_STATE_TIME_WAIT ||
+ last_ae == IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE ||
+ last_ae == IRDMA_AE_BAD_CLOSE ||
+ last_ae == IRDMA_AE_LLP_CONNECTION_RESET || iwdev->rf->reset || !cm_id) {
issue_close = 1;
iwqp->cm_id = NULL;
qp->term_flags = 0;
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index 46f4753..52f3e88 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -1618,13 +1618,13 @@ int irdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask,
if (issue_modify_qp && iwqp->ibqp_state > IB_QPS_RTS) {
if (dont_wait) {
- if (iwqp->cm_id && iwqp->hw_tcp_state) {
+ if (iwqp->hw_tcp_state) {
spin_lock_irqsave(&iwqp->lock, flags);
iwqp->hw_tcp_state = IRDMA_TCP_STATE_CLOSED;
iwqp->last_aeq = IRDMA_AE_RESET_SENT;
spin_unlock_irqrestore(&iwqp->lock, flags);
- irdma_cm_disconn(iwqp);
}
+ irdma_cm_disconn(iwqp);
} else {
int close_timer_started;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH for-rc 2/3] RDMA/irdma: Reduce iWARP QP destroy time
2022-04-25 18:17 [PATCH for-rc 0/3] irdma fixes Shiraz Saleem
2022-04-25 18:17 ` [PATCH for-rc 1/3] RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state Shiraz Saleem
@ 2022-04-25 18:17 ` Shiraz Saleem
2022-04-25 18:17 ` [PATCH for-rc 3/3] RDMA/irdma: Fix possible crash due to NULL netdev in notifier Shiraz Saleem
2022-05-02 14:38 ` [PATCH for-rc 0/3] irdma fixes Jason Gunthorpe
3 siblings, 0 replies; 5+ messages in thread
From: Shiraz Saleem @ 2022-04-25 18:17 UTC (permalink / raw)
To: jgg, leon; +Cc: linux-rdma, Shiraz Saleem
QP destroy is synchronous and waits for its refcnt to be decremented in
irdma_cm_node_free_cb (for iWARP) which fires after the RCU grace period
elapses.
Applications running a large number of connections are exposed to high
wait times on destroy QP for events like SIGABORT.
The long poll for this wait time is the firing of the call_rcu callback
during a CM node destroy which can be slow. It holds the QP reference
count and blocks the destroy QP from completing.
call_rcu only needs to make sure that list walkers have a reference
to the cm_node object before freeing it and thus need to wait for grace
period elapse. The rest of the connection teardown in
irdma_cm_node_free_cb is moved out of the grace period wait in
irdma_destroy_connection. Also, replace call_rcu with a simple kfree_rcu
as it just needs to do a kfree on the cm_node
Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
drivers/infiniband/hw/irdma/cm.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c
index d185f3a0..656baa3 100644
--- a/drivers/infiniband/hw/irdma/cm.c
+++ b/drivers/infiniband/hw/irdma/cm.c
@@ -2308,10 +2308,8 @@ static void irdma_cm_free_ah(struct irdma_cm_node *cm_node)
return NULL;
}
-static void irdma_cm_node_free_cb(struct rcu_head *rcu_head)
+static void irdma_destroy_connection(struct irdma_cm_node *cm_node)
{
- struct irdma_cm_node *cm_node =
- container_of(rcu_head, struct irdma_cm_node, rcu_head);
struct irdma_cm_core *cm_core = cm_node->cm_core;
struct irdma_qp *iwqp;
struct irdma_cm_info nfo;
@@ -2359,7 +2357,6 @@ static void irdma_cm_node_free_cb(struct rcu_head *rcu_head)
}
cm_core->cm_free_ah(cm_node);
- kfree(cm_node);
}
/**
@@ -2387,8 +2384,9 @@ void irdma_rem_ref_cm_node(struct irdma_cm_node *cm_node)
spin_unlock_irqrestore(&cm_core->ht_lock, flags);
- /* wait for all list walkers to exit their grace period */
- call_rcu(&cm_node->rcu_head, irdma_cm_node_free_cb);
+ irdma_destroy_connection(cm_node);
+
+ kfree_rcu(cm_node, rcu_head);
}
/**
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH for-rc 3/3] RDMA/irdma: Fix possible crash due to NULL netdev in notifier
2022-04-25 18:17 [PATCH for-rc 0/3] irdma fixes Shiraz Saleem
2022-04-25 18:17 ` [PATCH for-rc 1/3] RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state Shiraz Saleem
2022-04-25 18:17 ` [PATCH for-rc 2/3] RDMA/irdma: Reduce iWARP QP destroy time Shiraz Saleem
@ 2022-04-25 18:17 ` Shiraz Saleem
2022-05-02 14:38 ` [PATCH for-rc 0/3] irdma fixes Jason Gunthorpe
3 siblings, 0 replies; 5+ messages in thread
From: Shiraz Saleem @ 2022-04-25 18:17 UTC (permalink / raw)
To: jgg, leon; +Cc: linux-rdma, Mustafa Ismail, Shiraz Saleem
From: Mustafa Ismail <mustafa.ismail@intel.com>
For some net events in irdma_net_event notifier, the netdev can be NULL
which will cause a crash in rdma_vlan_dev_real_dev.
Fix this by moving all processing to the NETEVENT_NEIGH_UPDATE case where
the netdev is guaranteed to not be NULL.
Fixes: 6702bc147448 ("RDMA/irdma: Fix netdev notifications for vlan's")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
drivers/infiniband/hw/irdma/utils.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c
index 346c2c5..8176041 100644
--- a/drivers/infiniband/hw/irdma/utils.c
+++ b/drivers/infiniband/hw/irdma/utils.c
@@ -258,18 +258,16 @@ int irdma_net_event(struct notifier_block *notifier, unsigned long event,
u32 local_ipaddr[4] = {};
bool ipv4 = true;
- real_dev = rdma_vlan_dev_real_dev(netdev);
- if (!real_dev)
- real_dev = netdev;
-
- ibdev = ib_device_get_by_netdev(real_dev, RDMA_DRIVER_IRDMA);
- if (!ibdev)
- return NOTIFY_DONE;
-
- iwdev = to_iwdev(ibdev);
-
switch (event) {
case NETEVENT_NEIGH_UPDATE:
+ real_dev = rdma_vlan_dev_real_dev(netdev);
+ if (!real_dev)
+ real_dev = netdev;
+ ibdev = ib_device_get_by_netdev(real_dev, RDMA_DRIVER_IRDMA);
+ if (!ibdev)
+ return NOTIFY_DONE;
+
+ iwdev = to_iwdev(ibdev);
p = (__be32 *)neigh->primary_key;
if (neigh->tbl->family == AF_INET6) {
ipv4 = false;
@@ -290,13 +288,12 @@ int irdma_net_event(struct notifier_block *notifier, unsigned long event,
irdma_manage_arp_cache(iwdev->rf, neigh->ha,
local_ipaddr, ipv4,
IRDMA_ARP_DELETE);
+ ib_device_put(ibdev);
break;
default:
break;
}
- ib_device_put(ibdev);
-
return NOTIFY_DONE;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH for-rc 0/3] irdma fixes
2022-04-25 18:17 [PATCH for-rc 0/3] irdma fixes Shiraz Saleem
` (2 preceding siblings ...)
2022-04-25 18:17 ` [PATCH for-rc 3/3] RDMA/irdma: Fix possible crash due to NULL netdev in notifier Shiraz Saleem
@ 2022-05-02 14:38 ` Jason Gunthorpe
3 siblings, 0 replies; 5+ messages in thread
From: Jason Gunthorpe @ 2022-05-02 14:38 UTC (permalink / raw)
To: Shiraz Saleem; +Cc: leon, linux-rdma
On Mon, Apr 25, 2022 at 01:17:00PM -0500, Shiraz Saleem wrote:
> This series contains a few irdma bug fixes for 5.18 cycle.
>
> Mustafa Ismail (1):
> RDMA/irdma: Fix possible crash due to NULL netdev in notifier
>
> Shiraz Saleem (1):
> RDMA/irdma: Reduce iWARP QP destroy time
>
> Tatyana Nikolova (1):
> RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state
>
> drivers/infiniband/hw/irdma/cm.c | 26 +++++++++-----------------
> drivers/infiniband/hw/irdma/utils.c | 21 +++++++++------------
> drivers/infiniband/hw/irdma/verbs.c | 4 ++--
> 3 files changed, 20 insertions(+), 31 deletions(-)
Applied to for-rc, thanks
Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-05-02 14:38 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-25 18:17 [PATCH for-rc 0/3] irdma fixes Shiraz Saleem
2022-04-25 18:17 ` [PATCH for-rc 1/3] RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state Shiraz Saleem
2022-04-25 18:17 ` [PATCH for-rc 2/3] RDMA/irdma: Reduce iWARP QP destroy time Shiraz Saleem
2022-04-25 18:17 ` [PATCH for-rc 3/3] RDMA/irdma: Fix possible crash due to NULL netdev in notifier Shiraz Saleem
2022-05-02 14:38 ` [PATCH for-rc 0/3] irdma fixes Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).