All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Doug Ledford <dledford@redhat.com>, Jason Gunthorpe <jgg@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>,
	Daniel Jurgens <danielj@mellanox.com>,
	Erez Shitrit <erezsh@mellanox.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Maor Gottlieb <maorg@mellanox.com>,
	Michael Guralnik <michaelgur@mellanox.com>,
	Moni Shoua <monis@mellanox.com>,
	Parav Pandit <parav@mellanox.com>,
	Sean Hefty <sean.hefty@intel.com>,
	Valentine Fatiev <valentinef@mellanox.com>,
	Yishai Hadas <yishaih@mellanox.com>,
	Yonatan Cohen <yonatanc@mellanox.com>,
	Zhu Yanjun <yanjunz@mellanox.com>
Subject: [PATCH rdma-rc 7/9] RDMA/rxe: Fix soft lockup problem due to using tasklets in softirq
Date: Wed, 12 Feb 2020 09:26:33 +0200	[thread overview]
Message-ID: <20200212072635.682689-8-leon@kernel.org> (raw)
In-Reply-To: <20200212072635.682689-1-leon@kernel.org>

From: Zhu Yanjun <yanjunz@mellanox.com>

When run stress tests with RXE, the following Call Traces often occur
"
watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [swapper/2:0]
...
Call Trace:
<IRQ>
create_object+0x3f/0x3b0
kmem_cache_alloc_node_trace+0x129/0x2d0
__kmalloc_reserve.isra.52+0x2e/0x80
__alloc_skb+0x83/0x270
rxe_init_packet+0x99/0x150 [rdma_rxe]
rxe_requester+0x34e/0x11a0 [rdma_rxe]
rxe_do_task+0x85/0xf0 [rdma_rxe]
tasklet_action_common.isra.21+0xeb/0x100
__do_softirq+0xd0/0x298
irq_exit+0xc5/0xd0
smp_apic_timer_interrupt+0x68/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
...
"
The root cause is that tasklet is actually a softirq. In a tasklet handler,
another softirq handler is triggered. Usually these softirq handlers run
on the same cpu core. So this will cause "soft lockup Bug".

Fixes: 8700e3e7c485 ("Soft RoCE driver")
CC: Yossef Efraim <yossefe@mellanox.com>
Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/sw/rxe/rxe_comp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index 116cafc9afcf..4bc88708b355 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -329,7 +329,7 @@ static inline enum comp_state check_ack(struct rxe_qp *qp,
 					qp->comp.psn = pkt->psn;
 					if (qp->req.wait_psn) {
 						qp->req.wait_psn = 0;
-						rxe_run_task(&qp->req.task, 1);
+						rxe_run_task(&qp->req.task, 0);
 					}
 				}
 				return COMPST_ERROR_RETRY;
@@ -463,7 +463,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
 	 */
 	if (qp->req.wait_fence) {
 		qp->req.wait_fence = 0;
-		rxe_run_task(&qp->req.task, 1);
+		rxe_run_task(&qp->req.task, 0);
 	}
 }
 
@@ -479,7 +479,7 @@ static inline enum comp_state complete_ack(struct rxe_qp *qp,
 		if (qp->req.need_rd_atomic) {
 			qp->comp.timeout_retry = 0;
 			qp->req.need_rd_atomic = 0;
-			rxe_run_task(&qp->req.task, 1);
+			rxe_run_task(&qp->req.task, 0);
 		}
 	}
 
@@ -725,7 +725,7 @@ int rxe_completer(void *arg)
 							RXE_CNT_COMP_RETRY);
 					qp->req.need_retry = 1;
 					qp->comp.started_retry = 1;
-					rxe_run_task(&qp->req.task, 1);
+					rxe_run_task(&qp->req.task, 0);
 				}
 
 				if (pkt) {
-- 
2.24.1


  parent reply	other threads:[~2020-02-12  7:27 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-12  7:26 [PATCH rdma-rc 0/9] Fixes for v5.6 Leon Romanovsky
2020-02-12  7:26 ` [PATCH rdma-rc 1/9] RDMA/ucma: Mask QPN to be 24 bits according to IBTA Leon Romanovsky
2020-02-12  7:26 ` [PATCH rdma-rc 2/9] RDMA/core: Fix protection fault in get_pkey_idx_qp_list Leon Romanovsky
2020-02-12  8:01   ` Leon Romanovsky
2020-02-12  8:06   ` Leon Romanovsky
2020-02-12  7:26 ` [PATCH rdma-rc 3/9] Revert "RDMA/cma: Simplify rdma_resolve_addr() error flow" Leon Romanovsky
2020-02-13 13:30   ` Jason Gunthorpe
2020-02-14  3:11     ` Parav Pandit
2020-02-14 14:08       ` Jason Gunthorpe
2020-02-14 14:48         ` Parav Pandit
2020-02-19 20:40   ` Jason Gunthorpe
2020-02-12  7:26 ` [PATCH rdma-rc 4/9] IB/ipoib: Fix double free of skb in case of multicast traffic in CM mode Leon Romanovsky
2020-02-13 15:37   ` Jason Gunthorpe
2020-02-13 18:10     ` Leon Romanovsky
2020-02-13 18:26       ` Jason Gunthorpe
2020-02-13 18:36         ` Leon Romanovsky
2020-02-13 19:09           ` Jason Gunthorpe
2020-02-12  7:26 ` [PATCH rdma-rc 5/9] RDMA/core: Add missing list deletion on freeing event queue Leon Romanovsky
2020-02-12  7:26 ` [PATCH rdma-rc 6/9] IB/mlx5: Fix async events cleanup flows Leon Romanovsky
2020-02-12  7:26 ` Leon Romanovsky [this message]
2020-02-12  7:26 ` [PATCH rdma-rc 8/9] IB/umad: Fix kernel crash while unloading ib_umad Leon Romanovsky
2020-02-13 14:28   ` Jason Gunthorpe
2020-02-13 18:03     ` Leon Romanovsky
2020-02-12  7:26 ` [PATCH rdma-rc 9/9] RDMA/mlx5: Prevent overflow in mmap offset calculations Leon Romanovsky
2020-02-13 18:03 ` [PATCH rdma-rc 0/9] Fixes for v5.6 Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200212072635.682689-8-leon@kernel.org \
    --to=leon@kernel.org \
    --cc=danielj@mellanox.com \
    --cc=dledford@redhat.com \
    --cc=erezsh@mellanox.com \
    --cc=jgg@mellanox.com \
    --cc=jgg@ziepe.ca \
    --cc=leonro@mellanox.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maorg@mellanox.com \
    --cc=michaelgur@mellanox.com \
    --cc=monis@mellanox.com \
    --cc=parav@mellanox.com \
    --cc=sean.hefty@intel.com \
    --cc=valentinef@mellanox.com \
    --cc=yanjunz@mellanox.com \
    --cc=yishaih@mellanox.com \
    --cc=yonatanc@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.