* [PATCH V2 0/3] Misc changes for siw
@ 2023-08-21 8:47 Guoqing Jiang
2023-08-21 8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21 8:47 UTC (permalink / raw)
To: bmt, jgg, leon; +Cc: linux-rdma
V2 changes:
1. add Fixes lines for the first two patches per Leon
Hi,
The first one fix below calltrace which could happen if siw_connect
goto error (I manually set rv to -1 after siw_send_mpareqrep to trigger
it) after cep is allocated.
[ 97.341035] ------------[ cut here ]------------
[ 97.341037] WARNING: CPU: 0 PID: 143 at drivers/infiniband/sw/siw/siw_cm.c:444 siw_cep_put+0x1c5/0x1e0 [siw]
...
[ 97.341126] CPU: 0 PID: 143 Comm: kworker/u4:4 Tainted: G OE 6.5.0-rc3+ #16
[ 97.341128] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014
[ 97.341130] Workqueue: rdma_cm cma_work_handler [rdma_cm]
[ 97.341137] RIP: 0010:siw_cep_put+0x1c5/0x1e0 [siw]
...
[ 97.341159] Call Trace:
[ 97.341160] <TASK>
[ 97.341162] ? show_regs+0x72/0x90
[ 97.341166] ? siw_cep_put+0x1c5/0x1e0 [siw]
[ 97.341170] ? __warn+0x8d/0x1a0
[ 97.341175] ? siw_cep_put+0x1c5/0x1e0 [siw]
[ 97.341180] ? report_bug+0x1f9/0x250
[ 97.341185] ? handle_bug+0x46/0x90
[ 97.341188] ? exc_invalid_op+0x19/0x80
[ 97.341190] ? asm_exc_invalid_op+0x1b/0x20
[ 97.341196] ? siw_cep_put+0x1c5/0x1e0 [siw]
[ 97.341204] siw_connect+0x474/0x780 [siw]
[ 97.341211] iw_cm_connect+0x1ca/0x250 [iw_cm]
[ 97.341216] rdma_connect_locked+0x1bf/0x940 [rdma_cm]
[ 97.341227] nvme_rdma_cm_handler+0x5d7/0x9c0 [nvme_rdma]
[ 97.341235] cma_cm_event_handler+0x4f/0x170 [rdma_cm]
[ 97.341241] cma_work_handler+0x6a/0xe0 [rdma_cm]
[ 97.341247] process_one_work+0x2bd/0x590
...
The second one make the debug message consistent with the condition,
and the last one cleanup code a bit. Pls help to review them.
Thanks,
Guoqing
Guoqing Jiang (3):
RDMA/siw: Balance the reference of cep->kref in the error path
RDMA/siw: Correct wrong debug message
RDMA/siw: Call llist_reverse_order in siw_run_sq
drivers/infiniband/sw/siw/siw_cm.c | 1 -
drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
drivers/infiniband/sw/siw/siw_verbs.c | 2 +-
3 files changed, 2 insertions(+), 13 deletions(-)
--
2.35.3
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path
2023-08-21 8:47 [PATCH V2 0/3] Misc changes for siw Guoqing Jiang
@ 2023-08-21 8:47 ` Guoqing Jiang
2023-08-21 12:06 ` Bernard Metzler
2023-08-21 8:47 ` [PATCH V2 2/3] RDMA/siw: Correct wrong debug message Guoqing Jiang
2023-08-21 8:47 ` [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq Guoqing Jiang
2 siblings, 1 reply; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21 8:47 UTC (permalink / raw)
To: bmt, jgg, leon; +Cc: linux-rdma
The siw_connect can go to err in below after cep is allocated successfully:
1. If siw_cm_alloc_work returns failure. In this case socket is not
assoicated with cep so siw_cep_put can't be called by siw_socket_disassoc.
We need to call siw_cep_put twice since cep->kref is increased once after
it was initialized.
2. If siw_cm_queue_work can't find a work, which means siw_cep_get is not
called in siw_cm_queue_work, so cep->kref is increased twice by siw_cep_get
and when associate socket with cep after it was initialized. So we need to
call siw_cep_put three times (one in siw_socket_disassoc).
3. siw_send_mpareqrep returns error, this scenario is similar as 2.
So we need to remove one siw_cep_put in the error path.
Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
drivers/infiniband/sw/siw/siw_cm.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index da530c0404da..a2605178f4ed 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -1501,7 +1501,6 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
cep->cm_id = NULL;
id->rem_ref(id);
- siw_cep_put(cep);
qp->cep = NULL;
siw_cep_put(cep);
--
2.35.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH V2 2/3] RDMA/siw: Correct wrong debug message
2023-08-21 8:47 [PATCH V2 0/3] Misc changes for siw Guoqing Jiang
2023-08-21 8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
@ 2023-08-21 8:47 ` Guoqing Jiang
2023-08-21 11:57 ` Bernard Metzler
2023-08-21 8:47 ` [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq Guoqing Jiang
2 siblings, 1 reply; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21 8:47 UTC (permalink / raw)
To: bmt, jgg, leon; +Cc: linux-rdma
We need to print num_sle first then pbl->max_buf per the condition.
Also replace mem->pbl with pbl while at it.
Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
drivers/infiniband/sw/siw/siw_verbs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
index 398ec13db624..4832723dc244 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -1494,7 +1494,7 @@ int siw_map_mr_sg(struct ib_mr *base_mr, struct scatterlist *sl, int num_sle,
if (pbl->max_buf < num_sle) {
siw_dbg_mem(mem, "too many SGE's: %d > %d\n",
- mem->pbl->max_buf, num_sle);
+ num_sle, pbl->max_buf);
return -ENOMEM;
}
for_each_sg(sl, slp, num_sle, i) {
--
2.35.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq
2023-08-21 8:47 [PATCH V2 0/3] Misc changes for siw Guoqing Jiang
2023-08-21 8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
2023-08-21 8:47 ` [PATCH V2 2/3] RDMA/siw: Correct wrong debug message Guoqing Jiang
@ 2023-08-21 8:47 ` Guoqing Jiang
2023-08-21 12:00 ` Bernard Metzler
2 siblings, 1 reply; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21 8:47 UTC (permalink / raw)
To: bmt, jgg, leon; +Cc: linux-rdma
We can call the function to get fifo list.
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
1 file changed, 1 insertion(+), 11 deletions(-)
diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c
index 4b292e0504f1..eb3d438828e2 100644
--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -1229,17 +1229,7 @@ int siw_run_sq(void *data)
break;
active = llist_del_all(&tx_task->active);
- /*
- * llist_del_all returns a list with newest entry first.
- * Re-order list for fairness among QP's.
- */
- while (active) {
- struct llist_node *tmp = active;
-
- active = llist_next(active);
- tmp->next = fifo_list;
- fifo_list = tmp;
- }
+ fifo_list = llist_reverse_order(active);
while (fifo_list) {
qp = container_of(fifo_list, struct siw_qp, tx_list);
fifo_list = llist_next(fifo_list);
--
2.35.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [PATCH V2 2/3] RDMA/siw: Correct wrong debug message
2023-08-21 8:47 ` [PATCH V2 2/3] RDMA/siw: Correct wrong debug message Guoqing Jiang
@ 2023-08-21 11:57 ` Bernard Metzler
0 siblings, 0 replies; 8+ messages in thread
From: Bernard Metzler @ 2023-08-21 11:57 UTC (permalink / raw)
To: Guoqing Jiang, jgg, leon; +Cc: linux-rdma
> -----Original Message-----
> From: Guoqing Jiang <guoqing.jiang@linux.dev>
> Sent: Monday, 21 August 2023 10:48
> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org
> Cc: linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] [PATCH V2 2/3] RDMA/siw: Correct wrong debug message
>
> We need to print num_sle first then pbl->max_buf per the condition.
> Also replace mem->pbl with pbl while at it.
>
> Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
> drivers/infiniband/sw/siw/siw_verbs.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/siw/siw_verbs.c
> b/drivers/infiniband/sw/siw/siw_verbs.c
> index 398ec13db624..4832723dc244 100644
> --- a/drivers/infiniband/sw/siw/siw_verbs.c
> +++ b/drivers/infiniband/sw/siw/siw_verbs.c
> @@ -1494,7 +1494,7 @@ int siw_map_mr_sg(struct ib_mr *base_mr, struct
> scatterlist *sl, int num_sle,
>
> if (pbl->max_buf < num_sle) {
> siw_dbg_mem(mem, "too many SGE's: %d > %d\n",
> - mem->pbl->max_buf, num_sle);
> + num_sle, pbl->max_buf);
> return -ENOMEM;
> }
> for_each_sg(sl, slp, num_sle, i) {
> --
> 2.35.3
makes sense, thank you!
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq
2023-08-21 8:47 ` [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq Guoqing Jiang
@ 2023-08-21 12:00 ` Bernard Metzler
2023-08-21 13:23 ` Guoqing Jiang
0 siblings, 1 reply; 8+ messages in thread
From: Bernard Metzler @ 2023-08-21 12:00 UTC (permalink / raw)
To: Guoqing Jiang, jgg, leon; +Cc: linux-rdma
> -----Original Message-----
> From: Guoqing Jiang <guoqing.jiang@linux.dev>
> Sent: Monday, 21 August 2023 10:48
> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org
> Cc: linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in
> siw_run_sq
>
> We can call the function to get fifo list.
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
> drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
> 1 file changed, 1 insertion(+), 11 deletions(-)
>
> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> b/drivers/infiniband/sw/siw/siw_qp_tx.c
> index 4b292e0504f1..eb3d438828e2 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -1229,17 +1229,7 @@ int siw_run_sq(void *data)
> break;
>
> active = llist_del_all(&tx_task->active);
> - /*
> - * llist_del_all returns a list with newest entry first.
> - * Re-order list for fairness among QP's.
> - */
> - while (active) {
> - struct llist_node *tmp = active;
> -
> - active = llist_next(active);
> - tmp->next = fifo_list;
> - fifo_list = tmp;
> - }
> + fifo_list = llist_reverse_order(active);
> while (fifo_list) {
> qp = container_of(fifo_list, struct siw_qp, tx_list);
> fifo_list = llist_next(fifo_list);
> --
> 2.35.3
Oh yes, that function already exists. Many thanks!
I'd keep the comment, since it might be not obvious why we
reverse the list.
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path
2023-08-21 8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
@ 2023-08-21 12:06 ` Bernard Metzler
0 siblings, 0 replies; 8+ messages in thread
From: Bernard Metzler @ 2023-08-21 12:06 UTC (permalink / raw)
To: Guoqing Jiang, jgg, leon; +Cc: linux-rdma
> -----Original Message-----
> From: Guoqing Jiang <guoqing.jiang@linux.dev>
> Sent: Monday, 21 August 2023 10:48
> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org
> Cc: linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] [PATCH V2 1/3] RDMA/siw: Balance the reference of cep-
> >kref in the error path
>
> The siw_connect can go to err in below after cep is allocated successfully:
>
> 1. If siw_cm_alloc_work returns failure. In this case socket is not
> assoicated with cep so siw_cep_put can't be called by siw_socket_disassoc.
> We need to call siw_cep_put twice since cep->kref is increased once after
> it was initialized.
>
> 2. If siw_cm_queue_work can't find a work, which means siw_cep_get is not
> called in siw_cm_queue_work, so cep->kref is increased twice by siw_cep_get
> and when associate socket with cep after it was initialized. So we need to
> call siw_cep_put three times (one in siw_socket_disassoc).
>
> 3. siw_send_mpareqrep returns error, this scenario is similar as 2.
>
> So we need to remove one siw_cep_put in the error path.
>
> Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
> drivers/infiniband/sw/siw/siw_cm.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/siw/siw_cm.c
> b/drivers/infiniband/sw/siw/siw_cm.c
> index da530c0404da..a2605178f4ed 100644
> --- a/drivers/infiniband/sw/siw/siw_cm.c
> +++ b/drivers/infiniband/sw/siw/siw_cm.c
> @@ -1501,7 +1501,6 @@ int siw_connect(struct iw_cm_id *id, struct
> iw_cm_conn_param *params)
>
> cep->cm_id = NULL;
> id->rem_ref(id);
> - siw_cep_put(cep);
>
> qp->cep = NULL;
> siw_cep_put(cep);
> --
> 2.35.3
That's correct, thank you!
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq
2023-08-21 12:00 ` Bernard Metzler
@ 2023-08-21 13:23 ` Guoqing Jiang
0 siblings, 0 replies; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21 13:23 UTC (permalink / raw)
To: Bernard Metzler, jgg, leon; +Cc: linux-rdma
On 8/21/23 20:00, Bernard Metzler wrote:
>
>> -----Original Message-----
>> From: Guoqing Jiang <guoqing.jiang@linux.dev>
>> Sent: Monday, 21 August 2023 10:48
>> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org
>> Cc: linux-rdma@vger.kernel.org
>> Subject: [EXTERNAL] [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in
>> siw_run_sq
>>
>> We can call the function to get fifo list.
>>
>> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
>> ---
>> drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
>> 1 file changed, 1 insertion(+), 11 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
>> b/drivers/infiniband/sw/siw/siw_qp_tx.c
>> index 4b292e0504f1..eb3d438828e2 100644
>> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
>> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
>> @@ -1229,17 +1229,7 @@ int siw_run_sq(void *data)
>> break;
>>
>> active = llist_del_all(&tx_task->active);
>> - /*
>> - * llist_del_all returns a list with newest entry first.
>> - * Re-order list for fairness among QP's.
>> - */
>> - while (active) {
>> - struct llist_node *tmp = active;
>> -
>> - active = llist_next(active);
>> - tmp->next = fifo_list;
>> - fifo_list = tmp;
>> - }
>> + fifo_list = llist_reverse_order(active);
>> while (fifo_list) {
>> qp = container_of(fifo_list, struct siw_qp, tx_list);
>> fifo_list = llist_next(fifo_list);
>> --
>> 2.35.3
> Oh yes, that function already exists. Many thanks!
> I'd keep the comment, since it might be not obvious why we
> reverse the list.
Ok, will add them back.
> Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
Appreciate for your review!
Thanks,
Guoqing
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-08-21 13:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-21 8:47 [PATCH V2 0/3] Misc changes for siw Guoqing Jiang
2023-08-21 8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
2023-08-21 12:06 ` Bernard Metzler
2023-08-21 8:47 ` [PATCH V2 2/3] RDMA/siw: Correct wrong debug message Guoqing Jiang
2023-08-21 11:57 ` Bernard Metzler
2023-08-21 8:47 ` [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq Guoqing Jiang
2023-08-21 12:00 ` Bernard Metzler
2023-08-21 13:23 ` Guoqing Jiang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.