All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/3] Misc changes for siw
@ 2023-08-21  8:47 Guoqing Jiang
  2023-08-21  8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21  8:47 UTC (permalink / raw)
  To: bmt, jgg, leon; +Cc: linux-rdma

V2 changes:
1. add Fixes lines for the first two patches per Leon

Hi,

The first one fix below calltrace which could happen if siw_connect
goto error (I manually set rv to -1 after siw_send_mpareqrep to trigger
it) after cep is allocated.

[   97.341035] ------------[ cut here ]------------
[   97.341037] WARNING: CPU: 0 PID: 143 at drivers/infiniband/sw/siw/siw_cm.c:444 siw_cep_put+0x1c5/0x1e0 [siw]
...
[   97.341126] CPU: 0 PID: 143 Comm: kworker/u4:4 Tainted: G           OE      6.5.0-rc3+ #16
[   97.341128] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014
[   97.341130] Workqueue: rdma_cm cma_work_handler [rdma_cm]
[   97.341137] RIP: 0010:siw_cep_put+0x1c5/0x1e0 [siw]
...
[   97.341159] Call Trace:
[   97.341160]  <TASK>
[   97.341162]  ? show_regs+0x72/0x90
[   97.341166]  ? siw_cep_put+0x1c5/0x1e0 [siw]
[   97.341170]  ? __warn+0x8d/0x1a0
[   97.341175]  ? siw_cep_put+0x1c5/0x1e0 [siw]
[   97.341180]  ? report_bug+0x1f9/0x250
[   97.341185]  ? handle_bug+0x46/0x90
[   97.341188]  ? exc_invalid_op+0x19/0x80
[   97.341190]  ? asm_exc_invalid_op+0x1b/0x20
[   97.341196]  ? siw_cep_put+0x1c5/0x1e0 [siw]
[   97.341204]  siw_connect+0x474/0x780 [siw]
[   97.341211]  iw_cm_connect+0x1ca/0x250 [iw_cm]
[   97.341216]  rdma_connect_locked+0x1bf/0x940 [rdma_cm]
[   97.341227]  nvme_rdma_cm_handler+0x5d7/0x9c0 [nvme_rdma]
[   97.341235]  cma_cm_event_handler+0x4f/0x170 [rdma_cm]
[   97.341241]  cma_work_handler+0x6a/0xe0 [rdma_cm]
[   97.341247]  process_one_work+0x2bd/0x590
...

The second one make the debug message consistent with the condition,
and the last one cleanup code a bit. Pls help to review them.

Thanks,
Guoqing

Guoqing Jiang (3):
  RDMA/siw: Balance the reference of cep->kref in the error path
  RDMA/siw: Correct wrong debug message
  RDMA/siw: Call llist_reverse_order in siw_run_sq

 drivers/infiniband/sw/siw/siw_cm.c    |  1 -
 drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
 drivers/infiniband/sw/siw/siw_verbs.c |  2 +-
 3 files changed, 2 insertions(+), 13 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path
  2023-08-21  8:47 [PATCH V2 0/3] Misc changes for siw Guoqing Jiang
@ 2023-08-21  8:47 ` Guoqing Jiang
  2023-08-21 12:06   ` Bernard Metzler
  2023-08-21  8:47 ` [PATCH V2 2/3] RDMA/siw: Correct wrong debug message Guoqing Jiang
  2023-08-21  8:47 ` [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq Guoqing Jiang
  2 siblings, 1 reply; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21  8:47 UTC (permalink / raw)
  To: bmt, jgg, leon; +Cc: linux-rdma

The siw_connect can go to err in below after cep is allocated successfully:

1. If siw_cm_alloc_work returns failure. In this case socket is not
assoicated with cep so siw_cep_put can't be called by siw_socket_disassoc.
We need to call siw_cep_put twice since cep->kref is increased once after
it was initialized.

2. If siw_cm_queue_work can't find a work, which means siw_cep_get is not
called in siw_cm_queue_work, so cep->kref is increased twice by siw_cep_get
and when associate socket with cep after it was initialized. So we need to
call siw_cep_put three times (one in siw_socket_disassoc).

3. siw_send_mpareqrep returns error, this scenario is similar as 2.

So we need to remove one siw_cep_put in the error path.

Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
 drivers/infiniband/sw/siw/siw_cm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index da530c0404da..a2605178f4ed 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -1501,7 +1501,6 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params)
 
 		cep->cm_id = NULL;
 		id->rem_ref(id);
-		siw_cep_put(cep);
 
 		qp->cep = NULL;
 		siw_cep_put(cep);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 2/3] RDMA/siw: Correct wrong debug message
  2023-08-21  8:47 [PATCH V2 0/3] Misc changes for siw Guoqing Jiang
  2023-08-21  8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
@ 2023-08-21  8:47 ` Guoqing Jiang
  2023-08-21 11:57   ` Bernard Metzler
  2023-08-21  8:47 ` [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq Guoqing Jiang
  2 siblings, 1 reply; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21  8:47 UTC (permalink / raw)
  To: bmt, jgg, leon; +Cc: linux-rdma

We need to print num_sle first then pbl->max_buf per the condition.
Also replace mem->pbl with pbl while at it.

Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
 drivers/infiniband/sw/siw/siw_verbs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
index 398ec13db624..4832723dc244 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -1494,7 +1494,7 @@ int siw_map_mr_sg(struct ib_mr *base_mr, struct scatterlist *sl, int num_sle,
 
 	if (pbl->max_buf < num_sle) {
 		siw_dbg_mem(mem, "too many SGE's: %d > %d\n",
-			    mem->pbl->max_buf, num_sle);
+			    num_sle, pbl->max_buf);
 		return -ENOMEM;
 	}
 	for_each_sg(sl, slp, num_sle, i) {
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq
  2023-08-21  8:47 [PATCH V2 0/3] Misc changes for siw Guoqing Jiang
  2023-08-21  8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
  2023-08-21  8:47 ` [PATCH V2 2/3] RDMA/siw: Correct wrong debug message Guoqing Jiang
@ 2023-08-21  8:47 ` Guoqing Jiang
  2023-08-21 12:00   ` Bernard Metzler
  2 siblings, 1 reply; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21  8:47 UTC (permalink / raw)
  To: bmt, jgg, leon; +Cc: linux-rdma

We can call the function to get fifo list.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
 drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c
index 4b292e0504f1..eb3d438828e2 100644
--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -1229,17 +1229,7 @@ int siw_run_sq(void *data)
 			break;
 
 		active = llist_del_all(&tx_task->active);
-		/*
-		 * llist_del_all returns a list with newest entry first.
-		 * Re-order list for fairness among QP's.
-		 */
-		while (active) {
-			struct llist_node *tmp = active;
-
-			active = llist_next(active);
-			tmp->next = fifo_list;
-			fifo_list = tmp;
-		}
+		fifo_list = llist_reverse_order(active);
 		while (fifo_list) {
 			qp = container_of(fifo_list, struct siw_qp, tx_list);
 			fifo_list = llist_next(fifo_list);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* RE: [PATCH V2 2/3] RDMA/siw: Correct wrong debug message
  2023-08-21  8:47 ` [PATCH V2 2/3] RDMA/siw: Correct wrong debug message Guoqing Jiang
@ 2023-08-21 11:57   ` Bernard Metzler
  0 siblings, 0 replies; 8+ messages in thread
From: Bernard Metzler @ 2023-08-21 11:57 UTC (permalink / raw)
  To: Guoqing Jiang, jgg, leon; +Cc: linux-rdma



> -----Original Message-----
> From: Guoqing Jiang <guoqing.jiang@linux.dev>
> Sent: Monday, 21 August 2023 10:48
> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org
> Cc: linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] [PATCH V2 2/3] RDMA/siw: Correct wrong debug message
> 
> We need to print num_sle first then pbl->max_buf per the condition.
> Also replace mem->pbl with pbl while at it.
> 
> Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
>  drivers/infiniband/sw/siw/siw_verbs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/sw/siw/siw_verbs.c
> b/drivers/infiniband/sw/siw/siw_verbs.c
> index 398ec13db624..4832723dc244 100644
> --- a/drivers/infiniband/sw/siw/siw_verbs.c
> +++ b/drivers/infiniband/sw/siw/siw_verbs.c
> @@ -1494,7 +1494,7 @@ int siw_map_mr_sg(struct ib_mr *base_mr, struct
> scatterlist *sl, int num_sle,
> 
>  	if (pbl->max_buf < num_sle) {
>  		siw_dbg_mem(mem, "too many SGE's: %d > %d\n",
> -			    mem->pbl->max_buf, num_sle);
> +			    num_sle, pbl->max_buf);
>  		return -ENOMEM;
>  	}
>  	for_each_sg(sl, slp, num_sle, i) {
> --
> 2.35.3
makes sense, thank you!

Acked-by: Bernard Metzler <bmt@zurich.ibm.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq
  2023-08-21  8:47 ` [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq Guoqing Jiang
@ 2023-08-21 12:00   ` Bernard Metzler
  2023-08-21 13:23     ` Guoqing Jiang
  0 siblings, 1 reply; 8+ messages in thread
From: Bernard Metzler @ 2023-08-21 12:00 UTC (permalink / raw)
  To: Guoqing Jiang, jgg, leon; +Cc: linux-rdma



> -----Original Message-----
> From: Guoqing Jiang <guoqing.jiang@linux.dev>
> Sent: Monday, 21 August 2023 10:48
> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org
> Cc: linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in
> siw_run_sq
> 
> We can call the function to get fifo list.
> 
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
>  drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
>  1 file changed, 1 insertion(+), 11 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> b/drivers/infiniband/sw/siw/siw_qp_tx.c
> index 4b292e0504f1..eb3d438828e2 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -1229,17 +1229,7 @@ int siw_run_sq(void *data)
>  			break;
> 
>  		active = llist_del_all(&tx_task->active);
> -		/*
> -		 * llist_del_all returns a list with newest entry first.
> -		 * Re-order list for fairness among QP's.
> -		 */
> -		while (active) {
> -			struct llist_node *tmp = active;
> -
> -			active = llist_next(active);
> -			tmp->next = fifo_list;
> -			fifo_list = tmp;
> -		}
> +		fifo_list = llist_reverse_order(active);
>  		while (fifo_list) {
>  			qp = container_of(fifo_list, struct siw_qp, tx_list);
>  			fifo_list = llist_next(fifo_list);
> --
> 2.35.3

Oh yes, that function already exists. Many thanks!
I'd keep the comment, since it might be not obvious why we
reverse the list.

Acked-by: Bernard Metzler <bmt@zurich.ibm.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path
  2023-08-21  8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
@ 2023-08-21 12:06   ` Bernard Metzler
  0 siblings, 0 replies; 8+ messages in thread
From: Bernard Metzler @ 2023-08-21 12:06 UTC (permalink / raw)
  To: Guoqing Jiang, jgg, leon; +Cc: linux-rdma



> -----Original Message-----
> From: Guoqing Jiang <guoqing.jiang@linux.dev>
> Sent: Monday, 21 August 2023 10:48
> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org
> Cc: linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] [PATCH V2 1/3] RDMA/siw: Balance the reference of cep-
> >kref in the error path
> 
> The siw_connect can go to err in below after cep is allocated successfully:
> 
> 1. If siw_cm_alloc_work returns failure. In this case socket is not
> assoicated with cep so siw_cep_put can't be called by siw_socket_disassoc.
> We need to call siw_cep_put twice since cep->kref is increased once after
> it was initialized.
> 
> 2. If siw_cm_queue_work can't find a work, which means siw_cep_get is not
> called in siw_cm_queue_work, so cep->kref is increased twice by siw_cep_get
> and when associate socket with cep after it was initialized. So we need to
> call siw_cep_put three times (one in siw_socket_disassoc).
> 
> 3. siw_send_mpareqrep returns error, this scenario is similar as 2.
> 
> So we need to remove one siw_cep_put in the error path.
> 
> Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
>  drivers/infiniband/sw/siw/siw_cm.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/infiniband/sw/siw/siw_cm.c
> b/drivers/infiniband/sw/siw/siw_cm.c
> index da530c0404da..a2605178f4ed 100644
> --- a/drivers/infiniband/sw/siw/siw_cm.c
> +++ b/drivers/infiniband/sw/siw/siw_cm.c
> @@ -1501,7 +1501,6 @@ int siw_connect(struct iw_cm_id *id, struct
> iw_cm_conn_param *params)
> 
>  		cep->cm_id = NULL;
>  		id->rem_ref(id);
> -		siw_cep_put(cep);
> 
>  		qp->cep = NULL;
>  		siw_cep_put(cep);
> --
> 2.35.3

That's correct, thank you!


Acked-by: Bernard Metzler <bmt@zurich.ibm.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq
  2023-08-21 12:00   ` Bernard Metzler
@ 2023-08-21 13:23     ` Guoqing Jiang
  0 siblings, 0 replies; 8+ messages in thread
From: Guoqing Jiang @ 2023-08-21 13:23 UTC (permalink / raw)
  To: Bernard Metzler, jgg, leon; +Cc: linux-rdma



On 8/21/23 20:00, Bernard Metzler wrote:
>
>> -----Original Message-----
>> From: Guoqing Jiang <guoqing.jiang@linux.dev>
>> Sent: Monday, 21 August 2023 10:48
>> To: Bernard Metzler <BMT@zurich.ibm.com>; jgg@ziepe.ca; leon@kernel.org
>> Cc: linux-rdma@vger.kernel.org
>> Subject: [EXTERNAL] [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in
>> siw_run_sq
>>
>> We can call the function to get fifo list.
>>
>> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
>> ---
>>   drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
>>   1 file changed, 1 insertion(+), 11 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
>> b/drivers/infiniband/sw/siw/siw_qp_tx.c
>> index 4b292e0504f1..eb3d438828e2 100644
>> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
>> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
>> @@ -1229,17 +1229,7 @@ int siw_run_sq(void *data)
>>   			break;
>>
>>   		active = llist_del_all(&tx_task->active);
>> -		/*
>> -		 * llist_del_all returns a list with newest entry first.
>> -		 * Re-order list for fairness among QP's.
>> -		 */
>> -		while (active) {
>> -			struct llist_node *tmp = active;
>> -
>> -			active = llist_next(active);
>> -			tmp->next = fifo_list;
>> -			fifo_list = tmp;
>> -		}
>> +		fifo_list = llist_reverse_order(active);
>>   		while (fifo_list) {
>>   			qp = container_of(fifo_list, struct siw_qp, tx_list);
>>   			fifo_list = llist_next(fifo_list);
>> --
>> 2.35.3
> Oh yes, that function already exists. Many thanks!
> I'd keep the comment, since it might be not obvious why we
> reverse the list.

Ok, will add them back.

> Acked-by: Bernard Metzler <bmt@zurich.ibm.com>

Appreciate for your review!

Thanks,
Guoqing

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-21 13:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-21  8:47 [PATCH V2 0/3] Misc changes for siw Guoqing Jiang
2023-08-21  8:47 ` [PATCH V2 1/3] RDMA/siw: Balance the reference of cep->kref in the error path Guoqing Jiang
2023-08-21 12:06   ` Bernard Metzler
2023-08-21  8:47 ` [PATCH V2 2/3] RDMA/siw: Correct wrong debug message Guoqing Jiang
2023-08-21 11:57   ` Bernard Metzler
2023-08-21  8:47 ` [PATCH V2 3/3] RDMA/siw: Call llist_reverse_order in siw_run_sq Guoqing Jiang
2023-08-21 12:00   ` Bernard Metzler
2023-08-21 13:23     ` Guoqing Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.