* [PATCH v5 1/4] RDMA/rxe: Update wqe_index for each wqe error completion
2022-07-04 6:00 [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue lizhijian
@ 2022-07-04 6:00 ` lizhijian
2022-07-04 6:00 ` [PATCH v5 2/4] RDMA/rxe: Generate error completion for error requester QP state lizhijian
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: lizhijian @ 2022-07-04 6:00 UTC (permalink / raw)
To: Yanjun Zhu, Jason Gunthorpe, Haakon Bugge, linux-rdma, Bob Pearson
Cc: Cheng Xu, lizhijian
Previously, if user space keeps sending abnormal wqe, queue.index will
keep increasing while qp->req.wqe_index doesn't. Once
qp->req.wqe_index==queue.index in next round, req_next_wqe() will treat queue
as empty. In such case, no new completion would be generated.
Update wqe_index for each wqe completion so that req_next_wqe() can get
next wqe properly.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_req.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 9d98237389cf..4ffc4ebd6e28 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -759,6 +759,8 @@ int rxe_requester(void *arg)
if (ah)
rxe_put(ah);
err:
+ /* update wqe_index for each wqe completion */
+ qp->req.wqe_index = queue_next_index(qp->sq.queue, qp->req.wqe_index);
wqe->state = wqe_state_error;
__rxe_do_task(&qp->comp.task);
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v5 2/4] RDMA/rxe: Generate error completion for error requester QP state
2022-07-04 6:00 [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue lizhijian
2022-07-04 6:00 ` [PATCH v5 1/4] RDMA/rxe: Update wqe_index for each wqe error completion lizhijian
@ 2022-07-04 6:00 ` lizhijian
2022-07-04 6:00 ` [PATCH v5 3/4] RDMA/rxe: Split qp state for requester and completer lizhijian
` (2 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: lizhijian @ 2022-07-04 6:00 UTC (permalink / raw)
To: Yanjun Zhu, Jason Gunthorpe, Haakon Bugge, linux-rdma, Bob Pearson
Cc: Cheng Xu, lizhijian
As per IBTA specification, all subsequent WQEs while QP is in error
state should be completed with a flush error.
Here we check QP_STATE_ERROR after req_next_wqe() so that rxe_completer()
has chance to be called where it will set CQ state to FLUSH ERROR and the
completion can associate with its WQE.
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
V5: parentheses issue # Cheng Xu
V4: check QP ERROR before QP RESET # Bob
V3: unlikely() optimization # Cheng Xu <chengyou@linux.alibaba.com>
update commit log # Haakon Bugge <haakon.bugge@oracle.com>
---
drivers/infiniband/sw/rxe/rxe_req.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 4ffc4ebd6e28..6d2742997e1b 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -610,9 +610,20 @@ int rxe_requester(void *arg)
return -EAGAIN;
next_wqe:
- if (unlikely(!qp->valid || qp->req.state == QP_STATE_ERROR))
+ if (unlikely(!qp->valid))
goto exit;
+ if (unlikely(qp->req.state == QP_STATE_ERROR)) {
+ wqe = req_next_wqe(qp);
+ if (wqe)
+ /*
+ * Generate an error completion for error qp state
+ */
+ goto err;
+ else
+ goto exit;
+ }
+
if (unlikely(qp->req.state == QP_STATE_RESET)) {
qp->req.wqe_index = queue_get_consumer(q,
QUEUE_TYPE_FROM_CLIENT);
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v5 3/4] RDMA/rxe: Split qp state for requester and completer
2022-07-04 6:00 [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue lizhijian
2022-07-04 6:00 ` [PATCH v5 1/4] RDMA/rxe: Update wqe_index for each wqe error completion lizhijian
2022-07-04 6:00 ` [PATCH v5 2/4] RDMA/rxe: Generate error completion for error requester QP state lizhijian
@ 2022-07-04 6:00 ` lizhijian
2022-07-04 6:00 ` [PATCH v5 4/4] RDMA/rxe: Fix typo in comment lizhijian
2022-07-20 5:38 ` [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue Leon Romanovsky
4 siblings, 0 replies; 9+ messages in thread
From: lizhijian @ 2022-07-04 6:00 UTC (permalink / raw)
To: Yanjun Zhu, Jason Gunthorpe, Haakon Bugge, linux-rdma, Bob Pearson
Cc: Cheng Xu
From: Bob Pearson <rpearsonhpe@gmail.com>
Currently the requester can continue to process send wqes after
an local qp operation error is detected because the setting of
the qp state to the error state is deferred until later. This
patch splits the qp state for the completer and requester into
two separate states and sets qp->req.state = QP_STATE_ERROR as
soon as the error is detected before another wqe can be executed.
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
V4: new patch
---
drivers/infiniband/sw/rxe/rxe_comp.c | 6 +++---
drivers/infiniband/sw/rxe/rxe_qp.c | 5 +++++
drivers/infiniband/sw/rxe/rxe_req.c | 1 +
drivers/infiniband/sw/rxe/rxe_verbs.h | 1 +
4 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index da3a398053b8..0b68630a3e49 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -565,10 +565,10 @@ int rxe_completer(void *arg)
if (!rxe_get(qp))
return -EAGAIN;
- if (!qp->valid || qp->req.state == QP_STATE_ERROR ||
- qp->req.state == QP_STATE_RESET) {
+ if (!qp->valid || qp->comp.state == QP_STATE_ERROR ||
+ qp->comp.state == QP_STATE_RESET) {
rxe_drain_resp_pkts(qp, qp->valid &&
- qp->req.state == QP_STATE_ERROR);
+ qp->comp.state == QP_STATE_ERROR);
ret = -EAGAIN;
goto done;
}
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 22e9b85344c3..a95d3b49ae20 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -230,6 +230,7 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
QUEUE_TYPE_FROM_CLIENT);
qp->req.state = QP_STATE_RESET;
+ qp->comp.state = QP_STATE_RESET;
qp->req.opcode = -1;
qp->comp.opcode = -1;
@@ -490,6 +491,7 @@ static void rxe_qp_reset(struct rxe_qp *qp)
/* move qp to the reset state */
qp->req.state = QP_STATE_RESET;
+ qp->comp.state = QP_STATE_RESET;
qp->resp.state = QP_STATE_RESET;
/* let state machines reset themselves drain work and packet queues
@@ -552,6 +554,7 @@ void rxe_qp_error(struct rxe_qp *qp)
{
qp->req.state = QP_STATE_ERROR;
qp->resp.state = QP_STATE_ERROR;
+ qp->comp.state = QP_STATE_ERROR;
qp->attr.qp_state = IB_QPS_ERR;
/* drain work and packet queues */
@@ -689,6 +692,7 @@ int rxe_qp_from_attr(struct rxe_qp *qp, struct ib_qp_attr *attr, int mask,
pr_debug("qp#%d state -> INIT\n", qp_num(qp));
qp->req.state = QP_STATE_INIT;
qp->resp.state = QP_STATE_INIT;
+ qp->comp.state = QP_STATE_INIT;
break;
case IB_QPS_RTR:
@@ -699,6 +703,7 @@ int rxe_qp_from_attr(struct rxe_qp *qp, struct ib_qp_attr *attr, int mask,
case IB_QPS_RTS:
pr_debug("qp#%d state -> RTS\n", qp_num(qp));
qp->req.state = QP_STATE_READY;
+ qp->comp.state = QP_STATE_READY;
break;
case IB_QPS_SQD:
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 6d2742997e1b..ad25290e393d 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -773,6 +773,7 @@ int rxe_requester(void *arg)
/* update wqe_index for each wqe completion */
qp->req.wqe_index = queue_next_index(qp->sq.queue, qp->req.wqe_index);
wqe->state = wqe_state_error;
+ qp->req.state = QP_STATE_ERROR;
__rxe_do_task(&qp->comp.task);
exit:
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index ac464e68c923..bbfffe243fd6 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -129,6 +129,7 @@ struct rxe_req_info {
};
struct rxe_comp_info {
+ enum rxe_qp_state state;
u32 psn;
int opcode;
int timeout;
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v5 4/4] RDMA/rxe: Fix typo in comment
2022-07-04 6:00 [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue lizhijian
` (2 preceding siblings ...)
2022-07-04 6:00 ` [PATCH v5 3/4] RDMA/rxe: Split qp state for requester and completer lizhijian
@ 2022-07-04 6:00 ` lizhijian
2022-07-14 17:10 ` Bob Pearson
2022-07-20 5:38 ` [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue Leon Romanovsky
4 siblings, 1 reply; 9+ messages in thread
From: lizhijian @ 2022-07-04 6:00 UTC (permalink / raw)
To: Yanjun Zhu, Jason Gunthorpe, Haakon Bugge, linux-rdma, Bob Pearson
Cc: Cheng Xu, lizhijian
Fix a spelling mistake
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
drivers/infiniband/sw/rxe/rxe_task.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index 0c4db5bb17d7..c9b80410cd5b 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -67,7 +67,7 @@ void rxe_do_task(struct tasklet_struct *t)
cont = 1;
break;
- /* soneone tried to run the task since the last time we called
+ /* someone tried to run the task since the last time we called
* func, so we will call one more time regardless of the
* return value
*/
--
2.31.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v5 4/4] RDMA/rxe: Fix typo in comment
2022-07-04 6:00 ` [PATCH v5 4/4] RDMA/rxe: Fix typo in comment lizhijian
@ 2022-07-14 17:10 ` Bob Pearson
0 siblings, 0 replies; 9+ messages in thread
From: Bob Pearson @ 2022-07-14 17:10 UTC (permalink / raw)
To: lizhijian, Yanjun Zhu, Jason Gunthorpe, Haakon Bugge, linux-rdma; +Cc: Cheng Xu
On 7/4/22 01:00, lizhijian@fujitsu.com wrote:
> Fix a spelling mistake
>
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
> drivers/infiniband/sw/rxe/rxe_task.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
> index 0c4db5bb17d7..c9b80410cd5b 100644
> --- a/drivers/infiniband/sw/rxe/rxe_task.c
> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
> @@ -67,7 +67,7 @@ void rxe_do_task(struct tasklet_struct *t)
> cont = 1;
> break;
>
> - /* soneone tried to run the task since the last time we called
> + /* someone tried to run the task since the last time we called
> * func, so we will call one more time regardless of the
> * return value
> */
I think I snuck this in recently in something else but it is correct.
Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue
2022-07-04 6:00 [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue lizhijian
` (3 preceding siblings ...)
2022-07-04 6:00 ` [PATCH v5 4/4] RDMA/rxe: Fix typo in comment lizhijian
@ 2022-07-20 5:38 ` Leon Romanovsky
2022-07-20 6:21 ` lizhijian
4 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2022-07-20 5:38 UTC (permalink / raw)
To: lizhijian
Cc: Yanjun Zhu, Jason Gunthorpe, Haakon Bugge, linux-rdma,
Bob Pearson, Cheng Xu
On Mon, Jul 04, 2022 at 06:00:54AM +0000, lizhijian@fujitsu.com wrote:
Please fix your gitconfig to have same From/author fields as in Signed-off-by.
Thanks
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue
2022-07-20 5:38 ` [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue Leon Romanovsky
@ 2022-07-20 6:21 ` lizhijian
2022-07-20 6:33 ` Leon Romanovsky
0 siblings, 1 reply; 9+ messages in thread
From: lizhijian @ 2022-07-20 6:21 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Yanjun Zhu, Jason Gunthorpe, Haakon Bugge, linux-rdma,
Bob Pearson, Cheng Xu
Hi Leon
On 20/07/2022 13:38, Leon Romanovsky wrote:
> On Mon, Jul 04, 2022 at 06:00:54AM +0000, lizhijian@fujitsu.com wrote:
>
> Please fix your gitconfig to have same From/author fields as in Signed-off-by.
I'm sorry about that, tay I know which patch has something wrong? I have not updated these fields recently.
Do you mean "[PATCH v5 3/4] RDMA/rxe: Split qp state for requester and completer" which is from Bob. So
I keep his author and SOB.
Thanks
Zhijian
>
> Thanks
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5 0/4] RDMA/rxe: Fix no completion event issue
2022-07-20 6:21 ` lizhijian
@ 2022-07-20 6:33 ` Leon Romanovsky
0 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2022-07-20 6:33 UTC (permalink / raw)
To: lizhijian
Cc: Yanjun Zhu, Jason Gunthorpe, Haakon Bugge, linux-rdma,
Bob Pearson, Cheng Xu
On Wed, Jul 20, 2022 at 06:21:29AM +0000, lizhijian@fujitsu.com wrote:
> Hi Leon
>
>
> On 20/07/2022 13:38, Leon Romanovsky wrote:
> > On Mon, Jul 04, 2022 at 06:00:54AM +0000, lizhijian@fujitsu.com wrote:
> >
> > Please fix your gitconfig to have same From/author fields as in Signed-off-by.
>
> I'm sorry about that, tay I know which patch has something wrong? I have not updated these fields recently.
> Do you mean "[PATCH v5 3/4] RDMA/rxe: Split qp state for requester and completer" which is from Bob. So
> I keep his author and SOB.
No, I'm talking about something else. Almost all your patches are sent
with wrong "From:" field.
Let's take your first patch as an example:
https://lore.kernel.org/linux-rdma/20220704060806.1622849-2-lizhijian@fujitsu.com/
From: "lizhijian@fujitsu.com" <lizhijian@fujitsu.com>
...
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
and if i try to apply it, the checkpatch will throw the following error:
➜ kernel git:(rdma-next) ✗ git am --continue
Applying: RDMA/rxe: Update wqe_index for each wqe error completion
➜ kernel git:(rdma-next) git checkpatch
WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#9:
qp->req.wqe_index==queue.index in next round, req_next_wqe() will treat queue
WARNING: From:/Signed-off-by: email name mismatch: 'From: "lizhijian@fujitsu.com" <lizhijian@fujitsu.com>' != 'Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>'
0001-RDMA-rxe-Update-wqe_index-for-each-wqe-error-complet.patch total: 0 errors, 2 warnings, 8 lines checked
>
> Thanks
> Zhijian
>
>
> >
> > Thanks
^ permalink raw reply [flat|nested] 9+ messages in thread