linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 for-next 0/2] RDMA/hns: Add the workqueue framework for flush cqe handler
@ 2019-12-28  3:28 Yixian Liu
  2019-12-28  3:28 ` [PATCH v5 for-next 1/2] " Yixian Liu
  2019-12-28  3:28 ` [PATCH v5 for-next 2/2] RDMA/hns: Delayed flush cqe process with workqueue Yixian Liu
  0 siblings, 2 replies; 9+ messages in thread
From: Yixian Liu @ 2019-12-28  3:28 UTC (permalink / raw)
  To: dledford, jgg, leon; +Cc: linux-rdma, linuxarm

Earlier Background:
HiP08 RoCE hardware lacks ability(a known hardware problem) to flush
outstanding WQEs if QP state gets into errored mode for some reason.
To overcome this hardware problem and as a workaround, when QP is
detected to be in errored state during various legs like post send,
post receive etc [1], flush needs to be performed from the driver.

These data-path legs might get called concurrently from various context,
like thread and interrupt as well (like NVMe driver). Hence, these need
to be protected with spin-locks for the concurrency. This code exists
within the driver.

Problem:
Earlier The patch[1] sent to solve the hardware limitation explained
in the background section had a bug in the software flushing leg. It
acquired mutex while modifying QP state to errored state and while
conveying it to the hardware using the mailbox. This caused leg to
sleep while holding spin-lock and caused crash.

Suggested Solution:
In this patch, we have proposed to defer the flushing of the QP in
Errored state using the workqueue.

We do understand that this might have an impact on the recovery times
as scheduling of the workqueue handler depends upon the occupancy of
the system. Therefore to roughly mitigate this affect we have tried
to use Concurrency Managed workqueue to give worker thread (and
hence handler) a chance to run over more than one core.


[1] https://patchwork.kernel.org/patch/10534271/


This patch-set consists of:
[Patch 001] Introduce workqueue based WQE Flush Handler
[Patch 002] Call WQE flush handler in post {send|receive|poll}

v5 changes:
1. Remove WQ_MEM_RECLAIM flag according to Leon's suggestion.
2. Change to ordered workqueue for the requirement of flush work.

v4 changes:
1. Add flag for PI is being pushed according to Jason's suggestion
   to reduce unnecessary works submitted to workqueue.

v3 changes:
1. Fall back to dynamically allocate flush_work.

v2 changes:
1. Remove new created workqueue according to Jason's comment
2. Remove dynamic allocation for flush_work according to Jason's comment
3. Change current irq singlethread workqueue to concurrency management
   workqueue to ensure work unblocked.

Yixian Liu (2):
  RDMA/hns: Add the workqueue framework for flush cqe handler
  RDMA/hns: Delayed flush cqe process with workqueue

 drivers/infiniband/hw/hns/hns_roce_device.h |  4 ++
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 97 +++++++++++++++--------------
 drivers/infiniband/hw/hns/hns_roce_qp.c     | 45 +++++++++++++
 3 files changed, 98 insertions(+), 48 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 for-next 1/2] RDMA/hns: Add the workqueue framework for flush cqe handler
  2019-12-28  3:28 [PATCH v5 for-next 0/2] RDMA/hns: Add the workqueue framework for flush cqe handler Yixian Liu
@ 2019-12-28  3:28 ` Yixian Liu
  2020-01-10 15:26   ` Jason Gunthorpe
  2019-12-28  3:28 ` [PATCH v5 for-next 2/2] RDMA/hns: Delayed flush cqe process with workqueue Yixian Liu
  1 sibling, 1 reply; 9+ messages in thread
From: Yixian Liu @ 2019-12-28  3:28 UTC (permalink / raw)
  To: dledford, jgg, leon; +Cc: linux-rdma, linuxarm

HiP08 RoCE hardware lacks ability(a known hardware problem) to flush
outstanding WQEs if QP state gets into errored mode for some reason.
To overcome this hardware problem and as a workaround, when QP is
detected to be in errored state during various legs like post send,
post receive etc [1], flush needs to be performed from the driver.

The earlier patch[1] sent to solve the hardware limitation explained
in the cover-letter had a bug in the software flushing leg. It
acquired mutex while modifying QP state to errored state and while
conveying it to the hardware using the mailbox. This caused leg to
sleep while holding spin-lock and caused crash.

Suggested Solution:
we have proposed to defer the flushing of the QP in the Errored state
using the workqueue to get around with the limitation of our hardware.

This patch adds the framework of the workqueue and the flush handler
function.

[1] https://patchwork.kernel.org/patch/10534271/

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
---
 drivers/infiniband/hw/hns/hns_roce_device.h |  2 ++
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  |  3 +-
 drivers/infiniband/hw/hns/hns_roce_qp.c     | 43 +++++++++++++++++++++++++++++
 3 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 5617434..a87a838 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -900,6 +900,7 @@ struct hns_roce_caps {
 struct hns_roce_work {
 	struct hns_roce_dev *hr_dev;
 	struct work_struct work;
+	struct hns_roce_qp *hr_qp;
 	u32 qpn;
 	u32 cqn;
 	int event_type;
@@ -1220,6 +1221,7 @@ struct ib_qp *hns_roce_create_qp(struct ib_pd *ib_pd,
 				 struct ib_udata *udata);
 int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		       int attr_mask, struct ib_udata *udata);
+void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp);
 void *get_recv_wqe(struct hns_roce_qp *hr_qp, int n);
 void *get_send_wqe(struct hns_roce_qp *hr_qp, int n);
 void *get_send_extend_sge(struct hns_roce_qp *hr_qp, int n);
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 1026ac6..2afcedd 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -5966,8 +5966,7 @@ static int hns_roce_v2_init_eq_table(struct hns_roce_dev *hr_dev)
 		goto err_request_irq_fail;
 	}
 
-	hr_dev->irq_workq =
-		create_singlethread_workqueue("hns_roce_irq_workqueue");
+	hr_dev->irq_workq = alloc_ordered_workqueue("hns_roce_irq_workq", 0);
 	if (!hr_dev->irq_workq) {
 		dev_err(dev, "Create irq workqueue failed!\n");
 		ret = -ENOMEM;
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index a6565b6..0c1e74a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -43,6 +43,49 @@
 
 #define SQP_NUM				(2 * HNS_ROCE_MAX_PORTS)
 
+static void flush_work_handle(struct work_struct *work)
+{
+	struct hns_roce_work *flush_work = container_of(work,
+					struct hns_roce_work, work);
+	struct hns_roce_qp *hr_qp = flush_work->hr_qp;
+	struct device *dev = flush_work->hr_dev->dev;
+	struct ib_qp_attr attr;
+	int attr_mask;
+	int ret;
+
+	attr_mask = IB_QP_STATE;
+	attr.qp_state = IB_QPS_ERR;
+
+	ret = hns_roce_modify_qp(&hr_qp->ibqp, &attr, attr_mask, NULL);
+	if (ret)
+		dev_err(dev, "Modify QP to error state failed(%d) during CQE flush\n",
+			ret);
+
+	kfree(flush_work);
+
+	/*
+	 * make sure we signal QP destroy leg that flush QP was completed
+	 * so that it can safely proceed ahead now and destroy QP
+	 */
+	if (atomic_dec_and_test(&hr_qp->refcount))
+		complete(&hr_qp->free);
+}
+
+void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
+{
+	struct hns_roce_work *flush_work;
+
+	flush_work = kzalloc(sizeof(struct hns_roce_work), GFP_ATOMIC);
+	if (!flush_work)
+		return;
+
+	flush_work->hr_dev = hr_dev;
+	flush_work->hr_qp = hr_qp;
+	INIT_WORK(&flush_work->work, flush_work_handle);
+	atomic_inc(&hr_qp->refcount);
+	queue_work(hr_dev->irq_workq, &flush_work->work);
+}
+
 void hns_roce_qp_event(struct hns_roce_dev *hr_dev, u32 qpn, int event_type)
 {
 	struct device *dev = hr_dev->dev;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v5 for-next 2/2] RDMA/hns: Delayed flush cqe process with workqueue
  2019-12-28  3:28 [PATCH v5 for-next 0/2] RDMA/hns: Add the workqueue framework for flush cqe handler Yixian Liu
  2019-12-28  3:28 ` [PATCH v5 for-next 1/2] " Yixian Liu
@ 2019-12-28  3:28 ` Yixian Liu
  1 sibling, 0 replies; 9+ messages in thread
From: Yixian Liu @ 2019-12-28  3:28 UTC (permalink / raw)
  To: dledford, jgg, leon; +Cc: linux-rdma, linuxarm

HiP08 RoCE hardware lacks ability(a known hardware problem) to flush
outstanding WQEs if QP state gets into errored mode for some reason.
To overcome this hardware problem and as a workaround, when QP is
detected to be in errored state during various legs like post send,
post receive etc[1], flush needs to be performed from the driver.

The earlier patch[1] sent to solve the hardware limitation explained
in the cover-letter had a bug in the software flushing leg. It
acquired mutex while modifying QP state to errored state and while
conveying it to the hardware using the mailbox. This caused leg to
sleep while holding spin-lock and caused crash.

Suggested Solution:
we have proposed to defer the flushing of the QP in the Errored state
using the workqueue to get around with the limitation of our hardware.

This patch specifically adds the calls to the flush handler from
where parts of the code like post_send/post_recv etc. when the QP
state gets into the errored mode.

[1] https://patchwork.kernel.org/patch/10534271/

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
---
 drivers/infiniband/hw/hns/hns_roce_device.h |  2 +
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 94 +++++++++++++++--------------
 drivers/infiniband/hw/hns/hns_roce_qp.c     |  2 +
 3 files changed, 52 insertions(+), 46 deletions(-)

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index a87a838..0ba2387 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -667,6 +667,8 @@ struct hns_roce_qp {
 	u8			sl;
 	u8			resp_depth;
 	u8			state;
+	/* 1: PI is being pushed, 0: PI is not being pushed */
+	u8			being_pushed;
 	u32			access_flags;
 	u32                     atomic_rd_en;
 	u32			pkey_index;
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 2afcedd..2e8ce21 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -221,11 +221,6 @@ static int set_rwqe_data_seg(struct ib_qp *ibqp, const struct ib_send_wr *wr,
 	return 0;
 }
 
-static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
-				 const struct ib_qp_attr *attr,
-				 int attr_mask, enum ib_qp_state cur_state,
-				 enum ib_qp_state new_state);
-
 static int hns_roce_v2_post_send(struct ib_qp *ibqp,
 				 const struct ib_send_wr *wr,
 				 const struct ib_send_wr **bad_wr)
@@ -238,14 +233,12 @@ static int hns_roce_v2_post_send(struct ib_qp *ibqp,
 	struct hns_roce_wqe_frmr_seg *fseg;
 	struct device *dev = hr_dev->dev;
 	struct hns_roce_v2_db sq_db;
-	struct ib_qp_attr attr;
 	unsigned int sge_ind;
 	unsigned int owner_bit;
 	unsigned long flags;
 	unsigned int ind;
 	void *wqe = NULL;
 	bool loopback;
-	int attr_mask;
 	u32 tmp_len;
 	int ret = 0;
 	u32 hr_op;
@@ -591,18 +584,17 @@ static int hns_roce_v2_post_send(struct ib_qp *ibqp,
 		qp->sq_next_wqe = ind;
 		qp->next_sge = sge_ind;
 
-		if (qp->state == IB_QPS_ERR) {
-			attr_mask = IB_QP_STATE;
-			attr.qp_state = IB_QPS_ERR;
-
-			ret = hns_roce_v2_modify_qp(&qp->ibqp, &attr, attr_mask,
-						    qp->state, IB_QPS_ERR);
-			if (ret) {
-				spin_unlock_irqrestore(&qp->sq.lock, flags);
-				*bad_wr = wr;
-				return ret;
-			}
-		}
+		/*
+		 * Hip08 hardware cannot flush the WQEs in SQ if the QP state
+		 * gets into errored mode. Hence, as a workaround to this
+		 * hardware limitation, driver needs to assist in flushing. But
+		 * the flushing operation uses mailbox to convey the QP state to
+		 * the hardware and which can sleep due to the mutex protection
+		 * around the mailbox calls. Hence, use the deferred flush for
+		 * now.
+		 */
+		if (qp->state == IB_QPS_ERR && !qp->being_pushed)
+			init_flush_work(hr_dev, qp);
 	}
 
 	spin_unlock_irqrestore(&qp->sq.lock, flags);
@@ -619,10 +611,8 @@ static int hns_roce_v2_post_recv(struct ib_qp *ibqp,
 	struct hns_roce_v2_wqe_data_seg *dseg;
 	struct hns_roce_rinl_sge *sge_list;
 	struct device *dev = hr_dev->dev;
-	struct ib_qp_attr attr;
 	unsigned long flags;
 	void *wqe = NULL;
-	int attr_mask;
 	int ret = 0;
 	int nreq;
 	int ind;
@@ -692,19 +682,17 @@ static int hns_roce_v2_post_recv(struct ib_qp *ibqp,
 
 		*hr_qp->rdb.db_record = hr_qp->rq.head & 0xffff;
 
-		if (hr_qp->state == IB_QPS_ERR) {
-			attr_mask = IB_QP_STATE;
-			attr.qp_state = IB_QPS_ERR;
-
-			ret = hns_roce_v2_modify_qp(&hr_qp->ibqp, &attr,
-						    attr_mask, hr_qp->state,
-						    IB_QPS_ERR);
-			if (ret) {
-				spin_unlock_irqrestore(&hr_qp->rq.lock, flags);
-				*bad_wr = wr;
-				return ret;
-			}
-		}
+		/*
+		 * Hip08 hardware cannot flush the WQEs in RQ if the QP state
+		 * gets into errored mode. Hence, as a workaround to this
+		 * hardware limitation, driver needs to assist in flushing. But
+		 * the flushing operation uses mailbox to convey the QP state to
+		 * the hardware and which can sleep due to the mutex protection
+		 * around the mailbox calls. Hence, use the deferred flush for
+		 * now.
+		 */
+		if (hr_qp->state == IB_QPS_ERR && !hr_qp->being_pushed)
+			init_flush_work(hr_dev, hr_qp);
 	}
 	spin_unlock_irqrestore(&hr_qp->rq.lock, flags);
 
@@ -2690,13 +2678,11 @@ static int hns_roce_handle_recv_inl_wqe(struct hns_roce_v2_cqe *cqe,
 static int hns_roce_v2_poll_one(struct hns_roce_cq *hr_cq,
 				struct hns_roce_qp **cur_qp, struct ib_wc *wc)
 {
+	struct hns_roce_dev *hr_dev = to_hr_dev(hr_cq->ib_cq.device);
 	struct hns_roce_srq *srq = NULL;
-	struct hns_roce_dev *hr_dev;
 	struct hns_roce_v2_cqe *cqe;
 	struct hns_roce_qp *hr_qp;
 	struct hns_roce_wq *wq;
-	struct ib_qp_attr attr;
-	int attr_mask;
 	int is_send;
 	u16 wqe_ctr;
 	u32 opcode;
@@ -2720,7 +2706,6 @@ static int hns_roce_v2_poll_one(struct hns_roce_cq *hr_cq,
 				V2_CQE_BYTE_16_LCL_QPN_S);
 
 	if (!*cur_qp || (qpn & HNS_ROCE_V2_CQE_QPN_MASK) != (*cur_qp)->qpn) {
-		hr_dev = to_hr_dev(hr_cq->ib_cq.device);
 		hr_qp = __hns_roce_qp_lookup(hr_dev, qpn);
 		if (unlikely(!hr_qp)) {
 			dev_err(hr_dev->dev, "CQ %06lx with entry for unknown QPN %06x\n",
@@ -2814,14 +2799,22 @@ static int hns_roce_v2_poll_one(struct hns_roce_cq *hr_cq,
 		break;
 	}
 
-	/* flush cqe if wc status is error, excluding flush error */
-	if ((wc->status != IB_WC_SUCCESS) &&
-	    (wc->status != IB_WC_WR_FLUSH_ERR)) {
-		attr_mask = IB_QP_STATE;
-		attr.qp_state = IB_QPS_ERR;
-		return hns_roce_v2_modify_qp(&(*cur_qp)->ibqp,
-					     &attr, attr_mask,
-					     (*cur_qp)->state, IB_QPS_ERR);
+	/*
+	 * Hip08 hardware cannot flush the WQEs in SQ/RQ if the QP state gets
+	 * into errored mode. Hence, as a workaround to this hardware
+	 * limitation, driver needs to assist in flushing. But the flushing
+	 * operation uses mailbox to convey the QP state to the hardware and
+	 * which can sleep due to the mutex protection around the mailbox calls.
+	 * Hence, use the deferred flush for now. Once wc error detected, the
+	 * flushing operation is needed.
+	 */
+	if (wc->status != IB_WC_SUCCESS &&
+	    wc->status != IB_WC_WR_FLUSH_ERR &&
+	    !(*cur_qp)->being_pushed) {
+		dev_err(hr_dev->dev, "error cqe status is: 0x%x\n",
+			status & HNS_ROCE_V2_CQE_STATUS_MASK);
+		init_flush_work(hr_dev, *cur_qp);
+		return 0;
 	}
 
 	if (wc->status == IB_WC_WR_FLUSH_ERR)
@@ -4389,6 +4382,8 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
 	struct hns_roce_v2_qp_context *context = ctx;
 	struct hns_roce_v2_qp_context *qpc_mask = ctx + 1;
 	struct device *dev = hr_dev->dev;
+	unsigned long sq_flags = 0;
+	unsigned long rq_flags = 0;
 	int ret;
 
 	/*
@@ -4406,6 +4401,7 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
 
 	/* When QP state is err, SQ and RQ WQE should be flushed */
 	if (new_state == IB_QPS_ERR) {
+		spin_lock_irqsave(&hr_qp->sq.lock, sq_flags);
 		roce_set_field(context->byte_160_sq_ci_pi,
 			       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_M,
 			       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_S,
@@ -4413,8 +4409,12 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
 		roce_set_field(qpc_mask->byte_160_sq_ci_pi,
 			       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_M,
 			       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_S, 0);
+		hr_qp->state = IB_QPS_ERR;
+		hr_qp->being_pushed = 0;
+		spin_unlock_irqrestore(&hr_qp->sq.lock, sq_flags);
 
 		if (!ibqp->srq) {
+			spin_lock_irqsave(&hr_qp->rq.lock, rq_flags);
 			roce_set_field(context->byte_84_rq_ci_pi,
 			       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_M,
 			       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_S,
@@ -4422,6 +4422,7 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
 			roce_set_field(qpc_mask->byte_84_rq_ci_pi,
 			       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_M,
 			       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_S, 0);
+			spin_unlock_irqrestore(&hr_qp->rq.lock, rq_flags);
 		}
 	}
 
@@ -4466,6 +4467,7 @@ static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
 		hr_qp->sq.tail = 0;
 		hr_qp->sq_next_wqe = 0;
 		hr_qp->next_sge = 0;
+		hr_qp->being_pushed = 0;
 		if (hr_qp->rq.wqe_cnt)
 			*hr_qp->rdb.db_record = 0;
 	}
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index 0c1e74a..a30f86c 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -79,6 +79,7 @@ void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
 	if (!flush_work)
 		return;
 
+	hr_qp->being_pushed = 1;
 	flush_work->hr_dev = hr_dev;
 	flush_work->hr_qp = hr_qp;
 	INIT_WORK(&flush_work->work, flush_work_handle);
@@ -748,6 +749,7 @@ static int hns_roce_create_qp_common(struct hns_roce_dev *hr_dev,
 	spin_lock_init(&hr_qp->rq.lock);
 
 	hr_qp->state = IB_QPS_RESET;
+	hr_qp->being_pushed = 0;
 
 	hr_qp->ibqp.qp_type = init_attr->qp_type;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 1/2] RDMA/hns: Add the workqueue framework for flush cqe handler
  2019-12-28  3:28 ` [PATCH v5 for-next 1/2] " Yixian Liu
@ 2020-01-10 15:26   ` Jason Gunthorpe
  2020-01-11  9:49     ` Liuyixian (Eason)
  2020-01-13 11:26     ` Liuyixian (Eason)
  0 siblings, 2 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2020-01-10 15:26 UTC (permalink / raw)
  To: Yixian Liu; +Cc: dledford, leon, linux-rdma, linuxarm

On Sat, Dec 28, 2019 at 11:28:54AM +0800, Yixian Liu wrote:
> +void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
> +{
> +	struct hns_roce_work *flush_work;
> +
> +	flush_work = kzalloc(sizeof(struct hns_roce_work), GFP_ATOMIC);
> +	if (!flush_work)
> +		return;

You changed it to only queue once, so why do we need the allocation
now? That was the whole point..

And the other patch shouldn't be manipulating being_pushed without
some kind of locking

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 1/2] RDMA/hns: Add the workqueue framework for flush cqe handler
  2020-01-10 15:26   ` Jason Gunthorpe
@ 2020-01-11  9:49     ` Liuyixian (Eason)
  2020-01-13 11:26     ` Liuyixian (Eason)
  1 sibling, 0 replies; 9+ messages in thread
From: Liuyixian (Eason) @ 2020-01-11  9:49 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: dledford, leon, linux-rdma, linuxarm



On 2020/1/10 23:26, Jason Gunthorpe wrote:
> On Sat, Dec 28, 2019 at 11:28:54AM +0800, Yixian Liu wrote:
>> +void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
>> +{
>> +	struct hns_roce_work *flush_work;
>> +
>> +	flush_work = kzalloc(sizeof(struct hns_roce_work), GFP_ATOMIC);
>> +	if (!flush_work)
>> +		return;
> 
> You changed it to only queue once, so why do we need the allocation
> now? That was the whole point..
> 
> And the other patch shouldn't be manipulating being_pushed without
> some kind of locking

Hi Jason, thanks for your suggestion, I will consider them in next version.

> 
> Jason
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 1/2] RDMA/hns: Add the workqueue framework for flush cqe handler
  2020-01-10 15:26   ` Jason Gunthorpe
  2020-01-11  9:49     ` Liuyixian (Eason)
@ 2020-01-13 11:26     ` Liuyixian (Eason)
  2020-01-13 14:04       ` Jason Gunthorpe
  1 sibling, 1 reply; 9+ messages in thread
From: Liuyixian (Eason) @ 2020-01-13 11:26 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: dledford, leon, linux-rdma, linuxarm



On 2020/1/10 23:26, Jason Gunthorpe wrote:
> On Sat, Dec 28, 2019 at 11:28:54AM +0800, Yixian Liu wrote:
>> +void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
>> +{
>> +	struct hns_roce_work *flush_work;
>> +
>> +	flush_work = kzalloc(sizeof(struct hns_roce_work), GFP_ATOMIC);
>> +	if (!flush_work)
>> +		return;
> 
> You changed it to only queue once, so why do we need the allocation
> now? That was the whole point..

Hi Jason,

The flush work is queued **not only once**. As the flag being_pushed is set to 0 during
the process of modifying qp like this:
	hns_roce_v2_modify_qp {
		...
		if (new_state == IB_QPS_ERR) {
			spin_lock_irqsave(&hr_qp->sq.lock, sq_flag);
			...
			hr_qp->state = IB_QPS_ERR;
			hr_qp->being_push = 0;
			...
		}
		...
	}
which means the new updated PI value needs to be updated with initializing a new flush work.
Thus, maybe there are two flush work in the workqueue. Thus, we still need the allocation here.

> 
> And the other patch shouldn't be manipulating being_pushed without
> some kind of locking

Agree. It needs to hold the spin lock of sq and rq when updating it in modify qp,
will fix next version.

> 
> Jason
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 1/2] RDMA/hns: Add the workqueue framework for flush cqe handler
  2020-01-13 11:26     ` Liuyixian (Eason)
@ 2020-01-13 14:04       ` Jason Gunthorpe
  2020-01-15  9:36         ` Liuyixian (Eason)
  2020-01-15  9:39         ` Liuyixian (Eason)
  0 siblings, 2 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2020-01-13 14:04 UTC (permalink / raw)
  To: Liuyixian (Eason); +Cc: dledford, leon, linux-rdma, linuxarm

On Mon, Jan 13, 2020 at 07:26:45PM +0800, Liuyixian (Eason) wrote:
> 
> 
> On 2020/1/10 23:26, Jason Gunthorpe wrote:
> > On Sat, Dec 28, 2019 at 11:28:54AM +0800, Yixian Liu wrote:
> >> +void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
> >> +{
> >> +	struct hns_roce_work *flush_work;
> >> +
> >> +	flush_work = kzalloc(sizeof(struct hns_roce_work), GFP_ATOMIC);
> >> +	if (!flush_work)
> >> +		return;
> > 
> > You changed it to only queue once, so why do we need the allocation
> > now? That was the whole point..
> 
> Hi Jason,
> 
> The flush work is queued **not only once**. As the flag being_pushed is set to 0 during
> the process of modifying qp like this:
> 	hns_roce_v2_modify_qp {
> 		...
> 		if (new_state == IB_QPS_ERR) {
> 			spin_lock_irqsave(&hr_qp->sq.lock, sq_flag);
> 			...
> 			hr_qp->state = IB_QPS_ERR;
> 			hr_qp->being_push = 0;
> 			...
> 		}
> 		...
> 	}
> which means the new updated PI value needs to be updated with initializing a new flush work.
> Thus, maybe there are two flush work in the workqueue. Thus, we still need the allocation here.

I don't see how you should get two? One should be pending until the
modify is done with the new PI, then once the PI is updated the same
one should be re-queued the next time the PI needs changing.

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 1/2] RDMA/hns: Add the workqueue framework for flush cqe handler
  2020-01-13 14:04       ` Jason Gunthorpe
@ 2020-01-15  9:36         ` Liuyixian (Eason)
  2020-01-15  9:39         ` Liuyixian (Eason)
  1 sibling, 0 replies; 9+ messages in thread
From: Liuyixian (Eason) @ 2020-01-15  9:36 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: dledford, leon, linux-rdma, linuxarm



On 2020/1/13 22:04, Jason Gunthorpe wrote:
> On Mon, Jan 13, 2020 at 07:26:45PM +0800, Liuyixian (Eason) wrote:
>>
>>
>> On 2020/1/10 23:26, Jason Gunthorpe wrote:
>>> On Sat, Dec 28, 2019 at 11:28:54AM +0800, Yixian Liu wrote:
>>>> +void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
>>>> +{
>>>> +	struct hns_roce_work *flush_work;
>>>> +
>>>> +	flush_work = kzalloc(sizeof(struct hns_roce_work), GFP_ATOMIC);
>>>> +	if (!flush_work)
>>>> +		return;
>>>
>>> You changed it to only queue once, so why do we need the allocation
>>> now? That was the whole point..
>>
>> Hi Jason,
>>
>> The flush work is queued **not only once**. As the flag being_pushed is set to 0 during
>> the process of modifying qp like this:
>> 	hns_roce_v2_modify_qp {
>> 		...
>> 		if (new_state == IB_QPS_ERR) {
>> 			spin_lock_irqsave(&hr_qp->sq.lock, sq_flag);
>> 			...
>> 			hr_qp->state = IB_QPS_ERR;
>> 			hr_qp->being_push = 0;
>> 			...
>> 		}
>> 		...
>> 	}
>> which means the new updated PI value needs to be updated with initializing a new flush work.
>> Thus, maybe there are two flush work in the workqueue. Thus, we still need the allocation here.
> 
> I don't see how you should get two? One should be pending until the
> modify is done with the new PI, then once the PI is updated the same
> one should be re-queued the next time the PI needs changing.








^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v5 for-next 1/2] RDMA/hns: Add the workqueue framework for flush cqe handler
  2020-01-13 14:04       ` Jason Gunthorpe
  2020-01-15  9:36         ` Liuyixian (Eason)
@ 2020-01-15  9:39         ` Liuyixian (Eason)
  1 sibling, 0 replies; 9+ messages in thread
From: Liuyixian (Eason) @ 2020-01-15  9:39 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: dledford, leon, linux-rdma, linuxarm



On 2020/1/13 22:04, Jason Gunthorpe wrote:
> On Mon, Jan 13, 2020 at 07:26:45PM +0800, Liuyixian (Eason) wrote:
>>
>>
>> On 2020/1/10 23:26, Jason Gunthorpe wrote:
>>> On Sat, Dec 28, 2019 at 11:28:54AM +0800, Yixian Liu wrote:
>>>> +void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
>>>> +{
>>>> +	struct hns_roce_work *flush_work;
>>>> +
>>>> +	flush_work = kzalloc(sizeof(struct hns_roce_work), GFP_ATOMIC);
>>>> +	if (!flush_work)
>>>> +		return;
>>>
>>> You changed it to only queue once, so why do we need the allocation
>>> now? That was the whole point..
>>
>> Hi Jason,
>>
>> The flush work is queued **not only once**. As the flag being_pushed is set to 0 during
>> the process of modifying qp like this:
>> 	hns_roce_v2_modify_qp {
>> 		...
>> 		if (new_state == IB_QPS_ERR) {
>> 			spin_lock_irqsave(&hr_qp->sq.lock, sq_flag);
>> 			...
>> 			hr_qp->state = IB_QPS_ERR;
>> 			hr_qp->being_push = 0;
>> 			...
>> 		}
>> 		...
>> 	}
>> which means the new updated PI value needs to be updated with initializing a new flush work.
>> Thus, maybe there are two flush work in the workqueue. Thus, we still need the allocation here.
> 
> I don't see how you should get two? One should be pending until the
> modify is done with the new PI, then once the PI is updated the same
> one should be re-queued the next time the PI needs changing.
> 
Hi Jason,

Thanks! I will fix it according to your suggestion in V7.



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-01-15  9:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-28  3:28 [PATCH v5 for-next 0/2] RDMA/hns: Add the workqueue framework for flush cqe handler Yixian Liu
2019-12-28  3:28 ` [PATCH v5 for-next 1/2] " Yixian Liu
2020-01-10 15:26   ` Jason Gunthorpe
2020-01-11  9:49     ` Liuyixian (Eason)
2020-01-13 11:26     ` Liuyixian (Eason)
2020-01-13 14:04       ` Jason Gunthorpe
2020-01-15  9:36         ` Liuyixian (Eason)
2020-01-15  9:39         ` Liuyixian (Eason)
2019-12-28  3:28 ` [PATCH v5 for-next 2/2] RDMA/hns: Delayed flush cqe process with workqueue Yixian Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).