All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liuyixian (Eason)" <liuyixian@huawei.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: <dledford@redhat.com>, <leon@kernel.org>,
	<linux-rdma@vger.kernel.org>, <linuxarm@huawei.com>
Subject: Re: [PATCH v2 for-next 1/2] RDMA/hns: Add the workqueue framework for flush cqe handler
Date: Mon, 18 Nov 2019 21:50:24 +0800	[thread overview]
Message-ID: <523cf93d-a849-ab24-36f0-903fb1afe7ff@huawei.com> (raw)
In-Reply-To: <20191115210621.GE4055@ziepe.ca>



On 2019/11/16 5:06, Jason Gunthorpe wrote:
> On Tue, Nov 12, 2019 at 08:52:03PM +0800, Yixian Liu wrote:
>> HiP08 RoCE hardware lacks ability(a known hardware problem) to flush
>> outstanding WQEs if QP state gets into errored mode for some reason.
>> To overcome this hardware problem and as a workaround, when QP is
>> detected to be in errored state during various legs like post send,
>> post receive etc [1], flush needs to be performed from the driver.
>>
>> The earlier patch[1] sent to solve the hardware limitation explained
>> in the cover-letter had a bug in the software flushing leg. It
>> acquired mutex while modifying QP state to errored state and while
>> conveying it to the hardware using the mailbox. This caused leg to
>> sleep while holding spin-lock and caused crash.
>>
>> Suggested Solution:
>> we have proposed to defer the flushing of the QP in the Errored state
>> using the workqueue to get around with the limitation of our hardware.
>>
>> This patch adds the framework of the workqueue and the flush handler
>> function.
>>
>> [1] https://patchwork.kernel.org/patch/10534271/
>>
>> Signed-off-by: Yixian Liu <liuyixian@huawei.com>
>> Reviewed-by: Salil Mehta <salil.mehta@huawei.com>
>>  drivers/infiniband/hw/hns/hns_roce_device.h |  3 +++
>>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c  |  4 ++--
>>  drivers/infiniband/hw/hns/hns_roce_qp.c     | 33 +++++++++++++++++++++++++++++
>>  3 files changed, 38 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
>> index a1b712e..42d8a5a 100644
>> +++ b/drivers/infiniband/hw/hns/hns_roce_device.h
>> @@ -906,6 +906,7 @@ struct hns_roce_caps {
>>  struct hns_roce_work {
>>  	struct hns_roce_dev *hr_dev;
>>  	struct work_struct work;
>> +	struct hns_roce_qp *hr_qp;
>>  	u32 qpn;
>>  	u32 cqn;
>>  	int event_type;
>> @@ -1034,6 +1035,7 @@ struct hns_roce_dev {
>>  	const struct hns_roce_hw *hw;
>>  	void			*priv;
>>  	struct workqueue_struct *irq_workq;
>> +	struct hns_roce_work flush_work;
>>  	const struct hns_roce_dfx_hw *dfx;
>>  };
>>  
>> @@ -1226,6 +1228,7 @@ struct ib_qp *hns_roce_create_qp(struct ib_pd *ib_pd,
>>  				 struct ib_udata *udata);
>>  int hns_roce_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
>>  		       int attr_mask, struct ib_udata *udata);
>> +void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp);
>>  void *get_recv_wqe(struct hns_roce_qp *hr_qp, int n);
>>  void *get_send_wqe(struct hns_roce_qp *hr_qp, int n);
>>  void *get_send_extend_sge(struct hns_roce_qp *hr_qp, int n);
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>> index 907c951..ec48e7e 100644
>> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
>> @@ -5967,8 +5967,8 @@ static int hns_roce_v2_init_eq_table(struct hns_roce_dev *hr_dev)
>>  		goto err_request_irq_fail;
>>  	}
>>  
>> -	hr_dev->irq_workq =
>> -		create_singlethread_workqueue("hns_roce_irq_workqueue");
>> +	hr_dev->irq_workq = alloc_workqueue("hns_roce_irq_workqueue",
>> +					    WQ_MEM_RECLAIM, 0);
>>  	if (!hr_dev->irq_workq) {
>>  		dev_err(dev, "Create irq workqueue failed!\n");
>>  		ret = -ENOMEM;
>> diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
>> index 9442f01..0111f2e 100644
>> +++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
>> @@ -43,6 +43,39 @@
>>  
>>  #define SQP_NUM				(2 * HNS_ROCE_MAX_PORTS)
>>  
>> +static void flush_work_handle(struct work_struct *work)
>> +{
>> +	struct hns_roce_work *flush_work = container_of(work,
>> +					struct hns_roce_work, work);
>> +	struct hns_roce_qp *hr_qp = flush_work->hr_qp;
>> +	struct device *dev = flush_work->hr_dev->dev;
>> +	struct ib_qp_attr attr;
>> +	int attr_mask;
>> +	int ret;
>> +
>> +	attr_mask = IB_QP_STATE;
>> +	attr.qp_state = IB_QPS_ERR;
>> +
>> +	ret = hns_roce_modify_qp(&hr_qp->ibqp, &attr, attr_mask, NULL);
>> +	if (ret)
>> +		dev_err(dev, "Modify QP to error state failed(%d) during CQE flush\n",
>> +			ret);
>> +
>> +	if (atomic_dec_and_test(&hr_qp->refcount))
>> +		complete(&hr_qp->free);
>> +}
>> +
>> +void init_flush_work(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp)
>> +{
>> +	struct hns_roce_work *flush_work = &hr_dev->flush_work;
>> +
>> +	flush_work->hr_dev = hr_dev;
>> +	flush_work->hr_qp = hr_qp;
>> +	INIT_WORK(&flush_work->work, flush_work_handle);
>> +	atomic_inc(&hr_qp->refcount);
>> +	queue_work(hr_dev->irq_workq, &flush_work->work);
> 
> It kind of looks like this can be called multiple times? It won't work
> right unless it is called exactly once
> 
> Jason

Yes, you are right.

So I think the reasonable solution is to allocate it dynamically, and I think
it is a very very little chance that the allocation will be failed. If this happened,
I think the application also needs to be over.

So I will fall back to v1 for this part in next version.

	flush_work = kzalloc(sizeof(struct hns_roce_flush_work), GFP_ATOMIC)
	if (!flush_work)
		return;

Or, could you give me some advice for it?

Thanks.

> 
> .
> 


  reply	other threads:[~2019-11-18 13:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-12 12:52 [PATCH v2 for-next 0/2] Fix crash due to sleepy mutex while holding lock in post_{send|recv|poll} Yixian Liu
2019-11-12 12:52 ` [PATCH v2 for-next 1/2] RDMA/hns: Add the workqueue framework for flush cqe handler Yixian Liu
2019-11-15 21:06   ` Jason Gunthorpe
2019-11-18 13:50     ` Liuyixian (Eason) [this message]
2019-11-18 17:02       ` Jason Gunthorpe
2019-11-19  8:00         ` Liuyixian (Eason)
2019-11-19  9:43           ` Zengtao (B)
2019-11-19 13:09             ` Liuyixian (Eason)
2019-11-19 18:46           ` Jason Gunthorpe
2019-11-20 11:00             ` Liuyixian (Eason)
2019-11-12 12:52 ` [PATCH v2 for-next 2/2] RDMA/hns: Delayed flush cqe process with workqueue Yixian Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=523cf93d-a849-ab24-36f0-903fb1afe7ff@huawei.com \
    --to=liuyixian@huawei.com \
    --cc=dledford@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.