From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B730C5DF60 for ; Wed, 6 Nov 2019 02:18:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 50F2B21D71 for ; Wed, 6 Nov 2019 02:18:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730852AbfKFCSl (ORCPT ); Tue, 5 Nov 2019 21:18:41 -0500 Received: from szxga05-in.huawei.com ([45.249.212.191]:5731 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730724AbfKFCSl (ORCPT ); Tue, 5 Nov 2019 21:18:41 -0500 Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id E1F1759F11C8DBD56A66; Wed, 6 Nov 2019 10:18:39 +0800 (CST) Received: from [127.0.0.1] (10.74.223.196) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.439.0; Wed, 6 Nov 2019 10:18:30 +0800 Subject: Re: [PATCH for-next] RDMA/hns: Bugfix for flush cqe in case softirq and multi-process To: Leon Romanovsky CC: , , , References: <20190910075216.GX6601@unreal> <94ad1f56-afc6-ec78-4aa2-85d03c644031@huawei.com> <0d4ce391-6619-783d-55a8-fa2524af7b9c@huawei.com> <20190923050125.GK14368@unreal> <1224a3a0-50fb-dd6a-f22e-833e74ec77c3@huawei.com> <20191015080036.GC6957@unreal> <2a0ae88d-908f-df4b-11ea-26e639b7b338@huawei.com> <20191105143724.GD6763@unreal> From: "Liuyixian (Eason)" Message-ID: <629de5be-644b-d68a-84e4-52eaad4d8514@huawei.com> Date: Wed, 6 Nov 2019 10:16:57 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20191105143724.GD6763@unreal> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.74.223.196] X-CFilter-Loop: Reflected Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On 2019/11/5 22:37, Leon Romanovsky wrote: > On Tue, Nov 05, 2019 at 10:06:20AM +0800, Liuyixian (Eason) wrote: >> >> >> On 2019/10/28 17:34, Liuyixian (Eason) wrote: >>> >>> >>> On 2019/10/15 16:00, Leon Romanovsky wrote: >>>> On Sat, Oct 12, 2019 at 11:53:36AM +0800, Liuyixian (Eason) wrote: >>>>> >>>>> >>>>> On 2019/9/24 11:54, Liuyixian (Eason) wrote: >>>>>> >>>>>> >>>>>> On 2019/9/23 13:01, Leon Romanovsky wrote: >>>>>>> On Fri, Sep 20, 2019 at 11:55:56AM +0800, Liuyixian (Eason) wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 2019/9/11 21:17, Liuyixian (Eason) wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2019/9/10 15:52, Leon Romanovsky wrote: >>>>>>>>>> On Tue, Sep 10, 2019 at 02:40:20PM +0800, Liuyixian (Eason) wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2019/9/8 16:03, Leon Romanovsky wrote: >>>>>>>>>>>> On Thu, Sep 05, 2019 at 08:31:11PM +0800, Weihang Li wrote: >>>>>>>>>>>>> From: Yixian Liu >>>>>>>>>>>>> >>>>>>>>>>>>> Hip08 has the feature flush cqe, which help to flush wqe in workqueue >>>>>>>>>>>>> (sq and rq) when error happened by transmitting producer index with >>>>>>>>>>>>> mailbox to hardware. Flush cqe is emplemented in post send and recv >>>>>>>>>>>>> verbs. However, under NVMe cases, these verbs will be called under >>>>>>>>>>>>> softirq context, and it will lead to following calltrace with >>>>>>>>>>>>> current driver as mailbox used by flush cqe can go to sleep. >>>>>>>>>>>>> >>>>>>>>>>>>> This patch solves this problem by using workqueue to do flush cqe, >>>>>>>>>>>> >>>>>>>>>>>> Unbelievable, almost every bug in this driver is solved by introducing >>>>>>>>>>>> workqueue. You should fix "sleep in flush path" issue and not by adding >>>>>>>>>>>> new workqueue. >>>>>>>>>>>> >>>>>>>>>>> Hi Leon, >>>>>>>>>>> >>>>>>>>>>> Thanks for the comment. >>>>>>>>>>> Up to now, for hip08, only one place use workqueue in hns_roce_hw_v2.c >>>>>>>>>>> where for irq prints. >>>>>>>>>> >>>>>>>>>> Thanks to our lack of desire to add more workqueues and previous patches >>>>>>>>>> which removed extra workqueues from the driver. >>>>>>>>>> >>>>>>>>> Thanks, I see. >>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The solution for flush cqe in this patch is as follow: >>>>>>>>>>> While flush cqe should be implement, the driver should modify qp to error state >>>>>>>>>>> through mailbox with the newest product index of sq and rq, the hardware then >>>>>>>>>>> can flush all outstanding wqes in sq and rq. >>>>>>>>>>> >>>>>>>>>>> That's the whole mechanism of flush cqe, also is the flush path. We can't >>>>>>>>>>> change neither mailbox sleep attribute or flush cqe occurred in post send/recv. >>>>>>>>>>> To avoid the calltrace of flush cqe in post verbs under NVMe softirq, >>>>>>>>>>> use workqueue for flush cqe seems reasonable. >>>>>>>>>>> >>>>>>>>>>> As far as I know, there is no other alternative solution for this situation. >>>>>>>>>>> I will be very grateful if you reminder me more information. >>>>>>>>>> >>>>>>>>>> ib_drain_rq/ib_drain_sq/ib_drain_qp???? >>>>>>>>>> >>>>>>>>> Hi Leon, >>>>>>>>> >>>>>>>>> I think these interfaces are designed for application to check that all wqes >>>>>>>>> have been processed by hardware, so called drain or flush. However, it is not >>>>>>>>> the same as the flush in this patch. The solution in this patch is used >>>>>>>>> to help the hardware generate flush cqes for outstanding wqes while qp error. >>>>>>>>> >>>>>>>> Hi Leon, >>>>>>>> >>>>>>>> What's your opinion about above? Do you have any further comments? >>>>>>> >>>>>>> My opinion didn't change, you need to read discussions about ib_drain_*() >>>>>>> functions, how and why they were introduced. It is a way to go. >>>>>>> >>>>>>> Thanks >>>>>> >>>>>> Hi Leon, >>>>>> >>>>>> Thanks a lot! I will dig those functions for my problem. >>>>>> >>>>> >>>>> Hi Leon, >>>>> >>>>> I have analysis the mechanism of ib_drain_(qp, sq, rq), that's okay to use >>>>> it instead of our flush cqe as both of them are calling modify qp to error >>>>> state in flush path. >>>>> >>>>> However, both ib_drain_* and flush cqe will face the same problem as declared >>>>> in previous emails, that is, in NVME case, post verbs will be called under >>>>> **softirq**, which will result to calltrace as mailbox used in modify qp >>>>> (flush path) can sleep, this is not allowed under softirq. >>>>> >>>>> Thus, to resolve above calltrace (sleep in softirq), using workqueue as in >>>>> this patch seems is a reasonable solution regardless of ib_drain_qp or >>>>> flush cqe is called in the workqueue. >>>>> >>>>> I think it is not a good idea to fix sleep in flush path (actually referred >>>>> to mailbox used in modify qp) as the mailbox is such a mature mechanism. >>>> >>>> No, it is not reasonable solution. >>>> >>> >>> Hi Leon, >>> >>> I have explained this issue better in another patch set and pruned other logic. >>> Thanks a lot for your review! >>> >>> Best regards. >>> Eason >>> >> >> Hi Doug and Loen, >> >> I just want to make sure that you know the above mentioned patch set is on: >> https://patchwork.kernel.org/project/linux-rdma/list/?series=194423 >> >> Sorry to reply your last comment so late as I analyzed all possible solutions with >> your comment, and found that I haven't describe our problem clear enough and accurate, >> thus, I made this new patch set with simple logic and detailed commit message. I hope >> I have clearly explained this problem . > > Hi, > > I'm confident that Doug and/or Jason will review it very soon. > > Thanks Thanks a lot! > >> >> Thanks. >> >> >> >> > > . >