From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE712C433DF for ; Thu, 20 Aug 2020 20:59:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 73D8220885 for ; Thu, 20 Aug 2020 20:59:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=candelatech.com header.i=@candelatech.com header.b="iuFiHK2K" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728603AbgHTU7P (ORCPT ); Thu, 20 Aug 2020 16:59:15 -0400 Received: from mail2.candelatech.com ([208.74.158.173]:49104 "EHLO mail3.candelatech.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728255AbgHTU7O (ORCPT ); Thu, 20 Aug 2020 16:59:14 -0400 Received: from [192.168.254.6] (unknown [50.34.202.127]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail3.candelatech.com (Postfix) with ESMTPSA id B0B1E13C2B3; Thu, 20 Aug 2020 13:59:13 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 mail3.candelatech.com B0B1E13C2B3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=candelatech.com; s=default; t=1597957154; bh=KN5sWp7z6Bx6MEzVFq3mOAy+K018hbY4NXh3mQr5hSw=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=iuFiHK2KU5WCMsMZlfFUv/SlvJa24Efj8xMIIb4wusPPrcqoQJS8jEXbU6g3j0Co1 rFf5fqKQauqESLZGmFXL/y1MCqA4P616wvy3Ryu0qwxV4QSTZfpoGuS8b1VdclBdEq NPcewI3/7RbIyWTape+si48yk3rjCd+RuEs8pYMI= Subject: Re: [RFC] ath10k: change to do napi_enable and napi_disable when insmod and rmmod for sdio To: Krishna Chaitanya Cc: Wen Gong , Kalle Valo , linux-wireless , ath10k References: <20200214035555.24762-1-wgong@codeaurora.org> <878se9iup3.fsf@codeaurora.org> <47c1b1e0-afc4-b9b4-5fc0-4636d8b3b981@candelatech.com> From: Ben Greear Organization: Candela Technologies Message-ID: Date: Thu, 20 Aug 2020 13:59:13 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-MW Content-Transfer-Encoding: 7bit Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On 8/20/20 1:15 PM, Krishna Chaitanya wrote: > On Thu, Aug 20, 2020 at 11:23 PM Ben Greear wrote: >> >> On 8/20/20 10:42 AM, Krishna Chaitanya wrote: >>> On Thu, Aug 20, 2020 at 11:11 PM Krishna Chaitanya >>> wrote: >>>> >>>> On Thu, Aug 20, 2020 at 10:38 PM Ben Greear wrote: >>>>> >>>>> On 8/20/20 10:00 AM, Krishna Chaitanya wrote: >>>>>> On Thu, Aug 20, 2020 at 10:02 PM Ben Greear wrote: >>>>>>> >>>>>>> On 8/20/20 9:08 AM, Krishna Chaitanya wrote: >>>>>>>> On Thu, Aug 20, 2020 at 8:07 PM Wen Gong wrote: >>>>>>>>> >>>>>>>>> On 2020-08-20 18:52, Krishna Chaitanya wrote: >>>>>>>>>> On Thu, Aug 20, 2020 at 3:45 PM Wen Gong wrote: >>>>>>>>>>> >>>>>>>>>>> On 2020-08-20 17:19, Krishna Chaitanya wrote: >>>>>>>>> ... >>>>>>>>>>>>> I'm not really convinced that this is the right fix, but I'm no NAPI >>>>>>>>>>>>> expert. Can anyone else help? >>>>>>>>>>>> Calling napi_disable() twice can lead to hangs, but moving NAPI from >>>>>>>>>>>> start/stop to >>>>>>>>>>>> the probe isn't the right approach as the datapath is tied to >>>>>>>>>>>> start/stop. >>>>>>>>>>>> >>>>>>>>>>>> Maybe check the state of NAPI before disable? >>>>>>>>>>>> >>>>>>>>>>>> if (test_bit(NAPI_STATE_SCHED, &ar->napi.napi.state)) >>>>>>>>>>>> napi_disable(&ar->napi) >>>>>>>>>>>> or maintain napi_state like this >>>>>>>>>>>> https://patchwork.kernel.org/patch/10249365/ >>>>>>>>>>> it is better to use above link's patch. >>>>>>>>>>> napi.state is controlled by napi API, it is better ath10k not know it. >>>>>>>>>> Sure, but IMHO just canceling the async rx work should solve the issue. >>>>>>>>> Oh no, canceling the async rx work will not solve this issue, rx worker >>>>>>>>> ath10k_rx_indication_async_work call napi_schedule, after napi_complete, >>>>>>>>> the NAPI_STATE_SCHED will clear. >>>>>>>>> The issue of this patch is because 2 thread called to hif_stop and >>>>>>>>> NAPI_STATE_SCHED not clear. >>>>>>>> That fix is still valid and good to have. >>>>>>>> >>>>>>>> ndev_stop being called twice is typical scenarios (stop vs rmmod), so >>>>>>>> just checking the netdev_flags for IFF_UP and returning from hif_Stop >>>>>>>> should suffice, no? >>>>>>> >>>>>>> My approach to fix this problem was to add a boolean in ath10k as to whether >>>>>>> it had napi enabled or not, and then check that before trying to enable/disable >>>>>>> it again. Seems to work fine, and cleaner in my mind than checking internal >>>>>>> napi flags. >>>>>> A much simpler approach is just to check for IFF_UP and skip NAPI (and others) >>>>>> in the hif_stop no? (provided proper RTNL locking is done if hif_stop >>>>>> is being called >>>>>> internally as well). >>>>>> >>>>> >>>>> I'm not sure, but I think the driver should be internally consistent and not >>>>> spend a lot of time trying to guess about interactions with objects higher >>>>> in the stack. >>>> Fair enough, the network interface state is a basic thing controlled >>>> by the driver, >>>> so, should be okay to use. Anyways, the in-driver approach has more control. >>>>> >>>>> Here is my original patch to fix this, it is not complex. >>>>> >>>>> https://patchwork.kernel.org/patch/10249363/ >>>> Sure, I have shared your patch above :). >>> Sent a bit early, any idea why this wasn't upstreamed earlier? >> >> No, one comment from Michal indicated maybe there were more problems lurking >> in this area, but he seemed to be OK with the patch over all. After that, >> it was just ignored. >> > Now might be a good time to push for it :) > It is generally a waste of time in my experience. Kalle is the maintainer and should be seeing any of this he cares to see. If he likes the patch, he can apply it or something similar. If you have a reproducible test case, see if the patch fixes things, that might help it be accepted. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from mail2.candelatech.com ([208.74.158.173] helo=mail3.candelatech.com) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k8reV-00032D-4u for ath10k@lists.infradead.org; Thu, 20 Aug 2020 20:59:16 +0000 Subject: Re: [RFC] ath10k: change to do napi_enable and napi_disable when insmod and rmmod for sdio References: <20200214035555.24762-1-wgong@codeaurora.org> <878se9iup3.fsf@codeaurora.org> <47c1b1e0-afc4-b9b4-5fc0-4636d8b3b981@candelatech.com> From: Ben Greear Message-ID: Date: Thu, 20 Aug 2020 13:59:13 -0700 MIME-Version: 1.0 In-Reply-To: Content-Language: en-MW List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "ath10k" Errors-To: ath10k-bounces+kvalo=adurom.com@lists.infradead.org To: Krishna Chaitanya Cc: linux-wireless , ath10k , Kalle Valo , Wen Gong On 8/20/20 1:15 PM, Krishna Chaitanya wrote: > On Thu, Aug 20, 2020 at 11:23 PM Ben Greear wrote: >> >> On 8/20/20 10:42 AM, Krishna Chaitanya wrote: >>> On Thu, Aug 20, 2020 at 11:11 PM Krishna Chaitanya >>> wrote: >>>> >>>> On Thu, Aug 20, 2020 at 10:38 PM Ben Greear wrote: >>>>> >>>>> On 8/20/20 10:00 AM, Krishna Chaitanya wrote: >>>>>> On Thu, Aug 20, 2020 at 10:02 PM Ben Greear wrote: >>>>>>> >>>>>>> On 8/20/20 9:08 AM, Krishna Chaitanya wrote: >>>>>>>> On Thu, Aug 20, 2020 at 8:07 PM Wen Gong wrote: >>>>>>>>> >>>>>>>>> On 2020-08-20 18:52, Krishna Chaitanya wrote: >>>>>>>>>> On Thu, Aug 20, 2020 at 3:45 PM Wen Gong wrote: >>>>>>>>>>> >>>>>>>>>>> On 2020-08-20 17:19, Krishna Chaitanya wrote: >>>>>>>>> ... >>>>>>>>>>>>> I'm not really convinced that this is the right fix, but I'm no NAPI >>>>>>>>>>>>> expert. Can anyone else help? >>>>>>>>>>>> Calling napi_disable() twice can lead to hangs, but moving NAPI from >>>>>>>>>>>> start/stop to >>>>>>>>>>>> the probe isn't the right approach as the datapath is tied to >>>>>>>>>>>> start/stop. >>>>>>>>>>>> >>>>>>>>>>>> Maybe check the state of NAPI before disable? >>>>>>>>>>>> >>>>>>>>>>>> if (test_bit(NAPI_STATE_SCHED, &ar->napi.napi.state)) >>>>>>>>>>>> napi_disable(&ar->napi) >>>>>>>>>>>> or maintain napi_state like this >>>>>>>>>>>> https://patchwork.kernel.org/patch/10249365/ >>>>>>>>>>> it is better to use above link's patch. >>>>>>>>>>> napi.state is controlled by napi API, it is better ath10k not know it. >>>>>>>>>> Sure, but IMHO just canceling the async rx work should solve the issue. >>>>>>>>> Oh no, canceling the async rx work will not solve this issue, rx worker >>>>>>>>> ath10k_rx_indication_async_work call napi_schedule, after napi_complete, >>>>>>>>> the NAPI_STATE_SCHED will clear. >>>>>>>>> The issue of this patch is because 2 thread called to hif_stop and >>>>>>>>> NAPI_STATE_SCHED not clear. >>>>>>>> That fix is still valid and good to have. >>>>>>>> >>>>>>>> ndev_stop being called twice is typical scenarios (stop vs rmmod), so >>>>>>>> just checking the netdev_flags for IFF_UP and returning from hif_Stop >>>>>>>> should suffice, no? >>>>>>> >>>>>>> My approach to fix this problem was to add a boolean in ath10k as to whether >>>>>>> it had napi enabled or not, and then check that before trying to enable/disable >>>>>>> it again. Seems to work fine, and cleaner in my mind than checking internal >>>>>>> napi flags. >>>>>> A much simpler approach is just to check for IFF_UP and skip NAPI (and others) >>>>>> in the hif_stop no? (provided proper RTNL locking is done if hif_stop >>>>>> is being called >>>>>> internally as well). >>>>>> >>>>> >>>>> I'm not sure, but I think the driver should be internally consistent and not >>>>> spend a lot of time trying to guess about interactions with objects higher >>>>> in the stack. >>>> Fair enough, the network interface state is a basic thing controlled >>>> by the driver, >>>> so, should be okay to use. Anyways, the in-driver approach has more control. >>>>> >>>>> Here is my original patch to fix this, it is not complex. >>>>> >>>>> https://patchwork.kernel.org/patch/10249363/ >>>> Sure, I have shared your patch above :). >>> Sent a bit early, any idea why this wasn't upstreamed earlier? >> >> No, one comment from Michal indicated maybe there were more problems lurking >> in this area, but he seemed to be OK with the patch over all. After that, >> it was just ignored. >> > Now might be a good time to push for it :) > It is generally a waste of time in my experience. Kalle is the maintainer and should be seeing any of this he cares to see. If he likes the patch, he can apply it or something similar. If you have a reproducible test case, see if the patch fixes things, that might help it be accepted. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com _______________________________________________ ath10k mailing list ath10k@lists.infradead.org http://lists.infradead.org/mailman/listinfo/ath10k