From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, FORGED_MUA_MOZILLA,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 489F2C43603 for ; Tue, 10 Dec 2019 06:50:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 17A0E206E0 for ; Tue, 10 Dec 2019 06:50:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="key not found in DNS" (0-bit key) header.d=codeaurora.org header.i=@codeaurora.org header.b="B8CnQG/K"; dkim=pass (1024-bit key) header.d=amazonses.com header.i=@amazonses.com header.b="WJ9rL5+8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727231AbfLJGuI (ORCPT ); Tue, 10 Dec 2019 01:50:08 -0500 Received: from a27-187.smtp-out.us-west-2.amazonses.com ([54.240.27.187]:52540 "EHLO a27-187.smtp-out.us-west-2.amazonses.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726950AbfLJGuI (ORCPT ); Tue, 10 Dec 2019 01:50:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=zsmsymrwgfyinv5wlfyidntwsjeeldzt; d=codeaurora.org; t=1575960607; h=Subject:To:Cc:References:From:Message-ID:Date:MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding; bh=TEKWj/GnHTqeC9oZDHoE3QJNzvsPumLFGAOLFPGxvLY=; b=B8CnQG/KIBCtU1VNsr2rehdik9WM9neFH9SSFgW5eG/QYyPXA1g/Cl0u3vgpWk0H 09GSQD+/g/yc+vkG4tVAJdR4Bz2tCCcAmMpyzRuq4n4vTN4LUOnBwhki0MMuv+s3hHA x1FnKm1EkIEHZ0IVsaY6BapmDulD6Ypgb3cOt5yw= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=gdwg2y3kokkkj5a55z2ilkup5wp5hhxx; d=amazonses.com; t=1575960607; h=Subject:To:Cc:References:From:Message-ID:Date:MIME-Version:In-Reply-To:Content-Type:Content-Transfer-Encoding:Feedback-ID; bh=TEKWj/GnHTqeC9oZDHoE3QJNzvsPumLFGAOLFPGxvLY=; b=WJ9rL5+8JgZWwFhLPIeoPluK6kk1z/jn4j1l8/n1FI2aMgeF51GtgQPUnXUMrP+8 QOsCtj/CbGPjVH9324g5ZK6KtW1GltWihZoyFjUD9pnmhZrWP27q40EjXypdyUh8Qh7 RIwfdFu8WIlQybFUfaRLvFCfrdKO/qEs36VioSqM= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org A0F2EC43383 Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=none smtp.mailfrom=sallenki@codeaurora.org Subject: Re: [PATCH] usb: dwc3: Do not process request if HWO is set for its TRB To: Felipe Balbi , gregkh@linuxfoundation.org, linux-usb@vger.kernel.org Cc: jackp@codeaurora.org, mgautam@codeaurora.org References: <1574946055-3788-1-git-send-email-sallenki@codeaurora.org> <1575270714-29994-1-git-send-email-sallenki@codeaurora.org> <87tv6jch61.fsf@gmail.com> <0101016ec6294c21-99711286-dbda-4d62-b8c7-e9f28e99b261-000000@us-west-2.amazonses.com> <871rtla8xd.fsf@gmail.com> From: Sriharsha Allenki Message-ID: <0101016eee927967-22f4fa8c-a10a-41d8-9f74-0e6914ed3ee4-000000@us-west-2.amazonses.com> Date: Tue, 10 Dec 2019 06:50:07 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: <871rtla8xd.fsf@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US X-SES-Outgoing: 2019.12.10-54.240.27.187 Feedback-ID: 1.us-west-2.CZuq2qbDmUIuT3qdvXlRHZZCpfZqZ4GtG9v3VKgRyF0=:AmazonSES Sender: linux-usb-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-usb@vger.kernel.org Hi Felipe, On 12/3/2019 6:00 PM, Felipe Balbi wrote: > Hi, > > Sriharsha Allenki writes: > >> This case occurs only after the first TRB of the chain is processed, >> which we arechecking in the patch. Although, this piece of code has >> been no-op after introducingthe function >> "dwc3_gadget_ep_reclaim_trb_sg".This function checks for the HWO and >> does notcall the "dwc3_gadget_ep_reclaim_completed_trb" if it is >> set.Hence this condition mostly likely will never hit. > You're missing one important detail: If we have e.g. 200 TRBs in a > single SG-list and we receive a short packet on TRB 10, we will have 190 > TRBs with HWO bit left set and your patch prevents the driver from > clearing that bit. Yes, you are regressing a very special case. Iam checking only the first TRB of the chain and not the TRB pointed by the current dequeue pointer. > >>> what problem you actually found? Preferrably with tracepoint data >>> showing the fault. >> Test case here involves f_fs driver in AIO mode and we see ~8 TRBs in >> the queue with HWO set and UPDATE_XFER done. In the failure case I see >> thatas part of processingthe interrupt generated by the core for the >> completion of the first TRB, the driver isgoing ahead and giving > we shouldn't get completion interrupt for the first TRB, only the > last. Care to share tracepoint data? We have seen the issue only once and we do not have any tracepoint data for it. But with the internal logging we have in our downstream code, I see a race between dequeue from the function driver, and the giveback as part of the completion (XferInProgress). A request (say Request-1) is dequeued before we could notify it's completion to the gadget driver. Because of this, as part of handling the completion event for the Request-1 we gaveback the next request(Request-2) in the queue which is yet to be processed by the core leading to the mentioned SMMU fault. Normally, the core should not process the TRBs once a request has been dequeued because of the stop_active_transfer as part of dequeue, but I see a timeout when issuing the end transfer command during dequeue because of which core is still processing the TRBs in the queue. Regards, Sriharsha > >> backthe requests of all theother queued TRBs, whichinvolves removing >> the SMMU mapping of the buffers associated with the requests. But >> these are still active and when core processesthese TRBs and their >> correspondingun-mapped buffers, I see a translationfaultraised by the >> SMMU. >> >> I hope I have answered your queries, please let me know if I am still >> missing something here. > yes, tracepoint data showing the problem. Thank you >