From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752479AbdLSP5R (ORCPT ); Tue, 19 Dec 2017 10:57:17 -0500 Received: from mail-vk0-f53.google.com ([209.85.213.53]:46253 "EHLO mail-vk0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752078AbdLSP5D (ORCPT ); Tue, 19 Dec 2017 10:57:03 -0500 X-Google-Smtp-Source: ACJfBov1z0/5l+kuVLiW5CpKXuL90CoFUChMcStKP45XmKAzdvHPTS8OS88yLG0BLRWOKehrFuZOc07l75AbVI1uRgY= MIME-Version: 1.0 In-Reply-To: <20171212183031.46772-1-dianders@chromium.org> References: <20171212183031.46772-1-dianders@chromium.org> From: Doug Anderson Date: Tue, 19 Dec 2017 07:57:00 -0800 X-Google-Sender-Auth: UjkUrT81-aNN5QST7Rfi7HPJvPk Message-ID: Subject: Re: [PATCH v4] usb: dwc2: host: Don't retry NAKed transactions right away To: Felipe Balbi , John Youn Cc: Stefan Wahren , Alexandru M Stan , "open list:ARM/Rockchip SoC..." , Greg Kroah-Hartman , Johan Hovold , Eric Anholt , Matthias Kaehlcke , John Stultz , linux-rpi-kernel@lists.infradead.org, Julius Werner , Alyssa Rosenzweig , Douglas Anderson , linux-usb@vger.kernel.org, LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Felipe, On Tue, Dec 12, 2017 at 10:30 AM, Douglas Anderson wrote: > On rk3288-veyron devices on Chrome OS it was found that plugging in an > Arduino-based USB device could cause the system to lockup, especially > if the CPU Frequency was at one of the slower operating points (like > 100 MHz / 200 MHz). > > Upon tracing, I found that the following was happening: > * The USB device (full speed) was connected to a high speed hub and > then to the rk3288. Thus, we were dealing with split transactions, > which is all handled in software on dwc2. > * Userspace was initiating a BULK IN transfer > * When we sent the SSPLIT (to start the split transaction), we got an > ACK. Good. Then we issued the CSPLIT. > * When we sent the CSPLIT, we got back a NAK. We immediately (from > the interrupt handler) started to retry and sent another SSPLIT. > * The device kept NAKing our CSPLIT, so we kept ping-ponging between > sending a SSPLIT and a CSPLIT, each time sending from the interrupt > handler. > * The handling of the interrupts was (because of the low CPU speed and > the inefficiency of the dwc2 interrupt handler) was actually taking > _longer_ than it took the other side to send the ACK/NAK. Thus we > were _always_ in the USB interrupt routine. > * The fact that USB interrupts were always going off was preventing > other things from happening in the system. This included preventing > the system from being able to transition to a higher CPU frequency. > > As I understand it, there is no requirement to retry super quickly > after a NAK, we just have to retry sometime in the future. Thus one > solution to the above is to just add a delay between getting a NAK and > retrying the transmission. If this delay is sufficiently long to get > out of the interrupt routine then the rest of the system will be able > to make forward progress. Even a 25 us delay would probably be > enough, but we'll be extra conservative and try to delay 1 ms (the > exact amount depends on HZ and the accuracy of the jiffy and how close > the current jiffy is to ticking, but could be as much as 20 ms or as > little as 1 ms). > > Presumably adding a delay like this could impact the USB throughput, > so we only add the delay with repeated NAKs. > > NOTE: Upon further testing of a pl2303 serial adapter, I found that > this fix may help with problems there. Specifically I found that the > pl2303 serial adapters tend to respond with a NAK when they have > nothing to say and thus we end with this same sequence. > > Signed-off-by: Douglas Anderson > Reviewed-by: Julius Werner > Tested-by: Stefan Wahren > Acked-by: John Youn > --- > > Changes in v4: > - Removed Cc for stable as per Felipe's request in v3 > - Rebased and squashed the two patches since Kees' timer stuff landed > - Add John Youn's Ack. > > Changes in v3: > - Add tested-by for Stefan Wahren > - Sent to Felipe Balbi as candiate to land this. > - Add Cc for stable (it's always been broken so go as far is as easy) > > Changes in v2: > - Address http://crosreview.com/737520 feedback > > drivers/usb/dwc2/core.h | 1 + > drivers/usb/dwc2/hcd.c | 7 ++++ > drivers/usb/dwc2/hcd.h | 9 +++++ > drivers/usb/dwc2/hcd_intr.c | 20 +++++++++++ > drivers/usb/dwc2/hcd_queue.c | 81 +++++++++++++++++++++++++++++++++++++++++--- > 5 files changed, 114 insertions(+), 4 deletions(-) I don't mean to be a pest, but I'm hoping that we can land this somewhere (if nothing else in your /next tree) just so it doesn't miss another release cycle. If you're not so keen on collecting dwc2 host patches these days, I can also see if Greg KH is willing to take this directly into his tree. Please let me know. Thanks for your time! :) -Doug From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Subject: [v4] usb: dwc2: host: Don't retry NAKed transactions right away From: Doug Anderson Message-Id: Date: Tue, 19 Dec 2017 07:57:00 -0800 To: Felipe Balbi , John Youn Cc: Stefan Wahren , Alexandru M Stan , "open list:ARM/Rockchip SoC..." , Greg Kroah-Hartman , Johan Hovold , Eric Anholt , Matthias Kaehlcke , John Stultz , linux-rpi-kernel@lists.infradead.org, Julius Werner , Alyssa Rosenzweig , Douglas Anderson , linux-usb@vger.kernel.org, LKML List-ID: RmVsaXBlLAoKT24gVHVlLCBEZWMgMTIsIDIwMTcgYXQgMTA6MzAgQU0sIERvdWdsYXMgQW5kZXJz b24KPGRpYW5kZXJzQGNocm9taXVtLm9yZz4gd3JvdGU6Cj4gT24gcmszMjg4LXZleXJvbiBkZXZp Y2VzIG9uIENocm9tZSBPUyBpdCB3YXMgZm91bmQgdGhhdCBwbHVnZ2luZyBpbiBhbgo+IEFyZHVp bm8tYmFzZWQgVVNCIGRldmljZSBjb3VsZCBjYXVzZSB0aGUgc3lzdGVtIHRvIGxvY2t1cCwgZXNw ZWNpYWxseQo+IGlmIHRoZSBDUFUgRnJlcXVlbmN5IHdhcyBhdCBvbmUgb2YgdGhlIHNsb3dlciBv cGVyYXRpbmcgcG9pbnRzIChsaWtlCj4gMTAwIE1IeiAvIDIwMCBNSHopLgo+Cj4gVXBvbiB0cmFj aW5nLCBJIGZvdW5kIHRoYXQgdGhlIGZvbGxvd2luZyB3YXMgaGFwcGVuaW5nOgo+ICogVGhlIFVT QiBkZXZpY2UgKGZ1bGwgc3BlZWQpIHdhcyBjb25uZWN0ZWQgdG8gYSBoaWdoIHNwZWVkIGh1YiBh bmQKPiAgIHRoZW4gdG8gdGhlIHJrMzI4OC4gIFRodXMsIHdlIHdlcmUgZGVhbGluZyB3aXRoIHNw bGl0IHRyYW5zYWN0aW9ucywKPiAgIHdoaWNoIGlzIGFsbCBoYW5kbGVkIGluIHNvZnR3YXJlIG9u IGR3YzIuCj4gKiBVc2Vyc3BhY2Ugd2FzIGluaXRpYXRpbmcgYSBCVUxLIElOIHRyYW5zZmVyCj4g KiBXaGVuIHdlIHNlbnQgdGhlIFNTUExJVCAodG8gc3RhcnQgdGhlIHNwbGl0IHRyYW5zYWN0aW9u KSwgd2UgZ290IGFuCj4gICBBQ0suICBHb29kLiAgVGhlbiB3ZSBpc3N1ZWQgdGhlIENTUExJVC4K PiAqIFdoZW4gd2Ugc2VudCB0aGUgQ1NQTElULCB3ZSBnb3QgYmFjayBhIE5BSy4gIFdlIGltbWVk aWF0ZWx5IChmcm9tCj4gICB0aGUgaW50ZXJydXB0IGhhbmRsZXIpIHN0YXJ0ZWQgdG8gcmV0cnkg YW5kIHNlbnQgYW5vdGhlciBTU1BMSVQuCj4gKiBUaGUgZGV2aWNlIGtlcHQgTkFLaW5nIG91ciBD U1BMSVQsIHNvIHdlIGtlcHQgcGluZy1wb25naW5nIGJldHdlZW4KPiAgIHNlbmRpbmcgYSBTU1BM SVQgYW5kIGEgQ1NQTElULCBlYWNoIHRpbWUgc2VuZGluZyBmcm9tIHRoZSBpbnRlcnJ1cHQKPiAg IGhhbmRsZXIuCj4gKiBUaGUgaGFuZGxpbmcgb2YgdGhlIGludGVycnVwdHMgd2FzIChiZWNhdXNl IG9mIHRoZSBsb3cgQ1BVIHNwZWVkIGFuZAo+ICAgdGhlIGluZWZmaWNpZW5jeSBvZiB0aGUgZHdj MiBpbnRlcnJ1cHQgaGFuZGxlcikgd2FzIGFjdHVhbGx5IHRha2luZwo+ICAgX2xvbmdlcl8gdGhh biBpdCB0b29rIHRoZSBvdGhlciBzaWRlIHRvIHNlbmQgdGhlIEFDSy9OQUsuICBUaHVzIHdlCj4g ICB3ZXJlIF9hbHdheXNfIGluIHRoZSBVU0IgaW50ZXJydXB0IHJvdXRpbmUuCj4gKiBUaGUgZmFj dCB0aGF0IFVTQiBpbnRlcnJ1cHRzIHdlcmUgYWx3YXlzIGdvaW5nIG9mZiB3YXMgcHJldmVudGlu Zwo+ICAgb3RoZXIgdGhpbmdzIGZyb20gaGFwcGVuaW5nIGluIHRoZSBzeXN0ZW0uICBUaGlzIGlu Y2x1ZGVkIHByZXZlbnRpbmcKPiAgIHRoZSBzeXN0ZW0gZnJvbSBiZWluZyBhYmxlIHRvIHRyYW5z aXRpb24gdG8gYSBoaWdoZXIgQ1BVIGZyZXF1ZW5jeS4KPgo+IEFzIEkgdW5kZXJzdGFuZCBpdCwg dGhlcmUgaXMgbm8gcmVxdWlyZW1lbnQgdG8gcmV0cnkgc3VwZXIgcXVpY2tseQo+IGFmdGVyIGEg TkFLLCB3ZSBqdXN0IGhhdmUgdG8gcmV0cnkgc29tZXRpbWUgaW4gdGhlIGZ1dHVyZS4gIFRodXMg b25lCj4gc29sdXRpb24gdG8gdGhlIGFib3ZlIGlzIHRvIGp1c3QgYWRkIGEgZGVsYXkgYmV0d2Vl biBnZXR0aW5nIGEgTkFLIGFuZAo+IHJldHJ5aW5nIHRoZSB0cmFuc21pc3Npb24uICBJZiB0aGlz IGRlbGF5IGlzIHN1ZmZpY2llbnRseSBsb25nIHRvIGdldAo+IG91dCBvZiB0aGUgaW50ZXJydXB0 IHJvdXRpbmUgdGhlbiB0aGUgcmVzdCBvZiB0aGUgc3lzdGVtIHdpbGwgYmUgYWJsZQo+IHRvIG1h a2UgZm9yd2FyZCBwcm9ncmVzcy4gIEV2ZW4gYSAyNSB1cyBkZWxheSB3b3VsZCBwcm9iYWJseSBi ZQo+IGVub3VnaCwgYnV0IHdlJ2xsIGJlIGV4dHJhIGNvbnNlcnZhdGl2ZSBhbmQgdHJ5IHRvIGRl bGF5IDEgbXMgKHRoZQo+IGV4YWN0IGFtb3VudCBkZXBlbmRzIG9uIEhaIGFuZCB0aGUgYWNjdXJh Y3kgb2YgdGhlIGppZmZ5IGFuZCBob3cgY2xvc2UKPiB0aGUgY3VycmVudCBqaWZmeSBpcyB0byB0 aWNraW5nLCBidXQgY291bGQgYmUgYXMgbXVjaCBhcyAyMCBtcyBvciBhcwo+IGxpdHRsZSBhcyAx IG1zKS4KPgo+IFByZXN1bWFibHkgYWRkaW5nIGEgZGVsYXkgbGlrZSB0aGlzIGNvdWxkIGltcGFj dCB0aGUgVVNCIHRocm91Z2hwdXQsCj4gc28gd2Ugb25seSBhZGQgdGhlIGRlbGF5IHdpdGggcmVw ZWF0ZWQgTkFLcy4KPgo+IE5PVEU6IFVwb24gZnVydGhlciB0ZXN0aW5nIG9mIGEgcGwyMzAzIHNl cmlhbCBhZGFwdGVyLCBJIGZvdW5kIHRoYXQKPiB0aGlzIGZpeCBtYXkgaGVscCB3aXRoIHByb2Js ZW1zIHRoZXJlLiAgU3BlY2lmaWNhbGx5IEkgZm91bmQgdGhhdCB0aGUKPiBwbDIzMDMgc2VyaWFs IGFkYXB0ZXJzIHRlbmQgdG8gcmVzcG9uZCB3aXRoIGEgTkFLIHdoZW4gdGhleSBoYXZlCj4gbm90 aGluZyB0byBzYXkgYW5kIHRodXMgd2UgZW5kIHdpdGggdGhpcyBzYW1lIHNlcXVlbmNlLgo+Cj4g U2lnbmVkLW9mZi1ieTogRG91Z2xhcyBBbmRlcnNvbiA8ZGlhbmRlcnNAY2hyb21pdW0ub3JnPgo+ IFJldmlld2VkLWJ5OiBKdWxpdXMgV2VybmVyIDxqd2VybmVyQGNocm9taXVtLm9yZz4KPiBUZXN0 ZWQtYnk6IFN0ZWZhbiBXYWhyZW4gPHN0ZWZhbi53YWhyZW5AaTJzZS5jb20+Cj4gQWNrZWQtYnk6 IEpvaG4gWW91biA8am9obnlvdW5Ac3lub3BzeXMuY29tPgo+IC0tLQo+Cj4gQ2hhbmdlcyBpbiB2 NDoKPiAtIFJlbW92ZWQgQ2MgZm9yIHN0YWJsZSBhcyBwZXIgRmVsaXBlJ3MgcmVxdWVzdCBpbiB2 Mwo+IC0gUmViYXNlZCBhbmQgc3F1YXNoZWQgdGhlIHR3byBwYXRjaGVzIHNpbmNlIEtlZXMnIHRp bWVyIHN0dWZmIGxhbmRlZAo+IC0gQWRkIEpvaG4gWW91bidzIEFjay4KPgo+IENoYW5nZXMgaW4g djM6Cj4gLSBBZGQgdGVzdGVkLWJ5IGZvciBTdGVmYW4gV2FocmVuCj4gLSBTZW50IHRvIEZlbGlw ZSBCYWxiaSBhcyBjYW5kaWF0ZSB0byBsYW5kIHRoaXMuCj4gLSBBZGQgQ2MgZm9yIHN0YWJsZSAo aXQncyBhbHdheXMgYmVlbiBicm9rZW4gc28gZ28gYXMgZmFyIGlzIGFzIGVhc3kpCj4KPiBDaGFu Z2VzIGluIHYyOgo+IC0gQWRkcmVzcyBodHRwOi8vY3Jvc3Jldmlldy5jb20vNzM3NTIwIGZlZWRi YWNrCj4KPiAgZHJpdmVycy91c2IvZHdjMi9jb3JlLmggICAgICB8ICAxICsKPiAgZHJpdmVycy91 c2IvZHdjMi9oY2QuYyAgICAgICB8ICA3ICsrKysKPiAgZHJpdmVycy91c2IvZHdjMi9oY2QuaCAg ICAgICB8ICA5ICsrKysrCj4gIGRyaXZlcnMvdXNiL2R3YzIvaGNkX2ludHIuYyAgfCAyMCArKysr KysrKysrKwo+ICBkcml2ZXJzL3VzYi9kd2MyL2hjZF9xdWV1ZS5jIHwgODEgKysrKysrKysrKysr KysrKysrKysrKysrKysrKysrKysrKysrKysrKystLS0KPiAgNSBmaWxlcyBjaGFuZ2VkLCAxMTQg aW5zZXJ0aW9ucygrKSwgNCBkZWxldGlvbnMoLSkKCkkgZG9uJ3QgbWVhbiB0byBiZSBhIHBlc3Qs IGJ1dCBJJ20gaG9waW5nIHRoYXQgd2UgY2FuIGxhbmQgdGhpcwpzb21ld2hlcmUgKGlmIG5vdGhp bmcgZWxzZSBpbiB5b3VyIC9uZXh0IHRyZWUpIGp1c3Qgc28gaXQgZG9lc24ndCBtaXNzCmFub3Ro ZXIgcmVsZWFzZSBjeWNsZS4gIElmIHlvdSdyZSBub3Qgc28ga2VlbiBvbiBjb2xsZWN0aW5nIGR3 YzIgaG9zdApwYXRjaGVzIHRoZXNlIGRheXMsIEkgY2FuIGFsc28gc2VlIGlmIEdyZWcgS0ggaXMg d2lsbGluZyB0byB0YWtlIHRoaXMKZGlyZWN0bHkgaW50byBoaXMgdHJlZS4gIFBsZWFzZSBsZXQg bWUga25vdy4gIFRoYW5rcyBmb3IgeW91ciB0aW1lIQo6KQoKLURvdWcKLS0tClRvIHVuc3Vic2Ny aWJlIGZyb20gdGhpcyBsaXN0OiBzZW5kIHRoZSBsaW5lICJ1bnN1YnNjcmliZSBsaW51eC11c2Ii IGluCnRoZSBib2R5IG9mIGEgbWVzc2FnZSB0byBtYWpvcmRvbW9Admdlci5rZXJuZWwub3JnCk1v cmUgbWFqb3Jkb21vIGluZm8gYXQgIGh0dHA6Ly92Z2VyLmtlcm5lbC5vcmcvbWFqb3Jkb21vLWlu Zm8uaHRtbAo= From mboxrd@z Thu Jan 1 00:00:00 1970 From: Doug Anderson Subject: Re: [PATCH v4] usb: dwc2: host: Don't retry NAKed transactions right away Date: Tue, 19 Dec 2017 07:57:00 -0800 Message-ID: References: <20171212183031.46772-1-dianders@chromium.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: <20171212183031.46772-1-dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> Sender: linux-usb-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Felipe Balbi , John Youn Cc: Stefan Wahren , Alexandru M Stan , "open list:ARM/Rockchip SoC..." , Greg Kroah-Hartman , Johan Hovold , Eric Anholt , Matthias Kaehlcke , John Stultz , linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org, Julius Werner , Alyssa Rosenzweig , Douglas Anderson , linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, LKML List-Id: linux-rockchip.vger.kernel.org Felipe, On Tue, Dec 12, 2017 at 10:30 AM, Douglas Anderson wrote: > On rk3288-veyron devices on Chrome OS it was found that plugging in an > Arduino-based USB device could cause the system to lockup, especially > if the CPU Frequency was at one of the slower operating points (like > 100 MHz / 200 MHz). > > Upon tracing, I found that the following was happening: > * The USB device (full speed) was connected to a high speed hub and > then to the rk3288. Thus, we were dealing with split transactions, > which is all handled in software on dwc2. > * Userspace was initiating a BULK IN transfer > * When we sent the SSPLIT (to start the split transaction), we got an > ACK. Good. Then we issued the CSPLIT. > * When we sent the CSPLIT, we got back a NAK. We immediately (from > the interrupt handler) started to retry and sent another SSPLIT. > * The device kept NAKing our CSPLIT, so we kept ping-ponging between > sending a SSPLIT and a CSPLIT, each time sending from the interrupt > handler. > * The handling of the interrupts was (because of the low CPU speed and > the inefficiency of the dwc2 interrupt handler) was actually taking > _longer_ than it took the other side to send the ACK/NAK. Thus we > were _always_ in the USB interrupt routine. > * The fact that USB interrupts were always going off was preventing > other things from happening in the system. This included preventing > the system from being able to transition to a higher CPU frequency. > > As I understand it, there is no requirement to retry super quickly > after a NAK, we just have to retry sometime in the future. Thus one > solution to the above is to just add a delay between getting a NAK and > retrying the transmission. If this delay is sufficiently long to get > out of the interrupt routine then the rest of the system will be able > to make forward progress. Even a 25 us delay would probably be > enough, but we'll be extra conservative and try to delay 1 ms (the > exact amount depends on HZ and the accuracy of the jiffy and how close > the current jiffy is to ticking, but could be as much as 20 ms or as > little as 1 ms). > > Presumably adding a delay like this could impact the USB throughput, > so we only add the delay with repeated NAKs. > > NOTE: Upon further testing of a pl2303 serial adapter, I found that > this fix may help with problems there. Specifically I found that the > pl2303 serial adapters tend to respond with a NAK when they have > nothing to say and thus we end with this same sequence. > > Signed-off-by: Douglas Anderson > Reviewed-by: Julius Werner > Tested-by: Stefan Wahren > Acked-by: John Youn > --- > > Changes in v4: > - Removed Cc for stable as per Felipe's request in v3 > - Rebased and squashed the two patches since Kees' timer stuff landed > - Add John Youn's Ack. > > Changes in v3: > - Add tested-by for Stefan Wahren > - Sent to Felipe Balbi as candiate to land this. > - Add Cc for stable (it's always been broken so go as far is as easy) > > Changes in v2: > - Address http://crosreview.com/737520 feedback > > drivers/usb/dwc2/core.h | 1 + > drivers/usb/dwc2/hcd.c | 7 ++++ > drivers/usb/dwc2/hcd.h | 9 +++++ > drivers/usb/dwc2/hcd_intr.c | 20 +++++++++++ > drivers/usb/dwc2/hcd_queue.c | 81 +++++++++++++++++++++++++++++++++++++++++--- > 5 files changed, 114 insertions(+), 4 deletions(-) I don't mean to be a pest, but I'm hoping that we can land this somewhere (if nothing else in your /next tree) just so it doesn't miss another release cycle. If you're not so keen on collecting dwc2 host patches these days, I can also see if Greg KH is willing to take this directly into his tree. Please let me know. Thanks for your time! :) -Doug -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html