All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
@ 2016-10-03 14:18 ville.syrjala
  2016-10-03 16:08 ` Felipe Balbi
  0 siblings, 1 reply; 16+ messages in thread
From: ville.syrjala @ 2016-10-03 14:18 UTC (permalink / raw)
  To: linux-usb; +Cc: Felipe Balbi, stable, Ville Syrjälä

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.

commit  55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
is very sluggish and can even stall entirely. Revert cures it.

Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/usb/dwc3/gadget.c | 20 ++++++--------------
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 122e64df2f4d..7362ff009a3a 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -894,8 +894,7 @@ static u32 dwc3_calc_trbs_left(struct dwc3_ep *dep)
 }
 
 static void dwc3_prepare_one_trb_sg(struct dwc3_ep *dep,
-		struct dwc3_request *req, unsigned int trbs_left,
-		unsigned int more_coming)
+		struct dwc3_request *req, unsigned int trbs_left)
 {
 	struct usb_request *request = &req->request;
 	struct scatterlist *sg = request->sg;
@@ -912,8 +911,7 @@ static void dwc3_prepare_one_trb_sg(struct dwc3_ep *dep,
 		dma = sg_dma_address(s);
 
 		if (sg_is_last(s)) {
-			if (usb_endpoint_xfer_int(dep->endpoint.desc) ||
-				!more_coming)
+			if (list_is_last(&req->list, &dep->pending_list))
 				last = true;
 
 			chain = false;
@@ -934,8 +932,7 @@ static void dwc3_prepare_one_trb_sg(struct dwc3_ep *dep,
 }
 
 static void dwc3_prepare_one_trb_linear(struct dwc3_ep *dep,
-		struct dwc3_request *req, unsigned int trbs_left,
-		unsigned int more_coming)
+		struct dwc3_request *req, unsigned int trbs_left)
 {
 	unsigned int	last = false;
 	unsigned int	length;
@@ -948,7 +945,7 @@ static void dwc3_prepare_one_trb_linear(struct dwc3_ep *dep,
 		last = true;
 
 	/* Is this the last request? */
-	if (usb_endpoint_xfer_int(dep->endpoint.desc) || !more_coming)
+	if (list_is_last(&req->list, &dep->pending_list))
 		last = true;
 
 	dwc3_prepare_one_trb(dep, req, dma, length,
@@ -966,7 +963,6 @@ static void dwc3_prepare_one_trb_linear(struct dwc3_ep *dep,
 static void dwc3_prepare_trbs(struct dwc3_ep *dep)
 {
 	struct dwc3_request	*req, *n;
-	unsigned int		more_coming;
 	u32			trbs_left;
 
 	BUILD_BUG_ON_NOT_POWER_OF_2(DWC3_TRB_NUM);
@@ -975,15 +971,11 @@ static void dwc3_prepare_trbs(struct dwc3_ep *dep)
 	if (!trbs_left)
 		return;
 
-	more_coming = dep->allocated_requests - dep->queued_requests;
-
 	list_for_each_entry_safe(req, n, &dep->pending_list, list) {
 		if (req->request.num_mapped_sgs > 0)
-			dwc3_prepare_one_trb_sg(dep, req, trbs_left--,
-					more_coming);
+			dwc3_prepare_one_trb_sg(dep, req, trbs_left--);
 		else
-			dwc3_prepare_one_trb_linear(dep, req, trbs_left--,
-					more_coming);
+			dwc3_prepare_one_trb_linear(dep, req, trbs_left--);
 
 		if (!trbs_left)
 			return;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-03 14:18 [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit" ville.syrjala
@ 2016-10-03 16:08 ` Felipe Balbi
  2016-10-04  7:43   ` Felipe Balbi
  2016-10-04  8:38   ` Ville Syrjälä
  0 siblings, 2 replies; 16+ messages in thread
From: Felipe Balbi @ 2016-10-03 16:08 UTC (permalink / raw)
  To: ville.syrjala, linux-usb; +Cc: stable, Ville Syrjälä


Hi,

ville.syrjala@linux.intel.com writes:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.
>
> commit  55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
> LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
> is very sluggish and can even stall entirely. Revert cures it.
>
> Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Reverting that commit is not the best idea. Let's start with the usual:

- Kernel version
- dmesg on both sides (host and device)
- dwc3 tracepoints:

# mkdir -p /t
# mount -t tracefs none /t
# echo 8192 > /t/buffer_size_kb
# echo 1 > /t/events/dwc3/enable
# echo 0 > /t/events/dwc3/dwc3_readl/enable
# echo 0 > /t/events/dwc3/dwc3_writel/enable

This should be enough to tell me what's really going on.

-- 
balbi

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-03 16:08 ` Felipe Balbi
@ 2016-10-04  7:43   ` Felipe Balbi
  2016-10-04  8:38   ` Ville Syrjälä
  1 sibling, 0 replies; 16+ messages in thread
From: Felipe Balbi @ 2016-10-04  7:43 UTC (permalink / raw)
  To: ville.syrjala, linux-usb; +Cc: stable, Ville Syrjälä

[-- Attachment #1: Type: text/plain, Size: 1792 bytes --]


Hi again,

Felipe Balbi <felipe.balbi@linux.intel.com> writes:
> ville.syrjala@linux.intel.com writes:
>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>
>> This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.
>>
>> commit  55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
>> LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
>> is very sluggish and can even stall entirely. Revert cures it.
>>
>> Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Reverting that commit is not the best idea. Let's start with the usual:
>
> - Kernel version
> - dmesg on both sides (host and device)
> - dwc3 tracepoints:
>
> # mkdir -p /t
> # mount -t tracefs none /t
> # echo 8192 > /t/buffer_size_kb
> # echo 1 > /t/events/dwc3/enable
> # echo 0 > /t/events/dwc3/dwc3_readl/enable
> # echo 0 > /t/events/dwc3/dwc3_writel/enable
>
> This should be enough to tell me what's really going on.

I have been trying this with SKL running v4.8 vanilla and a laptop
running v4.7.1 from Debian and I can't reproduce it.

The only time when I sort of reproduced was when Network Manager on the
laptop side decided to "reconfigure" the interface after I had ifconfig
foobar 192.168.0.2, so the interface ended up loosing the IP.

I have since configured that interface for static IP and it has been
running for a while without any problems. FYI I sshed from laptop to SKL
and have 'while true; do dmesg; done' running and in another terminal
I'm pinging SKL from laptop.

I'll keep it running for a while longer. Drop by once you're in the
office and I can check what's going on with your device.

cheers

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-03 16:08 ` Felipe Balbi
  2016-10-04  7:43   ` Felipe Balbi
@ 2016-10-04  8:38   ` Ville Syrjälä
  2016-10-04 12:23     ` Felipe Balbi
  1 sibling, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2016-10-04  8:38 UTC (permalink / raw)
  To: Felipe Balbi; +Cc: linux-usb, stable

[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]

On Mon, Oct 03, 2016 at 07:08:43PM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> ville.syrjala@linux.intel.com writes:
> > From: Ville Syrj�l� <ville.syrjala@linux.intel.com>
> >
> > This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.
> >
> > commit  55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
> > LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
> > is very sluggish and can even stall entirely. Revert cures it.
> >
> > Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Ville Syrj�l� <ville.syrjala@linux.intel.com>
> 
> Reverting that commit is not the best idea. Let's start with the usual:
> 
> - Kernel version
> - dmesg on both sides (host and device)
> - dwc3 tracepoints:
> 
> # mkdir -p /t
> # mount -t tracefs none /t
> # echo 8192 > /t/buffer_size_kb
> # echo 1 > /t/events/dwc3/enable
> # echo 0 > /t/events/dwc3/dwc3_readl/enable
> # echo 0 > /t/events/dwc3/dwc3_writel/enable
> 
> This should be enough to tell me what's really going on.

I've attached the (compressed due to size) dmesgs/traces from the device.
For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh
session, and I did get the lag in the bad run, so presumably the trace
should have caught it. Both were on 4.8 (+ the revert for the good run).

Host side dmesg just showes me:
 usb 3-3.4: new high-speed USB device number 25 using xhci_hcd
 cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-0000:00:14.0-3.4, CDC Ethernet Device, ...
when I boot the device. Nothing extra appears when things lag around.
I'm running 4.5.something on the host.

-- 
Ville Syrj�l�
Intel OTC

[-- Attachment #2: bad.dmesg.gz --]
[-- Type: application/octet-stream, Size: 12616 bytes --]

[-- Attachment #3: bad.trace.gz --]
[-- Type: application/octet-stream, Size: 79471 bytes --]

[-- Attachment #4: revert.dmesg.gz --]
[-- Type: application/octet-stream, Size: 12493 bytes --]

[-- Attachment #5: revert.trace.gz --]
[-- Type: application/octet-stream, Size: 60893 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-04  8:38   ` Ville Syrjälä
@ 2016-10-04 12:23     ` Felipe Balbi
  2016-10-04 13:20       ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Felipe Balbi @ 2016-10-04 12:23 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: linux-usb, stable

[-- Attachment #1: Type: text/plain, Size: 7895 bytes --]


Hi,

Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
> On Mon, Oct 03, 2016 at 07:08:43PM +0300, Felipe Balbi wrote:
>> 
>> Hi,
>> 
>> ville.syrjala@linux.intel.com writes:
>> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> >
>> > This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.
>> >
>> > commit  55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
>> > LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
>> > is very sluggish and can even stall entirely. Revert cures it.
>> >
>> > Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
>> > Cc: stable@vger.kernel.org
>> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> 
>> Reverting that commit is not the best idea. Let's start with the usual:
>> 
>> - Kernel version
>> - dmesg on both sides (host and device)
>> - dwc3 tracepoints:
>> 
>> # mkdir -p /t
>> # mount -t tracefs none /t
>> # echo 8192 > /t/buffer_size_kb
>> # echo 1 > /t/events/dwc3/enable
>> # echo 0 > /t/events/dwc3/dwc3_readl/enable
>> # echo 0 > /t/events/dwc3/dwc3_writel/enable
>> 
>> This should be enough to tell me what's really going on.
>
> I've attached the (compressed due to size) dmesgs/traces from the device.
> For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh
> session, and I did get the lag in the bad run, so presumably the trace
> should have caught it. Both were on 4.8 (+ the revert for the good run).
>
> Host side dmesg just showes me:
>  usb 3-3.4: new high-speed USB device number 25 using xhci_hcd
>  cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-0000:00:14.0-3.4, CDC Ethernet Device, ...
> when I boot the device. Nothing extra appears when things lag around.
> I'm running 4.5.something on the host.

Looking at your logs what I can see is that "revert.trace" doesn't
respect gadget driver's IRQ throttling, which means you get an interrupt
for each and every request, while "bad.trace" shows dwc3 respecting
"no_interrupt" flag passed in by g_ether.

The only explanation I have for this behavior when we start respecting
gadget driver's flags, is that there's a possible race in u_ether.c
where it can go for a long time without an interrupt due to that
throttling.

This only happens with IN endpoints (the ones which are throttled) and
the behavior can be seen in log snippet below:

  27782:     irq/17-dwc3-2524  [000] d...   666.891969: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags ZsI tx_qlen 4 qmult 5
  27802:     ksoftirqd/0-3     [000] .Ns1   666.892047: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 5 qmult 5
  27808:     ksoftirqd/0-3     [000] .Ns2   666.892062: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags ZsI tx_qlen 6 qmult 5
  27814:     ksoftirqd/0-3     [000] .Ns2   666.892069: eth_start_xmit: req ffff88016da1ab40 length 0/1514 flags Zsi tx_qlen 7 qmult 5
  27820:     ksoftirqd/0-3     [000] .Ns2   666.892075: eth_start_xmit: req ffff88016da1acc0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
  27826:     ksoftirqd/0-3     [000] .Ns2   666.892082: eth_start_xmit: req ffff88016da1ae40 length 0/1514 flags Zsi tx_qlen 9 qmult 5
  27832:     ksoftirqd/0-3     [000] .Ns2   666.892088: eth_start_xmit: req ffff88016da1a780 length 0/1514 flags Zsi tx_qlen 10 qmult 5
  27884:     irq/17-dwc3-2524  [000] d...   666.892213: tx_complete: req ffff88016da1a840 length 1514/1514 flags Zsi tx_qlen 9 qmult 5
  27889:     irq/17-dwc3-2524  [000] d...   666.892215: tx_complete: req ffff88016da1a600 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
  27897:            sshd-2684  [002] ..s1   666.892223: eth_start_xmit: req ffff88016da1a840 length 0/1514 flags Zsi tx_qlen 9 qmult 5
  27898:     irq/17-dwc3-2524  [000] d...   666.892223: tx_complete: req ffff88016da1a180 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
  27903:     irq/17-dwc3-2524  [000] d...   666.892225: tx_complete: req ffff88016da1a240 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
  27910:            sshd-2684  [002] ..s1   666.892230: eth_start_xmit: req ffff88016da1a180 length 0/1514 flags Zsi tx_qlen 8 qmult 5
  27913:     irq/17-dwc3-2524  [000] d...   666.892231: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
  27920:            sshd-2684  [002] ..s1   666.892236: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
  27923:     irq/17-dwc3-2524  [000] d...   666.892238: tx_complete: req ffff88016da1a300 length 1514/1514 flags ZsI tx_qlen 7 qmult 5
  27928:            sshd-2684  [002] ..s1   666.892242: eth_start_xmit: req ffff88016da1a240 length 0/1514 flags Zsi tx_qlen 8 qmult 5
  27944:            sshd-2684  [002] ..s1   666.892253: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags Zsi tx_qlen 9 qmult 5
  27953:            sshd-2684  [002] ..s1   666.892262: eth_start_xmit: req ffff88016da1a600 length 0/1514 flags Zsi tx_qlen 10 qmult 5

Note here that we reached tx_qlen = 10 without any interrupts being requested.

I patched u_ether with below to capture log above:

diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
index 5f562c1ec795..e34c26ffcdf4 100644
--- a/drivers/usb/gadget/function/u_ether.c
+++ b/drivers/usb/gadget/function/u_ether.c
@@ -465,6 +465,15 @@ static void tx_complete(struct usb_ep *ep, struct usb_request *req)
        dev_kfree_skb_any(skb);
 
        atomic_dec(&dev->tx_qlen);
+
+       trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
+                       req, req->actual, req->length,
+                       req->zero ? 'Z' : 'z',
+                       req->short_not_ok ? 'S' : 's',
+                       req->no_interrupt ? 'i' : 'I',
+                       atomic_read(&dev->tx_qlen),
+                       dev->qmult);
+
        if (netif_carrier_ok(dev->net))
                netif_wake_queue(dev->net);
 }
@@ -600,6 +609,12 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
        case 0:
                netif_trans_update(net);
                atomic_inc(&dev->tx_qlen);
+               trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
+                               req, req->actual, req->length, req->zero ? 'Z' : 'z',
+                               req->short_not_ok ? 'S' : 's',
+                               req->no_interrupt ? 'i' : 'I',
+                               atomic_read(&dev->tx_qlen),
+                               dev->qmult);
        }
 
        if (retval) {

The only way I can think of to avoid this problem is to make sure we
don't starve our request queue with something like:

diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
index 5f562c1ec795..ddfa866c2cba 100644
--- a/drivers/usb/gadget/function/u_ether.c
+++ b/drivers/usb/gadget/function/u_ether.c
@@ -587,8 +596,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
 
        /* throttle high/super speed IRQ rate back slightly */
        if (gadget_is_dualspeed(dev->gadget))
-               req->no_interrupt = (dev->gadget->speed == USB_SPEED_HIGH ||
-                                    dev->gadget->speed == USB_SPEED_SUPER)
+               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
+                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
+                                       !list_empty(&dev->tx_reqs))
                        ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
                        : 0;
 
I have been testing this for about half an hour and I can't see delays
or link getting stuck. Care to test ? If it works for your setup, I'll
send this with Cc: stable and your Reported-by and Tested-by tags.

cheers

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-04 12:23     ` Felipe Balbi
@ 2016-10-04 13:20       ` Ville Syrjälä
  2016-10-04 14:03         ` Felipe Balbi
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2016-10-04 13:20 UTC (permalink / raw)
  To: Felipe Balbi; +Cc: linux-usb, stable

[-- Attachment #1: Type: text/plain, Size: 8239 bytes --]

On Tue, Oct 04, 2016 at 03:23:12PM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> Ville Syrj�l� <ville.syrjala@linux.intel.com> writes:
> > On Mon, Oct 03, 2016 at 07:08:43PM +0300, Felipe Balbi wrote:
> >> 
> >> Hi,
> >> 
> >> ville.syrjala@linux.intel.com writes:
> >> > From: Ville Syrj�l� <ville.syrjala@linux.intel.com>
> >> >
> >> > This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.
> >> >
> >> > commit  55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
> >> > LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
> >> > is very sluggish and can even stall entirely. Revert cures it.
> >> >
> >> > Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
> >> > Cc: stable@vger.kernel.org
> >> > Signed-off-by: Ville Syrj�l� <ville.syrjala@linux.intel.com>
> >> 
> >> Reverting that commit is not the best idea. Let's start with the usual:
> >> 
> >> - Kernel version
> >> - dmesg on both sides (host and device)
> >> - dwc3 tracepoints:
> >> 
> >> # mkdir -p /t
> >> # mount -t tracefs none /t
> >> # echo 8192 > /t/buffer_size_kb
> >> # echo 1 > /t/events/dwc3/enable
> >> # echo 0 > /t/events/dwc3/dwc3_readl/enable
> >> # echo 0 > /t/events/dwc3/dwc3_writel/enable
> >> 
> >> This should be enough to tell me what's really going on.
> >
> > I've attached the (compressed due to size) dmesgs/traces from the device.
> > For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh
> > session, and I did get the lag in the bad run, so presumably the trace
> > should have caught it. Both were on 4.8 (+ the revert for the good run).
> >
> > Host side dmesg just showes me:
> >  usb 3-3.4: new high-speed USB device number 25 using xhci_hcd
> >  cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-0000:00:14.0-3.4, CDC Ethernet Device, ...
> > when I boot the device. Nothing extra appears when things lag around.
> > I'm running 4.5.something on the host.
> 
> Looking at your logs what I can see is that "revert.trace" doesn't
> respect gadget driver's IRQ throttling, which means you get an interrupt
> for each and every request, while "bad.trace" shows dwc3 respecting
> "no_interrupt" flag passed in by g_ether.
> 
> The only explanation I have for this behavior when we start respecting
> gadget driver's flags, is that there's a possible race in u_ether.c
> where it can go for a long time without an interrupt due to that
> throttling.
> 
> This only happens with IN endpoints (the ones which are throttled) and
> the behavior can be seen in log snippet below:
> 
>   27782:     irq/17-dwc3-2524  [000] d...   666.891969: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags ZsI tx_qlen 4 qmult 5
>   27802:     ksoftirqd/0-3     [000] .Ns1   666.892047: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 5 qmult 5
>   27808:     ksoftirqd/0-3     [000] .Ns2   666.892062: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags ZsI tx_qlen 6 qmult 5
>   27814:     ksoftirqd/0-3     [000] .Ns2   666.892069: eth_start_xmit: req ffff88016da1ab40 length 0/1514 flags Zsi tx_qlen 7 qmult 5
>   27820:     ksoftirqd/0-3     [000] .Ns2   666.892075: eth_start_xmit: req ffff88016da1acc0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>   27826:     ksoftirqd/0-3     [000] .Ns2   666.892082: eth_start_xmit: req ffff88016da1ae40 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>   27832:     ksoftirqd/0-3     [000] .Ns2   666.892088: eth_start_xmit: req ffff88016da1a780 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>   27884:     irq/17-dwc3-2524  [000] d...   666.892213: tx_complete: req ffff88016da1a840 length 1514/1514 flags Zsi tx_qlen 9 qmult 5
>   27889:     irq/17-dwc3-2524  [000] d...   666.892215: tx_complete: req ffff88016da1a600 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>   27897:            sshd-2684  [002] ..s1   666.892223: eth_start_xmit: req ffff88016da1a840 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>   27898:     irq/17-dwc3-2524  [000] d...   666.892223: tx_complete: req ffff88016da1a180 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>   27903:     irq/17-dwc3-2524  [000] d...   666.892225: tx_complete: req ffff88016da1a240 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>   27910:            sshd-2684  [002] ..s1   666.892230: eth_start_xmit: req ffff88016da1a180 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>   27913:     irq/17-dwc3-2524  [000] d...   666.892231: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>   27920:            sshd-2684  [002] ..s1   666.892236: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>   27923:     irq/17-dwc3-2524  [000] d...   666.892238: tx_complete: req ffff88016da1a300 length 1514/1514 flags ZsI tx_qlen 7 qmult 5
>   27928:            sshd-2684  [002] ..s1   666.892242: eth_start_xmit: req ffff88016da1a240 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>   27944:            sshd-2684  [002] ..s1   666.892253: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>   27953:            sshd-2684  [002] ..s1   666.892262: eth_start_xmit: req ffff88016da1a600 length 0/1514 flags Zsi tx_qlen 10 qmult 5
> 
> Note here that we reached tx_qlen = 10 without any interrupts being requested.
> 
> I patched u_ether with below to capture log above:
> 
> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
> index 5f562c1ec795..e34c26ffcdf4 100644
> --- a/drivers/usb/gadget/function/u_ether.c
> +++ b/drivers/usb/gadget/function/u_ether.c
> @@ -465,6 +465,15 @@ static void tx_complete(struct usb_ep *ep, struct usb_request *req)
>         dev_kfree_skb_any(skb);
>  
>         atomic_dec(&dev->tx_qlen);
> +
> +       trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
> +                       req, req->actual, req->length,
> +                       req->zero ? 'Z' : 'z',
> +                       req->short_not_ok ? 'S' : 's',
> +                       req->no_interrupt ? 'i' : 'I',
> +                       atomic_read(&dev->tx_qlen),
> +                       dev->qmult);
> +
>         if (netif_carrier_ok(dev->net))
>                 netif_wake_queue(dev->net);
>  }
> @@ -600,6 +609,12 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>         case 0:
>                 netif_trans_update(net);
>                 atomic_inc(&dev->tx_qlen);
> +               trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
> +                               req, req->actual, req->length, req->zero ? 'Z' : 'z',
> +                               req->short_not_ok ? 'S' : 's',
> +                               req->no_interrupt ? 'i' : 'I',
> +                               atomic_read(&dev->tx_qlen),
> +                               dev->qmult);
>         }
>  
>         if (retval) {
> 
> The only way I can think of to avoid this problem is to make sure we
> don't starve our request queue with something like:
> 
> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
> index 5f562c1ec795..ddfa866c2cba 100644
> --- a/drivers/usb/gadget/function/u_ether.c
> +++ b/drivers/usb/gadget/function/u_ether.c
> @@ -587,8 +596,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>  
>         /* throttle high/super speed IRQ rate back slightly */
>         if (gadget_is_dualspeed(dev->gadget))
> -               req->no_interrupt = (dev->gadget->speed == USB_SPEED_HIGH ||
> -                                    dev->gadget->speed == USB_SPEED_SUPER)
> +               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
> +                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
> +                                       !list_empty(&dev->tx_reqs))
>                         ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
>                         : 0;
>  
> I have been testing this for about half an hour and I can't see delays
> or link getting stuck. Care to test ? If it works for your setup, I'll
> send this with Cc: stable and your Reported-by and Tested-by tags.

That doesn't appear to cure the problem unfortunately. I've attached another
trace, with your u_ether trace_printks included.

-- 
Ville Syrj�l�
Intel OTC

[-- Attachment #2: u_ether.trace.gz --]
[-- Type: application/octet-stream, Size: 58771 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-04 13:20       ` Ville Syrjälä
@ 2016-10-04 14:03         ` Felipe Balbi
  2016-10-06  7:36           ` Felipe Balbi
  0 siblings, 1 reply; 16+ messages in thread
From: Felipe Balbi @ 2016-10-04 14:03 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: linux-usb, stable

[-- Attachment #1: Type: text/plain, Size: 10808 bytes --]


Hi,

Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
> On Tue, Oct 04, 2016 at 03:23:12PM +0300, Felipe Balbi wrote:
>> 
>> Hi,
>> 
>> Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
>> > On Mon, Oct 03, 2016 at 07:08:43PM +0300, Felipe Balbi wrote:
>> >> 
>> >> Hi,
>> >> 
>> >> ville.syrjala@linux.intel.com writes:
>> >> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> >> >
>> >> > This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.
>> >> >
>> >> > commit  55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
>> >> > LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
>> >> > is very sluggish and can even stall entirely. Revert cures it.
>> >> >
>> >> > Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
>> >> > Cc: stable@vger.kernel.org
>> >> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> >> 
>> >> Reverting that commit is not the best idea. Let's start with the usual:
>> >> 
>> >> - Kernel version
>> >> - dmesg on both sides (host and device)
>> >> - dwc3 tracepoints:
>> >> 
>> >> # mkdir -p /t
>> >> # mount -t tracefs none /t
>> >> # echo 8192 > /t/buffer_size_kb
>> >> # echo 1 > /t/events/dwc3/enable
>> >> # echo 0 > /t/events/dwc3/dwc3_readl/enable
>> >> # echo 0 > /t/events/dwc3/dwc3_writel/enable
>> >> 
>> >> This should be enough to tell me what's really going on.
>> >
>> > I've attached the (compressed due to size) dmesgs/traces from the device.
>> > For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh
>> > session, and I did get the lag in the bad run, so presumably the trace
>> > should have caught it. Both were on 4.8 (+ the revert for the good run).
>> >
>> > Host side dmesg just showes me:
>> >  usb 3-3.4: new high-speed USB device number 25 using xhci_hcd
>> >  cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-0000:00:14.0-3.4, CDC Ethernet Device, ...
>> > when I boot the device. Nothing extra appears when things lag around.
>> > I'm running 4.5.something on the host.
>> 
>> Looking at your logs what I can see is that "revert.trace" doesn't
>> respect gadget driver's IRQ throttling, which means you get an interrupt
>> for each and every request, while "bad.trace" shows dwc3 respecting
>> "no_interrupt" flag passed in by g_ether.
>> 
>> The only explanation I have for this behavior when we start respecting
>> gadget driver's flags, is that there's a possible race in u_ether.c
>> where it can go for a long time without an interrupt due to that
>> throttling.
>> 
>> This only happens with IN endpoints (the ones which are throttled) and
>> the behavior can be seen in log snippet below:
>> 
>>   27782:     irq/17-dwc3-2524  [000] d...   666.891969: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags ZsI tx_qlen 4 qmult 5
>>   27802:     ksoftirqd/0-3     [000] .Ns1   666.892047: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 5 qmult 5
>>   27808:     ksoftirqd/0-3     [000] .Ns2   666.892062: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags ZsI tx_qlen 6 qmult 5
>>   27814:     ksoftirqd/0-3     [000] .Ns2   666.892069: eth_start_xmit: req ffff88016da1ab40 length 0/1514 flags Zsi tx_qlen 7 qmult 5
>>   27820:     ksoftirqd/0-3     [000] .Ns2   666.892075: eth_start_xmit: req ffff88016da1acc0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>   27826:     ksoftirqd/0-3     [000] .Ns2   666.892082: eth_start_xmit: req ffff88016da1ae40 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>>   27832:     ksoftirqd/0-3     [000] .Ns2   666.892088: eth_start_xmit: req ffff88016da1a780 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>>   27884:     irq/17-dwc3-2524  [000] d...   666.892213: tx_complete: req ffff88016da1a840 length 1514/1514 flags Zsi tx_qlen 9 qmult 5
>>   27889:     irq/17-dwc3-2524  [000] d...   666.892215: tx_complete: req ffff88016da1a600 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>>   27897:            sshd-2684  [002] ..s1   666.892223: eth_start_xmit: req ffff88016da1a840 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>>   27898:     irq/17-dwc3-2524  [000] d...   666.892223: tx_complete: req ffff88016da1a180 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>>   27903:     irq/17-dwc3-2524  [000] d...   666.892225: tx_complete: req ffff88016da1a240 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>>   27910:            sshd-2684  [002] ..s1   666.892230: eth_start_xmit: req ffff88016da1a180 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>   27913:     irq/17-dwc3-2524  [000] d...   666.892231: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>>   27920:            sshd-2684  [002] ..s1   666.892236: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>   27923:     irq/17-dwc3-2524  [000] d...   666.892238: tx_complete: req ffff88016da1a300 length 1514/1514 flags ZsI tx_qlen 7 qmult 5
>>   27928:            sshd-2684  [002] ..s1   666.892242: eth_start_xmit: req ffff88016da1a240 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>   27944:            sshd-2684  [002] ..s1   666.892253: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>>   27953:            sshd-2684  [002] ..s1   666.892262: eth_start_xmit: req ffff88016da1a600 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>> 
>> Note here that we reached tx_qlen = 10 without any interrupts being requested.
>> 
>> I patched u_ether with below to capture log above:
>> 
>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>> index 5f562c1ec795..e34c26ffcdf4 100644
>> --- a/drivers/usb/gadget/function/u_ether.c
>> +++ b/drivers/usb/gadget/function/u_ether.c
>> @@ -465,6 +465,15 @@ static void tx_complete(struct usb_ep *ep, struct usb_request *req)
>>         dev_kfree_skb_any(skb);
>>  
>>         atomic_dec(&dev->tx_qlen);
>> +
>> +       trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
>> +                       req, req->actual, req->length,
>> +                       req->zero ? 'Z' : 'z',
>> +                       req->short_not_ok ? 'S' : 's',
>> +                       req->no_interrupt ? 'i' : 'I',
>> +                       atomic_read(&dev->tx_qlen),
>> +                       dev->qmult);
>> +
>>         if (netif_carrier_ok(dev->net))
>>                 netif_wake_queue(dev->net);
>>  }
>> @@ -600,6 +609,12 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>         case 0:
>>                 netif_trans_update(net);
>>                 atomic_inc(&dev->tx_qlen);
>> +               trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
>> +                               req, req->actual, req->length, req->zero ? 'Z' : 'z',
>> +                               req->short_not_ok ? 'S' : 's',
>> +                               req->no_interrupt ? 'i' : 'I',
>> +                               atomic_read(&dev->tx_qlen),
>> +                               dev->qmult);
>>         }
>>  
>>         if (retval) {
>> 
>> The only way I can think of to avoid this problem is to make sure we
>> don't starve our request queue with something like:
>> 
>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>> index 5f562c1ec795..ddfa866c2cba 100644
>> --- a/drivers/usb/gadget/function/u_ether.c
>> +++ b/drivers/usb/gadget/function/u_ether.c
>> @@ -587,8 +596,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>  
>>         /* throttle high/super speed IRQ rate back slightly */
>>         if (gadget_is_dualspeed(dev->gadget))
>> -               req->no_interrupt = (dev->gadget->speed == USB_SPEED_HIGH ||
>> -                                    dev->gadget->speed == USB_SPEED_SUPER)
>> +               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
>> +                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
>> +                                       !list_empty(&dev->tx_reqs))
>>                         ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
>>                         : 0;
>>  
>> I have been testing this for about half an hour and I can't see delays
>> or link getting stuck. Care to test ? If it works for your setup, I'll
>> send this with Cc: stable and your Reported-by and Tested-by tags.
>
> That doesn't appear to cure the problem unfortunately. I've attached another
> trace, with your u_ether trace_printks included.

Now I only seem to find delays on ep1out. Here are a few instances:

     irq/22-dwc3-803   [001] d..1    19.446979: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    20.452232: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    20.761702: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    21.044345: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    21.999383: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
[...]
     irq/22-dwc3-803   [001] d..1    22.239576: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    22.979703: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
[...]
     irq/22-dwc3-803   [001] d..1    23.125096: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    23.820697: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    24.461117: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    24.882567: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    25.693330: dwc3_event: event (00006084): ep1out: Transfer In-Progress

     irq/22-dwc3-803   [001] d..1    25.702107: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
     irq/22-dwc3-803   [001] d..1    26.045495: dwc3_event: event (00006084): ep1out: Transfer In-Progress

Based on this, it would seem that host side isn't sending the data; but
that's quite odd. Let's put a sniffer here tomorrow and look at this
more carefully, I couldn't reproduce this with SKL as I was testing this
today.

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-04 14:03         ` Felipe Balbi
@ 2016-10-06  7:36           ` Felipe Balbi
  2016-10-06  9:08             ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Felipe Balbi @ 2016-10-06  7:36 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: linux-usb, stable

[-- Attachment #1: Type: text/plain, Size: 12611 bytes --]


Hi,

Felipe Balbi <felipe.balbi@linux.intel.com> writes:
>>> >> Reverting that commit is not the best idea. Let's start with the usual:
>>> >> 
>>> >> - Kernel version
>>> >> - dmesg on both sides (host and device)
>>> >> - dwc3 tracepoints:
>>> >> 
>>> >> # mkdir -p /t
>>> >> # mount -t tracefs none /t
>>> >> # echo 8192 > /t/buffer_size_kb
>>> >> # echo 1 > /t/events/dwc3/enable
>>> >> # echo 0 > /t/events/dwc3/dwc3_readl/enable
>>> >> # echo 0 > /t/events/dwc3/dwc3_writel/enable
>>> >> 
>>> >> This should be enough to tell me what's really going on.
>>> >
>>> > I've attached the (compressed due to size) dmesgs/traces from the device.
>>> > For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh
>>> > session, and I did get the lag in the bad run, so presumably the trace
>>> > should have caught it. Both were on 4.8 (+ the revert for the good run).
>>> >
>>> > Host side dmesg just showes me:
>>> >  usb 3-3.4: new high-speed USB device number 25 using xhci_hcd
>>> >  cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-0000:00:14.0-3.4, CDC Ethernet Device, ...
>>> > when I boot the device. Nothing extra appears when things lag around.
>>> > I'm running 4.5.something on the host.
>>> 
>>> Looking at your logs what I can see is that "revert.trace" doesn't
>>> respect gadget driver's IRQ throttling, which means you get an interrupt
>>> for each and every request, while "bad.trace" shows dwc3 respecting
>>> "no_interrupt" flag passed in by g_ether.
>>> 
>>> The only explanation I have for this behavior when we start respecting
>>> gadget driver's flags, is that there's a possible race in u_ether.c
>>> where it can go for a long time without an interrupt due to that
>>> throttling.
>>> 
>>> This only happens with IN endpoints (the ones which are throttled) and
>>> the behavior can be seen in log snippet below:
>>> 
>>>   27782:     irq/17-dwc3-2524  [000] d...   666.891969: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags ZsI tx_qlen 4 qmult 5
>>>   27802:     ksoftirqd/0-3     [000] .Ns1   666.892047: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 5 qmult 5
>>>   27808:     ksoftirqd/0-3     [000] .Ns2   666.892062: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags ZsI tx_qlen 6 qmult 5
>>>   27814:     ksoftirqd/0-3     [000] .Ns2   666.892069: eth_start_xmit: req ffff88016da1ab40 length 0/1514 flags Zsi tx_qlen 7 qmult 5
>>>   27820:     ksoftirqd/0-3     [000] .Ns2   666.892075: eth_start_xmit: req ffff88016da1acc0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>>   27826:     ksoftirqd/0-3     [000] .Ns2   666.892082: eth_start_xmit: req ffff88016da1ae40 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>>>   27832:     ksoftirqd/0-3     [000] .Ns2   666.892088: eth_start_xmit: req ffff88016da1a780 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>>>   27884:     irq/17-dwc3-2524  [000] d...   666.892213: tx_complete: req ffff88016da1a840 length 1514/1514 flags Zsi tx_qlen 9 qmult 5
>>>   27889:     irq/17-dwc3-2524  [000] d...   666.892215: tx_complete: req ffff88016da1a600 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>>>   27897:            sshd-2684  [002] ..s1   666.892223: eth_start_xmit: req ffff88016da1a840 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>>>   27898:     irq/17-dwc3-2524  [000] d...   666.892223: tx_complete: req ffff88016da1a180 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>>>   27903:     irq/17-dwc3-2524  [000] d...   666.892225: tx_complete: req ffff88016da1a240 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>>>   27910:            sshd-2684  [002] ..s1   666.892230: eth_start_xmit: req ffff88016da1a180 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>>   27913:     irq/17-dwc3-2524  [000] d...   666.892231: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>>>   27920:            sshd-2684  [002] ..s1   666.892236: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>>   27923:     irq/17-dwc3-2524  [000] d...   666.892238: tx_complete: req ffff88016da1a300 length 1514/1514 flags ZsI tx_qlen 7 qmult 5
>>>   27928:            sshd-2684  [002] ..s1   666.892242: eth_start_xmit: req ffff88016da1a240 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>>>   27944:            sshd-2684  [002] ..s1   666.892253: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>>>   27953:            sshd-2684  [002] ..s1   666.892262: eth_start_xmit: req ffff88016da1a600 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>>> 
>>> Note here that we reached tx_qlen = 10 without any interrupts being requested.
>>> 
>>> I patched u_ether with below to capture log above:
>>> 
>>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>>> index 5f562c1ec795..e34c26ffcdf4 100644
>>> --- a/drivers/usb/gadget/function/u_ether.c
>>> +++ b/drivers/usb/gadget/function/u_ether.c
>>> @@ -465,6 +465,15 @@ static void tx_complete(struct usb_ep *ep, struct usb_request *req)
>>>         dev_kfree_skb_any(skb);
>>>  
>>>         atomic_dec(&dev->tx_qlen);
>>> +
>>> +       trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
>>> +                       req, req->actual, req->length,
>>> +                       req->zero ? 'Z' : 'z',
>>> +                       req->short_not_ok ? 'S' : 's',
>>> +                       req->no_interrupt ? 'i' : 'I',
>>> +                       atomic_read(&dev->tx_qlen),
>>> +                       dev->qmult);
>>> +
>>>         if (netif_carrier_ok(dev->net))
>>>                 netif_wake_queue(dev->net);
>>>  }
>>> @@ -600,6 +609,12 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>>         case 0:
>>>                 netif_trans_update(net);
>>>                 atomic_inc(&dev->tx_qlen);
>>> +               trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
>>> +                               req, req->actual, req->length, req->zero ? 'Z' : 'z',
>>> +                               req->short_not_ok ? 'S' : 's',
>>> +                               req->no_interrupt ? 'i' : 'I',
>>> +                               atomic_read(&dev->tx_qlen),
>>> +                               dev->qmult);
>>>         }
>>>  
>>>         if (retval) {
>>> 
>>> The only way I can think of to avoid this problem is to make sure we
>>> don't starve our request queue with something like:
>>> 
>>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>>> index 5f562c1ec795..ddfa866c2cba 100644
>>> --- a/drivers/usb/gadget/function/u_ether.c
>>> +++ b/drivers/usb/gadget/function/u_ether.c
>>> @@ -587,8 +596,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>>  
>>>         /* throttle high/super speed IRQ rate back slightly */
>>>         if (gadget_is_dualspeed(dev->gadget))
>>> -               req->no_interrupt = (dev->gadget->speed == USB_SPEED_HIGH ||
>>> -                                    dev->gadget->speed == USB_SPEED_SUPER)
>>> +               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
>>> +                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
>>> +                                       !list_empty(&dev->tx_reqs))
>>>                         ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
>>>                         : 0;
>>>  
>>> I have been testing this for about half an hour and I can't see delays
>>> or link getting stuck. Care to test ? If it works for your setup, I'll
>>> send this with Cc: stable and your Reported-by and Tested-by tags.
>>
>> That doesn't appear to cure the problem unfortunately. I've attached another
>> trace, with your u_ether trace_printks included.
>
> Now I only seem to find delays on ep1out. Here are a few instances:
>
>      irq/22-dwc3-803   [001] d..1    19.446979: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
>      irq/22-dwc3-803   [001] d..1    20.452232: dwc3_event: event (00006084): ep1out: Transfer In-Progress
>
>      irq/22-dwc3-803   [001] d..1    20.761702: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
>      irq/22-dwc3-803   [001] d..1    21.044345: dwc3_event: event (00006084): ep1out: Transfer In-Progress
>
>      irq/22-dwc3-803   [001] d..1    21.999383: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
> [...]
>      irq/22-dwc3-803   [001] d..1    22.239576: dwc3_event: event (00006084): ep1out: Transfer In-Progress
>
>      irq/22-dwc3-803   [001] d..1    22.979703: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
> [...]
>      irq/22-dwc3-803   [001] d..1    23.125096: dwc3_event: event (00006084): ep1out: Transfer In-Progress
>
>      irq/22-dwc3-803   [001] d..1    23.820697: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
>      irq/22-dwc3-803   [001] d..1    24.461117: dwc3_event: event (00006084): ep1out: Transfer In-Progress
>
>      irq/22-dwc3-803   [001] d..1    24.882567: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
>      irq/22-dwc3-803   [001] d..1    25.693330: dwc3_event: event (00006084): ep1out: Transfer In-Progress
>
>      irq/22-dwc3-803   [001] d..1    25.702107: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
>      irq/22-dwc3-803   [001] d..1    26.045495: dwc3_event: event (00006084): ep1out: Transfer In-Progress
>
> Based on this, it would seem that host side isn't sending the data; but
> that's quite odd. Let's put a sniffer here tomorrow and look at this
> more carefully, I couldn't reproduce this with SKL as I was testing this
> today.

Okay, I have found a regression on dwc3 and another patch follows:

commit 5e1a2af3e46248c55098cdae643c4141851b703e
Author: Felipe Balbi <felipe.balbi@linux.intel.com>
Date:   Wed Oct 5 14:24:37 2016 +0300

    usb: dwc3: gadget: properly account queued requests
    
    Some requests could be accounted for multiple
    times. Let's fix that so each and every requests is
    accounted for only once.
    
    Cc: <stable@vger.kernel.org> # v4.8
    Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for LST bit")
    Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 07cc8929f271..3c3ced128c77 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
 		req->trb = trb;
 		req->trb_dma = dwc3_trb_dma_offset(dep, trb);
 		req->first_trb_index = dep->trb_enqueue;
+		dep->queued_requests++;
 	}
 
 	dwc3_ep_inc_enq(dep);
@@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
 
 	trb->ctrl |= DWC3_TRB_CTRL_HWO;
 
-	dep->queued_requests++;
-
 	trace_dwc3_prepare_trb(dep, trb);
 }
 
@@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 *dwc, struct dwc3_ep *dep,
 	unsigned int		s_pkt = 0;
 	unsigned int		trb_status;
 
-	dep->queued_requests--;
 	dwc3_ep_inc_deq(dep);
+
+	if (req->trb == trb)
+		dep->queued_requests--;
+
 	trace_dwc3_complete_trb(dep, trb);
 
 	/*

I have also built a branch which you can use for testing. Here's a pull
request, once you tell me it works for you, then I can send proper
patches out:

The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:

  Linux 4.8 (2016-10-02 16:24:33 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git tmp-test-v4.8

for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca:

  usb: dwc3: gadget: properly account queued requests (2016-10-06 10:16:37 +0300)

----------------------------------------------------------------
Felipe Balbi (2):
      usb: gadget: function: u_ether: don't starve tx request queue
      usb: dwc3: gadget: properly account queued requests

 drivers/usb/dwc3/gadget.c             | 7 ++++---
 drivers/usb/gadget/function/u_ether.c | 5 +++--
 2 files changed, 7 insertions(+), 5 deletions(-)


-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-06  7:36           ` Felipe Balbi
@ 2016-10-06  9:08             ` Ville Syrjälä
  2016-10-27 15:40               ` Ville Syrjälä
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2016-10-06  9:08 UTC (permalink / raw)
  To: Felipe Balbi; +Cc: linux-usb, stable

[-- Attachment #1: Type: text/plain, Size: 2806 bytes --]

On Thu, Oct 06, 2016 at 10:36:09AM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> Felipe Balbi <felipe.balbi@linux.intel.com> writes:
<snip>
> Okay, I have found a regression on dwc3 and another patch follows:
> 
> commit 5e1a2af3e46248c55098cdae643c4141851b703e
> Author: Felipe Balbi <felipe.balbi@linux.intel.com>
> Date:   Wed Oct 5 14:24:37 2016 +0300
> 
>     usb: dwc3: gadget: properly account queued requests
>     
>     Some requests could be accounted for multiple
>     times. Let's fix that so each and every requests is
>     accounted for only once.
>     
>     Cc: <stable@vger.kernel.org> # v4.8
>     Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for LST bit")
>     Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
> 
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 07cc8929f271..3c3ced128c77 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
>  		req->trb = trb;
>  		req->trb_dma = dwc3_trb_dma_offset(dep, trb);
>  		req->first_trb_index = dep->trb_enqueue;
> +		dep->queued_requests++;
>  	}
>  
>  	dwc3_ep_inc_enq(dep);
> @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
>  
>  	trb->ctrl |= DWC3_TRB_CTRL_HWO;
>  
> -	dep->queued_requests++;
> -
>  	trace_dwc3_prepare_trb(dep, trb);
>  }
>  
> @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 *dwc, struct dwc3_ep *dep,
>  	unsigned int		s_pkt = 0;
>  	unsigned int		trb_status;
>  
> -	dep->queued_requests--;
>  	dwc3_ep_inc_deq(dep);
> +
> +	if (req->trb == trb)
> +		dep->queued_requests--;
> +
>  	trace_dwc3_complete_trb(dep, trb);
>  
>  	/*
> 
> I have also built a branch which you can use for testing. Here's a pull
> request, once you tell me it works for you, then I can send proper
> patches out:
> 
> The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:
> 
>   Linux 4.8 (2016-10-02 16:24:33 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git tmp-test-v4.8
> 
> for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca:
> 
>   usb: dwc3: gadget: properly account queued requests (2016-10-06 10:16:37 +0300)
> 
> ----------------------------------------------------------------
> Felipe Balbi (2):
>       usb: gadget: function: u_ether: don't starve tx request queue
>       usb: dwc3: gadget: properly account queued requests
> 
>  drivers/usb/dwc3/gadget.c             | 7 ++++---
>  drivers/usb/gadget/function/u_ether.c | 5 +++--
>  2 files changed, 7 insertions(+), 5 deletions(-)

Tried your branch, but unfortunately I'm still seeing the lags. New trace
attached.

-- 
Ville Syrj�l�
Intel OTC

[-- Attachment #2: felipe_branch.trace.gz --]
[-- Type: application/octet-stream, Size: 113871 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-06  9:08             ` Ville Syrjälä
@ 2016-10-27 15:40               ` Ville Syrjälä
  2016-10-28 10:16                 ` Felipe Balbi
  0 siblings, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2016-10-27 15:40 UTC (permalink / raw)
  To: Felipe Balbi; +Cc: linux-usb, stable

On Thu, Oct 06, 2016 at 12:08:26PM +0300, Ville Syrj�l� wrote:
> On Thu, Oct 06, 2016 at 10:36:09AM +0300, Felipe Balbi wrote:
> > 
> > Hi,
> > 
> > Felipe Balbi <felipe.balbi@linux.intel.com> writes:
> <snip>
> > Okay, I have found a regression on dwc3 and another patch follows:
> > 
> > commit 5e1a2af3e46248c55098cdae643c4141851b703e
> > Author: Felipe Balbi <felipe.balbi@linux.intel.com>
> > Date:   Wed Oct 5 14:24:37 2016 +0300
> > 
> >     usb: dwc3: gadget: properly account queued requests
> >     
> >     Some requests could be accounted for multiple
> >     times. Let's fix that so each and every requests is
> >     accounted for only once.
> >     
> >     Cc: <stable@vger.kernel.org> # v4.8
> >     Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for LST bit")
> >     Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
> > 
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 07cc8929f271..3c3ced128c77 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> >  		req->trb = trb;
> >  		req->trb_dma = dwc3_trb_dma_offset(dep, trb);
> >  		req->first_trb_index = dep->trb_enqueue;
> > +		dep->queued_requests++;
> >  	}
> >  
> >  	dwc3_ep_inc_enq(dep);
> > @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> >  
> >  	trb->ctrl |= DWC3_TRB_CTRL_HWO;
> >  
> > -	dep->queued_requests++;
> > -
> >  	trace_dwc3_prepare_trb(dep, trb);
> >  }
> >  
> > @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 *dwc, struct dwc3_ep *dep,
> >  	unsigned int		s_pkt = 0;
> >  	unsigned int		trb_status;
> >  
> > -	dep->queued_requests--;
> >  	dwc3_ep_inc_deq(dep);
> > +
> > +	if (req->trb == trb)
> > +		dep->queued_requests--;
> > +
> >  	trace_dwc3_complete_trb(dep, trb);
> >  
> >  	/*
> > 
> > I have also built a branch which you can use for testing. Here's a pull
> > request, once you tell me it works for you, then I can send proper
> > patches out:
> > 
> > The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:
> > 
> >   Linux 4.8 (2016-10-02 16:24:33 -0700)
> > 
> > are available in the git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git tmp-test-v4.8
> > 
> > for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca:
> > 
> >   usb: dwc3: gadget: properly account queued requests (2016-10-06 10:16:37 +0300)
> > 
> > ----------------------------------------------------------------
> > Felipe Balbi (2):
> >       usb: gadget: function: u_ether: don't starve tx request queue
> >       usb: dwc3: gadget: properly account queued requests
> > 
> >  drivers/usb/dwc3/gadget.c             | 7 ++++---
> >  drivers/usb/gadget/function/u_ether.c | 5 +++--
> >  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> Tried your branch, but unfortunately I'm still seeing the lags. New trace
> attached.

Just a reminder that the regressions is still there in 4.9-rc2.
Unfortunateyly with all the stuff already piled on top, getting USB
working on my device is no longer a simple matter of reverting the
one commit. I had to revert 10 patches to get even a clean revert, but
unfortunately that approach just lead to the transfer hanging immediately.

So what I ended up doing is reverting all of this:
git log --no-merges --format=oneline 55a0237f8f47957163125e20ee9260538c5c341c^..HEAD -- drivers/usb/dwc3/ include/linux/ulpi/interface.h drivers/usb/Kconfig drivers/usb/core/Kconfig

which comes out at whopping 70 commits. Having to carry that around
is going to be quite a pain especially as more stuff might be piled on
top.

/me thinks a stable backport of any fix (assuming one is found
eventually) is going to be quite the challenge...

-- 
Ville Syrj�l�
Intel OTC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-27 15:40               ` Ville Syrjälä
@ 2016-10-28 10:16                 ` Felipe Balbi
       [not found]                   ` <87pomksrrm.fsf-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Felipe Balbi @ 2016-10-28 10:16 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: linux-usb, stable, davem, Oliver Neukum, netdev

[-- Attachment #1: Type: text/plain, Size: 5856 bytes --]


Hi,

Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
> On Thu, Oct 06, 2016 at 12:08:26PM +0300, Ville Syrjälä wrote:
>> On Thu, Oct 06, 2016 at 10:36:09AM +0300, Felipe Balbi wrote:
>> > 
>> > Hi,
>> > 
>> > Felipe Balbi <felipe.balbi@linux.intel.com> writes:
>> <snip>
>> > Okay, I have found a regression on dwc3 and another patch follows:
>> > 
>> > commit 5e1a2af3e46248c55098cdae643c4141851b703e
>> > Author: Felipe Balbi <felipe.balbi@linux.intel.com>
>> > Date:   Wed Oct 5 14:24:37 2016 +0300
>> > 
>> >     usb: dwc3: gadget: properly account queued requests
>> >     
>> >     Some requests could be accounted for multiple
>> >     times. Let's fix that so each and every requests is
>> >     accounted for only once.
>> >     
>> >     Cc: <stable@vger.kernel.org> # v4.8
>> >     Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for LST bit")
>> >     Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
>> > 
>> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> > index 07cc8929f271..3c3ced128c77 100644
>> > --- a/drivers/usb/dwc3/gadget.c
>> > +++ b/drivers/usb/dwc3/gadget.c
>> > @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
>> >  		req->trb = trb;
>> >  		req->trb_dma = dwc3_trb_dma_offset(dep, trb);
>> >  		req->first_trb_index = dep->trb_enqueue;
>> > +		dep->queued_requests++;
>> >  	}
>> >  
>> >  	dwc3_ep_inc_enq(dep);
>> > @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
>> >  
>> >  	trb->ctrl |= DWC3_TRB_CTRL_HWO;
>> >  
>> > -	dep->queued_requests++;
>> > -
>> >  	trace_dwc3_prepare_trb(dep, trb);
>> >  }
>> >  
>> > @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 *dwc, struct dwc3_ep *dep,
>> >  	unsigned int		s_pkt = 0;
>> >  	unsigned int		trb_status;
>> >  
>> > -	dep->queued_requests--;
>> >  	dwc3_ep_inc_deq(dep);
>> > +
>> > +	if (req->trb == trb)
>> > +		dep->queued_requests--;
>> > +
>> >  	trace_dwc3_complete_trb(dep, trb);
>> >  
>> >  	/*
>> > 
>> > I have also built a branch which you can use for testing. Here's a pull
>> > request, once you tell me it works for you, then I can send proper
>> > patches out:
>> > 
>> > The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:
>> > 
>> >   Linux 4.8 (2016-10-02 16:24:33 -0700)
>> > 
>> > are available in the git repository at:
>> > 
>> >   git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git tmp-test-v4.8
>> > 
>> > for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca:
>> > 
>> >   usb: dwc3: gadget: properly account queued requests (2016-10-06 10:16:37 +0300)
>> > 
>> > ----------------------------------------------------------------
>> > Felipe Balbi (2):
>> >       usb: gadget: function: u_ether: don't starve tx request queue
>> >       usb: dwc3: gadget: properly account queued requests
>> > 
>> >  drivers/usb/dwc3/gadget.c             | 7 ++++---
>> >  drivers/usb/gadget/function/u_ether.c | 5 +++--
>> >  2 files changed, 7 insertions(+), 5 deletions(-)
>> 
>> Tried your branch, but unfortunately I'm still seeing the lags. New trace
>> attached.
>
> Just a reminder that the regressions is still there in 4.9-rc2.
> Unfortunateyly with all the stuff already piled on top, getting USB
> working on my device is no longer a simple matter of reverting the
> one commit. I had to revert 10 patches to get even a clean revert, but
> unfortunately that approach just lead to the transfer hanging immediately.
>
> So what I ended up doing is reverting all of this:
> git log --no-merges --format=oneline 55a0237f8f47957163125e20ee9260538c5c341c^..HEAD -- drivers/usb/dwc3/ include/linux/ulpi/interface.h drivers/usb/Kconfig drivers/usb/core/Kconfig
>
> which comes out at whopping 70 commits. Having to carry that around
> is going to be quite a pain especially as more stuff might be piled on
> top.
>
> /me thinks a stable backport of any fix (assuming one is found
> eventually) is going to be quite the challenge...

Yeah, I'm guessing we're gonna need some help from networking folks. The
only thing we did since v4.7 was actually respect req->no_interrupt flag
coming from u_ether itself. No idea why that causes so much trouble for
u_ether.

BTW, Instead of reverting so many patches, you can just remove
throttling:

diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
index f4a640216913..119a2e5848e8 100644
--- a/drivers/usb/gadget/function/u_ether.c
+++ b/drivers/usb/gadget/function/u_ether.c
@@ -589,14 +589,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
 
        req->length = length;
 
-       /* throttle high/super speed IRQ rate back slightly */
-       if (gadget_is_dualspeed(dev->gadget))
-               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
-                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
-                                       !list_empty(&dev->tx_reqs))
-                       ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
-                       : 0;
-
        retval = usb_ep_queue(in, req, GFP_ATOMIC);
        switch (retval) {
        default:

I'm adding netdev and couple other folks to the loop.

Just to summarize, USB peripheral controller now actually throttles
interrupt when requested to do so and that causes lags for USB
networking gadgets.

Without throttle we, potentially, call netif_wake_queue() more
frequently than with throttling. I'm wondering if something changed in
NET layer within the past few years but the USB networking gadgets ended
up being forgotten.

Anyway, if anybody has any hints, I'd be glad to hear about them.

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-28 10:16                 ` Felipe Balbi
@ 2016-10-28 16:33                       ` Ville Syrjälä
  0 siblings, 0 replies; 16+ messages in thread
From: Ville Syrjälä @ 2016-10-28 16:33 UTC (permalink / raw)
  To: Felipe Balbi
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, stable-u79uwXL29TY76Z2rM5mHXA,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, Oliver Neukum,
	netdev-u79uwXL29TY76Z2rM5mHXA

On Fri, Oct 28, 2016 at 01:16:13PM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> Ville Syrjälä <ville.syrjala-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> writes:
> > On Thu, Oct 06, 2016 at 12:08:26PM +0300, Ville Syrjälä wrote:
> >> On Thu, Oct 06, 2016 at 10:36:09AM +0300, Felipe Balbi wrote:
> >> > 
> >> > Hi,
> >> > 
> >> > Felipe Balbi <felipe.balbi-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> writes:
> >> <snip>
> >> > Okay, I have found a regression on dwc3 and another patch follows:
> >> > 
> >> > commit 5e1a2af3e46248c55098cdae643c4141851b703e
> >> > Author: Felipe Balbi <felipe.balbi-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
> >> > Date:   Wed Oct 5 14:24:37 2016 +0300
> >> > 
> >> >     usb: dwc3: gadget: properly account queued requests
> >> >     
> >> >     Some requests could be accounted for multiple
> >> >     times. Let's fix that so each and every requests is
> >> >     accounted for only once.
> >> >     
> >> >     Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org> # v4.8
> >> >     Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for LST bit")
> >> >     Signed-off-by: Felipe Balbi <felipe.balbi-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
> >> > 
> >> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> >> > index 07cc8929f271..3c3ced128c77 100644
> >> > --- a/drivers/usb/dwc3/gadget.c
> >> > +++ b/drivers/usb/dwc3/gadget.c
> >> > @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> >> >  		req->trb = trb;
> >> >  		req->trb_dma = dwc3_trb_dma_offset(dep, trb);
> >> >  		req->first_trb_index = dep->trb_enqueue;
> >> > +		dep->queued_requests++;
> >> >  	}
> >> >  
> >> >  	dwc3_ep_inc_enq(dep);
> >> > @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> >> >  
> >> >  	trb->ctrl |= DWC3_TRB_CTRL_HWO;
> >> >  
> >> > -	dep->queued_requests++;
> >> > -
> >> >  	trace_dwc3_prepare_trb(dep, trb);
> >> >  }
> >> >  
> >> > @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 *dwc, struct dwc3_ep *dep,
> >> >  	unsigned int		s_pkt = 0;
> >> >  	unsigned int		trb_status;
> >> >  
> >> > -	dep->queued_requests--;
> >> >  	dwc3_ep_inc_deq(dep);
> >> > +
> >> > +	if (req->trb == trb)
> >> > +		dep->queued_requests--;
> >> > +
> >> >  	trace_dwc3_complete_trb(dep, trb);
> >> >  
> >> >  	/*
> >> > 
> >> > I have also built a branch which you can use for testing. Here's a pull
> >> > request, once you tell me it works for you, then I can send proper
> >> > patches out:
> >> > 
> >> > The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:
> >> > 
> >> >   Linux 4.8 (2016-10-02 16:24:33 -0700)
> >> > 
> >> > are available in the git repository at:
> >> > 
> >> >   git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git tmp-test-v4.8
> >> > 
> >> > for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca:
> >> > 
> >> >   usb: dwc3: gadget: properly account queued requests (2016-10-06 10:16:37 +0300)
> >> > 
> >> > ----------------------------------------------------------------
> >> > Felipe Balbi (2):
> >> >       usb: gadget: function: u_ether: don't starve tx request queue
> >> >       usb: dwc3: gadget: properly account queued requests
> >> > 
> >> >  drivers/usb/dwc3/gadget.c             | 7 ++++---
> >> >  drivers/usb/gadget/function/u_ether.c | 5 +++--
> >> >  2 files changed, 7 insertions(+), 5 deletions(-)
> >> 
> >> Tried your branch, but unfortunately I'm still seeing the lags. New trace
> >> attached.
> >
> > Just a reminder that the regressions is still there in 4.9-rc2.
> > Unfortunateyly with all the stuff already piled on top, getting USB
> > working on my device is no longer a simple matter of reverting the
> > one commit. I had to revert 10 patches to get even a clean revert, but
> > unfortunately that approach just lead to the transfer hanging immediately.
> >
> > So what I ended up doing is reverting all of this:
> > git log --no-merges --format=oneline 55a0237f8f47957163125e20ee9260538c5c341c^..HEAD -- drivers/usb/dwc3/ include/linux/ulpi/interface.h drivers/usb/Kconfig drivers/usb/core/Kconfig
> >
> > which comes out at whopping 70 commits. Having to carry that around
> > is going to be quite a pain especially as more stuff might be piled on
> > top.
> >
> > /me thinks a stable backport of any fix (assuming one is found
> > eventually) is going to be quite the challenge...
> 
> Yeah, I'm guessing we're gonna need some help from networking folks. The
> only thing we did since v4.7 was actually respect req->no_interrupt flag
> coming from u_ether itself. No idea why that causes so much trouble for
> u_ether.
> 
> BTW, Instead of reverting so many patches, you can just remove
> throttling:
> 
> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
> index f4a640216913..119a2e5848e8 100644
> --- a/drivers/usb/gadget/function/u_ether.c
> +++ b/drivers/usb/gadget/function/u_ether.c
> @@ -589,14 +589,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>  
>         req->length = length;
>  
> -       /* throttle high/super speed IRQ rate back slightly */
> -       if (gadget_is_dualspeed(dev->gadget))
> -               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
> -                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
> -                                       !list_empty(&dev->tx_reqs))
> -                       ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
> -                       : 0;
> -
>         retval = usb_ep_queue(in, req, GFP_ATOMIC);
>         switch (retval) {
>         default:

Ah cool. That indeed fixes the problem for me.

> 
> I'm adding netdev and couple other folks to the loop.
> 
> Just to summarize, USB peripheral controller now actually throttles
> interrupt when requested to do so and that causes lags for USB
> networking gadgets.
> 
> Without throttle we, potentially, call netif_wake_queue() more
> frequently than with throttling. I'm wondering if something changed in
> NET layer within the past few years but the USB networking gadgets ended
> up being forgotten.
> 
> Anyway, if anybody has any hints, I'd be glad to hear about them.
> 
> -- 
> balbi



-- 
Ville Syrjälä
Intel OTC
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
@ 2016-10-28 16:33                       ` Ville Syrjälä
  0 siblings, 0 replies; 16+ messages in thread
From: Ville Syrjälä @ 2016-10-28 16:33 UTC (permalink / raw)
  To: Felipe Balbi; +Cc: linux-usb, stable, davem, Oliver Neukum, netdev

On Fri, Oct 28, 2016 at 01:16:13PM +0300, Felipe Balbi wrote:
> 
> Hi,
> 
> Ville Syrj�l� <ville.syrjala@linux.intel.com> writes:
> > On Thu, Oct 06, 2016 at 12:08:26PM +0300, Ville Syrj�l� wrote:
> >> On Thu, Oct 06, 2016 at 10:36:09AM +0300, Felipe Balbi wrote:
> >> > 
> >> > Hi,
> >> > 
> >> > Felipe Balbi <felipe.balbi@linux.intel.com> writes:
> >> <snip>
> >> > Okay, I have found a regression on dwc3 and another patch follows:
> >> > 
> >> > commit 5e1a2af3e46248c55098cdae643c4141851b703e
> >> > Author: Felipe Balbi <felipe.balbi@linux.intel.com>
> >> > Date:   Wed Oct 5 14:24:37 2016 +0300
> >> > 
> >> >     usb: dwc3: gadget: properly account queued requests
> >> >     
> >> >     Some requests could be accounted for multiple
> >> >     times. Let's fix that so each and every requests is
> >> >     accounted for only once.
> >> >     
> >> >     Cc: <stable@vger.kernel.org> # v4.8
> >> >     Fixes: 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for LST bit")
> >> >     Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
> >> > 
> >> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> >> > index 07cc8929f271..3c3ced128c77 100644
> >> > --- a/drivers/usb/dwc3/gadget.c
> >> > +++ b/drivers/usb/dwc3/gadget.c
> >> > @@ -783,6 +783,7 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> >> >  		req->trb = trb;
> >> >  		req->trb_dma = dwc3_trb_dma_offset(dep, trb);
> >> >  		req->first_trb_index = dep->trb_enqueue;
> >> > +		dep->queued_requests++;
> >> >  	}
> >> >  
> >> >  	dwc3_ep_inc_enq(dep);
> >> > @@ -833,8 +834,6 @@ static void dwc3_prepare_one_trb(struct dwc3_ep *dep,
> >> >  
> >> >  	trb->ctrl |= DWC3_TRB_CTRL_HWO;
> >> >  
> >> > -	dep->queued_requests++;
> >> > -
> >> >  	trace_dwc3_prepare_trb(dep, trb);
> >> >  }
> >> >  
> >> > @@ -1861,8 +1860,11 @@ static int __dwc3_cleanup_done_trbs(struct dwc3 *dwc, struct dwc3_ep *dep,
> >> >  	unsigned int		s_pkt = 0;
> >> >  	unsigned int		trb_status;
> >> >  
> >> > -	dep->queued_requests--;
> >> >  	dwc3_ep_inc_deq(dep);
> >> > +
> >> > +	if (req->trb == trb)
> >> > +		dep->queued_requests--;
> >> > +
> >> >  	trace_dwc3_complete_trb(dep, trb);
> >> >  
> >> >  	/*
> >> > 
> >> > I have also built a branch which you can use for testing. Here's a pull
> >> > request, once you tell me it works for you, then I can send proper
> >> > patches out:
> >> > 
> >> > The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:
> >> > 
> >> >   Linux 4.8 (2016-10-02 16:24:33 -0700)
> >> > 
> >> > are available in the git repository at:
> >> > 
> >> >   git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git tmp-test-v4.8
> >> > 
> >> > for you to fetch changes up to c968b8d1effe64a7980802d1eef29f4e1922faca:
> >> > 
> >> >   usb: dwc3: gadget: properly account queued requests (2016-10-06 10:16:37 +0300)
> >> > 
> >> > ----------------------------------------------------------------
> >> > Felipe Balbi (2):
> >> >       usb: gadget: function: u_ether: don't starve tx request queue
> >> >       usb: dwc3: gadget: properly account queued requests
> >> > 
> >> >  drivers/usb/dwc3/gadget.c             | 7 ++++---
> >> >  drivers/usb/gadget/function/u_ether.c | 5 +++--
> >> >  2 files changed, 7 insertions(+), 5 deletions(-)
> >> 
> >> Tried your branch, but unfortunately I'm still seeing the lags. New trace
> >> attached.
> >
> > Just a reminder that the regressions is still there in 4.9-rc2.
> > Unfortunateyly with all the stuff already piled on top, getting USB
> > working on my device is no longer a simple matter of reverting the
> > one commit. I had to revert 10 patches to get even a clean revert, but
> > unfortunately that approach just lead to the transfer hanging immediately.
> >
> > So what I ended up doing is reverting all of this:
> > git log --no-merges --format=oneline 55a0237f8f47957163125e20ee9260538c5c341c^..HEAD -- drivers/usb/dwc3/ include/linux/ulpi/interface.h drivers/usb/Kconfig drivers/usb/core/Kconfig
> >
> > which comes out at whopping 70 commits. Having to carry that around
> > is going to be quite a pain especially as more stuff might be piled on
> > top.
> >
> > /me thinks a stable backport of any fix (assuming one is found
> > eventually) is going to be quite the challenge...
> 
> Yeah, I'm guessing we're gonna need some help from networking folks. The
> only thing we did since v4.7 was actually respect req->no_interrupt flag
> coming from u_ether itself. No idea why that causes so much trouble for
> u_ether.
> 
> BTW, Instead of reverting so many patches, you can just remove
> throttling:
> 
> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
> index f4a640216913..119a2e5848e8 100644
> --- a/drivers/usb/gadget/function/u_ether.c
> +++ b/drivers/usb/gadget/function/u_ether.c
> @@ -589,14 +589,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>  
>         req->length = length;
>  
> -       /* throttle high/super speed IRQ rate back slightly */
> -       if (gadget_is_dualspeed(dev->gadget))
> -               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
> -                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
> -                                       !list_empty(&dev->tx_reqs))
> -                       ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
> -                       : 0;
> -
>         retval = usb_ep_queue(in, req, GFP_ATOMIC);
>         switch (retval) {
>         default:

Ah cool. That indeed fixes the problem for me.

> 
> I'm adding netdev and couple other folks to the loop.
> 
> Just to summarize, USB peripheral controller now actually throttles
> interrupt when requested to do so and that causes lags for USB
> networking gadgets.
> 
> Without throttle we, potentially, call netif_wake_queue() more
> frequently than with throttling. I'm wondering if something changed in
> NET layer within the past few years but the USB networking gadgets ended
> up being forgotten.
> 
> Anyway, if anybody has any hints, I'd be glad to hear about them.
> 
> -- 
> balbi



-- 
Ville Syrj�l�
Intel OTC

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-28 16:33                       ` Ville Syrjälä
@ 2016-10-31 18:51                         ` David Miller
  -1 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2016-10-31 18:51 UTC (permalink / raw)
  To: ville.syrjala; +Cc: felipe.balbi, linux-usb, stable, oneukum, netdev

From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Fri, 28 Oct 2016 19:33:32 +0300

> On Fri, Oct 28, 2016 at 01:16:13PM +0300, Felipe Balbi wrote:
>> Yeah, I'm guessing we're gonna need some help from networking folks. The
>> only thing we did since v4.7 was actually respect req->no_interrupt flag
>> coming from u_ether itself. No idea why that causes so much trouble for
>> u_ether.
>> 
>> BTW, Instead of reverting so many patches, you can just remove
>> throttling:
>> 
>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>> index f4a640216913..119a2e5848e8 100644
>> --- a/drivers/usb/gadget/function/u_ether.c
>> +++ b/drivers/usb/gadget/function/u_ether.c
>> @@ -589,14 +589,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>  
>>         req->length = length;
>>  
>> -       /* throttle high/super speed IRQ rate back slightly */
>> -       if (gadget_is_dualspeed(dev->gadget))
>> -               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
>> -                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
>> -                                       !list_empty(&dev->tx_reqs))
>> -                       ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
>> -                       : 0;
>> -
>>         retval = usb_ep_queue(in, req, GFP_ATOMIC);
>>         switch (retval) {
>>         default:
> 
> Ah cool. That indeed fixes the problem for me.
> 
>> 
>> I'm adding netdev and couple other folks to the loop.
>> 
>> Just to summarize, USB peripheral controller now actually throttles
>> interrupt when requested to do so and that causes lags for USB
>> networking gadgets.
>> 
>> Without throttle we, potentially, call netif_wake_queue() more
>> frequently than with throttling. I'm wondering if something changed in
>> NET layer within the past few years but the USB networking gadgets ended
>> up being forgotten.
>> 
>> Anyway, if anybody has any hints, I'd be glad to hear about them.

This throttling mechanism seems to have the same problem we've seen in
the past with some ethernet drivers trying to do TX mitigation in
software.

If I understand correctly, the interrupt bit for TX completions is set
only periodically.

However, the networking stack has a hard requirement that all SKBs
which are transmitted must have their completion signalled in a finite
amount of time.  This is because, until the SKB is freed by the
driver, it holds onto socket, netfilter, and other subsystem
resources.

So, for example, if your scheme is that only every 8th TX packet will
generate an interrupt you run into problems if you suddenly have 7
pending TX packets and no more traffic is generated for a long time.

Those 7 packets will sit in the TX queue indefinitely, and this is the
situation which drivers must avoid.

Therefore, for devices with per-TX-queue-entry interrupt bit schemes,
it's not easy to take advantage of this facility.  The safest thing to
do is to interrupt for every queue entry.

For the time being, this revert is the way to go and it should be
submitted formally, with proper commit message and signoffs, via
whatever tree this gadget driver's changes should be submitted via.

It might be possible to elide TX queue entry interrupts using the
skb->xmit_more state.  Basically, if the TX queue is not full and
skb->xmit_more is set, you can skip the interrupt indication bit
in the descriptor.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
@ 2016-10-31 18:51                         ` David Miller
  0 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2016-10-31 18:51 UTC (permalink / raw)
  To: ville.syrjala; +Cc: felipe.balbi, linux-usb, stable, oneukum, netdev

From: Ville Syrj�l� <ville.syrjala@linux.intel.com>
Date: Fri, 28 Oct 2016 19:33:32 +0300

> On Fri, Oct 28, 2016 at 01:16:13PM +0300, Felipe Balbi wrote:
>> Yeah, I'm guessing we're gonna need some help from networking folks. The
>> only thing we did since v4.7 was actually respect req->no_interrupt flag
>> coming from u_ether itself. No idea why that causes so much trouble for
>> u_ether.
>> 
>> BTW, Instead of reverting so many patches, you can just remove
>> throttling:
>> 
>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>> index f4a640216913..119a2e5848e8 100644
>> --- a/drivers/usb/gadget/function/u_ether.c
>> +++ b/drivers/usb/gadget/function/u_ether.c
>> @@ -589,14 +589,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>  
>>         req->length = length;
>>  
>> -       /* throttle high/super speed IRQ rate back slightly */
>> -       if (gadget_is_dualspeed(dev->gadget))
>> -               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
>> -                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
>> -                                       !list_empty(&dev->tx_reqs))
>> -                       ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
>> -                       : 0;
>> -
>>         retval = usb_ep_queue(in, req, GFP_ATOMIC);
>>         switch (retval) {
>>         default:
> 
> Ah cool. That indeed fixes the problem for me.
> 
>> 
>> I'm adding netdev and couple other folks to the loop.
>> 
>> Just to summarize, USB peripheral controller now actually throttles
>> interrupt when requested to do so and that causes lags for USB
>> networking gadgets.
>> 
>> Without throttle we, potentially, call netif_wake_queue() more
>> frequently than with throttling. I'm wondering if something changed in
>> NET layer within the past few years but the USB networking gadgets ended
>> up being forgotten.
>> 
>> Anyway, if anybody has any hints, I'd be glad to hear about them.

This throttling mechanism seems to have the same problem we've seen in
the past with some ethernet drivers trying to do TX mitigation in
software.

If I understand correctly, the interrupt bit for TX completions is set
only periodically.

However, the networking stack has a hard requirement that all SKBs
which are transmitted must have their completion signalled in a finite
amount of time.  This is because, until the SKB is freed by the
driver, it holds onto socket, netfilter, and other subsystem
resources.

So, for example, if your scheme is that only every 8th TX packet will
generate an interrupt you run into problems if you suddenly have 7
pending TX packets and no more traffic is generated for a long time.

Those 7 packets will sit in the TX queue indefinitely, and this is the
situation which drivers must avoid.

Therefore, for devices with per-TX-queue-entry interrupt bit schemes,
it's not easy to take advantage of this facility.  The safest thing to
do is to interrupt for every queue entry.

For the time being, this revert is the way to go and it should be
submitted formally, with proper commit message and signoffs, via
whatever tree this gadget driver's changes should be submitted via.

It might be possible to elide TX queue entry interrupts using the
skb->xmit_more state.  Basically, if the TX queue is not full and
skb->xmit_more is set, you can skip the interrupt indication bit
in the descriptor.

Thanks.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
  2016-10-31 18:51                         ` David Miller
  (?)
@ 2016-11-01 11:30                         ` Felipe Balbi
  -1 siblings, 0 replies; 16+ messages in thread
From: Felipe Balbi @ 2016-11-01 11:30 UTC (permalink / raw)
  To: David Miller, ville.syrjala; +Cc: linux-usb, stable, oneukum, netdev

[-- Attachment #1: Type: text/plain, Size: 3762 bytes --]


Hi,

David Miller <davem@davemloft.net> writes:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Date: Fri, 28 Oct 2016 19:33:32 +0300
>
>> On Fri, Oct 28, 2016 at 01:16:13PM +0300, Felipe Balbi wrote:
>>> Yeah, I'm guessing we're gonna need some help from networking folks. The
>>> only thing we did since v4.7 was actually respect req->no_interrupt flag
>>> coming from u_ether itself. No idea why that causes so much trouble for
>>> u_ether.
>>> 
>>> BTW, Instead of reverting so many patches, you can just remove
>>> throttling:
>>> 
>>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>>> index f4a640216913..119a2e5848e8 100644
>>> --- a/drivers/usb/gadget/function/u_ether.c
>>> +++ b/drivers/usb/gadget/function/u_ether.c
>>> @@ -589,14 +589,6 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>>  
>>>         req->length = length;
>>>  
>>> -       /* throttle high/super speed IRQ rate back slightly */
>>> -       if (gadget_is_dualspeed(dev->gadget))
>>> -               req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
>>> -                                      dev->gadget->speed == USB_SPEED_SUPER)) &&
>>> -                                       !list_empty(&dev->tx_reqs))
>>> -                       ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
>>> -                       : 0;
>>> -
>>>         retval = usb_ep_queue(in, req, GFP_ATOMIC);
>>>         switch (retval) {
>>>         default:
>> 
>> Ah cool. That indeed fixes the problem for me.
>> 
>>> 
>>> I'm adding netdev and couple other folks to the loop.
>>> 
>>> Just to summarize, USB peripheral controller now actually throttles
>>> interrupt when requested to do so and that causes lags for USB
>>> networking gadgets.
>>> 
>>> Without throttle we, potentially, call netif_wake_queue() more
>>> frequently than with throttling. I'm wondering if something changed in
>>> NET layer within the past few years but the USB networking gadgets ended
>>> up being forgotten.
>>> 
>>> Anyway, if anybody has any hints, I'd be glad to hear about them.
>
> This throttling mechanism seems to have the same problem we've seen in
> The past with some ethernet drivers trying to do TX mitigation in
> software.
>
> If I understand correctly, the interrupt bit for TX completions is set
> only periodically.
>
> However, the networking stack has a hard requirement that all SKBs
> which are transmitted must have their completion signalled in a finite
> amount of time.  This is because, until the SKB is freed by the
> driver, it holds onto socket, netfilter, and other subsystem
> resources.
>
> So, for example, if your scheme is that only every 8th TX packet will
> generate an interrupt you run into problems if you suddenly have 7
> pending TX packets and no more traffic is generated for a long time.
>
> Those 7 packets will sit in the TX queue indefinitely, and this is the
> situation which drivers must avoid.
>
> Therefore, for devices with per-TX-queue-entry interrupt bit schemes,
> it's not easy to take advantage of this facility.  The safest thing to
> do is to interrupt for every queue entry.
>
> For the time being, this revert is the way to go and it should be
> submitted formally, with proper commit message and signoffs, via
> whatever tree this gadget driver's changes should be submitted via.
>
> It might be possible to elide TX queue entry interrupts using the
> skb->xmit_more state.  Basically, if the TX queue is not full and
> skb->xmit_more is set, you can skip the interrupt indication bit
> in the descriptor.

thanks for confirming, patch removing throttling sent. You're in Cc.

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-11-01 11:31 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-03 14:18 [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit" ville.syrjala
2016-10-03 16:08 ` Felipe Balbi
2016-10-04  7:43   ` Felipe Balbi
2016-10-04  8:38   ` Ville Syrjälä
2016-10-04 12:23     ` Felipe Balbi
2016-10-04 13:20       ` Ville Syrjälä
2016-10-04 14:03         ` Felipe Balbi
2016-10-06  7:36           ` Felipe Balbi
2016-10-06  9:08             ` Ville Syrjälä
2016-10-27 15:40               ` Ville Syrjälä
2016-10-28 10:16                 ` Felipe Balbi
     [not found]                   ` <87pomksrrm.fsf-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2016-10-28 16:33                     ` Ville Syrjälä
2016-10-28 16:33                       ` Ville Syrjälä
2016-10-31 18:51                       ` David Miller
2016-10-31 18:51                         ` David Miller
2016-11-01 11:30                         ` Felipe Balbi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.