All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits
@ 2016-01-29  2:19 Douglas Anderson
  2016-01-29  2:19   ` Douglas Anderson
                   ` (22 more replies)
  0 siblings, 23 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

This is a bit of catchall series for all the bug fix and performance
patches I've been working on over the last few months.  Note that for
dwc2 we need to do LOTS in software and need super low interrupt
latency, so most performance improvements actually fix real bugs.

Patches are structured to start with no-brainer stuff that could be
applied ASAP, especially things I've already gotten Acks for.  Things
get slightly more RFC / RFT like as we get farther down the series.
Anything that can be landed sooner rather than later (especially those
Acked long ago) would help in re-posts (I'm not biased, of course).

It's been a few months since my last post of this series.  In the
meantime I've added a bunch of small bugfixes to the start of it and
also TOTALLY REWROTE the microframe scheduler.  I'll say up front: I
know nothing about USB.  I haven't read the whole spec.  I'm not
terribly familiar with the OHCI, EHCI, and XHCI drivers in the kernel.
...and I'm pretty clueless overall.  Nevertheless, I've attempted to
write up a fancy scheduler based on the portion of the spec talking
about microframe scheduling requirements.  This rewritten scheduler does
seem to help when I start jamming lots of USB things into a hub, so
presumably the code is a reasonably starting point.  Given my current
understanding of USB the old code was fairly insane, so presumably even
if my new patch isn't perfect it's better than what we had.

Anyway, on to the patches:

1. usb: dwc2: rockchip: Make the max_transfer_size automatic

   No brainer.  Can land any time.

2. usb: dwc2: host: Get aligned DMA in a more supported way

   Although this touches a lot of code, it's mostly just deleting
   stuff.  The way this is working is nearly the same as tegra.  Biggest
   objection I expect is that it has too much duplication with tegra and
   musb.  I'd personally prefer to land it now and remove duplication
   later, but up to others.  Speeding up interrupt handler helps with
   SOF scheduling, so this is not just a dumb optimization.

3. usb: dwc2: host: Set host_rx_fifo_size to 525 for rk3066

   Seems like a good idea and small impact, but if someone hates it or
   it breaks on some Rockchip SoC, just drop it.  I've only tested on
   rk3288 so it would be nice if someone with access to more Rockchip
   SoCs can give a tested by.

4. usb: dwc2: host: Avoid use of chan->qh after qh freed

   Simple bugfix.  Unrelated to the series but thrown in here.

5. usb: dwc2: host: Always add to the tail of queues

   Big functionality improvement.  Small patch.  Suggest applying ASAP.

6. usb: dwc2: host: fix split transfer schedule sequence

   Unless I'm misunderstanding, this should be a no-brainer to fix.
   Could be some bikeshedding on how to fix this.  Let me know if/how
   you want me to spin.  Otherwise I'd say land it and it will fix a
   bunch of stuff.

7. usb: dwc2: host: Add scheduler tracing

   Shouldn't hurt anything.  If you have bikesheds, let me know.  Many
   future patches require this one just because they add additional
   traces.

8. usb: dwc2: host: Add a delay before releasing periodic bandwidth
9. usb: dwc2: host: Giveback URB in tasklet context

   I think we should take these.  They improve things a bunch and I have
   found no regressions due to them.  Additional testing appreciated, of
   course.

10. usb: dwc2: host: Properly set the HFIR

   I sent this out on its own, but since I'm resending the series I
   figured I'm jam it in here.  Can really go anywhere in the series or
   applied totally on its own.

11. usb: dwc2: host: There's not really a TT for the root hub

   Seems right to me, but if someone knows better then please drop.
   Wasn't part of the previous series so doesn't have any Tested-by
   tags, though Stefan did indicated that he tried it and it didn't
   appear to break anything for him.

   Can be applied totally on its own.

12. usb: dwc2: host: Use periodic interrupt even with DMA

   Just came up with this one recently so it's had slightly less
   testing.  ...but it certainly fixed a bunch of stuff.  Could probably
   be moved around in the series to be pretty much anywhere.  I don't
   think this has a huge impact until we fix the scheduler (below) but
   at the same time I'm pretty sure it's something that's been wrong for
   a long time.

13. usb: dwc2: host: Rename some fields in struct dwc2_qh
14. usb: dwc2: host: Reorder things in hcd_queue.c
15. usb: dwc2: host: Split code out to make dwc2_do_reserve()

   Cleanups to make future patches easier to understand.  Bikeshed away.
   All no-op changes.

16. usb: dwc2: host: Add scheduler logging for missed SOFs

   I found this to be quite helpful.  If you hate it, drop it from the
   series.

17. usb: dwc2: host: Manage frame nums better in scheduler

   Doesn't totally make sense on its own, but a good halfway point to
   the microframe scheduler.  ...and shouldn't regress anything.  Allows
   us to do the "Properly set even/odd frame" patch below which
   definitely improves things.

18. usb: dwc2: host: Schedule periodic right away if it's time

   Yet another small change to make scheduling tighter.

19. usb: dwc2: host: Add dwc2_hcd_get_future_frame_number() call

   Prep for ("usb: dwc2: host: Properly set even/odd frame")

20. usb: dwc2: host: Properly set even/odd frame

   Helps quite a bit.  Helps even more after the redone microframe
   scheduler.  Feel free to tidy up if you see easy ways to do this.
   Maybe someone has a better way to estimate time on the wire?

21. usb: dwc2: host: Totally redo the microframe scheduler

   Eyeballs please!  I think I've stared at this too much and now my
   eyes are glazing over.  This definitely helps but also probably needs
   a few more spins?  Of course, if nobody wants to review it, IMHO
   checking it in as-is is WAAAAY better than what we had before.

22. usb: dwc2: host: If using uframe scheduler, end splits better

   Low confidence in this one.  Worry that it will end something too
   soon, but haven't seen it yet.

===

Below is discussion of some of the speedup stuff (mostly relevant to the
first few patches).

===

The dwc2 interrupt handler is quite slow.  On rk3288 with a few things
plugged into the ports and with cpufreq locked at 696MHz (to simulate
real world idle system), I can easily observe dwc2_handle_hcd_intr()
taking > 120 us, sometimes > 150 us.  Note that SOF interrupts come
every 125 us with high speed USB, so taking > 120 us in the interrupt
handler is a big deal.

The patches here will speed up the interrupt controller significantly.
After this series, I have a hard time seeing the interrupt controller
taking > 20 us and I don't ever see it taking > 30 us ever in my tests
unless I bring the cpufreq back down.  With the cpufreq at 126 MHz I can
still see the interrupt handler take > 50 us, so I'm sure we could
improve this further.  ...but hey, it's a start.

This series also shows big speed improvements when testing with a USB
Gigabit Ethernet adapter.  Previously the tested adapter would top out
at about 15MB/s.  After these changes it gets about 23MB/s.

In addition to the speedup, this series also has the advantage of
simplifying dwc2 and making it more like everyone else (introducing the
possibility of future simplifications).  Picking this series up will
help your diffstat and likely win you friends.  ;)

===

Steps for gathering data with ftrace (for some reason I have to run
twice):

cd /sys/devices/system/cpu/cpu0/cpufreq/
echo userspace > scaling_governor
echo 696000 > scaling_setspeed

cd /sys/kernel/debug/tracing
echo 0 > tracing_on
echo "" > trace
echo nop > current_tracer
echo function_graph > current_tracer
echo dwc2_handle_hcd_intr > set_graph_function
echo dwc2_handle_common_intr >> set_graph_function
echo dwc2_handle_hcd_intr > set_ftrace_filter
echo dwc2_handle_common_intr >> set_ftrace_filter
echo funcgraph-abstime > trace_options
echo 70 > tracing_thresh
echo 1 > /sys/kernel/debug/tracing/tracing_on

sleep 2
cat trace

Changes in v6:
- Add Kever's Reviewed-bys.
- Add Kever's Tested-bys.
- Add Heiko's Tested-bys.
- Add Stefan's Tested-bys.
- Back to 525 dwords, not 528.
- Add one more instance of check; kept Reviewed-by / Tested-by (OK?).
- Fix patch tags (hcd -> host)
- Incorporated Properly set the HFIR patch to big series in v6
- There's not really a TT for the root hub new for v6
- Fix bug where periodic things get scheduled too quick (Alan Stern)
- Removed incorrect limit on number of channels (Heiko Stuebner).
- Fixed order of operations bug in debug print.

Changes in v5:
- Move list maintenance to hcd.c to avoid gadget-only compile error
- Moved defines outside of ifdef to avoid gadget-only compile error.

Changes in v4:
- Add John's Acks from <https://patchwork.kernel.org/patch/7631551>
- Set host_rx_fifo_size to 528 for rk3066 new for v4.
- Avoid use of chan->qh after qh freed new for v4.
- Always add to the tail of queues new for v4.
- fix split transfer schedule sequence new for v4.
- Retooled scheduler tracing a bit, so left off John's Ack from v3.
- Moved periodic bandwidth release delay patch earlier again.
- A bit earlier in the list of patches than in v3.
- Use periodic interrupt even with DMA new for v4.
- Rename some fields in struct dwc2_qh new for v4.
- Reorder things in hcd_queue.c new for v4.
- Split code out to make dwc2_do_reserve() new for v4.
- Add scheduler logging for missed SOFs new for v4.
- Manage frame nums better in scheduler new for v4.
- Schedule periodic right away if it's time new for v4.
- Add dwc2_hcd_get_future_frame_number() call new for v4.
- Properly set even/odd frame new for v4.
- Figured out what the microframe scheduler was supposed to do.
- Microframe rewrite is totally different from v3, hopefully more right.
- Microframe rewrite is later in the series now.
- If using uframe scheduler, end splits better new for v4.

Changes in v3:
- Moved periodic bandwidth release delay patch later in the series.
- The uframe scheduler patch is folded into optimization series.
- Optimize uframe scheduler "single uframe" case a little.
- uframe scheduler now atop logging patches.
- uframe scheduler now before delayed bandwidth release patches.
- Add defines like EARLY_FRAME_USEC
- Reorder dwc2_deschedule_periodic() in prep for future patches.
- uframe scheduler now shows real usefulness w/ future patches!
- Assuming single_tt is new for v3; not terribly well tested (yet).
- Keep track and use our uframe new for v3.

Changes in v2:
- Add a warn if setup_dma is not aligned (Julius Werner).
- Periodic bandwidth release delay new for V2
- Commit message now says that URB giveback change needs delay change.
- Totally rewrote uframe scheduler again after writing test code.
- uframe scheduler atop delayed bandwidth release patches.

Douglas Anderson (22):
  usb: dwc2: rockchip: Make the max_transfer_size automatic
  usb: dwc2: host: Get aligned DMA in a more supported way
  usb: dwc2: host: Set host_rx_fifo_size to 525 for rk3066
  usb: dwc2: host: Avoid use of chan->qh after qh freed
  usb: dwc2: host: Always add to the tail of queues
  usb: dwc2: host: fix split transfer schedule sequence
  usb: dwc2: host: Add scheduler tracing
  usb: dwc2: host: Add a delay before releasing periodic bandwidth
  usb: dwc2: host: Giveback URB in tasklet context
  usb: dwc2: host: Properly set the HFIR
  usb: dwc2: host: There's not really a TT for the root hub
  usb: dwc2: host: Use periodic interrupt even with DMA
  usb: dwc2: host: Rename some fields in struct dwc2_qh
  usb: dwc2: host: Reorder things in hcd_queue.c
  usb: dwc2: host: Split code out to make dwc2_do_reserve()
  usb: dwc2: host: Add scheduler logging for missed SOFs
  usb: dwc2: host: Manage frame nums better in scheduler
  usb: dwc2: host: Schedule periodic right away if it's time
  usb: dwc2: host: Add dwc2_hcd_get_future_frame_number() call
  usb: dwc2: host: Properly set even/odd frame
  usb: dwc2: host: Totally redo the microframe scheduler
  usb: dwc2: host: If using uframe scheduler, end splits better

 drivers/usb/dwc2/core.c      |  119 ++-
 drivers/usb/dwc2/core.h      |  114 ++-
 drivers/usb/dwc2/hcd.c       |  392 ++++++---
 drivers/usb/dwc2/hcd.h       |  126 ++-
 drivers/usb/dwc2/hcd_ddma.c  |   41 +-
 drivers/usb/dwc2/hcd_intr.c  |  174 ++--
 drivers/usb/dwc2/hcd_queue.c | 1965 ++++++++++++++++++++++++++++++++++--------
 drivers/usb/dwc2/platform.c  |    4 +-
 8 files changed, 2276 insertions(+), 659 deletions(-)

-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH v6 01/22] usb: dwc2: rockchip: Make the max_transfer_size automatic
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

Previously we needed to set the max_transfer_size to explicitly be 65535
because the old driver would detect that our hardware could support much
bigger transfers and then would try to do them.  This wouldn't work
since the DMA alignment code couldn't support it.

Later in commit e8f8c14d9da7 ("usb: dwc2: clip max_transfer_size to
65535") upstream added support for clipping this automatically.  Since
that commit it has been OK to just use "-1" (default), but nobody
bothered to change it.

Let's change it to default now for two reasons:
- It's nice to use autodetected params.
- If we can remove the 65535 limit, we can transfer more!

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: John Youn <johnyoun@synopsys.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
---
Changes in v6: None
Changes in v5: None
Changes in v4:
- Add John's Acks from <https://patchwork.kernel.org/patch/7631551>

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/platform.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
index 510f787434b3..5008a467ce06 100644
--- a/drivers/usb/dwc2/platform.c
+++ b/drivers/usb/dwc2/platform.c
@@ -129,7 +129,7 @@ static const struct dwc2_core_params params_rk3066 = {
 	.host_rx_fifo_size		= 520,	/* 520 DWORDs */
 	.host_nperio_tx_fifo_size	= 128,	/* 128 DWORDs */
 	.host_perio_tx_fifo_size	= 256,	/* 256 DWORDs */
-	.max_transfer_size		= 65535,
+	.max_transfer_size		= -1,
 	.max_packet_count		= -1,
 	.host_channels			= -1,
 	.phy_type			= -1,
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 01/22] usb: dwc2: rockchip: Make the max_transfer_size automatic
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: william.wu-TNX95d0MmH7DzftRWevZcw,
	huangtao-TNX95d0MmH7DzftRWevZcw, heiko-4mtYJXux2i+zQB+pC5nmwQ,
	stefan.wahren-eS4NqCHxEME,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Julius Werner,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw, Douglas Anderson,
	johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Previously we needed to set the max_transfer_size to explicitly be 65535
because the old driver would detect that our hardware could support much
bigger transfers and then would try to do them.  This wouldn't work
since the DMA alignment code couldn't support it.

Later in commit e8f8c14d9da7 ("usb: dwc2: clip max_transfer_size to
65535") upstream added support for clipping this automatically.  Since
that commit it has been OK to just use "-1" (default), but nobody
bothered to change it.

Let's change it to default now for two reasons:
- It's nice to use autodetected params.
- If we can remove the 65535 limit, we can transfer more!

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Acked-by: John Youn <johnyoun-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
---
Changes in v6: None
Changes in v5: None
Changes in v4:
- Add John's Acks from <https://patchwork.kernel.org/patch/7631551>

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/platform.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
index 510f787434b3..5008a467ce06 100644
--- a/drivers/usb/dwc2/platform.c
+++ b/drivers/usb/dwc2/platform.c
@@ -129,7 +129,7 @@ static const struct dwc2_core_params params_rk3066 = {
 	.host_rx_fifo_size		= 520,	/* 520 DWORDs */
 	.host_nperio_tx_fifo_size	= 128,	/* 128 DWORDs */
 	.host_perio_tx_fifo_size	= 256,	/* 256 DWORDs */
-	.max_transfer_size		= 65535,
+	.max_transfer_size		= -1,
 	.max_packet_count		= -1,
 	.host_channels			= -1,
 	.phy_type			= -1,
-- 
2.7.0.rc3.207.g0ac5344

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 02/22] usb: dwc2: host: Get aligned DMA in a more supported way
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

All other host controllers who want aligned buffers for DMA do it a
certain way.  Let's do that too instead of working behind the USB core's
back.  This makes our interrupt handler not take forever and also rips
out a lot of code, simplifying things a bunch.

This also has the side effect of removing the 65535 max transfer size
limit.

NOTE: The actual code to allocate the aligned buffers is ripped almost
completely from the tegra EHCI driver.  At some point in the future we
may want to add this functionality to the USB core to share more code
everywhere.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: John Youn <johnyoun@synopsys.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: John Youn <johnyoun@synopsys.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Add John's Acks from <https://patchwork.kernel.org/patch/7631551>

Changes in v3: None
Changes in v2:
- Add a warn if setup_dma is not aligned (Julius Werner).

 drivers/usb/dwc2/core.c      |  21 +-----
 drivers/usb/dwc2/hcd.c       | 170 +++++++++++++++++++++----------------------
 drivers/usb/dwc2/hcd.h       |  10 ---
 drivers/usb/dwc2/hcd_intr.c  |  65 -----------------
 drivers/usb/dwc2/hcd_queue.c |   7 +-
 5 files changed, 87 insertions(+), 186 deletions(-)

diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
index 39a0fa8a4c0a..73f2771b7740 100644
--- a/drivers/usb/dwc2/core.c
+++ b/drivers/usb/dwc2/core.c
@@ -1958,19 +1958,11 @@ void dwc2_hc_start_transfer(struct dwc2_hsotg *hsotg,
 	}
 
 	if (hsotg->core_params->dma_enable > 0) {
-		dma_addr_t dma_addr;
-
-		if (chan->align_buf) {
-			if (dbg_hc(chan))
-				dev_vdbg(hsotg->dev, "align_buf\n");
-			dma_addr = chan->align_buf;
-		} else {
-			dma_addr = chan->xfer_dma;
-		}
-		dwc2_writel((u32)dma_addr, hsotg->regs + HCDMA(chan->hc_num));
+		dwc2_writel((u32)chan->xfer_dma,
+			    hsotg->regs + HCDMA(chan->hc_num));
 		if (dbg_hc(chan))
 			dev_vdbg(hsotg->dev, "Wrote %08lx to HCDMA(%d)\n",
-				 (unsigned long)dma_addr, chan->hc_num);
+				 (unsigned long)chan->xfer_dma, chan->hc_num);
 	}
 
 	/* Start the split */
@@ -3363,13 +3355,6 @@ int dwc2_get_hwparams(struct dwc2_hsotg *hsotg)
 	width = (hwcfg3 & GHWCFG3_XFER_SIZE_CNTR_WIDTH_MASK) >>
 		GHWCFG3_XFER_SIZE_CNTR_WIDTH_SHIFT;
 	hw->max_transfer_size = (1 << (width + 11)) - 1;
-	/*
-	 * Clip max_transfer_size to 65535. dwc2_hc_setup_align_buf() allocates
-	 * coherent buffers with this size, and if it's too large we can
-	 * exhaust the coherent DMA pool.
-	 */
-	if (hw->max_transfer_size > 65535)
-		hw->max_transfer_size = 65535;
 	width = (hwcfg3 & GHWCFG3_PACKET_SIZE_CNTR_WIDTH_MASK) >>
 		GHWCFG3_PACKET_SIZE_CNTR_WIDTH_SHIFT;
 	hw->max_packet_count = (1 << (width + 4)) - 1;
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 8847c72e55f6..bc4bdbc1534e 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -635,9 +635,9 @@ static void dwc2_hc_init_split(struct dwc2_hsotg *hsotg,
 	chan->hub_port = (u8)hub_port;
 }
 
-static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
-			       struct dwc2_host_chan *chan,
-			       struct dwc2_qtd *qtd, void *bufptr)
+static void dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
+			      struct dwc2_host_chan *chan,
+			      struct dwc2_qtd *qtd)
 {
 	struct dwc2_hcd_urb *urb = qtd->urb;
 	struct dwc2_hcd_iso_packet_desc *frame_desc;
@@ -657,7 +657,6 @@ static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
 			else
 				chan->xfer_buf = urb->setup_packet;
 			chan->xfer_len = 8;
-			bufptr = NULL;
 			break;
 
 		case DWC2_CONTROL_DATA:
@@ -684,7 +683,6 @@ static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
 				chan->xfer_dma = hsotg->status_buf_dma;
 			else
 				chan->xfer_buf = hsotg->status_buf;
-			bufptr = NULL;
 			break;
 		}
 		break;
@@ -717,14 +715,6 @@ static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
 
 		chan->xfer_len = frame_desc->length - qtd->isoc_split_offset;
 
-		/* For non-dword aligned buffers */
-		if (hsotg->core_params->dma_enable > 0 &&
-		    (chan->xfer_dma & 0x3))
-			bufptr = (u8 *)urb->buf + frame_desc->offset +
-					qtd->isoc_split_offset;
-		else
-			bufptr = NULL;
-
 		if (chan->xact_pos == DWC2_HCSPLT_XACTPOS_ALL) {
 			if (chan->xfer_len <= 188)
 				chan->xact_pos = DWC2_HCSPLT_XACTPOS_ALL;
@@ -733,63 +723,93 @@ static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
 		}
 		break;
 	}
+}
+
+#define DWC2_USB_DMA_ALIGN 4
+
+struct dma_aligned_buffer {
+	void *kmalloc_ptr;
+	void *old_xfer_buffer;
+	u8 data[0];
+};
+
+static void dwc2_free_dma_aligned_buffer(struct urb *urb)
+{
+	struct dma_aligned_buffer *temp;
+
+	if (!(urb->transfer_flags & URB_ALIGNED_TEMP_BUFFER))
+		return;
+
+	temp = container_of(urb->transfer_buffer,
+		struct dma_aligned_buffer, data);
 
-	return bufptr;
+	if (usb_urb_dir_in(urb))
+		memcpy(temp->old_xfer_buffer, temp->data,
+		       urb->transfer_buffer_length);
+	urb->transfer_buffer = temp->old_xfer_buffer;
+	kfree(temp->kmalloc_ptr);
+
+	urb->transfer_flags &= ~URB_ALIGNED_TEMP_BUFFER;
 }
 
-static int dwc2_hc_setup_align_buf(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
-				   struct dwc2_host_chan *chan,
-				   struct dwc2_hcd_urb *urb, void *bufptr)
+static int dwc2_alloc_dma_aligned_buffer(struct urb *urb, gfp_t mem_flags)
 {
-	u32 buf_size;
-	struct urb *usb_urb;
-	struct usb_hcd *hcd;
+	struct dma_aligned_buffer *temp, *kmalloc_ptr;
+	size_t kmalloc_size;
 
-	if (!qh->dw_align_buf) {
-		if (chan->ep_type != USB_ENDPOINT_XFER_ISOC)
-			buf_size = hsotg->core_params->max_transfer_size;
-		else
-			/* 3072 = 3 max-size Isoc packets */
-			buf_size = 3072;
+	if (urb->num_sgs || urb->sg ||
+	    urb->transfer_buffer_length == 0 ||
+	    !((uintptr_t)urb->transfer_buffer & (DWC2_USB_DMA_ALIGN - 1)))
+		return 0;
 
-		qh->dw_align_buf = kmalloc(buf_size, GFP_ATOMIC | GFP_DMA);
-		if (!qh->dw_align_buf)
-			return -ENOMEM;
-		qh->dw_align_buf_size = buf_size;
-	}
+	/* Allocate a buffer with enough padding for alignment */
+	kmalloc_size = urb->transfer_buffer_length +
+		sizeof(struct dma_aligned_buffer) + DWC2_USB_DMA_ALIGN - 1;
 
-	if (chan->xfer_len) {
-		dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n", __func__);
-		usb_urb = urb->priv;
+	kmalloc_ptr = kmalloc(kmalloc_size, mem_flags);
+	if (!kmalloc_ptr)
+		return -ENOMEM;
 
-		if (usb_urb) {
-			if (usb_urb->transfer_flags &
-			    (URB_SETUP_MAP_SINGLE | URB_DMA_MAP_SG |
-			     URB_DMA_MAP_PAGE | URB_DMA_MAP_SINGLE)) {
-				hcd = dwc2_hsotg_to_hcd(hsotg);
-				usb_hcd_unmap_urb_for_dma(hcd, usb_urb);
-			}
-			if (!chan->ep_is_in)
-				memcpy(qh->dw_align_buf, bufptr,
-				       chan->xfer_len);
-		} else {
-			dev_warn(hsotg->dev, "no URB in dwc2_urb\n");
-		}
-	}
+	/* Position our struct dma_aligned_buffer such that data is aligned */
+	temp = PTR_ALIGN(kmalloc_ptr + 1, DWC2_USB_DMA_ALIGN) - 1;
+	temp->kmalloc_ptr = kmalloc_ptr;
+	temp->old_xfer_buffer = urb->transfer_buffer;
+	if (usb_urb_dir_out(urb))
+		memcpy(temp->data, urb->transfer_buffer,
+		       urb->transfer_buffer_length);
+	urb->transfer_buffer = temp->data;
 
-	qh->dw_align_buf_dma = dma_map_single(hsotg->dev,
-			qh->dw_align_buf, qh->dw_align_buf_size,
-			chan->ep_is_in ? DMA_FROM_DEVICE : DMA_TO_DEVICE);
-	if (dma_mapping_error(hsotg->dev, qh->dw_align_buf_dma)) {
-		dev_err(hsotg->dev, "can't map align_buf\n");
-		chan->align_buf = 0;
-		return -EINVAL;
-	}
+	urb->transfer_flags |= URB_ALIGNED_TEMP_BUFFER;
 
-	chan->align_buf = qh->dw_align_buf_dma;
 	return 0;
 }
 
+static int dwc2_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
+				      gfp_t mem_flags)
+{
+	int ret;
+
+	/* We assume setup_dma is always aligned; warn if not */
+	WARN_ON_ONCE(urb->setup_dma &&
+		     (urb->setup_dma & (DWC2_USB_DMA_ALIGN - 1)));
+
+	ret = dwc2_alloc_dma_aligned_buffer(urb, mem_flags);
+	if (ret)
+		return ret;
+
+	ret = usb_hcd_map_urb_for_dma(hcd, urb, mem_flags);
+	if (ret)
+		dwc2_free_dma_aligned_buffer(urb);
+
+	return ret;
+}
+
+static void dwc2_unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb)
+{
+	usb_hcd_unmap_urb_for_dma(hcd, urb);
+	dwc2_free_dma_aligned_buffer(urb);
+}
+
 /**
  * dwc2_assign_and_init_hc() - Assigns transactions from a QTD to a free host
  * channel and initializes the host channel to perform the transactions. The
@@ -804,7 +824,6 @@ static int dwc2_assign_and_init_hc(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	struct dwc2_host_chan *chan;
 	struct dwc2_hcd_urb *urb;
 	struct dwc2_qtd *qtd;
-	void *bufptr = NULL;
 
 	if (dbg_qh(qh))
 		dev_vdbg(hsotg->dev, "%s(%p,%p)\n", __func__, hsotg, qh);
@@ -866,16 +885,10 @@ static int dwc2_assign_and_init_hc(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		!dwc2_hcd_is_pipe_in(&urb->pipe_info))
 		urb->actual_length = urb->length;
 
-	if (hsotg->core_params->dma_enable > 0) {
+	if (hsotg->core_params->dma_enable > 0)
 		chan->xfer_dma = urb->dma + urb->actual_length;
-
-		/* For non-dword aligned case */
-		if (hsotg->core_params->dma_desc_enable <= 0 &&
-		    (chan->xfer_dma & 0x3))
-			bufptr = (u8 *)urb->buf + urb->actual_length;
-	} else {
+	else
 		chan->xfer_buf = (u8 *)urb->buf + urb->actual_length;
-	}
 
 	chan->xfer_len = urb->length - urb->actual_length;
 	chan->xfer_count = 0;
@@ -887,27 +900,7 @@ static int dwc2_assign_and_init_hc(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		chan->do_split = 0;
 
 	/* Set the transfer attributes */
-	bufptr = dwc2_hc_init_xfer(hsotg, chan, qtd, bufptr);
-
-	/* Non DWORD-aligned buffer case */
-	if (bufptr) {
-		dev_vdbg(hsotg->dev, "Non-aligned buffer\n");
-		if (dwc2_hc_setup_align_buf(hsotg, qh, chan, urb, bufptr)) {
-			dev_err(hsotg->dev,
-				"%s: Failed to allocate memory to handle non-dword aligned buffer\n",
-				__func__);
-			/* Add channel back to free list */
-			chan->align_buf = 0;
-			chan->multi_count = 0;
-			list_add_tail(&chan->hc_list_entry,
-				      &hsotg->free_hc_list);
-			qtd->in_process = 0;
-			qh->channel = NULL;
-			return -ENOMEM;
-		}
-	} else {
-		chan->align_buf = 0;
-	}
+	dwc2_hc_init_xfer(hsotg, chan, qtd);
 
 	if (chan->ep_type == USB_ENDPOINT_XFER_INT ||
 	    chan->ep_type == USB_ENDPOINT_XFER_ISOC)
@@ -2971,6 +2964,9 @@ static struct hc_driver dwc2_hc_driver = {
 
 	.bus_suspend = _dwc2_hcd_suspend,
 	.bus_resume = _dwc2_hcd_resume,
+
+	.map_urb_for_dma	= dwc2_map_urb_for_dma,
+	.unmap_urb_for_dma	= dwc2_unmap_urb_for_dma,
 };
 
 /*
diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 8f0a29cefdf7..42f2e4e233da 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -75,8 +75,6 @@ struct dwc2_qh;
  *                      (micro)frame
  * @xfer_buf:           Pointer to current transfer buffer position
  * @xfer_dma:           DMA address of xfer_buf
- * @align_buf:          In Buffer DMA mode this will be used if xfer_buf is not
- *                      DWORD aligned
  * @xfer_len:           Total number of bytes to transfer
  * @xfer_count:         Number of bytes transferred so far
  * @start_pkt_count:    Packet count at start of transfer
@@ -133,7 +131,6 @@ struct dwc2_host_chan {
 
 	u8 *xfer_buf;
 	dma_addr_t xfer_dma;
-	dma_addr_t align_buf;
 	u32 xfer_len;
 	u32 xfer_count;
 	u16 start_pkt_count;
@@ -243,10 +240,6 @@ enum dwc2_transaction_type {
  * @frame_usecs:        Internal variable used by the microframe scheduler
  * @start_split_frame:  (Micro)frame at which last start split was initialized
  * @ntd:                Actual number of transfer descriptors in a list
- * @dw_align_buf:       Used instead of original buffer if its physical address
- *                      is not dword-aligned
- * @dw_align_buf_size:  Size of dw_align_buf
- * @dw_align_buf_dma:   DMA address for dw_align_buf
  * @qtd_list:           List of QTDs for this QH
  * @channel:            Host channel currently processing transfers for this QH
  * @qh_list_entry:      Entry for QH in either the periodic or non-periodic
@@ -279,9 +272,6 @@ struct dwc2_qh {
 	u16 frame_usecs[8];
 	u16 start_split_frame;
 	u16 ntd;
-	u8 *dw_align_buf;
-	int dw_align_buf_size;
-	dma_addr_t dw_align_buf_dma;
 	struct list_head qtd_list;
 	struct dwc2_host_chan *channel;
 	struct list_head qh_list_entry;
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index f8253803a050..352c98364317 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -472,18 +472,6 @@ static int dwc2_update_urb_state(struct dwc2_hsotg *hsotg,
 		xfer_length = urb->length - urb->actual_length;
 	}
 
-	/* Non DWORD-aligned buffer case handling */
-	if (chan->align_buf && xfer_length) {
-		dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n", __func__);
-		dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-				chan->qh->dw_align_buf_size,
-				chan->ep_is_in ?
-				DMA_FROM_DEVICE : DMA_TO_DEVICE);
-		if (chan->ep_is_in)
-			memcpy(urb->buf + urb->actual_length,
-					chan->qh->dw_align_buf, xfer_length);
-	}
-
 	dev_vdbg(hsotg->dev, "urb->actual_length=%d xfer_length=%d\n",
 		 urb->actual_length, xfer_length);
 	urb->actual_length += xfer_length;
@@ -565,21 +553,6 @@ static enum dwc2_halt_status dwc2_update_isoc_urb_state(
 		frame_desc->status = 0;
 		frame_desc->actual_length = dwc2_get_actual_xfer_length(hsotg,
 					chan, chnum, qtd, halt_status, NULL);
-
-		/* Non DWORD-aligned buffer case handling */
-		if (chan->align_buf && frame_desc->actual_length) {
-			dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n",
-				 __func__);
-			dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-					chan->qh->dw_align_buf_size,
-					chan->ep_is_in ?
-					DMA_FROM_DEVICE : DMA_TO_DEVICE);
-			if (chan->ep_is_in)
-				memcpy(urb->buf + frame_desc->offset +
-					qtd->isoc_split_offset,
-					chan->qh->dw_align_buf,
-					frame_desc->actual_length);
-		}
 		break;
 	case DWC2_HC_XFER_FRAME_OVERRUN:
 		urb->error_count++;
@@ -600,21 +573,6 @@ static enum dwc2_halt_status dwc2_update_isoc_urb_state(
 		frame_desc->actual_length = dwc2_get_actual_xfer_length(hsotg,
 					chan, chnum, qtd, halt_status, NULL);
 
-		/* Non DWORD-aligned buffer case handling */
-		if (chan->align_buf && frame_desc->actual_length) {
-			dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n",
-				 __func__);
-			dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-					chan->qh->dw_align_buf_size,
-					chan->ep_is_in ?
-					DMA_FROM_DEVICE : DMA_TO_DEVICE);
-			if (chan->ep_is_in)
-				memcpy(urb->buf + frame_desc->offset +
-					qtd->isoc_split_offset,
-					chan->qh->dw_align_buf,
-					frame_desc->actual_length);
-		}
-
 		/* Skip whole frame */
 		if (chan->qh->do_split &&
 		    chan->ep_type == USB_ENDPOINT_XFER_ISOC && chan->ep_is_in &&
@@ -680,8 +638,6 @@ static void dwc2_deactivate_qh(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 	}
 
 no_qtd:
-	if (qh->channel)
-		qh->channel->align_buf = 0;
 	qh->channel = NULL;
 	dwc2_hcd_qh_deactivate(hsotg, qh, continue_split);
 }
@@ -946,14 +902,6 @@ static int dwc2_xfercomp_isoc_split_in(struct dwc2_hsotg *hsotg,
 
 	frame_desc->actual_length += len;
 
-	if (chan->align_buf) {
-		dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n", __func__);
-		dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-				chan->qh->dw_align_buf_size, DMA_FROM_DEVICE);
-		memcpy(qtd->urb->buf + frame_desc->offset +
-		       qtd->isoc_split_offset, chan->qh->dw_align_buf, len);
-	}
-
 	qtd->isoc_split_offset += len;
 
 	if (frame_desc->actual_length >= frame_desc->length) {
@@ -1176,19 +1124,6 @@ static void dwc2_update_urb_state_abn(struct dwc2_hsotg *hsotg,
 		xfer_length = urb->length - urb->actual_length;
 	}
 
-	/* Non DWORD-aligned buffer case handling */
-	if (chan->align_buf && xfer_length && chan->ep_is_in) {
-		dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n", __func__);
-		dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-				chan->qh->dw_align_buf_size,
-				chan->ep_is_in ?
-				DMA_FROM_DEVICE : DMA_TO_DEVICE);
-		if (chan->ep_is_in)
-			memcpy(urb->buf + urb->actual_length,
-					chan->qh->dw_align_buf,
-					xfer_length);
-	}
-
 	urb->actual_length += xfer_length;
 
 	hctsiz = dwc2_readl(hsotg->regs + HCTSIZ(chnum));
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 27d402f680a3..e0933a9dfad7 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -232,13 +232,8 @@ struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
  */
 void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
-	if (qh->desc_list) {
+	if (qh->desc_list)
 		dwc2_hcd_qh_free_ddma(hsotg, qh);
-	} else {
-		/* kfree(NULL) is safe */
-		kfree(qh->dw_align_buf);
-		qh->dw_align_buf_dma = (dma_addr_t)0;
-	}
 	kfree(qh);
 }
 
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 02/22] usb: dwc2: host: Get aligned DMA in a more supported way
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

All other host controllers who want aligned buffers for DMA do it a
certain way.  Let's do that too instead of working behind the USB core's
back.  This makes our interrupt handler not take forever and also rips
out a lot of code, simplifying things a bunch.

This also has the side effect of removing the 65535 max transfer size
limit.

NOTE: The actual code to allocate the aligned buffers is ripped almost
completely from the tegra EHCI driver.  At some point in the future we
may want to add this functionality to the USB core to share more code
everywhere.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Acked-by: John Youn <johnyoun-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: John Youn <johnyoun-HKixBCOQz3hWk0Htik3J/w@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Add John's Acks from <https://patchwork.kernel.org/patch/7631551>

Changes in v3: None
Changes in v2:
- Add a warn if setup_dma is not aligned (Julius Werner).

 drivers/usb/dwc2/core.c      |  21 +-----
 drivers/usb/dwc2/hcd.c       | 170 +++++++++++++++++++++----------------------
 drivers/usb/dwc2/hcd.h       |  10 ---
 drivers/usb/dwc2/hcd_intr.c  |  65 -----------------
 drivers/usb/dwc2/hcd_queue.c |   7 +-
 5 files changed, 87 insertions(+), 186 deletions(-)

diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
index 39a0fa8a4c0a..73f2771b7740 100644
--- a/drivers/usb/dwc2/core.c
+++ b/drivers/usb/dwc2/core.c
@@ -1958,19 +1958,11 @@ void dwc2_hc_start_transfer(struct dwc2_hsotg *hsotg,
 	}
 
 	if (hsotg->core_params->dma_enable > 0) {
-		dma_addr_t dma_addr;
-
-		if (chan->align_buf) {
-			if (dbg_hc(chan))
-				dev_vdbg(hsotg->dev, "align_buf\n");
-			dma_addr = chan->align_buf;
-		} else {
-			dma_addr = chan->xfer_dma;
-		}
-		dwc2_writel((u32)dma_addr, hsotg->regs + HCDMA(chan->hc_num));
+		dwc2_writel((u32)chan->xfer_dma,
+			    hsotg->regs + HCDMA(chan->hc_num));
 		if (dbg_hc(chan))
 			dev_vdbg(hsotg->dev, "Wrote %08lx to HCDMA(%d)\n",
-				 (unsigned long)dma_addr, chan->hc_num);
+				 (unsigned long)chan->xfer_dma, chan->hc_num);
 	}
 
 	/* Start the split */
@@ -3363,13 +3355,6 @@ int dwc2_get_hwparams(struct dwc2_hsotg *hsotg)
 	width = (hwcfg3 & GHWCFG3_XFER_SIZE_CNTR_WIDTH_MASK) >>
 		GHWCFG3_XFER_SIZE_CNTR_WIDTH_SHIFT;
 	hw->max_transfer_size = (1 << (width + 11)) - 1;
-	/*
-	 * Clip max_transfer_size to 65535. dwc2_hc_setup_align_buf() allocates
-	 * coherent buffers with this size, and if it's too large we can
-	 * exhaust the coherent DMA pool.
-	 */
-	if (hw->max_transfer_size > 65535)
-		hw->max_transfer_size = 65535;
 	width = (hwcfg3 & GHWCFG3_PACKET_SIZE_CNTR_WIDTH_MASK) >>
 		GHWCFG3_PACKET_SIZE_CNTR_WIDTH_SHIFT;
 	hw->max_packet_count = (1 << (width + 4)) - 1;
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 8847c72e55f6..bc4bdbc1534e 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -635,9 +635,9 @@ static void dwc2_hc_init_split(struct dwc2_hsotg *hsotg,
 	chan->hub_port = (u8)hub_port;
 }
 
-static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
-			       struct dwc2_host_chan *chan,
-			       struct dwc2_qtd *qtd, void *bufptr)
+static void dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
+			      struct dwc2_host_chan *chan,
+			      struct dwc2_qtd *qtd)
 {
 	struct dwc2_hcd_urb *urb = qtd->urb;
 	struct dwc2_hcd_iso_packet_desc *frame_desc;
@@ -657,7 +657,6 @@ static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
 			else
 				chan->xfer_buf = urb->setup_packet;
 			chan->xfer_len = 8;
-			bufptr = NULL;
 			break;
 
 		case DWC2_CONTROL_DATA:
@@ -684,7 +683,6 @@ static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
 				chan->xfer_dma = hsotg->status_buf_dma;
 			else
 				chan->xfer_buf = hsotg->status_buf;
-			bufptr = NULL;
 			break;
 		}
 		break;
@@ -717,14 +715,6 @@ static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
 
 		chan->xfer_len = frame_desc->length - qtd->isoc_split_offset;
 
-		/* For non-dword aligned buffers */
-		if (hsotg->core_params->dma_enable > 0 &&
-		    (chan->xfer_dma & 0x3))
-			bufptr = (u8 *)urb->buf + frame_desc->offset +
-					qtd->isoc_split_offset;
-		else
-			bufptr = NULL;
-
 		if (chan->xact_pos == DWC2_HCSPLT_XACTPOS_ALL) {
 			if (chan->xfer_len <= 188)
 				chan->xact_pos = DWC2_HCSPLT_XACTPOS_ALL;
@@ -733,63 +723,93 @@ static void *dwc2_hc_init_xfer(struct dwc2_hsotg *hsotg,
 		}
 		break;
 	}
+}
+
+#define DWC2_USB_DMA_ALIGN 4
+
+struct dma_aligned_buffer {
+	void *kmalloc_ptr;
+	void *old_xfer_buffer;
+	u8 data[0];
+};
+
+static void dwc2_free_dma_aligned_buffer(struct urb *urb)
+{
+	struct dma_aligned_buffer *temp;
+
+	if (!(urb->transfer_flags & URB_ALIGNED_TEMP_BUFFER))
+		return;
+
+	temp = container_of(urb->transfer_buffer,
+		struct dma_aligned_buffer, data);
 
-	return bufptr;
+	if (usb_urb_dir_in(urb))
+		memcpy(temp->old_xfer_buffer, temp->data,
+		       urb->transfer_buffer_length);
+	urb->transfer_buffer = temp->old_xfer_buffer;
+	kfree(temp->kmalloc_ptr);
+
+	urb->transfer_flags &= ~URB_ALIGNED_TEMP_BUFFER;
 }
 
-static int dwc2_hc_setup_align_buf(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
-				   struct dwc2_host_chan *chan,
-				   struct dwc2_hcd_urb *urb, void *bufptr)
+static int dwc2_alloc_dma_aligned_buffer(struct urb *urb, gfp_t mem_flags)
 {
-	u32 buf_size;
-	struct urb *usb_urb;
-	struct usb_hcd *hcd;
+	struct dma_aligned_buffer *temp, *kmalloc_ptr;
+	size_t kmalloc_size;
 
-	if (!qh->dw_align_buf) {
-		if (chan->ep_type != USB_ENDPOINT_XFER_ISOC)
-			buf_size = hsotg->core_params->max_transfer_size;
-		else
-			/* 3072 = 3 max-size Isoc packets */
-			buf_size = 3072;
+	if (urb->num_sgs || urb->sg ||
+	    urb->transfer_buffer_length == 0 ||
+	    !((uintptr_t)urb->transfer_buffer & (DWC2_USB_DMA_ALIGN - 1)))
+		return 0;
 
-		qh->dw_align_buf = kmalloc(buf_size, GFP_ATOMIC | GFP_DMA);
-		if (!qh->dw_align_buf)
-			return -ENOMEM;
-		qh->dw_align_buf_size = buf_size;
-	}
+	/* Allocate a buffer with enough padding for alignment */
+	kmalloc_size = urb->transfer_buffer_length +
+		sizeof(struct dma_aligned_buffer) + DWC2_USB_DMA_ALIGN - 1;
 
-	if (chan->xfer_len) {
-		dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n", __func__);
-		usb_urb = urb->priv;
+	kmalloc_ptr = kmalloc(kmalloc_size, mem_flags);
+	if (!kmalloc_ptr)
+		return -ENOMEM;
 
-		if (usb_urb) {
-			if (usb_urb->transfer_flags &
-			    (URB_SETUP_MAP_SINGLE | URB_DMA_MAP_SG |
-			     URB_DMA_MAP_PAGE | URB_DMA_MAP_SINGLE)) {
-				hcd = dwc2_hsotg_to_hcd(hsotg);
-				usb_hcd_unmap_urb_for_dma(hcd, usb_urb);
-			}
-			if (!chan->ep_is_in)
-				memcpy(qh->dw_align_buf, bufptr,
-				       chan->xfer_len);
-		} else {
-			dev_warn(hsotg->dev, "no URB in dwc2_urb\n");
-		}
-	}
+	/* Position our struct dma_aligned_buffer such that data is aligned */
+	temp = PTR_ALIGN(kmalloc_ptr + 1, DWC2_USB_DMA_ALIGN) - 1;
+	temp->kmalloc_ptr = kmalloc_ptr;
+	temp->old_xfer_buffer = urb->transfer_buffer;
+	if (usb_urb_dir_out(urb))
+		memcpy(temp->data, urb->transfer_buffer,
+		       urb->transfer_buffer_length);
+	urb->transfer_buffer = temp->data;
 
-	qh->dw_align_buf_dma = dma_map_single(hsotg->dev,
-			qh->dw_align_buf, qh->dw_align_buf_size,
-			chan->ep_is_in ? DMA_FROM_DEVICE : DMA_TO_DEVICE);
-	if (dma_mapping_error(hsotg->dev, qh->dw_align_buf_dma)) {
-		dev_err(hsotg->dev, "can't map align_buf\n");
-		chan->align_buf = 0;
-		return -EINVAL;
-	}
+	urb->transfer_flags |= URB_ALIGNED_TEMP_BUFFER;
 
-	chan->align_buf = qh->dw_align_buf_dma;
 	return 0;
 }
 
+static int dwc2_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
+				      gfp_t mem_flags)
+{
+	int ret;
+
+	/* We assume setup_dma is always aligned; warn if not */
+	WARN_ON_ONCE(urb->setup_dma &&
+		     (urb->setup_dma & (DWC2_USB_DMA_ALIGN - 1)));
+
+	ret = dwc2_alloc_dma_aligned_buffer(urb, mem_flags);
+	if (ret)
+		return ret;
+
+	ret = usb_hcd_map_urb_for_dma(hcd, urb, mem_flags);
+	if (ret)
+		dwc2_free_dma_aligned_buffer(urb);
+
+	return ret;
+}
+
+static void dwc2_unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb)
+{
+	usb_hcd_unmap_urb_for_dma(hcd, urb);
+	dwc2_free_dma_aligned_buffer(urb);
+}
+
 /**
  * dwc2_assign_and_init_hc() - Assigns transactions from a QTD to a free host
  * channel and initializes the host channel to perform the transactions. The
@@ -804,7 +824,6 @@ static int dwc2_assign_and_init_hc(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	struct dwc2_host_chan *chan;
 	struct dwc2_hcd_urb *urb;
 	struct dwc2_qtd *qtd;
-	void *bufptr = NULL;
 
 	if (dbg_qh(qh))
 		dev_vdbg(hsotg->dev, "%s(%p,%p)\n", __func__, hsotg, qh);
@@ -866,16 +885,10 @@ static int dwc2_assign_and_init_hc(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		!dwc2_hcd_is_pipe_in(&urb->pipe_info))
 		urb->actual_length = urb->length;
 
-	if (hsotg->core_params->dma_enable > 0) {
+	if (hsotg->core_params->dma_enable > 0)
 		chan->xfer_dma = urb->dma + urb->actual_length;
-
-		/* For non-dword aligned case */
-		if (hsotg->core_params->dma_desc_enable <= 0 &&
-		    (chan->xfer_dma & 0x3))
-			bufptr = (u8 *)urb->buf + urb->actual_length;
-	} else {
+	else
 		chan->xfer_buf = (u8 *)urb->buf + urb->actual_length;
-	}
 
 	chan->xfer_len = urb->length - urb->actual_length;
 	chan->xfer_count = 0;
@@ -887,27 +900,7 @@ static int dwc2_assign_and_init_hc(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		chan->do_split = 0;
 
 	/* Set the transfer attributes */
-	bufptr = dwc2_hc_init_xfer(hsotg, chan, qtd, bufptr);
-
-	/* Non DWORD-aligned buffer case */
-	if (bufptr) {
-		dev_vdbg(hsotg->dev, "Non-aligned buffer\n");
-		if (dwc2_hc_setup_align_buf(hsotg, qh, chan, urb, bufptr)) {
-			dev_err(hsotg->dev,
-				"%s: Failed to allocate memory to handle non-dword aligned buffer\n",
-				__func__);
-			/* Add channel back to free list */
-			chan->align_buf = 0;
-			chan->multi_count = 0;
-			list_add_tail(&chan->hc_list_entry,
-				      &hsotg->free_hc_list);
-			qtd->in_process = 0;
-			qh->channel = NULL;
-			return -ENOMEM;
-		}
-	} else {
-		chan->align_buf = 0;
-	}
+	dwc2_hc_init_xfer(hsotg, chan, qtd);
 
 	if (chan->ep_type == USB_ENDPOINT_XFER_INT ||
 	    chan->ep_type == USB_ENDPOINT_XFER_ISOC)
@@ -2971,6 +2964,9 @@ static struct hc_driver dwc2_hc_driver = {
 
 	.bus_suspend = _dwc2_hcd_suspend,
 	.bus_resume = _dwc2_hcd_resume,
+
+	.map_urb_for_dma	= dwc2_map_urb_for_dma,
+	.unmap_urb_for_dma	= dwc2_unmap_urb_for_dma,
 };
 
 /*
diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 8f0a29cefdf7..42f2e4e233da 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -75,8 +75,6 @@ struct dwc2_qh;
  *                      (micro)frame
  * @xfer_buf:           Pointer to current transfer buffer position
  * @xfer_dma:           DMA address of xfer_buf
- * @align_buf:          In Buffer DMA mode this will be used if xfer_buf is not
- *                      DWORD aligned
  * @xfer_len:           Total number of bytes to transfer
  * @xfer_count:         Number of bytes transferred so far
  * @start_pkt_count:    Packet count at start of transfer
@@ -133,7 +131,6 @@ struct dwc2_host_chan {
 
 	u8 *xfer_buf;
 	dma_addr_t xfer_dma;
-	dma_addr_t align_buf;
 	u32 xfer_len;
 	u32 xfer_count;
 	u16 start_pkt_count;
@@ -243,10 +240,6 @@ enum dwc2_transaction_type {
  * @frame_usecs:        Internal variable used by the microframe scheduler
  * @start_split_frame:  (Micro)frame at which last start split was initialized
  * @ntd:                Actual number of transfer descriptors in a list
- * @dw_align_buf:       Used instead of original buffer if its physical address
- *                      is not dword-aligned
- * @dw_align_buf_size:  Size of dw_align_buf
- * @dw_align_buf_dma:   DMA address for dw_align_buf
  * @qtd_list:           List of QTDs for this QH
  * @channel:            Host channel currently processing transfers for this QH
  * @qh_list_entry:      Entry for QH in either the periodic or non-periodic
@@ -279,9 +272,6 @@ struct dwc2_qh {
 	u16 frame_usecs[8];
 	u16 start_split_frame;
 	u16 ntd;
-	u8 *dw_align_buf;
-	int dw_align_buf_size;
-	dma_addr_t dw_align_buf_dma;
 	struct list_head qtd_list;
 	struct dwc2_host_chan *channel;
 	struct list_head qh_list_entry;
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index f8253803a050..352c98364317 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -472,18 +472,6 @@ static int dwc2_update_urb_state(struct dwc2_hsotg *hsotg,
 		xfer_length = urb->length - urb->actual_length;
 	}
 
-	/* Non DWORD-aligned buffer case handling */
-	if (chan->align_buf && xfer_length) {
-		dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n", __func__);
-		dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-				chan->qh->dw_align_buf_size,
-				chan->ep_is_in ?
-				DMA_FROM_DEVICE : DMA_TO_DEVICE);
-		if (chan->ep_is_in)
-			memcpy(urb->buf + urb->actual_length,
-					chan->qh->dw_align_buf, xfer_length);
-	}
-
 	dev_vdbg(hsotg->dev, "urb->actual_length=%d xfer_length=%d\n",
 		 urb->actual_length, xfer_length);
 	urb->actual_length += xfer_length;
@@ -565,21 +553,6 @@ static enum dwc2_halt_status dwc2_update_isoc_urb_state(
 		frame_desc->status = 0;
 		frame_desc->actual_length = dwc2_get_actual_xfer_length(hsotg,
 					chan, chnum, qtd, halt_status, NULL);
-
-		/* Non DWORD-aligned buffer case handling */
-		if (chan->align_buf && frame_desc->actual_length) {
-			dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n",
-				 __func__);
-			dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-					chan->qh->dw_align_buf_size,
-					chan->ep_is_in ?
-					DMA_FROM_DEVICE : DMA_TO_DEVICE);
-			if (chan->ep_is_in)
-				memcpy(urb->buf + frame_desc->offset +
-					qtd->isoc_split_offset,
-					chan->qh->dw_align_buf,
-					frame_desc->actual_length);
-		}
 		break;
 	case DWC2_HC_XFER_FRAME_OVERRUN:
 		urb->error_count++;
@@ -600,21 +573,6 @@ static enum dwc2_halt_status dwc2_update_isoc_urb_state(
 		frame_desc->actual_length = dwc2_get_actual_xfer_length(hsotg,
 					chan, chnum, qtd, halt_status, NULL);
 
-		/* Non DWORD-aligned buffer case handling */
-		if (chan->align_buf && frame_desc->actual_length) {
-			dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n",
-				 __func__);
-			dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-					chan->qh->dw_align_buf_size,
-					chan->ep_is_in ?
-					DMA_FROM_DEVICE : DMA_TO_DEVICE);
-			if (chan->ep_is_in)
-				memcpy(urb->buf + frame_desc->offset +
-					qtd->isoc_split_offset,
-					chan->qh->dw_align_buf,
-					frame_desc->actual_length);
-		}
-
 		/* Skip whole frame */
 		if (chan->qh->do_split &&
 		    chan->ep_type == USB_ENDPOINT_XFER_ISOC && chan->ep_is_in &&
@@ -680,8 +638,6 @@ static void dwc2_deactivate_qh(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 	}
 
 no_qtd:
-	if (qh->channel)
-		qh->channel->align_buf = 0;
 	qh->channel = NULL;
 	dwc2_hcd_qh_deactivate(hsotg, qh, continue_split);
 }
@@ -946,14 +902,6 @@ static int dwc2_xfercomp_isoc_split_in(struct dwc2_hsotg *hsotg,
 
 	frame_desc->actual_length += len;
 
-	if (chan->align_buf) {
-		dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n", __func__);
-		dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-				chan->qh->dw_align_buf_size, DMA_FROM_DEVICE);
-		memcpy(qtd->urb->buf + frame_desc->offset +
-		       qtd->isoc_split_offset, chan->qh->dw_align_buf, len);
-	}
-
 	qtd->isoc_split_offset += len;
 
 	if (frame_desc->actual_length >= frame_desc->length) {
@@ -1176,19 +1124,6 @@ static void dwc2_update_urb_state_abn(struct dwc2_hsotg *hsotg,
 		xfer_length = urb->length - urb->actual_length;
 	}
 
-	/* Non DWORD-aligned buffer case handling */
-	if (chan->align_buf && xfer_length && chan->ep_is_in) {
-		dev_vdbg(hsotg->dev, "%s(): non-aligned buffer\n", __func__);
-		dma_unmap_single(hsotg->dev, chan->qh->dw_align_buf_dma,
-				chan->qh->dw_align_buf_size,
-				chan->ep_is_in ?
-				DMA_FROM_DEVICE : DMA_TO_DEVICE);
-		if (chan->ep_is_in)
-			memcpy(urb->buf + urb->actual_length,
-					chan->qh->dw_align_buf,
-					xfer_length);
-	}
-
 	urb->actual_length += xfer_length;
 
 	hctsiz = dwc2_readl(hsotg->regs + HCTSIZ(chnum));
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 27d402f680a3..e0933a9dfad7 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -232,13 +232,8 @@ struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
  */
 void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
-	if (qh->desc_list) {
+	if (qh->desc_list)
 		dwc2_hcd_qh_free_ddma(hsotg, qh);
-	} else {
-		/* kfree(NULL) is safe */
-		kfree(qh->dw_align_buf);
-		qh->dw_align_buf_dma = (dma_addr_t)0;
-	}
 	kfree(qh);
 }
 
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 03/22] usb: dwc2: host: Set host_rx_fifo_size to 525 for rk3066
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

As documented in dwc2_calculate_dynamic_fifo(), host_rx_fifo_size should
really be:
 2 * ((Largest Packet size / 4) + 1 + 1) + n
 with n = number of host channel.

We have 9 host channels, so
 2 * ((1024/4) + 2) + 9 = 516 + 9 = 525

We've got 960 / 972 total_fifo_size on rk3288 (and presumably on
rk3066) and 525 + 128 + 256 = 909 so we're still under on both ports
even when we increment by 5.

In the future, it would be nice if dwc2_calculate_dynamic_fifo() could
handle the "too small" FIFO case and come up with something more
dynamically.  When we do that we can figure out how to allocate the
extra 48 / 60 bytes of FIFO that we're currently wasting.

NOTE: no known bugs are fixed by this patch, but it seems like a simple
fix and ought to fix someone.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
---
Changes in v6:
- Back to 525 dwords, not 528.
- Add Kever's Reviewed-by.
- Add Heiko's Tested-by.

Changes in v5: None
Changes in v4:
- Set host_rx_fifo_size to 528 for rk3066 new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/platform.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
index 5008a467ce06..b277e521a311 100644
--- a/drivers/usb/dwc2/platform.c
+++ b/drivers/usb/dwc2/platform.c
@@ -126,7 +126,7 @@ static const struct dwc2_core_params params_rk3066 = {
 	.speed				= -1,
 	.enable_dynamic_fifo		= 1,
 	.en_multiple_tx_fifo		= -1,
-	.host_rx_fifo_size		= 520,	/* 520 DWORDs */
+	.host_rx_fifo_size		= 525,	/* 525 DWORDs */
 	.host_nperio_tx_fifo_size	= 128,	/* 128 DWORDs */
 	.host_perio_tx_fifo_size	= 256,	/* 256 DWORDs */
 	.max_transfer_size		= -1,
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 03/22] usb: dwc2: host: Set host_rx_fifo_size to 525 for rk3066
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

As documented in dwc2_calculate_dynamic_fifo(), host_rx_fifo_size should
really be:
 2 * ((Largest Packet size / 4) + 1 + 1) + n
 with n = number of host channel.

We have 9 host channels, so
 2 * ((1024/4) + 2) + 9 = 516 + 9 = 525

We've got 960 / 972 total_fifo_size on rk3288 (and presumably on
rk3066) and 525 + 128 + 256 = 909 so we're still under on both ports
even when we increment by 5.

In the future, it would be nice if dwc2_calculate_dynamic_fifo() could
handle the "too small" FIFO case and come up with something more
dynamically.  When we do that we can figure out how to allocate the
extra 48 / 60 bytes of FIFO that we're currently wasting.

NOTE: no known bugs are fixed by this patch, but it seems like a simple
fix and ought to fix someone.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Reviewed-by: Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
---
Changes in v6:
- Back to 525 dwords, not 528.
- Add Kever's Reviewed-by.
- Add Heiko's Tested-by.

Changes in v5: None
Changes in v4:
- Set host_rx_fifo_size to 528 for rk3066 new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/platform.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
index 5008a467ce06..b277e521a311 100644
--- a/drivers/usb/dwc2/platform.c
+++ b/drivers/usb/dwc2/platform.c
@@ -126,7 +126,7 @@ static const struct dwc2_core_params params_rk3066 = {
 	.speed				= -1,
 	.enable_dynamic_fifo		= 1,
 	.en_multiple_tx_fifo		= -1,
-	.host_rx_fifo_size		= 520,	/* 520 DWORDs */
+	.host_rx_fifo_size		= 525,	/* 525 DWORDs */
 	.host_nperio_tx_fifo_size	= 128,	/* 128 DWORDs */
 	.host_perio_tx_fifo_size	= 256,	/* 256 DWORDs */
 	.max_transfer_size		= -1,
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 04/22] usb: dwc2: host: Avoid use of chan->qh after qh freed
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

When poking around with USB devices with slub_debug enabled, I found
another obvious use after free.  Turns out that in dwc2_hc_n_intr() I
was in a state when the contents of chan->qh was filled with 0x6b,
indicating that chan->qh was freed but chan still had a reference to
it.

Let's make sure that whenever we free qh we also make sure we remove a
reference from its channel.

The bug fixed here doesn't appear to be new--I believe I just got lucky
and happened to see it while stress testing.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add one more instance of check; kept Reviewed-by / Tested-by (OK?).
- Add Kever's Reviewed-by.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Avoid use of chan->qh after qh freed new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd.c      | 10 ++++++++++
 drivers/usb/dwc2/hcd_intr.c | 10 ++++++++++
 2 files changed, 20 insertions(+)

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index bc4bdbc1534e..e2d2e9be366e 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -164,6 +164,9 @@ static void dwc2_qh_list_free(struct dwc2_hsotg *hsotg,
 					 qtd_list_entry)
 			dwc2_hcd_qtd_unlink_and_free(hsotg, qtd, qh);
 
+		if (qh->channel && qh->channel->qh == qh)
+			qh->channel->qh = NULL;
+
 		spin_unlock_irqrestore(&hsotg->lock, flags);
 		dwc2_hcd_qh_free(hsotg, qh);
 		spin_lock_irqsave(&hsotg->lock, flags);
@@ -554,7 +557,12 @@ static int dwc2_hcd_endpoint_disable(struct dwc2_hsotg *hsotg,
 		dwc2_hcd_qtd_unlink_and_free(hsotg, qtd, qh);
 
 	ep->hcpriv = NULL;
+
+	if (qh->channel && qh->channel->qh == qh)
+		qh->channel->qh = NULL;
+
 	spin_unlock_irqrestore(&hsotg->lock, flags);
+
 	dwc2_hcd_qh_free(hsotg, qh);
 
 	return 0;
@@ -2782,6 +2790,8 @@ static int _dwc2_hcd_urb_enqueue(struct usb_hcd *hcd, struct urb *urb,
 fail3:
 	dwc2_urb->priv = NULL;
 	usb_hcd_unlink_urb_from_ep(hcd, urb);
+	if (qh_allocated && qh->channel && qh->channel->qh == qh)
+		qh->channel->qh = NULL;
 fail2:
 	spin_unlock_irqrestore(&hsotg->lock, flags);
 	urb->hcpriv = NULL;
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 352c98364317..99efc2bd1617 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -1935,6 +1935,16 @@ static void dwc2_hc_n_intr(struct dwc2_hsotg *hsotg, int chnum)
 	}
 
 	dwc2_writel(hcint, hsotg->regs + HCINT(chnum));
+
+	/*
+	 * If we got an interrupt after someone called
+	 * dwc2_hcd_endpoint_disable() we don't want to crash below
+	 */
+	if (!chan->qh) {
+		dev_warn(hsotg->dev, "Interrupt on disabled channel\n");
+		return;
+	}
+
 	chan->hcint = hcint;
 	hcint &= hcintmsk;
 
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 04/22] usb: dwc2: host: Avoid use of chan->qh after qh freed
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

When poking around with USB devices with slub_debug enabled, I found
another obvious use after free.  Turns out that in dwc2_hc_n_intr() I
was in a state when the contents of chan->qh was filled with 0x6b,
indicating that chan->qh was freed but chan still had a reference to
it.

Let's make sure that whenever we free qh we also make sure we remove a
reference from its channel.

The bug fixed here doesn't appear to be new--I believe I just got lucky
and happened to see it while stress testing.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Reviewed-by: Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add one more instance of check; kept Reviewed-by / Tested-by (OK?).
- Add Kever's Reviewed-by.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Avoid use of chan->qh after qh freed new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd.c      | 10 ++++++++++
 drivers/usb/dwc2/hcd_intr.c | 10 ++++++++++
 2 files changed, 20 insertions(+)

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index bc4bdbc1534e..e2d2e9be366e 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -164,6 +164,9 @@ static void dwc2_qh_list_free(struct dwc2_hsotg *hsotg,
 					 qtd_list_entry)
 			dwc2_hcd_qtd_unlink_and_free(hsotg, qtd, qh);
 
+		if (qh->channel && qh->channel->qh == qh)
+			qh->channel->qh = NULL;
+
 		spin_unlock_irqrestore(&hsotg->lock, flags);
 		dwc2_hcd_qh_free(hsotg, qh);
 		spin_lock_irqsave(&hsotg->lock, flags);
@@ -554,7 +557,12 @@ static int dwc2_hcd_endpoint_disable(struct dwc2_hsotg *hsotg,
 		dwc2_hcd_qtd_unlink_and_free(hsotg, qtd, qh);
 
 	ep->hcpriv = NULL;
+
+	if (qh->channel && qh->channel->qh == qh)
+		qh->channel->qh = NULL;
+
 	spin_unlock_irqrestore(&hsotg->lock, flags);
+
 	dwc2_hcd_qh_free(hsotg, qh);
 
 	return 0;
@@ -2782,6 +2790,8 @@ static int _dwc2_hcd_urb_enqueue(struct usb_hcd *hcd, struct urb *urb,
 fail3:
 	dwc2_urb->priv = NULL;
 	usb_hcd_unlink_urb_from_ep(hcd, urb);
+	if (qh_allocated && qh->channel && qh->channel->qh == qh)
+		qh->channel->qh = NULL;
 fail2:
 	spin_unlock_irqrestore(&hsotg->lock, flags);
 	urb->hcpriv = NULL;
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 352c98364317..99efc2bd1617 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -1935,6 +1935,16 @@ static void dwc2_hc_n_intr(struct dwc2_hsotg *hsotg, int chnum)
 	}
 
 	dwc2_writel(hcint, hsotg->regs + HCINT(chnum));
+
+	/*
+	 * If we got an interrupt after someone called
+	 * dwc2_hcd_endpoint_disable() we don't want to crash below
+	 */
+	if (!chan->qh) {
+		dev_warn(hsotg->dev, "Interrupt on disabled channel\n");
+		return;
+	}
+
 	chan->hcint = hcint;
 	hcint &= hcintmsk;
 
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 05/22] usb: dwc2: host: Always add to the tail of queues
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

The queues the the dwc2 host controller used are truly queues.  That
means FIFO or first in first out.

Unfortunately though the code was iterating through these queues
starting from the head, some places in the code was adding things to the
queue by adding at the head instead of the tail.  That means last in
first out.  Doh.

Go through and just always add to the tail.

Doing this makes things much happier when I've got:
* 7-port USB 2.0 Single-TT hub
* - Microsoft 2.4 GHz Transceiver v7.0 dongle
* - Jabra speakerphone playing music

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Kever's Reviewed-by.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Always add to the tail of queues new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd.c       | 11 ++++++-----
 drivers/usb/dwc2/hcd_ddma.c  |  4 ++--
 drivers/usb/dwc2/hcd_intr.c  |  4 ++--
 drivers/usb/dwc2/hcd_queue.c |  6 ++++--
 4 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index e2d2e9be366e..349194342c90 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -969,7 +969,8 @@ enum dwc2_transaction_type dwc2_hcd_select_transactions(
 		 * periodic assigned schedule
 		 */
 		qh_ptr = qh_ptr->next;
-		list_move(&qh->qh_list_entry, &hsotg->periodic_sched_assigned);
+		list_move_tail(&qh->qh_list_entry,
+			       &hsotg->periodic_sched_assigned);
 		ret_val = DWC2_TRANSACTION_PERIODIC;
 	}
 
@@ -1002,8 +1003,8 @@ enum dwc2_transaction_type dwc2_hcd_select_transactions(
 		 * non-periodic active schedule
 		 */
 		qh_ptr = qh_ptr->next;
-		list_move(&qh->qh_list_entry,
-			  &hsotg->non_periodic_sched_active);
+		list_move_tail(&qh->qh_list_entry,
+			       &hsotg->non_periodic_sched_active);
 
 		if (ret_val == DWC2_TRANSACTION_NONE)
 			ret_val = DWC2_TRANSACTION_NON_PERIODIC;
@@ -1176,8 +1177,8 @@ static void dwc2_process_periodic_channels(struct dwc2_hsotg *hsotg)
 			 * Move the QH from the periodic assigned schedule to
 			 * the periodic queued schedule
 			 */
-			list_move(&qh->qh_list_entry,
-				  &hsotg->periodic_sched_queued);
+			list_move_tail(&qh->qh_list_entry,
+				       &hsotg->periodic_sched_queued);
 
 			/* done queuing high bandwidth */
 			hsotg->queuing_high_bandwidth = 0;
diff --git a/drivers/usb/dwc2/hcd_ddma.c b/drivers/usb/dwc2/hcd_ddma.c
index 36606fc33c0d..16b261cfa92d 100644
--- a/drivers/usb/dwc2/hcd_ddma.c
+++ b/drivers/usb/dwc2/hcd_ddma.c
@@ -1327,8 +1327,8 @@ void dwc2_hcd_complete_xfer_ddma(struct dwc2_hsotg *hsotg,
 			dwc2_hcd_qh_unlink(hsotg, qh);
 		} else {
 			/* Keep in assigned schedule to continue transfer */
-			list_move(&qh->qh_list_entry,
-				  &hsotg->periodic_sched_assigned);
+			list_move_tail(&qh->qh_list_entry,
+				       &hsotg->periodic_sched_assigned);
 			/*
 			 * If channel has been halted during giveback of urb
 			 * then prevent any new scheduling.
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 99efc2bd1617..2c521c00e5e0 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -143,7 +143,7 @@ static void dwc2_sof_intr(struct dwc2_hsotg *hsotg)
 			 * Move QH to the ready list to be executed next
 			 * (micro)frame
 			 */
-			list_move(&qh->qh_list_entry,
+			list_move_tail(&qh->qh_list_entry,
 				  &hsotg->periodic_sched_ready);
 	}
 	tr_type = dwc2_hcd_select_transactions(hsotg);
@@ -794,7 +794,7 @@ static void dwc2_halt_channel(struct dwc2_hsotg *hsotg,
 			 * halt to be queued when the periodic schedule is
 			 * processed.
 			 */
-			list_move(&chan->qh->qh_list_entry,
+			list_move_tail(&chan->qh->qh_list_entry,
 				  &hsotg->periodic_sched_assigned);
 
 			/*
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index e0933a9dfad7..bc632a72f611 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -732,9 +732,11 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 	     dwc2_frame_num_le(qh->sched_frame, frame_number)) ||
 	    (hsotg->core_params->uframe_sched <= 0 &&
 	     qh->sched_frame == frame_number))
-		list_move(&qh->qh_list_entry, &hsotg->periodic_sched_ready);
+		list_move_tail(&qh->qh_list_entry,
+			       &hsotg->periodic_sched_ready);
 	else
-		list_move(&qh->qh_list_entry, &hsotg->periodic_sched_inactive);
+		list_move_tail(&qh->qh_list_entry,
+			       &hsotg->periodic_sched_inactive);
 }
 
 /**
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 05/22] usb: dwc2: host: Always add to the tail of queues
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

The queues the the dwc2 host controller used are truly queues.  That
means FIFO or first in first out.

Unfortunately though the code was iterating through these queues
starting from the head, some places in the code was adding things to the
queue by adding at the head instead of the tail.  That means last in
first out.  Doh.

Go through and just always add to the tail.

Doing this makes things much happier when I've got:
* 7-port USB 2.0 Single-TT hub
* - Microsoft 2.4 GHz Transceiver v7.0 dongle
* - Jabra speakerphone playing music

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Reviewed-by: Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add Kever's Reviewed-by.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Always add to the tail of queues new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd.c       | 11 ++++++-----
 drivers/usb/dwc2/hcd_ddma.c  |  4 ++--
 drivers/usb/dwc2/hcd_intr.c  |  4 ++--
 drivers/usb/dwc2/hcd_queue.c |  6 ++++--
 4 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index e2d2e9be366e..349194342c90 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -969,7 +969,8 @@ enum dwc2_transaction_type dwc2_hcd_select_transactions(
 		 * periodic assigned schedule
 		 */
 		qh_ptr = qh_ptr->next;
-		list_move(&qh->qh_list_entry, &hsotg->periodic_sched_assigned);
+		list_move_tail(&qh->qh_list_entry,
+			       &hsotg->periodic_sched_assigned);
 		ret_val = DWC2_TRANSACTION_PERIODIC;
 	}
 
@@ -1002,8 +1003,8 @@ enum dwc2_transaction_type dwc2_hcd_select_transactions(
 		 * non-periodic active schedule
 		 */
 		qh_ptr = qh_ptr->next;
-		list_move(&qh->qh_list_entry,
-			  &hsotg->non_periodic_sched_active);
+		list_move_tail(&qh->qh_list_entry,
+			       &hsotg->non_periodic_sched_active);
 
 		if (ret_val == DWC2_TRANSACTION_NONE)
 			ret_val = DWC2_TRANSACTION_NON_PERIODIC;
@@ -1176,8 +1177,8 @@ static void dwc2_process_periodic_channels(struct dwc2_hsotg *hsotg)
 			 * Move the QH from the periodic assigned schedule to
 			 * the periodic queued schedule
 			 */
-			list_move(&qh->qh_list_entry,
-				  &hsotg->periodic_sched_queued);
+			list_move_tail(&qh->qh_list_entry,
+				       &hsotg->periodic_sched_queued);
 
 			/* done queuing high bandwidth */
 			hsotg->queuing_high_bandwidth = 0;
diff --git a/drivers/usb/dwc2/hcd_ddma.c b/drivers/usb/dwc2/hcd_ddma.c
index 36606fc33c0d..16b261cfa92d 100644
--- a/drivers/usb/dwc2/hcd_ddma.c
+++ b/drivers/usb/dwc2/hcd_ddma.c
@@ -1327,8 +1327,8 @@ void dwc2_hcd_complete_xfer_ddma(struct dwc2_hsotg *hsotg,
 			dwc2_hcd_qh_unlink(hsotg, qh);
 		} else {
 			/* Keep in assigned schedule to continue transfer */
-			list_move(&qh->qh_list_entry,
-				  &hsotg->periodic_sched_assigned);
+			list_move_tail(&qh->qh_list_entry,
+				       &hsotg->periodic_sched_assigned);
 			/*
 			 * If channel has been halted during giveback of urb
 			 * then prevent any new scheduling.
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 99efc2bd1617..2c521c00e5e0 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -143,7 +143,7 @@ static void dwc2_sof_intr(struct dwc2_hsotg *hsotg)
 			 * Move QH to the ready list to be executed next
 			 * (micro)frame
 			 */
-			list_move(&qh->qh_list_entry,
+			list_move_tail(&qh->qh_list_entry,
 				  &hsotg->periodic_sched_ready);
 	}
 	tr_type = dwc2_hcd_select_transactions(hsotg);
@@ -794,7 +794,7 @@ static void dwc2_halt_channel(struct dwc2_hsotg *hsotg,
 			 * halt to be queued when the periodic schedule is
 			 * processed.
 			 */
-			list_move(&chan->qh->qh_list_entry,
+			list_move_tail(&chan->qh->qh_list_entry,
 				  &hsotg->periodic_sched_assigned);
 
 			/*
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index e0933a9dfad7..bc632a72f611 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -732,9 +732,11 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 	     dwc2_frame_num_le(qh->sched_frame, frame_number)) ||
 	    (hsotg->core_params->uframe_sched <= 0 &&
 	     qh->sched_frame == frame_number))
-		list_move(&qh->qh_list_entry, &hsotg->periodic_sched_ready);
+		list_move_tail(&qh->qh_list_entry,
+			       &hsotg->periodic_sched_ready);
 	else
-		list_move(&qh->qh_list_entry, &hsotg->periodic_sched_inactive);
+		list_move_tail(&qh->qh_list_entry,
+			       &hsotg->periodic_sched_inactive);
 }
 
 /**
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 06/22] usb: dwc2: host: fix split transfer schedule sequence
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, Yunzhi Li, johnyoun,
	gregkh, linux-usb, linux-kernel

We're supposed to keep outstanding splits in order.  Keep track of a
list of the order of splits and process channel interrupts in that
order.

Without this change and the following setup:
* Rockchip rk3288 Chromebook, using port ff540000
  -> Pluggable 7-port Hub with Charging (powered)
     -> Microsoft Wireless Keyboard 2000 in port 1.
     -> Das Keyboard in port 2.

...I find that I get dropped keys on the Microsoft keyboard (I'm sure
there are other combinations that fail, but this documents my test).
Specifically I've been typing "hahahahahahaha" on the keyboard and often
see keys dropped or repeated.

After this change the above setup works properly.  This patch is based
on a previous patch proposed by Yunzhi Li ("usb: dwc2: hcd: fix periodic
transfer schedule sequence")

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Yunzhi Li <lyz@rock-chips.com>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Fix patch tags (hcd -> host)
- Add Kever's Reviewed-by.
- Add Kever's Tested-by.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5:
- Move list maintenance to hcd.c to avoid gadget-only compile error

Changes in v4:
- fix split transfer schedule sequence new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.c     |  2 ++
 drivers/usb/dwc2/core.h     |  2 ++
 drivers/usb/dwc2/hcd.c      |  8 ++++++++
 drivers/usb/dwc2/hcd.h      |  2 ++
 drivers/usb/dwc2/hcd_intr.c | 17 +++++++++++++++++
 5 files changed, 31 insertions(+)

diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
index 73f2771b7740..ed73b26818c0 100644
--- a/drivers/usb/dwc2/core.c
+++ b/drivers/usb/dwc2/core.c
@@ -1676,6 +1676,8 @@ void dwc2_hc_cleanup(struct dwc2_hsotg *hsotg, struct dwc2_host_chan *chan)
 
 	chan->xfer_started = 0;
 
+	list_del_init(&chan->split_order_list_entry);
+
 	/*
 	 * Clear channel interrupt enables and any unhandled channel interrupt
 	 * conditions
diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 7fb6434f4639..538cf38af0e4 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -657,6 +657,7 @@ struct dwc2_hregs_backup {
  *                      periodic_sched_ready because it must be rescheduled for
  *                      the next frame. Otherwise, the item moves to
  *                      periodic_sched_inactive.
+ * @split_order:        List keeping track of channels doing splits, in order.
  * @periodic_usecs:     Total bandwidth claimed so far for periodic transfers.
  *                      This value is in microseconds per (micro)frame. The
  *                      assumption is that all periodic transfers may occur in
@@ -780,6 +781,7 @@ struct dwc2_hsotg {
 	struct list_head periodic_sched_ready;
 	struct list_head periodic_sched_assigned;
 	struct list_head periodic_sched_queued;
+	struct list_head split_order;
 	u16 periodic_usecs;
 	u16 frame_usecs[8];
 	u16 frame_number;
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 349194342c90..0b6ebc7fff3f 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -1045,6 +1045,11 @@ static int dwc2_queue_transaction(struct dwc2_hsotg *hsotg,
 {
 	int retval = 0;
 
+	if (chan->do_split)
+		/* Put ourselves on the list to keep order straight */
+		list_move_tail(&chan->split_order_list_entry,
+			       &hsotg->split_order);
+
 	if (hsotg->core_params->dma_enable > 0) {
 		if (hsotg->core_params->dma_desc_enable > 0) {
 			if (!chan->xfer_started ||
@@ -3153,6 +3158,8 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq)
 	INIT_LIST_HEAD(&hsotg->periodic_sched_assigned);
 	INIT_LIST_HEAD(&hsotg->periodic_sched_queued);
 
+	INIT_LIST_HEAD(&hsotg->split_order);
+
 	/*
 	 * Create a host channel descriptor for each host channel implemented
 	 * in the controller. Initialize the channel descriptor array.
@@ -3166,6 +3173,7 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq)
 		if (channel == NULL)
 			goto error3;
 		channel->hc_num = i;
+		INIT_LIST_HEAD(&channel->split_order_list_entry);
 		hsotg->hc_ptr_array[i] = channel;
 	}
 
diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 42f2e4e233da..1b46e2e617cc 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -106,6 +106,7 @@ struct dwc2_qh;
  * @hc_list_entry:      For linking to list of host channels
  * @desc_list_addr:     Current QH's descriptor list DMA address
  * @desc_list_sz:       Current QH's descriptor list size
+ * @split_order_list_entry: List entry for keeping track of the order of splits
  *
  * This structure represents the state of a single host channel when acting in
  * host mode. It contains the data items needed to transfer packets to an
@@ -158,6 +159,7 @@ struct dwc2_host_chan {
 	struct list_head hc_list_entry;
 	dma_addr_t desc_list_addr;
 	u32 desc_list_sz;
+	struct list_head split_order_list_entry;
 };
 
 struct dwc2_hcd_pipe_info {
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 2c521c00e5e0..577c91096a51 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -2067,6 +2067,7 @@ static void dwc2_hc_intr(struct dwc2_hsotg *hsotg)
 {
 	u32 haint;
 	int i;
+	struct dwc2_host_chan *chan, *chan_tmp;
 
 	haint = dwc2_readl(hsotg->regs + HAINT);
 	if (dbg_perio()) {
@@ -2075,6 +2076,22 @@ static void dwc2_hc_intr(struct dwc2_hsotg *hsotg)
 		dev_vdbg(hsotg->dev, "HAINT=%08x\n", haint);
 	}
 
+	/*
+	 * According to USB 2.0 spec section 11.18.8, a host must
+	 * issue complete-split transactions in a microframe for a
+	 * set of full-/low-speed endpoints in the same relative
+	 * order as the start-splits were issued in a microframe for.
+	 */
+	list_for_each_entry_safe(chan, chan_tmp, &hsotg->split_order,
+				 split_order_list_entry) {
+		int hc_num = chan->hc_num;
+
+		if (haint & (1 << hc_num)) {
+			dwc2_hc_n_intr(hsotg, hc_num);
+			haint &= ~(1 << hc_num);
+		}
+	}
+
 	for (i = 0; i < hsotg->core_params->host_channels; i++) {
 		if (haint & (1 << i))
 			dwc2_hc_n_intr(hsotg, i);
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 06/22] usb: dwc2: host: fix split transfer schedule sequence
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx, Yunzhi Li

We're supposed to keep outstanding splits in order.  Keep track of a
list of the order of splits and process channel interrupts in that
order.

Without this change and the following setup:
* Rockchip rk3288 Chromebook, using port ff540000
  -> Pluggable 7-port Hub with Charging (powered)
     -> Microsoft Wireless Keyboard 2000 in port 1.
     -> Das Keyboard in port 2.

...I find that I get dropped keys on the Microsoft keyboard (I'm sure
there are other combinations that fail, but this documents my test).
Specifically I've been typing "hahahahahahaha" on the keyboard and often
see keys dropped or repeated.

After this change the above setup works properly.  This patch is based
on a previous patch proposed by Yunzhi Li ("usb: dwc2: hcd: fix periodic
transfer schedule sequence")

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Signed-off-by: Yunzhi Li <lyz-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
Reviewed-by: Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Fix patch tags (hcd -> host)
- Add Kever's Reviewed-by.
- Add Kever's Tested-by.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5:
- Move list maintenance to hcd.c to avoid gadget-only compile error

Changes in v4:
- fix split transfer schedule sequence new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.c     |  2 ++
 drivers/usb/dwc2/core.h     |  2 ++
 drivers/usb/dwc2/hcd.c      |  8 ++++++++
 drivers/usb/dwc2/hcd.h      |  2 ++
 drivers/usb/dwc2/hcd_intr.c | 17 +++++++++++++++++
 5 files changed, 31 insertions(+)

diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
index 73f2771b7740..ed73b26818c0 100644
--- a/drivers/usb/dwc2/core.c
+++ b/drivers/usb/dwc2/core.c
@@ -1676,6 +1676,8 @@ void dwc2_hc_cleanup(struct dwc2_hsotg *hsotg, struct dwc2_host_chan *chan)
 
 	chan->xfer_started = 0;
 
+	list_del_init(&chan->split_order_list_entry);
+
 	/*
 	 * Clear channel interrupt enables and any unhandled channel interrupt
 	 * conditions
diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 7fb6434f4639..538cf38af0e4 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -657,6 +657,7 @@ struct dwc2_hregs_backup {
  *                      periodic_sched_ready because it must be rescheduled for
  *                      the next frame. Otherwise, the item moves to
  *                      periodic_sched_inactive.
+ * @split_order:        List keeping track of channels doing splits, in order.
  * @periodic_usecs:     Total bandwidth claimed so far for periodic transfers.
  *                      This value is in microseconds per (micro)frame. The
  *                      assumption is that all periodic transfers may occur in
@@ -780,6 +781,7 @@ struct dwc2_hsotg {
 	struct list_head periodic_sched_ready;
 	struct list_head periodic_sched_assigned;
 	struct list_head periodic_sched_queued;
+	struct list_head split_order;
 	u16 periodic_usecs;
 	u16 frame_usecs[8];
 	u16 frame_number;
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 349194342c90..0b6ebc7fff3f 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -1045,6 +1045,11 @@ static int dwc2_queue_transaction(struct dwc2_hsotg *hsotg,
 {
 	int retval = 0;
 
+	if (chan->do_split)
+		/* Put ourselves on the list to keep order straight */
+		list_move_tail(&chan->split_order_list_entry,
+			       &hsotg->split_order);
+
 	if (hsotg->core_params->dma_enable > 0) {
 		if (hsotg->core_params->dma_desc_enable > 0) {
 			if (!chan->xfer_started ||
@@ -3153,6 +3158,8 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq)
 	INIT_LIST_HEAD(&hsotg->periodic_sched_assigned);
 	INIT_LIST_HEAD(&hsotg->periodic_sched_queued);
 
+	INIT_LIST_HEAD(&hsotg->split_order);
+
 	/*
 	 * Create a host channel descriptor for each host channel implemented
 	 * in the controller. Initialize the channel descriptor array.
@@ -3166,6 +3173,7 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq)
 		if (channel == NULL)
 			goto error3;
 		channel->hc_num = i;
+		INIT_LIST_HEAD(&channel->split_order_list_entry);
 		hsotg->hc_ptr_array[i] = channel;
 	}
 
diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 42f2e4e233da..1b46e2e617cc 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -106,6 +106,7 @@ struct dwc2_qh;
  * @hc_list_entry:      For linking to list of host channels
  * @desc_list_addr:     Current QH's descriptor list DMA address
  * @desc_list_sz:       Current QH's descriptor list size
+ * @split_order_list_entry: List entry for keeping track of the order of splits
  *
  * This structure represents the state of a single host channel when acting in
  * host mode. It contains the data items needed to transfer packets to an
@@ -158,6 +159,7 @@ struct dwc2_host_chan {
 	struct list_head hc_list_entry;
 	dma_addr_t desc_list_addr;
 	u32 desc_list_sz;
+	struct list_head split_order_list_entry;
 };
 
 struct dwc2_hcd_pipe_info {
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 2c521c00e5e0..577c91096a51 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -2067,6 +2067,7 @@ static void dwc2_hc_intr(struct dwc2_hsotg *hsotg)
 {
 	u32 haint;
 	int i;
+	struct dwc2_host_chan *chan, *chan_tmp;
 
 	haint = dwc2_readl(hsotg->regs + HAINT);
 	if (dbg_perio()) {
@@ -2075,6 +2076,22 @@ static void dwc2_hc_intr(struct dwc2_hsotg *hsotg)
 		dev_vdbg(hsotg->dev, "HAINT=%08x\n", haint);
 	}
 
+	/*
+	 * According to USB 2.0 spec section 11.18.8, a host must
+	 * issue complete-split transactions in a microframe for a
+	 * set of full-/low-speed endpoints in the same relative
+	 * order as the start-splits were issued in a microframe for.
+	 */
+	list_for_each_entry_safe(chan, chan_tmp, &hsotg->split_order,
+				 split_order_list_entry) {
+		int hc_num = chan->hc_num;
+
+		if (haint & (1 << hc_num)) {
+			dwc2_hc_n_intr(hsotg, hc_num);
+			haint &= ~(1 << hc_num);
+		}
+	}
+
 	for (i = 0; i < hsotg->core_params->host_channels; i++) {
 		if (haint & (1 << i))
 			dwc2_hc_n_intr(hsotg, i);
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 07/22] usb: dwc2: host: Add scheduler tracing
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

In preparation for future changes to the scheduler let's add some
tracing that makes it easy for us to see what's happening.  By default
this tracing will be off.

By changing "core.h" you can easily trace to ftrace, the console, or
nowhere.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Kever Yang <kever.yang@rock-chips.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Kever's Reviewed-by.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Retooled scheduler tracing a bit, so left off John's Ack from v3.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.h      | 20 ++++++++++++++++++++
 drivers/usb/dwc2/hcd.h       |  5 +++++
 drivers/usb/dwc2/hcd_intr.c  |  6 +++++-
 drivers/usb/dwc2/hcd_queue.c | 24 +++++++++++++++++++++++-
 4 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 538cf38af0e4..18f9e4045643 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -44,6 +44,26 @@
 #include <linux/usb/phy.h>
 #include "hw.h"
 
+/*
+ * Suggested defines for tracers:
+ * - no_printk:    Disable tracing
+ * - pr_info:      Print this info to the console
+ * - trace_printk: Print this info to trace buffer (good for verbose logging)
+ */
+
+#define DWC2_TRACE_SCHEDULER		no_printk
+#define DWC2_TRACE_SCHEDULER_VB		no_printk
+
+/* Detailed scheduler tracing, but won't overwhelm console */
+#define dwc2_sch_dbg(hsotg, fmt, ...)					\
+	DWC2_TRACE_SCHEDULER(pr_fmt("%s: SCH: " fmt),			\
+			     dev_name(hsotg->dev), ##__VA_ARGS__)
+
+/* Verbose scheduler tracing */
+#define dwc2_sch_vdbg(hsotg, fmt, ...)					\
+	DWC2_TRACE_SCHEDULER_VB(pr_fmt("%s: SCH: " fmt),		\
+				dev_name(hsotg->dev), ##__VA_ARGS__)
+
 static inline u32 dwc2_readl(const void __iomem *addr)
 {
 	u32 value = __raw_readl(addr);
diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 1b46e2e617cc..809bc4ff9116 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -563,6 +563,11 @@ static inline u16 dwc2_frame_num_inc(u16 frame, u16 inc)
 	return (frame + inc) & HFNUM_MAX_FRNUM;
 }
 
+static inline u16 dwc2_frame_num_dec(u16 frame, u16 dec)
+{
+	return (frame + HFNUM_MAX_FRNUM + 1 - dec) & HFNUM_MAX_FRNUM;
+}
+
 static inline u16 dwc2_full_frame_num(u16 frame)
 {
 	return (frame & HFNUM_MAX_FRNUM) >> 3;
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 577c91096a51..5d25a5ec9736 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -138,13 +138,17 @@ static void dwc2_sof_intr(struct dwc2_hsotg *hsotg)
 	while (qh_entry != &hsotg->periodic_sched_inactive) {
 		qh = list_entry(qh_entry, struct dwc2_qh, qh_list_entry);
 		qh_entry = qh_entry->next;
-		if (dwc2_frame_num_le(qh->sched_frame, hsotg->frame_number))
+		if (dwc2_frame_num_le(qh->sched_frame, hsotg->frame_number)) {
+			dwc2_sch_vdbg(hsotg, "QH=%p ready fn=%04x, sch=%04x\n",
+				      qh, hsotg->frame_number, qh->sched_frame);
+
 			/*
 			 * Move QH to the ready list to be executed next
 			 * (micro)frame
 			 */
 			list_move_tail(&qh->qh_list_entry,
 				  &hsotg->periodic_sched_ready);
+		}
 	}
 	tr_type = dwc2_hcd_select_transactions(hsotg);
 	if (tr_type != DWC2_TRANSACTION_NONE)
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index bc632a72f611..0e9faa75593c 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -113,6 +113,9 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		qh->sched_frame = dwc2_frame_num_inc(hsotg->frame_number,
 						     SCHEDULE_SLOP);
 		qh->interval = urb->interval;
+		dwc2_sch_dbg(hsotg, "QH=%p init sch=%04x, fn=%04x, int=%#x\n",
+			     qh, qh->sched_frame, hsotg->frame_number,
+			     qh->interval);
 #if 0
 		/* Increase interrupt polling rate for debugging */
 		if (qh->ep_type == USB_ENDPOINT_XFER_INT)
@@ -126,6 +129,11 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 			qh->interval *= 8;
 			qh->sched_frame |= 0x7;
 			qh->start_split_frame = qh->sched_frame;
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p init*8 sch=%04x, fn=%04x, int=%#x\n",
+				     qh, qh->sched_frame, hsotg->frame_number,
+				     qh->interval);
+
 		}
 		dev_dbg(hsotg->dev, "interval=%d\n", qh->interval);
 	}
@@ -482,6 +490,8 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		if (frame >= 0) {
 			qh->sched_frame &= ~0x7;
 			qh->sched_frame |= (frame & 7);
+			dwc2_sch_dbg(hsotg, "QH=%p sched_p sch=%04x, uf=%d\n",
+				     qh, qh->sched_frame, frame);
 		}
 
 		if (status > 0)
@@ -583,10 +593,16 @@ int dwc2_hcd_qh_add(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 
 	if (!dwc2_frame_num_le(qh->sched_frame, hsotg->frame_number) &&
 			!hsotg->frame_number) {
+		u16 new_frame;
+
 		dev_dbg(hsotg->dev,
 				"reset frame number counter\n");
-		qh->sched_frame = dwc2_frame_num_inc(hsotg->frame_number,
+		new_frame = dwc2_frame_num_inc(hsotg->frame_number,
 				SCHEDULE_SLOP);
+
+		dwc2_sch_vdbg(hsotg, "QH=%p reset sch=%04x=>%04x\n",
+			      qh, qh->sched_frame, new_frame);
+		qh->sched_frame = new_frame;
 	}
 
 	/* Add the new QH to the appropriate schedule */
@@ -652,6 +668,7 @@ static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
 				      int sched_next_periodic_split)
 {
 	u16 incr;
+	u16 old_frame = qh->sched_frame;
 
 	if (sched_next_periodic_split) {
 		qh->sched_frame = frame_number;
@@ -677,6 +694,11 @@ static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
 		qh->sched_frame |= 0x7;
 		qh->start_split_frame = qh->sched_frame;
 	}
+
+	dwc2_sch_vdbg(hsotg, "QH=%p next(%d) fn=%04x, sch=%04x=>%04x (%+d)\n",
+		      qh, sched_next_periodic_split, frame_number, old_frame,
+		      qh->sched_frame,
+		      dwc2_frame_num_dec(qh->sched_frame, old_frame));
 }
 
 /*
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 07/22] usb: dwc2: host: Add scheduler tracing
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

In preparation for future changes to the scheduler let's add some
tracing that makes it easy for us to see what's happening.  By default
this tracing will be off.

By changing "core.h" you can easily trace to ftrace, the console, or
nowhere.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Reviewed-by: Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add Kever's Reviewed-by.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Retooled scheduler tracing a bit, so left off John's Ack from v3.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.h      | 20 ++++++++++++++++++++
 drivers/usb/dwc2/hcd.h       |  5 +++++
 drivers/usb/dwc2/hcd_intr.c  |  6 +++++-
 drivers/usb/dwc2/hcd_queue.c | 24 +++++++++++++++++++++++-
 4 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 538cf38af0e4..18f9e4045643 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -44,6 +44,26 @@
 #include <linux/usb/phy.h>
 #include "hw.h"
 
+/*
+ * Suggested defines for tracers:
+ * - no_printk:    Disable tracing
+ * - pr_info:      Print this info to the console
+ * - trace_printk: Print this info to trace buffer (good for verbose logging)
+ */
+
+#define DWC2_TRACE_SCHEDULER		no_printk
+#define DWC2_TRACE_SCHEDULER_VB		no_printk
+
+/* Detailed scheduler tracing, but won't overwhelm console */
+#define dwc2_sch_dbg(hsotg, fmt, ...)					\
+	DWC2_TRACE_SCHEDULER(pr_fmt("%s: SCH: " fmt),			\
+			     dev_name(hsotg->dev), ##__VA_ARGS__)
+
+/* Verbose scheduler tracing */
+#define dwc2_sch_vdbg(hsotg, fmt, ...)					\
+	DWC2_TRACE_SCHEDULER_VB(pr_fmt("%s: SCH: " fmt),		\
+				dev_name(hsotg->dev), ##__VA_ARGS__)
+
 static inline u32 dwc2_readl(const void __iomem *addr)
 {
 	u32 value = __raw_readl(addr);
diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 1b46e2e617cc..809bc4ff9116 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -563,6 +563,11 @@ static inline u16 dwc2_frame_num_inc(u16 frame, u16 inc)
 	return (frame + inc) & HFNUM_MAX_FRNUM;
 }
 
+static inline u16 dwc2_frame_num_dec(u16 frame, u16 dec)
+{
+	return (frame + HFNUM_MAX_FRNUM + 1 - dec) & HFNUM_MAX_FRNUM;
+}
+
 static inline u16 dwc2_full_frame_num(u16 frame)
 {
 	return (frame & HFNUM_MAX_FRNUM) >> 3;
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 577c91096a51..5d25a5ec9736 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -138,13 +138,17 @@ static void dwc2_sof_intr(struct dwc2_hsotg *hsotg)
 	while (qh_entry != &hsotg->periodic_sched_inactive) {
 		qh = list_entry(qh_entry, struct dwc2_qh, qh_list_entry);
 		qh_entry = qh_entry->next;
-		if (dwc2_frame_num_le(qh->sched_frame, hsotg->frame_number))
+		if (dwc2_frame_num_le(qh->sched_frame, hsotg->frame_number)) {
+			dwc2_sch_vdbg(hsotg, "QH=%p ready fn=%04x, sch=%04x\n",
+				      qh, hsotg->frame_number, qh->sched_frame);
+
 			/*
 			 * Move QH to the ready list to be executed next
 			 * (micro)frame
 			 */
 			list_move_tail(&qh->qh_list_entry,
 				  &hsotg->periodic_sched_ready);
+		}
 	}
 	tr_type = dwc2_hcd_select_transactions(hsotg);
 	if (tr_type != DWC2_TRANSACTION_NONE)
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index bc632a72f611..0e9faa75593c 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -113,6 +113,9 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		qh->sched_frame = dwc2_frame_num_inc(hsotg->frame_number,
 						     SCHEDULE_SLOP);
 		qh->interval = urb->interval;
+		dwc2_sch_dbg(hsotg, "QH=%p init sch=%04x, fn=%04x, int=%#x\n",
+			     qh, qh->sched_frame, hsotg->frame_number,
+			     qh->interval);
 #if 0
 		/* Increase interrupt polling rate for debugging */
 		if (qh->ep_type == USB_ENDPOINT_XFER_INT)
@@ -126,6 +129,11 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 			qh->interval *= 8;
 			qh->sched_frame |= 0x7;
 			qh->start_split_frame = qh->sched_frame;
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p init*8 sch=%04x, fn=%04x, int=%#x\n",
+				     qh, qh->sched_frame, hsotg->frame_number,
+				     qh->interval);
+
 		}
 		dev_dbg(hsotg->dev, "interval=%d\n", qh->interval);
 	}
@@ -482,6 +490,8 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		if (frame >= 0) {
 			qh->sched_frame &= ~0x7;
 			qh->sched_frame |= (frame & 7);
+			dwc2_sch_dbg(hsotg, "QH=%p sched_p sch=%04x, uf=%d\n",
+				     qh, qh->sched_frame, frame);
 		}
 
 		if (status > 0)
@@ -583,10 +593,16 @@ int dwc2_hcd_qh_add(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 
 	if (!dwc2_frame_num_le(qh->sched_frame, hsotg->frame_number) &&
 			!hsotg->frame_number) {
+		u16 new_frame;
+
 		dev_dbg(hsotg->dev,
 				"reset frame number counter\n");
-		qh->sched_frame = dwc2_frame_num_inc(hsotg->frame_number,
+		new_frame = dwc2_frame_num_inc(hsotg->frame_number,
 				SCHEDULE_SLOP);
+
+		dwc2_sch_vdbg(hsotg, "QH=%p reset sch=%04x=>%04x\n",
+			      qh, qh->sched_frame, new_frame);
+		qh->sched_frame = new_frame;
 	}
 
 	/* Add the new QH to the appropriate schedule */
@@ -652,6 +668,7 @@ static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
 				      int sched_next_periodic_split)
 {
 	u16 incr;
+	u16 old_frame = qh->sched_frame;
 
 	if (sched_next_periodic_split) {
 		qh->sched_frame = frame_number;
@@ -677,6 +694,11 @@ static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
 		qh->sched_frame |= 0x7;
 		qh->start_split_frame = qh->sched_frame;
 	}
+
+	dwc2_sch_vdbg(hsotg, "QH=%p next(%d) fn=%04x, sch=%04x=>%04x (%+d)\n",
+		      qh, sched_next_periodic_split, frame_number, old_frame,
+		      qh->sched_frame,
+		      dwc2_frame_num_dec(qh->sched_frame, old_frame));
 }
 
 /*
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 08/22] usb: dwc2: host: Add a delay before releasing periodic bandwidth
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

We'd like to be able to use HCD_BH in order to speed up the dwc2 host
interrupt handler quite a bit.  However, according to the kernel doc for
usb_submit_urb() (specifically the part about "Reserved Bandwidth
Transfers"), we need to keep a reservation active as long as a device
driver keeps submitting.  That was easy to do when we gave back the URB
in the interrupt context: we just looked at when our queue was empty and
released the reserved bandwidth then.  ...but now we need a little more
complexity.

We'll follow EHCI's lead in commit 9118f9eb4f1e ("USB: EHCI: improve
interrupt qh unlink") and add a 5ms delay.  Since we don't have a whole
timer infrastructure in dwc2, we'll just add a timer per QH.  The
overhead for this is very small.

Note that the dwc2 scheduler is pretty broken (see future patches to fix
it).  This patch attempts to replicate all old behavior and just add the
proper delay.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Moved periodic bandwidth release delay patch earlier again.

Changes in v3:
- Moved periodic bandwidth release delay patch later in the series.

Changes in v2:
- Periodic bandwidth release delay new for V2

 drivers/usb/dwc2/hcd.h       |   6 ++
 drivers/usb/dwc2/hcd_queue.c | 237 +++++++++++++++++++++++++++++++++----------
 2 files changed, 187 insertions(+), 56 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 809bc4ff9116..79473ea35bd6 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -215,6 +215,7 @@ enum dwc2_transaction_type {
 /**
  * struct dwc2_qh - Software queue head structure
  *
+ * @hsotg:              The HCD state structure for the DWC OTG controller
  * @ep_type:            Endpoint type. One of the following values:
  *                       - USB_ENDPOINT_XFER_CONTROL
  *                       - USB_ENDPOINT_XFER_BULK
@@ -252,13 +253,16 @@ enum dwc2_transaction_type {
  * @n_bytes:            Xfer Bytes array. Each element corresponds to a transfer
  *                      descriptor and indicates original XferSize value for the
  *                      descriptor
+ * @unreserve_timer:    Timer for releasing periodic reservation.
  * @tt_buffer_dirty     True if clear_tt_buffer_complete is pending
+ * @unreserve_pending:  True if we planned to unreserve but haven't yet.
  *
  * A Queue Head (QH) holds the static characteristics of an endpoint and
  * maintains a list of transfers (QTDs) for that endpoint. A QH structure may
  * be entered in either the non-periodic or periodic schedule.
  */
 struct dwc2_qh {
+	struct dwc2_hsotg *hsotg;
 	u8 ep_type;
 	u8 ep_is_in;
 	u16 maxp;
@@ -281,7 +285,9 @@ struct dwc2_qh {
 	dma_addr_t desc_list_dma;
 	u32 desc_list_sz;
 	u32 *n_bytes;
+	struct timer_list unreserve_timer;
 	unsigned tt_buffer_dirty:1;
+	unsigned unreserve_pending:1;
 };
 
 /**
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 0e9faa75593c..b9e4867e1afd 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -53,6 +53,94 @@
 #include "core.h"
 #include "hcd.h"
 
+/* Wait this long before releasing periodic reservation */
+#define DWC2_UNRESERVE_DELAY (msecs_to_jiffies(5))
+
+/**
+ * dwc2_do_unreserve() - Actually release the periodic reservation
+ *
+ * This function actually releases the periodic bandwidth that was reserved
+ * by the given qh.
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    QH for the periodic transfer.
+ */
+static void dwc2_do_unreserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	assert_spin_locked(&hsotg->lock);
+
+	WARN_ON(!qh->unreserve_pending);
+
+	/* No more unreserve pending--we're doing it */
+	qh->unreserve_pending = false;
+
+	if (WARN_ON(!list_empty(&qh->qh_list_entry)))
+		list_del_init(&qh->qh_list_entry);
+
+	/* Update claimed usecs per (micro)frame */
+	hsotg->periodic_usecs -= qh->usecs;
+
+	if (hsotg->core_params->uframe_sched > 0) {
+		int i;
+
+		for (i = 0; i < 8; i++) {
+			hsotg->frame_usecs[i] += qh->frame_usecs[i];
+			qh->frame_usecs[i] = 0;
+		}
+	} else {
+		/* Release periodic channel reservation */
+		hsotg->periodic_channels--;
+	}
+}
+
+/**
+ * dwc2_unreserve_timer_fn() - Timer function to release periodic reservation
+ *
+ * According to the kernel doc for usb_submit_urb() (specifically the part about
+ * "Reserved Bandwidth Transfers"), we need to keep a reservation active as
+ * long as a device driver keeps submitting.  Since we're using HCD_BH to give
+ * back the URB we need to give the driver a little bit of time before we
+ * release the reservation.  This worker is called after the appropriate
+ * delay.
+ *
+ * @work: Pointer to a qh unreserve_work.
+ */
+static void dwc2_unreserve_timer_fn(unsigned long data)
+{
+	struct dwc2_qh *qh = (struct dwc2_qh *)data;
+	struct dwc2_hsotg *hsotg = qh->hsotg;
+	unsigned long flags;
+
+	/*
+	 * Wait for the lock, or for us to be scheduled again.  We
+	 * could be scheduled again if:
+	 * - We started executing but didn't get the lock yet.
+	 * - A new reservation came in, but cancel didn't take effect
+	 *   because we already started executing.
+	 * - The timer has been kicked again.
+	 * In that case cancel and wait for the next call.
+	 */
+	while (!spin_trylock_irqsave(&hsotg->lock, flags)) {
+		if (timer_pending(&qh->unreserve_timer))
+			return;
+	}
+
+	/*
+	 * Might be no more unreserve pending if:
+	 * - We started executing but didn't get the lock yet.
+	 * - A new reservation came in, but cancel didn't take effect
+	 *   because we already started executing.
+	 *
+	 * We can't put this in the loop above because unreserve_pending needs
+	 * to be accessed under lock, so we can only check it once we got the
+	 * lock.
+	 */
+	if (qh->unreserve_pending)
+		dwc2_do_unreserve(hsotg, qh);
+
+	spin_unlock_irqrestore(&hsotg->lock, flags);
+}
+
 /**
  * dwc2_qh_init() - Initializes a QH structure
  *
@@ -71,6 +159,9 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 	dev_vdbg(hsotg->dev, "%s()\n", __func__);
 
 	/* Initialize QH */
+	qh->hsotg = hsotg;
+	setup_timer(&qh->unreserve_timer, dwc2_unreserve_timer_fn,
+		    (unsigned long)qh);
 	qh->ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
 	qh->ep_is_in = dwc2_hcd_is_pipe_in(&urb->pipe_info) ? 1 : 0;
 
@@ -240,6 +331,15 @@ struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
  */
 void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
+	/* Make sure any unreserve work is finished. */
+	if (del_timer_sync(&qh->unreserve_timer)) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&hsotg->lock, flags);
+		dwc2_do_unreserve(hsotg, qh);
+		spin_unlock_irqrestore(&hsotg->lock, flags);
+	}
+
 	if (qh->desc_list)
 		dwc2_hcd_qh_free_ddma(hsotg, qh);
 	kfree(qh);
@@ -477,51 +577,74 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
 	int status;
 
-	if (hsotg->core_params->uframe_sched > 0) {
-		int frame = -1;
-
-		status = dwc2_find_uframe(hsotg, qh);
-		if (status == 0)
-			frame = 7;
-		else if (status > 0)
-			frame = status - 1;
-
-		/* Set the new frame up */
-		if (frame >= 0) {
-			qh->sched_frame &= ~0x7;
-			qh->sched_frame |= (frame & 7);
-			dwc2_sch_dbg(hsotg, "QH=%p sched_p sch=%04x, uf=%d\n",
-				     qh, qh->sched_frame, frame);
+	status = dwc2_check_max_xfer_size(hsotg, qh);
+	if (status) {
+		dev_dbg(hsotg->dev,
+			"%s: Channel max transfer size too small for periodic transfer\n",
+			__func__);
+		return status;
+	}
+
+	/* Cancel pending unreserve; if canceled OK, unreserve was pending */
+	if (del_timer(&qh->unreserve_timer))
+		WARN_ON(!qh->unreserve_pending);
+
+	/*
+	 * Only need to reserve if there's not an unreserve pending, since if an
+	 * unreserve is pending then by definition our old reservation is still
+	 * valid.  Unreserve might still be pending even if we didn't cancel if
+	 * dwc2_unreserve_timer_fn() already started.  Code in the timer handles
+	 * that case.
+	 */
+	if (!qh->unreserve_pending) {
+		if (hsotg->core_params->uframe_sched > 0) {
+			int frame = -1;
+
+			status = dwc2_find_uframe(hsotg, qh);
+			if (status == 0)
+				frame = 7;
+			else if (status > 0)
+				frame = status - 1;
+
+			/* Set the new frame up */
+			if (frame >= 0) {
+				qh->sched_frame &= ~0x7;
+				qh->sched_frame |= (frame & 7);
+				dwc2_sch_dbg(hsotg,
+					     "QH=%p sched_p sch=%04x, uf=%d\n",
+					     qh, qh->sched_frame, frame);
+			}
+
+			if (status > 0)
+				status = 0;
+		} else {
+			status = dwc2_periodic_channel_available(hsotg);
+			if (status) {
+				dev_info(hsotg->dev,
+					"%s: No host channel available for periodic transfer\n",
+					__func__);
+				return status;
+			}
+
+			status = dwc2_check_periodic_bandwidth(hsotg, qh);
 		}
 
-		if (status > 0)
-			status = 0;
-	} else {
-		status = dwc2_periodic_channel_available(hsotg);
 		if (status) {
-			dev_info(hsotg->dev,
-				 "%s: No host channel available for periodic transfer\n",
-				 __func__);
+			dev_dbg(hsotg->dev,
+				"%s: Insufficient periodic bandwidth for periodic transfer\n",
+				__func__);
 			return status;
 		}
 
-		status = dwc2_check_periodic_bandwidth(hsotg, qh);
-	}
+		if (hsotg->core_params->uframe_sched <= 0)
+			/* Reserve periodic channel */
+			hsotg->periodic_channels++;
 
-	if (status) {
-		dev_dbg(hsotg->dev,
-			"%s: Insufficient periodic bandwidth for periodic transfer\n",
-			__func__);
-		return status;
+		/* Update claimed usecs per (micro)frame */
+		hsotg->periodic_usecs += qh->usecs;
 	}
 
-	status = dwc2_check_max_xfer_size(hsotg, qh);
-	if (status) {
-		dev_dbg(hsotg->dev,
-			"%s: Channel max transfer size too small for periodic transfer\n",
-			__func__);
-		return status;
-	}
+	qh->unreserve_pending = 0;
 
 	if (hsotg->core_params->dma_desc_enable > 0)
 		/* Don't rely on SOF and start in ready schedule */
@@ -531,13 +654,6 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		list_add_tail(&qh->qh_list_entry,
 			      &hsotg->periodic_sched_inactive);
 
-	if (hsotg->core_params->uframe_sched <= 0)
-		/* Reserve periodic channel */
-		hsotg->periodic_channels++;
-
-	/* Update claimed usecs per (micro)frame */
-	hsotg->periodic_usecs += qh->usecs;
-
 	return status;
 }
 
@@ -551,22 +667,31 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg,
 				     struct dwc2_qh *qh)
 {
-	int i;
+	bool did_modify;
 
-	list_del_init(&qh->qh_list_entry);
+	assert_spin_locked(&hsotg->lock);
 
-	/* Update claimed usecs per (micro)frame */
-	hsotg->periodic_usecs -= qh->usecs;
+	/*
+	 * Schedule the unreserve to happen in a little bit.  Cases here:
+	 * - Unreserve worker might be sitting there waiting to grab the lock.
+	 *   In this case it will notice it's been schedule again and will
+	 *   quit.
+	 * - Unreserve worker might not be scheduled.
+	 *
+	 * We should never already be scheduled since dwc2_schedule_periodic()
+	 * should have canceled the scheduled unreserve timer (hence the
+	 * warning on did_modify).
+	 *
+	 * We add + 1 to the timer to guarantee that at least 1 jiffy has
+	 * passed (otherwise if the jiffy counter might tick right after we
+	 * read it and we'll get no delay).
+	 */
+	did_modify = mod_timer(&qh->unreserve_timer,
+			       jiffies + DWC2_UNRESERVE_DELAY + 1);
+	WARN_ON(did_modify);
+	qh->unreserve_pending = 1;
 
-	if (hsotg->core_params->uframe_sched > 0) {
-		for (i = 0; i < 8; i++) {
-			hsotg->frame_usecs[i] += qh->frame_usecs[i];
-			qh->frame_usecs[i] = 0;
-		}
-	} else {
-		/* Release periodic channel reservation */
-		hsotg->periodic_channels--;
-	}
+	list_del_init(&qh->qh_list_entry);
 }
 
 /**
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 08/22] usb: dwc2: host: Add a delay before releasing periodic bandwidth
@ 2016-01-29  2:19   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:19 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

We'd like to be able to use HCD_BH in order to speed up the dwc2 host
interrupt handler quite a bit.  However, according to the kernel doc for
usb_submit_urb() (specifically the part about "Reserved Bandwidth
Transfers"), we need to keep a reservation active as long as a device
driver keeps submitting.  That was easy to do when we gave back the URB
in the interrupt context: we just looked at when our queue was empty and
released the reserved bandwidth then.  ...but now we need a little more
complexity.

We'll follow EHCI's lead in commit 9118f9eb4f1e ("USB: EHCI: improve
interrupt qh unlink") and add a 5ms delay.  Since we don't have a whole
timer infrastructure in dwc2, we'll just add a timer per QH.  The
overhead for this is very small.

Note that the dwc2 scheduler is pretty broken (see future patches to fix
it).  This patch attempts to replicate all old behavior and just add the
proper delay.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Moved periodic bandwidth release delay patch earlier again.

Changes in v3:
- Moved periodic bandwidth release delay patch later in the series.

Changes in v2:
- Periodic bandwidth release delay new for V2

 drivers/usb/dwc2/hcd.h       |   6 ++
 drivers/usb/dwc2/hcd_queue.c | 237 +++++++++++++++++++++++++++++++++----------
 2 files changed, 187 insertions(+), 56 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 809bc4ff9116..79473ea35bd6 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -215,6 +215,7 @@ enum dwc2_transaction_type {
 /**
  * struct dwc2_qh - Software queue head structure
  *
+ * @hsotg:              The HCD state structure for the DWC OTG controller
  * @ep_type:            Endpoint type. One of the following values:
  *                       - USB_ENDPOINT_XFER_CONTROL
  *                       - USB_ENDPOINT_XFER_BULK
@@ -252,13 +253,16 @@ enum dwc2_transaction_type {
  * @n_bytes:            Xfer Bytes array. Each element corresponds to a transfer
  *                      descriptor and indicates original XferSize value for the
  *                      descriptor
+ * @unreserve_timer:    Timer for releasing periodic reservation.
  * @tt_buffer_dirty     True if clear_tt_buffer_complete is pending
+ * @unreserve_pending:  True if we planned to unreserve but haven't yet.
  *
  * A Queue Head (QH) holds the static characteristics of an endpoint and
  * maintains a list of transfers (QTDs) for that endpoint. A QH structure may
  * be entered in either the non-periodic or periodic schedule.
  */
 struct dwc2_qh {
+	struct dwc2_hsotg *hsotg;
 	u8 ep_type;
 	u8 ep_is_in;
 	u16 maxp;
@@ -281,7 +285,9 @@ struct dwc2_qh {
 	dma_addr_t desc_list_dma;
 	u32 desc_list_sz;
 	u32 *n_bytes;
+	struct timer_list unreserve_timer;
 	unsigned tt_buffer_dirty:1;
+	unsigned unreserve_pending:1;
 };
 
 /**
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 0e9faa75593c..b9e4867e1afd 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -53,6 +53,94 @@
 #include "core.h"
 #include "hcd.h"
 
+/* Wait this long before releasing periodic reservation */
+#define DWC2_UNRESERVE_DELAY (msecs_to_jiffies(5))
+
+/**
+ * dwc2_do_unreserve() - Actually release the periodic reservation
+ *
+ * This function actually releases the periodic bandwidth that was reserved
+ * by the given qh.
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    QH for the periodic transfer.
+ */
+static void dwc2_do_unreserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	assert_spin_locked(&hsotg->lock);
+
+	WARN_ON(!qh->unreserve_pending);
+
+	/* No more unreserve pending--we're doing it */
+	qh->unreserve_pending = false;
+
+	if (WARN_ON(!list_empty(&qh->qh_list_entry)))
+		list_del_init(&qh->qh_list_entry);
+
+	/* Update claimed usecs per (micro)frame */
+	hsotg->periodic_usecs -= qh->usecs;
+
+	if (hsotg->core_params->uframe_sched > 0) {
+		int i;
+
+		for (i = 0; i < 8; i++) {
+			hsotg->frame_usecs[i] += qh->frame_usecs[i];
+			qh->frame_usecs[i] = 0;
+		}
+	} else {
+		/* Release periodic channel reservation */
+		hsotg->periodic_channels--;
+	}
+}
+
+/**
+ * dwc2_unreserve_timer_fn() - Timer function to release periodic reservation
+ *
+ * According to the kernel doc for usb_submit_urb() (specifically the part about
+ * "Reserved Bandwidth Transfers"), we need to keep a reservation active as
+ * long as a device driver keeps submitting.  Since we're using HCD_BH to give
+ * back the URB we need to give the driver a little bit of time before we
+ * release the reservation.  This worker is called after the appropriate
+ * delay.
+ *
+ * @work: Pointer to a qh unreserve_work.
+ */
+static void dwc2_unreserve_timer_fn(unsigned long data)
+{
+	struct dwc2_qh *qh = (struct dwc2_qh *)data;
+	struct dwc2_hsotg *hsotg = qh->hsotg;
+	unsigned long flags;
+
+	/*
+	 * Wait for the lock, or for us to be scheduled again.  We
+	 * could be scheduled again if:
+	 * - We started executing but didn't get the lock yet.
+	 * - A new reservation came in, but cancel didn't take effect
+	 *   because we already started executing.
+	 * - The timer has been kicked again.
+	 * In that case cancel and wait for the next call.
+	 */
+	while (!spin_trylock_irqsave(&hsotg->lock, flags)) {
+		if (timer_pending(&qh->unreserve_timer))
+			return;
+	}
+
+	/*
+	 * Might be no more unreserve pending if:
+	 * - We started executing but didn't get the lock yet.
+	 * - A new reservation came in, but cancel didn't take effect
+	 *   because we already started executing.
+	 *
+	 * We can't put this in the loop above because unreserve_pending needs
+	 * to be accessed under lock, so we can only check it once we got the
+	 * lock.
+	 */
+	if (qh->unreserve_pending)
+		dwc2_do_unreserve(hsotg, qh);
+
+	spin_unlock_irqrestore(&hsotg->lock, flags);
+}
+
 /**
  * dwc2_qh_init() - Initializes a QH structure
  *
@@ -71,6 +159,9 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 	dev_vdbg(hsotg->dev, "%s()\n", __func__);
 
 	/* Initialize QH */
+	qh->hsotg = hsotg;
+	setup_timer(&qh->unreserve_timer, dwc2_unreserve_timer_fn,
+		    (unsigned long)qh);
 	qh->ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
 	qh->ep_is_in = dwc2_hcd_is_pipe_in(&urb->pipe_info) ? 1 : 0;
 
@@ -240,6 +331,15 @@ struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
  */
 void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
+	/* Make sure any unreserve work is finished. */
+	if (del_timer_sync(&qh->unreserve_timer)) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&hsotg->lock, flags);
+		dwc2_do_unreserve(hsotg, qh);
+		spin_unlock_irqrestore(&hsotg->lock, flags);
+	}
+
 	if (qh->desc_list)
 		dwc2_hcd_qh_free_ddma(hsotg, qh);
 	kfree(qh);
@@ -477,51 +577,74 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
 	int status;
 
-	if (hsotg->core_params->uframe_sched > 0) {
-		int frame = -1;
-
-		status = dwc2_find_uframe(hsotg, qh);
-		if (status == 0)
-			frame = 7;
-		else if (status > 0)
-			frame = status - 1;
-
-		/* Set the new frame up */
-		if (frame >= 0) {
-			qh->sched_frame &= ~0x7;
-			qh->sched_frame |= (frame & 7);
-			dwc2_sch_dbg(hsotg, "QH=%p sched_p sch=%04x, uf=%d\n",
-				     qh, qh->sched_frame, frame);
+	status = dwc2_check_max_xfer_size(hsotg, qh);
+	if (status) {
+		dev_dbg(hsotg->dev,
+			"%s: Channel max transfer size too small for periodic transfer\n",
+			__func__);
+		return status;
+	}
+
+	/* Cancel pending unreserve; if canceled OK, unreserve was pending */
+	if (del_timer(&qh->unreserve_timer))
+		WARN_ON(!qh->unreserve_pending);
+
+	/*
+	 * Only need to reserve if there's not an unreserve pending, since if an
+	 * unreserve is pending then by definition our old reservation is still
+	 * valid.  Unreserve might still be pending even if we didn't cancel if
+	 * dwc2_unreserve_timer_fn() already started.  Code in the timer handles
+	 * that case.
+	 */
+	if (!qh->unreserve_pending) {
+		if (hsotg->core_params->uframe_sched > 0) {
+			int frame = -1;
+
+			status = dwc2_find_uframe(hsotg, qh);
+			if (status == 0)
+				frame = 7;
+			else if (status > 0)
+				frame = status - 1;
+
+			/* Set the new frame up */
+			if (frame >= 0) {
+				qh->sched_frame &= ~0x7;
+				qh->sched_frame |= (frame & 7);
+				dwc2_sch_dbg(hsotg,
+					     "QH=%p sched_p sch=%04x, uf=%d\n",
+					     qh, qh->sched_frame, frame);
+			}
+
+			if (status > 0)
+				status = 0;
+		} else {
+			status = dwc2_periodic_channel_available(hsotg);
+			if (status) {
+				dev_info(hsotg->dev,
+					"%s: No host channel available for periodic transfer\n",
+					__func__);
+				return status;
+			}
+
+			status = dwc2_check_periodic_bandwidth(hsotg, qh);
 		}
 
-		if (status > 0)
-			status = 0;
-	} else {
-		status = dwc2_periodic_channel_available(hsotg);
 		if (status) {
-			dev_info(hsotg->dev,
-				 "%s: No host channel available for periodic transfer\n",
-				 __func__);
+			dev_dbg(hsotg->dev,
+				"%s: Insufficient periodic bandwidth for periodic transfer\n",
+				__func__);
 			return status;
 		}
 
-		status = dwc2_check_periodic_bandwidth(hsotg, qh);
-	}
+		if (hsotg->core_params->uframe_sched <= 0)
+			/* Reserve periodic channel */
+			hsotg->periodic_channels++;
 
-	if (status) {
-		dev_dbg(hsotg->dev,
-			"%s: Insufficient periodic bandwidth for periodic transfer\n",
-			__func__);
-		return status;
+		/* Update claimed usecs per (micro)frame */
+		hsotg->periodic_usecs += qh->usecs;
 	}
 
-	status = dwc2_check_max_xfer_size(hsotg, qh);
-	if (status) {
-		dev_dbg(hsotg->dev,
-			"%s: Channel max transfer size too small for periodic transfer\n",
-			__func__);
-		return status;
-	}
+	qh->unreserve_pending = 0;
 
 	if (hsotg->core_params->dma_desc_enable > 0)
 		/* Don't rely on SOF and start in ready schedule */
@@ -531,13 +654,6 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		list_add_tail(&qh->qh_list_entry,
 			      &hsotg->periodic_sched_inactive);
 
-	if (hsotg->core_params->uframe_sched <= 0)
-		/* Reserve periodic channel */
-		hsotg->periodic_channels++;
-
-	/* Update claimed usecs per (micro)frame */
-	hsotg->periodic_usecs += qh->usecs;
-
 	return status;
 }
 
@@ -551,22 +667,31 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg,
 				     struct dwc2_qh *qh)
 {
-	int i;
+	bool did_modify;
 
-	list_del_init(&qh->qh_list_entry);
+	assert_spin_locked(&hsotg->lock);
 
-	/* Update claimed usecs per (micro)frame */
-	hsotg->periodic_usecs -= qh->usecs;
+	/*
+	 * Schedule the unreserve to happen in a little bit.  Cases here:
+	 * - Unreserve worker might be sitting there waiting to grab the lock.
+	 *   In this case it will notice it's been schedule again and will
+	 *   quit.
+	 * - Unreserve worker might not be scheduled.
+	 *
+	 * We should never already be scheduled since dwc2_schedule_periodic()
+	 * should have canceled the scheduled unreserve timer (hence the
+	 * warning on did_modify).
+	 *
+	 * We add + 1 to the timer to guarantee that at least 1 jiffy has
+	 * passed (otherwise if the jiffy counter might tick right after we
+	 * read it and we'll get no delay).
+	 */
+	did_modify = mod_timer(&qh->unreserve_timer,
+			       jiffies + DWC2_UNRESERVE_DELAY + 1);
+	WARN_ON(did_modify);
+	qh->unreserve_pending = 1;
 
-	if (hsotg->core_params->uframe_sched > 0) {
-		for (i = 0; i < 8; i++) {
-			hsotg->frame_usecs[i] += qh->frame_usecs[i];
-			qh->frame_usecs[i] = 0;
-		}
-	} else {
-		/* Release periodic channel reservation */
-		hsotg->periodic_channels--;
-	}
+	list_del_init(&qh->qh_list_entry);
 }
 
 /**
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 09/22] usb: dwc2: host: Giveback URB in tasklet context
  2016-01-29  2:19 [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits Douglas Anderson
                   ` (7 preceding siblings ...)
  2016-01-29  2:19   ` Douglas Anderson
@ 2016-01-29  2:20 ` Douglas Anderson
  2016-01-29  2:20 ` [PATCH v6 10/22] usb: dwc2: host: Properly set the HFIR Douglas Anderson
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

In commit 94dfd7edfd5c ("USB: HCD: support giveback of URB in tasklet
context") support was added to give back the URB in tasklet context.
Let's take advantage of this in dwc2.

This speeds up the dwc2 interrupt handler considerably.

Note that this requires the change ("usb: dwc2: host: Add a delay before
releasing periodic bandwidth") to come first.

Note that, as per Alan Stern in
<https://patchwork.kernel.org/patch/7555771/>, we also need to make sure
that the extra delay before the device drivers submit more data doesn't
break the scheduler.  At the moment the scheduler is pretty broken (see
future patches) so it's hard to be 100% certain, but I have yet to see
any new breakage introduced by this delay.  ...and speeding up interrupt
processing for dwc2 is a huge deal because it means we've got a better
chance of not missing SOF interrupts.  That means we've got an overall
win here.

Note that when playing USB audio and using a USB webcam and having
several USB keyboards plugged in, the crackling on the USB audio device
is noticably reduced with this patch.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- A bit earlier in the list of patches than in v3.

Changes in v3: None
Changes in v2:
- Commit message now says that URB giveback change needs delay change.

 drivers/usb/dwc2/hcd.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 0b6ebc7fff3f..40558478a192 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -2341,9 +2341,7 @@ void dwc2_host_complete(struct dwc2_hsotg *hsotg, struct dwc2_qtd *qtd,
 	kfree(qtd->urb);
 	qtd->urb = NULL;
 
-	spin_unlock(&hsotg->lock);
 	usb_hcd_giveback_urb(dwc2_hsotg_to_hcd(hsotg), urb, status);
-	spin_lock(&hsotg->lock);
 }
 
 /*
@@ -2964,7 +2962,7 @@ static struct hc_driver dwc2_hc_driver = {
 	.hcd_priv_size = sizeof(struct wrapper_priv_data),
 
 	.irq = _dwc2_hcd_irq,
-	.flags = HCD_MEMORY | HCD_USB2,
+	.flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
 
 	.start = _dwc2_hcd_start,
 	.stop = _dwc2_hcd_stop,
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 10/22] usb: dwc2: host: Properly set the HFIR
  2016-01-29  2:19 [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits Douglas Anderson
                   ` (8 preceding siblings ...)
  2016-01-29  2:20 ` [PATCH v6 09/22] usb: dwc2: host: Giveback URB in tasklet context Douglas Anderson
@ 2016-01-29  2:20 ` Douglas Anderson
  2016-01-31  9:23     ` Kever Yang
  2016-01-29  2:20   ` Douglas Anderson
                   ` (12 subsequent siblings)
  22 siblings, 1 reply; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

In case you didn't spot it, the difference is the "- 1".

Let's add the "- 1" to match the newest user manual.  It's presumed that
the "- 1" should have always been there and that this was always a
documentation error.  If some hardware needs the "- 1" and other
hardware doesn't, we'll have to add a configuration parameter for it in
the future.

I checked things before and after this patch on rk3288 using a Total
Phase Beagle 5000 analyzer.

Before this patch, a low speed mouse shows constant Frame Timing Jitter
errors.  After this patch errors have gone away.

Before this patch SOF packets move forward about 1 us per 4 ms.  After
this patch the SOF packets move backward about 1 us per 255 ms.  Some
specific SOF timestamps from the analyzer are below.

Before:
  6.603.790
  6.603.916
  6.604.041
  6.604.166
  ...
  6.607.541
  6.607.667
  6.607.792
  6.607.917
  ...
  6.611.417
  6.611.543
  6.611.668
  6.611.793

After:
  6.215.159
  6.215.284
  6.215.408
  6.215.533
  6.215.658
  ...
  6.470.658
  6.470.783
  6.470.907
  ...
  6.726.032
  6.726.157
  6.725.281
  6.725.406

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
---
Changes in v6:
- Incorporated Properly set the HFIR patch to big series in v6
- Add Heiko's Tested-by.

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
index ed73b26818c0..a5db20f12ee4 100644
--- a/drivers/usb/dwc2/core.c
+++ b/drivers/usb/dwc2/core.c
@@ -2245,10 +2245,10 @@ u32 dwc2_calc_frame_interval(struct dwc2_hsotg *hsotg)
 
 	if ((hprt0 & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT == HPRT0_SPD_HIGH_SPEED)
 		/* High speed case */
-		return 125 * clock;
+		return 125 * clock - 1;
 	else
 		/* FS/LS case */
-		return 1000 * clock;
+		return 1000 * clock - 1;
 }
 
 /**
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 11/22] usb: dwc2: host: There's not really a TT for the root hub
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

I find that when I plug a full speed (NOT high speed) hub into a dwc2
port and then I plug a bunch of devices into that full speed hub that
dwc2 goes bat guano crazy.  Specifically, it just spews errors like this
in the console:
  usb usb1: clear tt 1 (9043) error -22

The specific test case I used looks like this:
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=dwc2/1p, 480M
    |__ Port 1: Dev 17, If 0, Class=Hub, Driver=hub/4p, 12M
        |__ Port 2: Dev 19, If 0, ..., Driver=usbhid, 1.5M
        |__ Port 4: Dev 20, If 0, ..., Driver=usbhid, 12M
        |__ Port 4: Dev 20, If 1, ..., Driver=usbhid, 12M
        |__ Port 4: Dev 20, If 2, ..., Driver=usbhid, 12M

Showing VID/PID:
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 001 Device 017: ID 03eb:3301 Atmel Corp. at43301 4-Port Hub
 Bus 001 Device 020: ID 045e:0745 Microsoft Corp. Nano Transceiver ...
 Bus 001 Device 019: ID 046d:c404 Logitech, Inc. TrackMan Wheel

I spent a bunch of time trying to figure out why there are errors to
begin with.  I believe that the issue may be a hardware issue where the
transceiver sometimes accidentally sends a PREAMBLE packet if you send a
packet to a full speed device right after one to a low speed device.
Luckily the USB driver retries and the second time things work OK.

In any case, things kinda seem work despite the errors, except for the
"clear tt" spew mucking up my console.  Chalk it up for a win for
retries and robust protocols.

So getting back to the "clear tt" problem, it appears that we get those
because there's not actually a TT here to clear.  It's my understanding
that when dwc2 operates in low speed or full speed mode that there's no
real TT out there.  That makes all these attempts to "clear the TT"
somewhat meaningless and also causes the spew in the log.

Let's just skip all the useless TT clears.  Eventually we should root
cause the errors, but even if we do this is still a proper fix and is
likely to avoid the "clear tt" error in the future.

Note that hooking up a Full Speed USB Audio Device (Jabra 510) to this
same hub with the keyboard / trackball shows that even audio works over
this janky connection.  As a point to note, this particular change (skip
bogus TT clears) compared to just commenting out the dev_err() in
hub_tt_work() actually produces better audio.

Note: don't ask me where I got a full speed USB hub or whether the
massive amount of dust that accumulated on it while it was in my junk
box affected its funtionality.  Just smile and nod.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
Changes in v6:
- There's not really a TT for the root hub new for v6

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd_intr.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 5d25a5ec9736..fe44870f84eb 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -87,6 +87,7 @@ static void dwc2_hc_handle_tt_clear(struct dwc2_hsotg *hsotg,
 				    struct dwc2_host_chan *chan,
 				    struct dwc2_qtd *qtd)
 {
+	struct usb_device *root_hub = dwc2_hsotg_to_hcd(hsotg)->self.root_hub;
 	struct urb *usb_urb;
 
 	if (!chan->qh)
@@ -102,6 +103,15 @@ static void dwc2_hc_handle_tt_clear(struct dwc2_hsotg *hsotg,
 	if (!usb_urb || !usb_urb->dev || !usb_urb->dev->tt)
 		return;
 
+	/*
+	 * The root hub doesn't really have a TT, but Linux thinks it
+	 * does because how could you have a "high speed hub" that
+	 * directly talks directly to low speed devices without a TT?
+	 * It's all lies.  Lies, I tell you.
+	 */
+	if (usb_urb->dev->tt->hub == root_hub)
+		return;
+
 	if (qtd->urb->status != -EPIPE && qtd->urb->status != -EREMOTEIO) {
 		chan->qh->tt_buffer_dirty = 1;
 		if (usb_hub_clear_tt_buffer(usb_urb))
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 11/22] usb: dwc2: host: There's not really a TT for the root hub
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

I find that when I plug a full speed (NOT high speed) hub into a dwc2
port and then I plug a bunch of devices into that full speed hub that
dwc2 goes bat guano crazy.  Specifically, it just spews errors like this
in the console:
  usb usb1: clear tt 1 (9043) error -22

The specific test case I used looks like this:
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=dwc2/1p, 480M
    |__ Port 1: Dev 17, If 0, Class=Hub, Driver=hub/4p, 12M
        |__ Port 2: Dev 19, If 0, ..., Driver=usbhid, 1.5M
        |__ Port 4: Dev 20, If 0, ..., Driver=usbhid, 12M
        |__ Port 4: Dev 20, If 1, ..., Driver=usbhid, 12M
        |__ Port 4: Dev 20, If 2, ..., Driver=usbhid, 12M

Showing VID/PID:
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 001 Device 017: ID 03eb:3301 Atmel Corp. at43301 4-Port Hub
 Bus 001 Device 020: ID 045e:0745 Microsoft Corp. Nano Transceiver ...
 Bus 001 Device 019: ID 046d:c404 Logitech, Inc. TrackMan Wheel

I spent a bunch of time trying to figure out why there are errors to
begin with.  I believe that the issue may be a hardware issue where the
transceiver sometimes accidentally sends a PREAMBLE packet if you send a
packet to a full speed device right after one to a low speed device.
Luckily the USB driver retries and the second time things work OK.

In any case, things kinda seem work despite the errors, except for the
"clear tt" spew mucking up my console.  Chalk it up for a win for
retries and robust protocols.

So getting back to the "clear tt" problem, it appears that we get those
because there's not actually a TT here to clear.  It's my understanding
that when dwc2 operates in low speed or full speed mode that there's no
real TT out there.  That makes all these attempts to "clear the TT"
somewhat meaningless and also causes the spew in the log.

Let's just skip all the useless TT clears.  Eventually we should root
cause the errors, but even if we do this is still a proper fix and is
likely to avoid the "clear tt" error in the future.

Note that hooking up a Full Speed USB Audio Device (Jabra 510) to this
same hub with the keyboard / trackball shows that even audio works over
this janky connection.  As a point to note, this particular change (skip
bogus TT clears) compared to just commenting out the dev_err() in
hub_tt_work() actually produces better audio.

Note: don't ask me where I got a full speed USB hub or whether the
massive amount of dust that accumulated on it while it was in my junk
box affected its funtionality.  Just smile and nod.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---
Changes in v6:
- There's not really a TT for the root hub new for v6

Changes in v5: None
Changes in v4: None
Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd_intr.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index 5d25a5ec9736..fe44870f84eb 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -87,6 +87,7 @@ static void dwc2_hc_handle_tt_clear(struct dwc2_hsotg *hsotg,
 				    struct dwc2_host_chan *chan,
 				    struct dwc2_qtd *qtd)
 {
+	struct usb_device *root_hub = dwc2_hsotg_to_hcd(hsotg)->self.root_hub;
 	struct urb *usb_urb;
 
 	if (!chan->qh)
@@ -102,6 +103,15 @@ static void dwc2_hc_handle_tt_clear(struct dwc2_hsotg *hsotg,
 	if (!usb_urb || !usb_urb->dev || !usb_urb->dev->tt)
 		return;
 
+	/*
+	 * The root hub doesn't really have a TT, but Linux thinks it
+	 * does because how could you have a "high speed hub" that
+	 * directly talks directly to low speed devices without a TT?
+	 * It's all lies.  Lies, I tell you.
+	 */
+	if (usb_urb->dev->tt->hub == root_hub)
+		return;
+
 	if (qtd->urb->status != -EPIPE && qtd->urb->status != -EREMOTEIO) {
 		chan->qh->tt_buffer_dirty = 1;
 		if (usb_hub_clear_tt_buffer(usb_urb))
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 12/22] usb: dwc2: host: Use periodic interrupt even with DMA
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

The old code in dwc2_process_periodic_channels() would only enable the
"periodic empty" interrupt if we weren't using DMA.  That wasn't right
since we can still get into cases where we have small FIFOs even on
systems that have DMA (the rk3288 is a prime example).

Let's always enable/disable the "periodic empty" when appropriate.  As
part of this:

* Always call dwc2_process_periodic_channels() even if there's nothing
  in periodic_sched_assigned (we move the queue empty check so we still
  avoid the extra work).  That will make extra certain that we will
  properly disable the "periodic empty" interrupt even if there's
  nothing queued up.

* Move the enable of "periodic empty" due to non-empty
  periodic_sched_assigned to be for slave mode (non-DMA mode) only.
  Presumably this was the original intention of the check for DMA since
  it seems to match the comments above where in slave mode we leave
  things on the assigned queue.

Note that even before this change slave mode didn't work for me, so I
can't say for sure that my understanding of slave mode is correct.
However, this shouldn't change anything for slave mode so if slave mode
worked for someone in the past it ought to still work.

With this change, I no longer get constant misses reported by my other
debugging code (and with future patches) when I've got:
* Rockchip rk3288 Chromebook, using port ff540000
  -> Pluggable 7-port Hub with Charging (powered)
     -> Microsoft Wireless Keyboard 2000 in port 1.
     -> Das Keyboard in port 2.
     -> Jabra Speaker in port 3
     -> Logitech, Inc. Webcam C600 in port 4
     -> Microsoft Sidewinder X6 Keyboard in port 5

...and I'm playing music on the USB speaker and capturing video from the
webcam.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Use periodic interrupt even with DMA new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd.c | 71 +++++++++++++++++++++++---------------------------
 1 file changed, 32 insertions(+), 39 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 40558478a192..fd731347daf7 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -1109,10 +1109,14 @@ static void dwc2_process_periodic_channels(struct dwc2_hsotg *hsotg)
 	u32 fspcavail;
 	u32 gintmsk;
 	int status;
-	int no_queue_space = 0;
-	int no_fifo_space = 0;
+	bool no_queue_space = false;
+	bool no_fifo_space = false;
 	u32 qspcavail;
 
+	/* If empty list then just adjust interrupt enables */
+	if (list_empty(&hsotg->periodic_sched_assigned))
+		goto exit;
+
 	if (dbg_perio())
 		dev_vdbg(hsotg->dev, "Queue periodic transactions\n");
 
@@ -1190,42 +1194,32 @@ static void dwc2_process_periodic_channels(struct dwc2_hsotg *hsotg)
 		}
 	}
 
-	if (hsotg->core_params->dma_enable <= 0) {
-		tx_status = dwc2_readl(hsotg->regs + HPTXSTS);
-		qspcavail = (tx_status & TXSTS_QSPCAVAIL_MASK) >>
-			    TXSTS_QSPCAVAIL_SHIFT;
-		fspcavail = (tx_status & TXSTS_FSPCAVAIL_MASK) >>
-			    TXSTS_FSPCAVAIL_SHIFT;
-		if (dbg_perio()) {
-			dev_vdbg(hsotg->dev,
-				 "  P Tx Req Queue Space Avail (after queue): %d\n",
-				 qspcavail);
-			dev_vdbg(hsotg->dev,
-				 "  P Tx FIFO Space Avail (after queue): %d\n",
-				 fspcavail);
-		}
-
-		if (!list_empty(&hsotg->periodic_sched_assigned) ||
-		    no_queue_space || no_fifo_space) {
-			/*
-			 * May need to queue more transactions as the request
-			 * queue or Tx FIFO empties. Enable the periodic Tx
-			 * FIFO empty interrupt. (Always use the half-empty
-			 * level to ensure that new requests are loaded as
-			 * soon as possible.)
-			 */
-			gintmsk = dwc2_readl(hsotg->regs + GINTMSK);
+exit:
+	if (no_queue_space || no_fifo_space ||
+	    (hsotg->core_params->dma_enable <= 0 &&
+	     !list_empty(&hsotg->periodic_sched_assigned))) {
+		/*
+		 * May need to queue more transactions as the request
+		 * queue or Tx FIFO empties. Enable the periodic Tx
+		 * FIFO empty interrupt. (Always use the half-empty
+		 * level to ensure that new requests are loaded as
+		 * soon as possible.)
+		 */
+		gintmsk = dwc2_readl(hsotg->regs + GINTMSK);
+		if (!(gintmsk & GINTSTS_PTXFEMP)) {
 			gintmsk |= GINTSTS_PTXFEMP;
 			dwc2_writel(gintmsk, hsotg->regs + GINTMSK);
-		} else {
-			/*
-			 * Disable the Tx FIFO empty interrupt since there are
-			 * no more transactions that need to be queued right
-			 * now. This function is called from interrupt
-			 * handlers to queue more transactions as transfer
-			 * states change.
-			 */
-			gintmsk = dwc2_readl(hsotg->regs + GINTMSK);
+		}
+	} else {
+		/*
+		 * Disable the Tx FIFO empty interrupt since there are
+		 * no more transactions that need to be queued right
+		 * now. This function is called from interrupt
+		 * handlers to queue more transactions as transfer
+		 * states change.
+		*/
+		gintmsk = dwc2_readl(hsotg->regs + GINTMSK);
+		if (gintmsk & GINTSTS_PTXFEMP) {
 			gintmsk &= ~GINTSTS_PTXFEMP;
 			dwc2_writel(gintmsk, hsotg->regs + GINTMSK);
 		}
@@ -1372,9 +1366,8 @@ void dwc2_hcd_queue_transactions(struct dwc2_hsotg *hsotg,
 	dev_vdbg(hsotg->dev, "Queue Transactions\n");
 #endif
 	/* Process host channels associated with periodic transfers */
-	if ((tr_type == DWC2_TRANSACTION_PERIODIC ||
-	     tr_type == DWC2_TRANSACTION_ALL) &&
-	    !list_empty(&hsotg->periodic_sched_assigned))
+	if (tr_type == DWC2_TRANSACTION_PERIODIC ||
+	    tr_type == DWC2_TRANSACTION_ALL)
 		dwc2_process_periodic_channels(hsotg);
 
 	/* Process host channels associated with non-periodic transfers */
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 12/22] usb: dwc2: host: Use periodic interrupt even with DMA
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: william.wu-TNX95d0MmH7DzftRWevZcw,
	huangtao-TNX95d0MmH7DzftRWevZcw, heiko-4mtYJXux2i+zQB+pC5nmwQ,
	stefan.wahren-eS4NqCHxEME,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Julius Werner,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw, Douglas Anderson,
	johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

The old code in dwc2_process_periodic_channels() would only enable the
"periodic empty" interrupt if we weren't using DMA.  That wasn't right
since we can still get into cases where we have small FIFOs even on
systems that have DMA (the rk3288 is a prime example).

Let's always enable/disable the "periodic empty" when appropriate.  As
part of this:

* Always call dwc2_process_periodic_channels() even if there's nothing
  in periodic_sched_assigned (we move the queue empty check so we still
  avoid the extra work).  That will make extra certain that we will
  properly disable the "periodic empty" interrupt even if there's
  nothing queued up.

* Move the enable of "periodic empty" due to non-empty
  periodic_sched_assigned to be for slave mode (non-DMA mode) only.
  Presumably this was the original intention of the check for DMA since
  it seems to match the comments above where in slave mode we leave
  things on the assigned queue.

Note that even before this change slave mode didn't work for me, so I
can't say for sure that my understanding of slave mode is correct.
However, this shouldn't change anything for slave mode so if slave mode
worked for someone in the past it ought to still work.

With this change, I no longer get constant misses reported by my other
debugging code (and with future patches) when I've got:
* Rockchip rk3288 Chromebook, using port ff540000
  -> Pluggable 7-port Hub with Charging (powered)
     -> Microsoft Wireless Keyboard 2000 in port 1.
     -> Das Keyboard in port 2.
     -> Jabra Speaker in port 3
     -> Logitech, Inc. Webcam C600 in port 4
     -> Microsoft Sidewinder X6 Keyboard in port 5

...and I'm playing music on the USB speaker and capturing video from the
webcam.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Use periodic interrupt even with DMA new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd.c | 71 +++++++++++++++++++++++---------------------------
 1 file changed, 32 insertions(+), 39 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 40558478a192..fd731347daf7 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -1109,10 +1109,14 @@ static void dwc2_process_periodic_channels(struct dwc2_hsotg *hsotg)
 	u32 fspcavail;
 	u32 gintmsk;
 	int status;
-	int no_queue_space = 0;
-	int no_fifo_space = 0;
+	bool no_queue_space = false;
+	bool no_fifo_space = false;
 	u32 qspcavail;
 
+	/* If empty list then just adjust interrupt enables */
+	if (list_empty(&hsotg->periodic_sched_assigned))
+		goto exit;
+
 	if (dbg_perio())
 		dev_vdbg(hsotg->dev, "Queue periodic transactions\n");
 
@@ -1190,42 +1194,32 @@ static void dwc2_process_periodic_channels(struct dwc2_hsotg *hsotg)
 		}
 	}
 
-	if (hsotg->core_params->dma_enable <= 0) {
-		tx_status = dwc2_readl(hsotg->regs + HPTXSTS);
-		qspcavail = (tx_status & TXSTS_QSPCAVAIL_MASK) >>
-			    TXSTS_QSPCAVAIL_SHIFT;
-		fspcavail = (tx_status & TXSTS_FSPCAVAIL_MASK) >>
-			    TXSTS_FSPCAVAIL_SHIFT;
-		if (dbg_perio()) {
-			dev_vdbg(hsotg->dev,
-				 "  P Tx Req Queue Space Avail (after queue): %d\n",
-				 qspcavail);
-			dev_vdbg(hsotg->dev,
-				 "  P Tx FIFO Space Avail (after queue): %d\n",
-				 fspcavail);
-		}
-
-		if (!list_empty(&hsotg->periodic_sched_assigned) ||
-		    no_queue_space || no_fifo_space) {
-			/*
-			 * May need to queue more transactions as the request
-			 * queue or Tx FIFO empties. Enable the periodic Tx
-			 * FIFO empty interrupt. (Always use the half-empty
-			 * level to ensure that new requests are loaded as
-			 * soon as possible.)
-			 */
-			gintmsk = dwc2_readl(hsotg->regs + GINTMSK);
+exit:
+	if (no_queue_space || no_fifo_space ||
+	    (hsotg->core_params->dma_enable <= 0 &&
+	     !list_empty(&hsotg->periodic_sched_assigned))) {
+		/*
+		 * May need to queue more transactions as the request
+		 * queue or Tx FIFO empties. Enable the periodic Tx
+		 * FIFO empty interrupt. (Always use the half-empty
+		 * level to ensure that new requests are loaded as
+		 * soon as possible.)
+		 */
+		gintmsk = dwc2_readl(hsotg->regs + GINTMSK);
+		if (!(gintmsk & GINTSTS_PTXFEMP)) {
 			gintmsk |= GINTSTS_PTXFEMP;
 			dwc2_writel(gintmsk, hsotg->regs + GINTMSK);
-		} else {
-			/*
-			 * Disable the Tx FIFO empty interrupt since there are
-			 * no more transactions that need to be queued right
-			 * now. This function is called from interrupt
-			 * handlers to queue more transactions as transfer
-			 * states change.
-			 */
-			gintmsk = dwc2_readl(hsotg->regs + GINTMSK);
+		}
+	} else {
+		/*
+		 * Disable the Tx FIFO empty interrupt since there are
+		 * no more transactions that need to be queued right
+		 * now. This function is called from interrupt
+		 * handlers to queue more transactions as transfer
+		 * states change.
+		*/
+		gintmsk = dwc2_readl(hsotg->regs + GINTMSK);
+		if (gintmsk & GINTSTS_PTXFEMP) {
 			gintmsk &= ~GINTSTS_PTXFEMP;
 			dwc2_writel(gintmsk, hsotg->regs + GINTMSK);
 		}
@@ -1372,9 +1366,8 @@ void dwc2_hcd_queue_transactions(struct dwc2_hsotg *hsotg,
 	dev_vdbg(hsotg->dev, "Queue Transactions\n");
 #endif
 	/* Process host channels associated with periodic transfers */
-	if ((tr_type == DWC2_TRANSACTION_PERIODIC ||
-	     tr_type == DWC2_TRANSACTION_ALL) &&
-	    !list_empty(&hsotg->periodic_sched_assigned))
+	if (tr_type == DWC2_TRANSACTION_PERIODIC ||
+	    tr_type == DWC2_TRANSACTION_ALL)
 		dwc2_process_periodic_channels(hsotg);
 
 	/* Process host channels associated with non-periodic transfers */
-- 
2.7.0.rc3.207.g0ac5344

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 13/22] usb: dwc2: host: Rename some fields in struct dwc2_qh
  2016-01-29  2:19 [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits Douglas Anderson
                   ` (11 preceding siblings ...)
  2016-01-29  2:20   ` Douglas Anderson
@ 2016-01-29  2:20 ` Douglas Anderson
  2016-01-29  2:20   ` Douglas Anderson
                   ` (9 subsequent siblings)
  22 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

This no-op change just does some renames to simplify a future patch.

1. The "interval" field is renamed to "host_interval" to make it more
   obvious that this interval may be 8 times the interval that the
   device sees (if we're doing split transactions).  A future patch will
   also add the "device_interval" field.
2. The "usecs" field is renamed to "host_us" again to make it more
   obvious that this is the time for the transaction as seen by the
   host.  For split transactions the device may see a much longer
   transaction time.  A future patch will also add "device_us".
3. The "sched_frame" field is renamed to "next_active_frame".  The name
   "sched_frame" kept confusing me because it felt like something more
   permament (the QH's reservation or something).  The name
   "next_active_frame" makes it more obvious that this field is
   constantly changing.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Rename some fields in struct dwc2_qh new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd.h       |  20 ++++----
 drivers/usb/dwc2/hcd_ddma.c  |  37 ++++++++-------
 drivers/usb/dwc2/hcd_intr.c  |  10 ++--
 drivers/usb/dwc2/hcd_queue.c | 107 ++++++++++++++++++++++---------------------
 4 files changed, 92 insertions(+), 82 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 79473ea35bd6..10c35585a2bd 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -236,10 +236,14 @@ enum dwc2_transaction_type {
  * @do_split:           Full/low speed endpoint on high-speed hub requires split
  * @td_first:           Index of first activated isochronous transfer descriptor
  * @td_last:            Index of last activated isochronous transfer descriptor
- * @usecs:              Bandwidth in microseconds per (micro)frame
- * @interval:           Interval between transfers in (micro)frames
- * @sched_frame:        (Micro)frame to initialize a periodic transfer.
- *                      The transfer executes in the following (micro)frame.
+ * @host_us:            Bandwidth in microseconds per transfer as seen by host
+ * @host_interval:      Interval between transfers as seen by the host.  If
+ *                      the host is high speed and the device is low speed this
+ *                      will be 8 times device interval.
+ * @next_active_frame:  (Micro)frame before we next need to put something on
+ *                      the bus.  We'll move the qh to active here.  If the
+ *                      host is in high speed mode this will be a uframe.  If
+ *                      the host is in low speed mode this will be a full frame.
  * @frame_usecs:        Internal variable used by the microframe scheduler
  * @start_split_frame:  (Micro)frame at which last start split was initialized
  * @ntd:                Actual number of transfer descriptors in a list
@@ -272,9 +276,9 @@ struct dwc2_qh {
 	u8 do_split;
 	u8 td_first;
 	u8 td_last;
-	u16 usecs;
-	u16 interval;
-	u16 sched_frame;
+	u16 host_us;
+	u16 host_interval;
+	u16 next_active_frame;
 	u16 frame_usecs[8];
 	u16 start_split_frame;
 	u16 ntd;
@@ -651,7 +655,7 @@ static inline u16 dwc2_hcd_get_ep_bandwidth(struct dwc2_hsotg *hsotg,
 		return 0;
 	}
 
-	return qh->usecs;
+	return qh->host_us;
 }
 
 extern void dwc2_hcd_save_data_toggle(struct dwc2_hsotg *hsotg,
diff --git a/drivers/usb/dwc2/hcd_ddma.c b/drivers/usb/dwc2/hcd_ddma.c
index 16b261cfa92d..26b00270ca0b 100644
--- a/drivers/usb/dwc2/hcd_ddma.c
+++ b/drivers/usb/dwc2/hcd_ddma.c
@@ -81,7 +81,7 @@ static u16 dwc2_max_desc_num(struct dwc2_qh *qh)
 static u16 dwc2_frame_incr_val(struct dwc2_qh *qh)
 {
 	return qh->dev_speed == USB_SPEED_HIGH ?
-	       (qh->interval + 8 - 1) / 8 : qh->interval;
+	       (qh->host_interval + 8 - 1) / 8 : qh->host_interval;
 }
 
 static int dwc2_desc_list_alloc(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
@@ -252,7 +252,7 @@ static void dwc2_update_frame_list(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 	chan = qh->channel;
 	inc = dwc2_frame_incr_val(qh);
 	if (qh->ep_type == USB_ENDPOINT_XFER_ISOC)
-		i = dwc2_frame_list_idx(qh->sched_frame);
+		i = dwc2_frame_list_idx(qh->next_active_frame);
 	else
 		i = 0;
 
@@ -278,13 +278,13 @@ static void dwc2_update_frame_list(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		return;
 
 	chan->schinfo = 0;
-	if (chan->speed == USB_SPEED_HIGH && qh->interval) {
+	if (chan->speed == USB_SPEED_HIGH && qh->host_interval) {
 		j = 1;
 		/* TODO - check this */
-		inc = (8 + qh->interval - 1) / qh->interval;
+		inc = (8 + qh->host_interval - 1) / qh->host_interval;
 		for (i = 0; i < inc; i++) {
 			chan->schinfo |= j;
-			j = j << qh->interval;
+			j = j << qh->host_interval;
 		}
 	} else {
 		chan->schinfo = 0xff;
@@ -431,7 +431,10 @@ static u16 dwc2_calc_starting_frame(struct dwc2_hsotg *hsotg,
 
 	hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
 
-	/* sched_frame is always frame number (not uFrame) both in FS and HS! */
+	/*
+	 * next_active_frame is always frame number (not uFrame) both in FS
+	 * and HS!
+	 */
 
 	/*
 	 * skip_frames is used to limit activated descriptors number
@@ -514,13 +517,13 @@ static u16 dwc2_recalc_initial_desc_idx(struct dwc2_hsotg *hsotg,
 		 */
 		fr_idx_tmp = dwc2_frame_list_idx(frame);
 		fr_idx = (FRLISTEN_64_SIZE +
-			  dwc2_frame_list_idx(qh->sched_frame) - fr_idx_tmp)
-			 % dwc2_frame_incr_val(qh);
+			  dwc2_frame_list_idx(qh->next_active_frame) -
+			  fr_idx_tmp) % dwc2_frame_incr_val(qh);
 		fr_idx = (fr_idx + fr_idx_tmp) % FRLISTEN_64_SIZE;
 	} else {
-		qh->sched_frame = dwc2_calc_starting_frame(hsotg, qh,
+		qh->next_active_frame = dwc2_calc_starting_frame(hsotg, qh,
 							   &skip_frames);
-		fr_idx = dwc2_frame_list_idx(qh->sched_frame);
+		fr_idx = dwc2_frame_list_idx(qh->next_active_frame);
 	}
 
 	qh->td_first = qh->td_last = dwc2_frame_to_desc_idx(qh, fr_idx);
@@ -583,7 +586,7 @@ static void dwc2_init_isoc_dma_desc(struct dwc2_hsotg *hsotg,
 	u16 next_idx;
 
 	idx = qh->td_last;
-	inc = qh->interval;
+	inc = qh->host_interval;
 	hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
 	cur_idx = dwc2_frame_list_idx(hsotg->frame_number);
 	next_idx = dwc2_desclist_idx_inc(qh->td_last, inc, qh->dev_speed);
@@ -605,11 +608,11 @@ static void dwc2_init_isoc_dma_desc(struct dwc2_hsotg *hsotg,
 		}
 	}
 
-	if (qh->interval) {
-		ntd_max = (dwc2_max_desc_num(qh) + qh->interval - 1) /
-				qh->interval;
+	if (qh->host_interval) {
+		ntd_max = (dwc2_max_desc_num(qh) + qh->host_interval - 1) /
+				qh->host_interval;
 		if (skip_frames && !qh->channel)
-			ntd_max -= skip_frames / qh->interval;
+			ntd_max -= skip_frames / qh->host_interval;
 	}
 
 	max_xfer_size = qh->dev_speed == USB_SPEED_HIGH ?
@@ -1029,7 +1032,7 @@ static void dwc2_complete_isoc_xfer_ddma(struct dwc2_hsotg *hsotg,
 							  idx);
 			if (rc < 0)
 				return;
-			idx = dwc2_desclist_idx_inc(idx, qh->interval,
+			idx = dwc2_desclist_idx_inc(idx, qh->host_interval,
 						    chan->speed);
 			if (!rc)
 				continue;
@@ -1039,7 +1042,7 @@ static void dwc2_complete_isoc_xfer_ddma(struct dwc2_hsotg *hsotg,
 
 			/* rc == DWC2_CMPL_STOP */
 
-			if (qh->interval >= 32)
+			if (qh->host_interval >= 32)
 				goto stop_scan;
 
 			qh->td_first = idx;
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index fe44870f84eb..d929db5e7f3f 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -148,9 +148,11 @@ static void dwc2_sof_intr(struct dwc2_hsotg *hsotg)
 	while (qh_entry != &hsotg->periodic_sched_inactive) {
 		qh = list_entry(qh_entry, struct dwc2_qh, qh_list_entry);
 		qh_entry = qh_entry->next;
-		if (dwc2_frame_num_le(qh->sched_frame, hsotg->frame_number)) {
-			dwc2_sch_vdbg(hsotg, "QH=%p ready fn=%04x, sch=%04x\n",
-				      qh, hsotg->frame_number, qh->sched_frame);
+		if (dwc2_frame_num_le(qh->next_active_frame,
+				      hsotg->frame_number)) {
+			dwc2_sch_vdbg(hsotg, "QH=%p ready fn=%04x, nxt=%04x\n",
+				      qh, hsotg->frame_number,
+				      qh->next_active_frame);
 
 			/*
 			 * Move QH to the ready list to be executed next
@@ -1360,7 +1362,7 @@ static void dwc2_hc_nyet_intr(struct dwc2_hsotg *hsotg,
 			int frnum = dwc2_hcd_get_frame_number(hsotg);
 
 			if (dwc2_full_frame_num(frnum) !=
-			    dwc2_full_frame_num(chan->qh->sched_frame)) {
+			    dwc2_full_frame_num(chan->qh->next_active_frame)) {
 				/*
 				 * No longer in the same full speed frame.
 				 * Treat this as a transaction error.
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index b9e4867e1afd..39f4de6279f8 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -78,7 +78,7 @@ static void dwc2_do_unreserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		list_del_init(&qh->qh_list_entry);
 
 	/* Update claimed usecs per (micro)frame */
-	hsotg->periodic_usecs -= qh->usecs;
+	hsotg->periodic_usecs -= qh->host_us;
 
 	if (hsotg->core_params->uframe_sched > 0) {
 		int i;
@@ -193,40 +193,40 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		int bytecount =
 			dwc2_hb_mult(qh->maxp) * dwc2_max_packet(qh->maxp);
 
-		qh->usecs = NS_TO_US(usb_calc_bus_time(qh->do_split ?
-				USB_SPEED_HIGH : dev_speed, qh->ep_is_in,
-				qh->ep_type == USB_ENDPOINT_XFER_ISOC,
-				bytecount));
+		qh->host_us = NS_TO_US(usb_calc_bus_time(qh->do_split ?
+			      USB_SPEED_HIGH : dev_speed, qh->ep_is_in,
+			      qh->ep_type == USB_ENDPOINT_XFER_ISOC,
+			      bytecount));
 
 		/* Ensure frame_number corresponds to the reality */
 		hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
 		/* Start in a slightly future (micro)frame */
-		qh->sched_frame = dwc2_frame_num_inc(hsotg->frame_number,
+		qh->next_active_frame = dwc2_frame_num_inc(hsotg->frame_number,
 						     SCHEDULE_SLOP);
-		qh->interval = urb->interval;
-		dwc2_sch_dbg(hsotg, "QH=%p init sch=%04x, fn=%04x, int=%#x\n",
-			     qh, qh->sched_frame, hsotg->frame_number,
-			     qh->interval);
+		qh->host_interval = urb->interval;
+		dwc2_sch_dbg(hsotg, "QH=%p init nxt=%04x, fn=%04x, int=%#x\n",
+			     qh, qh->next_active_frame, hsotg->frame_number,
+			     qh->host_interval);
 #if 0
 		/* Increase interrupt polling rate for debugging */
 		if (qh->ep_type == USB_ENDPOINT_XFER_INT)
-			qh->interval = 8;
+			qh->host_interval = 8;
 #endif
 		hprt = dwc2_readl(hsotg->regs + HPRT0);
 		prtspd = (hprt & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT;
 		if (prtspd == HPRT0_SPD_HIGH_SPEED &&
 		    (dev_speed == USB_SPEED_LOW ||
 		     dev_speed == USB_SPEED_FULL)) {
-			qh->interval *= 8;
-			qh->sched_frame |= 0x7;
-			qh->start_split_frame = qh->sched_frame;
+			qh->host_interval *= 8;
+			qh->next_active_frame |= 0x7;
+			qh->start_split_frame = qh->next_active_frame;
 			dwc2_sch_dbg(hsotg,
-				     "QH=%p init*8 sch=%04x, fn=%04x, int=%#x\n",
-				     qh, qh->sched_frame, hsotg->frame_number,
-				     qh->interval);
+				     "QH=%p init*8 nxt=%04x, fn=%04x, int=%#x\n",
+				     qh, qh->next_active_frame,
+				     hsotg->frame_number, qh->host_interval);
 
 		}
-		dev_dbg(hsotg->dev, "interval=%d\n", qh->interval);
+		dev_dbg(hsotg->dev, "interval=%d\n", qh->host_interval);
 	}
 
 	dev_vdbg(hsotg->dev, "DWC OTG HCD QH Initialized\n");
@@ -277,9 +277,9 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 
 	if (qh->ep_type == USB_ENDPOINT_XFER_INT) {
 		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - usecs = %d\n",
-			 qh->usecs);
+			 qh->host_us);
 		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - interval = %d\n",
-			 qh->interval);
+			 qh->host_interval);
 	}
 }
 
@@ -404,19 +404,19 @@ static int dwc2_check_periodic_bandwidth(struct dwc2_hsotg *hsotg,
 		 * High speed mode
 		 * Max periodic usecs is 80% x 125 usec = 100 usec
 		 */
-		max_claimed_usecs = 100 - qh->usecs;
+		max_claimed_usecs = 100 - qh->host_us;
 	} else {
 		/*
 		 * Full speed mode
 		 * Max periodic usecs is 90% x 1000 usec = 900 usec
 		 */
-		max_claimed_usecs = 900 - qh->usecs;
+		max_claimed_usecs = 900 - qh->host_us;
 	}
 
 	if (hsotg->periodic_usecs > max_claimed_usecs) {
 		dev_err(hsotg->dev,
 			"%s: already claimed usecs %d, required usecs %d\n",
-			__func__, hsotg->periodic_usecs, qh->usecs);
+			__func__, hsotg->periodic_usecs, qh->host_us);
 		status = -ENOSPC;
 	}
 
@@ -443,7 +443,7 @@ void dwc2_hcd_init_usecs(struct dwc2_hsotg *hsotg)
 
 static int dwc2_find_single_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
-	unsigned short utime = qh->usecs;
+	unsigned short utime = qh->host_us;
 	int i;
 
 	for (i = 0; i < 8; i++) {
@@ -462,7 +462,7 @@ static int dwc2_find_single_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
  */
 static int dwc2_find_multi_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
-	unsigned short utime = qh->usecs;
+	unsigned short utime = qh->host_us;
 	unsigned short xtime;
 	int t_left;
 	int i;
@@ -608,11 +608,11 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 
 			/* Set the new frame up */
 			if (frame >= 0) {
-				qh->sched_frame &= ~0x7;
-				qh->sched_frame |= (frame & 7);
+				qh->next_active_frame &= ~0x7;
+				qh->next_active_frame |= (frame & 7);
 				dwc2_sch_dbg(hsotg,
-					     "QH=%p sched_p sch=%04x, uf=%d\n",
-					     qh, qh->sched_frame, frame);
+					     "QH=%p sched_p nxt=%04x, uf=%d\n",
+					     qh, qh->next_active_frame, frame);
 			}
 
 			if (status > 0)
@@ -641,7 +641,7 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 			hsotg->periodic_channels++;
 
 		/* Update claimed usecs per (micro)frame */
-		hsotg->periodic_usecs += qh->usecs;
+		hsotg->periodic_usecs += qh->host_us;
 	}
 
 	qh->unreserve_pending = 0;
@@ -716,7 +716,7 @@ int dwc2_hcd_qh_add(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		/* QH already in a schedule */
 		return 0;
 
-	if (!dwc2_frame_num_le(qh->sched_frame, hsotg->frame_number) &&
+	if (!dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number) &&
 			!hsotg->frame_number) {
 		u16 new_frame;
 
@@ -725,9 +725,9 @@ int dwc2_hcd_qh_add(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		new_frame = dwc2_frame_num_inc(hsotg->frame_number,
 				SCHEDULE_SLOP);
 
-		dwc2_sch_vdbg(hsotg, "QH=%p reset sch=%04x=>%04x\n",
-			      qh, qh->sched_frame, new_frame);
-		qh->sched_frame = new_frame;
+		dwc2_sch_vdbg(hsotg, "QH=%p reset nxt=%04x=>%04x\n",
+			      qh, qh->next_active_frame, new_frame);
+		qh->next_active_frame = new_frame;
 	}
 
 	/* Add the new QH to the appropriate schedule */
@@ -793,10 +793,10 @@ static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
 				      int sched_next_periodic_split)
 {
 	u16 incr;
-	u16 old_frame = qh->sched_frame;
+	u16 old_frame = qh->next_active_frame;
 
 	if (sched_next_periodic_split) {
-		qh->sched_frame = frame_number;
+		qh->next_active_frame = frame_number;
 		incr = dwc2_frame_num_inc(qh->start_split_frame, 1);
 		if (dwc2_frame_num_le(frame_number, incr)) {
 			/*
@@ -807,23 +807,24 @@ static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
 			 */
 			if (qh->ep_type != USB_ENDPOINT_XFER_ISOC ||
 			    qh->ep_is_in != 0) {
-				qh->sched_frame =
-					dwc2_frame_num_inc(qh->sched_frame, 1);
+				qh->next_active_frame = dwc2_frame_num_inc(
+					qh->next_active_frame, 1);
 			}
 		}
 	} else {
-		qh->sched_frame = dwc2_frame_num_inc(qh->start_split_frame,
-						     qh->interval);
-		if (dwc2_frame_num_le(qh->sched_frame, frame_number))
-			qh->sched_frame = frame_number;
-		qh->sched_frame |= 0x7;
-		qh->start_split_frame = qh->sched_frame;
+		qh->next_active_frame =
+			dwc2_frame_num_inc(qh->start_split_frame,
+					   qh->host_interval);
+		if (dwc2_frame_num_le(qh->next_active_frame, frame_number))
+			qh->next_active_frame = frame_number;
+		qh->next_active_frame |= 0x7;
+		qh->start_split_frame = qh->next_active_frame;
 	}
 
-	dwc2_sch_vdbg(hsotg, "QH=%p next(%d) fn=%04x, sch=%04x=>%04x (%+d)\n",
+	dwc2_sch_vdbg(hsotg, "QH=%p next(%d) fn=%04x, nxt=%04x=>%04x (%+d)\n",
 		      qh, sched_next_periodic_split, frame_number, old_frame,
-		      qh->sched_frame,
-		      dwc2_frame_num_dec(qh->sched_frame, old_frame));
+		      qh->next_active_frame,
+		      dwc2_frame_num_dec(qh->next_active_frame, old_frame));
 }
 
 /*
@@ -861,10 +862,10 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		dwc2_sched_periodic_split(hsotg, qh, frame_number,
 					  sched_next_periodic_split);
 	} else {
-		qh->sched_frame = dwc2_frame_num_inc(qh->sched_frame,
-						     qh->interval);
-		if (dwc2_frame_num_le(qh->sched_frame, frame_number))
-			qh->sched_frame = frame_number;
+		qh->next_active_frame = dwc2_frame_num_inc(
+			qh->next_active_frame, qh->host_interval);
+		if (dwc2_frame_num_le(qh->next_active_frame, frame_number))
+			qh->next_active_frame = frame_number;
 	}
 
 	if (list_empty(&qh->qtd_list)) {
@@ -876,9 +877,9 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 	 * appropriate queue
 	 */
 	if ((hsotg->core_params->uframe_sched > 0 &&
-	     dwc2_frame_num_le(qh->sched_frame, frame_number)) ||
+	     dwc2_frame_num_le(qh->next_active_frame, frame_number)) ||
 	    (hsotg->core_params->uframe_sched <= 0 &&
-	     qh->sched_frame == frame_number))
+	     qh->next_active_frame == frame_number))
 		list_move_tail(&qh->qh_list_entry,
 			       &hsotg->periodic_sched_ready);
 	else
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 14/22] usb: dwc2: host: Reorder things in hcd_queue.c
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

This no-op change just reorders a few functions in hcd_queue.c in order
to prepare for future changes.  Motivations here:

The functions dwc2_hcd_qh_free() and dwc2_hcd_qh_create() are exported
functions.  They are not called within the file.  That means that they
should be near the bottom so that they can easily call static helpers.

The function dwc2_qh_init() is only called by dwc2_hcd_qh_create() and
should move near the bottom with it.

The only reason that the dwc2_unreserve_timer_fn() timer function (and
its subroutine dwc2_do_unreserve()) were so high in the file was that
they needed to be above dwc2_qh_init().  Now that dwc2_qh_init() has
been moved down it can be moved down a bit.  A later patch will split
the reserve code out of dwc2_schedule_periodic() and the reserve
function should be near the unreserve function.  The reserve function
needs to be below dwc2_find_uframe() since it calls that.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Reorder things in hcd_queue.c new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd_queue.c | 600 +++++++++++++++++++++----------------------
 1 file changed, 300 insertions(+), 300 deletions(-)

diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 39f4de6279f8..8a2067bc1e62 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -57,295 +57,6 @@
 #define DWC2_UNRESERVE_DELAY (msecs_to_jiffies(5))
 
 /**
- * dwc2_do_unreserve() - Actually release the periodic reservation
- *
- * This function actually releases the periodic bandwidth that was reserved
- * by the given qh.
- *
- * @hsotg: The HCD state structure for the DWC OTG controller
- * @qh:    QH for the periodic transfer.
- */
-static void dwc2_do_unreserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
-{
-	assert_spin_locked(&hsotg->lock);
-
-	WARN_ON(!qh->unreserve_pending);
-
-	/* No more unreserve pending--we're doing it */
-	qh->unreserve_pending = false;
-
-	if (WARN_ON(!list_empty(&qh->qh_list_entry)))
-		list_del_init(&qh->qh_list_entry);
-
-	/* Update claimed usecs per (micro)frame */
-	hsotg->periodic_usecs -= qh->host_us;
-
-	if (hsotg->core_params->uframe_sched > 0) {
-		int i;
-
-		for (i = 0; i < 8; i++) {
-			hsotg->frame_usecs[i] += qh->frame_usecs[i];
-			qh->frame_usecs[i] = 0;
-		}
-	} else {
-		/* Release periodic channel reservation */
-		hsotg->periodic_channels--;
-	}
-}
-
-/**
- * dwc2_unreserve_timer_fn() - Timer function to release periodic reservation
- *
- * According to the kernel doc for usb_submit_urb() (specifically the part about
- * "Reserved Bandwidth Transfers"), we need to keep a reservation active as
- * long as a device driver keeps submitting.  Since we're using HCD_BH to give
- * back the URB we need to give the driver a little bit of time before we
- * release the reservation.  This worker is called after the appropriate
- * delay.
- *
- * @work: Pointer to a qh unreserve_work.
- */
-static void dwc2_unreserve_timer_fn(unsigned long data)
-{
-	struct dwc2_qh *qh = (struct dwc2_qh *)data;
-	struct dwc2_hsotg *hsotg = qh->hsotg;
-	unsigned long flags;
-
-	/*
-	 * Wait for the lock, or for us to be scheduled again.  We
-	 * could be scheduled again if:
-	 * - We started executing but didn't get the lock yet.
-	 * - A new reservation came in, but cancel didn't take effect
-	 *   because we already started executing.
-	 * - The timer has been kicked again.
-	 * In that case cancel and wait for the next call.
-	 */
-	while (!spin_trylock_irqsave(&hsotg->lock, flags)) {
-		if (timer_pending(&qh->unreserve_timer))
-			return;
-	}
-
-	/*
-	 * Might be no more unreserve pending if:
-	 * - We started executing but didn't get the lock yet.
-	 * - A new reservation came in, but cancel didn't take effect
-	 *   because we already started executing.
-	 *
-	 * We can't put this in the loop above because unreserve_pending needs
-	 * to be accessed under lock, so we can only check it once we got the
-	 * lock.
-	 */
-	if (qh->unreserve_pending)
-		dwc2_do_unreserve(hsotg, qh);
-
-	spin_unlock_irqrestore(&hsotg->lock, flags);
-}
-
-/**
- * dwc2_qh_init() - Initializes a QH structure
- *
- * @hsotg: The HCD state structure for the DWC OTG controller
- * @qh:    The QH to init
- * @urb:   Holds the information about the device/endpoint needed to initialize
- *         the QH
- */
-#define SCHEDULE_SLOP 10
-static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
-			 struct dwc2_hcd_urb *urb)
-{
-	int dev_speed, hub_addr, hub_port;
-	char *speed, *type;
-
-	dev_vdbg(hsotg->dev, "%s()\n", __func__);
-
-	/* Initialize QH */
-	qh->hsotg = hsotg;
-	setup_timer(&qh->unreserve_timer, dwc2_unreserve_timer_fn,
-		    (unsigned long)qh);
-	qh->ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
-	qh->ep_is_in = dwc2_hcd_is_pipe_in(&urb->pipe_info) ? 1 : 0;
-
-	qh->data_toggle = DWC2_HC_PID_DATA0;
-	qh->maxp = dwc2_hcd_get_mps(&urb->pipe_info);
-	INIT_LIST_HEAD(&qh->qtd_list);
-	INIT_LIST_HEAD(&qh->qh_list_entry);
-
-	/* FS/LS Endpoint on HS Hub, NOT virtual root hub */
-	dev_speed = dwc2_host_get_speed(hsotg, urb->priv);
-
-	dwc2_host_hub_info(hsotg, urb->priv, &hub_addr, &hub_port);
-
-	if ((dev_speed == USB_SPEED_LOW || dev_speed == USB_SPEED_FULL) &&
-	    hub_addr != 0 && hub_addr != 1) {
-		dev_vdbg(hsotg->dev,
-			 "QH init: EP %d: TT found at hub addr %d, for port %d\n",
-			 dwc2_hcd_get_ep_num(&urb->pipe_info), hub_addr,
-			 hub_port);
-		qh->do_split = 1;
-	}
-
-	if (qh->ep_type == USB_ENDPOINT_XFER_INT ||
-	    qh->ep_type == USB_ENDPOINT_XFER_ISOC) {
-		/* Compute scheduling parameters once and save them */
-		u32 hprt, prtspd;
-
-		/* Todo: Account for split transfers in the bus time */
-		int bytecount =
-			dwc2_hb_mult(qh->maxp) * dwc2_max_packet(qh->maxp);
-
-		qh->host_us = NS_TO_US(usb_calc_bus_time(qh->do_split ?
-			      USB_SPEED_HIGH : dev_speed, qh->ep_is_in,
-			      qh->ep_type == USB_ENDPOINT_XFER_ISOC,
-			      bytecount));
-
-		/* Ensure frame_number corresponds to the reality */
-		hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
-		/* Start in a slightly future (micro)frame */
-		qh->next_active_frame = dwc2_frame_num_inc(hsotg->frame_number,
-						     SCHEDULE_SLOP);
-		qh->host_interval = urb->interval;
-		dwc2_sch_dbg(hsotg, "QH=%p init nxt=%04x, fn=%04x, int=%#x\n",
-			     qh, qh->next_active_frame, hsotg->frame_number,
-			     qh->host_interval);
-#if 0
-		/* Increase interrupt polling rate for debugging */
-		if (qh->ep_type == USB_ENDPOINT_XFER_INT)
-			qh->host_interval = 8;
-#endif
-		hprt = dwc2_readl(hsotg->regs + HPRT0);
-		prtspd = (hprt & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT;
-		if (prtspd == HPRT0_SPD_HIGH_SPEED &&
-		    (dev_speed == USB_SPEED_LOW ||
-		     dev_speed == USB_SPEED_FULL)) {
-			qh->host_interval *= 8;
-			qh->next_active_frame |= 0x7;
-			qh->start_split_frame = qh->next_active_frame;
-			dwc2_sch_dbg(hsotg,
-				     "QH=%p init*8 nxt=%04x, fn=%04x, int=%#x\n",
-				     qh, qh->next_active_frame,
-				     hsotg->frame_number, qh->host_interval);
-
-		}
-		dev_dbg(hsotg->dev, "interval=%d\n", qh->host_interval);
-	}
-
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH Initialized\n");
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - qh = %p\n", qh);
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Device Address = %d\n",
-		 dwc2_hcd_get_dev_addr(&urb->pipe_info));
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Endpoint %d, %s\n",
-		 dwc2_hcd_get_ep_num(&urb->pipe_info),
-		 dwc2_hcd_is_pipe_in(&urb->pipe_info) ? "IN" : "OUT");
-
-	qh->dev_speed = dev_speed;
-
-	switch (dev_speed) {
-	case USB_SPEED_LOW:
-		speed = "low";
-		break;
-	case USB_SPEED_FULL:
-		speed = "full";
-		break;
-	case USB_SPEED_HIGH:
-		speed = "high";
-		break;
-	default:
-		speed = "?";
-		break;
-	}
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Speed = %s\n", speed);
-
-	switch (qh->ep_type) {
-	case USB_ENDPOINT_XFER_ISOC:
-		type = "isochronous";
-		break;
-	case USB_ENDPOINT_XFER_INT:
-		type = "interrupt";
-		break;
-	case USB_ENDPOINT_XFER_CONTROL:
-		type = "control";
-		break;
-	case USB_ENDPOINT_XFER_BULK:
-		type = "bulk";
-		break;
-	default:
-		type = "?";
-		break;
-	}
-
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Type = %s\n", type);
-
-	if (qh->ep_type == USB_ENDPOINT_XFER_INT) {
-		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - usecs = %d\n",
-			 qh->host_us);
-		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - interval = %d\n",
-			 qh->host_interval);
-	}
-}
-
-/**
- * dwc2_hcd_qh_create() - Allocates and initializes a QH
- *
- * @hsotg:        The HCD state structure for the DWC OTG controller
- * @urb:          Holds the information about the device/endpoint needed
- *                to initialize the QH
- * @atomic_alloc: Flag to do atomic allocation if needed
- *
- * Return: Pointer to the newly allocated QH, or NULL on error
- */
-struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
-					  struct dwc2_hcd_urb *urb,
-					  gfp_t mem_flags)
-{
-	struct dwc2_qh *qh;
-
-	if (!urb->priv)
-		return NULL;
-
-	/* Allocate memory */
-	qh = kzalloc(sizeof(*qh), mem_flags);
-	if (!qh)
-		return NULL;
-
-	dwc2_qh_init(hsotg, qh, urb);
-
-	if (hsotg->core_params->dma_desc_enable > 0 &&
-	    dwc2_hcd_qh_init_ddma(hsotg, qh, mem_flags) < 0) {
-		dwc2_hcd_qh_free(hsotg, qh);
-		return NULL;
-	}
-
-	return qh;
-}
-
-/**
- * dwc2_hcd_qh_free() - Frees the QH
- *
- * @hsotg: HCD instance
- * @qh:    The QH to free
- *
- * QH should already be removed from the list. QTD list should already be empty
- * if called from URB Dequeue.
- *
- * Must NOT be called with interrupt disabled or spinlock held
- */
-void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
-{
-	/* Make sure any unreserve work is finished. */
-	if (del_timer_sync(&qh->unreserve_timer)) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&hsotg->lock, flags);
-		dwc2_do_unreserve(hsotg, qh);
-		spin_unlock_irqrestore(&hsotg->lock, flags);
-	}
-
-	if (qh->desc_list)
-		dwc2_hcd_qh_free_ddma(hsotg, qh);
-	kfree(qh);
-}
-
-/**
  * dwc2_periodic_channel_available() - Checks that a channel is available for a
  * periodic transfer
  *
@@ -518,19 +229,104 @@ static int dwc2_find_multi_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 
 static int dwc2_find_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
-	int ret;
+	int ret;
+
+	if (qh->dev_speed == USB_SPEED_HIGH) {
+		/* if this is a hs transaction we need a full frame */
+		ret = dwc2_find_single_uframe(hsotg, qh);
+	} else {
+		/*
+		 * if this is a fs transaction we may need a sequence
+		 * of frames
+		 */
+		ret = dwc2_find_multi_uframe(hsotg, qh);
+	}
+	return ret;
+}
+
+/**
+ * dwc2_do_unreserve() - Actually release the periodic reservation
+ *
+ * This function actually releases the periodic bandwidth that was reserved
+ * by the given qh.
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    QH for the periodic transfer.
+ */
+static void dwc2_do_unreserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	assert_spin_locked(&hsotg->lock);
+
+	WARN_ON(!qh->unreserve_pending);
+
+	/* No more unreserve pending--we're doing it */
+	qh->unreserve_pending = false;
+
+	if (WARN_ON(!list_empty(&qh->qh_list_entry)))
+		list_del_init(&qh->qh_list_entry);
+
+	/* Update claimed usecs per (micro)frame */
+	hsotg->periodic_usecs -= qh->host_us;
+
+	if (hsotg->core_params->uframe_sched > 0) {
+		int i;
+
+		for (i = 0; i < 8; i++) {
+			hsotg->frame_usecs[i] += qh->frame_usecs[i];
+			qh->frame_usecs[i] = 0;
+		}
+	} else {
+		/* Release periodic channel reservation */
+		hsotg->periodic_channels--;
+	}
+}
+
+/**
+ * dwc2_unreserve_timer_fn() - Timer function to release periodic reservation
+ *
+ * According to the kernel doc for usb_submit_urb() (specifically the part about
+ * "Reserved Bandwidth Transfers"), we need to keep a reservation active as
+ * long as a device driver keeps submitting.  Since we're using HCD_BH to give
+ * back the URB we need to give the driver a little bit of time before we
+ * release the reservation.  This worker is called after the appropriate
+ * delay.
+ *
+ * @work: Pointer to a qh unreserve_work.
+ */
+static void dwc2_unreserve_timer_fn(unsigned long data)
+{
+	struct dwc2_qh *qh = (struct dwc2_qh *)data;
+	struct dwc2_hsotg *hsotg = qh->hsotg;
+	unsigned long flags;
 
-	if (qh->dev_speed == USB_SPEED_HIGH) {
-		/* if this is a hs transaction we need a full frame */
-		ret = dwc2_find_single_uframe(hsotg, qh);
-	} else {
-		/*
-		 * if this is a fs transaction we may need a sequence
-		 * of frames
-		 */
-		ret = dwc2_find_multi_uframe(hsotg, qh);
+	/*
+	 * Wait for the lock, or for us to be scheduled again.  We
+	 * could be scheduled again if:
+	 * - We started executing but didn't get the lock yet.
+	 * - A new reservation came in, but cancel didn't take effect
+	 *   because we already started executing.
+	 * - The timer has been kicked again.
+	 * In that case cancel and wait for the next call.
+	 */
+	while (!spin_trylock_irqsave(&hsotg->lock, flags)) {
+		if (timer_pending(&qh->unreserve_timer))
+			return;
 	}
-	return ret;
+
+	/*
+	 * Might be no more unreserve pending if:
+	 * - We started executing but didn't get the lock yet.
+	 * - A new reservation came in, but cancel didn't take effect
+	 *   because we already started executing.
+	 *
+	 * We can't put this in the loop above because unreserve_pending needs
+	 * to be accessed under lock, so we can only check it once we got the
+	 * lock.
+	 */
+	if (qh->unreserve_pending)
+		dwc2_do_unreserve(hsotg, qh);
+
+	spin_unlock_irqrestore(&hsotg->lock, flags);
 }
 
 /**
@@ -695,6 +491,210 @@ static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg,
 }
 
 /**
+ * dwc2_qh_init() - Initializes a QH structure
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    The QH to init
+ * @urb:   Holds the information about the device/endpoint needed to initialize
+ *         the QH
+ */
+#define SCHEDULE_SLOP 10
+static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
+			 struct dwc2_hcd_urb *urb)
+{
+	int dev_speed, hub_addr, hub_port;
+	char *speed, *type;
+
+	dev_vdbg(hsotg->dev, "%s()\n", __func__);
+
+	/* Initialize QH */
+	qh->hsotg = hsotg;
+	setup_timer(&qh->unreserve_timer, dwc2_unreserve_timer_fn,
+		    (unsigned long)qh);
+	qh->ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
+	qh->ep_is_in = dwc2_hcd_is_pipe_in(&urb->pipe_info) ? 1 : 0;
+
+	qh->data_toggle = DWC2_HC_PID_DATA0;
+	qh->maxp = dwc2_hcd_get_mps(&urb->pipe_info);
+	INIT_LIST_HEAD(&qh->qtd_list);
+	INIT_LIST_HEAD(&qh->qh_list_entry);
+
+	/* FS/LS Endpoint on HS Hub, NOT virtual root hub */
+	dev_speed = dwc2_host_get_speed(hsotg, urb->priv);
+
+	dwc2_host_hub_info(hsotg, urb->priv, &hub_addr, &hub_port);
+
+	if ((dev_speed == USB_SPEED_LOW || dev_speed == USB_SPEED_FULL) &&
+	    hub_addr != 0 && hub_addr != 1) {
+		dev_vdbg(hsotg->dev,
+			 "QH init: EP %d: TT found at hub addr %d, for port %d\n",
+			 dwc2_hcd_get_ep_num(&urb->pipe_info), hub_addr,
+			 hub_port);
+		qh->do_split = 1;
+	}
+
+	if (qh->ep_type == USB_ENDPOINT_XFER_INT ||
+	    qh->ep_type == USB_ENDPOINT_XFER_ISOC) {
+		/* Compute scheduling parameters once and save them */
+		u32 hprt, prtspd;
+
+		/* Todo: Account for split transfers in the bus time */
+		int bytecount =
+			dwc2_hb_mult(qh->maxp) * dwc2_max_packet(qh->maxp);
+
+		qh->host_us = NS_TO_US(usb_calc_bus_time(qh->do_split ?
+			      USB_SPEED_HIGH : dev_speed, qh->ep_is_in,
+			      qh->ep_type == USB_ENDPOINT_XFER_ISOC,
+			      bytecount));
+
+		/* Ensure frame_number corresponds to the reality */
+		hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
+		/* Start in a slightly future (micro)frame */
+		qh->next_active_frame = dwc2_frame_num_inc(hsotg->frame_number,
+						     SCHEDULE_SLOP);
+		qh->host_interval = urb->interval;
+		dwc2_sch_dbg(hsotg, "QH=%p init nxt=%04x, fn=%04x, int=%#x\n",
+			     qh, qh->next_active_frame, hsotg->frame_number,
+			     qh->host_interval);
+#if 0
+		/* Increase interrupt polling rate for debugging */
+		if (qh->ep_type == USB_ENDPOINT_XFER_INT)
+			qh->host_interval = 8;
+#endif
+		hprt = dwc2_readl(hsotg->regs + HPRT0);
+		prtspd = (hprt & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT;
+		if (prtspd == HPRT0_SPD_HIGH_SPEED &&
+		    (dev_speed == USB_SPEED_LOW ||
+		     dev_speed == USB_SPEED_FULL)) {
+			qh->host_interval *= 8;
+			qh->next_active_frame |= 0x7;
+			qh->start_split_frame = qh->next_active_frame;
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p init*8 nxt=%04x, fn=%04x, int=%#x\n",
+				     qh, qh->next_active_frame,
+				     hsotg->frame_number, qh->host_interval);
+
+		}
+		dev_dbg(hsotg->dev, "interval=%d\n", qh->host_interval);
+	}
+
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH Initialized\n");
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - qh = %p\n", qh);
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Device Address = %d\n",
+		 dwc2_hcd_get_dev_addr(&urb->pipe_info));
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Endpoint %d, %s\n",
+		 dwc2_hcd_get_ep_num(&urb->pipe_info),
+		 dwc2_hcd_is_pipe_in(&urb->pipe_info) ? "IN" : "OUT");
+
+	qh->dev_speed = dev_speed;
+
+	switch (dev_speed) {
+	case USB_SPEED_LOW:
+		speed = "low";
+		break;
+	case USB_SPEED_FULL:
+		speed = "full";
+		break;
+	case USB_SPEED_HIGH:
+		speed = "high";
+		break;
+	default:
+		speed = "?";
+		break;
+	}
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Speed = %s\n", speed);
+
+	switch (qh->ep_type) {
+	case USB_ENDPOINT_XFER_ISOC:
+		type = "isochronous";
+		break;
+	case USB_ENDPOINT_XFER_INT:
+		type = "interrupt";
+		break;
+	case USB_ENDPOINT_XFER_CONTROL:
+		type = "control";
+		break;
+	case USB_ENDPOINT_XFER_BULK:
+		type = "bulk";
+		break;
+	default:
+		type = "?";
+		break;
+	}
+
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Type = %s\n", type);
+
+	if (qh->ep_type == USB_ENDPOINT_XFER_INT) {
+		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - usecs = %d\n",
+			 qh->host_us);
+		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - interval = %d\n",
+			 qh->host_interval);
+	}
+}
+
+/**
+ * dwc2_hcd_qh_create() - Allocates and initializes a QH
+ *
+ * @hsotg:        The HCD state structure for the DWC OTG controller
+ * @urb:          Holds the information about the device/endpoint needed
+ *                to initialize the QH
+ * @atomic_alloc: Flag to do atomic allocation if needed
+ *
+ * Return: Pointer to the newly allocated QH, or NULL on error
+ */
+struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
+					  struct dwc2_hcd_urb *urb,
+					  gfp_t mem_flags)
+{
+	struct dwc2_qh *qh;
+
+	if (!urb->priv)
+		return NULL;
+
+	/* Allocate memory */
+	qh = kzalloc(sizeof(*qh), mem_flags);
+	if (!qh)
+		return NULL;
+
+	dwc2_qh_init(hsotg, qh, urb);
+
+	if (hsotg->core_params->dma_desc_enable > 0 &&
+	    dwc2_hcd_qh_init_ddma(hsotg, qh, mem_flags) < 0) {
+		dwc2_hcd_qh_free(hsotg, qh);
+		return NULL;
+	}
+
+	return qh;
+}
+
+/**
+ * dwc2_hcd_qh_free() - Frees the QH
+ *
+ * @hsotg: HCD instance
+ * @qh:    The QH to free
+ *
+ * QH should already be removed from the list. QTD list should already be empty
+ * if called from URB Dequeue.
+ *
+ * Must NOT be called with interrupt disabled or spinlock held
+ */
+void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	/* Make sure any unreserve work is finished. */
+	if (del_timer_sync(&qh->unreserve_timer)) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&hsotg->lock, flags);
+		dwc2_do_unreserve(hsotg, qh);
+		spin_unlock_irqrestore(&hsotg->lock, flags);
+	}
+
+	if (qh->desc_list)
+		dwc2_hcd_qh_free_ddma(hsotg, qh);
+	kfree(qh);
+}
+
+/**
  * dwc2_hcd_qh_add() - Adds a QH to either the non periodic or periodic
  * schedule if it is not already in the schedule. If the QH is already in
  * the schedule, no action is taken.
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 14/22] usb: dwc2: host: Reorder things in hcd_queue.c
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: william.wu-TNX95d0MmH7DzftRWevZcw,
	huangtao-TNX95d0MmH7DzftRWevZcw, heiko-4mtYJXux2i+zQB+pC5nmwQ,
	stefan.wahren-eS4NqCHxEME,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Julius Werner,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw, Douglas Anderson,
	johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

This no-op change just reorders a few functions in hcd_queue.c in order
to prepare for future changes.  Motivations here:

The functions dwc2_hcd_qh_free() and dwc2_hcd_qh_create() are exported
functions.  They are not called within the file.  That means that they
should be near the bottom so that they can easily call static helpers.

The function dwc2_qh_init() is only called by dwc2_hcd_qh_create() and
should move near the bottom with it.

The only reason that the dwc2_unreserve_timer_fn() timer function (and
its subroutine dwc2_do_unreserve()) were so high in the file was that
they needed to be above dwc2_qh_init().  Now that dwc2_qh_init() has
been moved down it can be moved down a bit.  A later patch will split
the reserve code out of dwc2_schedule_periodic() and the reserve
function should be near the unreserve function.  The reserve function
needs to be below dwc2_find_uframe() since it calls that.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Reorder things in hcd_queue.c new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd_queue.c | 600 +++++++++++++++++++++----------------------
 1 file changed, 300 insertions(+), 300 deletions(-)

diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 39f4de6279f8..8a2067bc1e62 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -57,295 +57,6 @@
 #define DWC2_UNRESERVE_DELAY (msecs_to_jiffies(5))
 
 /**
- * dwc2_do_unreserve() - Actually release the periodic reservation
- *
- * This function actually releases the periodic bandwidth that was reserved
- * by the given qh.
- *
- * @hsotg: The HCD state structure for the DWC OTG controller
- * @qh:    QH for the periodic transfer.
- */
-static void dwc2_do_unreserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
-{
-	assert_spin_locked(&hsotg->lock);
-
-	WARN_ON(!qh->unreserve_pending);
-
-	/* No more unreserve pending--we're doing it */
-	qh->unreserve_pending = false;
-
-	if (WARN_ON(!list_empty(&qh->qh_list_entry)))
-		list_del_init(&qh->qh_list_entry);
-
-	/* Update claimed usecs per (micro)frame */
-	hsotg->periodic_usecs -= qh->host_us;
-
-	if (hsotg->core_params->uframe_sched > 0) {
-		int i;
-
-		for (i = 0; i < 8; i++) {
-			hsotg->frame_usecs[i] += qh->frame_usecs[i];
-			qh->frame_usecs[i] = 0;
-		}
-	} else {
-		/* Release periodic channel reservation */
-		hsotg->periodic_channels--;
-	}
-}
-
-/**
- * dwc2_unreserve_timer_fn() - Timer function to release periodic reservation
- *
- * According to the kernel doc for usb_submit_urb() (specifically the part about
- * "Reserved Bandwidth Transfers"), we need to keep a reservation active as
- * long as a device driver keeps submitting.  Since we're using HCD_BH to give
- * back the URB we need to give the driver a little bit of time before we
- * release the reservation.  This worker is called after the appropriate
- * delay.
- *
- * @work: Pointer to a qh unreserve_work.
- */
-static void dwc2_unreserve_timer_fn(unsigned long data)
-{
-	struct dwc2_qh *qh = (struct dwc2_qh *)data;
-	struct dwc2_hsotg *hsotg = qh->hsotg;
-	unsigned long flags;
-
-	/*
-	 * Wait for the lock, or for us to be scheduled again.  We
-	 * could be scheduled again if:
-	 * - We started executing but didn't get the lock yet.
-	 * - A new reservation came in, but cancel didn't take effect
-	 *   because we already started executing.
-	 * - The timer has been kicked again.
-	 * In that case cancel and wait for the next call.
-	 */
-	while (!spin_trylock_irqsave(&hsotg->lock, flags)) {
-		if (timer_pending(&qh->unreserve_timer))
-			return;
-	}
-
-	/*
-	 * Might be no more unreserve pending if:
-	 * - We started executing but didn't get the lock yet.
-	 * - A new reservation came in, but cancel didn't take effect
-	 *   because we already started executing.
-	 *
-	 * We can't put this in the loop above because unreserve_pending needs
-	 * to be accessed under lock, so we can only check it once we got the
-	 * lock.
-	 */
-	if (qh->unreserve_pending)
-		dwc2_do_unreserve(hsotg, qh);
-
-	spin_unlock_irqrestore(&hsotg->lock, flags);
-}
-
-/**
- * dwc2_qh_init() - Initializes a QH structure
- *
- * @hsotg: The HCD state structure for the DWC OTG controller
- * @qh:    The QH to init
- * @urb:   Holds the information about the device/endpoint needed to initialize
- *         the QH
- */
-#define SCHEDULE_SLOP 10
-static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
-			 struct dwc2_hcd_urb *urb)
-{
-	int dev_speed, hub_addr, hub_port;
-	char *speed, *type;
-
-	dev_vdbg(hsotg->dev, "%s()\n", __func__);
-
-	/* Initialize QH */
-	qh->hsotg = hsotg;
-	setup_timer(&qh->unreserve_timer, dwc2_unreserve_timer_fn,
-		    (unsigned long)qh);
-	qh->ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
-	qh->ep_is_in = dwc2_hcd_is_pipe_in(&urb->pipe_info) ? 1 : 0;
-
-	qh->data_toggle = DWC2_HC_PID_DATA0;
-	qh->maxp = dwc2_hcd_get_mps(&urb->pipe_info);
-	INIT_LIST_HEAD(&qh->qtd_list);
-	INIT_LIST_HEAD(&qh->qh_list_entry);
-
-	/* FS/LS Endpoint on HS Hub, NOT virtual root hub */
-	dev_speed = dwc2_host_get_speed(hsotg, urb->priv);
-
-	dwc2_host_hub_info(hsotg, urb->priv, &hub_addr, &hub_port);
-
-	if ((dev_speed == USB_SPEED_LOW || dev_speed == USB_SPEED_FULL) &&
-	    hub_addr != 0 && hub_addr != 1) {
-		dev_vdbg(hsotg->dev,
-			 "QH init: EP %d: TT found at hub addr %d, for port %d\n",
-			 dwc2_hcd_get_ep_num(&urb->pipe_info), hub_addr,
-			 hub_port);
-		qh->do_split = 1;
-	}
-
-	if (qh->ep_type == USB_ENDPOINT_XFER_INT ||
-	    qh->ep_type == USB_ENDPOINT_XFER_ISOC) {
-		/* Compute scheduling parameters once and save them */
-		u32 hprt, prtspd;
-
-		/* Todo: Account for split transfers in the bus time */
-		int bytecount =
-			dwc2_hb_mult(qh->maxp) * dwc2_max_packet(qh->maxp);
-
-		qh->host_us = NS_TO_US(usb_calc_bus_time(qh->do_split ?
-			      USB_SPEED_HIGH : dev_speed, qh->ep_is_in,
-			      qh->ep_type == USB_ENDPOINT_XFER_ISOC,
-			      bytecount));
-
-		/* Ensure frame_number corresponds to the reality */
-		hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
-		/* Start in a slightly future (micro)frame */
-		qh->next_active_frame = dwc2_frame_num_inc(hsotg->frame_number,
-						     SCHEDULE_SLOP);
-		qh->host_interval = urb->interval;
-		dwc2_sch_dbg(hsotg, "QH=%p init nxt=%04x, fn=%04x, int=%#x\n",
-			     qh, qh->next_active_frame, hsotg->frame_number,
-			     qh->host_interval);
-#if 0
-		/* Increase interrupt polling rate for debugging */
-		if (qh->ep_type == USB_ENDPOINT_XFER_INT)
-			qh->host_interval = 8;
-#endif
-		hprt = dwc2_readl(hsotg->regs + HPRT0);
-		prtspd = (hprt & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT;
-		if (prtspd == HPRT0_SPD_HIGH_SPEED &&
-		    (dev_speed == USB_SPEED_LOW ||
-		     dev_speed == USB_SPEED_FULL)) {
-			qh->host_interval *= 8;
-			qh->next_active_frame |= 0x7;
-			qh->start_split_frame = qh->next_active_frame;
-			dwc2_sch_dbg(hsotg,
-				     "QH=%p init*8 nxt=%04x, fn=%04x, int=%#x\n",
-				     qh, qh->next_active_frame,
-				     hsotg->frame_number, qh->host_interval);
-
-		}
-		dev_dbg(hsotg->dev, "interval=%d\n", qh->host_interval);
-	}
-
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH Initialized\n");
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - qh = %p\n", qh);
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Device Address = %d\n",
-		 dwc2_hcd_get_dev_addr(&urb->pipe_info));
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Endpoint %d, %s\n",
-		 dwc2_hcd_get_ep_num(&urb->pipe_info),
-		 dwc2_hcd_is_pipe_in(&urb->pipe_info) ? "IN" : "OUT");
-
-	qh->dev_speed = dev_speed;
-
-	switch (dev_speed) {
-	case USB_SPEED_LOW:
-		speed = "low";
-		break;
-	case USB_SPEED_FULL:
-		speed = "full";
-		break;
-	case USB_SPEED_HIGH:
-		speed = "high";
-		break;
-	default:
-		speed = "?";
-		break;
-	}
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Speed = %s\n", speed);
-
-	switch (qh->ep_type) {
-	case USB_ENDPOINT_XFER_ISOC:
-		type = "isochronous";
-		break;
-	case USB_ENDPOINT_XFER_INT:
-		type = "interrupt";
-		break;
-	case USB_ENDPOINT_XFER_CONTROL:
-		type = "control";
-		break;
-	case USB_ENDPOINT_XFER_BULK:
-		type = "bulk";
-		break;
-	default:
-		type = "?";
-		break;
-	}
-
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Type = %s\n", type);
-
-	if (qh->ep_type == USB_ENDPOINT_XFER_INT) {
-		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - usecs = %d\n",
-			 qh->host_us);
-		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - interval = %d\n",
-			 qh->host_interval);
-	}
-}
-
-/**
- * dwc2_hcd_qh_create() - Allocates and initializes a QH
- *
- * @hsotg:        The HCD state structure for the DWC OTG controller
- * @urb:          Holds the information about the device/endpoint needed
- *                to initialize the QH
- * @atomic_alloc: Flag to do atomic allocation if needed
- *
- * Return: Pointer to the newly allocated QH, or NULL on error
- */
-struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
-					  struct dwc2_hcd_urb *urb,
-					  gfp_t mem_flags)
-{
-	struct dwc2_qh *qh;
-
-	if (!urb->priv)
-		return NULL;
-
-	/* Allocate memory */
-	qh = kzalloc(sizeof(*qh), mem_flags);
-	if (!qh)
-		return NULL;
-
-	dwc2_qh_init(hsotg, qh, urb);
-
-	if (hsotg->core_params->dma_desc_enable > 0 &&
-	    dwc2_hcd_qh_init_ddma(hsotg, qh, mem_flags) < 0) {
-		dwc2_hcd_qh_free(hsotg, qh);
-		return NULL;
-	}
-
-	return qh;
-}
-
-/**
- * dwc2_hcd_qh_free() - Frees the QH
- *
- * @hsotg: HCD instance
- * @qh:    The QH to free
- *
- * QH should already be removed from the list. QTD list should already be empty
- * if called from URB Dequeue.
- *
- * Must NOT be called with interrupt disabled or spinlock held
- */
-void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
-{
-	/* Make sure any unreserve work is finished. */
-	if (del_timer_sync(&qh->unreserve_timer)) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&hsotg->lock, flags);
-		dwc2_do_unreserve(hsotg, qh);
-		spin_unlock_irqrestore(&hsotg->lock, flags);
-	}
-
-	if (qh->desc_list)
-		dwc2_hcd_qh_free_ddma(hsotg, qh);
-	kfree(qh);
-}
-
-/**
  * dwc2_periodic_channel_available() - Checks that a channel is available for a
  * periodic transfer
  *
@@ -518,19 +229,104 @@ static int dwc2_find_multi_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 
 static int dwc2_find_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
-	int ret;
+	int ret;
+
+	if (qh->dev_speed == USB_SPEED_HIGH) {
+		/* if this is a hs transaction we need a full frame */
+		ret = dwc2_find_single_uframe(hsotg, qh);
+	} else {
+		/*
+		 * if this is a fs transaction we may need a sequence
+		 * of frames
+		 */
+		ret = dwc2_find_multi_uframe(hsotg, qh);
+	}
+	return ret;
+}
+
+/**
+ * dwc2_do_unreserve() - Actually release the periodic reservation
+ *
+ * This function actually releases the periodic bandwidth that was reserved
+ * by the given qh.
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    QH for the periodic transfer.
+ */
+static void dwc2_do_unreserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	assert_spin_locked(&hsotg->lock);
+
+	WARN_ON(!qh->unreserve_pending);
+
+	/* No more unreserve pending--we're doing it */
+	qh->unreserve_pending = false;
+
+	if (WARN_ON(!list_empty(&qh->qh_list_entry)))
+		list_del_init(&qh->qh_list_entry);
+
+	/* Update claimed usecs per (micro)frame */
+	hsotg->periodic_usecs -= qh->host_us;
+
+	if (hsotg->core_params->uframe_sched > 0) {
+		int i;
+
+		for (i = 0; i < 8; i++) {
+			hsotg->frame_usecs[i] += qh->frame_usecs[i];
+			qh->frame_usecs[i] = 0;
+		}
+	} else {
+		/* Release periodic channel reservation */
+		hsotg->periodic_channels--;
+	}
+}
+
+/**
+ * dwc2_unreserve_timer_fn() - Timer function to release periodic reservation
+ *
+ * According to the kernel doc for usb_submit_urb() (specifically the part about
+ * "Reserved Bandwidth Transfers"), we need to keep a reservation active as
+ * long as a device driver keeps submitting.  Since we're using HCD_BH to give
+ * back the URB we need to give the driver a little bit of time before we
+ * release the reservation.  This worker is called after the appropriate
+ * delay.
+ *
+ * @work: Pointer to a qh unreserve_work.
+ */
+static void dwc2_unreserve_timer_fn(unsigned long data)
+{
+	struct dwc2_qh *qh = (struct dwc2_qh *)data;
+	struct dwc2_hsotg *hsotg = qh->hsotg;
+	unsigned long flags;
 
-	if (qh->dev_speed == USB_SPEED_HIGH) {
-		/* if this is a hs transaction we need a full frame */
-		ret = dwc2_find_single_uframe(hsotg, qh);
-	} else {
-		/*
-		 * if this is a fs transaction we may need a sequence
-		 * of frames
-		 */
-		ret = dwc2_find_multi_uframe(hsotg, qh);
+	/*
+	 * Wait for the lock, or for us to be scheduled again.  We
+	 * could be scheduled again if:
+	 * - We started executing but didn't get the lock yet.
+	 * - A new reservation came in, but cancel didn't take effect
+	 *   because we already started executing.
+	 * - The timer has been kicked again.
+	 * In that case cancel and wait for the next call.
+	 */
+	while (!spin_trylock_irqsave(&hsotg->lock, flags)) {
+		if (timer_pending(&qh->unreserve_timer))
+			return;
 	}
-	return ret;
+
+	/*
+	 * Might be no more unreserve pending if:
+	 * - We started executing but didn't get the lock yet.
+	 * - A new reservation came in, but cancel didn't take effect
+	 *   because we already started executing.
+	 *
+	 * We can't put this in the loop above because unreserve_pending needs
+	 * to be accessed under lock, so we can only check it once we got the
+	 * lock.
+	 */
+	if (qh->unreserve_pending)
+		dwc2_do_unreserve(hsotg, qh);
+
+	spin_unlock_irqrestore(&hsotg->lock, flags);
 }
 
 /**
@@ -695,6 +491,210 @@ static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg,
 }
 
 /**
+ * dwc2_qh_init() - Initializes a QH structure
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    The QH to init
+ * @urb:   Holds the information about the device/endpoint needed to initialize
+ *         the QH
+ */
+#define SCHEDULE_SLOP 10
+static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
+			 struct dwc2_hcd_urb *urb)
+{
+	int dev_speed, hub_addr, hub_port;
+	char *speed, *type;
+
+	dev_vdbg(hsotg->dev, "%s()\n", __func__);
+
+	/* Initialize QH */
+	qh->hsotg = hsotg;
+	setup_timer(&qh->unreserve_timer, dwc2_unreserve_timer_fn,
+		    (unsigned long)qh);
+	qh->ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
+	qh->ep_is_in = dwc2_hcd_is_pipe_in(&urb->pipe_info) ? 1 : 0;
+
+	qh->data_toggle = DWC2_HC_PID_DATA0;
+	qh->maxp = dwc2_hcd_get_mps(&urb->pipe_info);
+	INIT_LIST_HEAD(&qh->qtd_list);
+	INIT_LIST_HEAD(&qh->qh_list_entry);
+
+	/* FS/LS Endpoint on HS Hub, NOT virtual root hub */
+	dev_speed = dwc2_host_get_speed(hsotg, urb->priv);
+
+	dwc2_host_hub_info(hsotg, urb->priv, &hub_addr, &hub_port);
+
+	if ((dev_speed == USB_SPEED_LOW || dev_speed == USB_SPEED_FULL) &&
+	    hub_addr != 0 && hub_addr != 1) {
+		dev_vdbg(hsotg->dev,
+			 "QH init: EP %d: TT found at hub addr %d, for port %d\n",
+			 dwc2_hcd_get_ep_num(&urb->pipe_info), hub_addr,
+			 hub_port);
+		qh->do_split = 1;
+	}
+
+	if (qh->ep_type == USB_ENDPOINT_XFER_INT ||
+	    qh->ep_type == USB_ENDPOINT_XFER_ISOC) {
+		/* Compute scheduling parameters once and save them */
+		u32 hprt, prtspd;
+
+		/* Todo: Account for split transfers in the bus time */
+		int bytecount =
+			dwc2_hb_mult(qh->maxp) * dwc2_max_packet(qh->maxp);
+
+		qh->host_us = NS_TO_US(usb_calc_bus_time(qh->do_split ?
+			      USB_SPEED_HIGH : dev_speed, qh->ep_is_in,
+			      qh->ep_type == USB_ENDPOINT_XFER_ISOC,
+			      bytecount));
+
+		/* Ensure frame_number corresponds to the reality */
+		hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
+		/* Start in a slightly future (micro)frame */
+		qh->next_active_frame = dwc2_frame_num_inc(hsotg->frame_number,
+						     SCHEDULE_SLOP);
+		qh->host_interval = urb->interval;
+		dwc2_sch_dbg(hsotg, "QH=%p init nxt=%04x, fn=%04x, int=%#x\n",
+			     qh, qh->next_active_frame, hsotg->frame_number,
+			     qh->host_interval);
+#if 0
+		/* Increase interrupt polling rate for debugging */
+		if (qh->ep_type == USB_ENDPOINT_XFER_INT)
+			qh->host_interval = 8;
+#endif
+		hprt = dwc2_readl(hsotg->regs + HPRT0);
+		prtspd = (hprt & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT;
+		if (prtspd == HPRT0_SPD_HIGH_SPEED &&
+		    (dev_speed == USB_SPEED_LOW ||
+		     dev_speed == USB_SPEED_FULL)) {
+			qh->host_interval *= 8;
+			qh->next_active_frame |= 0x7;
+			qh->start_split_frame = qh->next_active_frame;
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p init*8 nxt=%04x, fn=%04x, int=%#x\n",
+				     qh, qh->next_active_frame,
+				     hsotg->frame_number, qh->host_interval);
+
+		}
+		dev_dbg(hsotg->dev, "interval=%d\n", qh->host_interval);
+	}
+
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH Initialized\n");
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - qh = %p\n", qh);
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Device Address = %d\n",
+		 dwc2_hcd_get_dev_addr(&urb->pipe_info));
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Endpoint %d, %s\n",
+		 dwc2_hcd_get_ep_num(&urb->pipe_info),
+		 dwc2_hcd_is_pipe_in(&urb->pipe_info) ? "IN" : "OUT");
+
+	qh->dev_speed = dev_speed;
+
+	switch (dev_speed) {
+	case USB_SPEED_LOW:
+		speed = "low";
+		break;
+	case USB_SPEED_FULL:
+		speed = "full";
+		break;
+	case USB_SPEED_HIGH:
+		speed = "high";
+		break;
+	default:
+		speed = "?";
+		break;
+	}
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Speed = %s\n", speed);
+
+	switch (qh->ep_type) {
+	case USB_ENDPOINT_XFER_ISOC:
+		type = "isochronous";
+		break;
+	case USB_ENDPOINT_XFER_INT:
+		type = "interrupt";
+		break;
+	case USB_ENDPOINT_XFER_CONTROL:
+		type = "control";
+		break;
+	case USB_ENDPOINT_XFER_BULK:
+		type = "bulk";
+		break;
+	default:
+		type = "?";
+		break;
+	}
+
+	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Type = %s\n", type);
+
+	if (qh->ep_type == USB_ENDPOINT_XFER_INT) {
+		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - usecs = %d\n",
+			 qh->host_us);
+		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - interval = %d\n",
+			 qh->host_interval);
+	}
+}
+
+/**
+ * dwc2_hcd_qh_create() - Allocates and initializes a QH
+ *
+ * @hsotg:        The HCD state structure for the DWC OTG controller
+ * @urb:          Holds the information about the device/endpoint needed
+ *                to initialize the QH
+ * @atomic_alloc: Flag to do atomic allocation if needed
+ *
+ * Return: Pointer to the newly allocated QH, or NULL on error
+ */
+struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
+					  struct dwc2_hcd_urb *urb,
+					  gfp_t mem_flags)
+{
+	struct dwc2_qh *qh;
+
+	if (!urb->priv)
+		return NULL;
+
+	/* Allocate memory */
+	qh = kzalloc(sizeof(*qh), mem_flags);
+	if (!qh)
+		return NULL;
+
+	dwc2_qh_init(hsotg, qh, urb);
+
+	if (hsotg->core_params->dma_desc_enable > 0 &&
+	    dwc2_hcd_qh_init_ddma(hsotg, qh, mem_flags) < 0) {
+		dwc2_hcd_qh_free(hsotg, qh);
+		return NULL;
+	}
+
+	return qh;
+}
+
+/**
+ * dwc2_hcd_qh_free() - Frees the QH
+ *
+ * @hsotg: HCD instance
+ * @qh:    The QH to free
+ *
+ * QH should already be removed from the list. QTD list should already be empty
+ * if called from URB Dequeue.
+ *
+ * Must NOT be called with interrupt disabled or spinlock held
+ */
+void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	/* Make sure any unreserve work is finished. */
+	if (del_timer_sync(&qh->unreserve_timer)) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&hsotg->lock, flags);
+		dwc2_do_unreserve(hsotg, qh);
+		spin_unlock_irqrestore(&hsotg->lock, flags);
+	}
+
+	if (qh->desc_list)
+		dwc2_hcd_qh_free_ddma(hsotg, qh);
+	kfree(qh);
+}
+
+/**
  * dwc2_hcd_qh_add() - Adds a QH to either the non periodic or periodic
  * schedule if it is not already in the schedule. If the QH is already in
  * the schedule, no action is taken.
-- 
2.7.0.rc3.207.g0ac5344

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 15/22] usb: dwc2: host: Split code out to make dwc2_do_reserve()
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

This no-op change splits code out of dwc2_schedule_periodic() into a
dwc2_do_reserve() function.  This makes it a little easier to follow the
logic.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Split code out to make dwc2_do_reserve() new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd_queue.c | 112 ++++++++++++++++++++++++++-----------------
 1 file changed, 67 insertions(+), 45 deletions(-)

diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 8a2067bc1e62..9ce407e5017d 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -245,6 +245,70 @@ static int dwc2_find_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 }
 
 /**
+ * dwc2_do_reserve() - Make a periodic reservation
+ *
+ * Try to allocate space in the periodic schedule.  Depending on parameters
+ * this might use the microframe scheduler or the dumb scheduler.
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    QH for the periodic transfer.
+ *
+ * Returns: 0 upon success; error upon failure.
+ */
+static int dwc2_do_reserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	int status;
+
+	if (hsotg->core_params->uframe_sched > 0) {
+		int frame = -1;
+
+		status = dwc2_find_uframe(hsotg, qh);
+		if (status == 0)
+			frame = 7;
+		else if (status > 0)
+			frame = status - 1;
+
+		/* Set the new frame up */
+		if (frame >= 0) {
+			qh->next_active_frame &= ~0x7;
+			qh->next_active_frame |= (frame & 7);
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p sched_p nxt=%04x, uf=%d\n",
+				     qh, qh->next_active_frame, frame);
+		}
+
+		if (status > 0)
+			status = 0;
+	} else {
+		status = dwc2_periodic_channel_available(hsotg);
+		if (status) {
+			dev_info(hsotg->dev,
+				 "%s: No host channel available for periodic transfer\n",
+				 __func__);
+			return status;
+		}
+
+		status = dwc2_check_periodic_bandwidth(hsotg, qh);
+	}
+
+	if (status) {
+		dev_dbg(hsotg->dev,
+			"%s: Insufficient periodic bandwidth for periodic transfer\n",
+			__func__);
+		return status;
+	}
+
+	if (hsotg->core_params->uframe_sched <= 0)
+		/* Reserve periodic channel */
+		hsotg->periodic_channels++;
+
+	/* Update claimed usecs per (micro)frame */
+	hsotg->periodic_usecs += qh->host_us;
+
+	return 0;
+}
+
+/**
  * dwc2_do_unreserve() - Actually release the periodic reservation
  *
  * This function actually releases the periodic bandwidth that was reserved
@@ -393,51 +457,9 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	 * that case.
 	 */
 	if (!qh->unreserve_pending) {
-		if (hsotg->core_params->uframe_sched > 0) {
-			int frame = -1;
-
-			status = dwc2_find_uframe(hsotg, qh);
-			if (status == 0)
-				frame = 7;
-			else if (status > 0)
-				frame = status - 1;
-
-			/* Set the new frame up */
-			if (frame >= 0) {
-				qh->next_active_frame &= ~0x7;
-				qh->next_active_frame |= (frame & 7);
-				dwc2_sch_dbg(hsotg,
-					     "QH=%p sched_p nxt=%04x, uf=%d\n",
-					     qh, qh->next_active_frame, frame);
-			}
-
-			if (status > 0)
-				status = 0;
-		} else {
-			status = dwc2_periodic_channel_available(hsotg);
-			if (status) {
-				dev_info(hsotg->dev,
-					"%s: No host channel available for periodic transfer\n",
-					__func__);
-				return status;
-			}
-
-			status = dwc2_check_periodic_bandwidth(hsotg, qh);
-		}
-
-		if (status) {
-			dev_dbg(hsotg->dev,
-				"%s: Insufficient periodic bandwidth for periodic transfer\n",
-				__func__);
+		status = dwc2_do_reserve(hsotg, qh);
+		if (status)
 			return status;
-		}
-
-		if (hsotg->core_params->uframe_sched <= 0)
-			/* Reserve periodic channel */
-			hsotg->periodic_channels++;
-
-		/* Update claimed usecs per (micro)frame */
-		hsotg->periodic_usecs += qh->host_us;
 	}
 
 	qh->unreserve_pending = 0;
@@ -450,7 +472,7 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		list_add_tail(&qh->qh_list_entry,
 			      &hsotg->periodic_sched_inactive);
 
-	return status;
+	return 0;
 }
 
 /**
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 15/22] usb: dwc2: host: Split code out to make dwc2_do_reserve()
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

This no-op change splits code out of dwc2_schedule_periodic() into a
dwc2_do_reserve() function.  This makes it a little easier to follow the
logic.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Split code out to make dwc2_do_reserve() new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd_queue.c | 112 ++++++++++++++++++++++++++-----------------
 1 file changed, 67 insertions(+), 45 deletions(-)

diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 8a2067bc1e62..9ce407e5017d 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -245,6 +245,70 @@ static int dwc2_find_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 }
 
 /**
+ * dwc2_do_reserve() - Make a periodic reservation
+ *
+ * Try to allocate space in the periodic schedule.  Depending on parameters
+ * this might use the microframe scheduler or the dumb scheduler.
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    QH for the periodic transfer.
+ *
+ * Returns: 0 upon success; error upon failure.
+ */
+static int dwc2_do_reserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	int status;
+
+	if (hsotg->core_params->uframe_sched > 0) {
+		int frame = -1;
+
+		status = dwc2_find_uframe(hsotg, qh);
+		if (status == 0)
+			frame = 7;
+		else if (status > 0)
+			frame = status - 1;
+
+		/* Set the new frame up */
+		if (frame >= 0) {
+			qh->next_active_frame &= ~0x7;
+			qh->next_active_frame |= (frame & 7);
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p sched_p nxt=%04x, uf=%d\n",
+				     qh, qh->next_active_frame, frame);
+		}
+
+		if (status > 0)
+			status = 0;
+	} else {
+		status = dwc2_periodic_channel_available(hsotg);
+		if (status) {
+			dev_info(hsotg->dev,
+				 "%s: No host channel available for periodic transfer\n",
+				 __func__);
+			return status;
+		}
+
+		status = dwc2_check_periodic_bandwidth(hsotg, qh);
+	}
+
+	if (status) {
+		dev_dbg(hsotg->dev,
+			"%s: Insufficient periodic bandwidth for periodic transfer\n",
+			__func__);
+		return status;
+	}
+
+	if (hsotg->core_params->uframe_sched <= 0)
+		/* Reserve periodic channel */
+		hsotg->periodic_channels++;
+
+	/* Update claimed usecs per (micro)frame */
+	hsotg->periodic_usecs += qh->host_us;
+
+	return 0;
+}
+
+/**
  * dwc2_do_unreserve() - Actually release the periodic reservation
  *
  * This function actually releases the periodic bandwidth that was reserved
@@ -393,51 +457,9 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	 * that case.
 	 */
 	if (!qh->unreserve_pending) {
-		if (hsotg->core_params->uframe_sched > 0) {
-			int frame = -1;
-
-			status = dwc2_find_uframe(hsotg, qh);
-			if (status == 0)
-				frame = 7;
-			else if (status > 0)
-				frame = status - 1;
-
-			/* Set the new frame up */
-			if (frame >= 0) {
-				qh->next_active_frame &= ~0x7;
-				qh->next_active_frame |= (frame & 7);
-				dwc2_sch_dbg(hsotg,
-					     "QH=%p sched_p nxt=%04x, uf=%d\n",
-					     qh, qh->next_active_frame, frame);
-			}
-
-			if (status > 0)
-				status = 0;
-		} else {
-			status = dwc2_periodic_channel_available(hsotg);
-			if (status) {
-				dev_info(hsotg->dev,
-					"%s: No host channel available for periodic transfer\n",
-					__func__);
-				return status;
-			}
-
-			status = dwc2_check_periodic_bandwidth(hsotg, qh);
-		}
-
-		if (status) {
-			dev_dbg(hsotg->dev,
-				"%s: Insufficient periodic bandwidth for periodic transfer\n",
-				__func__);
+		status = dwc2_do_reserve(hsotg, qh);
+		if (status)
 			return status;
-		}
-
-		if (hsotg->core_params->uframe_sched <= 0)
-			/* Reserve periodic channel */
-			hsotg->periodic_channels++;
-
-		/* Update claimed usecs per (micro)frame */
-		hsotg->periodic_usecs += qh->host_us;
 	}
 
 	qh->unreserve_pending = 0;
@@ -450,7 +472,7 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		list_add_tail(&qh->qh_list_entry,
 			      &hsotg->periodic_sched_inactive);
 
-	return status;
+	return 0;
 }
 
 /**
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 16/22] usb: dwc2: host: Add scheduler logging for missed SOFs
  2016-01-29  2:19 [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits Douglas Anderson
                   ` (14 preceding siblings ...)
  2016-01-29  2:20   ` Douglas Anderson
@ 2016-01-29  2:20 ` Douglas Anderson
  2016-01-29  2:20   ` Douglas Anderson
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

We'll use the new "scheduler verbose debugging" macro to log missed
SOFs.  This is fast enough (assuming you configure it to use the ftrace
buffer) that we can do it without worrying about the speed hit.  The
overhead hit if the scheduler tracing is set to "no_printk" should be
near zero.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Add scheduler logging for missed SOFs new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.h     |  3 ++-
 drivers/usb/dwc2/hcd.c      |  2 +-
 drivers/usb/dwc2/hcd_intr.c | 12 ++++++++----
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 18f9e4045643..64d45a2053bb 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -809,9 +809,10 @@ struct dwc2_hsotg {
 	bool bus_suspended;
 	bool new_connection;
 
+	u16 last_frame_num;
+
 #ifdef CONFIG_USB_DWC2_TRACK_MISSED_SOFS
 #define FRAME_NUM_ARRAY_SIZE 1000
-	u16 last_frame_num;
 	u16 *frame_num_array;
 	u16 *last_frame_num_array;
 	int frame_num_idx;
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index fd731347daf7..f48da015fa5e 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -3084,8 +3084,8 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq)
 			FRAME_NUM_ARRAY_SIZE, GFP_KERNEL);
 	if (!hsotg->last_frame_num_array)
 		goto error1;
-	hsotg->last_frame_num = HFNUM_MAX_FRNUM;
 #endif
+	hsotg->last_frame_num = HFNUM_MAX_FRNUM;
 
 	/* Check if the bus driver or platform code has setup a dma_mask */
 	if (hsotg->core_params->dma_enable > 0 &&
diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index d929db5e7f3f..dc285667233a 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -55,12 +55,16 @@
 /* This function is for debug only */
 static void dwc2_track_missed_sofs(struct dwc2_hsotg *hsotg)
 {
-#ifdef CONFIG_USB_DWC2_TRACK_MISSED_SOFS
 	u16 curr_frame_number = hsotg->frame_number;
+	u16 expected = dwc2_frame_num_inc(hsotg->last_frame_num, 1);
+
+	if (expected != curr_frame_number)
+		dwc2_sch_vdbg(hsotg, "MISSED SOF %04x != %04x\n",
+			expected, curr_frame_number);
 
+#ifdef CONFIG_USB_DWC2_TRACK_MISSED_SOFS
 	if (hsotg->frame_num_idx < FRAME_NUM_ARRAY_SIZE) {
-		if (((hsotg->last_frame_num + 1) & HFNUM_MAX_FRNUM) !=
-		    curr_frame_number) {
+		if (expected != curr_frame_number) {
 			hsotg->frame_num_array[hsotg->frame_num_idx] =
 					curr_frame_number;
 			hsotg->last_frame_num_array[hsotg->frame_num_idx] =
@@ -79,8 +83,8 @@ static void dwc2_track_missed_sofs(struct dwc2_hsotg *hsotg)
 		}
 		hsotg->dumped_frame_num_array = 1;
 	}
-	hsotg->last_frame_num = curr_frame_number;
 #endif
+	hsotg->last_frame_num = curr_frame_number;
 }
 
 static void dwc2_hc_handle_tt_clear(struct dwc2_hsotg *hsotg,
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 17/22] usb: dwc2: host: Manage frame nums better in scheduler
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

The dwc2 scheduler (contained in hcd_queue.c) was a bit confusing in the
way it initted / kept track of which frames a QH was going to be active
in.  Let's clean things up a little bit in preparation for a rewrite of
the microframe scheduler.

Specifically:
* Old code would pick a frame number in dwc2_qh_init() and would try to
  pick it "in a slightly future (micro)frame".  As far as I can tell the
  reason for this was that there was a delay between dwc2_qh_init() and
  when we actually wanted to dwc2_hcd_qh_add().  ...but apparently this
  attempt to be slightly in the future wasn't enough because
  dwc2_hcd_qh_add() then had code to reset things if the frame _wasn't_
  in the future.  There's no reason not to just pick the frame later.
  For non-periodic QH we now pick the frame in dwc2_hcd_qh_add().  For
  periodic QH we pick the frame at dwc2_schedule_periodic() time.
* The old "dwc2_qh_init() actually assigned to "hsotg->frame_number".
  This doesn't seem like a great idea since that variable is supposed to
  be used to keep track of which SOF the interrupt handler has seen.
  Let's be clean: anyone who wants the current frame number (instead of
  the one as of the last interrupt) should ask for it.
* The old code wasn't terribly consistent about trying to use the frame
  that the microframe scheduler assigned to it.  In
  dwc2_sched_periodic_split() when it was scheduling the first frame it
  always "ORed" in 0x7 (!).  Since the frame goes on the wire 1 uFrame
  after next_active_frame it meant that the SSPLIT would always try for
  uFrame 0 and the transaction would happen on the low speed bus during
  uFrame 1.  This is irregardless of what the microframe scheduler
  said.
* The old code assumed it would get called to schedule the next in a
  periodic split very quickly.  That is if next_active_frame was
  0 (transfer on wire in uFrame 1) it assumed it was getting called to
  schedule the next uFrame during uFrame 1 too (so it could queue
  something up for uFrame 2).  It should be possible to actually queue
  something up for uFrame 2 while in uFrame 2 (AKA queue up ASAP).  To
  do this, code needs to look at the previously scheduled frame when
  deciding when to next be active, not look at the current frame number.
* If there was no microframe scheduler, the old code would check for
  whether we should be active using "qh->next_active_frame ==
  frame_number".  This seemed like a race waiting to happen.  ...plus
  there's no way that you wouldn't want to schedule if next_active_frame
  was actually less than frame number.

Note that this change doesn't make 100% sense on its own since it's
expecting some sanity in the frame numbers assigned by the microframe
scheduler and (as per the future patch which rewries it) I think that
the current microframe scheduler is quite insane.  However, it seems
like splitting this up from the microframe scheduler patch makes things
into smaller chunks and hopefully adds to clarity rather than reduces
it.  The two patches could certainly be squashed.  Not that in the very
least, I don't see any obvious bad behavior introduced with just this
patch.

I've attempted to keep the config parameter to disable the microframe
scheduler in tact in this change, though I'm not sure it's worth it.
Obviously the code is touched a lot so it's possible I regressed
something when the microframe scheduler is disabled, though I did some
basic testing and it seemed to work OK.  I'm still not 100% sure why you
wouldn't want the microframe scheduler (presuming it works), so maybe a
future patch (or a future version of this patch?) could remove that
parameter.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Fix bug where periodic things get scheduled too quick (Alan Stern)
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Manage frame nums better in scheduler new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd.h       |  10 +-
 drivers/usb/dwc2/hcd_queue.c | 351 ++++++++++++++++++++++++++++++++-----------
 2 files changed, 272 insertions(+), 89 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 10c35585a2bd..fd266ac53a28 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -244,8 +244,11 @@ enum dwc2_transaction_type {
  *                      the bus.  We'll move the qh to active here.  If the
  *                      host is in high speed mode this will be a uframe.  If
  *                      the host is in low speed mode this will be a full frame.
+ * @start_active_frame: If we are partway through a split transfer, this will be
+ *			what next_active_frame was when we started.  Otherwise
+ *			it should always be the same as next_active_frame.
+ * @assigned_uframe:    The uframe (0 -7) assigned by dwc2_find_uframe().
  * @frame_usecs:        Internal variable used by the microframe scheduler
- * @start_split_frame:  (Micro)frame at which last start split was initialized
  * @ntd:                Actual number of transfer descriptors in a list
  * @qtd_list:           List of QTDs for this QH
  * @channel:            Host channel currently processing transfers for this QH
@@ -279,8 +282,9 @@ struct dwc2_qh {
 	u16 host_us;
 	u16 host_interval;
 	u16 next_active_frame;
+	u16 start_active_frame;
+	u16 assigned_uframe;
 	u16 frame_usecs[8];
-	u16 start_split_frame;
 	u16 ntd;
 	struct list_head qtd_list;
 	struct dwc2_host_chan *channel;
@@ -746,7 +750,7 @@ do {									\
 	_qtd_ = list_entry((_qh_)->qtd_list.next, struct dwc2_qtd,	\
 			   qtd_list_entry);				\
 	if (usb_pipeint(_qtd_->urb->pipe) &&				\
-	    (_qh_)->start_split_frame != 0 && !_qtd_->complete_split) {	\
+	    (_qh_)->start_active_frame != 0 && !_qtd_->complete_split) { \
 		_hfnum_.d32 = dwc2_readl((_hcd_)->regs + HFNUM);	\
 		switch (_hfnum_.b.frnum & 0x7) {			\
 		case 7:							\
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 9ce407e5017d..9b3c435339ee 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -38,6 +38,7 @@
  * This file contains the functions to manage Queue Heads and Queue
  * Transfer Descriptors for Host mode
  */
+#include <linux/gcd.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/spinlock.h>
@@ -245,6 +246,96 @@ static int dwc2_find_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 }
 
 /**
+ * dwc2_pick_first_frame() - Choose 1st frame for qh that's already scheduled
+ *
+ * Takes a qh that has already been scheduled (which means we know we have the
+ * bandwdith reserved for us) and set the next_active_frame and the
+ * start_active_frame.
+ *
+ * This is expected to be called on qh's that weren't previously actively
+ * running.  It just picks the next frame that we can fit into without any
+ * thought about the past.
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    QH for a periodic endpoint
+ *
+ */
+static void dwc2_pick_first_frame(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	u16 frame_number;
+	u16 earliest_frame;
+	u16 next_active_frame;
+	u16 interval;
+
+	/*
+	 * Use the real frame number rather than the cached value as of the
+	 * last SOF to give us a little extra slop.
+	 */
+	frame_number = dwc2_hcd_get_frame_number(hsotg);
+
+	/*
+	 * We wouldn't want to start any earlier than the next frame just in
+	 * case the frame number ticks as we're doing this calculation.
+	 *
+	 * NOTE: if we could quantify how long till we actually get scheduled
+	 * we might be able to avoid the "+ 1" by looking at the upper part of
+	 * HFNUM (the FRREM field).  For now we'll just use the + 1 though.
+	 */
+	earliest_frame = dwc2_frame_num_inc(frame_number, 1);
+	next_active_frame = earliest_frame;
+
+	/* Get the "no microframe schduler" out of the way... */
+	if (hsotg->core_params->uframe_sched <= 0) {
+		if (qh->do_split)
+			/* Splits are active at microframe 0 minus 1 */
+			next_active_frame |= 0x7;
+		goto exit;
+	}
+
+	/* Adjust interval as per high speed schedule which has 8 uFrame */
+	interval = gcd(qh->host_interval, 8);
+
+	/*
+	 * We know interval must divide (HFNUM_MAX_FRNUM + 1) now that we've
+	 * done the gcd(), so it's safe to move to the beginning of the current
+	 * interval like this.
+	 *
+	 * After this we might be before earliest_frame, but don't worry,
+	 * we'll fix it...
+	 */
+	next_active_frame = (next_active_frame / interval) * interval;
+
+	/*
+	 * Actually choose to start at the frame number we've been
+	 * scheduled for.
+	 */
+	next_active_frame = dwc2_frame_num_inc(next_active_frame,
+					       qh->assigned_uframe);
+
+	/*
+	 * We actually need 1 frame before since the next_active_frame is
+	 * the frame number we'll be put on the ready list and we won't be on
+	 * the bus until 1 frame later.
+	 */
+	next_active_frame = dwc2_frame_num_dec(next_active_frame, 1);
+
+	/*
+	 * By now we might actually be before the earliest_frame.  Let's move
+	 * up intervals until we're not.
+	 */
+	while (dwc2_frame_num_gt(earliest_frame, next_active_frame))
+		next_active_frame = dwc2_frame_num_inc(next_active_frame,
+						       interval);
+
+exit:
+	qh->next_active_frame = next_active_frame;
+	qh->start_active_frame = next_active_frame;
+
+	dwc2_sch_vdbg(hsotg, "QH=%p First fn=%04x nxt=%04x\n",
+		     qh, frame_number, qh->next_active_frame);
+}
+
+/**
  * dwc2_do_reserve() - Make a periodic reservation
  *
  * Try to allocate space in the periodic schedule.  Depending on parameters
@@ -260,25 +351,9 @@ static int dwc2_do_reserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	int status;
 
 	if (hsotg->core_params->uframe_sched > 0) {
-		int frame = -1;
-
 		status = dwc2_find_uframe(hsotg, qh);
-		if (status == 0)
-			frame = 7;
-		else if (status > 0)
-			frame = status - 1;
-
-		/* Set the new frame up */
-		if (frame >= 0) {
-			qh->next_active_frame &= ~0x7;
-			qh->next_active_frame |= (frame & 7);
-			dwc2_sch_dbg(hsotg,
-				     "QH=%p sched_p nxt=%04x, uf=%d\n",
-				     qh, qh->next_active_frame, frame);
-		}
-
-		if (status > 0)
-			status = 0;
+		if (status >= 0)
+			qh->assigned_uframe = status;
 	} else {
 		status = dwc2_periodic_channel_available(hsotg);
 		if (status) {
@@ -305,6 +380,8 @@ static int dwc2_do_reserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	/* Update claimed usecs per (micro)frame */
 	hsotg->periodic_usecs += qh->host_us;
 
+	dwc2_pick_first_frame(hsotg, qh);
+
 	return 0;
 }
 
@@ -460,6 +537,16 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		status = dwc2_do_reserve(hsotg, qh);
 		if (status)
 			return status;
+	} else {
+		/*
+		 * It might have been a while, so make sure that frame_number
+		 * is still good.  Note: we could also try to use the similar
+		 * dwc2_next_periodic_start() but that schedules much more
+		 * tightly and we might need to hurry and queue things up.
+		 */
+		if (dwc2_frame_num_le(qh->next_active_frame,
+				      hsotg->frame_number))
+			dwc2_pick_first_frame(hsotg, qh);
 	}
 
 	qh->unreserve_pending = 0;
@@ -520,7 +607,6 @@ static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg,
  * @urb:   Holds the information about the device/endpoint needed to initialize
  *         the QH
  */
-#define SCHEDULE_SLOP 10
 static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 			 struct dwc2_hcd_urb *urb)
 {
@@ -569,11 +655,6 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 			      qh->ep_type == USB_ENDPOINT_XFER_ISOC,
 			      bytecount));
 
-		/* Ensure frame_number corresponds to the reality */
-		hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
-		/* Start in a slightly future (micro)frame */
-		qh->next_active_frame = dwc2_frame_num_inc(hsotg->frame_number,
-						     SCHEDULE_SLOP);
 		qh->host_interval = urb->interval;
 		dwc2_sch_dbg(hsotg, "QH=%p init nxt=%04x, fn=%04x, int=%#x\n",
 			     qh, qh->next_active_frame, hsotg->frame_number,
@@ -589,8 +670,6 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		    (dev_speed == USB_SPEED_LOW ||
 		     dev_speed == USB_SPEED_FULL)) {
 			qh->host_interval *= 8;
-			qh->next_active_frame |= 0x7;
-			qh->start_split_frame = qh->next_active_frame;
 			dwc2_sch_dbg(hsotg,
 				     "QH=%p init*8 nxt=%04x, fn=%04x, int=%#x\n",
 				     qh, qh->next_active_frame,
@@ -738,22 +817,12 @@ int dwc2_hcd_qh_add(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		/* QH already in a schedule */
 		return 0;
 
-	if (!dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number) &&
-			!hsotg->frame_number) {
-		u16 new_frame;
-
-		dev_dbg(hsotg->dev,
-				"reset frame number counter\n");
-		new_frame = dwc2_frame_num_inc(hsotg->frame_number,
-				SCHEDULE_SLOP);
-
-		dwc2_sch_vdbg(hsotg, "QH=%p reset nxt=%04x=>%04x\n",
-			      qh, qh->next_active_frame, new_frame);
-		qh->next_active_frame = new_frame;
-	}
-
 	/* Add the new QH to the appropriate schedule */
 	if (dwc2_qh_is_non_per(qh)) {
+		/* Schedule right away */
+		qh->start_active_frame = hsotg->frame_number;
+		qh->next_active_frame = qh->start_active_frame;
+
 		/* Always start in inactive schedule */
 		list_add_tail(&qh->qh_list_entry,
 			      &hsotg->non_periodic_sched_inactive);
@@ -807,46 +876,145 @@ void dwc2_hcd_qh_unlink(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	}
 }
 
-/*
- * Schedule the next continuing periodic split transfer
+/**
+ * dwc2_next_for_periodic_split() - Set next_active_frame midway thru a split.
+ *
+ * This is called for setting next_active_frame for periodic splits for all but
+ * the first packet of the split.  Confusing?  I thought so...
+ *
+ * Periodic splits are single low/full speed transfers that we end up splitting
+ * up into several high speed transfers.  They always fit into one full (1 ms)
+ * frame but might be split over several microframes (125 us each).  We to put
+ * each of the parts on a very specific high speed frame.
+ *
+ * This function figures out where the next active uFrame needs to be.
+ *
+ * @hsotg:        The HCD state structure
+ * @qh:           QH for the periodic transfer.
+ * @frame_number: The current frame number.
+ *
+ * Return: number missed by (or 0 if we didn't miss).
  */
-static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
-				      struct dwc2_qh *qh, u16 frame_number,
-				      int sched_next_periodic_split)
+static int dwc2_next_for_periodic_split(struct dwc2_hsotg *hsotg,
+					 struct dwc2_qh *qh, u16 frame_number)
 {
-	u16 incr;
 	u16 old_frame = qh->next_active_frame;
+	u16 prev_frame_number = dwc2_frame_num_dec(frame_number, 1);
+	int missed = 0;
+	u16 incr;
+
+	/*
+	 * Basically: increment 1 normally, but 2 right after the start split
+	 * (except for ISOC out).
+	 */
+	if (old_frame == qh->start_active_frame &&
+	    !(qh->ep_type == USB_ENDPOINT_XFER_ISOC && !qh->ep_is_in))
+		incr = 2;
+	else
+		incr = 1;
+
+	qh->next_active_frame = dwc2_frame_num_inc(old_frame, incr);
 
-	if (sched_next_periodic_split) {
+	/*
+	 * Note that it's OK for frame_number to be 1 frame past
+	 * next_active_frame.  Remember that next_active_frame is supposed to
+	 * be 1 frame _before_ when we want to be scheduled.  If we're 1 frame
+	 * past it just means schedule ASAP.
+	 *
+	 * It's _not_ OK, however, if we're more than one frame past.
+	 */
+	if (dwc2_frame_num_gt(prev_frame_number, qh->next_active_frame)) {
+		/*
+		 * OOPS, we missed.  That's actually pretty bad since
+		 * the hub will be unhappy; try ASAP I guess.
+		 */
+		missed = dwc2_frame_num_dec(prev_frame_number,
+					    qh->next_active_frame);
 		qh->next_active_frame = frame_number;
-		incr = dwc2_frame_num_inc(qh->start_split_frame, 1);
-		if (dwc2_frame_num_le(frame_number, incr)) {
-			/*
-			 * Allow one frame to elapse after start split
-			 * microframe before scheduling complete split, but
-			 * DON'T if we are doing the next start split in the
-			 * same frame for an ISOC out
-			 */
-			if (qh->ep_type != USB_ENDPOINT_XFER_ISOC ||
-			    qh->ep_is_in != 0) {
-				qh->next_active_frame = dwc2_frame_num_inc(
-					qh->next_active_frame, 1);
-			}
-		}
-	} else {
-		qh->next_active_frame =
-			dwc2_frame_num_inc(qh->start_split_frame,
-					   qh->host_interval);
-		if (dwc2_frame_num_le(qh->next_active_frame, frame_number))
-			qh->next_active_frame = frame_number;
-		qh->next_active_frame |= 0x7;
-		qh->start_split_frame = qh->next_active_frame;
 	}
 
-	dwc2_sch_vdbg(hsotg, "QH=%p next(%d) fn=%04x, nxt=%04x=>%04x (%+d)\n",
-		      qh, sched_next_periodic_split, frame_number, old_frame,
-		      qh->next_active_frame,
-		      dwc2_frame_num_dec(qh->next_active_frame, old_frame));
+	return missed;
+}
+
+/**
+ * dwc2_next_periodic_start() - Set next_active_frame for next transfer start
+ *
+ * This is called for setting next_active_frame for a periodic transfer for
+ * all cases other than midway through a periodic split.  This will also update
+ * start_active_frame.
+ *
+ * Since we _always_ keep start_active_frame as the start of the previous
+ * transfer this is normally pretty easy: we just add our interval to
+ * start_active_frame and we've got our answer.
+ *
+ * The tricks come into play if we miss.  In that case we'll look for the next
+ * slot we can fit into.
+ *
+ * @hsotg:        The HCD state structure
+ * @qh:           QH for the periodic transfer.
+ * @frame_number: The current frame number.
+ *
+ * Return: number missed by (or 0 if we didn't miss).
+ */
+static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
+				     struct dwc2_qh *qh, u16 frame_number)
+{
+	int missed = 0;
+	u16 interval = qh->host_interval;
+	u16 prev_frame_number = dwc2_frame_num_dec(frame_number, 1);
+
+	qh->start_active_frame = dwc2_frame_num_inc(qh->start_active_frame,
+						    interval);
+
+	/*
+	 * The dwc2_frame_num_gt() function used below won't work terribly well
+	 * with if we just incremented by a really large intervals since the
+	 * frame counter only goes to 0x3fff.  It's terribly unlikely that we
+	 * will have missed in this case anyway.  Just go to exit.  If we want
+	 * to try to do better we'll need to keep track of a bigger counter
+	 * somewhere in the driver and handle overflows.
+	 */
+	if (interval >= 0x1000)
+		goto exit;
+
+	/*
+	 * Test for misses, which is when it's too late to schedule.
+	 *
+	 * A few things to note:
+	 * - We compare against prev_frame_number since start_active_frame
+	 *   and next_active_frame are always 1 frame before we want things
+	 *   to be active and we assume we can still get scheduled in the
+	 *   current frame number.
+	 * - Some misses are expected.  Specifically, in order to work
+	 *   perfectly dwc2 really needs quite spectacular interrupt latency
+	 *   requirements.  It needs to be able to handle its interrupts
+	 *   completely within 125 us of them being asserted. That not only
+	 *   means that the dwc2 interrupt handler needs to be fast but it
+	 *   means that nothing else in the system has to block dwc2 for a long
+	 *   time.  We can help with the dwc2 parts of this, but it's hard to
+	 *   guarantee that a system will have interrupt latency < 125 us, so
+	 *   we have to be robust to some misses.
+	 */
+	if (dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
+		u16 ideal_start = qh->start_active_frame;
+
+		/* Adjust interval as per gcd with plan length. */
+		interval = gcd(interval, 8);
+
+		do {
+			qh->start_active_frame = dwc2_frame_num_inc(
+				qh->start_active_frame, interval);
+		} while (dwc2_frame_num_gt(prev_frame_number,
+					   qh->start_active_frame));
+
+		missed = dwc2_frame_num_dec(qh->start_active_frame,
+					    ideal_start);
+	}
+
+exit:
+	qh->next_active_frame = qh->start_active_frame;
+
+	return missed;
 }
 
 /*
@@ -865,7 +1033,9 @@ static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
 void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 			    int sched_next_periodic_split)
 {
+	u16 old_frame = qh->next_active_frame;
 	u16 frame_number;
+	int missed;
 
 	if (dbg_qh(qh))
 		dev_vdbg(hsotg->dev, "%s()\n", __func__);
@@ -878,30 +1048,39 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		return;
 	}
 
+	/*
+	 * Use the real frame number rather than the cached value as of the
+	 * last SOF just to get us a little closer to reality.  Note that
+	 * means we don't actually know if we've already handled the SOF
+	 * interrupt for this frame.
+	 */
 	frame_number = dwc2_hcd_get_frame_number(hsotg);
 
-	if (qh->do_split) {
-		dwc2_sched_periodic_split(hsotg, qh, frame_number,
-					  sched_next_periodic_split);
-	} else {
-		qh->next_active_frame = dwc2_frame_num_inc(
-			qh->next_active_frame, qh->host_interval);
-		if (dwc2_frame_num_le(qh->next_active_frame, frame_number))
-			qh->next_active_frame = frame_number;
-	}
+	if (sched_next_periodic_split)
+		missed = dwc2_next_for_periodic_split(hsotg, qh, frame_number);
+	else
+		missed = dwc2_next_periodic_start(hsotg, qh, frame_number);
+
+	dwc2_sch_vdbg(hsotg,
+		     "QH=%p next(%d) fn=%04x, sch=%04x=>%04x (%+d) miss=%d %s\n",
+		     qh, sched_next_periodic_split, frame_number, old_frame,
+		     qh->next_active_frame,
+		     dwc2_frame_num_dec(qh->next_active_frame, old_frame),
+		missed, missed ? "MISS" : "");
 
 	if (list_empty(&qh->qtd_list)) {
 		dwc2_hcd_qh_unlink(hsotg, qh);
 		return;
 	}
+
 	/*
 	 * Remove from periodic_sched_queued and move to
 	 * appropriate queue
+	 *
+	 * Note: we purposely use the frame_number from the "hsotg" structure
+	 * since we know SOF interrupt will handle future frames.
 	 */
-	if ((hsotg->core_params->uframe_sched > 0 &&
-	     dwc2_frame_num_le(qh->next_active_frame, frame_number)) ||
-	    (hsotg->core_params->uframe_sched <= 0 &&
-	     qh->next_active_frame == frame_number))
+	if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
 		list_move_tail(&qh->qh_list_entry,
 			       &hsotg->periodic_sched_ready);
 	else
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 17/22] usb: dwc2: host: Manage frame nums better in scheduler
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: william.wu-TNX95d0MmH7DzftRWevZcw,
	huangtao-TNX95d0MmH7DzftRWevZcw, heiko-4mtYJXux2i+zQB+pC5nmwQ,
	stefan.wahren-eS4NqCHxEME,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Julius Werner,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw, Douglas Anderson,
	johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

The dwc2 scheduler (contained in hcd_queue.c) was a bit confusing in the
way it initted / kept track of which frames a QH was going to be active
in.  Let's clean things up a little bit in preparation for a rewrite of
the microframe scheduler.

Specifically:
* Old code would pick a frame number in dwc2_qh_init() and would try to
  pick it "in a slightly future (micro)frame".  As far as I can tell the
  reason for this was that there was a delay between dwc2_qh_init() and
  when we actually wanted to dwc2_hcd_qh_add().  ...but apparently this
  attempt to be slightly in the future wasn't enough because
  dwc2_hcd_qh_add() then had code to reset things if the frame _wasn't_
  in the future.  There's no reason not to just pick the frame later.
  For non-periodic QH we now pick the frame in dwc2_hcd_qh_add().  For
  periodic QH we pick the frame at dwc2_schedule_periodic() time.
* The old "dwc2_qh_init() actually assigned to "hsotg->frame_number".
  This doesn't seem like a great idea since that variable is supposed to
  be used to keep track of which SOF the interrupt handler has seen.
  Let's be clean: anyone who wants the current frame number (instead of
  the one as of the last interrupt) should ask for it.
* The old code wasn't terribly consistent about trying to use the frame
  that the microframe scheduler assigned to it.  In
  dwc2_sched_periodic_split() when it was scheduling the first frame it
  always "ORed" in 0x7 (!).  Since the frame goes on the wire 1 uFrame
  after next_active_frame it meant that the SSPLIT would always try for
  uFrame 0 and the transaction would happen on the low speed bus during
  uFrame 1.  This is irregardless of what the microframe scheduler
  said.
* The old code assumed it would get called to schedule the next in a
  periodic split very quickly.  That is if next_active_frame was
  0 (transfer on wire in uFrame 1) it assumed it was getting called to
  schedule the next uFrame during uFrame 1 too (so it could queue
  something up for uFrame 2).  It should be possible to actually queue
  something up for uFrame 2 while in uFrame 2 (AKA queue up ASAP).  To
  do this, code needs to look at the previously scheduled frame when
  deciding when to next be active, not look at the current frame number.
* If there was no microframe scheduler, the old code would check for
  whether we should be active using "qh->next_active_frame ==
  frame_number".  This seemed like a race waiting to happen.  ...plus
  there's no way that you wouldn't want to schedule if next_active_frame
  was actually less than frame number.

Note that this change doesn't make 100% sense on its own since it's
expecting some sanity in the frame numbers assigned by the microframe
scheduler and (as per the future patch which rewries it) I think that
the current microframe scheduler is quite insane.  However, it seems
like splitting this up from the microframe scheduler patch makes things
into smaller chunks and hopefully adds to clarity rather than reduces
it.  The two patches could certainly be squashed.  Not that in the very
least, I don't see any obvious bad behavior introduced with just this
patch.

I've attempted to keep the config parameter to disable the microframe
scheduler in tact in this change, though I'm not sure it's worth it.
Obviously the code is touched a lot so it's possible I regressed
something when the microframe scheduler is disabled, though I did some
basic testing and it seemed to work OK.  I'm still not 100% sure why you
wouldn't want the microframe scheduler (presuming it works), so maybe a
future patch (or a future version of this patch?) could remove that
parameter.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Fix bug where periodic things get scheduled too quick (Alan Stern)
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Manage frame nums better in scheduler new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd.h       |  10 +-
 drivers/usb/dwc2/hcd_queue.c | 351 ++++++++++++++++++++++++++++++++-----------
 2 files changed, 272 insertions(+), 89 deletions(-)

diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index 10c35585a2bd..fd266ac53a28 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -244,8 +244,11 @@ enum dwc2_transaction_type {
  *                      the bus.  We'll move the qh to active here.  If the
  *                      host is in high speed mode this will be a uframe.  If
  *                      the host is in low speed mode this will be a full frame.
+ * @start_active_frame: If we are partway through a split transfer, this will be
+ *			what next_active_frame was when we started.  Otherwise
+ *			it should always be the same as next_active_frame.
+ * @assigned_uframe:    The uframe (0 -7) assigned by dwc2_find_uframe().
  * @frame_usecs:        Internal variable used by the microframe scheduler
- * @start_split_frame:  (Micro)frame at which last start split was initialized
  * @ntd:                Actual number of transfer descriptors in a list
  * @qtd_list:           List of QTDs for this QH
  * @channel:            Host channel currently processing transfers for this QH
@@ -279,8 +282,9 @@ struct dwc2_qh {
 	u16 host_us;
 	u16 host_interval;
 	u16 next_active_frame;
+	u16 start_active_frame;
+	u16 assigned_uframe;
 	u16 frame_usecs[8];
-	u16 start_split_frame;
 	u16 ntd;
 	struct list_head qtd_list;
 	struct dwc2_host_chan *channel;
@@ -746,7 +750,7 @@ do {									\
 	_qtd_ = list_entry((_qh_)->qtd_list.next, struct dwc2_qtd,	\
 			   qtd_list_entry);				\
 	if (usb_pipeint(_qtd_->urb->pipe) &&				\
-	    (_qh_)->start_split_frame != 0 && !_qtd_->complete_split) {	\
+	    (_qh_)->start_active_frame != 0 && !_qtd_->complete_split) { \
 		_hfnum_.d32 = dwc2_readl((_hcd_)->regs + HFNUM);	\
 		switch (_hfnum_.b.frnum & 0x7) {			\
 		case 7:							\
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 9ce407e5017d..9b3c435339ee 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -38,6 +38,7 @@
  * This file contains the functions to manage Queue Heads and Queue
  * Transfer Descriptors for Host mode
  */
+#include <linux/gcd.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/spinlock.h>
@@ -245,6 +246,96 @@ static int dwc2_find_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 }
 
 /**
+ * dwc2_pick_first_frame() - Choose 1st frame for qh that's already scheduled
+ *
+ * Takes a qh that has already been scheduled (which means we know we have the
+ * bandwdith reserved for us) and set the next_active_frame and the
+ * start_active_frame.
+ *
+ * This is expected to be called on qh's that weren't previously actively
+ * running.  It just picks the next frame that we can fit into without any
+ * thought about the past.
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller
+ * @qh:    QH for a periodic endpoint
+ *
+ */
+static void dwc2_pick_first_frame(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	u16 frame_number;
+	u16 earliest_frame;
+	u16 next_active_frame;
+	u16 interval;
+
+	/*
+	 * Use the real frame number rather than the cached value as of the
+	 * last SOF to give us a little extra slop.
+	 */
+	frame_number = dwc2_hcd_get_frame_number(hsotg);
+
+	/*
+	 * We wouldn't want to start any earlier than the next frame just in
+	 * case the frame number ticks as we're doing this calculation.
+	 *
+	 * NOTE: if we could quantify how long till we actually get scheduled
+	 * we might be able to avoid the "+ 1" by looking at the upper part of
+	 * HFNUM (the FRREM field).  For now we'll just use the + 1 though.
+	 */
+	earliest_frame = dwc2_frame_num_inc(frame_number, 1);
+	next_active_frame = earliest_frame;
+
+	/* Get the "no microframe schduler" out of the way... */
+	if (hsotg->core_params->uframe_sched <= 0) {
+		if (qh->do_split)
+			/* Splits are active at microframe 0 minus 1 */
+			next_active_frame |= 0x7;
+		goto exit;
+	}
+
+	/* Adjust interval as per high speed schedule which has 8 uFrame */
+	interval = gcd(qh->host_interval, 8);
+
+	/*
+	 * We know interval must divide (HFNUM_MAX_FRNUM + 1) now that we've
+	 * done the gcd(), so it's safe to move to the beginning of the current
+	 * interval like this.
+	 *
+	 * After this we might be before earliest_frame, but don't worry,
+	 * we'll fix it...
+	 */
+	next_active_frame = (next_active_frame / interval) * interval;
+
+	/*
+	 * Actually choose to start at the frame number we've been
+	 * scheduled for.
+	 */
+	next_active_frame = dwc2_frame_num_inc(next_active_frame,
+					       qh->assigned_uframe);
+
+	/*
+	 * We actually need 1 frame before since the next_active_frame is
+	 * the frame number we'll be put on the ready list and we won't be on
+	 * the bus until 1 frame later.
+	 */
+	next_active_frame = dwc2_frame_num_dec(next_active_frame, 1);
+
+	/*
+	 * By now we might actually be before the earliest_frame.  Let's move
+	 * up intervals until we're not.
+	 */
+	while (dwc2_frame_num_gt(earliest_frame, next_active_frame))
+		next_active_frame = dwc2_frame_num_inc(next_active_frame,
+						       interval);
+
+exit:
+	qh->next_active_frame = next_active_frame;
+	qh->start_active_frame = next_active_frame;
+
+	dwc2_sch_vdbg(hsotg, "QH=%p First fn=%04x nxt=%04x\n",
+		     qh, frame_number, qh->next_active_frame);
+}
+
+/**
  * dwc2_do_reserve() - Make a periodic reservation
  *
  * Try to allocate space in the periodic schedule.  Depending on parameters
@@ -260,25 +351,9 @@ static int dwc2_do_reserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	int status;
 
 	if (hsotg->core_params->uframe_sched > 0) {
-		int frame = -1;
-
 		status = dwc2_find_uframe(hsotg, qh);
-		if (status == 0)
-			frame = 7;
-		else if (status > 0)
-			frame = status - 1;
-
-		/* Set the new frame up */
-		if (frame >= 0) {
-			qh->next_active_frame &= ~0x7;
-			qh->next_active_frame |= (frame & 7);
-			dwc2_sch_dbg(hsotg,
-				     "QH=%p sched_p nxt=%04x, uf=%d\n",
-				     qh, qh->next_active_frame, frame);
-		}
-
-		if (status > 0)
-			status = 0;
+		if (status >= 0)
+			qh->assigned_uframe = status;
 	} else {
 		status = dwc2_periodic_channel_available(hsotg);
 		if (status) {
@@ -305,6 +380,8 @@ static int dwc2_do_reserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	/* Update claimed usecs per (micro)frame */
 	hsotg->periodic_usecs += qh->host_us;
 
+	dwc2_pick_first_frame(hsotg, qh);
+
 	return 0;
 }
 
@@ -460,6 +537,16 @@ static int dwc2_schedule_periodic(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		status = dwc2_do_reserve(hsotg, qh);
 		if (status)
 			return status;
+	} else {
+		/*
+		 * It might have been a while, so make sure that frame_number
+		 * is still good.  Note: we could also try to use the similar
+		 * dwc2_next_periodic_start() but that schedules much more
+		 * tightly and we might need to hurry and queue things up.
+		 */
+		if (dwc2_frame_num_le(qh->next_active_frame,
+				      hsotg->frame_number))
+			dwc2_pick_first_frame(hsotg, qh);
 	}
 
 	qh->unreserve_pending = 0;
@@ -520,7 +607,6 @@ static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg,
  * @urb:   Holds the information about the device/endpoint needed to initialize
  *         the QH
  */
-#define SCHEDULE_SLOP 10
 static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 			 struct dwc2_hcd_urb *urb)
 {
@@ -569,11 +655,6 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 			      qh->ep_type == USB_ENDPOINT_XFER_ISOC,
 			      bytecount));
 
-		/* Ensure frame_number corresponds to the reality */
-		hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
-		/* Start in a slightly future (micro)frame */
-		qh->next_active_frame = dwc2_frame_num_inc(hsotg->frame_number,
-						     SCHEDULE_SLOP);
 		qh->host_interval = urb->interval;
 		dwc2_sch_dbg(hsotg, "QH=%p init nxt=%04x, fn=%04x, int=%#x\n",
 			     qh, qh->next_active_frame, hsotg->frame_number,
@@ -589,8 +670,6 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		    (dev_speed == USB_SPEED_LOW ||
 		     dev_speed == USB_SPEED_FULL)) {
 			qh->host_interval *= 8;
-			qh->next_active_frame |= 0x7;
-			qh->start_split_frame = qh->next_active_frame;
 			dwc2_sch_dbg(hsotg,
 				     "QH=%p init*8 nxt=%04x, fn=%04x, int=%#x\n",
 				     qh, qh->next_active_frame,
@@ -738,22 +817,12 @@ int dwc2_hcd_qh_add(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		/* QH already in a schedule */
 		return 0;
 
-	if (!dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number) &&
-			!hsotg->frame_number) {
-		u16 new_frame;
-
-		dev_dbg(hsotg->dev,
-				"reset frame number counter\n");
-		new_frame = dwc2_frame_num_inc(hsotg->frame_number,
-				SCHEDULE_SLOP);
-
-		dwc2_sch_vdbg(hsotg, "QH=%p reset nxt=%04x=>%04x\n",
-			      qh, qh->next_active_frame, new_frame);
-		qh->next_active_frame = new_frame;
-	}
-
 	/* Add the new QH to the appropriate schedule */
 	if (dwc2_qh_is_non_per(qh)) {
+		/* Schedule right away */
+		qh->start_active_frame = hsotg->frame_number;
+		qh->next_active_frame = qh->start_active_frame;
+
 		/* Always start in inactive schedule */
 		list_add_tail(&qh->qh_list_entry,
 			      &hsotg->non_periodic_sched_inactive);
@@ -807,46 +876,145 @@ void dwc2_hcd_qh_unlink(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	}
 }
 
-/*
- * Schedule the next continuing periodic split transfer
+/**
+ * dwc2_next_for_periodic_split() - Set next_active_frame midway thru a split.
+ *
+ * This is called for setting next_active_frame for periodic splits for all but
+ * the first packet of the split.  Confusing?  I thought so...
+ *
+ * Periodic splits are single low/full speed transfers that we end up splitting
+ * up into several high speed transfers.  They always fit into one full (1 ms)
+ * frame but might be split over several microframes (125 us each).  We to put
+ * each of the parts on a very specific high speed frame.
+ *
+ * This function figures out where the next active uFrame needs to be.
+ *
+ * @hsotg:        The HCD state structure
+ * @qh:           QH for the periodic transfer.
+ * @frame_number: The current frame number.
+ *
+ * Return: number missed by (or 0 if we didn't miss).
  */
-static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
-				      struct dwc2_qh *qh, u16 frame_number,
-				      int sched_next_periodic_split)
+static int dwc2_next_for_periodic_split(struct dwc2_hsotg *hsotg,
+					 struct dwc2_qh *qh, u16 frame_number)
 {
-	u16 incr;
 	u16 old_frame = qh->next_active_frame;
+	u16 prev_frame_number = dwc2_frame_num_dec(frame_number, 1);
+	int missed = 0;
+	u16 incr;
+
+	/*
+	 * Basically: increment 1 normally, but 2 right after the start split
+	 * (except for ISOC out).
+	 */
+	if (old_frame == qh->start_active_frame &&
+	    !(qh->ep_type == USB_ENDPOINT_XFER_ISOC && !qh->ep_is_in))
+		incr = 2;
+	else
+		incr = 1;
+
+	qh->next_active_frame = dwc2_frame_num_inc(old_frame, incr);
 
-	if (sched_next_periodic_split) {
+	/*
+	 * Note that it's OK for frame_number to be 1 frame past
+	 * next_active_frame.  Remember that next_active_frame is supposed to
+	 * be 1 frame _before_ when we want to be scheduled.  If we're 1 frame
+	 * past it just means schedule ASAP.
+	 *
+	 * It's _not_ OK, however, if we're more than one frame past.
+	 */
+	if (dwc2_frame_num_gt(prev_frame_number, qh->next_active_frame)) {
+		/*
+		 * OOPS, we missed.  That's actually pretty bad since
+		 * the hub will be unhappy; try ASAP I guess.
+		 */
+		missed = dwc2_frame_num_dec(prev_frame_number,
+					    qh->next_active_frame);
 		qh->next_active_frame = frame_number;
-		incr = dwc2_frame_num_inc(qh->start_split_frame, 1);
-		if (dwc2_frame_num_le(frame_number, incr)) {
-			/*
-			 * Allow one frame to elapse after start split
-			 * microframe before scheduling complete split, but
-			 * DON'T if we are doing the next start split in the
-			 * same frame for an ISOC out
-			 */
-			if (qh->ep_type != USB_ENDPOINT_XFER_ISOC ||
-			    qh->ep_is_in != 0) {
-				qh->next_active_frame = dwc2_frame_num_inc(
-					qh->next_active_frame, 1);
-			}
-		}
-	} else {
-		qh->next_active_frame =
-			dwc2_frame_num_inc(qh->start_split_frame,
-					   qh->host_interval);
-		if (dwc2_frame_num_le(qh->next_active_frame, frame_number))
-			qh->next_active_frame = frame_number;
-		qh->next_active_frame |= 0x7;
-		qh->start_split_frame = qh->next_active_frame;
 	}
 
-	dwc2_sch_vdbg(hsotg, "QH=%p next(%d) fn=%04x, nxt=%04x=>%04x (%+d)\n",
-		      qh, sched_next_periodic_split, frame_number, old_frame,
-		      qh->next_active_frame,
-		      dwc2_frame_num_dec(qh->next_active_frame, old_frame));
+	return missed;
+}
+
+/**
+ * dwc2_next_periodic_start() - Set next_active_frame for next transfer start
+ *
+ * This is called for setting next_active_frame for a periodic transfer for
+ * all cases other than midway through a periodic split.  This will also update
+ * start_active_frame.
+ *
+ * Since we _always_ keep start_active_frame as the start of the previous
+ * transfer this is normally pretty easy: we just add our interval to
+ * start_active_frame and we've got our answer.
+ *
+ * The tricks come into play if we miss.  In that case we'll look for the next
+ * slot we can fit into.
+ *
+ * @hsotg:        The HCD state structure
+ * @qh:           QH for the periodic transfer.
+ * @frame_number: The current frame number.
+ *
+ * Return: number missed by (or 0 if we didn't miss).
+ */
+static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
+				     struct dwc2_qh *qh, u16 frame_number)
+{
+	int missed = 0;
+	u16 interval = qh->host_interval;
+	u16 prev_frame_number = dwc2_frame_num_dec(frame_number, 1);
+
+	qh->start_active_frame = dwc2_frame_num_inc(qh->start_active_frame,
+						    interval);
+
+	/*
+	 * The dwc2_frame_num_gt() function used below won't work terribly well
+	 * with if we just incremented by a really large intervals since the
+	 * frame counter only goes to 0x3fff.  It's terribly unlikely that we
+	 * will have missed in this case anyway.  Just go to exit.  If we want
+	 * to try to do better we'll need to keep track of a bigger counter
+	 * somewhere in the driver and handle overflows.
+	 */
+	if (interval >= 0x1000)
+		goto exit;
+
+	/*
+	 * Test for misses, which is when it's too late to schedule.
+	 *
+	 * A few things to note:
+	 * - We compare against prev_frame_number since start_active_frame
+	 *   and next_active_frame are always 1 frame before we want things
+	 *   to be active and we assume we can still get scheduled in the
+	 *   current frame number.
+	 * - Some misses are expected.  Specifically, in order to work
+	 *   perfectly dwc2 really needs quite spectacular interrupt latency
+	 *   requirements.  It needs to be able to handle its interrupts
+	 *   completely within 125 us of them being asserted. That not only
+	 *   means that the dwc2 interrupt handler needs to be fast but it
+	 *   means that nothing else in the system has to block dwc2 for a long
+	 *   time.  We can help with the dwc2 parts of this, but it's hard to
+	 *   guarantee that a system will have interrupt latency < 125 us, so
+	 *   we have to be robust to some misses.
+	 */
+	if (dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
+		u16 ideal_start = qh->start_active_frame;
+
+		/* Adjust interval as per gcd with plan length. */
+		interval = gcd(interval, 8);
+
+		do {
+			qh->start_active_frame = dwc2_frame_num_inc(
+				qh->start_active_frame, interval);
+		} while (dwc2_frame_num_gt(prev_frame_number,
+					   qh->start_active_frame));
+
+		missed = dwc2_frame_num_dec(qh->start_active_frame,
+					    ideal_start);
+	}
+
+exit:
+	qh->next_active_frame = qh->start_active_frame;
+
+	return missed;
 }
 
 /*
@@ -865,7 +1033,9 @@ static void dwc2_sched_periodic_split(struct dwc2_hsotg *hsotg,
 void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 			    int sched_next_periodic_split)
 {
+	u16 old_frame = qh->next_active_frame;
 	u16 frame_number;
+	int missed;
 
 	if (dbg_qh(qh))
 		dev_vdbg(hsotg->dev, "%s()\n", __func__);
@@ -878,30 +1048,39 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		return;
 	}
 
+	/*
+	 * Use the real frame number rather than the cached value as of the
+	 * last SOF just to get us a little closer to reality.  Note that
+	 * means we don't actually know if we've already handled the SOF
+	 * interrupt for this frame.
+	 */
 	frame_number = dwc2_hcd_get_frame_number(hsotg);
 
-	if (qh->do_split) {
-		dwc2_sched_periodic_split(hsotg, qh, frame_number,
-					  sched_next_periodic_split);
-	} else {
-		qh->next_active_frame = dwc2_frame_num_inc(
-			qh->next_active_frame, qh->host_interval);
-		if (dwc2_frame_num_le(qh->next_active_frame, frame_number))
-			qh->next_active_frame = frame_number;
-	}
+	if (sched_next_periodic_split)
+		missed = dwc2_next_for_periodic_split(hsotg, qh, frame_number);
+	else
+		missed = dwc2_next_periodic_start(hsotg, qh, frame_number);
+
+	dwc2_sch_vdbg(hsotg,
+		     "QH=%p next(%d) fn=%04x, sch=%04x=>%04x (%+d) miss=%d %s\n",
+		     qh, sched_next_periodic_split, frame_number, old_frame,
+		     qh->next_active_frame,
+		     dwc2_frame_num_dec(qh->next_active_frame, old_frame),
+		missed, missed ? "MISS" : "");
 
 	if (list_empty(&qh->qtd_list)) {
 		dwc2_hcd_qh_unlink(hsotg, qh);
 		return;
 	}
+
 	/*
 	 * Remove from periodic_sched_queued and move to
 	 * appropriate queue
+	 *
+	 * Note: we purposely use the frame_number from the "hsotg" structure
+	 * since we know SOF interrupt will handle future frames.
 	 */
-	if ((hsotg->core_params->uframe_sched > 0 &&
-	     dwc2_frame_num_le(qh->next_active_frame, frame_number)) ||
-	    (hsotg->core_params->uframe_sched <= 0 &&
-	     qh->next_active_frame == frame_number))
+	if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
 		list_move_tail(&qh->qh_list_entry,
 			       &hsotg->periodic_sched_ready);
 	else
-- 
2.7.0.rc3.207.g0ac5344

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
  2016-01-29  2:19 [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits Douglas Anderson
                   ` (16 preceding siblings ...)
  2016-01-29  2:20   ` Douglas Anderson
@ 2016-01-29  2:20 ` Douglas Anderson
  2016-01-31  9:36     ` Kever Yang
  2016-01-29  2:20   ` Douglas Anderson
                   ` (4 subsequent siblings)
  22 siblings, 1 reply; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

In dwc2_hcd_qh_deactivate() we will put some things on the
periodic_sched_ready list.  These things won't be taken off the ready
list until the next SOF, which might be a little late.  Let's put them
on right away.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Schedule periodic right away if it's time new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 9b3c435339ee..3abb34a5fc5b 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 	 * Note: we purposely use the frame_number from the "hsotg" structure
 	 * since we know SOF interrupt will handle future frames.
 	 */
-	if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
+	if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number)) {
+		enum dwc2_transaction_type tr_type;
+
+		/*
+		 * We're bypassing the SOF handler which is normally what puts
+		 * us on the ready list because we're in a hurry and need to
+		 * try to catch up.
+		 */
+		dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x, nxt=%04x\n",
+			      qh, frame_number, qh->next_active_frame);
 		list_move_tail(&qh->qh_list_entry,
 			       &hsotg->periodic_sched_ready);
-	else
+
+		tr_type = dwc2_hcd_select_transactions(hsotg);
+		if (tr_type != DWC2_TRANSACTION_NONE)
+			dwc2_hcd_queue_transactions(hsotg, tr_type);
+	} else {
 		list_move_tail(&qh->qh_list_entry,
 			       &hsotg->periodic_sched_inactive);
+	}
 }
 
 /**
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 19/22] usb: dwc2: host: Add dwc2_hcd_get_future_frame_number() call
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

As we start getting more exact about our scheduling it's becoming more
and more important to know exactly how far through the current frame we
are.  This lets us make decisions about whether there's still time left
to start a new transaction in the current frame.

We'll add dwc2_hcd_get_future_frame_number() which will tell you what
the frame number will be a certain number of microseconds (us) from
now.  We can use this information to help decide if there's enough time
left in the frame for a transaction that will take a certain duration.

This is expected to be used by a future change ("usb: dwc2: host:
Properly set even/odd frame").

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Add dwc2_hcd_get_future_frame_number() call new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.h |  4 ++++
 drivers/usb/dwc2/hcd.c  | 29 +++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 64d45a2053bb..52cbea28d0e9 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -1235,12 +1235,16 @@ static inline int dwc2_hsotg_set_test_mode(struct dwc2_hsotg *hsotg,
 
 #if IS_ENABLED(CONFIG_USB_DWC2_HOST) || IS_ENABLED(CONFIG_USB_DWC2_DUAL_ROLE)
 extern int dwc2_hcd_get_frame_number(struct dwc2_hsotg *hsotg);
+extern int dwc2_hcd_get_future_frame_number(struct dwc2_hsotg *hsotg, int us);
 extern void dwc2_hcd_connect(struct dwc2_hsotg *hsotg);
 extern void dwc2_hcd_disconnect(struct dwc2_hsotg *hsotg, bool force);
 extern void dwc2_hcd_start(struct dwc2_hsotg *hsotg);
 #else
 static inline int dwc2_hcd_get_frame_number(struct dwc2_hsotg *hsotg)
 { return 0; }
+static inline int dwc2_hcd_get_future_frame_number(struct dwc2_hsotg *hsotg,
+						   int us)
+{ return 0; }
 static inline void dwc2_hcd_connect(struct dwc2_hsotg *hsotg) {}
 static inline void dwc2_hcd_disconnect(struct dwc2_hsotg *hsotg, bool force) {}
 static inline void dwc2_hcd_start(struct dwc2_hsotg *hsotg) {}
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index f48da015fa5e..8edd0b45f41c 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -1947,6 +1947,35 @@ int dwc2_hcd_get_frame_number(struct dwc2_hsotg *hsotg)
 	return (hfnum & HFNUM_FRNUM_MASK) >> HFNUM_FRNUM_SHIFT;
 }
 
+int dwc2_hcd_get_future_frame_number(struct dwc2_hsotg *hsotg, int us)
+{
+	u32 hprt = dwc2_readl(hsotg->regs + HPRT0);
+	u32 hfir = dwc2_readl(hsotg->regs + HFIR);
+	u32 hfnum = dwc2_readl(hsotg->regs + HFNUM);
+	unsigned int us_per_frame;
+	unsigned int frame_number;
+	unsigned int remaining;
+	unsigned int interval;
+	unsigned int phy_clks;
+
+	/* High speed has 125 us per (micro) frame; others are 1 ms per */
+	us_per_frame = (hprt & HPRT0_SPD_MASK) ? 1000 : 125;
+
+	/* Extract fields */
+	frame_number = (hfnum & HFNUM_FRNUM_MASK) >> HFNUM_FRNUM_SHIFT;
+	remaining = (hfnum & HFNUM_FRREM_MASK) >> HFNUM_FRREM_SHIFT;
+	interval = (hfir & HFIR_FRINT_MASK) >> HFIR_FRINT_SHIFT;
+
+	/*
+	 * Number of phy clocks since the last tick of the frame number after
+	 * "us" has passed.
+	 */
+	phy_clks = (interval - remaining) +
+		   DIV_ROUND_UP(interval * us, us_per_frame);
+
+	return dwc2_frame_num_inc(frame_number, phy_clks / interval);
+}
+
 int dwc2_hcd_is_b_host(struct dwc2_hsotg *hsotg)
 {
 	return hsotg->op_state == OTG_STATE_B_HOST;
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 19/22] usb: dwc2: host: Add dwc2_hcd_get_future_frame_number() call
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

As we start getting more exact about our scheduling it's becoming more
and more important to know exactly how far through the current frame we
are.  This lets us make decisions about whether there's still time left
to start a new transaction in the current frame.

We'll add dwc2_hcd_get_future_frame_number() which will tell you what
the frame number will be a certain number of microseconds (us) from
now.  We can use this information to help decide if there's enough time
left in the frame for a transaction that will take a certain duration.

This is expected to be used by a future change ("usb: dwc2: host:
Properly set even/odd frame").

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Add dwc2_hcd_get_future_frame_number() call new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.h |  4 ++++
 drivers/usb/dwc2/hcd.c  | 29 +++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 64d45a2053bb..52cbea28d0e9 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -1235,12 +1235,16 @@ static inline int dwc2_hsotg_set_test_mode(struct dwc2_hsotg *hsotg,
 
 #if IS_ENABLED(CONFIG_USB_DWC2_HOST) || IS_ENABLED(CONFIG_USB_DWC2_DUAL_ROLE)
 extern int dwc2_hcd_get_frame_number(struct dwc2_hsotg *hsotg);
+extern int dwc2_hcd_get_future_frame_number(struct dwc2_hsotg *hsotg, int us);
 extern void dwc2_hcd_connect(struct dwc2_hsotg *hsotg);
 extern void dwc2_hcd_disconnect(struct dwc2_hsotg *hsotg, bool force);
 extern void dwc2_hcd_start(struct dwc2_hsotg *hsotg);
 #else
 static inline int dwc2_hcd_get_frame_number(struct dwc2_hsotg *hsotg)
 { return 0; }
+static inline int dwc2_hcd_get_future_frame_number(struct dwc2_hsotg *hsotg,
+						   int us)
+{ return 0; }
 static inline void dwc2_hcd_connect(struct dwc2_hsotg *hsotg) {}
 static inline void dwc2_hcd_disconnect(struct dwc2_hsotg *hsotg, bool force) {}
 static inline void dwc2_hcd_start(struct dwc2_hsotg *hsotg) {}
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index f48da015fa5e..8edd0b45f41c 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -1947,6 +1947,35 @@ int dwc2_hcd_get_frame_number(struct dwc2_hsotg *hsotg)
 	return (hfnum & HFNUM_FRNUM_MASK) >> HFNUM_FRNUM_SHIFT;
 }
 
+int dwc2_hcd_get_future_frame_number(struct dwc2_hsotg *hsotg, int us)
+{
+	u32 hprt = dwc2_readl(hsotg->regs + HPRT0);
+	u32 hfir = dwc2_readl(hsotg->regs + HFIR);
+	u32 hfnum = dwc2_readl(hsotg->regs + HFNUM);
+	unsigned int us_per_frame;
+	unsigned int frame_number;
+	unsigned int remaining;
+	unsigned int interval;
+	unsigned int phy_clks;
+
+	/* High speed has 125 us per (micro) frame; others are 1 ms per */
+	us_per_frame = (hprt & HPRT0_SPD_MASK) ? 1000 : 125;
+
+	/* Extract fields */
+	frame_number = (hfnum & HFNUM_FRNUM_MASK) >> HFNUM_FRNUM_SHIFT;
+	remaining = (hfnum & HFNUM_FRREM_MASK) >> HFNUM_FRREM_SHIFT;
+	interval = (hfir & HFIR_FRINT_MASK) >> HFIR_FRINT_SHIFT;
+
+	/*
+	 * Number of phy clocks since the last tick of the frame number after
+	 * "us" has passed.
+	 */
+	phy_clks = (interval - remaining) +
+		   DIV_ROUND_UP(interval * us, us_per_frame);
+
+	return dwc2_frame_num_inc(frame_number, phy_clks / interval);
+}
+
 int dwc2_hcd_is_b_host(struct dwc2_hsotg *hsotg)
 {
 	return hsotg->op_state == OTG_STATE_B_HOST;
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame
  2016-01-29  2:19 [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits Douglas Anderson
                   ` (18 preceding siblings ...)
  2016-01-29  2:20   ` Douglas Anderson
@ 2016-01-29  2:20 ` Douglas Anderson
  2016-02-02  7:46   ` Kever Yang
  2016-01-29  2:20   ` Douglas Anderson
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

When setting up ISO and INT transfers dwc2 needs to specify whether the
transfer is for an even or an odd frame (or microframe if the controller
is running in high speed mode).

The controller appears to use this as a simple way to figure out if a
transfer should happen right away (in the current microframe) or should
happen at the start of the next microframe.  Said another way:

- If you set "odd" and the current frame number is odd it appears that
  the controller will try to transfer right away.  Same thing if you set
  "even" and the current frame number is even.
- If the oddness you set and the oddness of the frame number are
  _different_, the transfer will be delayed until the frame number
  changes.

As I understand it, the above technique allows you to plan ahead of time
where possible by always working on the next frame.  ...but it still
allows you to properly respond immediately to things that happened in
the previous frame.

The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
It always looked at the frame number and setup the transfer to happen in
the next frame.  In some cases that meant that certain transactions
would be transferred in the wrong frame.

We'll try our best to set the even / odd to do the transfer in the
scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
We'll also modify the scheduler code to handle this and not try to
schedule a second transfer for the same frame.

Note that this change relies on the work to redo the microframe
scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
in scheduler") but it works even better after ("usb: dwc2: host: Totally
redo the microframe scheduler").

With this change my stressful USB test (USB webcam + USB audio +
keyboards) has less audio crackling than before.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- Properly set even/odd frame new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/core.c      | 92 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/usb/dwc2/hcd_queue.c | 11 +++++-
 2 files changed, 100 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
index a5db20f12ee4..c143f26bd9d9 100644
--- a/drivers/usb/dwc2/core.c
+++ b/drivers/usb/dwc2/core.c
@@ -1703,9 +1703,97 @@ static void dwc2_hc_set_even_odd_frame(struct dwc2_hsotg *hsotg,
 {
 	if (chan->ep_type == USB_ENDPOINT_XFER_INT ||
 	    chan->ep_type == USB_ENDPOINT_XFER_ISOC) {
-		/* 1 if _next_ frame is odd, 0 if it's even */
-		if (!(dwc2_hcd_get_frame_number(hsotg) & 0x1))
+		int host_speed;
+		int xfer_ns;
+		int xfer_us;
+		int bytes_in_fifo;
+		u16 fifo_space;
+		u16 frame_number;
+		u16 wire_frame;
+
+		/*
+		 * Try to figure out if we're an even or odd frame. If we set
+		 * even and the current frame number is even the the transfer
+		 * will happen immediately.  Similar if both are odd. If one is
+		 * even and the other is odd then the transfer will happen when
+		 * the frame number ticks.
+		 *
+		 * There's a bit of a balancing act to get this right.
+		 * Sometimes we may want to send data in the current frame (AK
+		 * right away).  We might want to do this if the frame number
+		 * _just_ ticked, but we might also want to do this in order
+		 * to continue a split transaction that happened late in a
+		 * microframe (so we didn't know to queue the next transfer
+		 * until the frame number had ticked).  The problem is that we
+		 * need a lot of knowledge to know if there's actually still
+		 * time to send things or if it would be better to wait until
+		 * the next frame.
+		 *
+		 * We can look at how much time is left in the current frame
+		 * and make a guess about whether we'll have time to transfer.
+		 * We'll do that.
+		 */
+
+		/* Get speed host is running at */
+		host_speed = (chan->speed != USB_SPEED_HIGH &&
+			      !chan->do_split) ? chan->speed : USB_SPEED_HIGH;
+
+		/* See how many bytes are in the periodic FIFO right now */
+		fifo_space = (dwc2_readl(hsotg->regs + HPTXSTS) &
+			      TXSTS_FSPCAVAIL_MASK) >> TXSTS_FSPCAVAIL_SHIFT;
+		bytes_in_fifo = sizeof(u32) *
+				(hsotg->core_params->host_perio_tx_fifo_size -
+				 fifo_space);
+
+		/*
+		 * Roughly estimate bus time for everything in the periodic
+		 * queue + our new transfer.  This is "rough" because we're
+		 * using a function that makes takes into account IN/OUT
+		 * and INT/ISO and we're just slamming in one value for all
+		 * transfers.  This should be an over-estimate and that should
+		 * be OK, but we can probably tighten it.
+		 */
+		xfer_ns = usb_calc_bus_time(host_speed, false, false,
+					    chan->xfer_len + bytes_in_fifo);
+		xfer_us = NS_TO_US(xfer_ns);
+
+		/* See what frame number we'll be at by the time we finish */
+		frame_number = dwc2_hcd_get_future_frame_number(hsotg, xfer_us);
+
+		/* This is when we were scheduled to be on the wire */
+		wire_frame = dwc2_frame_num_inc(chan->qh->next_active_frame, 1);
+
+		/*
+		 * If we'd finish _after_ the frame we're scheduled in then
+		 * it's hopeless.  Just schedule right away and hope for the
+		 * best.  Note that it _might_ be wise to call back into the
+		 * scheduler to pick a better frame, but this is better than
+		 * nothing.
+		 */
+		if (dwc2_frame_num_gt(frame_number, wire_frame)) {
+			dwc2_sch_vdbg(hsotg,
+				      "QH=%p EO MISS fr=%04x=>%04x (%+d)\n",
+				      chan->qh, wire_frame, frame_number,
+				      dwc2_frame_num_dec(frame_number,
+							 wire_frame));
+			wire_frame = frame_number;
+
+			/*
+			 * We picked a different frame number; communicate this
+			 * back to the scheduler so it doesn't try to schedule
+			 * another in the same frame.
+			 *
+			 * Remember that next_active_frame is 1 before the wire
+			 * frame.
+			 */
+			chan->qh->next_active_frame =
+				dwc2_frame_num_dec(frame_number, 1);
+		}
+
+		if (wire_frame & 1)
 			*hcchar |= HCCHAR_ODDFRM;
+		else
+			*hcchar &= ~HCCHAR_ODDFRM;
 	}
 }
 
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 3abb34a5fc5b..5f909747b5a4 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -985,6 +985,14 @@ static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
 	 *   and next_active_frame are always 1 frame before we want things
 	 *   to be active and we assume we can still get scheduled in the
 	 *   current frame number.
+	 * - It's possible for start_active_frame (now incremented) to be
+	 *   next_active_frame if we got an EO MISS (even_odd miss) which
+	 *   basically means that we detected there wasn't enough time for
+	 *   the last packet and dwc2_hc_set_even_odd_frame() rescheduled us
+	 *   at the last second.  We want to make sure we don't schedule
+	 *   another transfer for the same frame.  My test webcam doesn't seem
+	 *   terribly upset by missing a transfer but really doesn't like when
+	 *   we do two transfers in the same frame.
 	 * - Some misses are expected.  Specifically, in order to work
 	 *   perfectly dwc2 really needs quite spectacular interrupt latency
 	 *   requirements.  It needs to be able to handle its interrupts
@@ -995,7 +1003,8 @@ static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
 	 *   guarantee that a system will have interrupt latency < 125 us, so
 	 *   we have to be robust to some misses.
 	 */
-	if (dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
+	if (qh->start_active_frame == qh->next_active_frame ||
+	    dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
 		u16 ideal_start = qh->start_active_frame;
 
 		/* Adjust interval as per gcd with plan length. */
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 21/22] usb: dwc2: host: Totally redo the microframe scheduler
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

This totally reimplements the microframe scheduler in dwc2 to attempt to
handle periodic splits properly.  The old code didn't even try, so this
was a significant effort since periodic splits are one of the most
complicated things in USB.

I've attempted to keep the old "don't use the microframe" schduler
around for now, but not sure it's needed.  It has also only been lightly
tested.

I think it's pretty certain that this scheduler isn't perfect and might
have some bugs, but it seems much better than what was there before.
With this change my stressful USB test (USB webcam + USB audio + some
keyboards) crackles less.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Removed incorrect limit on number of channels (Heiko Stuebner).
- Fixed order of operations bug in debug print.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5:
- Moved defines outside of ifdef to avoid gadget-only compile error.

Changes in v4:
- Figured out what the microframe scheduler was supposed to do.
- Microframe rewrite is totally different from v3, hopefully more right.
- Microframe rewrite is later in the series now.

Changes in v3:
- The uframe scheduler patch is folded into optimization series.
- Optimize uframe scheduler "single uframe" case a little.
- uframe scheduler now atop logging patches.
- uframe scheduler now before delayed bandwidth release patches.
- Add defines like EARLY_FRAME_USEC
- Reorder dwc2_deschedule_periodic() in prep for future patches.
- uframe scheduler now shows real usefulness w/ future patches!
- Assuming single_tt is new for v3; not terribly well tested (yet).
- Keep track and use our uframe new for v3.

Changes in v2:
- Totally rewrote uframe scheduler again after writing test code.
- uframe scheduler atop delayed bandwidth release patches.

 drivers/usb/dwc2/core.h      |   85 ++-
 drivers/usb/dwc2/hcd.c       |   87 +++-
 drivers/usb/dwc2/hcd.h       |   79 ++-
 drivers/usb/dwc2/hcd_queue.c | 1170 ++++++++++++++++++++++++++++++++++++------
 4 files changed, 1250 insertions(+), 171 deletions(-)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 52cbea28d0e9..115925909390 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -592,6 +592,84 @@ struct dwc2_hregs_backup {
 	bool valid;
 };
 
+/*
+ * Constants related to high speed periodic scheduling
+ *
+ * We have a periodic schedule that is DWC2_HS_SCHEDULE_UFRAMES long.  From a
+ * reservation point of view it's assumed that the schedule goes right back to
+ * the beginning after the end of the schedule.
+ *
+ * What does that mean for scheduling things with a long interval?  It means
+ * we'll reserve time for them in every possible microframe that they could
+ * ever be scheduled in.  ...but we'll still only actually schedule them as
+ * often as they were requested.
+ *
+ * We keep our schedule in a "bitmap" structure.  This simplifies having
+ * to keep track of and merge intervals: we just let the bitmap code do most
+ * of the heavy lifting.  In a way scheduling is much like memory allocation.
+ *
+ * We schedule 100us per uframe or 80% of 125us (the maximum amount you're
+ * supposed to schedule for periodic transfers).  That's according to spec.
+ *
+ * Note that though we only schedule 80% of each microframe, the bitmap that we
+ * keep the schedule in is tightly packed (AKA it doesn't have 100us worth of
+ * space for each uFrame).
+ *
+ * Requirements:
+ * - DWC2_HS_SCHEDULE_UFRAMES must even divide 0x4000 (HFNUM_MAX_FRNUM + 1)
+ * - DWC2_HS_SCHEDULE_UFRAMES must be 8 times DWC2_LS_SCHEDULE_FRAMES (probably
+ *   could be any multiple of 8 times DWC2_LS_SCHEDULE_FRAMES, but there might
+ *   be bugs).  The 8 comes from the USB spec: number of microframes per frame.
+ */
+#define DWC2_US_PER_UFRAME		125
+#define DWC2_HS_PERIODIC_US_PER_UFRAME	100
+
+#define DWC2_HS_SCHEDULE_UFRAMES	8
+#define DWC2_HS_SCHEDULE_US		(DWC2_HS_SCHEDULE_UFRAMES * \
+					 DWC2_HS_PERIODIC_US_PER_UFRAME)
+
+/*
+ * Constants related to low speed scheduling
+ *
+ * For high speed we schedule every 1us.  For low speed that's a bit overkill,
+ * so we make up a unit called a "slice" that's worth 25us.  There are 40
+ * slices in a full frame and we can schedule 36 of those (90%) for periodic
+ * transfers.
+ *
+ * Our low speed schedule can be as short as 1 frame or could be longer.  When
+ * we only schedule 1 frame it means that we'll need to reserve a time every
+ * frame even for things that only transfer very rarely, so something that runs
+ * every 2048 frames will get time reserved in every frame.  Our low speed
+ * schedule can be longer and we'll be able to handle more overlap, but that
+ * will come at increased memory cost and increased time to schedule.
+ *
+ * Note: one other advantage of a short low speed schedule is that if we mess
+ * up and miss scheduling we can jump in and use any of the slots that we
+ * happened to reserve.
+ *
+ * With 25 us per slice and 1 frame in the schedule, we only need 4 bytes for
+ * the schedule.  There will be one schedule per TT.
+ *
+ * Requirements:
+ * - DWC2_US_PER_SLICE must evenly divide DWC2_LS_PERIODIC_US_PER_FRAME.
+ */
+#define DWC2_US_PER_SLICE	25
+#define DWC2_SLICES_PER_UFRAME	(DWC2_US_PER_UFRAME / DWC2_US_PER_SLICE)
+
+#define DWC2_ROUND_US_TO_SLICE(us) \
+				(DIV_ROUND_UP((us), DWC2_US_PER_SLICE) * \
+				 DWC2_US_PER_SLICE)
+
+#define DWC2_LS_PERIODIC_US_PER_FRAME \
+				900
+#define DWC2_LS_PERIODIC_SLICES_PER_FRAME \
+				(DWC2_LS_PERIODIC_US_PER_FRAME / \
+				 DWC2_US_PER_SLICE)
+
+#define DWC2_LS_SCHEDULE_FRAMES	1
+#define DWC2_LS_SCHEDULE_SLICES	(DWC2_LS_SCHEDULE_FRAMES * \
+				 DWC2_LS_PERIODIC_SLICES_PER_FRAME)
+
 /**
  * struct dwc2_hsotg - Holds the state of the driver, including the non-periodic
  * and periodic schedules
@@ -682,7 +760,9 @@ struct dwc2_hregs_backup {
  *                      This value is in microseconds per (micro)frame. The
  *                      assumption is that all periodic transfers may occur in
  *                      the same (micro)frame.
- * @frame_usecs:        Internal variable used by the microframe scheduler
+ * @hs_periodic_bitmap: Bitmap used by the microframe scheduler any time the
+ *                      host is in high speed mode; low speed schedules are
+ *                      stored elsewhere since we need one per TT.
  * @frame_number:       Frame number read from the core at SOF. The value ranges
  *                      from 0 to HFNUM_MAX_FRNUM.
  * @periodic_qh_count:  Count of periodic QHs, if using several eps. Used for
@@ -803,7 +883,8 @@ struct dwc2_hsotg {
 	struct list_head periodic_sched_queued;
 	struct list_head split_order;
 	u16 periodic_usecs;
-	u16 frame_usecs[8];
+	unsigned long hs_periodic_bitmap[
+		DIV_ROUND_UP(DWC2_HS_SCHEDULE_US, BITS_PER_LONG)];
 	u16 frame_number;
 	u16 periodic_qh_count;
 	bool bus_suspended;
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 8edd0b45f41c..2b5a706e7c32 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -2252,6 +2252,90 @@ void dwc2_host_hub_info(struct dwc2_hsotg *hsotg, void *context, int *hub_addr,
 	*hub_port = urb->dev->ttport;
 }
 
+/**
+ * dwc2_host_get_tt_info() - Get the dwc2_tt associated with context
+ *
+ * This will get the dwc2_tt structure (and ttport) associated with the given
+ * context (which is really just a struct urb pointer).
+ *
+ * The first time this is called for a given TT we allocate memory for our
+ * structure.  When everyone is done and has called dwc2_host_put_tt_info()
+ * then the refcount for the structure will go to 0 and we'll free it.
+ *
+ * @hsotg:     The HCD state structure for the DWC OTG controller.
+ * @qh:        The QH structure.
+ * @context:   The priv pointer from a struct dwc2_hcd_urb.
+ * @mem_flags: Flags for allocating memory.
+ * @ttport:    We'll return this device's port number here.  That's used to
+ *             reference into the bitmap if we're on a multi_tt hub.
+ *
+ * Return: a pointer to a struct dwc2_tt.  Don't forget to call
+ *         dwc2_host_put_tt_info()!  Returns NULL upon memory alloc failure.
+ */
+
+struct dwc2_tt *dwc2_host_get_tt_info(struct dwc2_hsotg *hsotg, void *context,
+				      gfp_t mem_flags, int *ttport)
+{
+	struct urb *urb = context;
+	struct dwc2_tt *dwc_tt = NULL;
+
+	if (urb->dev->tt) {
+		*ttport = urb->dev->ttport;
+
+		dwc_tt = urb->dev->tt->hcpriv;
+		if (dwc_tt == NULL) {
+			size_t bitmap_size;
+
+			/*
+			 * For single_tt we need one schedule.  For multi_tt
+			 * we need one per port.
+			 */
+			bitmap_size = DWC2_ELEMENTS_PER_LS_BITMAP *
+				      sizeof(dwc_tt->periodic_bitmaps[0]);
+			if (urb->dev->tt->multi)
+				bitmap_size *= urb->dev->tt->hub->maxchild;
+
+			dwc_tt = kzalloc(sizeof(*dwc_tt) + bitmap_size,
+					 mem_flags);
+			if (dwc_tt == NULL)
+				return NULL;
+
+			dwc_tt->usb_tt = urb->dev->tt;
+			dwc_tt->usb_tt->hcpriv = dwc_tt;
+		}
+
+		dwc_tt->refcount++;
+	}
+
+	return dwc_tt;
+}
+
+/**
+ * dwc2_host_put_tt_info() - Put the dwc2_tt from dwc2_host_get_tt_info()
+ *
+ * Frees resources allocated by dwc2_host_get_tt_info() if all current holders
+ * of the structure are done.
+ *
+ * It's OK to call this with NULL.
+ *
+ * @hsotg:     The HCD state structure for the DWC OTG controller.
+ * @dwc_tt:    The pointer returned by dwc2_host_get_tt_info.
+ */
+void dwc2_host_put_tt_info(struct dwc2_hsotg *hsotg, struct dwc2_tt *dwc_tt)
+{
+	/* Model kfree and make put of NULL a no-op */
+	if (dwc_tt == NULL)
+		return;
+
+	WARN_ON(dwc_tt->refcount < 1);
+
+	dwc_tt->refcount--;
+	if (!dwc_tt->refcount) {
+		dwc_tt->usb_tt->hcpriv = NULL;
+		kfree(dwc_tt);
+	}
+}
+
 int dwc2_host_get_speed(struct dwc2_hsotg *hsotg, void *context)
 {
 	struct urb *urb = context;
@@ -3197,9 +3281,6 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq)
 		hsotg->hc_ptr_array[i] = channel;
 	}
 
-	if (hsotg->core_params->uframe_sched > 0)
-		dwc2_hcd_init_usecs(hsotg);
-
 	/* Initialize hsotg start work */
 	INIT_DELAYED_WORK(&hsotg->start_work, dwc2_hcd_start_func);
 
diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index fd266ac53a28..140b1511a131 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -212,6 +212,43 @@ enum dwc2_transaction_type {
 	DWC2_TRANSACTION_ALL,
 };
 
+/* The number of elements per LS bitmap (per port on multi_tt) */
+#define DWC2_ELEMENTS_PER_LS_BITMAP	DIV_ROUND_UP(DWC2_LS_SCHEDULE_SLICES, \
+						     BITS_PER_LONG)
+
+/**
+ * struct dwc2_tt - dwc2 data associated with a usb_tt
+ *
+ * @refcount:           Number of Queue Heads (QHs) holding a reference.
+ * @usb_tt:             Pointer back to the official usb_tt.
+ * @periodic_bitmaps:   Bitmap for which parts of the 1ms frame are accounted
+ *                      for already.  Each is DWC2_ELEMENTS_PER_LS_BITMAP
+ *			elements (so sizeof(long) times that in bytes).
+ *
+ * This structure is stored in the hcpriv of the official usb_tt.
+ */
+struct dwc2_tt {
+	int refcount;
+	struct usb_tt *usb_tt;
+	unsigned long periodic_bitmaps[];
+};
+
+/**
+ * struct dwc2_hs_transfer_time - Info about a transfer on the high speed bus.
+ *
+ * @start_schedule_usecs:  The start time on the main bus schedule.  Note that
+ *                         the main bus schedule is tightly packed and this
+ *			   time should be interpreted as tightly packed (so
+ *			   uFrame 0 starts at 0 us, uFrame 1 starts at 100 us
+ *			   instead of 125 us).
+ * @duration_us:           How long this transfer goes.
+ */
+
+struct dwc2_hs_transfer_time {
+	u32 start_schedule_us;
+	u16 duration_us;
+};
+
 /**
  * struct dwc2_qh - Software queue head structure
  *
@@ -237,18 +274,33 @@ enum dwc2_transaction_type {
  * @td_first:           Index of first activated isochronous transfer descriptor
  * @td_last:            Index of last activated isochronous transfer descriptor
  * @host_us:            Bandwidth in microseconds per transfer as seen by host
+ * @device_us:          Bandwidth in microseconds per transfer as seen by device
  * @host_interval:      Interval between transfers as seen by the host.  If
  *                      the host is high speed and the device is low speed this
  *                      will be 8 times device interval.
- * @next_active_frame:  (Micro)frame before we next need to put something on
+ * @device_interval:    Interval between transfers as seen by the device.
+ *                      interval.
+ * @next_active_frame:  (Micro)frame _before_ we next need to put something on
  *                      the bus.  We'll move the qh to active here.  If the
  *                      host is in high speed mode this will be a uframe.  If
  *                      the host is in low speed mode this will be a full frame.
  * @start_active_frame: If we are partway through a split transfer, this will be
  *			what next_active_frame was when we started.  Otherwise
  *			it should always be the same as next_active_frame.
- * @assigned_uframe:    The uframe (0 -7) assigned by dwc2_find_uframe().
- * @frame_usecs:        Internal variable used by the microframe scheduler
+ * @num_hs_transfers:   Number of transfers in hs_transfers.
+ *                      Normally this is 1 but can be more than one for splits.
+ *                      Always >= 1 unless the host is in low/full speed mode.
+ * @hs_transfers:       Transfers that are scheduled as seen by the high speed
+ *                      bus.  Not used if host is in low or full speed mode (but
+ *                      note that it IS USED if the device is low or full speed
+ *                      as long as the HOST is in high speed mode).
+ * @ls_start_schedule_slice: Start time (in slices) on the low speed bus
+ *                           schedule that's being used by this device.  This
+ *			     will be on the periodic_bitmap in a
+ *                           "struct dwc2_tt".  Not used if this device is high
+ *                           speed.  Note that this is in "schedule slice" which
+ *                           is tightly packed.
+ * @ls_duration_us:     Duration on the low speed bus schedule.
  * @ntd:                Actual number of transfer descriptors in a list
  * @qtd_list:           List of QTDs for this QH
  * @channel:            Host channel currently processing transfers for this QH
@@ -261,8 +313,12 @@ enum dwc2_transaction_type {
  *                      descriptor and indicates original XferSize value for the
  *                      descriptor
  * @unreserve_timer:    Timer for releasing periodic reservation.
+ * @dwc2_tt:            Pointer to our tt info (or NULL if no tt).
+ * @ttport:             Port number within our tt.
  * @tt_buffer_dirty     True if clear_tt_buffer_complete is pending
  * @unreserve_pending:  True if we planned to unreserve but haven't yet.
+ * @schedule_low_speed: True if we have a low/full speed component (either the
+ *			host is in low/full speed mode or do_split).
  *
  * A Queue Head (QH) holds the static characteristics of an endpoint and
  * maintains a list of transfers (QTDs) for that endpoint. A QH structure may
@@ -280,11 +336,14 @@ struct dwc2_qh {
 	u8 td_first;
 	u8 td_last;
 	u16 host_us;
+	u16 device_us;
 	u16 host_interval;
+	u16 device_interval;
 	u16 next_active_frame;
 	u16 start_active_frame;
-	u16 assigned_uframe;
-	u16 frame_usecs[8];
+	s16 num_hs_transfers;
+	struct dwc2_hs_transfer_time hs_transfers[DWC2_HS_SCHEDULE_UFRAMES];
+	u32 ls_start_schedule_slice;
 	u16 ntd;
 	struct list_head qtd_list;
 	struct dwc2_host_chan *channel;
@@ -294,8 +353,11 @@ struct dwc2_qh {
 	u32 desc_list_sz;
 	u32 *n_bytes;
 	struct timer_list unreserve_timer;
+	struct dwc2_tt *dwc_tt;
+	int ttport;
 	unsigned tt_buffer_dirty:1;
 	unsigned unreserve_pending:1;
+	unsigned schedule_low_speed:1;
 };
 
 /**
@@ -462,7 +524,6 @@ extern void dwc2_hcd_queue_transactions(struct dwc2_hsotg *hsotg,
 
 /* Schedule Queue Functions */
 /* Implemented in hcd_queue.c */
-extern void dwc2_hcd_init_usecs(struct dwc2_hsotg *hsotg);
 extern struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
 					  struct dwc2_hcd_urb *urb,
 					  gfp_t mem_flags);
@@ -728,6 +789,12 @@ extern void dwc2_host_start(struct dwc2_hsotg *hsotg);
 extern void dwc2_host_disconnect(struct dwc2_hsotg *hsotg);
 extern void dwc2_host_hub_info(struct dwc2_hsotg *hsotg, void *context,
 			       int *hub_addr, int *hub_port);
+extern struct dwc2_tt *dwc2_host_get_tt_info(struct dwc2_hsotg *hsotg,
+					     void *context, gfp_t mem_flags,
+					     int *ttport);
+
+extern void dwc2_host_put_tt_info(struct dwc2_hsotg *hsotg,
+				  struct dwc2_tt *dwc_tt);
 extern int dwc2_host_get_speed(struct dwc2_hsotg *hsotg, void *context);
 extern void dwc2_host_complete(struct dwc2_hsotg *hsotg, struct dwc2_qtd *qtd,
 			       int status);
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 5f909747b5a4..5ea460d886ba 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -136,116 +136,933 @@ static int dwc2_check_periodic_bandwidth(struct dwc2_hsotg *hsotg,
 }
 
 /**
- * Microframe scheduler
- * track the total use in hsotg->frame_usecs
- * keep each qh use in qh->frame_usecs
- * when surrendering the qh then donate the time back
+ * pmap_schedule() - Schedule time in a periodic bitmap (pmap).
+ *
+ * @map:             The bitmap representing the schedule; will be updated
+ *                   upon success.
+ * @bits_per_period: The schedule represents several periods.  This is how many
+ *                   bits are in each period.  It's assumed that the beginning
+ *                   of the schedule will repeat after its end.
+ * @periods_in_map:  The number of periods in the schedule.
+ * @num_bits:        The number of bits we need per period we want to reserve
+ *                   in this function call.
+ * @interval:        How often we need to be scheduled for the reservation this
+ *                   time.  1 means every period.  2 means every other period.
+ *                   ...you get the picture?
+ * @start:           The bit number to start at.  Normally 0.  Must be within
+ *                   the interval or we return failure right away.
+ * @only_one_period: Normally we'll allow picking a start anywhere within the
+ *                   first interval, since we can still make all repetition
+ *                   requirements by doing that.  However, if you pass true
+ *                   here then we'll return failure if we can't fit within
+ *                   the period that "start" is in.
+ *
+ * The idea here is that we want to schedule time for repeating events that all
+ * want the same resource.  The resource is divided into fixed-sized periods
+ * and the events want to repeat every "interval" periods.  The schedule
+ * granularity is one bit.
+ *
+ * To keep things "simple", we'll represent our schedule with a bitmap that
+ * contains a fixed number of periods.  This gets rid of a lot of complexity
+ * but does mean that we need to handle things specially (and non-ideally) if
+ * the number of the periods in the schedule doesn't match well with the
+ * intervals that we're trying to schedule.
+ *
+ * Here's an explanation of the scheme we'll implement, assuming 8 periods.
+ * - If interval is 1, we need to take up space in each of the 8
+ *   periods we're scheduling.  Easy.
+ * - If interval is 2, we need to take up space in half of the
+ *   periods.  Again, easy.
+ * - If interval is 3, we actually need to fall back to interval 1.
+ *   Why?  Because we might need time in any period.  AKA for the
+ *   first 8 periods, we'll be in slot 0, 3, 6.  Then we'll be
+ *   in slot 1, 4, 7.  Then we'll be in 2, 5.  Then we'll be back to
+ *   0, 3, and 6.  Since we could be in any frame we need to reserve
+ *   for all of them.  Sucks, but that's what you gotta do.  Note that
+ *   if we were instead scheduling 8 * 3 = 24 we'd do much better, but
+ *   then we need more memory and time to do scheduling.
+ * - If interval is 4, easy.
+ * - If interval is 5, we again need interval 1.  The schedule will be
+ *   0, 5, 2, 7, 4, 1, 6, 3, 0
+ * - If interval is 6, we need interval 2.  0, 6, 4, 2.
+ * - If interval is 7, we need interval 1.
+ * - If interval is 8, we need interval 8.
+ *
+ * If you do the math, you'll see that we need to pretend that interval is
+ * equal to the greatest_common_divisor(interval, periods_in_map).
+ *
+ * Note that at the moment this function tends to front-pack the schedule.
+ * In some cases that's really non-ideal (it's hard to schedule things that
+ * need to repeat every period).  In other cases it's perfect (you can easily
+ * schedule bigger, less often repeating things).
+ *
+ * Here's the algorithm in action (8 periods, 5 bits per period):
+ *  |**   |     |**   |     |**   |     |**   |     |   OK 2 bits, intv 2 at 0
+ *  |*****|  ***|*****|  ***|*****|  ***|*****|  ***|   OK 3 bits, intv 3 at 2
+ *  |*****|* ***|*****|  ***|*****|* ***|*****|  ***|   OK 1 bits, intv 4 at 5
+ *  |**   |*    |**   |     |**   |*    |**   |     | Remv 3 bits, intv 3 at 2
+ *  |***  |*    |***  |     |***  |*    |***  |     |   OK 1 bits, intv 6 at 2
+ *  |**** |*  * |**** |   * |**** |*  * |**** |   * |   OK 1 bits, intv 1 at 3
+ *  |**** |**** |**** | *** |**** |**** |**** | *** |   OK 2 bits, intv 2 at 6
+ *  |*****|*****|*****| ****|*****|*****|*****| ****|   OK 1 bits, intv 1 at 4
+ *  |*****|*****|*****| ****|*****|*****|*****| ****| FAIL 1 bits, intv 1
+ *  |  ***|*****|  ***| ****|  ***|*****|  ***| ****| Remv 2 bits, intv 2 at 0
+ *  |  ***| ****|  ***| ****|  ***| ****|  ***| ****| Remv 1 bits, intv 4 at 5
+ *  |   **| ****|   **| ****|   **| ****|   **| ****| Remv 1 bits, intv 6 at 2
+ *  |    *| ** *|    *| ** *|    *| ** *|    *| ** *| Remv 1 bits, intv 1 at 3
+ *  |    *|    *|    *|    *|    *|    *|    *|    *| Remv 2 bits, intv 2 at 6
+ *  |     |     |     |     |     |     |     |     | Remv 1 bits, intv 1 at 4
+ *  |**   |     |**   |     |**   |     |**   |     |   OK 2 bits, intv 2 at 0
+ *  |***  |     |**   |     |***  |     |**   |     |   OK 1 bits, intv 4 at 2
+ *  |*****|     |** **|     |*****|     |** **|     |   OK 2 bits, intv 2 at 3
+ *  |*****|*    |** **|     |*****|*    |** **|     |   OK 1 bits, intv 4 at 5
+ *  |*****|***  |** **| **  |*****|***  |** **| **  |   OK 2 bits, intv 2 at 6
+ *  |*****|*****|** **| ****|*****|*****|** **| ****|   OK 2 bits, intv 2 at 8
+ *  |*****|*****|*****| ****|*****|*****|*****| ****|   OK 1 bits, intv 4 at 12
+ *
+ * This function is pretty generic and could be easily abstracted if anything
+ * needed similar scheduling.
+ *
+ * Returns either -ENOSPC or a >= 0 start bit which should be passed to the
+ * unschedule routine.  The map bitmap will be updated on a non-error result.
  */
-static const unsigned short max_uframe_usecs[] = {
-	100, 100, 100, 100, 100, 100, 30, 0
-};
+static int pmap_schedule(unsigned long *map, int bits_per_period,
+			 int periods_in_map, int num_bits,
+			 int interval, int start, bool only_one_period)
+{
+	int interval_bits;
+	int to_reserve;
+	int first_end;
+	int i;
+
+	if (num_bits > bits_per_period)
+		return -ENOSPC;
+
+	/* Adjust interval as per description */
+	interval = gcd(interval, periods_in_map);
+
+	interval_bits = bits_per_period * interval;
+	to_reserve = periods_in_map / interval;
+
+	/* If start has gotten us past interval then we can't schedule */
+	if (start >= interval_bits)
+		return -ENOSPC;
+
+	if (only_one_period)
+		/* Must fit within same period as start; end at begin of next */
+		first_end = (start / bits_per_period + 1) * bits_per_period;
+	else
+		/* Can fit anywhere in the first interval */
+		first_end = interval_bits;
+
+	/*
+	 * We'll try to pick the first repetition, then see if that time
+	 * is free for each of the subsequent repetitions.  If it's not
+	 * we'll adjust the start time for the next search of the first
+	 * repetition.
+	 */
+	while (start + num_bits <= first_end) {
+		int end;
+
+		/* Need to stay within this period */
+		end = (start / bits_per_period + 1) * bits_per_period;
+
+		/* Look for num_bits us in this microframe starting at start */
+		start = bitmap_find_next_zero_area(map, end, start, num_bits,
+						   0);
+
+		/*
+		 * We should get start >= end if we fail.  We might be
+		 * able to check the next microframe depending on the
+		 * interval, so continue on (start already updated).
+		 */
+		if (start >= end) {
+			start = end;
+			continue;
+		}
+
+		/* At this point we have a valid point for first one */
+		for (i = 1; i < to_reserve; i++) {
+			int ith_start = start + interval_bits * i;
+			int ith_end = end + interval_bits * i;
+			int ret;
+
+			/* Use this as a dumb "check if bits are 0" */
+			ret = bitmap_find_next_zero_area(
+				map, ith_start + num_bits, ith_start, num_bits,
+				0);
+
+			/* We got the right place, continue checking */
+			if (ret == ith_start)
+				continue;
+
+			/* Move start up for next time and exit for loop */
+			ith_start = bitmap_find_next_zero_area(
+				map, ith_end, ith_start, num_bits, 0);
+			if (ith_start >= ith_end)
+				/* Need a while new period next time */
+				start = end;
+			else
+				start = ith_start - interval_bits * i;
+			break;
+		}
+
+		/* If didn't exit the for loop with a break, we have success */
+		if (i == to_reserve)
+			break;
+	}
+
+	if (start + num_bits > first_end)
+		return -ENOSPC;
 
-void dwc2_hcd_init_usecs(struct dwc2_hsotg *hsotg)
+	for (i = 0; i < to_reserve; i++) {
+		int ith_start = start + interval_bits * i;
+
+		bitmap_set(map, ith_start, num_bits);
+	}
+
+	return start;
+}
+
+/**
+ * pmap_unschedule() - Undo work done by pmap_schedule()
+ *
+ * @map:             See pmap_schedule().
+ * @bits_per_period: See pmap_schedule().
+ * @periods_in_map:  See pmap_schedule().
+ * @num_bits:        The number of bits that was passed to schedule.
+ * @interval:        The interval that was passed to schedule.
+ * @start:           The return value from pmap_schedule().
+ */
+static void pmap_unschedule(unsigned long *map, int bits_per_period,
+			    int periods_in_map, int num_bits,
+			    int interval, int start)
 {
+	int interval_bits;
+	int to_release;
 	int i;
 
-	for (i = 0; i < 8; i++)
-		hsotg->frame_usecs[i] = max_uframe_usecs[i];
+	/* Adjust interval as per description in pmap_schedule() */
+	interval = gcd(interval, periods_in_map);
+
+	interval_bits = bits_per_period * interval;
+	to_release = periods_in_map / interval;
+
+	for (i = 0; i < to_release; i++) {
+		int ith_start = start + interval_bits * i;
+
+		bitmap_clear(map, ith_start, num_bits);
+	}
 }
 
-static int dwc2_find_single_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+/*
+ * cat_printf() - A printf() + strcat() helper
+ *
+ * This is useful for concatenating a bunch of strings where each string is
+ * constructed using printf.
+ *
+ * @buf:   The destination buffer; will be updated to point after the printed
+ *         data.
+ * @size:  The number of bytes in the buffer (includes space for '\0').
+ * @fmt:   The format for printf.
+ * @...:   The args for printf.
+ */
+static void cat_printf(char **buf, size_t *size, const char *fmt, ...)
 {
-	unsigned short utime = qh->host_us;
+	va_list args;
 	int i;
 
-	for (i = 0; i < 8; i++) {
-		/* At the start hsotg->frame_usecs[i] = max_uframe_usecs[i] */
-		if (utime <= hsotg->frame_usecs[i]) {
-			hsotg->frame_usecs[i] -= utime;
-			qh->frame_usecs[i] += utime;
-			return i;
-		}
+	if (*size == 0)
+		return;
+
+	va_start(args, fmt);
+	i = vsnprintf(*buf, *size, fmt, args);
+	va_end(args);
+
+	if (i >= *size) {
+		(*buf)[*size - 1] = '\0';
+		*buf += *size;
+		*size = 0;
+	} else {
+		*buf += i;
+		*size -= i;
 	}
-	return -ENOSPC;
 }
 
 /*
- * use this for FS apps that can span multiple uframes
+ * pmap_print() - Print the given periodic map
+ *
+ * Will attempt to print out the periodic schedule.
+ *
+ * @map:             See pmap_schedule().
+ * @bits_per_period: See pmap_schedule().
+ * @periods_in_map:  See pmap_schedule().
+ * @period_name:     The name of 1 period, like "uFrame"
+ * @units:           The name of the units, like "us".
+ * @print_fn:        The function to call for printing.
+ * @print_data:      Opaque data to pass to the print function.
  */
-static int dwc2_find_multi_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+static void pmap_print(unsigned long *map, int bits_per_period,
+		       int periods_in_map, const char *period_name,
+		       const char *units,
+		       void (*print_fn)(const char *str, void *data),
+		       void *print_data)
 {
-	unsigned short utime = qh->host_us;
-	unsigned short xtime;
-	int t_left;
+	int period;
+
+	for (period = 0; period < periods_in_map; period++) {
+		char tmp[64];
+		char *buf = tmp;
+		size_t buf_size = sizeof(tmp);
+		int period_start = period * bits_per_period;
+		int period_end = period_start + bits_per_period;
+		int start = 0;
+		int count = 0;
+		bool printed = false;
+		int i;
+
+		for (i = period_start; i < period_end + 1; i++) {
+			/* Handle case when ith bit is set */
+			if (i < period_end &&
+			    bitmap_find_next_zero_area(map, i + 1,
+						       i, 1, 0) != i) {
+				if (count == 0)
+					start = i - period_start;
+				count++;
+				continue;
+			}
+
+			/* ith bit isn't set; don't care if count == 0 */
+			if (count == 0)
+				continue;
+
+			if (!printed)
+				cat_printf(&buf, &buf_size, "%s %d: ",
+					   period_name, period);
+			else
+				cat_printf(&buf, &buf_size, ", ");
+			printed = true;
+
+			cat_printf(&buf, &buf_size, "%d %s -%3d %s", start,
+				   units, start + count - 1, units);
+			count = 0;
+		}
+
+		if (printed)
+			print_fn(tmp, print_data);
+	}
+}
+
+/**
+ * dwc2_get_ls_map() - Get the map used for the given qh
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller.
+ * @qh:    QH for the periodic transfer.
+ *
+ * We'll always get the periodic map out of our TT.  Note that even if we're
+ * running the host straight in low speed / full speed mode it appears as if
+ * a TT is allocated for us, so we'll use it.  If that ever changes we can
+ * add logic here to get a map out of "hsotg" if !qh->do_split.
+ *
+ * Returns: the map or NULL if a map couldn't be found.
+ */
+static unsigned long *dwc2_get_ls_map(struct dwc2_hsotg *hsotg,
+				      struct dwc2_qh *qh)
+{
+	unsigned long *map;
+
+	/* Don't expect to be missing a TT and be doing low speed scheduling */
+	if (WARN_ON(!qh->dwc_tt))
+		return NULL;
+
+	/* Get the map and adjust if this is a multi_tt hub */
+	map = qh->dwc_tt->periodic_bitmaps;
+	if (qh->dwc_tt->usb_tt->multi)
+		map += DWC2_ELEMENTS_PER_LS_BITMAP * qh->ttport;
+
+	return map;
+}
+
+struct dwc2_qh_print_data {
+	struct dwc2_hsotg *hsotg;
+	struct dwc2_qh *qh;
+};
+
+/**
+ * dwc2_qh_print() - Helper function for dwc2_qh_schedule_print()
+ *
+ * @str:  The string to print
+ * @data: A pointer to a struct dwc2_qh_print_data
+ */
+static void dwc2_qh_print(const char *str, void *data)
+{
+	struct dwc2_qh_print_data *print_data = data;
+
+	dwc2_sch_dbg(print_data->hsotg, "QH=%p ...%s\n", print_data->qh, str);
+}
+
+/**
+ * dwc2_qh_schedule_print() - Print the periodic schedule
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller.
+ * @qh:    QH to print.
+ */
+static void dwc2_qh_schedule_print(struct dwc2_hsotg *hsotg,
+				   struct dwc2_qh *qh)
+{
+	struct dwc2_qh_print_data print_data = { hsotg, qh };
 	int i;
-	int j;
-	int k;
 
-	for (i = 0; i < 8; i++) {
-		if (hsotg->frame_usecs[i] <= 0)
+	/*
+	 * The printing functions are quite slow and inefficient.
+	 * If we don't have tracing turned on, don't run unless the special
+	 * define is turned on.
+	 */
+#ifndef DWC2_PRINT_SCHEDULE
+	return;
+#endif
+
+	if (qh->schedule_low_speed) {
+		unsigned long *map = dwc2_get_ls_map(hsotg, qh);
+
+		dwc2_sch_dbg(hsotg, "QH=%p LS/FS trans: %d=>%d us @ %d us",
+			     qh, qh->device_us,
+			     DWC2_ROUND_US_TO_SLICE(qh->device_us),
+			     DWC2_US_PER_SLICE * qh->ls_start_schedule_slice);
+
+		if (map) {
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p Whole low/full speed map %p now:\n",
+				     qh, map);
+			pmap_print(map, DWC2_LS_PERIODIC_SLICES_PER_FRAME,
+				   DWC2_LS_SCHEDULE_FRAMES, "Frame ", "slices",
+				   dwc2_qh_print, &print_data);
+		}
+	}
+
+	for (i = 0; i < qh->num_hs_transfers; i++) {
+		struct dwc2_hs_transfer_time *trans_time = qh->hs_transfers + i;
+		int uframe = trans_time->start_schedule_us /
+			     DWC2_HS_PERIODIC_US_PER_UFRAME;
+		int rel_us = trans_time->start_schedule_us %
+			     DWC2_HS_PERIODIC_US_PER_UFRAME;
+
+		dwc2_sch_dbg(hsotg,
+			     "QH=%p HS trans #%d: %d us @ uFrame %d + %d us\n",
+			     qh, i, trans_time->duration_us, uframe, rel_us);
+	}
+	if (qh->num_hs_transfers) {
+		dwc2_sch_dbg(hsotg, "QH=%p Whole high speed map now:\n", qh);
+		pmap_print(hsotg->hs_periodic_bitmap,
+			   DWC2_HS_PERIODIC_US_PER_UFRAME,
+			   DWC2_HS_SCHEDULE_UFRAMES, "uFrame", "us",
+			   dwc2_qh_print, &print_data);
+	}
+
+}
+
+/**
+ * dwc2_ls_pmap_schedule() - Schedule a low speed QH
+ *
+ * @hsotg:        The HCD state structure for the DWC OTG controller.
+ * @qh:           QH for the periodic transfer.
+ * @search_slice: We'll start trying to schedule at the passed slice.
+ *                Remember that slices are the units of the low speed
+ *                schedule (think 25us or so).
+ *
+ * Wraps pmap_schedule() with the right parameters for low speed scheduling.
+ *
+ * Normally we schedule low speed devices on the map associated with the TT.
+ *
+ * Returns: 0 for success or an error code.
+ */
+static int dwc2_ls_pmap_schedule(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
+				 int search_slice)
+{
+	int slices = DIV_ROUND_UP(qh->device_us, DWC2_US_PER_SLICE);
+	unsigned long *map = dwc2_get_ls_map(hsotg, qh);
+	int slice;
+
+	if (map == NULL)
+		return -EINVAL;
+
+	/*
+	 * Schedule on the proper low speed map with our low speed scheduling
+	 * parameters.  Note that we use the "device_interval" here since
+	 * we want the low speed interval and the only way we'd be in this
+	 * function is if the device is low speed.
+	 *
+	 * If we happen to be doing low speed and high speed scheduling for the
+	 * same transaction (AKA we have a split) we always do low speed first.
+	 * That means we can always pass "false" for only_one_period (that
+	 * parameters is only useful when we're trying to get one schedule to
+	 * match what we already planned in the other schedule).
+	 */
+	slice = pmap_schedule(map, DWC2_LS_PERIODIC_SLICES_PER_FRAME,
+			      DWC2_LS_SCHEDULE_FRAMES, slices,
+			      qh->device_interval, search_slice, false);
+
+	if (slice < 0)
+		return slice;
+
+	qh->ls_start_schedule_slice = slice;
+	return 0;
+}
+
+/**
+ * dwc2_ls_pmap_unschedule() - Undo work done by dwc2_ls_pmap_schedule()
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static void dwc2_ls_pmap_unschedule(struct dwc2_hsotg *hsotg,
+				    struct dwc2_qh *qh)
+{
+	int slices = DIV_ROUND_UP(qh->device_us, DWC2_US_PER_SLICE);
+	unsigned long *map = dwc2_get_ls_map(hsotg, qh);
+
+	/* Schedule should have failed, so no worries about no error code */
+	if (map == NULL)
+		return;
+
+	pmap_unschedule(map, DWC2_LS_PERIODIC_SLICES_PER_FRAME,
+			DWC2_LS_SCHEDULE_FRAMES, slices, qh->device_interval,
+			qh->ls_start_schedule_slice);
+}
+
+/**
+ * dwc2_hs_pmap_schedule - Schedule in the main high speed schedule
+ *
+ * This will schedule something on the main dwc2 schedule.
+ *
+ * We'll start looking in qh->hs_transfers[index].start_schedule_us.  We'll
+ * update this with the result upon success.  We also use the duration from
+ * the same structure.
+ *
+ * @hsotg:           The HCD state structure for the DWC OTG controller.
+ * @qh:              QH for the periodic transfer.
+ * @only_one_period: If true we will limit ourselves to just looking at
+ *                   one period (aka one 100us chunk).  This is used if we have
+ *                   already scheduled something on the low speed schedule and
+ *                   need to find something that matches on the high speed one.
+ * @index:           The index into qh->hs_transfers that we're working with.
+ *
+ * Returns: 0 for success or an error code.  Upon success the
+ *          dwc2_hs_transfer_time specified by "index" will be updated.
+ */
+static int dwc2_hs_pmap_schedule(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
+				 bool only_one_period, int index)
+{
+	struct dwc2_hs_transfer_time *trans_time = qh->hs_transfers + index;
+	int us;
+
+	us = pmap_schedule(hsotg->hs_periodic_bitmap,
+			   DWC2_HS_PERIODIC_US_PER_UFRAME,
+			   DWC2_HS_SCHEDULE_UFRAMES, trans_time->duration_us,
+			   qh->host_interval, trans_time->start_schedule_us,
+			   only_one_period);
+
+	if (us < 0)
+		return us;
+
+	trans_time->start_schedule_us = us;
+	return 0;
+}
+
+/**
+ * dwc2_ls_pmap_unschedule() - Undo work done by dwc2_hs_pmap_schedule()
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static void dwc2_hs_pmap_unschedule(struct dwc2_hsotg *hsotg,
+				    struct dwc2_qh *qh, int index)
+{
+	struct dwc2_hs_transfer_time *trans_time = qh->hs_transfers + index;
+
+	pmap_unschedule(hsotg->hs_periodic_bitmap,
+			DWC2_HS_PERIODIC_US_PER_UFRAME,
+			DWC2_HS_SCHEDULE_UFRAMES, trans_time->duration_us,
+			qh->host_interval, trans_time->start_schedule_us);
+}
+
+/**
+ * dwc2_uframe_schedule_split - Schedule a QH for a periodic split xfer.
+ *
+ * This is the most complicated thing in USB.  We have to find matching time
+ * in both the global high speed schedule for the port and the low speed
+ * schedule for the TT associated with the given device.
+ *
+ * Being here means that the host must be running in high speed mode and the
+ * device is in low or full speed mode (and behind a hub).
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static int dwc2_uframe_schedule_split(struct dwc2_hsotg *hsotg,
+				      struct dwc2_qh *qh)
+{
+	int bytecount = dwc2_hb_mult(qh->maxp) * dwc2_max_packet(qh->maxp);
+	int ls_search_slice;
+	int err = 0;
+	int host_interval_in_sched;
+
+	/*
+	 * The interval (how often to repeat) in the actual host schedule.
+	 * See pmap_schedule() for gcd() explanation.
+	 */
+	host_interval_in_sched = gcd(qh->host_interval,
+				     DWC2_HS_SCHEDULE_UFRAMES);
+
+	/*
+	 * We always try to find space in the low speed schedule first, then
+	 * try to find high speed time that matches.  If we don't, we'll bump
+	 * up the place we start searching in the low speed schedule and try
+	 * again.  To start we'll look right at the beginning of the low speed
+	 * schedule.
+	 *
+	 * Note that this will tend to front-load the high speed schedule.
+	 * We may eventually want to try to avoid this by either considering
+	 * both schedules together or doing some sort of round robin.
+	 */
+	ls_search_slice = 0;
+
+	while (ls_search_slice < DWC2_LS_SCHEDULE_SLICES) {
+		int start_s_uframe;
+		int ssplit_s_uframe;
+		int second_s_uframe;
+		int rel_uframe;
+		int first_count;
+		int middle_count;
+		int end_count;
+		int first_data_bytes;
+		int other_data_bytes;
+		int i;
+
+		if (qh->schedule_low_speed) {
+			err = dwc2_ls_pmap_schedule(hsotg, qh, ls_search_slice);
+
+			/*
+			 * If we got an error here there's no other magic we
+			 * can do, so bail.  All the looping above is only
+			 * helpful to redo things if we got a low speed slot
+			 * and then couldn't find a matching high speed slot.
+			 */
+			if (err)
+				return err;
+		} else {
+			/* Must be missing the tt structure?  Why? */
+			WARN_ON_ONCE(1);
+		}
+
+		/*
+		 * This will give us a number 0 - 7 if
+		 * DWC2_LS_SCHEDULE_FRAMES == 1, or 0 - 15 if == 2, or ...
+		 */
+		start_s_uframe = qh->ls_start_schedule_slice /
+				 DWC2_SLICES_PER_UFRAME;
+
+		/* Get a number that's always 0 - 7 */
+		rel_uframe = (start_s_uframe % 8);
+
+		/*
+		 * If we were going to start in uframe 7 then we would need to
+		 * issue a start split in uframe 6, which spec says is not OK.
+		 * Move on to the next full frame (assuming there is one).
+		 *
+		 * See 11.18.4 Host Split Transaction Scheduling Requirements
+		 * bullet 1.
+		 */
+		if (rel_uframe == 7) {
+			if (qh->schedule_low_speed)
+				dwc2_ls_pmap_unschedule(hsotg, qh);
+			ls_search_slice =
+				(qh->ls_start_schedule_slice /
+				 DWC2_LS_PERIODIC_SLICES_PER_FRAME + 1) *
+				DWC2_LS_PERIODIC_SLICES_PER_FRAME;
 			continue;
+		}
 
 		/*
-		 * we need n consecutive slots so use j as a start slot
-		 * j plus j+1 must be enough time (for now)
+		 * For ISOC in:
+		 * - start split            (frame -1)
+		 * - complete split w/ data (frame +1)
+		 * - complete split w/ data (frame +2)
+		 * - ...
+		 * - complete split w/ data (frame +num_data_packets)
+		 * - complete split w/ data (frame +num_data_packets+1)
+		 * - complete split w/ data (frame +num_data_packets+2, max 8)
+		 *   ...though if frame was "0" then max is 7...
+		 *
+		 * For ISOC out we might need to do:
+		 * - start split w/ data    (frame -1)
+		 * - start split w/ data    (frame +0)
+		 * - ...
+		 * - start split w/ data    (frame +num_data_packets-2)
+		 *
+		 * For INTERRUPT in we might need to do:
+		 * - start split            (frame -1)
+		 * - complete split w/ data (frame +1)
+		 * - complete split w/ data (frame +2)
+		 * - complete split w/ data (frame +3, max 8)
+		 *
+		 * For INTERRUPT out we might need to do:
+		 * - start split w/ data    (frame -1)
+		 * - complete split         (frame +1)
+		 * - complete split         (frame +2)
+		 * - complete split         (frame +3, max 8)
+		 *
+		 * Start adjusting!
 		 */
-		xtime = hsotg->frame_usecs[i];
-		for (j = i + 1; j < 8; j++) {
-			/*
-			 * if we add this frame remaining time to xtime we may
-			 * be OK, if not we need to test j for a complete frame
-			 */
-			if (xtime + hsotg->frame_usecs[j] < utime) {
-				if (hsotg->frame_usecs[j] <
-							max_uframe_usecs[j])
-					continue;
+		ssplit_s_uframe = (start_s_uframe +
+				   host_interval_in_sched - 1) %
+				  host_interval_in_sched;
+		if (qh->ep_type == USB_ENDPOINT_XFER_ISOC && !qh->ep_is_in)
+			second_s_uframe = start_s_uframe;
+		else
+			second_s_uframe = start_s_uframe + 1;
+
+		/* First data transfer might not be all 188 bytes. */
+		first_data_bytes = 188 -
+			DIV_ROUND_UP(188 * (qh->ls_start_schedule_slice %
+					    DWC2_SLICES_PER_UFRAME),
+				     DWC2_SLICES_PER_UFRAME);
+		if (first_data_bytes > bytecount)
+			first_data_bytes = bytecount;
+		other_data_bytes = bytecount - first_data_bytes;
+
+		/*
+		 * For now, skip OUT xfers where first xfer is partial
+		 *
+		 * Main dwc2 code assumes:
+		 * - INT transfers never get split in two.
+		 * - ISOC transfers can always transfer 188 bytes the first
+		 *   time.
+		 *
+		 * Until that code is fixed, try again if the first transfer
+		 * couldn't transfer everything.
+		 *
+		 * This code can be removed if/when the rest of dwc2 handles
+		 * the above cases.  Until it's fixed we just won't be able
+		 * to schedule quite as tightly.
+		 */
+		if (!qh->ep_is_in &&
+		    (first_data_bytes != min_t(int, 188, bytecount))) {
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p avoiding broken 1st xfer (%d, %d)\n",
+				     qh, first_data_bytes, bytecount);
+			if (qh->schedule_low_speed)
+				dwc2_ls_pmap_unschedule(hsotg, qh);
+			ls_search_slice = (start_s_uframe + 1) *
+				DWC2_SLICES_PER_UFRAME;
+			continue;
+		}
+
+		/* Start by assuming transfers for the bytes */
+		qh->num_hs_transfers = 1 + DIV_ROUND_UP(other_data_bytes, 188);
+
+		/*
+		 * Everything except ISOC OUT has extra transfers.  Rules are
+		 * complicated.  See 11.18.4 Host Split Transaction Scheduling
+		 * Requirements bullet 3.
+		 */
+		if (qh->ep_type == USB_ENDPOINT_XFER_INT) {
+			if (rel_uframe == 6)
+				qh->num_hs_transfers += 2;
+			else
+				qh->num_hs_transfers += 3;
+
+			if (qh->ep_is_in) {
+				/*
+				 * First is start split, middle/end is data.
+				 * Allocate full data bytes for all data.
+				 */
+				first_count = 4;
+				middle_count = bytecount;
+				end_count = bytecount;
+			} else {
+				/*
+				 * First is data, middle/end is complete.
+				 * First transfer and second can have data.
+				 * Rest should just have complete split.
+				 */
+				first_count = first_data_bytes;
+				middle_count = max_t(int, 4, other_data_bytes);
+				end_count = 4;
 			}
-			if (xtime >= utime) {
-				t_left = utime;
-				for (k = i; k < 8; k++) {
-					t_left -= hsotg->frame_usecs[k];
-					if (t_left <= 0) {
-						qh->frame_usecs[k] +=
-							hsotg->frame_usecs[k]
-								+ t_left;
-						hsotg->frame_usecs[k] = -t_left;
-						return i;
-					} else {
-						qh->frame_usecs[k] +=
-							hsotg->frame_usecs[k];
-						hsotg->frame_usecs[k] = 0;
-					}
-				}
+		} else {
+			if (qh->ep_is_in) {
+				int last;
+
+				/* Account for the start split */
+				qh->num_hs_transfers++;
+
+				/* Calculate "L" value from spec */
+				last = rel_uframe + qh->num_hs_transfers + 1;
+
+				/* Start with basic case */
+				if (last <= 6)
+					qh->num_hs_transfers += 2;
+				else
+					qh->num_hs_transfers += 1;
+
+				/* Adjust downwards */
+				if (last >= 6 && rel_uframe == 0)
+					qh->num_hs_transfers--;
+
+				/* 1st = start; rest can contain data */
+				first_count = 4;
+				middle_count = min_t(int, 188, bytecount);
+				end_count = middle_count;
+			} else {
+				/* All contain data, last might be smaller */
+				first_count = first_data_bytes;
+				middle_count = min_t(int, 188,
+						     other_data_bytes);
+				end_count = other_data_bytes % 188;
 			}
-			/* add the frame time to x time */
-			xtime += hsotg->frame_usecs[j];
-			/* we must have a fully available next frame or break */
-			if (xtime < utime &&
-			   hsotg->frame_usecs[j] == max_uframe_usecs[j])
-				continue;
 		}
+
+		/* Assign durations per uFrame */
+		qh->hs_transfers[0].duration_us = HS_USECS_ISO(first_count);
+		for (i = 1; i < qh->num_hs_transfers - 1; i++)
+			qh->hs_transfers[i].duration_us =
+				HS_USECS_ISO(middle_count);
+		if (qh->num_hs_transfers > 1)
+			qh->hs_transfers[qh->num_hs_transfers - 1].duration_us =
+				HS_USECS_ISO(end_count);
+
+		/*
+		 * Assign start us.  The call below to dwc2_hs_pmap_schedule()
+		 * will start with these numbers but may adjust within the same
+		 * microframe.
+		 */
+		qh->hs_transfers[0].start_schedule_us =
+			ssplit_s_uframe * DWC2_HS_PERIODIC_US_PER_UFRAME;
+		for (i = 1; i < qh->num_hs_transfers; i++)
+			qh->hs_transfers[i].start_schedule_us =
+				((second_s_uframe + i - 1) %
+				 DWC2_HS_SCHEDULE_UFRAMES) *
+				DWC2_HS_PERIODIC_US_PER_UFRAME;
+
+		/* Try to schedule with filled in hs_transfers above */
+		for (i = 0; i < qh->num_hs_transfers; i++) {
+			err = dwc2_hs_pmap_schedule(hsotg, qh, true, i);
+			if (err)
+				break;
+		}
+
+		/* If we scheduled all w/out breaking out then we're all good */
+		if (i == qh->num_hs_transfers)
+			break;
+
+		for (; i >= 0; i--)
+			dwc2_hs_pmap_unschedule(hsotg, qh, i);
+
+		if (qh->schedule_low_speed)
+			dwc2_ls_pmap_unschedule(hsotg, qh);
+
+		/* Try again starting in the next microframe */
+		ls_search_slice = (start_s_uframe + 1) * DWC2_SLICES_PER_UFRAME;
 	}
-	return -ENOSPC;
+
+	if (ls_search_slice >= DWC2_LS_SCHEDULE_SLICES)
+		return -ENOSPC;
+
+	return 0;
 }
 
-static int dwc2_find_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+/**
+ * dwc2_uframe_schedule_hs - Schedule a QH for a periodic high speed xfer.
+ *
+ * Basically this just wraps dwc2_hs_pmap_schedule() to provide a clean
+ * interface.
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static int dwc2_uframe_schedule_hs(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	/* In non-split host and device time are the same */
+	WARN_ON(qh->host_us != qh->device_us);
+	WARN_ON(qh->host_interval != qh->device_interval);
+	WARN_ON(qh->num_hs_transfers != 1);
+
+	/* We'll have one transfer; init start to 0 before calling scheduler */
+	qh->hs_transfers[0].start_schedule_us = 0;
+	qh->hs_transfers[0].duration_us = qh->host_us;
+
+	return dwc2_hs_pmap_schedule(hsotg, qh, false, 0);
+}
+
+/**
+ * dwc2_uframe_schedule_ls - Schedule a QH for a periodic low/full speed xfer.
+ *
+ * Basically this just wraps dwc2_ls_pmap_schedule() to provide a clean
+ * interface.
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static int dwc2_uframe_schedule_ls(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	/* In non-split host and device time are the same */
+	WARN_ON(qh->host_us != qh->device_us);
+	WARN_ON(qh->host_interval != qh->device_interval);
+	WARN_ON(!qh->schedule_low_speed);
+
+	/* Run on the main low speed schedule (no split = no hub = no TT) */
+	return dwc2_ls_pmap_schedule(hsotg, qh, 0);
+}
+
+/**
+ * dwc2_uframe_schedule - Schedule a QH for a periodic xfer.
+ *
+ * Calls one of the 3 sub-function depending on what type of transfer this QH
+ * is for.  Also adds some printing.
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static int dwc2_uframe_schedule(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
 	int ret;
 
-	if (qh->dev_speed == USB_SPEED_HIGH) {
-		/* if this is a hs transaction we need a full frame */
-		ret = dwc2_find_single_uframe(hsotg, qh);
-	} else {
-		/*
-		 * if this is a fs transaction we may need a sequence
-		 * of frames
-		 */
-		ret = dwc2_find_multi_uframe(hsotg, qh);
-	}
+	if (qh->dev_speed == USB_SPEED_HIGH)
+		ret = dwc2_uframe_schedule_hs(hsotg, qh);
+	else if (!qh->do_split)
+		ret = dwc2_uframe_schedule_ls(hsotg, qh);
+	else
+		ret = dwc2_uframe_schedule_split(hsotg, qh);
+
+	if (ret)
+		dwc2_sch_dbg(hsotg, "QH=%p Failed to schedule %d\n", qh, ret);
+	else
+		dwc2_qh_schedule_print(hsotg, qh);
+
 	return ret;
 }
 
 /**
+ * dwc2_uframe_unschedule - Undoes dwc2_uframe_schedule().
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static void dwc2_uframe_unschedule(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	int i;
+
+	for (i = 0; i < qh->num_hs_transfers; i++)
+		dwc2_hs_pmap_unschedule(hsotg, qh, i);
+
+	if (qh->schedule_low_speed)
+		dwc2_ls_pmap_unschedule(hsotg, qh);
+
+	dwc2_sch_dbg(hsotg, "QH=%p Unscheduled\n", qh);
+}
+
+/**
  * dwc2_pick_first_frame() - Choose 1st frame for qh that's already scheduled
  *
  * Takes a qh that has already been scheduled (which means we know we have the
@@ -265,6 +1082,7 @@ static void dwc2_pick_first_frame(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	u16 frame_number;
 	u16 earliest_frame;
 	u16 next_active_frame;
+	u16 relative_frame;
 	u16 interval;
 
 	/*
@@ -292,8 +1110,36 @@ static void dwc2_pick_first_frame(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		goto exit;
 	}
 
-	/* Adjust interval as per high speed schedule which has 8 uFrame */
-	interval = gcd(qh->host_interval, 8);
+	if (qh->dev_speed == USB_SPEED_HIGH || qh->do_split) {
+		/*
+		 * We're either at high speed or we're doing a split (which
+		 * means we're talking high speed to a hub).  In any case
+		 * the first frame should be based on when the first scheduled
+		 * event is.
+		 */
+		WARN_ON(qh->num_hs_transfers < 1);
+
+		relative_frame = qh->hs_transfers[0].start_schedule_us /
+				 DWC2_HS_PERIODIC_US_PER_UFRAME;
+
+		/* Adjust interval as per high speed schedule */
+		interval = gcd(qh->host_interval, DWC2_HS_SCHEDULE_UFRAMES);
+
+	} else {
+		/*
+		 * Low or full speed directly on dwc2.  Just about the same
+		 * as high speed but on a different schedule and with slightly
+		 * different adjustments.  Note that this works because when
+		 * the host and device are both low speed then frames in the
+		 * controller tick at low speed.
+		 */
+		relative_frame = qh->ls_start_schedule_slice /
+				 DWC2_LS_PERIODIC_SLICES_PER_FRAME;
+		interval = gcd(qh->host_interval, DWC2_LS_SCHEDULE_FRAMES);
+	}
+
+	/* Scheduler messed up if frame is past interval */
+	WARN_ON(relative_frame >= interval);
 
 	/*
 	 * We know interval must divide (HFNUM_MAX_FRNUM + 1) now that we've
@@ -310,7 +1156,7 @@ static void dwc2_pick_first_frame(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	 * scheduled for.
 	 */
 	next_active_frame = dwc2_frame_num_inc(next_active_frame,
-					       qh->assigned_uframe);
+					       relative_frame);
 
 	/*
 	 * We actually need 1 frame before since the next_active_frame is
@@ -351,9 +1197,7 @@ static int dwc2_do_reserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	int status;
 
 	if (hsotg->core_params->uframe_sched > 0) {
-		status = dwc2_find_uframe(hsotg, qh);
-		if (status >= 0)
-			qh->assigned_uframe = status;
+		status = dwc2_uframe_schedule(hsotg, qh);
 	} else {
 		status = dwc2_periodic_channel_available(hsotg);
 		if (status) {
@@ -410,12 +1254,7 @@ static void dwc2_do_unreserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	hsotg->periodic_usecs -= qh->host_us;
 
 	if (hsotg->core_params->uframe_sched > 0) {
-		int i;
-
-		for (i = 0; i < 8; i++) {
-			hsotg->frame_usecs[i] += qh->frame_usecs[i];
-			qh->frame_usecs[i] = 0;
-		}
+		dwc2_uframe_unschedule(hsotg, qh);
 	} else {
 		/* Release periodic channel reservation */
 		hsotg->periodic_channels--;
@@ -606,88 +1445,81 @@ static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg,
  * @qh:    The QH to init
  * @urb:   Holds the information about the device/endpoint needed to initialize
  *         the QH
+ * @mem_flags: Flags for allocating memory.
  */
 static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
-			 struct dwc2_hcd_urb *urb)
+			 struct dwc2_hcd_urb *urb, gfp_t mem_flags)
 {
-	int dev_speed, hub_addr, hub_port;
+	int dev_speed = dwc2_host_get_speed(hsotg, urb->priv);
+	u8 ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
+	bool ep_is_in = !!dwc2_hcd_is_pipe_in(&urb->pipe_info);
+	bool ep_is_isoc = (ep_type == USB_ENDPOINT_XFER_ISOC);
+	bool ep_is_int = (ep_type == USB_ENDPOINT_XFER_INT);
+	u32 hprt = dwc2_readl(hsotg->regs + HPRT0);
+	u32 prtspd = (hprt & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT;
+	bool do_split = (prtspd == HPRT0_SPD_HIGH_SPEED &&
+			 dev_speed != USB_SPEED_HIGH);
+	int maxp = dwc2_hcd_get_mps(&urb->pipe_info);
+	int bytecount = dwc2_hb_mult(maxp) * dwc2_max_packet(maxp);
 	char *speed, *type;
 
-	dev_vdbg(hsotg->dev, "%s()\n", __func__);
-
 	/* Initialize QH */
 	qh->hsotg = hsotg;
 	setup_timer(&qh->unreserve_timer, dwc2_unreserve_timer_fn,
 		    (unsigned long)qh);
-	qh->ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
-	qh->ep_is_in = dwc2_hcd_is_pipe_in(&urb->pipe_info) ? 1 : 0;
+	qh->ep_type = ep_type;
+	qh->ep_is_in = ep_is_in;
 
 	qh->data_toggle = DWC2_HC_PID_DATA0;
-	qh->maxp = dwc2_hcd_get_mps(&urb->pipe_info);
+	qh->maxp = maxp;
 	INIT_LIST_HEAD(&qh->qtd_list);
 	INIT_LIST_HEAD(&qh->qh_list_entry);
 
-	/* FS/LS Endpoint on HS Hub, NOT virtual root hub */
-	dev_speed = dwc2_host_get_speed(hsotg, urb->priv);
+	qh->do_split = do_split;
+	qh->dev_speed = dev_speed;
+
+	if (ep_is_int || ep_is_isoc) {
+		/* Compute scheduling parameters once and save them */
+		int host_speed = do_split ? USB_SPEED_HIGH : dev_speed;
+		struct dwc2_tt *dwc_tt = dwc2_host_get_tt_info(hsotg, urb->priv,
+							       mem_flags,
+							       &qh->ttport);
+		int device_ns;
 
-	dwc2_host_hub_info(hsotg, urb->priv, &hub_addr, &hub_port);
+		qh->dwc_tt = dwc_tt;
 
-	if ((dev_speed == USB_SPEED_LOW || dev_speed == USB_SPEED_FULL) &&
-	    hub_addr != 0 && hub_addr != 1) {
-		dev_vdbg(hsotg->dev,
-			 "QH init: EP %d: TT found at hub addr %d, for port %d\n",
-			 dwc2_hcd_get_ep_num(&urb->pipe_info), hub_addr,
-			 hub_port);
-		qh->do_split = 1;
-	}
+		qh->host_us = NS_TO_US(usb_calc_bus_time(host_speed, ep_is_in,
+				       ep_is_isoc, bytecount));
+		device_ns = usb_calc_bus_time(dev_speed, ep_is_in,
+					      ep_is_isoc, bytecount);
 
-	if (qh->ep_type == USB_ENDPOINT_XFER_INT ||
-	    qh->ep_type == USB_ENDPOINT_XFER_ISOC) {
-		/* Compute scheduling parameters once and save them */
-		u32 hprt, prtspd;
-
-		/* Todo: Account for split transfers in the bus time */
-		int bytecount =
-			dwc2_hb_mult(qh->maxp) * dwc2_max_packet(qh->maxp);
-
-		qh->host_us = NS_TO_US(usb_calc_bus_time(qh->do_split ?
-			      USB_SPEED_HIGH : dev_speed, qh->ep_is_in,
-			      qh->ep_type == USB_ENDPOINT_XFER_ISOC,
-			      bytecount));
-
-		qh->host_interval = urb->interval;
-		dwc2_sch_dbg(hsotg, "QH=%p init nxt=%04x, fn=%04x, int=%#x\n",
-			     qh, qh->next_active_frame, hsotg->frame_number,
-			     qh->host_interval);
-#if 0
-		/* Increase interrupt polling rate for debugging */
-		if (qh->ep_type == USB_ENDPOINT_XFER_INT)
-			qh->host_interval = 8;
-#endif
-		hprt = dwc2_readl(hsotg->regs + HPRT0);
-		prtspd = (hprt & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT;
-		if (prtspd == HPRT0_SPD_HIGH_SPEED &&
-		    (dev_speed == USB_SPEED_LOW ||
-		     dev_speed == USB_SPEED_FULL)) {
-			qh->host_interval *= 8;
-			dwc2_sch_dbg(hsotg,
-				     "QH=%p init*8 nxt=%04x, fn=%04x, int=%#x\n",
-				     qh, qh->next_active_frame,
-				     hsotg->frame_number, qh->host_interval);
+		if (do_split && dwc_tt)
+			device_ns += dwc_tt->usb_tt->think_time;
+		qh->device_us = NS_TO_US(device_ns);
 
-		}
-		dev_dbg(hsotg->dev, "interval=%d\n", qh->host_interval);
-	}
 
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH Initialized\n");
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - qh = %p\n", qh);
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Device Address = %d\n",
-		 dwc2_hcd_get_dev_addr(&urb->pipe_info));
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Endpoint %d, %s\n",
-		 dwc2_hcd_get_ep_num(&urb->pipe_info),
-		 dwc2_hcd_is_pipe_in(&urb->pipe_info) ? "IN" : "OUT");
+		qh->device_interval = urb->interval;
+		qh->host_interval = urb->interval * (do_split ? 8 : 1);
 
-	qh->dev_speed = dev_speed;
+		/*
+		 * Schedule low speed if we're running the host in low or
+		 * full speed OR if we've got a "TT" to deal with to access this
+		 * device.
+		 */
+		qh->schedule_low_speed = prtspd != HPRT0_SPD_HIGH_SPEED ||
+					 dwc_tt;
+
+		if (do_split) {
+			/* We won't know num transfers until we schedule */
+			qh->num_hs_transfers = -1;
+		} else if (dev_speed == USB_SPEED_HIGH) {
+			qh->num_hs_transfers = 1;
+		} else {
+			qh->num_hs_transfers = 0;
+		}
+
+		/* We'll schedule later when we have something to do */
+	}
 
 	switch (dev_speed) {
 	case USB_SPEED_LOW:
@@ -703,7 +1535,6 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		speed = "?";
 		break;
 	}
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Speed = %s\n", speed);
 
 	switch (qh->ep_type) {
 	case USB_ENDPOINT_XFER_ISOC:
@@ -723,13 +1554,21 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		break;
 	}
 
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Type = %s\n", type);
-
-	if (qh->ep_type == USB_ENDPOINT_XFER_INT) {
-		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - usecs = %d\n",
-			 qh->host_us);
-		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - interval = %d\n",
-			 qh->host_interval);
+	dwc2_sch_dbg(hsotg, "QH=%p Init %s, %s speed, %d bytes:\n", qh, type,
+		     speed, bytecount);
+	dwc2_sch_dbg(hsotg, "QH=%p ...addr=%d, ep=%d, %s\n", qh,
+		     dwc2_hcd_get_dev_addr(&urb->pipe_info),
+		     dwc2_hcd_get_ep_num(&urb->pipe_info),
+		     ep_is_in ? "IN" : "OUT");
+	if (ep_is_int || ep_is_isoc) {
+		dwc2_sch_dbg(hsotg,
+			     "QH=%p ...duration: host=%d us, device=%d us\n",
+			     qh, qh->host_us, qh->device_us);
+		dwc2_sch_dbg(hsotg, "QH=%p ...interval: host=%d, device=%d\n",
+			     qh, qh->host_interval, qh->device_interval);
+		if (qh->schedule_low_speed)
+			dwc2_sch_dbg(hsotg, "QH=%p ...low speed schedule=%p\n",
+				     qh, dwc2_get_ls_map(hsotg, qh));
 	}
 }
 
@@ -757,7 +1596,7 @@ struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
 	if (!qh)
 		return NULL;
 
-	dwc2_qh_init(hsotg, qh, urb);
+	dwc2_qh_init(hsotg, qh, urb, mem_flags);
 
 	if (hsotg->core_params->dma_desc_enable > 0 &&
 	    dwc2_hcd_qh_init_ddma(hsotg, qh, mem_flags) < 0) {
@@ -789,6 +1628,7 @@ void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		dwc2_do_unreserve(hsotg, qh);
 		spin_unlock_irqrestore(&hsotg->lock, flags);
 	}
+	dwc2_host_put_tt_info(hsotg, qh->dwc_tt);
 
 	if (qh->desc_list)
 		dwc2_hcd_qh_free_ddma(hsotg, qh);
@@ -904,6 +1744,8 @@ static int dwc2_next_for_periodic_split(struct dwc2_hsotg *hsotg,
 	u16 incr;
 
 	/*
+	 * See dwc2_uframe_schedule_split() for split scheduling.
+	 *
 	 * Basically: increment 1 normally, but 2 right after the start split
 	 * (except for ISOC out).
 	 */
@@ -1006,9 +1848,17 @@ static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
 	if (qh->start_active_frame == qh->next_active_frame ||
 	    dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
 		u16 ideal_start = qh->start_active_frame;
+		int periods_in_map;
 
-		/* Adjust interval as per gcd with plan length. */
-		interval = gcd(interval, 8);
+		/*
+		 * Adjust interval as per gcd with map size.
+		 * See pmap_schedule() for more details here.
+		 */
+		if (qh->do_split || qh->dev_speed == USB_SPEED_HIGH)
+			periods_in_map = DWC2_HS_SCHEDULE_UFRAMES;
+		else
+			periods_in_map = DWC2_LS_SCHEDULE_FRAMES;
+		interval = gcd(interval, periods_in_map);
 
 		do {
 			qh->start_active_frame = dwc2_frame_num_inc(
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 21/22] usb: dwc2: host: Totally redo the microframe scheduler
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

This totally reimplements the microframe scheduler in dwc2 to attempt to
handle periodic splits properly.  The old code didn't even try, so this
was a significant effort since periodic splits are one of the most
complicated things in USB.

I've attempted to keep the old "don't use the microframe" schduler
around for now, but not sure it's needed.  It has also only been lightly
tested.

I think it's pretty certain that this scheduler isn't perfect and might
have some bugs, but it seems much better than what was there before.
With this change my stressful USB test (USB webcam + USB audio + some
keyboards) crackles less.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Removed incorrect limit on number of channels (Heiko Stuebner).
- Fixed order of operations bug in debug print.
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5:
- Moved defines outside of ifdef to avoid gadget-only compile error.

Changes in v4:
- Figured out what the microframe scheduler was supposed to do.
- Microframe rewrite is totally different from v3, hopefully more right.
- Microframe rewrite is later in the series now.

Changes in v3:
- The uframe scheduler patch is folded into optimization series.
- Optimize uframe scheduler "single uframe" case a little.
- uframe scheduler now atop logging patches.
- uframe scheduler now before delayed bandwidth release patches.
- Add defines like EARLY_FRAME_USEC
- Reorder dwc2_deschedule_periodic() in prep for future patches.
- uframe scheduler now shows real usefulness w/ future patches!
- Assuming single_tt is new for v3; not terribly well tested (yet).
- Keep track and use our uframe new for v3.

Changes in v2:
- Totally rewrote uframe scheduler again after writing test code.
- uframe scheduler atop delayed bandwidth release patches.

 drivers/usb/dwc2/core.h      |   85 ++-
 drivers/usb/dwc2/hcd.c       |   87 +++-
 drivers/usb/dwc2/hcd.h       |   79 ++-
 drivers/usb/dwc2/hcd_queue.c | 1170 ++++++++++++++++++++++++++++++++++++------
 4 files changed, 1250 insertions(+), 171 deletions(-)

diff --git a/drivers/usb/dwc2/core.h b/drivers/usb/dwc2/core.h
index 52cbea28d0e9..115925909390 100644
--- a/drivers/usb/dwc2/core.h
+++ b/drivers/usb/dwc2/core.h
@@ -592,6 +592,84 @@ struct dwc2_hregs_backup {
 	bool valid;
 };
 
+/*
+ * Constants related to high speed periodic scheduling
+ *
+ * We have a periodic schedule that is DWC2_HS_SCHEDULE_UFRAMES long.  From a
+ * reservation point of view it's assumed that the schedule goes right back to
+ * the beginning after the end of the schedule.
+ *
+ * What does that mean for scheduling things with a long interval?  It means
+ * we'll reserve time for them in every possible microframe that they could
+ * ever be scheduled in.  ...but we'll still only actually schedule them as
+ * often as they were requested.
+ *
+ * We keep our schedule in a "bitmap" structure.  This simplifies having
+ * to keep track of and merge intervals: we just let the bitmap code do most
+ * of the heavy lifting.  In a way scheduling is much like memory allocation.
+ *
+ * We schedule 100us per uframe or 80% of 125us (the maximum amount you're
+ * supposed to schedule for periodic transfers).  That's according to spec.
+ *
+ * Note that though we only schedule 80% of each microframe, the bitmap that we
+ * keep the schedule in is tightly packed (AKA it doesn't have 100us worth of
+ * space for each uFrame).
+ *
+ * Requirements:
+ * - DWC2_HS_SCHEDULE_UFRAMES must even divide 0x4000 (HFNUM_MAX_FRNUM + 1)
+ * - DWC2_HS_SCHEDULE_UFRAMES must be 8 times DWC2_LS_SCHEDULE_FRAMES (probably
+ *   could be any multiple of 8 times DWC2_LS_SCHEDULE_FRAMES, but there might
+ *   be bugs).  The 8 comes from the USB spec: number of microframes per frame.
+ */
+#define DWC2_US_PER_UFRAME		125
+#define DWC2_HS_PERIODIC_US_PER_UFRAME	100
+
+#define DWC2_HS_SCHEDULE_UFRAMES	8
+#define DWC2_HS_SCHEDULE_US		(DWC2_HS_SCHEDULE_UFRAMES * \
+					 DWC2_HS_PERIODIC_US_PER_UFRAME)
+
+/*
+ * Constants related to low speed scheduling
+ *
+ * For high speed we schedule every 1us.  For low speed that's a bit overkill,
+ * so we make up a unit called a "slice" that's worth 25us.  There are 40
+ * slices in a full frame and we can schedule 36 of those (90%) for periodic
+ * transfers.
+ *
+ * Our low speed schedule can be as short as 1 frame or could be longer.  When
+ * we only schedule 1 frame it means that we'll need to reserve a time every
+ * frame even for things that only transfer very rarely, so something that runs
+ * every 2048 frames will get time reserved in every frame.  Our low speed
+ * schedule can be longer and we'll be able to handle more overlap, but that
+ * will come at increased memory cost and increased time to schedule.
+ *
+ * Note: one other advantage of a short low speed schedule is that if we mess
+ * up and miss scheduling we can jump in and use any of the slots that we
+ * happened to reserve.
+ *
+ * With 25 us per slice and 1 frame in the schedule, we only need 4 bytes for
+ * the schedule.  There will be one schedule per TT.
+ *
+ * Requirements:
+ * - DWC2_US_PER_SLICE must evenly divide DWC2_LS_PERIODIC_US_PER_FRAME.
+ */
+#define DWC2_US_PER_SLICE	25
+#define DWC2_SLICES_PER_UFRAME	(DWC2_US_PER_UFRAME / DWC2_US_PER_SLICE)
+
+#define DWC2_ROUND_US_TO_SLICE(us) \
+				(DIV_ROUND_UP((us), DWC2_US_PER_SLICE) * \
+				 DWC2_US_PER_SLICE)
+
+#define DWC2_LS_PERIODIC_US_PER_FRAME \
+				900
+#define DWC2_LS_PERIODIC_SLICES_PER_FRAME \
+				(DWC2_LS_PERIODIC_US_PER_FRAME / \
+				 DWC2_US_PER_SLICE)
+
+#define DWC2_LS_SCHEDULE_FRAMES	1
+#define DWC2_LS_SCHEDULE_SLICES	(DWC2_LS_SCHEDULE_FRAMES * \
+				 DWC2_LS_PERIODIC_SLICES_PER_FRAME)
+
 /**
  * struct dwc2_hsotg - Holds the state of the driver, including the non-periodic
  * and periodic schedules
@@ -682,7 +760,9 @@ struct dwc2_hregs_backup {
  *                      This value is in microseconds per (micro)frame. The
  *                      assumption is that all periodic transfers may occur in
  *                      the same (micro)frame.
- * @frame_usecs:        Internal variable used by the microframe scheduler
+ * @hs_periodic_bitmap: Bitmap used by the microframe scheduler any time the
+ *                      host is in high speed mode; low speed schedules are
+ *                      stored elsewhere since we need one per TT.
  * @frame_number:       Frame number read from the core at SOF. The value ranges
  *                      from 0 to HFNUM_MAX_FRNUM.
  * @periodic_qh_count:  Count of periodic QHs, if using several eps. Used for
@@ -803,7 +883,8 @@ struct dwc2_hsotg {
 	struct list_head periodic_sched_queued;
 	struct list_head split_order;
 	u16 periodic_usecs;
-	u16 frame_usecs[8];
+	unsigned long hs_periodic_bitmap[
+		DIV_ROUND_UP(DWC2_HS_SCHEDULE_US, BITS_PER_LONG)];
 	u16 frame_number;
 	u16 periodic_qh_count;
 	bool bus_suspended;
diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
index 8edd0b45f41c..2b5a706e7c32 100644
--- a/drivers/usb/dwc2/hcd.c
+++ b/drivers/usb/dwc2/hcd.c
@@ -2252,6 +2252,90 @@ void dwc2_host_hub_info(struct dwc2_hsotg *hsotg, void *context, int *hub_addr,
 	*hub_port = urb->dev->ttport;
 }
 
+/**
+ * dwc2_host_get_tt_info() - Get the dwc2_tt associated with context
+ *
+ * This will get the dwc2_tt structure (and ttport) associated with the given
+ * context (which is really just a struct urb pointer).
+ *
+ * The first time this is called for a given TT we allocate memory for our
+ * structure.  When everyone is done and has called dwc2_host_put_tt_info()
+ * then the refcount for the structure will go to 0 and we'll free it.
+ *
+ * @hsotg:     The HCD state structure for the DWC OTG controller.
+ * @qh:        The QH structure.
+ * @context:   The priv pointer from a struct dwc2_hcd_urb.
+ * @mem_flags: Flags for allocating memory.
+ * @ttport:    We'll return this device's port number here.  That's used to
+ *             reference into the bitmap if we're on a multi_tt hub.
+ *
+ * Return: a pointer to a struct dwc2_tt.  Don't forget to call
+ *         dwc2_host_put_tt_info()!  Returns NULL upon memory alloc failure.
+ */
+
+struct dwc2_tt *dwc2_host_get_tt_info(struct dwc2_hsotg *hsotg, void *context,
+				      gfp_t mem_flags, int *ttport)
+{
+	struct urb *urb = context;
+	struct dwc2_tt *dwc_tt = NULL;
+
+	if (urb->dev->tt) {
+		*ttport = urb->dev->ttport;
+
+		dwc_tt = urb->dev->tt->hcpriv;
+		if (dwc_tt == NULL) {
+			size_t bitmap_size;
+
+			/*
+			 * For single_tt we need one schedule.  For multi_tt
+			 * we need one per port.
+			 */
+			bitmap_size = DWC2_ELEMENTS_PER_LS_BITMAP *
+				      sizeof(dwc_tt->periodic_bitmaps[0]);
+			if (urb->dev->tt->multi)
+				bitmap_size *= urb->dev->tt->hub->maxchild;
+
+			dwc_tt = kzalloc(sizeof(*dwc_tt) + bitmap_size,
+					 mem_flags);
+			if (dwc_tt == NULL)
+				return NULL;
+
+			dwc_tt->usb_tt = urb->dev->tt;
+			dwc_tt->usb_tt->hcpriv = dwc_tt;
+		}
+
+		dwc_tt->refcount++;
+	}
+
+	return dwc_tt;
+}
+
+/**
+ * dwc2_host_put_tt_info() - Put the dwc2_tt from dwc2_host_get_tt_info()
+ *
+ * Frees resources allocated by dwc2_host_get_tt_info() if all current holders
+ * of the structure are done.
+ *
+ * It's OK to call this with NULL.
+ *
+ * @hsotg:     The HCD state structure for the DWC OTG controller.
+ * @dwc_tt:    The pointer returned by dwc2_host_get_tt_info.
+ */
+void dwc2_host_put_tt_info(struct dwc2_hsotg *hsotg, struct dwc2_tt *dwc_tt)
+{
+	/* Model kfree and make put of NULL a no-op */
+	if (dwc_tt == NULL)
+		return;
+
+	WARN_ON(dwc_tt->refcount < 1);
+
+	dwc_tt->refcount--;
+	if (!dwc_tt->refcount) {
+		dwc_tt->usb_tt->hcpriv = NULL;
+		kfree(dwc_tt);
+	}
+}
+
 int dwc2_host_get_speed(struct dwc2_hsotg *hsotg, void *context)
 {
 	struct urb *urb = context;
@@ -3197,9 +3281,6 @@ int dwc2_hcd_init(struct dwc2_hsotg *hsotg, int irq)
 		hsotg->hc_ptr_array[i] = channel;
 	}
 
-	if (hsotg->core_params->uframe_sched > 0)
-		dwc2_hcd_init_usecs(hsotg);
-
 	/* Initialize hsotg start work */
 	INIT_DELAYED_WORK(&hsotg->start_work, dwc2_hcd_start_func);
 
diff --git a/drivers/usb/dwc2/hcd.h b/drivers/usb/dwc2/hcd.h
index fd266ac53a28..140b1511a131 100644
--- a/drivers/usb/dwc2/hcd.h
+++ b/drivers/usb/dwc2/hcd.h
@@ -212,6 +212,43 @@ enum dwc2_transaction_type {
 	DWC2_TRANSACTION_ALL,
 };
 
+/* The number of elements per LS bitmap (per port on multi_tt) */
+#define DWC2_ELEMENTS_PER_LS_BITMAP	DIV_ROUND_UP(DWC2_LS_SCHEDULE_SLICES, \
+						     BITS_PER_LONG)
+
+/**
+ * struct dwc2_tt - dwc2 data associated with a usb_tt
+ *
+ * @refcount:           Number of Queue Heads (QHs) holding a reference.
+ * @usb_tt:             Pointer back to the official usb_tt.
+ * @periodic_bitmaps:   Bitmap for which parts of the 1ms frame are accounted
+ *                      for already.  Each is DWC2_ELEMENTS_PER_LS_BITMAP
+ *			elements (so sizeof(long) times that in bytes).
+ *
+ * This structure is stored in the hcpriv of the official usb_tt.
+ */
+struct dwc2_tt {
+	int refcount;
+	struct usb_tt *usb_tt;
+	unsigned long periodic_bitmaps[];
+};
+
+/**
+ * struct dwc2_hs_transfer_time - Info about a transfer on the high speed bus.
+ *
+ * @start_schedule_usecs:  The start time on the main bus schedule.  Note that
+ *                         the main bus schedule is tightly packed and this
+ *			   time should be interpreted as tightly packed (so
+ *			   uFrame 0 starts at 0 us, uFrame 1 starts at 100 us
+ *			   instead of 125 us).
+ * @duration_us:           How long this transfer goes.
+ */
+
+struct dwc2_hs_transfer_time {
+	u32 start_schedule_us;
+	u16 duration_us;
+};
+
 /**
  * struct dwc2_qh - Software queue head structure
  *
@@ -237,18 +274,33 @@ enum dwc2_transaction_type {
  * @td_first:           Index of first activated isochronous transfer descriptor
  * @td_last:            Index of last activated isochronous transfer descriptor
  * @host_us:            Bandwidth in microseconds per transfer as seen by host
+ * @device_us:          Bandwidth in microseconds per transfer as seen by device
  * @host_interval:      Interval between transfers as seen by the host.  If
  *                      the host is high speed and the device is low speed this
  *                      will be 8 times device interval.
- * @next_active_frame:  (Micro)frame before we next need to put something on
+ * @device_interval:    Interval between transfers as seen by the device.
+ *                      interval.
+ * @next_active_frame:  (Micro)frame _before_ we next need to put something on
  *                      the bus.  We'll move the qh to active here.  If the
  *                      host is in high speed mode this will be a uframe.  If
  *                      the host is in low speed mode this will be a full frame.
  * @start_active_frame: If we are partway through a split transfer, this will be
  *			what next_active_frame was when we started.  Otherwise
  *			it should always be the same as next_active_frame.
- * @assigned_uframe:    The uframe (0 -7) assigned by dwc2_find_uframe().
- * @frame_usecs:        Internal variable used by the microframe scheduler
+ * @num_hs_transfers:   Number of transfers in hs_transfers.
+ *                      Normally this is 1 but can be more than one for splits.
+ *                      Always >= 1 unless the host is in low/full speed mode.
+ * @hs_transfers:       Transfers that are scheduled as seen by the high speed
+ *                      bus.  Not used if host is in low or full speed mode (but
+ *                      note that it IS USED if the device is low or full speed
+ *                      as long as the HOST is in high speed mode).
+ * @ls_start_schedule_slice: Start time (in slices) on the low speed bus
+ *                           schedule that's being used by this device.  This
+ *			     will be on the periodic_bitmap in a
+ *                           "struct dwc2_tt".  Not used if this device is high
+ *                           speed.  Note that this is in "schedule slice" which
+ *                           is tightly packed.
+ * @ls_duration_us:     Duration on the low speed bus schedule.
  * @ntd:                Actual number of transfer descriptors in a list
  * @qtd_list:           List of QTDs for this QH
  * @channel:            Host channel currently processing transfers for this QH
@@ -261,8 +313,12 @@ enum dwc2_transaction_type {
  *                      descriptor and indicates original XferSize value for the
  *                      descriptor
  * @unreserve_timer:    Timer for releasing periodic reservation.
+ * @dwc2_tt:            Pointer to our tt info (or NULL if no tt).
+ * @ttport:             Port number within our tt.
  * @tt_buffer_dirty     True if clear_tt_buffer_complete is pending
  * @unreserve_pending:  True if we planned to unreserve but haven't yet.
+ * @schedule_low_speed: True if we have a low/full speed component (either the
+ *			host is in low/full speed mode or do_split).
  *
  * A Queue Head (QH) holds the static characteristics of an endpoint and
  * maintains a list of transfers (QTDs) for that endpoint. A QH structure may
@@ -280,11 +336,14 @@ struct dwc2_qh {
 	u8 td_first;
 	u8 td_last;
 	u16 host_us;
+	u16 device_us;
 	u16 host_interval;
+	u16 device_interval;
 	u16 next_active_frame;
 	u16 start_active_frame;
-	u16 assigned_uframe;
-	u16 frame_usecs[8];
+	s16 num_hs_transfers;
+	struct dwc2_hs_transfer_time hs_transfers[DWC2_HS_SCHEDULE_UFRAMES];
+	u32 ls_start_schedule_slice;
 	u16 ntd;
 	struct list_head qtd_list;
 	struct dwc2_host_chan *channel;
@@ -294,8 +353,11 @@ struct dwc2_qh {
 	u32 desc_list_sz;
 	u32 *n_bytes;
 	struct timer_list unreserve_timer;
+	struct dwc2_tt *dwc_tt;
+	int ttport;
 	unsigned tt_buffer_dirty:1;
 	unsigned unreserve_pending:1;
+	unsigned schedule_low_speed:1;
 };
 
 /**
@@ -462,7 +524,6 @@ extern void dwc2_hcd_queue_transactions(struct dwc2_hsotg *hsotg,
 
 /* Schedule Queue Functions */
 /* Implemented in hcd_queue.c */
-extern void dwc2_hcd_init_usecs(struct dwc2_hsotg *hsotg);
 extern struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
 					  struct dwc2_hcd_urb *urb,
 					  gfp_t mem_flags);
@@ -728,6 +789,12 @@ extern void dwc2_host_start(struct dwc2_hsotg *hsotg);
 extern void dwc2_host_disconnect(struct dwc2_hsotg *hsotg);
 extern void dwc2_host_hub_info(struct dwc2_hsotg *hsotg, void *context,
 			       int *hub_addr, int *hub_port);
+extern struct dwc2_tt *dwc2_host_get_tt_info(struct dwc2_hsotg *hsotg,
+					     void *context, gfp_t mem_flags,
+					     int *ttport);
+
+extern void dwc2_host_put_tt_info(struct dwc2_hsotg *hsotg,
+				  struct dwc2_tt *dwc_tt);
 extern int dwc2_host_get_speed(struct dwc2_hsotg *hsotg, void *context);
 extern void dwc2_host_complete(struct dwc2_hsotg *hsotg, struct dwc2_qtd *qtd,
 			       int status);
diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
index 5f909747b5a4..5ea460d886ba 100644
--- a/drivers/usb/dwc2/hcd_queue.c
+++ b/drivers/usb/dwc2/hcd_queue.c
@@ -136,116 +136,933 @@ static int dwc2_check_periodic_bandwidth(struct dwc2_hsotg *hsotg,
 }
 
 /**
- * Microframe scheduler
- * track the total use in hsotg->frame_usecs
- * keep each qh use in qh->frame_usecs
- * when surrendering the qh then donate the time back
+ * pmap_schedule() - Schedule time in a periodic bitmap (pmap).
+ *
+ * @map:             The bitmap representing the schedule; will be updated
+ *                   upon success.
+ * @bits_per_period: The schedule represents several periods.  This is how many
+ *                   bits are in each period.  It's assumed that the beginning
+ *                   of the schedule will repeat after its end.
+ * @periods_in_map:  The number of periods in the schedule.
+ * @num_bits:        The number of bits we need per period we want to reserve
+ *                   in this function call.
+ * @interval:        How often we need to be scheduled for the reservation this
+ *                   time.  1 means every period.  2 means every other period.
+ *                   ...you get the picture?
+ * @start:           The bit number to start at.  Normally 0.  Must be within
+ *                   the interval or we return failure right away.
+ * @only_one_period: Normally we'll allow picking a start anywhere within the
+ *                   first interval, since we can still make all repetition
+ *                   requirements by doing that.  However, if you pass true
+ *                   here then we'll return failure if we can't fit within
+ *                   the period that "start" is in.
+ *
+ * The idea here is that we want to schedule time for repeating events that all
+ * want the same resource.  The resource is divided into fixed-sized periods
+ * and the events want to repeat every "interval" periods.  The schedule
+ * granularity is one bit.
+ *
+ * To keep things "simple", we'll represent our schedule with a bitmap that
+ * contains a fixed number of periods.  This gets rid of a lot of complexity
+ * but does mean that we need to handle things specially (and non-ideally) if
+ * the number of the periods in the schedule doesn't match well with the
+ * intervals that we're trying to schedule.
+ *
+ * Here's an explanation of the scheme we'll implement, assuming 8 periods.
+ * - If interval is 1, we need to take up space in each of the 8
+ *   periods we're scheduling.  Easy.
+ * - If interval is 2, we need to take up space in half of the
+ *   periods.  Again, easy.
+ * - If interval is 3, we actually need to fall back to interval 1.
+ *   Why?  Because we might need time in any period.  AKA for the
+ *   first 8 periods, we'll be in slot 0, 3, 6.  Then we'll be
+ *   in slot 1, 4, 7.  Then we'll be in 2, 5.  Then we'll be back to
+ *   0, 3, and 6.  Since we could be in any frame we need to reserve
+ *   for all of them.  Sucks, but that's what you gotta do.  Note that
+ *   if we were instead scheduling 8 * 3 = 24 we'd do much better, but
+ *   then we need more memory and time to do scheduling.
+ * - If interval is 4, easy.
+ * - If interval is 5, we again need interval 1.  The schedule will be
+ *   0, 5, 2, 7, 4, 1, 6, 3, 0
+ * - If interval is 6, we need interval 2.  0, 6, 4, 2.
+ * - If interval is 7, we need interval 1.
+ * - If interval is 8, we need interval 8.
+ *
+ * If you do the math, you'll see that we need to pretend that interval is
+ * equal to the greatest_common_divisor(interval, periods_in_map).
+ *
+ * Note that at the moment this function tends to front-pack the schedule.
+ * In some cases that's really non-ideal (it's hard to schedule things that
+ * need to repeat every period).  In other cases it's perfect (you can easily
+ * schedule bigger, less often repeating things).
+ *
+ * Here's the algorithm in action (8 periods, 5 bits per period):
+ *  |**   |     |**   |     |**   |     |**   |     |   OK 2 bits, intv 2 at 0
+ *  |*****|  ***|*****|  ***|*****|  ***|*****|  ***|   OK 3 bits, intv 3 at 2
+ *  |*****|* ***|*****|  ***|*****|* ***|*****|  ***|   OK 1 bits, intv 4 at 5
+ *  |**   |*    |**   |     |**   |*    |**   |     | Remv 3 bits, intv 3 at 2
+ *  |***  |*    |***  |     |***  |*    |***  |     |   OK 1 bits, intv 6 at 2
+ *  |**** |*  * |**** |   * |**** |*  * |**** |   * |   OK 1 bits, intv 1 at 3
+ *  |**** |**** |**** | *** |**** |**** |**** | *** |   OK 2 bits, intv 2 at 6
+ *  |*****|*****|*****| ****|*****|*****|*****| ****|   OK 1 bits, intv 1 at 4
+ *  |*****|*****|*****| ****|*****|*****|*****| ****| FAIL 1 bits, intv 1
+ *  |  ***|*****|  ***| ****|  ***|*****|  ***| ****| Remv 2 bits, intv 2 at 0
+ *  |  ***| ****|  ***| ****|  ***| ****|  ***| ****| Remv 1 bits, intv 4 at 5
+ *  |   **| ****|   **| ****|   **| ****|   **| ****| Remv 1 bits, intv 6 at 2
+ *  |    *| ** *|    *| ** *|    *| ** *|    *| ** *| Remv 1 bits, intv 1 at 3
+ *  |    *|    *|    *|    *|    *|    *|    *|    *| Remv 2 bits, intv 2 at 6
+ *  |     |     |     |     |     |     |     |     | Remv 1 bits, intv 1 at 4
+ *  |**   |     |**   |     |**   |     |**   |     |   OK 2 bits, intv 2 at 0
+ *  |***  |     |**   |     |***  |     |**   |     |   OK 1 bits, intv 4 at 2
+ *  |*****|     |** **|     |*****|     |** **|     |   OK 2 bits, intv 2 at 3
+ *  |*****|*    |** **|     |*****|*    |** **|     |   OK 1 bits, intv 4 at 5
+ *  |*****|***  |** **| **  |*****|***  |** **| **  |   OK 2 bits, intv 2 at 6
+ *  |*****|*****|** **| ****|*****|*****|** **| ****|   OK 2 bits, intv 2 at 8
+ *  |*****|*****|*****| ****|*****|*****|*****| ****|   OK 1 bits, intv 4 at 12
+ *
+ * This function is pretty generic and could be easily abstracted if anything
+ * needed similar scheduling.
+ *
+ * Returns either -ENOSPC or a >= 0 start bit which should be passed to the
+ * unschedule routine.  The map bitmap will be updated on a non-error result.
  */
-static const unsigned short max_uframe_usecs[] = {
-	100, 100, 100, 100, 100, 100, 30, 0
-};
+static int pmap_schedule(unsigned long *map, int bits_per_period,
+			 int periods_in_map, int num_bits,
+			 int interval, int start, bool only_one_period)
+{
+	int interval_bits;
+	int to_reserve;
+	int first_end;
+	int i;
+
+	if (num_bits > bits_per_period)
+		return -ENOSPC;
+
+	/* Adjust interval as per description */
+	interval = gcd(interval, periods_in_map);
+
+	interval_bits = bits_per_period * interval;
+	to_reserve = periods_in_map / interval;
+
+	/* If start has gotten us past interval then we can't schedule */
+	if (start >= interval_bits)
+		return -ENOSPC;
+
+	if (only_one_period)
+		/* Must fit within same period as start; end at begin of next */
+		first_end = (start / bits_per_period + 1) * bits_per_period;
+	else
+		/* Can fit anywhere in the first interval */
+		first_end = interval_bits;
+
+	/*
+	 * We'll try to pick the first repetition, then see if that time
+	 * is free for each of the subsequent repetitions.  If it's not
+	 * we'll adjust the start time for the next search of the first
+	 * repetition.
+	 */
+	while (start + num_bits <= first_end) {
+		int end;
+
+		/* Need to stay within this period */
+		end = (start / bits_per_period + 1) * bits_per_period;
+
+		/* Look for num_bits us in this microframe starting at start */
+		start = bitmap_find_next_zero_area(map, end, start, num_bits,
+						   0);
+
+		/*
+		 * We should get start >= end if we fail.  We might be
+		 * able to check the next microframe depending on the
+		 * interval, so continue on (start already updated).
+		 */
+		if (start >= end) {
+			start = end;
+			continue;
+		}
+
+		/* At this point we have a valid point for first one */
+		for (i = 1; i < to_reserve; i++) {
+			int ith_start = start + interval_bits * i;
+			int ith_end = end + interval_bits * i;
+			int ret;
+
+			/* Use this as a dumb "check if bits are 0" */
+			ret = bitmap_find_next_zero_area(
+				map, ith_start + num_bits, ith_start, num_bits,
+				0);
+
+			/* We got the right place, continue checking */
+			if (ret == ith_start)
+				continue;
+
+			/* Move start up for next time and exit for loop */
+			ith_start = bitmap_find_next_zero_area(
+				map, ith_end, ith_start, num_bits, 0);
+			if (ith_start >= ith_end)
+				/* Need a while new period next time */
+				start = end;
+			else
+				start = ith_start - interval_bits * i;
+			break;
+		}
+
+		/* If didn't exit the for loop with a break, we have success */
+		if (i == to_reserve)
+			break;
+	}
+
+	if (start + num_bits > first_end)
+		return -ENOSPC;
 
-void dwc2_hcd_init_usecs(struct dwc2_hsotg *hsotg)
+	for (i = 0; i < to_reserve; i++) {
+		int ith_start = start + interval_bits * i;
+
+		bitmap_set(map, ith_start, num_bits);
+	}
+
+	return start;
+}
+
+/**
+ * pmap_unschedule() - Undo work done by pmap_schedule()
+ *
+ * @map:             See pmap_schedule().
+ * @bits_per_period: See pmap_schedule().
+ * @periods_in_map:  See pmap_schedule().
+ * @num_bits:        The number of bits that was passed to schedule.
+ * @interval:        The interval that was passed to schedule.
+ * @start:           The return value from pmap_schedule().
+ */
+static void pmap_unschedule(unsigned long *map, int bits_per_period,
+			    int periods_in_map, int num_bits,
+			    int interval, int start)
 {
+	int interval_bits;
+	int to_release;
 	int i;
 
-	for (i = 0; i < 8; i++)
-		hsotg->frame_usecs[i] = max_uframe_usecs[i];
+	/* Adjust interval as per description in pmap_schedule() */
+	interval = gcd(interval, periods_in_map);
+
+	interval_bits = bits_per_period * interval;
+	to_release = periods_in_map / interval;
+
+	for (i = 0; i < to_release; i++) {
+		int ith_start = start + interval_bits * i;
+
+		bitmap_clear(map, ith_start, num_bits);
+	}
 }
 
-static int dwc2_find_single_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+/*
+ * cat_printf() - A printf() + strcat() helper
+ *
+ * This is useful for concatenating a bunch of strings where each string is
+ * constructed using printf.
+ *
+ * @buf:   The destination buffer; will be updated to point after the printed
+ *         data.
+ * @size:  The number of bytes in the buffer (includes space for '\0').
+ * @fmt:   The format for printf.
+ * @...:   The args for printf.
+ */
+static void cat_printf(char **buf, size_t *size, const char *fmt, ...)
 {
-	unsigned short utime = qh->host_us;
+	va_list args;
 	int i;
 
-	for (i = 0; i < 8; i++) {
-		/* At the start hsotg->frame_usecs[i] = max_uframe_usecs[i] */
-		if (utime <= hsotg->frame_usecs[i]) {
-			hsotg->frame_usecs[i] -= utime;
-			qh->frame_usecs[i] += utime;
-			return i;
-		}
+	if (*size == 0)
+		return;
+
+	va_start(args, fmt);
+	i = vsnprintf(*buf, *size, fmt, args);
+	va_end(args);
+
+	if (i >= *size) {
+		(*buf)[*size - 1] = '\0';
+		*buf += *size;
+		*size = 0;
+	} else {
+		*buf += i;
+		*size -= i;
 	}
-	return -ENOSPC;
 }
 
 /*
- * use this for FS apps that can span multiple uframes
+ * pmap_print() - Print the given periodic map
+ *
+ * Will attempt to print out the periodic schedule.
+ *
+ * @map:             See pmap_schedule().
+ * @bits_per_period: See pmap_schedule().
+ * @periods_in_map:  See pmap_schedule().
+ * @period_name:     The name of 1 period, like "uFrame"
+ * @units:           The name of the units, like "us".
+ * @print_fn:        The function to call for printing.
+ * @print_data:      Opaque data to pass to the print function.
  */
-static int dwc2_find_multi_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+static void pmap_print(unsigned long *map, int bits_per_period,
+		       int periods_in_map, const char *period_name,
+		       const char *units,
+		       void (*print_fn)(const char *str, void *data),
+		       void *print_data)
 {
-	unsigned short utime = qh->host_us;
-	unsigned short xtime;
-	int t_left;
+	int period;
+
+	for (period = 0; period < periods_in_map; period++) {
+		char tmp[64];
+		char *buf = tmp;
+		size_t buf_size = sizeof(tmp);
+		int period_start = period * bits_per_period;
+		int period_end = period_start + bits_per_period;
+		int start = 0;
+		int count = 0;
+		bool printed = false;
+		int i;
+
+		for (i = period_start; i < period_end + 1; i++) {
+			/* Handle case when ith bit is set */
+			if (i < period_end &&
+			    bitmap_find_next_zero_area(map, i + 1,
+						       i, 1, 0) != i) {
+				if (count == 0)
+					start = i - period_start;
+				count++;
+				continue;
+			}
+
+			/* ith bit isn't set; don't care if count == 0 */
+			if (count == 0)
+				continue;
+
+			if (!printed)
+				cat_printf(&buf, &buf_size, "%s %d: ",
+					   period_name, period);
+			else
+				cat_printf(&buf, &buf_size, ", ");
+			printed = true;
+
+			cat_printf(&buf, &buf_size, "%d %s -%3d %s", start,
+				   units, start + count - 1, units);
+			count = 0;
+		}
+
+		if (printed)
+			print_fn(tmp, print_data);
+	}
+}
+
+/**
+ * dwc2_get_ls_map() - Get the map used for the given qh
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller.
+ * @qh:    QH for the periodic transfer.
+ *
+ * We'll always get the periodic map out of our TT.  Note that even if we're
+ * running the host straight in low speed / full speed mode it appears as if
+ * a TT is allocated for us, so we'll use it.  If that ever changes we can
+ * add logic here to get a map out of "hsotg" if !qh->do_split.
+ *
+ * Returns: the map or NULL if a map couldn't be found.
+ */
+static unsigned long *dwc2_get_ls_map(struct dwc2_hsotg *hsotg,
+				      struct dwc2_qh *qh)
+{
+	unsigned long *map;
+
+	/* Don't expect to be missing a TT and be doing low speed scheduling */
+	if (WARN_ON(!qh->dwc_tt))
+		return NULL;
+
+	/* Get the map and adjust if this is a multi_tt hub */
+	map = qh->dwc_tt->periodic_bitmaps;
+	if (qh->dwc_tt->usb_tt->multi)
+		map += DWC2_ELEMENTS_PER_LS_BITMAP * qh->ttport;
+
+	return map;
+}
+
+struct dwc2_qh_print_data {
+	struct dwc2_hsotg *hsotg;
+	struct dwc2_qh *qh;
+};
+
+/**
+ * dwc2_qh_print() - Helper function for dwc2_qh_schedule_print()
+ *
+ * @str:  The string to print
+ * @data: A pointer to a struct dwc2_qh_print_data
+ */
+static void dwc2_qh_print(const char *str, void *data)
+{
+	struct dwc2_qh_print_data *print_data = data;
+
+	dwc2_sch_dbg(print_data->hsotg, "QH=%p ...%s\n", print_data->qh, str);
+}
+
+/**
+ * dwc2_qh_schedule_print() - Print the periodic schedule
+ *
+ * @hsotg: The HCD state structure for the DWC OTG controller.
+ * @qh:    QH to print.
+ */
+static void dwc2_qh_schedule_print(struct dwc2_hsotg *hsotg,
+				   struct dwc2_qh *qh)
+{
+	struct dwc2_qh_print_data print_data = { hsotg, qh };
 	int i;
-	int j;
-	int k;
 
-	for (i = 0; i < 8; i++) {
-		if (hsotg->frame_usecs[i] <= 0)
+	/*
+	 * The printing functions are quite slow and inefficient.
+	 * If we don't have tracing turned on, don't run unless the special
+	 * define is turned on.
+	 */
+#ifndef DWC2_PRINT_SCHEDULE
+	return;
+#endif
+
+	if (qh->schedule_low_speed) {
+		unsigned long *map = dwc2_get_ls_map(hsotg, qh);
+
+		dwc2_sch_dbg(hsotg, "QH=%p LS/FS trans: %d=>%d us @ %d us",
+			     qh, qh->device_us,
+			     DWC2_ROUND_US_TO_SLICE(qh->device_us),
+			     DWC2_US_PER_SLICE * qh->ls_start_schedule_slice);
+
+		if (map) {
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p Whole low/full speed map %p now:\n",
+				     qh, map);
+			pmap_print(map, DWC2_LS_PERIODIC_SLICES_PER_FRAME,
+				   DWC2_LS_SCHEDULE_FRAMES, "Frame ", "slices",
+				   dwc2_qh_print, &print_data);
+		}
+	}
+
+	for (i = 0; i < qh->num_hs_transfers; i++) {
+		struct dwc2_hs_transfer_time *trans_time = qh->hs_transfers + i;
+		int uframe = trans_time->start_schedule_us /
+			     DWC2_HS_PERIODIC_US_PER_UFRAME;
+		int rel_us = trans_time->start_schedule_us %
+			     DWC2_HS_PERIODIC_US_PER_UFRAME;
+
+		dwc2_sch_dbg(hsotg,
+			     "QH=%p HS trans #%d: %d us @ uFrame %d + %d us\n",
+			     qh, i, trans_time->duration_us, uframe, rel_us);
+	}
+	if (qh->num_hs_transfers) {
+		dwc2_sch_dbg(hsotg, "QH=%p Whole high speed map now:\n", qh);
+		pmap_print(hsotg->hs_periodic_bitmap,
+			   DWC2_HS_PERIODIC_US_PER_UFRAME,
+			   DWC2_HS_SCHEDULE_UFRAMES, "uFrame", "us",
+			   dwc2_qh_print, &print_data);
+	}
+
+}
+
+/**
+ * dwc2_ls_pmap_schedule() - Schedule a low speed QH
+ *
+ * @hsotg:        The HCD state structure for the DWC OTG controller.
+ * @qh:           QH for the periodic transfer.
+ * @search_slice: We'll start trying to schedule at the passed slice.
+ *                Remember that slices are the units of the low speed
+ *                schedule (think 25us or so).
+ *
+ * Wraps pmap_schedule() with the right parameters for low speed scheduling.
+ *
+ * Normally we schedule low speed devices on the map associated with the TT.
+ *
+ * Returns: 0 for success or an error code.
+ */
+static int dwc2_ls_pmap_schedule(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
+				 int search_slice)
+{
+	int slices = DIV_ROUND_UP(qh->device_us, DWC2_US_PER_SLICE);
+	unsigned long *map = dwc2_get_ls_map(hsotg, qh);
+	int slice;
+
+	if (map == NULL)
+		return -EINVAL;
+
+	/*
+	 * Schedule on the proper low speed map with our low speed scheduling
+	 * parameters.  Note that we use the "device_interval" here since
+	 * we want the low speed interval and the only way we'd be in this
+	 * function is if the device is low speed.
+	 *
+	 * If we happen to be doing low speed and high speed scheduling for the
+	 * same transaction (AKA we have a split) we always do low speed first.
+	 * That means we can always pass "false" for only_one_period (that
+	 * parameters is only useful when we're trying to get one schedule to
+	 * match what we already planned in the other schedule).
+	 */
+	slice = pmap_schedule(map, DWC2_LS_PERIODIC_SLICES_PER_FRAME,
+			      DWC2_LS_SCHEDULE_FRAMES, slices,
+			      qh->device_interval, search_slice, false);
+
+	if (slice < 0)
+		return slice;
+
+	qh->ls_start_schedule_slice = slice;
+	return 0;
+}
+
+/**
+ * dwc2_ls_pmap_unschedule() - Undo work done by dwc2_ls_pmap_schedule()
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static void dwc2_ls_pmap_unschedule(struct dwc2_hsotg *hsotg,
+				    struct dwc2_qh *qh)
+{
+	int slices = DIV_ROUND_UP(qh->device_us, DWC2_US_PER_SLICE);
+	unsigned long *map = dwc2_get_ls_map(hsotg, qh);
+
+	/* Schedule should have failed, so no worries about no error code */
+	if (map == NULL)
+		return;
+
+	pmap_unschedule(map, DWC2_LS_PERIODIC_SLICES_PER_FRAME,
+			DWC2_LS_SCHEDULE_FRAMES, slices, qh->device_interval,
+			qh->ls_start_schedule_slice);
+}
+
+/**
+ * dwc2_hs_pmap_schedule - Schedule in the main high speed schedule
+ *
+ * This will schedule something on the main dwc2 schedule.
+ *
+ * We'll start looking in qh->hs_transfers[index].start_schedule_us.  We'll
+ * update this with the result upon success.  We also use the duration from
+ * the same structure.
+ *
+ * @hsotg:           The HCD state structure for the DWC OTG controller.
+ * @qh:              QH for the periodic transfer.
+ * @only_one_period: If true we will limit ourselves to just looking at
+ *                   one period (aka one 100us chunk).  This is used if we have
+ *                   already scheduled something on the low speed schedule and
+ *                   need to find something that matches on the high speed one.
+ * @index:           The index into qh->hs_transfers that we're working with.
+ *
+ * Returns: 0 for success or an error code.  Upon success the
+ *          dwc2_hs_transfer_time specified by "index" will be updated.
+ */
+static int dwc2_hs_pmap_schedule(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
+				 bool only_one_period, int index)
+{
+	struct dwc2_hs_transfer_time *trans_time = qh->hs_transfers + index;
+	int us;
+
+	us = pmap_schedule(hsotg->hs_periodic_bitmap,
+			   DWC2_HS_PERIODIC_US_PER_UFRAME,
+			   DWC2_HS_SCHEDULE_UFRAMES, trans_time->duration_us,
+			   qh->host_interval, trans_time->start_schedule_us,
+			   only_one_period);
+
+	if (us < 0)
+		return us;
+
+	trans_time->start_schedule_us = us;
+	return 0;
+}
+
+/**
+ * dwc2_ls_pmap_unschedule() - Undo work done by dwc2_hs_pmap_schedule()
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static void dwc2_hs_pmap_unschedule(struct dwc2_hsotg *hsotg,
+				    struct dwc2_qh *qh, int index)
+{
+	struct dwc2_hs_transfer_time *trans_time = qh->hs_transfers + index;
+
+	pmap_unschedule(hsotg->hs_periodic_bitmap,
+			DWC2_HS_PERIODIC_US_PER_UFRAME,
+			DWC2_HS_SCHEDULE_UFRAMES, trans_time->duration_us,
+			qh->host_interval, trans_time->start_schedule_us);
+}
+
+/**
+ * dwc2_uframe_schedule_split - Schedule a QH for a periodic split xfer.
+ *
+ * This is the most complicated thing in USB.  We have to find matching time
+ * in both the global high speed schedule for the port and the low speed
+ * schedule for the TT associated with the given device.
+ *
+ * Being here means that the host must be running in high speed mode and the
+ * device is in low or full speed mode (and behind a hub).
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static int dwc2_uframe_schedule_split(struct dwc2_hsotg *hsotg,
+				      struct dwc2_qh *qh)
+{
+	int bytecount = dwc2_hb_mult(qh->maxp) * dwc2_max_packet(qh->maxp);
+	int ls_search_slice;
+	int err = 0;
+	int host_interval_in_sched;
+
+	/*
+	 * The interval (how often to repeat) in the actual host schedule.
+	 * See pmap_schedule() for gcd() explanation.
+	 */
+	host_interval_in_sched = gcd(qh->host_interval,
+				     DWC2_HS_SCHEDULE_UFRAMES);
+
+	/*
+	 * We always try to find space in the low speed schedule first, then
+	 * try to find high speed time that matches.  If we don't, we'll bump
+	 * up the place we start searching in the low speed schedule and try
+	 * again.  To start we'll look right at the beginning of the low speed
+	 * schedule.
+	 *
+	 * Note that this will tend to front-load the high speed schedule.
+	 * We may eventually want to try to avoid this by either considering
+	 * both schedules together or doing some sort of round robin.
+	 */
+	ls_search_slice = 0;
+
+	while (ls_search_slice < DWC2_LS_SCHEDULE_SLICES) {
+		int start_s_uframe;
+		int ssplit_s_uframe;
+		int second_s_uframe;
+		int rel_uframe;
+		int first_count;
+		int middle_count;
+		int end_count;
+		int first_data_bytes;
+		int other_data_bytes;
+		int i;
+
+		if (qh->schedule_low_speed) {
+			err = dwc2_ls_pmap_schedule(hsotg, qh, ls_search_slice);
+
+			/*
+			 * If we got an error here there's no other magic we
+			 * can do, so bail.  All the looping above is only
+			 * helpful to redo things if we got a low speed slot
+			 * and then couldn't find a matching high speed slot.
+			 */
+			if (err)
+				return err;
+		} else {
+			/* Must be missing the tt structure?  Why? */
+			WARN_ON_ONCE(1);
+		}
+
+		/*
+		 * This will give us a number 0 - 7 if
+		 * DWC2_LS_SCHEDULE_FRAMES == 1, or 0 - 15 if == 2, or ...
+		 */
+		start_s_uframe = qh->ls_start_schedule_slice /
+				 DWC2_SLICES_PER_UFRAME;
+
+		/* Get a number that's always 0 - 7 */
+		rel_uframe = (start_s_uframe % 8);
+
+		/*
+		 * If we were going to start in uframe 7 then we would need to
+		 * issue a start split in uframe 6, which spec says is not OK.
+		 * Move on to the next full frame (assuming there is one).
+		 *
+		 * See 11.18.4 Host Split Transaction Scheduling Requirements
+		 * bullet 1.
+		 */
+		if (rel_uframe == 7) {
+			if (qh->schedule_low_speed)
+				dwc2_ls_pmap_unschedule(hsotg, qh);
+			ls_search_slice =
+				(qh->ls_start_schedule_slice /
+				 DWC2_LS_PERIODIC_SLICES_PER_FRAME + 1) *
+				DWC2_LS_PERIODIC_SLICES_PER_FRAME;
 			continue;
+		}
 
 		/*
-		 * we need n consecutive slots so use j as a start slot
-		 * j plus j+1 must be enough time (for now)
+		 * For ISOC in:
+		 * - start split            (frame -1)
+		 * - complete split w/ data (frame +1)
+		 * - complete split w/ data (frame +2)
+		 * - ...
+		 * - complete split w/ data (frame +num_data_packets)
+		 * - complete split w/ data (frame +num_data_packets+1)
+		 * - complete split w/ data (frame +num_data_packets+2, max 8)
+		 *   ...though if frame was "0" then max is 7...
+		 *
+		 * For ISOC out we might need to do:
+		 * - start split w/ data    (frame -1)
+		 * - start split w/ data    (frame +0)
+		 * - ...
+		 * - start split w/ data    (frame +num_data_packets-2)
+		 *
+		 * For INTERRUPT in we might need to do:
+		 * - start split            (frame -1)
+		 * - complete split w/ data (frame +1)
+		 * - complete split w/ data (frame +2)
+		 * - complete split w/ data (frame +3, max 8)
+		 *
+		 * For INTERRUPT out we might need to do:
+		 * - start split w/ data    (frame -1)
+		 * - complete split         (frame +1)
+		 * - complete split         (frame +2)
+		 * - complete split         (frame +3, max 8)
+		 *
+		 * Start adjusting!
 		 */
-		xtime = hsotg->frame_usecs[i];
-		for (j = i + 1; j < 8; j++) {
-			/*
-			 * if we add this frame remaining time to xtime we may
-			 * be OK, if not we need to test j for a complete frame
-			 */
-			if (xtime + hsotg->frame_usecs[j] < utime) {
-				if (hsotg->frame_usecs[j] <
-							max_uframe_usecs[j])
-					continue;
+		ssplit_s_uframe = (start_s_uframe +
+				   host_interval_in_sched - 1) %
+				  host_interval_in_sched;
+		if (qh->ep_type == USB_ENDPOINT_XFER_ISOC && !qh->ep_is_in)
+			second_s_uframe = start_s_uframe;
+		else
+			second_s_uframe = start_s_uframe + 1;
+
+		/* First data transfer might not be all 188 bytes. */
+		first_data_bytes = 188 -
+			DIV_ROUND_UP(188 * (qh->ls_start_schedule_slice %
+					    DWC2_SLICES_PER_UFRAME),
+				     DWC2_SLICES_PER_UFRAME);
+		if (first_data_bytes > bytecount)
+			first_data_bytes = bytecount;
+		other_data_bytes = bytecount - first_data_bytes;
+
+		/*
+		 * For now, skip OUT xfers where first xfer is partial
+		 *
+		 * Main dwc2 code assumes:
+		 * - INT transfers never get split in two.
+		 * - ISOC transfers can always transfer 188 bytes the first
+		 *   time.
+		 *
+		 * Until that code is fixed, try again if the first transfer
+		 * couldn't transfer everything.
+		 *
+		 * This code can be removed if/when the rest of dwc2 handles
+		 * the above cases.  Until it's fixed we just won't be able
+		 * to schedule quite as tightly.
+		 */
+		if (!qh->ep_is_in &&
+		    (first_data_bytes != min_t(int, 188, bytecount))) {
+			dwc2_sch_dbg(hsotg,
+				     "QH=%p avoiding broken 1st xfer (%d, %d)\n",
+				     qh, first_data_bytes, bytecount);
+			if (qh->schedule_low_speed)
+				dwc2_ls_pmap_unschedule(hsotg, qh);
+			ls_search_slice = (start_s_uframe + 1) *
+				DWC2_SLICES_PER_UFRAME;
+			continue;
+		}
+
+		/* Start by assuming transfers for the bytes */
+		qh->num_hs_transfers = 1 + DIV_ROUND_UP(other_data_bytes, 188);
+
+		/*
+		 * Everything except ISOC OUT has extra transfers.  Rules are
+		 * complicated.  See 11.18.4 Host Split Transaction Scheduling
+		 * Requirements bullet 3.
+		 */
+		if (qh->ep_type == USB_ENDPOINT_XFER_INT) {
+			if (rel_uframe == 6)
+				qh->num_hs_transfers += 2;
+			else
+				qh->num_hs_transfers += 3;
+
+			if (qh->ep_is_in) {
+				/*
+				 * First is start split, middle/end is data.
+				 * Allocate full data bytes for all data.
+				 */
+				first_count = 4;
+				middle_count = bytecount;
+				end_count = bytecount;
+			} else {
+				/*
+				 * First is data, middle/end is complete.
+				 * First transfer and second can have data.
+				 * Rest should just have complete split.
+				 */
+				first_count = first_data_bytes;
+				middle_count = max_t(int, 4, other_data_bytes);
+				end_count = 4;
 			}
-			if (xtime >= utime) {
-				t_left = utime;
-				for (k = i; k < 8; k++) {
-					t_left -= hsotg->frame_usecs[k];
-					if (t_left <= 0) {
-						qh->frame_usecs[k] +=
-							hsotg->frame_usecs[k]
-								+ t_left;
-						hsotg->frame_usecs[k] = -t_left;
-						return i;
-					} else {
-						qh->frame_usecs[k] +=
-							hsotg->frame_usecs[k];
-						hsotg->frame_usecs[k] = 0;
-					}
-				}
+		} else {
+			if (qh->ep_is_in) {
+				int last;
+
+				/* Account for the start split */
+				qh->num_hs_transfers++;
+
+				/* Calculate "L" value from spec */
+				last = rel_uframe + qh->num_hs_transfers + 1;
+
+				/* Start with basic case */
+				if (last <= 6)
+					qh->num_hs_transfers += 2;
+				else
+					qh->num_hs_transfers += 1;
+
+				/* Adjust downwards */
+				if (last >= 6 && rel_uframe == 0)
+					qh->num_hs_transfers--;
+
+				/* 1st = start; rest can contain data */
+				first_count = 4;
+				middle_count = min_t(int, 188, bytecount);
+				end_count = middle_count;
+			} else {
+				/* All contain data, last might be smaller */
+				first_count = first_data_bytes;
+				middle_count = min_t(int, 188,
+						     other_data_bytes);
+				end_count = other_data_bytes % 188;
 			}
-			/* add the frame time to x time */
-			xtime += hsotg->frame_usecs[j];
-			/* we must have a fully available next frame or break */
-			if (xtime < utime &&
-			   hsotg->frame_usecs[j] == max_uframe_usecs[j])
-				continue;
 		}
+
+		/* Assign durations per uFrame */
+		qh->hs_transfers[0].duration_us = HS_USECS_ISO(first_count);
+		for (i = 1; i < qh->num_hs_transfers - 1; i++)
+			qh->hs_transfers[i].duration_us =
+				HS_USECS_ISO(middle_count);
+		if (qh->num_hs_transfers > 1)
+			qh->hs_transfers[qh->num_hs_transfers - 1].duration_us =
+				HS_USECS_ISO(end_count);
+
+		/*
+		 * Assign start us.  The call below to dwc2_hs_pmap_schedule()
+		 * will start with these numbers but may adjust within the same
+		 * microframe.
+		 */
+		qh->hs_transfers[0].start_schedule_us =
+			ssplit_s_uframe * DWC2_HS_PERIODIC_US_PER_UFRAME;
+		for (i = 1; i < qh->num_hs_transfers; i++)
+			qh->hs_transfers[i].start_schedule_us =
+				((second_s_uframe + i - 1) %
+				 DWC2_HS_SCHEDULE_UFRAMES) *
+				DWC2_HS_PERIODIC_US_PER_UFRAME;
+
+		/* Try to schedule with filled in hs_transfers above */
+		for (i = 0; i < qh->num_hs_transfers; i++) {
+			err = dwc2_hs_pmap_schedule(hsotg, qh, true, i);
+			if (err)
+				break;
+		}
+
+		/* If we scheduled all w/out breaking out then we're all good */
+		if (i == qh->num_hs_transfers)
+			break;
+
+		for (; i >= 0; i--)
+			dwc2_hs_pmap_unschedule(hsotg, qh, i);
+
+		if (qh->schedule_low_speed)
+			dwc2_ls_pmap_unschedule(hsotg, qh);
+
+		/* Try again starting in the next microframe */
+		ls_search_slice = (start_s_uframe + 1) * DWC2_SLICES_PER_UFRAME;
 	}
-	return -ENOSPC;
+
+	if (ls_search_slice >= DWC2_LS_SCHEDULE_SLICES)
+		return -ENOSPC;
+
+	return 0;
 }
 
-static int dwc2_find_uframe(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+/**
+ * dwc2_uframe_schedule_hs - Schedule a QH for a periodic high speed xfer.
+ *
+ * Basically this just wraps dwc2_hs_pmap_schedule() to provide a clean
+ * interface.
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static int dwc2_uframe_schedule_hs(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	/* In non-split host and device time are the same */
+	WARN_ON(qh->host_us != qh->device_us);
+	WARN_ON(qh->host_interval != qh->device_interval);
+	WARN_ON(qh->num_hs_transfers != 1);
+
+	/* We'll have one transfer; init start to 0 before calling scheduler */
+	qh->hs_transfers[0].start_schedule_us = 0;
+	qh->hs_transfers[0].duration_us = qh->host_us;
+
+	return dwc2_hs_pmap_schedule(hsotg, qh, false, 0);
+}
+
+/**
+ * dwc2_uframe_schedule_ls - Schedule a QH for a periodic low/full speed xfer.
+ *
+ * Basically this just wraps dwc2_ls_pmap_schedule() to provide a clean
+ * interface.
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static int dwc2_uframe_schedule_ls(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	/* In non-split host and device time are the same */
+	WARN_ON(qh->host_us != qh->device_us);
+	WARN_ON(qh->host_interval != qh->device_interval);
+	WARN_ON(!qh->schedule_low_speed);
+
+	/* Run on the main low speed schedule (no split = no hub = no TT) */
+	return dwc2_ls_pmap_schedule(hsotg, qh, 0);
+}
+
+/**
+ * dwc2_uframe_schedule - Schedule a QH for a periodic xfer.
+ *
+ * Calls one of the 3 sub-function depending on what type of transfer this QH
+ * is for.  Also adds some printing.
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static int dwc2_uframe_schedule(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 {
 	int ret;
 
-	if (qh->dev_speed == USB_SPEED_HIGH) {
-		/* if this is a hs transaction we need a full frame */
-		ret = dwc2_find_single_uframe(hsotg, qh);
-	} else {
-		/*
-		 * if this is a fs transaction we may need a sequence
-		 * of frames
-		 */
-		ret = dwc2_find_multi_uframe(hsotg, qh);
-	}
+	if (qh->dev_speed == USB_SPEED_HIGH)
+		ret = dwc2_uframe_schedule_hs(hsotg, qh);
+	else if (!qh->do_split)
+		ret = dwc2_uframe_schedule_ls(hsotg, qh);
+	else
+		ret = dwc2_uframe_schedule_split(hsotg, qh);
+
+	if (ret)
+		dwc2_sch_dbg(hsotg, "QH=%p Failed to schedule %d\n", qh, ret);
+	else
+		dwc2_qh_schedule_print(hsotg, qh);
+
 	return ret;
 }
 
 /**
+ * dwc2_uframe_unschedule - Undoes dwc2_uframe_schedule().
+ *
+ * @hsotg:       The HCD state structure for the DWC OTG controller.
+ * @qh:          QH for the periodic transfer.
+ */
+static void dwc2_uframe_unschedule(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
+{
+	int i;
+
+	for (i = 0; i < qh->num_hs_transfers; i++)
+		dwc2_hs_pmap_unschedule(hsotg, qh, i);
+
+	if (qh->schedule_low_speed)
+		dwc2_ls_pmap_unschedule(hsotg, qh);
+
+	dwc2_sch_dbg(hsotg, "QH=%p Unscheduled\n", qh);
+}
+
+/**
  * dwc2_pick_first_frame() - Choose 1st frame for qh that's already scheduled
  *
  * Takes a qh that has already been scheduled (which means we know we have the
@@ -265,6 +1082,7 @@ static void dwc2_pick_first_frame(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	u16 frame_number;
 	u16 earliest_frame;
 	u16 next_active_frame;
+	u16 relative_frame;
 	u16 interval;
 
 	/*
@@ -292,8 +1110,36 @@ static void dwc2_pick_first_frame(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		goto exit;
 	}
 
-	/* Adjust interval as per high speed schedule which has 8 uFrame */
-	interval = gcd(qh->host_interval, 8);
+	if (qh->dev_speed == USB_SPEED_HIGH || qh->do_split) {
+		/*
+		 * We're either at high speed or we're doing a split (which
+		 * means we're talking high speed to a hub).  In any case
+		 * the first frame should be based on when the first scheduled
+		 * event is.
+		 */
+		WARN_ON(qh->num_hs_transfers < 1);
+
+		relative_frame = qh->hs_transfers[0].start_schedule_us /
+				 DWC2_HS_PERIODIC_US_PER_UFRAME;
+
+		/* Adjust interval as per high speed schedule */
+		interval = gcd(qh->host_interval, DWC2_HS_SCHEDULE_UFRAMES);
+
+	} else {
+		/*
+		 * Low or full speed directly on dwc2.  Just about the same
+		 * as high speed but on a different schedule and with slightly
+		 * different adjustments.  Note that this works because when
+		 * the host and device are both low speed then frames in the
+		 * controller tick at low speed.
+		 */
+		relative_frame = qh->ls_start_schedule_slice /
+				 DWC2_LS_PERIODIC_SLICES_PER_FRAME;
+		interval = gcd(qh->host_interval, DWC2_LS_SCHEDULE_FRAMES);
+	}
+
+	/* Scheduler messed up if frame is past interval */
+	WARN_ON(relative_frame >= interval);
 
 	/*
 	 * We know interval must divide (HFNUM_MAX_FRNUM + 1) now that we've
@@ -310,7 +1156,7 @@ static void dwc2_pick_first_frame(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	 * scheduled for.
 	 */
 	next_active_frame = dwc2_frame_num_inc(next_active_frame,
-					       qh->assigned_uframe);
+					       relative_frame);
 
 	/*
 	 * We actually need 1 frame before since the next_active_frame is
@@ -351,9 +1197,7 @@ static int dwc2_do_reserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	int status;
 
 	if (hsotg->core_params->uframe_sched > 0) {
-		status = dwc2_find_uframe(hsotg, qh);
-		if (status >= 0)
-			qh->assigned_uframe = status;
+		status = dwc2_uframe_schedule(hsotg, qh);
 	} else {
 		status = dwc2_periodic_channel_available(hsotg);
 		if (status) {
@@ -410,12 +1254,7 @@ static void dwc2_do_unreserve(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 	hsotg->periodic_usecs -= qh->host_us;
 
 	if (hsotg->core_params->uframe_sched > 0) {
-		int i;
-
-		for (i = 0; i < 8; i++) {
-			hsotg->frame_usecs[i] += qh->frame_usecs[i];
-			qh->frame_usecs[i] = 0;
-		}
+		dwc2_uframe_unschedule(hsotg, qh);
 	} else {
 		/* Release periodic channel reservation */
 		hsotg->periodic_channels--;
@@ -606,88 +1445,81 @@ static void dwc2_deschedule_periodic(struct dwc2_hsotg *hsotg,
  * @qh:    The QH to init
  * @urb:   Holds the information about the device/endpoint needed to initialize
  *         the QH
+ * @mem_flags: Flags for allocating memory.
  */
 static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
-			 struct dwc2_hcd_urb *urb)
+			 struct dwc2_hcd_urb *urb, gfp_t mem_flags)
 {
-	int dev_speed, hub_addr, hub_port;
+	int dev_speed = dwc2_host_get_speed(hsotg, urb->priv);
+	u8 ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
+	bool ep_is_in = !!dwc2_hcd_is_pipe_in(&urb->pipe_info);
+	bool ep_is_isoc = (ep_type == USB_ENDPOINT_XFER_ISOC);
+	bool ep_is_int = (ep_type == USB_ENDPOINT_XFER_INT);
+	u32 hprt = dwc2_readl(hsotg->regs + HPRT0);
+	u32 prtspd = (hprt & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT;
+	bool do_split = (prtspd == HPRT0_SPD_HIGH_SPEED &&
+			 dev_speed != USB_SPEED_HIGH);
+	int maxp = dwc2_hcd_get_mps(&urb->pipe_info);
+	int bytecount = dwc2_hb_mult(maxp) * dwc2_max_packet(maxp);
 	char *speed, *type;
 
-	dev_vdbg(hsotg->dev, "%s()\n", __func__);
-
 	/* Initialize QH */
 	qh->hsotg = hsotg;
 	setup_timer(&qh->unreserve_timer, dwc2_unreserve_timer_fn,
 		    (unsigned long)qh);
-	qh->ep_type = dwc2_hcd_get_pipe_type(&urb->pipe_info);
-	qh->ep_is_in = dwc2_hcd_is_pipe_in(&urb->pipe_info) ? 1 : 0;
+	qh->ep_type = ep_type;
+	qh->ep_is_in = ep_is_in;
 
 	qh->data_toggle = DWC2_HC_PID_DATA0;
-	qh->maxp = dwc2_hcd_get_mps(&urb->pipe_info);
+	qh->maxp = maxp;
 	INIT_LIST_HEAD(&qh->qtd_list);
 	INIT_LIST_HEAD(&qh->qh_list_entry);
 
-	/* FS/LS Endpoint on HS Hub, NOT virtual root hub */
-	dev_speed = dwc2_host_get_speed(hsotg, urb->priv);
+	qh->do_split = do_split;
+	qh->dev_speed = dev_speed;
+
+	if (ep_is_int || ep_is_isoc) {
+		/* Compute scheduling parameters once and save them */
+		int host_speed = do_split ? USB_SPEED_HIGH : dev_speed;
+		struct dwc2_tt *dwc_tt = dwc2_host_get_tt_info(hsotg, urb->priv,
+							       mem_flags,
+							       &qh->ttport);
+		int device_ns;
 
-	dwc2_host_hub_info(hsotg, urb->priv, &hub_addr, &hub_port);
+		qh->dwc_tt = dwc_tt;
 
-	if ((dev_speed == USB_SPEED_LOW || dev_speed == USB_SPEED_FULL) &&
-	    hub_addr != 0 && hub_addr != 1) {
-		dev_vdbg(hsotg->dev,
-			 "QH init: EP %d: TT found at hub addr %d, for port %d\n",
-			 dwc2_hcd_get_ep_num(&urb->pipe_info), hub_addr,
-			 hub_port);
-		qh->do_split = 1;
-	}
+		qh->host_us = NS_TO_US(usb_calc_bus_time(host_speed, ep_is_in,
+				       ep_is_isoc, bytecount));
+		device_ns = usb_calc_bus_time(dev_speed, ep_is_in,
+					      ep_is_isoc, bytecount);
 
-	if (qh->ep_type == USB_ENDPOINT_XFER_INT ||
-	    qh->ep_type == USB_ENDPOINT_XFER_ISOC) {
-		/* Compute scheduling parameters once and save them */
-		u32 hprt, prtspd;
-
-		/* Todo: Account for split transfers in the bus time */
-		int bytecount =
-			dwc2_hb_mult(qh->maxp) * dwc2_max_packet(qh->maxp);
-
-		qh->host_us = NS_TO_US(usb_calc_bus_time(qh->do_split ?
-			      USB_SPEED_HIGH : dev_speed, qh->ep_is_in,
-			      qh->ep_type == USB_ENDPOINT_XFER_ISOC,
-			      bytecount));
-
-		qh->host_interval = urb->interval;
-		dwc2_sch_dbg(hsotg, "QH=%p init nxt=%04x, fn=%04x, int=%#x\n",
-			     qh, qh->next_active_frame, hsotg->frame_number,
-			     qh->host_interval);
-#if 0
-		/* Increase interrupt polling rate for debugging */
-		if (qh->ep_type == USB_ENDPOINT_XFER_INT)
-			qh->host_interval = 8;
-#endif
-		hprt = dwc2_readl(hsotg->regs + HPRT0);
-		prtspd = (hprt & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT;
-		if (prtspd == HPRT0_SPD_HIGH_SPEED &&
-		    (dev_speed == USB_SPEED_LOW ||
-		     dev_speed == USB_SPEED_FULL)) {
-			qh->host_interval *= 8;
-			dwc2_sch_dbg(hsotg,
-				     "QH=%p init*8 nxt=%04x, fn=%04x, int=%#x\n",
-				     qh, qh->next_active_frame,
-				     hsotg->frame_number, qh->host_interval);
+		if (do_split && dwc_tt)
+			device_ns += dwc_tt->usb_tt->think_time;
+		qh->device_us = NS_TO_US(device_ns);
 
-		}
-		dev_dbg(hsotg->dev, "interval=%d\n", qh->host_interval);
-	}
 
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH Initialized\n");
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - qh = %p\n", qh);
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Device Address = %d\n",
-		 dwc2_hcd_get_dev_addr(&urb->pipe_info));
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Endpoint %d, %s\n",
-		 dwc2_hcd_get_ep_num(&urb->pipe_info),
-		 dwc2_hcd_is_pipe_in(&urb->pipe_info) ? "IN" : "OUT");
+		qh->device_interval = urb->interval;
+		qh->host_interval = urb->interval * (do_split ? 8 : 1);
 
-	qh->dev_speed = dev_speed;
+		/*
+		 * Schedule low speed if we're running the host in low or
+		 * full speed OR if we've got a "TT" to deal with to access this
+		 * device.
+		 */
+		qh->schedule_low_speed = prtspd != HPRT0_SPD_HIGH_SPEED ||
+					 dwc_tt;
+
+		if (do_split) {
+			/* We won't know num transfers until we schedule */
+			qh->num_hs_transfers = -1;
+		} else if (dev_speed == USB_SPEED_HIGH) {
+			qh->num_hs_transfers = 1;
+		} else {
+			qh->num_hs_transfers = 0;
+		}
+
+		/* We'll schedule later when we have something to do */
+	}
 
 	switch (dev_speed) {
 	case USB_SPEED_LOW:
@@ -703,7 +1535,6 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		speed = "?";
 		break;
 	}
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Speed = %s\n", speed);
 
 	switch (qh->ep_type) {
 	case USB_ENDPOINT_XFER_ISOC:
@@ -723,13 +1554,21 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
 		break;
 	}
 
-	dev_vdbg(hsotg->dev, "DWC OTG HCD QH - Type = %s\n", type);
-
-	if (qh->ep_type == USB_ENDPOINT_XFER_INT) {
-		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - usecs = %d\n",
-			 qh->host_us);
-		dev_vdbg(hsotg->dev, "DWC OTG HCD QH - interval = %d\n",
-			 qh->host_interval);
+	dwc2_sch_dbg(hsotg, "QH=%p Init %s, %s speed, %d bytes:\n", qh, type,
+		     speed, bytecount);
+	dwc2_sch_dbg(hsotg, "QH=%p ...addr=%d, ep=%d, %s\n", qh,
+		     dwc2_hcd_get_dev_addr(&urb->pipe_info),
+		     dwc2_hcd_get_ep_num(&urb->pipe_info),
+		     ep_is_in ? "IN" : "OUT");
+	if (ep_is_int || ep_is_isoc) {
+		dwc2_sch_dbg(hsotg,
+			     "QH=%p ...duration: host=%d us, device=%d us\n",
+			     qh, qh->host_us, qh->device_us);
+		dwc2_sch_dbg(hsotg, "QH=%p ...interval: host=%d, device=%d\n",
+			     qh, qh->host_interval, qh->device_interval);
+		if (qh->schedule_low_speed)
+			dwc2_sch_dbg(hsotg, "QH=%p ...low speed schedule=%p\n",
+				     qh, dwc2_get_ls_map(hsotg, qh));
 	}
 }
 
@@ -757,7 +1596,7 @@ struct dwc2_qh *dwc2_hcd_qh_create(struct dwc2_hsotg *hsotg,
 	if (!qh)
 		return NULL;
 
-	dwc2_qh_init(hsotg, qh, urb);
+	dwc2_qh_init(hsotg, qh, urb, mem_flags);
 
 	if (hsotg->core_params->dma_desc_enable > 0 &&
 	    dwc2_hcd_qh_init_ddma(hsotg, qh, mem_flags) < 0) {
@@ -789,6 +1628,7 @@ void dwc2_hcd_qh_free(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh)
 		dwc2_do_unreserve(hsotg, qh);
 		spin_unlock_irqrestore(&hsotg->lock, flags);
 	}
+	dwc2_host_put_tt_info(hsotg, qh->dwc_tt);
 
 	if (qh->desc_list)
 		dwc2_hcd_qh_free_ddma(hsotg, qh);
@@ -904,6 +1744,8 @@ static int dwc2_next_for_periodic_split(struct dwc2_hsotg *hsotg,
 	u16 incr;
 
 	/*
+	 * See dwc2_uframe_schedule_split() for split scheduling.
+	 *
 	 * Basically: increment 1 normally, but 2 right after the start split
 	 * (except for ISOC out).
 	 */
@@ -1006,9 +1848,17 @@ static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
 	if (qh->start_active_frame == qh->next_active_frame ||
 	    dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
 		u16 ideal_start = qh->start_active_frame;
+		int periods_in_map;
 
-		/* Adjust interval as per gcd with plan length. */
-		interval = gcd(interval, 8);
+		/*
+		 * Adjust interval as per gcd with map size.
+		 * See pmap_schedule() for more details here.
+		 */
+		if (qh->do_split || qh->dev_speed == USB_SPEED_HIGH)
+			periods_in_map = DWC2_HS_SCHEDULE_UFRAMES;
+		else
+			periods_in_map = DWC2_LS_SCHEDULE_FRAMES;
+		interval = gcd(interval, periods_in_map);
 
 		do {
 			qh->start_active_frame = dwc2_frame_num_inc(
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 22/22] usb: dwc2: host: If using uframe scheduler, end splits better
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, Douglas Anderson, johnyoun, gregkh,
	linux-usb, linux-kernel

The microframe scheduler figured out exactly how many transfers we need
for a split transaction.  Let's use this knowledge to know when to end
things.

Without this I found that certain devices would just keep responding
with tons of NYET resonses on their INT_IN endpoint.  These would just
keep going and going and eventually we'd decide to terminate the
transfer (because the whole frame changed), but by that time the
scheduler would decide that we "missed" the start of the next transfer.
I can also imagine that if we blow past the end of our scheduled time we
may mess up other things that were scheduled to happen.

No known test cases are improved by this patch except that the scheduler
code doesn't yell about MISSES constantly anymore.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- If using uframe scheduler, end splits better new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd_intr.c | 48 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index dc285667233a..f7a325a8c741 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -1363,14 +1363,50 @@ static void dwc2_hc_nyet_intr(struct dwc2_hsotg *hsotg,
 
 		if (chan->ep_type == USB_ENDPOINT_XFER_INT ||
 		    chan->ep_type == USB_ENDPOINT_XFER_ISOC) {
-			int frnum = dwc2_hcd_get_frame_number(hsotg);
+			struct dwc2_qh *qh = chan->qh;
+			bool past_end;
+
+			if (hsotg->core_params->uframe_sched <= 0) {
+				int frnum = dwc2_hcd_get_frame_number(hsotg);
+
+				/* Don't have num_hs_transfers; simple logic */
+				past_end = dwc2_full_frame_num(frnum) !=
+				     dwc2_full_frame_num(qh->next_active_frame);
+			} else {
+				int end_frnum;
 
-			if (dwc2_full_frame_num(frnum) !=
-			    dwc2_full_frame_num(chan->qh->next_active_frame)) {
 				/*
-				 * No longer in the same full speed frame.
-				 * Treat this as a transaction error.
-				 */
+				* Figure out the end frame based on schedule.
+				*
+				* We don't want to go on trying again and again
+				* forever.  Let's stop when we've done all the
+				* transfers that were scheduled.
+				*
+				* We're going to be comparing start_active_frame
+				* and next_active_frame, both of which are 1
+				* before the time the packet goes on the wire,
+				* so that cancels out.  Basically if had 1
+				* transfer and we saw 1 NYET then we're done.
+				* We're getting a NYET here so if next >=
+				* (start + num_transfers) we're done. The
+				* complexity is that for all but ISOC_OUT we
+				* skip one slot.
+				*/
+				end_frnum = dwc2_frame_num_inc(
+					qh->start_active_frame,
+					qh->num_hs_transfers);
+
+				if (qh->ep_type != USB_ENDPOINT_XFER_ISOC ||
+				    qh->ep_is_in)
+					end_frnum =
+					       dwc2_frame_num_inc(end_frnum, 1);
+
+				past_end = dwc2_frame_num_le(
+					end_frnum, qh->next_active_frame);
+			}
+
+			if (past_end) {
+				/* Treat this as a transaction error. */
 #if 0
 				/*
 				 * Todo: Fix system performance so this can
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH v6 22/22] usb: dwc2: host: If using uframe scheduler, end splits better
@ 2016-01-29  2:20   ` Douglas Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Douglas Anderson @ 2016-01-29  2:20 UTC (permalink / raw)
  To: John Youn, balbi-l0cyMroinI0, kever.yang-TNX95d0MmH7DzftRWevZcw
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

The microframe scheduler figured out exactly how many transfers we need
for a split transaction.  Let's use this knowledge to know when to end
things.

Without this I found that certain devices would just keep responding
with tons of NYET resonses on their INT_IN endpoint.  These would just
keep going and going and eventually we'd decide to terminate the
transfer (because the whole frame changed), but by that time the
scheduler would decide that we "missed" the start of the next transfer.
I can also imagine that if we blow past the end of our scheduled time we
may mess up other things that were scheduled to happen.

No known test cases are improved by this patch except that the scheduler
code doesn't yell about MISSES constantly anymore.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
---
Changes in v6:
- Add Heiko's Tested-by.
- Add Stefan's Tested-by.

Changes in v5: None
Changes in v4:
- If using uframe scheduler, end splits better new for v4.

Changes in v3: None
Changes in v2: None

 drivers/usb/dwc2/hcd_intr.c | 48 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 42 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
index dc285667233a..f7a325a8c741 100644
--- a/drivers/usb/dwc2/hcd_intr.c
+++ b/drivers/usb/dwc2/hcd_intr.c
@@ -1363,14 +1363,50 @@ static void dwc2_hc_nyet_intr(struct dwc2_hsotg *hsotg,
 
 		if (chan->ep_type == USB_ENDPOINT_XFER_INT ||
 		    chan->ep_type == USB_ENDPOINT_XFER_ISOC) {
-			int frnum = dwc2_hcd_get_frame_number(hsotg);
+			struct dwc2_qh *qh = chan->qh;
+			bool past_end;
+
+			if (hsotg->core_params->uframe_sched <= 0) {
+				int frnum = dwc2_hcd_get_frame_number(hsotg);
+
+				/* Don't have num_hs_transfers; simple logic */
+				past_end = dwc2_full_frame_num(frnum) !=
+				     dwc2_full_frame_num(qh->next_active_frame);
+			} else {
+				int end_frnum;
 
-			if (dwc2_full_frame_num(frnum) !=
-			    dwc2_full_frame_num(chan->qh->next_active_frame)) {
 				/*
-				 * No longer in the same full speed frame.
-				 * Treat this as a transaction error.
-				 */
+				* Figure out the end frame based on schedule.
+				*
+				* We don't want to go on trying again and again
+				* forever.  Let's stop when we've done all the
+				* transfers that were scheduled.
+				*
+				* We're going to be comparing start_active_frame
+				* and next_active_frame, both of which are 1
+				* before the time the packet goes on the wire,
+				* so that cancels out.  Basically if had 1
+				* transfer and we saw 1 NYET then we're done.
+				* We're getting a NYET here so if next >=
+				* (start + num_transfers) we're done. The
+				* complexity is that for all but ISOC_OUT we
+				* skip one slot.
+				*/
+				end_frnum = dwc2_frame_num_inc(
+					qh->start_active_frame,
+					qh->num_hs_transfers);
+
+				if (qh->ep_type != USB_ENDPOINT_XFER_ISOC ||
+				    qh->ep_is_in)
+					end_frnum =
+					       dwc2_frame_num_inc(end_frnum, 1);
+
+				past_end = dwc2_frame_num_le(
+					end_frnum, qh->next_active_frame);
+			}
+
+			if (past_end) {
+				/* Treat this as a transaction error. */
 #if 0
 				/*
 				 * Todo: Fix system performance so this can
-- 
2.7.0.rc3.207.g0ac5344

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 10/22] usb: dwc2: host: Properly set the HFIR
@ 2016-01-31  9:23     ` Kever Yang
  0 siblings, 0 replies; 71+ messages in thread
From: Kever Yang @ 2016-01-31  9:23 UTC (permalink / raw)
  To: Douglas Anderson, John Youn, balbi
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, johnyoun, gregkh, linux-usb,
	linux-kernel

Doug,

On 01/29/2016 10:20 AM, Douglas Anderson wrote:
> According to the most up to date version of the dwc2 databook, the FRINT
> field of the HFIR register should be programmed to:
> * 125 us * (PHY clock freq for HS) - 1
> * 1000 us * (PHY clock freq for FS/LS) - 1
I got 3 version of dwc_otg databook, 2.74a, 2.94a and 3.10a,
all the doc describe the FrInt as:

* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

Maybe John can help to check the design.

There are some feature different in new and old version, but not sure
if this is one of then.

The doc says If no value is programmed, the corecalculates the value
based on the PHY clock specified in the FS/LS PHY Clock select field of
Host configuration register(HCFG.FLSLPclkSel), does this work?

Thanks,
- Kever
>
> This is opposed to older versions of the doc that claimed it should be:
> * 125 us * (PHY clock freq for HS)
> * 1000 us * (PHY clock freq for FS/LS)
>
> In case you didn't spot it, the difference is the "- 1".
>
> Let's add the "- 1" to match the newest user manual.  It's presumed that
> the "- 1" should have always been there and that this was always a
> documentation error.  If some hardware needs the "- 1" and other
> hardware doesn't, we'll have to add a configuration parameter for it in
> the future.
>
> I checked things before and after this patch on rk3288 using a Total
> Phase Beagle 5000 analyzer.
>
> Before this patch, a low speed mouse shows constant Frame Timing Jitter
> errors.  After this patch errors have gone away.
>
> Before this patch SOF packets move forward about 1 us per 4 ms.  After
> this patch the SOF packets move backward about 1 us per 255 ms.  Some
> specific SOF timestamps from the analyzer are below.
>
> Before:
>    6.603.790
>    6.603.916
>    6.604.041
>    6.604.166
>    ...
>    6.607.541
>    6.607.667
>    6.607.792
>    6.607.917
>    ...
>    6.611.417
>    6.611.543
>    6.611.668
>    6.611.793
>
> After:
>    6.215.159
>    6.215.284
>    6.215.408
>    6.215.533
>    6.215.658
>    ...
>    6.470.658
>    6.470.783
>    6.470.907
>    ...
>    6.726.032
>    6.726.157
>    6.725.281
>    6.725.406
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> Tested-by: Heiko Stuebner <heiko@sntech.de>
> ---
> Changes in v6:
> - Incorporated Properly set the HFIR patch to big series in v6
> - Add Heiko's Tested-by.
>
> Changes in v5: None
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
>
>   drivers/usb/dwc2/core.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
> index ed73b26818c0..a5db20f12ee4 100644
> --- a/drivers/usb/dwc2/core.c
> +++ b/drivers/usb/dwc2/core.c
> @@ -2245,10 +2245,10 @@ u32 dwc2_calc_frame_interval(struct dwc2_hsotg *hsotg)
>   
>   	if ((hprt0 & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT == HPRT0_SPD_HIGH_SPEED)
>   		/* High speed case */
> -		return 125 * clock;
> +		return 125 * clock - 1;
>   	else
>   		/* FS/LS case */
> -		return 1000 * clock;
> +		return 1000 * clock - 1;
>   }
>   
>   /**

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 10/22] usb: dwc2: host: Properly set the HFIR
@ 2016-01-31  9:23     ` Kever Yang
  0 siblings, 0 replies; 71+ messages in thread
From: Kever Yang @ 2016-01-31  9:23 UTC (permalink / raw)
  To: Douglas Anderson, John Youn, balbi-l0cyMroinI0
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	william.wu-TNX95d0MmH7DzftRWevZcw, Julius Werner,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

Doug,

On 01/29/2016 10:20 AM, Douglas Anderson wrote:
> According to the most up to date version of the dwc2 databook, the FRINT
> field of the HFIR register should be programmed to:
> * 125 us * (PHY clock freq for HS) - 1
> * 1000 us * (PHY clock freq for FS/LS) - 1
I got 3 version of dwc_otg databook, 2.74a, 2.94a and 3.10a,
all the doc describe the FrInt as:

* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

Maybe John can help to check the design.

There are some feature different in new and old version, but not sure
if this is one of then.

The doc says If no value is programmed, the corecalculates the value
based on the PHY clock specified in the FS/LS PHY Clock select field of
Host configuration register(HCFG.FLSLPclkSel), does this work?

Thanks,
- Kever
>
> This is opposed to older versions of the doc that claimed it should be:
> * 125 us * (PHY clock freq for HS)
> * 1000 us * (PHY clock freq for FS/LS)
>
> In case you didn't spot it, the difference is the "- 1".
>
> Let's add the "- 1" to match the newest user manual.  It's presumed that
> the "- 1" should have always been there and that this was always a
> documentation error.  If some hardware needs the "- 1" and other
> hardware doesn't, we'll have to add a configuration parameter for it in
> the future.
>
> I checked things before and after this patch on rk3288 using a Total
> Phase Beagle 5000 analyzer.
>
> Before this patch, a low speed mouse shows constant Frame Timing Jitter
> errors.  After this patch errors have gone away.
>
> Before this patch SOF packets move forward about 1 us per 4 ms.  After
> this patch the SOF packets move backward about 1 us per 255 ms.  Some
> specific SOF timestamps from the analyzer are below.
>
> Before:
>    6.603.790
>    6.603.916
>    6.604.041
>    6.604.166
>    ...
>    6.607.541
>    6.607.667
>    6.607.792
>    6.607.917
>    ...
>    6.611.417
>    6.611.543
>    6.611.668
>    6.611.793
>
> After:
>    6.215.159
>    6.215.284
>    6.215.408
>    6.215.533
>    6.215.658
>    ...
>    6.470.658
>    6.470.783
>    6.470.907
>    ...
>    6.726.032
>    6.726.157
>    6.725.281
>    6.725.406
>
> Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
> ---
> Changes in v6:
> - Incorporated Properly set the HFIR patch to big series in v6
> - Add Heiko's Tested-by.
>
> Changes in v5: None
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
>
>   drivers/usb/dwc2/core.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
> index ed73b26818c0..a5db20f12ee4 100644
> --- a/drivers/usb/dwc2/core.c
> +++ b/drivers/usb/dwc2/core.c
> @@ -2245,10 +2245,10 @@ u32 dwc2_calc_frame_interval(struct dwc2_hsotg *hsotg)
>   
>   	if ((hprt0 & HPRT0_SPD_MASK) >> HPRT0_SPD_SHIFT == HPRT0_SPD_HIGH_SPEED)
>   		/* High speed case */
> -		return 125 * clock;
> +		return 125 * clock - 1;
>   	else
>   		/* FS/LS case */
> -		return 1000 * clock;
> +		return 1000 * clock - 1;
>   }
>   
>   /**

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 11/22] usb: dwc2: host: There's not really a TT for the root hub
  2016-01-29  2:20   ` Douglas Anderson
  (?)
@ 2016-01-31  9:25   ` Kever Yang
  -1 siblings, 0 replies; 71+ messages in thread
From: Kever Yang @ 2016-01-31  9:25 UTC (permalink / raw)
  To: Douglas Anderson, John Youn, balbi
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, johnyoun, gregkh, linux-usb,
	linux-kernel

Doug,

Reviewed-by: Kever Yang <kever.yang@rock-chips.com>

Thanks,
- Kever
On 01/29/2016 10:20 AM, Douglas Anderson wrote:
> I find that when I plug a full speed (NOT high speed) hub into a dwc2
> port and then I plug a bunch of devices into that full speed hub that
> dwc2 goes bat guano crazy.  Specifically, it just spews errors like this
> in the console:
>    usb usb1: clear tt 1 (9043) error -22
>
> The specific test case I used looks like this:
> /:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=dwc2/1p, 480M
>      |__ Port 1: Dev 17, If 0, Class=Hub, Driver=hub/4p, 12M
>          |__ Port 2: Dev 19, If 0, ..., Driver=usbhid, 1.5M
>          |__ Port 4: Dev 20, If 0, ..., Driver=usbhid, 12M
>          |__ Port 4: Dev 20, If 1, ..., Driver=usbhid, 12M
>          |__ Port 4: Dev 20, If 2, ..., Driver=usbhid, 12M
>
> Showing VID/PID:
>   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
>   Bus 001 Device 017: ID 03eb:3301 Atmel Corp. at43301 4-Port Hub
>   Bus 001 Device 020: ID 045e:0745 Microsoft Corp. Nano Transceiver ...
>   Bus 001 Device 019: ID 046d:c404 Logitech, Inc. TrackMan Wheel
>
> I spent a bunch of time trying to figure out why there are errors to
> begin with.  I believe that the issue may be a hardware issue where the
> transceiver sometimes accidentally sends a PREAMBLE packet if you send a
> packet to a full speed device right after one to a low speed device.
> Luckily the USB driver retries and the second time things work OK.
>
> In any case, things kinda seem work despite the errors, except for the
> "clear tt" spew mucking up my console.  Chalk it up for a win for
> retries and robust protocols.
>
> So getting back to the "clear tt" problem, it appears that we get those
> because there's not actually a TT here to clear.  It's my understanding
> that when dwc2 operates in low speed or full speed mode that there's no
> real TT out there.  That makes all these attempts to "clear the TT"
> somewhat meaningless and also causes the spew in the log.
>
> Let's just skip all the useless TT clears.  Eventually we should root
> cause the errors, but even if we do this is still a proper fix and is
> likely to avoid the "clear tt" error in the future.
>
> Note that hooking up a Full Speed USB Audio Device (Jabra 510) to this
> same hub with the keyboard / trackball shows that even audio works over
> this janky connection.  As a point to note, this particular change (skip
> bogus TT clears) compared to just commenting out the dev_err() in
> hub_tt_work() actually produces better audio.
>
> Note: don't ask me where I got a full speed USB hub or whether the
> massive amount of dust that accumulated on it while it was in my junk
> box affected its funtionality.  Just smile and nod.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> Changes in v6:
> - There's not really a TT for the root hub new for v6
>
> Changes in v5: None
> Changes in v4: None
> Changes in v3: None
> Changes in v2: None
>
>   drivers/usb/dwc2/hcd_intr.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
>
> diff --git a/drivers/usb/dwc2/hcd_intr.c b/drivers/usb/dwc2/hcd_intr.c
> index 5d25a5ec9736..fe44870f84eb 100644
> --- a/drivers/usb/dwc2/hcd_intr.c
> +++ b/drivers/usb/dwc2/hcd_intr.c
> @@ -87,6 +87,7 @@ static void dwc2_hc_handle_tt_clear(struct dwc2_hsotg *hsotg,
>   				    struct dwc2_host_chan *chan,
>   				    struct dwc2_qtd *qtd)
>   {
> +	struct usb_device *root_hub = dwc2_hsotg_to_hcd(hsotg)->self.root_hub;
>   	struct urb *usb_urb;
>   
>   	if (!chan->qh)
> @@ -102,6 +103,15 @@ static void dwc2_hc_handle_tt_clear(struct dwc2_hsotg *hsotg,
>   	if (!usb_urb || !usb_urb->dev || !usb_urb->dev->tt)
>   		return;
>   
> +	/*
> +	 * The root hub doesn't really have a TT, but Linux thinks it
> +	 * does because how could you have a "high speed hub" that
> +	 * directly talks directly to low speed devices without a TT?
> +	 * It's all lies.  Lies, I tell you.
> +	 */
> +	if (usb_urb->dev->tt->hub == root_hub)
> +		return;
> +
>   	if (qtd->urb->status != -EPIPE && qtd->urb->status != -EREMOTEIO) {
>   		chan->qh->tt_buffer_dirty = 1;
>   		if (usb_hub_clear_tt_buffer(usb_urb))

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-01-31  9:36     ` Kever Yang
  0 siblings, 0 replies; 71+ messages in thread
From: Kever Yang @ 2016-01-31  9:36 UTC (permalink / raw)
  To: Douglas Anderson, John Youn, balbi
  Cc: huangtao, stefan.wahren, heiko, johnyoun, gregkh, ming.lei,
	linux-usb, linux-kernel, linux-rockchip, yousaf.kaukab, stern,
	linux-rpi-kernel, gregory.herrero, william.wu, Julius Werner,
	dinguyen

Doug,

On 01/29/2016 10:20 AM, Douglas Anderson wrote:
> In dwc2_hcd_qh_deactivate() we will put some things on the
> periodic_sched_ready list.  These things won't be taken off the ready
> list until the next SOF, which might be a little late.  Let's put them
> on right away.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> Tested-by: Heiko Stuebner <heiko@sntech.de>
> Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
> ---
> Changes in v6:
> - Add Heiko's Tested-by.
> - Add Stefan's Tested-by.
>
> Changes in v5: None
> Changes in v4:
> - Schedule periodic right away if it's time new for v4.
>
> Changes in v3: None
> Changes in v2: None
>
>   drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>   1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
> index 9b3c435339ee..3abb34a5fc5b 100644
> --- a/drivers/usb/dwc2/hcd_queue.c
> +++ b/drivers/usb/dwc2/hcd_queue.c
> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
>   	 * Note: we purposely use the frame_number from the "hsotg" structure
>   	 * since we know SOF interrupt will handle future frames.
>   	 */
> -	if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
> +	if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number)) {
> +		enum dwc2_transaction_type tr_type;
> +
> +		/*
> +		 * We're bypassing the SOF handler which is normally what puts
> +		 * us on the ready list because we're in a hurry and need to
> +		 * try to catch up.
> +		 */
> +		dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x, nxt=%04x\n",
> +			      qh, frame_number, qh->next_active_frame);
>   		list_move_tail(&qh->qh_list_entry,
>   			       &hsotg->periodic_sched_ready);
> -	else
> +
> +		tr_type = dwc2_hcd_select_transactions(hsotg);
Do we need to add select_transactions call here? If we get into this 
function in interrupt
and once we put the qh in ready queue, the qh can be handled in this 
frame again by the
later function call of dwc_hcd_select_transactions, so what we need to 
to here is put
it in ready list instead of inactive queue, and wait for the schedule.

Thanks,
- Kever

> +		if (tr_type != DWC2_TRANSACTION_NONE)
> +			dwc2_hcd_queue_transactions(hsotg, tr_type);
> +	} else {
>   		list_move_tail(&qh->qh_list_entry,
>   			       &hsotg->periodic_sched_inactive);
> +	}
>   }
>   
>   /**

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-01-31  9:36     ` Kever Yang
  0 siblings, 0 replies; 71+ messages in thread
From: Kever Yang @ 2016-01-31  9:36 UTC (permalink / raw)
  To: Douglas Anderson, John Youn, balbi-l0cyMroinI0
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, stefan.wahren-eS4NqCHxEME,
	heiko-4mtYJXux2i+zQB+pC5nmwQ, johnyoun-HKixBCOQz3hWk0Htik3J/w,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r,
	ming.lei-Z7WLFzj8eWMS+FvcfC7Uqw,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	yousaf.kaukab-ral2JQCrhuEAvxtiuMwx3w,
	stern-nwvwT67g6+6dFdvTe/nMLpVzexx5G7lz,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Julius Werner,
	william.wu-TNX95d0MmH7DzftRWevZcw,
	gregory.herrero-ral2JQCrhuEAvxtiuMwx3w,
	dinguyen-yzvPICuk2ABMcg4IHK0kFoH6Mc4MB0Vx

Doug,

On 01/29/2016 10:20 AM, Douglas Anderson wrote:
> In dwc2_hcd_qh_deactivate() we will put some things on the
> periodic_sched_ready list.  These things won't be taken off the ready
> list until the next SOF, which might be a little late.  Let's put them
> on right away.
>
> Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
> Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
> ---
> Changes in v6:
> - Add Heiko's Tested-by.
> - Add Stefan's Tested-by.
>
> Changes in v5: None
> Changes in v4:
> - Schedule periodic right away if it's time new for v4.
>
> Changes in v3: None
> Changes in v2: None
>
>   drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>   1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
> index 9b3c435339ee..3abb34a5fc5b 100644
> --- a/drivers/usb/dwc2/hcd_queue.c
> +++ b/drivers/usb/dwc2/hcd_queue.c
> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
>   	 * Note: we purposely use the frame_number from the "hsotg" structure
>   	 * since we know SOF interrupt will handle future frames.
>   	 */
> -	if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
> +	if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number)) {
> +		enum dwc2_transaction_type tr_type;
> +
> +		/*
> +		 * We're bypassing the SOF handler which is normally what puts
> +		 * us on the ready list because we're in a hurry and need to
> +		 * try to catch up.
> +		 */
> +		dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x, nxt=%04x\n",
> +			      qh, frame_number, qh->next_active_frame);
>   		list_move_tail(&qh->qh_list_entry,
>   			       &hsotg->periodic_sched_ready);
> -	else
> +
> +		tr_type = dwc2_hcd_select_transactions(hsotg);
Do we need to add select_transactions call here? If we get into this 
function in interrupt
and once we put the qh in ready queue, the qh can be handled in this 
frame again by the
later function call of dwc_hcd_select_transactions, so what we need to 
to here is put
it in ready list instead of inactive queue, and wait for the schedule.

Thanks,
- Kever

> +		if (tr_type != DWC2_TRANSACTION_NONE)
> +			dwc2_hcd_queue_transactions(hsotg, tr_type);
> +	} else {
>   		list_move_tail(&qh->qh_list_entry,
>   			       &hsotg->periodic_sched_inactive);
> +	}
>   }
>   
>   /**

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-01-31 22:09       ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-01-31 22:09 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb, linux-kernel, open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern, linux-rpi-kernel, Herrero, Gregory,
	吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Sun, Jan 31, 2016 at 1:36 AM, Kever Yang <kever.yang@rock-chips.com> wrote:
> Doug,
>
>
> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>
>> In dwc2_hcd_qh_deactivate() we will put some things on the
>> periodic_sched_ready list.  These things won't be taken off the ready
>> list until the next SOF, which might be a little late.  Let's put them
>> on right away.
>>
>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
>> Tested-by: Heiko Stuebner <heiko@sntech.de>
>> Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
>> ---
>> Changes in v6:
>> - Add Heiko's Tested-by.
>> - Add Stefan's Tested-by.
>>
>> Changes in v5: None
>> Changes in v4:
>> - Schedule periodic right away if it's time new for v4.
>>
>> Changes in v3: None
>> Changes in v2: None
>>
>>   drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>>   1 file changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
>> index 9b3c435339ee..3abb34a5fc5b 100644
>> --- a/drivers/usb/dwc2/hcd_queue.c
>> +++ b/drivers/usb/dwc2/hcd_queue.c
>> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg
>> *hsotg, struct dwc2_qh *qh,
>>          * Note: we purposely use the frame_number from the "hsotg"
>> structure
>>          * since we know SOF interrupt will handle future frames.
>>          */
>> -       if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
>> +       if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
>> {
>> +               enum dwc2_transaction_type tr_type;
>> +
>> +               /*
>> +                * We're bypassing the SOF handler which is normally what
>> puts
>> +                * us on the ready list because we're in a hurry and need
>> to
>> +                * try to catch up.
>> +                */
>> +               dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x,
>> nxt=%04x\n",
>> +                             qh, frame_number, qh->next_active_frame);
>>                 list_move_tail(&qh->qh_list_entry,
>>                                &hsotg->periodic_sched_ready);
>> -       else
>> +
>> +               tr_type = dwc2_hcd_select_transactions(hsotg);
>
> Do we need to add select_transactions call here? If we get into this
> function in interrupt
> and once we put the qh in ready queue, the qh can be handled in this frame
> again by the
> later function call of dwc_hcd_select_transactions, so what we need to to
> here is put
> it in ready list instead of inactive queue, and wait for the schedule.

I'm not sure I understand.  Can you restate?


I'll try to explain more in the meantime...

Both before and after my change, this function would place something
on the ready queue if the next_active_frame <= the frame number as of
last SOF interrupt (aka hsotg->frame_number).  Otherwise it goes on
the inactive queue.  Assuming that the previous change ("usb: dwc2:
host: Manage frame nums better in scheduler") worked properly then
next_active_frame shouldn't be less than (hsotg->frame_number - 1).
Remember that next_active_frame is always 1 before the wire frame, so
if "next_active_frame == hsotg->frame_number - 1" it means that we
need to get the transfer on the wire _right away_.  If
"next_active_frame == hsotg->frame_number" the transfer doesn't need
to go on the wire right away, but since dwc2 can be prepped one frame
in advance it doesn't hurt to give it to the hardware right away if
there's space.

As I understand it, if we stick something on the ready queue it won't
generally get looked at until the next SOF interrupt.  That means
we'll be too late if "next_active_frame == hsotg->frame_number - 1"
and we'll possibly be too late (depending on interrupt latency) if
"next_active_frame == hsotg->frame_number"


Note that before my series, there were more places than just the SOF
interrupt that would update hsotg->frame_number (see "usb: dwc2: host:
Manage frame nums better in scheduler" for fix).  Also before my
series (specially "usb: dwc2: host: Manage frame nums better in
scheduler") we used the actual current frame number when doing
comparisons.  Also before my series (specifically "usb: dwc2: host:
Properly set even/odd frame") we didn't really place things in the
frame that they were scheduled in anyway.


Also note that I believe that when dwc2_hcd_qh_deactivate() is called
our spinlock is held which means that the SOF interrupt either ran
before our function or won't run till after it.

>
>> +               if (tr_type != DWC2_TRANSACTION_NONE)
>> +                       dwc2_hcd_queue_transactions(hsotg, tr_type);
>> +       } else {
>>                 list_move_tail(&qh->qh_list_entry,
>>                                &hsotg->periodic_sched_inactive);
>> +       }
>>   }
>>     /**
>
>
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-01-31 22:09       ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-01-31 22:09 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Herrero,
	Gregory, 吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Sun, Jan 31, 2016 at 1:36 AM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org> wrote:
> Doug,
>
>
> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>
>> In dwc2_hcd_qh_deactivate() we will put some things on the
>> periodic_sched_ready list.  These things won't be taken off the ready
>> list until the next SOF, which might be a little late.  Let's put them
>> on right away.
>>
>> Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>> Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
>> Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
>> ---
>> Changes in v6:
>> - Add Heiko's Tested-by.
>> - Add Stefan's Tested-by.
>>
>> Changes in v5: None
>> Changes in v4:
>> - Schedule periodic right away if it's time new for v4.
>>
>> Changes in v3: None
>> Changes in v2: None
>>
>>   drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>>   1 file changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
>> index 9b3c435339ee..3abb34a5fc5b 100644
>> --- a/drivers/usb/dwc2/hcd_queue.c
>> +++ b/drivers/usb/dwc2/hcd_queue.c
>> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg
>> *hsotg, struct dwc2_qh *qh,
>>          * Note: we purposely use the frame_number from the "hsotg"
>> structure
>>          * since we know SOF interrupt will handle future frames.
>>          */
>> -       if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
>> +       if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
>> {
>> +               enum dwc2_transaction_type tr_type;
>> +
>> +               /*
>> +                * We're bypassing the SOF handler which is normally what
>> puts
>> +                * us on the ready list because we're in a hurry and need
>> to
>> +                * try to catch up.
>> +                */
>> +               dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x,
>> nxt=%04x\n",
>> +                             qh, frame_number, qh->next_active_frame);
>>                 list_move_tail(&qh->qh_list_entry,
>>                                &hsotg->periodic_sched_ready);
>> -       else
>> +
>> +               tr_type = dwc2_hcd_select_transactions(hsotg);
>
> Do we need to add select_transactions call here? If we get into this
> function in interrupt
> and once we put the qh in ready queue, the qh can be handled in this frame
> again by the
> later function call of dwc_hcd_select_transactions, so what we need to to
> here is put
> it in ready list instead of inactive queue, and wait for the schedule.

I'm not sure I understand.  Can you restate?


I'll try to explain more in the meantime...

Both before and after my change, this function would place something
on the ready queue if the next_active_frame <= the frame number as of
last SOF interrupt (aka hsotg->frame_number).  Otherwise it goes on
the inactive queue.  Assuming that the previous change ("usb: dwc2:
host: Manage frame nums better in scheduler") worked properly then
next_active_frame shouldn't be less than (hsotg->frame_number - 1).
Remember that next_active_frame is always 1 before the wire frame, so
if "next_active_frame == hsotg->frame_number - 1" it means that we
need to get the transfer on the wire _right away_.  If
"next_active_frame == hsotg->frame_number" the transfer doesn't need
to go on the wire right away, but since dwc2 can be prepped one frame
in advance it doesn't hurt to give it to the hardware right away if
there's space.

As I understand it, if we stick something on the ready queue it won't
generally get looked at until the next SOF interrupt.  That means
we'll be too late if "next_active_frame == hsotg->frame_number - 1"
and we'll possibly be too late (depending on interrupt latency) if
"next_active_frame == hsotg->frame_number"


Note that before my series, there were more places than just the SOF
interrupt that would update hsotg->frame_number (see "usb: dwc2: host:
Manage frame nums better in scheduler" for fix).  Also before my
series (specially "usb: dwc2: host: Manage frame nums better in
scheduler") we used the actual current frame number when doing
comparisons.  Also before my series (specifically "usb: dwc2: host:
Properly set even/odd frame") we didn't really place things in the
frame that they were scheduled in anyway.


Also note that I believe that when dwc2_hcd_qh_deactivate() is called
our spinlock is held which means that the SOF interrupt either ran
before our function or won't run till after it.

>
>> +               if (tr_type != DWC2_TRANSACTION_NONE)
>> +                       dwc2_hcd_queue_transactions(hsotg, tr_type);
>> +       } else {
>>                 list_move_tail(&qh->qh_list_entry,
>>                                &hsotg->periodic_sched_inactive);
>> +       }
>>   }
>>     /**
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 10/22] usb: dwc2: host: Properly set the HFIR
  2016-01-31  9:23     ` Kever Yang
  (?)
@ 2016-01-31 22:19     ` Doug Anderson
  2016-02-10  2:08       ` John Youn
  -1 siblings, 1 reply; 71+ messages in thread
From: Doug Anderson @ 2016-01-31 22:19 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, 吴良峰,
	Tao Huang, Heiko Stübner, Stefan Wahren,
	open list:ARM/Rockchip SoC...,
	linux-rpi-kernel, Julius Werner, Herrero, Gregory, Kaukab,
	Yousaf, Dinh Nguyen, Alan Stern, Ming Lei, John Youn,
	Greg Kroah-Hartman, linux-usb, linux-kernel

Kever,

On Sun, Jan 31, 2016 at 1:23 AM, Kever Yang <kever.yang@rock-chips.com> wrote:
> Doug,
>
> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>
>> According to the most up to date version of the dwc2 databook, the FRINT
>> field of the HFIR register should be programmed to:
>> * 125 us * (PHY clock freq for HS) - 1
>> * 1000 us * (PHY clock freq for FS/LS) - 1
>
> I got 3 version of dwc_otg databook, 2.74a, 2.94a and 3.10a,
> all the doc describe the FrInt as:

Can you check to see if you can get 3.30a (October 2015)?


> * 125 us * (PHY clock freq for HS)
> * 1000 us * (PHY clock freq for FS/LS)
>
> Maybe John can help to check the design.

Yes, this really needs John or someone at Synopsys.


> There are some feature different in new and old version, but not sure
> if this is one of then.
>
> The doc says If no value is programmed, the corecalculates the value
> based on the PHY clock specified in the FS/LS PHY Clock select field of
> Host configuration register(HCFG.FLSLPclkSel), does this work?

It seems to.  It looks like that's what makes our firmware work.  I'm
not 100% sure if there are any downsides to that approach...

-Doug

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
  2016-01-31 22:09       ` Doug Anderson
  (?)
@ 2016-02-01  3:32       ` Kever Yang
  2016-02-01  4:36           ` Doug Anderson
  -1 siblings, 1 reply; 71+ messages in thread
From: Kever Yang @ 2016-02-01  3:32 UTC (permalink / raw)
  To: Doug Anderson
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb, linux-kernel, open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern, linux-rpi-kernel, Herrero, Gregory,
	吴良峰,
	Julius Werner, Dinh Nguyen

Doug,

On 02/01/2016 06:09 AM, Doug Anderson wrote:
> Kever,
>
> On Sun, Jan 31, 2016 at 1:36 AM, Kever Yang <kever.yang@rock-chips.com> wrote:
>> Doug,
>>
>>
>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>> In dwc2_hcd_qh_deactivate() we will put some things on the
>>> periodic_sched_ready list.  These things won't be taken off the ready
>>> list until the next SOF, which might be a little late.  Let's put them
>>> on right away.
>>>
>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
>>> Tested-by: Heiko Stuebner <heiko@sntech.de>
>>> Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
>>> ---
>>> Changes in v6:
>>> - Add Heiko's Tested-by.
>>> - Add Stefan's Tested-by.
>>>
>>> Changes in v5: None
>>> Changes in v4:
>>> - Schedule periodic right away if it's time new for v4.
>>>
>>> Changes in v3: None
>>> Changes in v2: None
>>>
>>>    drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>>>    1 file changed, 16 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
>>> index 9b3c435339ee..3abb34a5fc5b 100644
>>> --- a/drivers/usb/dwc2/hcd_queue.c
>>> +++ b/drivers/usb/dwc2/hcd_queue.c
>>> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg
>>> *hsotg, struct dwc2_qh *qh,
>>>           * Note: we purposely use the frame_number from the "hsotg"
>>> structure
>>>           * since we know SOF interrupt will handle future frames.
>>>           */
>>> -       if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
>>> +       if (dwc2_frame_num_le(qh->next_active_frame, hsotg->frame_number))
>>> {
>>> +               enum dwc2_transaction_type tr_type;
>>> +
>>> +               /*
>>> +                * We're bypassing the SOF handler which is normally what
>>> puts
>>> +                * us on the ready list because we're in a hurry and need
>>> to
>>> +                * try to catch up.
>>> +                */
>>> +               dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x,
>>> nxt=%04x\n",
>>> +                             qh, frame_number, qh->next_active_frame);
>>>                  list_move_tail(&qh->qh_list_entry,
>>>                                 &hsotg->periodic_sched_ready);
>>> -       else
>>> +
>>> +               tr_type = dwc2_hcd_select_transactions(hsotg);
>> Do we need to add select_transactions call here? If we get into this
>> function in interrupt
>> and once we put the qh in ready queue, the qh can be handled in this frame
>> again by the
>> later function call of dwc_hcd_select_transactions, so what we need to to
>> here is put
>> it in ready list instead of inactive queue, and wait for the schedule.
> I'm not sure I understand.  Can you restate?
>
>
> I'll try to explain more in the meantime...
>
> Both before and after my change, this function would place something
> on the ready queue if the next_active_frame <= the frame number as of
> last SOF interrupt (aka hsotg->frame_number).  Otherwise it goes on
> the inactive queue.  Assuming that the previous change ("usb: dwc2:
> host: Manage frame nums better in scheduler") worked properly then
> next_active_frame shouldn't be less than (hsotg->frame_number - 1).
> Remember that next_active_frame is always 1 before the wire frame, so
> if "next_active_frame == hsotg->frame_number - 1" it means that we
> need to get the transfer on the wire _right away_.  If
> "next_active_frame == hsotg->frame_number" the transfer doesn't need
> to go on the wire right away, but since dwc2 can be prepped one frame
> in advance it doesn't hurt to give it to the hardware right away if
> there's space.
>
> As I understand it, if we stick something on the ready queue it won't
> generally get looked at until the next SOF interrupt.  That means
> we'll be too late if "next_active_frame == hsotg->frame_number - 1"
> and we'll possibly be too late (depending on interrupt latency) if
> "next_active_frame == hsotg->frame_number"
>
I understand this patch and agree with your point of schedule the
periodic right away instead of at least next frame.
My point is, there are only two call to dwc2_hcd_qh_deactivate(), from
dwc2_hcd_urb_dequeue() and dwc2_release_channel(), we don't need
to do the schedule for dequeue, and there is one
dwc2_hcd_select_transactions() call at the end of dwc2_release_channel(),
maybe we don't need another dwc2_hcd_select_transactions() here.

I think the duration from this point to the function call of 
dwc2_hcd_select_transactions()
in dwc2_release_channel() will be the main factor for us to decide if
we need to add a function call of  dwc2_hcd_select_transactions() here.

Thanks,
- Kever
> Note that before my series, there were more places than just the SOF
> interrupt that would update hsotg->frame_number (see "usb: dwc2: host:
> Manage frame nums better in scheduler" for fix).  Also before my
> series (specially "usb: dwc2: host: Manage frame nums better in
> scheduler") we used the actual current frame number when doing
> comparisons.  Also before my series (specifically "usb: dwc2: host:
> Properly set even/odd frame") we didn't really place things in the
> frame that they were scheduled in anyway.
>
>
> Also note that I believe that when dwc2_hcd_qh_deactivate() is called
> our spinlock is held which means that the SOF interrupt either ran
> before our function or won't run till after it.
>
>>> +               if (tr_type != DWC2_TRANSACTION_NONE)
>>> +                       dwc2_hcd_queue_transactions(hsotg, tr_type);
>>> +       } else {
>>>                  list_move_tail(&qh->qh_list_entry,
>>>                                 &hsotg->periodic_sched_inactive);
>>> +       }
>>>    }
>>>      /**
>>
>>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-02-01  4:36           ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-01  4:36 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb, linux-kernel, open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern, linux-rpi-kernel, Herrero, Gregory,
	吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Sun, Jan 31, 2016 at 7:32 PM, Kever Yang <kever.yang@rock-chips.com> wrote:
> Doug,
>
>
> On 02/01/2016 06:09 AM, Doug Anderson wrote:
>>
>> Kever,
>>
>> On Sun, Jan 31, 2016 at 1:36 AM, Kever Yang <kever.yang@rock-chips.com>
>> wrote:
>>>
>>> Doug,
>>>
>>>
>>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>>>
>>>> In dwc2_hcd_qh_deactivate() we will put some things on the
>>>> periodic_sched_ready list.  These things won't be taken off the ready
>>>> list until the next SOF, which might be a little late.  Let's put them
>>>> on right away.
>>>>
>>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
>>>> Tested-by: Heiko Stuebner <heiko@sntech.de>
>>>> Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
>>>> ---
>>>> Changes in v6:
>>>> - Add Heiko's Tested-by.
>>>> - Add Stefan's Tested-by.
>>>>
>>>> Changes in v5: None
>>>> Changes in v4:
>>>> - Schedule periodic right away if it's time new for v4.
>>>>
>>>> Changes in v3: None
>>>> Changes in v2: None
>>>>
>>>>    drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>>>>    1 file changed, 16 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
>>>> index 9b3c435339ee..3abb34a5fc5b 100644
>>>> --- a/drivers/usb/dwc2/hcd_queue.c
>>>> +++ b/drivers/usb/dwc2/hcd_queue.c
>>>> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg
>>>> *hsotg, struct dwc2_qh *qh,
>>>>           * Note: we purposely use the frame_number from the "hsotg"
>>>> structure
>>>>           * since we know SOF interrupt will handle future frames.
>>>>           */
>>>> -       if (dwc2_frame_num_le(qh->next_active_frame,
>>>> hsotg->frame_number))
>>>> +       if (dwc2_frame_num_le(qh->next_active_frame,
>>>> hsotg->frame_number))
>>>> {
>>>> +               enum dwc2_transaction_type tr_type;
>>>> +
>>>> +               /*
>>>> +                * We're bypassing the SOF handler which is normally
>>>> what
>>>> puts
>>>> +                * us on the ready list because we're in a hurry and
>>>> need
>>>> to
>>>> +                * try to catch up.
>>>> +                */
>>>> +               dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x,
>>>> nxt=%04x\n",
>>>> +                             qh, frame_number, qh->next_active_frame);
>>>>                  list_move_tail(&qh->qh_list_entry,
>>>>                                 &hsotg->periodic_sched_ready);
>>>> -       else
>>>> +
>>>> +               tr_type = dwc2_hcd_select_transactions(hsotg);
>>>
>>> Do we need to add select_transactions call here? If we get into this
>>> function in interrupt
>>> and once we put the qh in ready queue, the qh can be handled in this
>>> frame
>>> again by the
>>> later function call of dwc_hcd_select_transactions, so what we need to to
>>> here is put
>>> it in ready list instead of inactive queue, and wait for the schedule.
>>
>> I'm not sure I understand.  Can you restate?
>>
>>
>> I'll try to explain more in the meantime...
>>
>> Both before and after my change, this function would place something
>> on the ready queue if the next_active_frame <= the frame number as of
>> last SOF interrupt (aka hsotg->frame_number).  Otherwise it goes on
>> the inactive queue.  Assuming that the previous change ("usb: dwc2:
>> host: Manage frame nums better in scheduler") worked properly then
>> next_active_frame shouldn't be less than (hsotg->frame_number - 1).
>> Remember that next_active_frame is always 1 before the wire frame, so
>> if "next_active_frame == hsotg->frame_number - 1" it means that we
>> need to get the transfer on the wire _right away_.  If
>> "next_active_frame == hsotg->frame_number" the transfer doesn't need
>> to go on the wire right away, but since dwc2 can be prepped one frame
>> in advance it doesn't hurt to give it to the hardware right away if
>> there's space.
>>
>> As I understand it, if we stick something on the ready queue it won't
>> generally get looked at until the next SOF interrupt.  That means
>> we'll be too late if "next_active_frame == hsotg->frame_number - 1"
>> and we'll possibly be too late (depending on interrupt latency) if
>> "next_active_frame == hsotg->frame_number"
>>
> I understand this patch and agree with your point of schedule the
> periodic right away instead of at least next frame.
> My point is, there are only two call to dwc2_hcd_qh_deactivate(), from
> dwc2_hcd_urb_dequeue() and dwc2_release_channel(), we don't need
> to do the schedule for dequeue, and there is one
> dwc2_hcd_select_transactions() call at the end of dwc2_release_channel(),
> maybe we don't need another dwc2_hcd_select_transactions() here.
>
> I think the duration from this point to the function call of
> dwc2_hcd_select_transactions()
> in dwc2_release_channel() will be the main factor for us to decide if
> we need to add a function call of  dwc2_hcd_select_transactions() here.

Oh, now I get what you're saying!

A) You've got dwc2_release_channel() -> dwc2_deactivate_qh() ->
dwc2_hcd_qh_deactivate()
...and always in that case we'll do a select / queue, so we don't need it there.

B) You've got dwc2_hcd_urb_dequeue() -> dwc2_hcd_qh_deactivate()

...but why don't we need it for dwc2_hcd_urb_dequeue()?  Yes, you're
not continuing a split so timing isn't quite as urgent, but you still
might have an INT or ISOC packet that's scheduled with an interval of
1.  We still might want to schedule right away if there are remaining
QTDs, right?

-Doug

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-02-01  4:36           ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-01  4:36 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Herrero,
	Gregory, 吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Sun, Jan 31, 2016 at 7:32 PM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org> wrote:
> Doug,
>
>
> On 02/01/2016 06:09 AM, Doug Anderson wrote:
>>
>> Kever,
>>
>> On Sun, Jan 31, 2016 at 1:36 AM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
>> wrote:
>>>
>>> Doug,
>>>
>>>
>>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>>>
>>>> In dwc2_hcd_qh_deactivate() we will put some things on the
>>>> periodic_sched_ready list.  These things won't be taken off the ready
>>>> list until the next SOF, which might be a little late.  Let's put them
>>>> on right away.
>>>>
>>>> Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>>>> Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
>>>> Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
>>>> ---
>>>> Changes in v6:
>>>> - Add Heiko's Tested-by.
>>>> - Add Stefan's Tested-by.
>>>>
>>>> Changes in v5: None
>>>> Changes in v4:
>>>> - Schedule periodic right away if it's time new for v4.
>>>>
>>>> Changes in v3: None
>>>> Changes in v2: None
>>>>
>>>>    drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>>>>    1 file changed, 16 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
>>>> index 9b3c435339ee..3abb34a5fc5b 100644
>>>> --- a/drivers/usb/dwc2/hcd_queue.c
>>>> +++ b/drivers/usb/dwc2/hcd_queue.c
>>>> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg
>>>> *hsotg, struct dwc2_qh *qh,
>>>>           * Note: we purposely use the frame_number from the "hsotg"
>>>> structure
>>>>           * since we know SOF interrupt will handle future frames.
>>>>           */
>>>> -       if (dwc2_frame_num_le(qh->next_active_frame,
>>>> hsotg->frame_number))
>>>> +       if (dwc2_frame_num_le(qh->next_active_frame,
>>>> hsotg->frame_number))
>>>> {
>>>> +               enum dwc2_transaction_type tr_type;
>>>> +
>>>> +               /*
>>>> +                * We're bypassing the SOF handler which is normally
>>>> what
>>>> puts
>>>> +                * us on the ready list because we're in a hurry and
>>>> need
>>>> to
>>>> +                * try to catch up.
>>>> +                */
>>>> +               dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x,
>>>> nxt=%04x\n",
>>>> +                             qh, frame_number, qh->next_active_frame);
>>>>                  list_move_tail(&qh->qh_list_entry,
>>>>                                 &hsotg->periodic_sched_ready);
>>>> -       else
>>>> +
>>>> +               tr_type = dwc2_hcd_select_transactions(hsotg);
>>>
>>> Do we need to add select_transactions call here? If we get into this
>>> function in interrupt
>>> and once we put the qh in ready queue, the qh can be handled in this
>>> frame
>>> again by the
>>> later function call of dwc_hcd_select_transactions, so what we need to to
>>> here is put
>>> it in ready list instead of inactive queue, and wait for the schedule.
>>
>> I'm not sure I understand.  Can you restate?
>>
>>
>> I'll try to explain more in the meantime...
>>
>> Both before and after my change, this function would place something
>> on the ready queue if the next_active_frame <= the frame number as of
>> last SOF interrupt (aka hsotg->frame_number).  Otherwise it goes on
>> the inactive queue.  Assuming that the previous change ("usb: dwc2:
>> host: Manage frame nums better in scheduler") worked properly then
>> next_active_frame shouldn't be less than (hsotg->frame_number - 1).
>> Remember that next_active_frame is always 1 before the wire frame, so
>> if "next_active_frame == hsotg->frame_number - 1" it means that we
>> need to get the transfer on the wire _right away_.  If
>> "next_active_frame == hsotg->frame_number" the transfer doesn't need
>> to go on the wire right away, but since dwc2 can be prepped one frame
>> in advance it doesn't hurt to give it to the hardware right away if
>> there's space.
>>
>> As I understand it, if we stick something on the ready queue it won't
>> generally get looked at until the next SOF interrupt.  That means
>> we'll be too late if "next_active_frame == hsotg->frame_number - 1"
>> and we'll possibly be too late (depending on interrupt latency) if
>> "next_active_frame == hsotg->frame_number"
>>
> I understand this patch and agree with your point of schedule the
> periodic right away instead of at least next frame.
> My point is, there are only two call to dwc2_hcd_qh_deactivate(), from
> dwc2_hcd_urb_dequeue() and dwc2_release_channel(), we don't need
> to do the schedule for dequeue, and there is one
> dwc2_hcd_select_transactions() call at the end of dwc2_release_channel(),
> maybe we don't need another dwc2_hcd_select_transactions() here.
>
> I think the duration from this point to the function call of
> dwc2_hcd_select_transactions()
> in dwc2_release_channel() will be the main factor for us to decide if
> we need to add a function call of  dwc2_hcd_select_transactions() here.

Oh, now I get what you're saying!

A) You've got dwc2_release_channel() -> dwc2_deactivate_qh() ->
dwc2_hcd_qh_deactivate()
...and always in that case we'll do a select / queue, so we don't need it there.

B) You've got dwc2_hcd_urb_dequeue() -> dwc2_hcd_qh_deactivate()

...but why don't we need it for dwc2_hcd_urb_dequeue()?  Yes, you're
not continuing a split so timing isn't quite as urgent, but you still
might have an INT or ISOC packet that's scheduled with an interval of
1.  We still might want to schedule right away if there are remaining
QTDs, right?

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-02-02  0:36             ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-02  0:36 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb, linux-kernel, open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern, linux-rpi-kernel, Herrero, Gregory,
	吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Sun, Jan 31, 2016 at 8:36 PM, Doug Anderson <dianders@chromium.org> wrote:
> Kever,
>
> On Sun, Jan 31, 2016 at 7:32 PM, Kever Yang <kever.yang@rock-chips.com> wrote:
>> Doug,
>>
>>
>> On 02/01/2016 06:09 AM, Doug Anderson wrote:
>>>
>>> Kever,
>>>
>>> On Sun, Jan 31, 2016 at 1:36 AM, Kever Yang <kever.yang@rock-chips.com>
>>> wrote:
>>>>
>>>> Doug,
>>>>
>>>>
>>>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>>>>
>>>>> In dwc2_hcd_qh_deactivate() we will put some things on the
>>>>> periodic_sched_ready list.  These things won't be taken off the ready
>>>>> list until the next SOF, which might be a little late.  Let's put them
>>>>> on right away.
>>>>>
>>>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
>>>>> Tested-by: Heiko Stuebner <heiko@sntech.de>
>>>>> Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
>>>>> ---
>>>>> Changes in v6:
>>>>> - Add Heiko's Tested-by.
>>>>> - Add Stefan's Tested-by.
>>>>>
>>>>> Changes in v5: None
>>>>> Changes in v4:
>>>>> - Schedule periodic right away if it's time new for v4.
>>>>>
>>>>> Changes in v3: None
>>>>> Changes in v2: None
>>>>>
>>>>>    drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>>>>>    1 file changed, 16 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
>>>>> index 9b3c435339ee..3abb34a5fc5b 100644
>>>>> --- a/drivers/usb/dwc2/hcd_queue.c
>>>>> +++ b/drivers/usb/dwc2/hcd_queue.c
>>>>> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg
>>>>> *hsotg, struct dwc2_qh *qh,
>>>>>           * Note: we purposely use the frame_number from the "hsotg"
>>>>> structure
>>>>>           * since we know SOF interrupt will handle future frames.
>>>>>           */
>>>>> -       if (dwc2_frame_num_le(qh->next_active_frame,
>>>>> hsotg->frame_number))
>>>>> +       if (dwc2_frame_num_le(qh->next_active_frame,
>>>>> hsotg->frame_number))
>>>>> {
>>>>> +               enum dwc2_transaction_type tr_type;
>>>>> +
>>>>> +               /*
>>>>> +                * We're bypassing the SOF handler which is normally
>>>>> what
>>>>> puts
>>>>> +                * us on the ready list because we're in a hurry and
>>>>> need
>>>>> to
>>>>> +                * try to catch up.
>>>>> +                */
>>>>> +               dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x,
>>>>> nxt=%04x\n",
>>>>> +                             qh, frame_number, qh->next_active_frame);
>>>>>                  list_move_tail(&qh->qh_list_entry,
>>>>>                                 &hsotg->periodic_sched_ready);
>>>>> -       else
>>>>> +
>>>>> +               tr_type = dwc2_hcd_select_transactions(hsotg);
>>>>
>>>> Do we need to add select_transactions call here? If we get into this
>>>> function in interrupt
>>>> and once we put the qh in ready queue, the qh can be handled in this
>>>> frame
>>>> again by the
>>>> later function call of dwc_hcd_select_transactions, so what we need to to
>>>> here is put
>>>> it in ready list instead of inactive queue, and wait for the schedule.
>>>
>>> I'm not sure I understand.  Can you restate?
>>>
>>>
>>> I'll try to explain more in the meantime...
>>>
>>> Both before and after my change, this function would place something
>>> on the ready queue if the next_active_frame <= the frame number as of
>>> last SOF interrupt (aka hsotg->frame_number).  Otherwise it goes on
>>> the inactive queue.  Assuming that the previous change ("usb: dwc2:
>>> host: Manage frame nums better in scheduler") worked properly then
>>> next_active_frame shouldn't be less than (hsotg->frame_number - 1).
>>> Remember that next_active_frame is always 1 before the wire frame, so
>>> if "next_active_frame == hsotg->frame_number - 1" it means that we
>>> need to get the transfer on the wire _right away_.  If
>>> "next_active_frame == hsotg->frame_number" the transfer doesn't need
>>> to go on the wire right away, but since dwc2 can be prepped one frame
>>> in advance it doesn't hurt to give it to the hardware right away if
>>> there's space.
>>>
>>> As I understand it, if we stick something on the ready queue it won't
>>> generally get looked at until the next SOF interrupt.  That means
>>> we'll be too late if "next_active_frame == hsotg->frame_number - 1"
>>> and we'll possibly be too late (depending on interrupt latency) if
>>> "next_active_frame == hsotg->frame_number"
>>>
>> I understand this patch and agree with your point of schedule the
>> periodic right away instead of at least next frame.
>> My point is, there are only two call to dwc2_hcd_qh_deactivate(), from
>> dwc2_hcd_urb_dequeue() and dwc2_release_channel(), we don't need
>> to do the schedule for dequeue, and there is one
>> dwc2_hcd_select_transactions() call at the end of dwc2_release_channel(),
>> maybe we don't need another dwc2_hcd_select_transactions() here.
>>
>> I think the duration from this point to the function call of
>> dwc2_hcd_select_transactions()
>> in dwc2_release_channel() will be the main factor for us to decide if
>> we need to add a function call of  dwc2_hcd_select_transactions() here.
>
> Oh, now I get what you're saying!
>
> A) You've got dwc2_release_channel() -> dwc2_deactivate_qh() ->
> dwc2_hcd_qh_deactivate()
> ...and always in that case we'll do a select / queue, so we don't need it there.
>
> B) You've got dwc2_hcd_urb_dequeue() -> dwc2_hcd_qh_deactivate()
>
> ...but why don't we need it for dwc2_hcd_urb_dequeue()?  Yes, you're
> not continuing a split so timing isn't quite as urgent, but you still
> might have an INT or ISOC packet that's scheduled with an interval of
> 1.  We still might want to schedule right away if there are remaining
> QTDs, right?

I ran out of time to fully test today, but I couldn't actually get a
case where we needed to schedule right away for B).  ...so given your
point about the the select / queue already present in case A, we could
probably just drop this patch ("usb: dwc2: host: Schedule periodic
right away if it's time") and if we can find a case where it's needed
in case B we can add the select / queue there.

Sound OK?  I'll try to do more testing tomorrow...

-Doug

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-02-02  0:36             ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-02  0:36 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Herrero,
	Gregory, 吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Sun, Jan 31, 2016 at 8:36 PM, Doug Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
> Kever,
>
> On Sun, Jan 31, 2016 at 7:32 PM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org> wrote:
>> Doug,
>>
>>
>> On 02/01/2016 06:09 AM, Doug Anderson wrote:
>>>
>>> Kever,
>>>
>>> On Sun, Jan 31, 2016 at 1:36 AM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
>>> wrote:
>>>>
>>>> Doug,
>>>>
>>>>
>>>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>>>>
>>>>> In dwc2_hcd_qh_deactivate() we will put some things on the
>>>>> periodic_sched_ready list.  These things won't be taken off the ready
>>>>> list until the next SOF, which might be a little late.  Let's put them
>>>>> on right away.
>>>>>
>>>>> Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>>>>> Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
>>>>> Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
>>>>> ---
>>>>> Changes in v6:
>>>>> - Add Heiko's Tested-by.
>>>>> - Add Stefan's Tested-by.
>>>>>
>>>>> Changes in v5: None
>>>>> Changes in v4:
>>>>> - Schedule periodic right away if it's time new for v4.
>>>>>
>>>>> Changes in v3: None
>>>>> Changes in v2: None
>>>>>
>>>>>    drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>>>>>    1 file changed, 16 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
>>>>> index 9b3c435339ee..3abb34a5fc5b 100644
>>>>> --- a/drivers/usb/dwc2/hcd_queue.c
>>>>> +++ b/drivers/usb/dwc2/hcd_queue.c
>>>>> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg
>>>>> *hsotg, struct dwc2_qh *qh,
>>>>>           * Note: we purposely use the frame_number from the "hsotg"
>>>>> structure
>>>>>           * since we know SOF interrupt will handle future frames.
>>>>>           */
>>>>> -       if (dwc2_frame_num_le(qh->next_active_frame,
>>>>> hsotg->frame_number))
>>>>> +       if (dwc2_frame_num_le(qh->next_active_frame,
>>>>> hsotg->frame_number))
>>>>> {
>>>>> +               enum dwc2_transaction_type tr_type;
>>>>> +
>>>>> +               /*
>>>>> +                * We're bypassing the SOF handler which is normally
>>>>> what
>>>>> puts
>>>>> +                * us on the ready list because we're in a hurry and
>>>>> need
>>>>> to
>>>>> +                * try to catch up.
>>>>> +                */
>>>>> +               dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x,
>>>>> nxt=%04x\n",
>>>>> +                             qh, frame_number, qh->next_active_frame);
>>>>>                  list_move_tail(&qh->qh_list_entry,
>>>>>                                 &hsotg->periodic_sched_ready);
>>>>> -       else
>>>>> +
>>>>> +               tr_type = dwc2_hcd_select_transactions(hsotg);
>>>>
>>>> Do we need to add select_transactions call here? If we get into this
>>>> function in interrupt
>>>> and once we put the qh in ready queue, the qh can be handled in this
>>>> frame
>>>> again by the
>>>> later function call of dwc_hcd_select_transactions, so what we need to to
>>>> here is put
>>>> it in ready list instead of inactive queue, and wait for the schedule.
>>>
>>> I'm not sure I understand.  Can you restate?
>>>
>>>
>>> I'll try to explain more in the meantime...
>>>
>>> Both before and after my change, this function would place something
>>> on the ready queue if the next_active_frame <= the frame number as of
>>> last SOF interrupt (aka hsotg->frame_number).  Otherwise it goes on
>>> the inactive queue.  Assuming that the previous change ("usb: dwc2:
>>> host: Manage frame nums better in scheduler") worked properly then
>>> next_active_frame shouldn't be less than (hsotg->frame_number - 1).
>>> Remember that next_active_frame is always 1 before the wire frame, so
>>> if "next_active_frame == hsotg->frame_number - 1" it means that we
>>> need to get the transfer on the wire _right away_.  If
>>> "next_active_frame == hsotg->frame_number" the transfer doesn't need
>>> to go on the wire right away, but since dwc2 can be prepped one frame
>>> in advance it doesn't hurt to give it to the hardware right away if
>>> there's space.
>>>
>>> As I understand it, if we stick something on the ready queue it won't
>>> generally get looked at until the next SOF interrupt.  That means
>>> we'll be too late if "next_active_frame == hsotg->frame_number - 1"
>>> and we'll possibly be too late (depending on interrupt latency) if
>>> "next_active_frame == hsotg->frame_number"
>>>
>> I understand this patch and agree with your point of schedule the
>> periodic right away instead of at least next frame.
>> My point is, there are only two call to dwc2_hcd_qh_deactivate(), from
>> dwc2_hcd_urb_dequeue() and dwc2_release_channel(), we don't need
>> to do the schedule for dequeue, and there is one
>> dwc2_hcd_select_transactions() call at the end of dwc2_release_channel(),
>> maybe we don't need another dwc2_hcd_select_transactions() here.
>>
>> I think the duration from this point to the function call of
>> dwc2_hcd_select_transactions()
>> in dwc2_release_channel() will be the main factor for us to decide if
>> we need to add a function call of  dwc2_hcd_select_transactions() here.
>
> Oh, now I get what you're saying!
>
> A) You've got dwc2_release_channel() -> dwc2_deactivate_qh() ->
> dwc2_hcd_qh_deactivate()
> ...and always in that case we'll do a select / queue, so we don't need it there.
>
> B) You've got dwc2_hcd_urb_dequeue() -> dwc2_hcd_qh_deactivate()
>
> ...but why don't we need it for dwc2_hcd_urb_dequeue()?  Yes, you're
> not continuing a split so timing isn't quite as urgent, but you still
> might have an INT or ISOC packet that's scheduled with an interval of
> 1.  We still might want to schedule right away if there are remaining
> QTDs, right?

I ran out of time to fully test today, but I couldn't actually get a
case where we needed to schedule right away for B).  ...so given your
point about the the select / queue already present in case A, we could
probably just drop this patch ("usb: dwc2: host: Schedule periodic
right away if it's time") and if we can find a case where it's needed
in case B we can add the select / queue there.

Sound OK?  I'll try to do more testing tomorrow...

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-02-02  7:04               ` Kever Yang
  0 siblings, 0 replies; 71+ messages in thread
From: Kever Yang @ 2016-02-02  7:04 UTC (permalink / raw)
  To: Doug Anderson
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb, linux-kernel, open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern, linux-rpi-kernel, Herrero, Gregory,
	吴良峰,
	Julius Werner, Dinh Nguyen

Doug,

On 02/02/2016 08:36 AM, Doug Anderson wrote:
> Kever,
>
> On Sun, Jan 31, 2016 at 8:36 PM, Doug Anderson <dianders@chromium.org> wrote:
>> Kever,
>>
>> On Sun, Jan 31, 2016 at 7:32 PM, Kever Yang <kever.yang@rock-chips.com> wrote:
>>> Doug,
>>>
>>>
>>> On 02/01/2016 06:09 AM, Doug Anderson wrote:
>>>> Kever,
>>>>
>>>> On Sun, Jan 31, 2016 at 1:36 AM, Kever Yang <kever.yang@rock-chips.com>
>>>> wrote:
>>>>> Doug,
>>>>>
>>>>>
>>>>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>>>>> In dwc2_hcd_qh_deactivate() we will put some things on the
>>>>>> periodic_sched_ready list.  These things won't be taken off the ready
>>>>>> list until the next SOF, which might be a little late.  Let's put them
>>>>>> on right away.
>>>>>>
>>>>>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
>>>>>> Tested-by: Heiko Stuebner <heiko@sntech.de>
>>>>>> Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
>>>>>> ---
>>>>>> Changes in v6:
>>>>>> - Add Heiko's Tested-by.
>>>>>> - Add Stefan's Tested-by.
>>>>>>
>>>>>> Changes in v5: None
>>>>>> Changes in v4:
>>>>>> - Schedule periodic right away if it's time new for v4.
>>>>>>
>>>>>> Changes in v3: None
>>>>>> Changes in v2: None
>>>>>>
>>>>>>     drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>>>>>>     1 file changed, 16 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
>>>>>> index 9b3c435339ee..3abb34a5fc5b 100644
>>>>>> --- a/drivers/usb/dwc2/hcd_queue.c
>>>>>> +++ b/drivers/usb/dwc2/hcd_queue.c
>>>>>> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg
>>>>>> *hsotg, struct dwc2_qh *qh,
>>>>>>            * Note: we purposely use the frame_number from the "hsotg"
>>>>>> structure
>>>>>>            * since we know SOF interrupt will handle future frames.
>>>>>>            */
>>>>>> -       if (dwc2_frame_num_le(qh->next_active_frame,
>>>>>> hsotg->frame_number))
>>>>>> +       if (dwc2_frame_num_le(qh->next_active_frame,
>>>>>> hsotg->frame_number))
>>>>>> {
>>>>>> +               enum dwc2_transaction_type tr_type;
>>>>>> +
>>>>>> +               /*
>>>>>> +                * We're bypassing the SOF handler which is normally
>>>>>> what
>>>>>> puts
>>>>>> +                * us on the ready list because we're in a hurry and
>>>>>> need
>>>>>> to
>>>>>> +                * try to catch up.
>>>>>> +                */
>>>>>> +               dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x,
>>>>>> nxt=%04x\n",
>>>>>> +                             qh, frame_number, qh->next_active_frame);
>>>>>>                   list_move_tail(&qh->qh_list_entry,
>>>>>>                                  &hsotg->periodic_sched_ready);
>>>>>> -       else
>>>>>> +
>>>>>> +               tr_type = dwc2_hcd_select_transactions(hsotg);
>>>>> Do we need to add select_transactions call here? If we get into this
>>>>> function in interrupt
>>>>> and once we put the qh in ready queue, the qh can be handled in this
>>>>> frame
>>>>> again by the
>>>>> later function call of dwc_hcd_select_transactions, so what we need to to
>>>>> here is put
>>>>> it in ready list instead of inactive queue, and wait for the schedule.
>>>> I'm not sure I understand.  Can you restate?
>>>>
>>>>
>>>> I'll try to explain more in the meantime...
>>>>
>>>> Both before and after my change, this function would place something
>>>> on the ready queue if the next_active_frame <= the frame number as of
>>>> last SOF interrupt (aka hsotg->frame_number).  Otherwise it goes on
>>>> the inactive queue.  Assuming that the previous change ("usb: dwc2:
>>>> host: Manage frame nums better in scheduler") worked properly then
>>>> next_active_frame shouldn't be less than (hsotg->frame_number - 1).
>>>> Remember that next_active_frame is always 1 before the wire frame, so
>>>> if "next_active_frame == hsotg->frame_number - 1" it means that we
>>>> need to get the transfer on the wire _right away_.  If
>>>> "next_active_frame == hsotg->frame_number" the transfer doesn't need
>>>> to go on the wire right away, but since dwc2 can be prepped one frame
>>>> in advance it doesn't hurt to give it to the hardware right away if
>>>> there's space.
>>>>
>>>> As I understand it, if we stick something on the ready queue it won't
>>>> generally get looked at until the next SOF interrupt.  That means
>>>> we'll be too late if "next_active_frame == hsotg->frame_number - 1"
>>>> and we'll possibly be too late (depending on interrupt latency) if
>>>> "next_active_frame == hsotg->frame_number"
>>>>
>>> I understand this patch and agree with your point of schedule the
>>> periodic right away instead of at least next frame.
>>> My point is, there are only two call to dwc2_hcd_qh_deactivate(), from
>>> dwc2_hcd_urb_dequeue() and dwc2_release_channel(), we don't need
>>> to do the schedule for dequeue, and there is one
>>> dwc2_hcd_select_transactions() call at the end of dwc2_release_channel(),
>>> maybe we don't need another dwc2_hcd_select_transactions() here.
>>>
>>> I think the duration from this point to the function call of
>>> dwc2_hcd_select_transactions()
>>> in dwc2_release_channel() will be the main factor for us to decide if
>>> we need to add a function call of  dwc2_hcd_select_transactions() here.
>> Oh, now I get what you're saying!
>>
>> A) You've got dwc2_release_channel() -> dwc2_deactivate_qh() ->
>> dwc2_hcd_qh_deactivate()
>> ...and always in that case we'll do a select / queue, so we don't need it there.
>>
>> B) You've got dwc2_hcd_urb_dequeue() -> dwc2_hcd_qh_deactivate()
>>
>> ...but why don't we need it for dwc2_hcd_urb_dequeue()?  Yes, you're
>> not continuing a split so timing isn't quite as urgent, but you still
>> might have an INT or ISOC packet that's scheduled with an interval of
>> 1.  We still might want to schedule right away if there are remaining
>> QTDs, right?
> I ran out of time to fully test today, but I couldn't actually get a
> case where we needed to schedule right away for B).  ...so given your
> point about the the select / queue already present in case A, we could
> probably just drop this patch ("usb: dwc2: host: Schedule periodic
> right away if it's time") and if we can find a case where it's needed
> in case B we can add the select / queue there.
>
> Sound OK?  I'll try to do more testing tomorrow...
Yes, we don't get a case we need to schedule right away for case B).

For INT or ISOC packet, I can recall I have seen somewhere but I can find
it now, the synchronous transfer is happen in the next uframe instead of 
the uframe
when the host channel initialized, so there is no difference of setting the
host channel register sooner or later inside the same frame.
Which means the existent code should be OK for case A).

We can drop this patch before we have the exact use case.

Thanks,
- Kever

>
> -Doug
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-02-02  7:04               ` Kever Yang
  0 siblings, 0 replies; 71+ messages in thread
From: Kever Yang @ 2016-02-02  7:04 UTC (permalink / raw)
  To: Doug Anderson
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Herrero,
	Gregory, 吴良峰,
	Julius Werner, Dinh Nguyen

Doug,

On 02/02/2016 08:36 AM, Doug Anderson wrote:
> Kever,
>
> On Sun, Jan 31, 2016 at 8:36 PM, Doug Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>> Kever,
>>
>> On Sun, Jan 31, 2016 at 7:32 PM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org> wrote:
>>> Doug,
>>>
>>>
>>> On 02/01/2016 06:09 AM, Doug Anderson wrote:
>>>> Kever,
>>>>
>>>> On Sun, Jan 31, 2016 at 1:36 AM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org>
>>>> wrote:
>>>>> Doug,
>>>>>
>>>>>
>>>>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>>>>> In dwc2_hcd_qh_deactivate() we will put some things on the
>>>>>> periodic_sched_ready list.  These things won't be taken off the ready
>>>>>> list until the next SOF, which might be a little late.  Let's put them
>>>>>> on right away.
>>>>>>
>>>>>> Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
>>>>>> Tested-by: Heiko Stuebner <heiko-4mtYJXux2i+zQB+pC5nmwQ@public.gmane.org>
>>>>>> Tested-by: Stefan Wahren <stefan.wahren-eS4NqCHxEME@public.gmane.org>
>>>>>> ---
>>>>>> Changes in v6:
>>>>>> - Add Heiko's Tested-by.
>>>>>> - Add Stefan's Tested-by.
>>>>>>
>>>>>> Changes in v5: None
>>>>>> Changes in v4:
>>>>>> - Schedule periodic right away if it's time new for v4.
>>>>>>
>>>>>> Changes in v3: None
>>>>>> Changes in v2: None
>>>>>>
>>>>>>     drivers/usb/dwc2/hcd_queue.c | 18 ++++++++++++++++--
>>>>>>     1 file changed, 16 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
>>>>>> index 9b3c435339ee..3abb34a5fc5b 100644
>>>>>> --- a/drivers/usb/dwc2/hcd_queue.c
>>>>>> +++ b/drivers/usb/dwc2/hcd_queue.c
>>>>>> @@ -1080,12 +1080,26 @@ void dwc2_hcd_qh_deactivate(struct dwc2_hsotg
>>>>>> *hsotg, struct dwc2_qh *qh,
>>>>>>            * Note: we purposely use the frame_number from the "hsotg"
>>>>>> structure
>>>>>>            * since we know SOF interrupt will handle future frames.
>>>>>>            */
>>>>>> -       if (dwc2_frame_num_le(qh->next_active_frame,
>>>>>> hsotg->frame_number))
>>>>>> +       if (dwc2_frame_num_le(qh->next_active_frame,
>>>>>> hsotg->frame_number))
>>>>>> {
>>>>>> +               enum dwc2_transaction_type tr_type;
>>>>>> +
>>>>>> +               /*
>>>>>> +                * We're bypassing the SOF handler which is normally
>>>>>> what
>>>>>> puts
>>>>>> +                * us on the ready list because we're in a hurry and
>>>>>> need
>>>>>> to
>>>>>> +                * try to catch up.
>>>>>> +                */
>>>>>> +               dwc2_sch_vdbg(hsotg, "QH=%p IMM ready fn=%04x,
>>>>>> nxt=%04x\n",
>>>>>> +                             qh, frame_number, qh->next_active_frame);
>>>>>>                   list_move_tail(&qh->qh_list_entry,
>>>>>>                                  &hsotg->periodic_sched_ready);
>>>>>> -       else
>>>>>> +
>>>>>> +               tr_type = dwc2_hcd_select_transactions(hsotg);
>>>>> Do we need to add select_transactions call here? If we get into this
>>>>> function in interrupt
>>>>> and once we put the qh in ready queue, the qh can be handled in this
>>>>> frame
>>>>> again by the
>>>>> later function call of dwc_hcd_select_transactions, so what we need to to
>>>>> here is put
>>>>> it in ready list instead of inactive queue, and wait for the schedule.
>>>> I'm not sure I understand.  Can you restate?
>>>>
>>>>
>>>> I'll try to explain more in the meantime...
>>>>
>>>> Both before and after my change, this function would place something
>>>> on the ready queue if the next_active_frame <= the frame number as of
>>>> last SOF interrupt (aka hsotg->frame_number).  Otherwise it goes on
>>>> the inactive queue.  Assuming that the previous change ("usb: dwc2:
>>>> host: Manage frame nums better in scheduler") worked properly then
>>>> next_active_frame shouldn't be less than (hsotg->frame_number - 1).
>>>> Remember that next_active_frame is always 1 before the wire frame, so
>>>> if "next_active_frame == hsotg->frame_number - 1" it means that we
>>>> need to get the transfer on the wire _right away_.  If
>>>> "next_active_frame == hsotg->frame_number" the transfer doesn't need
>>>> to go on the wire right away, but since dwc2 can be prepped one frame
>>>> in advance it doesn't hurt to give it to the hardware right away if
>>>> there's space.
>>>>
>>>> As I understand it, if we stick something on the ready queue it won't
>>>> generally get looked at until the next SOF interrupt.  That means
>>>> we'll be too late if "next_active_frame == hsotg->frame_number - 1"
>>>> and we'll possibly be too late (depending on interrupt latency) if
>>>> "next_active_frame == hsotg->frame_number"
>>>>
>>> I understand this patch and agree with your point of schedule the
>>> periodic right away instead of at least next frame.
>>> My point is, there are only two call to dwc2_hcd_qh_deactivate(), from
>>> dwc2_hcd_urb_dequeue() and dwc2_release_channel(), we don't need
>>> to do the schedule for dequeue, and there is one
>>> dwc2_hcd_select_transactions() call at the end of dwc2_release_channel(),
>>> maybe we don't need another dwc2_hcd_select_transactions() here.
>>>
>>> I think the duration from this point to the function call of
>>> dwc2_hcd_select_transactions()
>>> in dwc2_release_channel() will be the main factor for us to decide if
>>> we need to add a function call of  dwc2_hcd_select_transactions() here.
>> Oh, now I get what you're saying!
>>
>> A) You've got dwc2_release_channel() -> dwc2_deactivate_qh() ->
>> dwc2_hcd_qh_deactivate()
>> ...and always in that case we'll do a select / queue, so we don't need it there.
>>
>> B) You've got dwc2_hcd_urb_dequeue() -> dwc2_hcd_qh_deactivate()
>>
>> ...but why don't we need it for dwc2_hcd_urb_dequeue()?  Yes, you're
>> not continuing a split so timing isn't quite as urgent, but you still
>> might have an INT or ISOC packet that's scheduled with an interval of
>> 1.  We still might want to schedule right away if there are remaining
>> QTDs, right?
> I ran out of time to fully test today, but I couldn't actually get a
> case where we needed to schedule right away for B).  ...so given your
> point about the the select / queue already present in case A, we could
> probably just drop this patch ("usb: dwc2: host: Schedule periodic
> right away if it's time") and if we can find a case where it's needed
> in case B we can add the select / queue there.
>
> Sound OK?  I'll try to do more testing tomorrow...
Yes, we don't get a case we need to schedule right away for case B).

For INT or ISOC packet, I can recall I have seen somewhere but I can find
it now, the synchronous transfer is happen in the next uframe instead of 
the uframe
when the host channel initialized, so there is no difference of setting the
host channel register sooner or later inside the same frame.
Which means the existent code should be OK for case A).

We can drop this patch before we have the exact use case.

Thanks,
- Kever

>
> -Doug
>


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame
  2016-01-29  2:20 ` [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame Douglas Anderson
@ 2016-02-02  7:46   ` Kever Yang
  2016-02-02 22:47       ` Doug Anderson
  0 siblings, 1 reply; 71+ messages in thread
From: Kever Yang @ 2016-02-02  7:46 UTC (permalink / raw)
  To: Douglas Anderson, John Youn, balbi
  Cc: huangtao, stefan.wahren, heiko, johnyoun, gregkh, ming.lei,
	linux-usb, linux-kernel, linux-rockchip, yousaf.kaukab, stern,
	linux-rpi-kernel, gregory.herrero, william.wu, Julius Werner,
	dinguyen

Doug,

On 01/29/2016 10:20 AM, Douglas Anderson wrote:
> When setting up ISO and INT transfers dwc2 needs to specify whether the
> transfer is for an even or an odd frame (or microframe if the controller
> is running in high speed mode).
>
> The controller appears to use this as a simple way to figure out if a
> transfer should happen right away (in the current microframe) or should
> happen at the start of the next microframe.  Said another way:
>
> - If you set "odd" and the current frame number is odd it appears that
>    the controller will try to transfer right away.  Same thing if you set
>    "even" and the current frame number is even.
> - If the oddness you set and the oddness of the frame number are
>    _different_, the transfer will be delayed until the frame number
>    changes.
>
> As I understand it, the above technique allows you to plan ahead of time
> where possible by always working on the next frame.  ...but it still
> allows you to properly respond immediately to things that happened in
> the previous frame.
>
> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
> It always looked at the frame number and setup the transfer to happen in
> the next frame.  In some cases that meant that certain transactions
> would be transferred in the wrong frame.
>
> We'll try our best to set the even / odd to do the transfer in the
> scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
> We'll also modify the scheduler code to handle this and not try to
> schedule a second transfer for the same frame.
>
> Note that this change relies on the work to redo the microframe
> scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
> in scheduler") but it works even better after ("usb: dwc2: host: Totally
> redo the microframe scheduler").
>
> With this change my stressful USB test (USB webcam + USB audio +
> keyboards) has less audio crackling than before.
Seems this really help for your case?

Do you check if the transfer can happen right in the current frame? I 
know it's
quite difficult to check it, but this changes what I know for the dwc core
schedule the transaction.

In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso 
IN/OUT)
in DMA Mode, the normal Interrupt OUT operation says:
The DWC_otg host attempts to send out the OUT token in the beginning of next
odd frame/microframe.

So I'm confuse about if the dwc core can do the transaction at the same 
frame
of host channel initialized or not.

Thanks,
- Kever

> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> Tested-by: Heiko Stuebner <heiko@sntech.de>
> Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
> ---
> Changes in v6:
> - Add Heiko's Tested-by.
> - Add Stefan's Tested-by.
>
> Changes in v5: None
> Changes in v4:
> - Properly set even/odd frame new for v4.
>
> Changes in v3: None
> Changes in v2: None
>
>   drivers/usb/dwc2/core.c      | 92 +++++++++++++++++++++++++++++++++++++++++++-
>   drivers/usb/dwc2/hcd_queue.c | 11 +++++-
>   2 files changed, 100 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/usb/dwc2/core.c b/drivers/usb/dwc2/core.c
> index a5db20f12ee4..c143f26bd9d9 100644
> --- a/drivers/usb/dwc2/core.c
> +++ b/drivers/usb/dwc2/core.c
> @@ -1703,9 +1703,97 @@ static void dwc2_hc_set_even_odd_frame(struct dwc2_hsotg *hsotg,
>   {
>   	if (chan->ep_type == USB_ENDPOINT_XFER_INT ||
>   	    chan->ep_type == USB_ENDPOINT_XFER_ISOC) {
> -		/* 1 if _next_ frame is odd, 0 if it's even */
> -		if (!(dwc2_hcd_get_frame_number(hsotg) & 0x1))
> +		int host_speed;
> +		int xfer_ns;
> +		int xfer_us;
> +		int bytes_in_fifo;
> +		u16 fifo_space;
> +		u16 frame_number;
> +		u16 wire_frame;
> +
> +		/*
> +		 * Try to figure out if we're an even or odd frame. If we set
> +		 * even and the current frame number is even the the transfer
> +		 * will happen immediately.  Similar if both are odd. If one is
> +		 * even and the other is odd then the transfer will happen when
> +		 * the frame number ticks.
> +		 *
> +		 * There's a bit of a balancing act to get this right.
> +		 * Sometimes we may want to send data in the current frame (AK
> +		 * right away).  We might want to do this if the frame number
> +		 * _just_ ticked, but we might also want to do this in order
> +		 * to continue a split transaction that happened late in a
> +		 * microframe (so we didn't know to queue the next transfer
> +		 * until the frame number had ticked).  The problem is that we
> +		 * need a lot of knowledge to know if there's actually still
> +		 * time to send things or if it would be better to wait until
> +		 * the next frame.
> +		 *
> +		 * We can look at how much time is left in the current frame
> +		 * and make a guess about whether we'll have time to transfer.
> +		 * We'll do that.
> +		 */
> +
> +		/* Get speed host is running at */
> +		host_speed = (chan->speed != USB_SPEED_HIGH &&
> +			      !chan->do_split) ? chan->speed : USB_SPEED_HIGH;
> +
> +		/* See how many bytes are in the periodic FIFO right now */
> +		fifo_space = (dwc2_readl(hsotg->regs + HPTXSTS) &
> +			      TXSTS_FSPCAVAIL_MASK) >> TXSTS_FSPCAVAIL_SHIFT;
> +		bytes_in_fifo = sizeof(u32) *
> +				(hsotg->core_params->host_perio_tx_fifo_size -
> +				 fifo_space);
> +
> +		/*
> +		 * Roughly estimate bus time for everything in the periodic
> +		 * queue + our new transfer.  This is "rough" because we're
> +		 * using a function that makes takes into account IN/OUT
> +		 * and INT/ISO and we're just slamming in one value for all
> +		 * transfers.  This should be an over-estimate and that should
> +		 * be OK, but we can probably tighten it.
> +		 */
> +		xfer_ns = usb_calc_bus_time(host_speed, false, false,
> +					    chan->xfer_len + bytes_in_fifo);
> +		xfer_us = NS_TO_US(xfer_ns);
> +
> +		/* See what frame number we'll be at by the time we finish */
> +		frame_number = dwc2_hcd_get_future_frame_number(hsotg, xfer_us);
> +
> +		/* This is when we were scheduled to be on the wire */
> +		wire_frame = dwc2_frame_num_inc(chan->qh->next_active_frame, 1);
> +
> +		/*
> +		 * If we'd finish _after_ the frame we're scheduled in then
> +		 * it's hopeless.  Just schedule right away and hope for the
> +		 * best.  Note that it _might_ be wise to call back into the
> +		 * scheduler to pick a better frame, but this is better than
> +		 * nothing.
> +		 */
> +		if (dwc2_frame_num_gt(frame_number, wire_frame)) {
> +			dwc2_sch_vdbg(hsotg,
> +				      "QH=%p EO MISS fr=%04x=>%04x (%+d)\n",
> +				      chan->qh, wire_frame, frame_number,
> +				      dwc2_frame_num_dec(frame_number,
> +							 wire_frame));
> +			wire_frame = frame_number;
> +
> +			/*
> +			 * We picked a different frame number; communicate this
> +			 * back to the scheduler so it doesn't try to schedule
> +			 * another in the same frame.
> +			 *
> +			 * Remember that next_active_frame is 1 before the wire
> +			 * frame.
> +			 */
> +			chan->qh->next_active_frame =
> +				dwc2_frame_num_dec(frame_number, 1);
> +		}
> +
> +		if (wire_frame & 1)
>   			*hcchar |= HCCHAR_ODDFRM;
> +		else
> +			*hcchar &= ~HCCHAR_ODDFRM;
>   	}
>   }
>   
> diff --git a/drivers/usb/dwc2/hcd_queue.c b/drivers/usb/dwc2/hcd_queue.c
> index 3abb34a5fc5b..5f909747b5a4 100644
> --- a/drivers/usb/dwc2/hcd_queue.c
> +++ b/drivers/usb/dwc2/hcd_queue.c
> @@ -985,6 +985,14 @@ static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
>   	 *   and next_active_frame are always 1 frame before we want things
>   	 *   to be active and we assume we can still get scheduled in the
>   	 *   current frame number.
> +	 * - It's possible for start_active_frame (now incremented) to be
> +	 *   next_active_frame if we got an EO MISS (even_odd miss) which
> +	 *   basically means that we detected there wasn't enough time for
> +	 *   the last packet and dwc2_hc_set_even_odd_frame() rescheduled us
> +	 *   at the last second.  We want to make sure we don't schedule
> +	 *   another transfer for the same frame.  My test webcam doesn't seem
> +	 *   terribly upset by missing a transfer but really doesn't like when
> +	 *   we do two transfers in the same frame.
>   	 * - Some misses are expected.  Specifically, in order to work
>   	 *   perfectly dwc2 really needs quite spectacular interrupt latency
>   	 *   requirements.  It needs to be able to handle its interrupts
> @@ -995,7 +1003,8 @@ static int dwc2_next_periodic_start(struct dwc2_hsotg *hsotg,
>   	 *   guarantee that a system will have interrupt latency < 125 us, so
>   	 *   we have to be robust to some misses.
>   	 */
> -	if (dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
> +	if (qh->start_active_frame == qh->next_active_frame ||
> +	    dwc2_frame_num_gt(prev_frame_number, qh->start_active_frame)) {
>   		u16 ideal_start = qh->start_active_frame;
>   
>   		/* Adjust interval as per gcd with plan length. */

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame
@ 2016-02-02 22:47       ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-02 22:47 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb, linux-kernel, open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern, linux-rpi-kernel, Herrero, Gregory,
	吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Mon, Feb 1, 2016 at 11:46 PM, Kever Yang <kever.yang@rock-chips.com> wrote:
> Doug,
>
>
> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>
>> When setting up ISO and INT transfers dwc2 needs to specify whether the
>> transfer is for an even or an odd frame (or microframe if the controller
>> is running in high speed mode).
>>
>> The controller appears to use this as a simple way to figure out if a
>> transfer should happen right away (in the current microframe) or should
>> happen at the start of the next microframe.  Said another way:
>>
>> - If you set "odd" and the current frame number is odd it appears that
>>    the controller will try to transfer right away.  Same thing if you set
>>    "even" and the current frame number is even.
>> - If the oddness you set and the oddness of the frame number are
>>    _different_, the transfer will be delayed until the frame number
>>    changes.
>>
>> As I understand it, the above technique allows you to plan ahead of time
>> where possible by always working on the next frame.  ...but it still
>> allows you to properly respond immediately to things that happened in
>> the previous frame.
>>
>> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
>> It always looked at the frame number and setup the transfer to happen in
>> the next frame.  In some cases that meant that certain transactions
>> would be transferred in the wrong frame.
>>
>> We'll try our best to set the even / odd to do the transfer in the
>> scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
>> We'll also modify the scheduler code to handle this and not try to
>> schedule a second transfer for the same frame.
>>
>> Note that this change relies on the work to redo the microframe
>> scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
>> in scheduler") but it works even better after ("usb: dwc2: host: Totally
>> redo the microframe scheduler").
>>
>> With this change my stressful USB test (USB webcam + USB audio +
>> keyboards) has less audio crackling than before.
>
> Seems this really help for your case?

Yes, I believe it does.  Of course my test case is pretty "black box"
for the most part in that I play music on youtube while having a
webcam open and several USB input devices connected.  I then try to
decide whether I hear more static or less static.  ...clearly a less
subjective test would be better...

* I tried with http://crosreview.com/325451 (see below) and I hear
more static with "use_old = true" than with "use_old = "false".

* I tried with this entire patch reverted and I hear about the same
static as with "use_old = true".

Note that counting reported MISS lines from my logging also shows that
the new code is better...


> Do you check if the transfer can happen right in the current frame? I know
> it's
> quite difficult to check it, but this changes what I know for the dwc core
> schedule the transaction.

Yes.  I just tried again, too.  I coded up
<https://chromium-review.googlesource.com/325451> and included it.  I
then opened up a USB webcam.

With things set to the old way:

  115.355370  QH=dc6ba8c0 next(0) fn=10cb, sch=10ca=>10cb (+1) miss=0
  115.355373  QH=dc6ba8c0 IMM ready fn=10cb, nxt=10cb
  115.355518  QH=dc6ba8c0 next(0) fn=10cc, sch=10cb=>10cc (+1) miss=0
  115.355522  QH=dc6ba8c0 IMM ready fn=10cc, nxt=10cc
  115.355637  QH=dc6ba8c0 next(0) fn=10cd, sch=10cc=>10cd (+1) miss=0
  115.355641  QH=dc6ba8c0 IMM ready fn=10cd, nxt=10cd
  115.355857  QH=dc6ba8c0 next(0) fn=10ce, sch=10cd=>10ce (+1) miss=0
  115.355859  QH=dc6ba8c0 IMM ready fn=10ce, nxt=10ce
  115.355867  QH=dc6ba8c0, wire=10cf, old_wire=10d0, EO diff (use OLD)
  115.355870  QH=dc6ba8c0 EO MISS w/ old (10ce != 10cf)
  115.356037  QH=dc6ba8c0 next(0) fn=10d0, sch=10cf=>10d0 (+1) miss=1 MISS
  115.356039  QH=dc6ba8c0 IMM ready fn=10d0, nxt=10d0
  115.356169  QH=dc6ba8c0 next(0) fn=10d1, sch=10d0=>10d1 (+1) miss=0
  115.356170  QH=dc6ba8c0 IMM ready fn=10d1, nxt=10d1
  115.356269  QH=dc6ba8c0 next(0) fn=10d2, sch=10d1=>10d2 (+1) miss=0
  115.356273  QH=dc6ba8c0 IMM ready fn=10d2, nxt=10d2
  115.356404  QH=dc6ba8c0 next(0) fn=10d3, sch=10d2=>10d3 (+1) miss=0
  115.356407  QH=dc6ba8c0 IMM ready fn=10d3, nxt=10d3

With the new way:

   87.814741  QH=e2fd7880 next(0) fn=32e4, sch=32e3=>32e4 (+1) miss=0
   87.814744  QH=e2fd7880 IMM ready fn=32e4, nxt=32e4
   87.814858  QH=e2fd7880 next(0) fn=32e5, sch=32e4=>32e5 (+1) miss=0
   87.814862  QH=e2fd7880 IMM ready fn=32e5, nxt=32e5
   87.815010  QH=e2fd7880 next(0) fn=32e6, sch=32e5=>32e6 (+1) miss=0
   87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
   87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
   87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
   87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
   87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
   87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
   87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
   87.815391  QH=e2fd7880 IMM ready fn=32e9, nxt=32e9
   87.815491  QH=e2fd7880 next(0) fn=32ea, sch=32e9=>32ea (+1) miss=0
   87.815493  QH=e2fd7880 IMM ready fn=32ea, nxt=32ea
   87.815635  QH=e2fd7880 next(0) fn=32eb, sch=32ea=>32eb (+1) miss=0
   87.815638  QH=e2fd7880 IMM ready fn=32eb, nxt=32eb


Note that with my TEST-ONLY patch the old way is still _slightly_
different in that I still communicate back to the scheduler with:

  chan->qh->next_active_frame = now_frame;

The old code didn't used to do that.  If I don't do that then you
you'll just stay in an inconsistent state for a while where things are
going on the wire 1 frame later than we think they are.


Also note that above you can see that the new way is indeed able to
schedule things in the current microframe.  Looking one line at a
time:


   87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6

QH e2fd7880 is going straight to the ready queue.  Actual frame number
in hardware is 32e6.  next_active_frame = 32e6 which means we ideally
want to give it to hardware in 32e6 and wire frame is 32e7.


   87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
   87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7

Frame number in hardware is now 32e8.  We'd like to give the next
transfer to hardware in 32e7 to transfer on the wire at 32e8, but
that's obviously impossible.  We will try to give it right away.


   87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)

Showing a difference in the old way.  We'll choose "even" to have the
packet go on the wire (expecting 32e8).


   87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
   87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8

We got a response back and are ready to schedule the next transfer and
it's still 32e8!  That means that transfer must have happened (as
expected) in 32e8.  Whew!  Give the next transfer to hardware hoping
for 32e9 wire.


   87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0

Now at hardware 32e9 and ready to schedule the next...



> In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso
> IN/OUT)
> in DMA Mode, the normal Interrupt OUT operation says:
> The DWC_otg host attempts to send out the OUT token in the beginning of next
> odd frame/microframe.
>
> So I'm confuse about if the dwc core can do the transaction at the same
> frame
> of host channel initialized or not.

The docbook is obviously way too terse here, but the above experiment
shows that the hardware is designed in the only sane way that it could
be designed.

Why do I say that this is the only sane way for the hardware to work?
I think all the following is true (please correct any errors):

A) HW only lets you specify even/odd which means you choose between
two frame to send the packet.  Two possible ways HW could be
implemented: "sane" way means you can send a packet in frame "x" and
"x + 1".  "insane" way means you can send a packet in frame "x + 1"
and "x + 2" but not frame "x"

B) In some cases (especially with regards to SPLIT transfers), we need
to use the result of a transfer in uFrame "x" to decide what to do
about uFrame "x + 1".  Specifically for IN transfers I think we can't
know for sure whether we'll get back all of our data in uFrame "x" or
whether we'll only get part of the data and need uFrame "x + 1".

C) It's possible to schedule 100us worth of periodic transfers in one
125us uFrame.

D) We can't know the result of a transfer until that transfer is done.


So above basically means that we might have a periodic transfer where
we get the result of the transfer 100us into a uFrame.  We've now got
to quickly queue up the transfer for the next uFrame.  If hardware was
designed in the "insane" way then we'd need an interrupt latency of <
25 us since once the frame ticked we'd no longer be able to schedule.
If hardware was designed in the "sane" way then we'd "only" need an
interrupt latency of 125 us since we could continue to schedule even
partway through the current frame.

Also note that if there's any chance that a periodic transfer ends
later than 100 us into a frame (like if a non-periodic transfer snuck
in there because we were out of periodic channels) then the above
problem becomes even more extreme.



-Doug

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame
@ 2016-02-02 22:47       ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-02 22:47 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Herrero,
	Gregory, 吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Mon, Feb 1, 2016 at 11:46 PM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org> wrote:
> Doug,
>
>
> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>
>> When setting up ISO and INT transfers dwc2 needs to specify whether the
>> transfer is for an even or an odd frame (or microframe if the controller
>> is running in high speed mode).
>>
>> The controller appears to use this as a simple way to figure out if a
>> transfer should happen right away (in the current microframe) or should
>> happen at the start of the next microframe.  Said another way:
>>
>> - If you set "odd" and the current frame number is odd it appears that
>>    the controller will try to transfer right away.  Same thing if you set
>>    "even" and the current frame number is even.
>> - If the oddness you set and the oddness of the frame number are
>>    _different_, the transfer will be delayed until the frame number
>>    changes.
>>
>> As I understand it, the above technique allows you to plan ahead of time
>> where possible by always working on the next frame.  ...but it still
>> allows you to properly respond immediately to things that happened in
>> the previous frame.
>>
>> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
>> It always looked at the frame number and setup the transfer to happen in
>> the next frame.  In some cases that meant that certain transactions
>> would be transferred in the wrong frame.
>>
>> We'll try our best to set the even / odd to do the transfer in the
>> scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
>> We'll also modify the scheduler code to handle this and not try to
>> schedule a second transfer for the same frame.
>>
>> Note that this change relies on the work to redo the microframe
>> scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
>> in scheduler") but it works even better after ("usb: dwc2: host: Totally
>> redo the microframe scheduler").
>>
>> With this change my stressful USB test (USB webcam + USB audio +
>> keyboards) has less audio crackling than before.
>
> Seems this really help for your case?

Yes, I believe it does.  Of course my test case is pretty "black box"
for the most part in that I play music on youtube while having a
webcam open and several USB input devices connected.  I then try to
decide whether I hear more static or less static.  ...clearly a less
subjective test would be better...

* I tried with http://crosreview.com/325451 (see below) and I hear
more static with "use_old = true" than with "use_old = "false".

* I tried with this entire patch reverted and I hear about the same
static as with "use_old = true".

Note that counting reported MISS lines from my logging also shows that
the new code is better...


> Do you check if the transfer can happen right in the current frame? I know
> it's
> quite difficult to check it, but this changes what I know for the dwc core
> schedule the transaction.

Yes.  I just tried again, too.  I coded up
<https://chromium-review.googlesource.com/325451> and included it.  I
then opened up a USB webcam.

With things set to the old way:

  115.355370  QH=dc6ba8c0 next(0) fn=10cb, sch=10ca=>10cb (+1) miss=0
  115.355373  QH=dc6ba8c0 IMM ready fn=10cb, nxt=10cb
  115.355518  QH=dc6ba8c0 next(0) fn=10cc, sch=10cb=>10cc (+1) miss=0
  115.355522  QH=dc6ba8c0 IMM ready fn=10cc, nxt=10cc
  115.355637  QH=dc6ba8c0 next(0) fn=10cd, sch=10cc=>10cd (+1) miss=0
  115.355641  QH=dc6ba8c0 IMM ready fn=10cd, nxt=10cd
  115.355857  QH=dc6ba8c0 next(0) fn=10ce, sch=10cd=>10ce (+1) miss=0
  115.355859  QH=dc6ba8c0 IMM ready fn=10ce, nxt=10ce
  115.355867  QH=dc6ba8c0, wire=10cf, old_wire=10d0, EO diff (use OLD)
  115.355870  QH=dc6ba8c0 EO MISS w/ old (10ce != 10cf)
  115.356037  QH=dc6ba8c0 next(0) fn=10d0, sch=10cf=>10d0 (+1) miss=1 MISS
  115.356039  QH=dc6ba8c0 IMM ready fn=10d0, nxt=10d0
  115.356169  QH=dc6ba8c0 next(0) fn=10d1, sch=10d0=>10d1 (+1) miss=0
  115.356170  QH=dc6ba8c0 IMM ready fn=10d1, nxt=10d1
  115.356269  QH=dc6ba8c0 next(0) fn=10d2, sch=10d1=>10d2 (+1) miss=0
  115.356273  QH=dc6ba8c0 IMM ready fn=10d2, nxt=10d2
  115.356404  QH=dc6ba8c0 next(0) fn=10d3, sch=10d2=>10d3 (+1) miss=0
  115.356407  QH=dc6ba8c0 IMM ready fn=10d3, nxt=10d3

With the new way:

   87.814741  QH=e2fd7880 next(0) fn=32e4, sch=32e3=>32e4 (+1) miss=0
   87.814744  QH=e2fd7880 IMM ready fn=32e4, nxt=32e4
   87.814858  QH=e2fd7880 next(0) fn=32e5, sch=32e4=>32e5 (+1) miss=0
   87.814862  QH=e2fd7880 IMM ready fn=32e5, nxt=32e5
   87.815010  QH=e2fd7880 next(0) fn=32e6, sch=32e5=>32e6 (+1) miss=0
   87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
   87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
   87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
   87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
   87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
   87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
   87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
   87.815391  QH=e2fd7880 IMM ready fn=32e9, nxt=32e9
   87.815491  QH=e2fd7880 next(0) fn=32ea, sch=32e9=>32ea (+1) miss=0
   87.815493  QH=e2fd7880 IMM ready fn=32ea, nxt=32ea
   87.815635  QH=e2fd7880 next(0) fn=32eb, sch=32ea=>32eb (+1) miss=0
   87.815638  QH=e2fd7880 IMM ready fn=32eb, nxt=32eb


Note that with my TEST-ONLY patch the old way is still _slightly_
different in that I still communicate back to the scheduler with:

  chan->qh->next_active_frame = now_frame;

The old code didn't used to do that.  If I don't do that then you
you'll just stay in an inconsistent state for a while where things are
going on the wire 1 frame later than we think they are.


Also note that above you can see that the new way is indeed able to
schedule things in the current microframe.  Looking one line at a
time:


   87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6

QH e2fd7880 is going straight to the ready queue.  Actual frame number
in hardware is 32e6.  next_active_frame = 32e6 which means we ideally
want to give it to hardware in 32e6 and wire frame is 32e7.


   87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
   87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7

Frame number in hardware is now 32e8.  We'd like to give the next
transfer to hardware in 32e7 to transfer on the wire at 32e8, but
that's obviously impossible.  We will try to give it right away.


   87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)

Showing a difference in the old way.  We'll choose "even" to have the
packet go on the wire (expecting 32e8).


   87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
   87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8

We got a response back and are ready to schedule the next transfer and
it's still 32e8!  That means that transfer must have happened (as
expected) in 32e8.  Whew!  Give the next transfer to hardware hoping
for 32e9 wire.


   87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0

Now at hardware 32e9 and ready to schedule the next...



> In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso
> IN/OUT)
> in DMA Mode, the normal Interrupt OUT operation says:
> The DWC_otg host attempts to send out the OUT token in the beginning of next
> odd frame/microframe.
>
> So I'm confuse about if the dwc core can do the transaction at the same
> frame
> of host channel initialized or not.

The docbook is obviously way too terse here, but the above experiment
shows that the hardware is designed in the only sane way that it could
be designed.

Why do I say that this is the only sane way for the hardware to work?
I think all the following is true (please correct any errors):

A) HW only lets you specify even/odd which means you choose between
two frame to send the packet.  Two possible ways HW could be
implemented: "sane" way means you can send a packet in frame "x" and
"x + 1".  "insane" way means you can send a packet in frame "x + 1"
and "x + 2" but not frame "x"

B) In some cases (especially with regards to SPLIT transfers), we need
to use the result of a transfer in uFrame "x" to decide what to do
about uFrame "x + 1".  Specifically for IN transfers I think we can't
know for sure whether we'll get back all of our data in uFrame "x" or
whether we'll only get part of the data and need uFrame "x + 1".

C) It's possible to schedule 100us worth of periodic transfers in one
125us uFrame.

D) We can't know the result of a transfer until that transfer is done.


So above basically means that we might have a periodic transfer where
we get the result of the transfer 100us into a uFrame.  We've now got
to quickly queue up the transfer for the next uFrame.  If hardware was
designed in the "insane" way then we'd need an interrupt latency of <
25 us since once the frame ticked we'd no longer be able to schedule.
If hardware was designed in the "sane" way then we'd "only" need an
interrupt latency of 125 us since we could continue to schedule even
partway through the current frame.

Also note that if there's any chance that a periodic transfer ends
later than 100 us into a frame (like if a non-periodic transfer snuck
in there because we were out of periodic channels) then the above
problem becomes even more extreme.



-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-02-02 23:28                 ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-02 23:28 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb, linux-kernel, open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern, linux-rpi-kernel, Herrero, Gregory,
	吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Mon, Feb 1, 2016 at 11:04 PM, Kever Yang <kever.yang@rock-chips.com> wrote:
>>> Oh, now I get what you're saying!
>>>
>>> A) You've got dwc2_release_channel() -> dwc2_deactivate_qh() ->
>>> dwc2_hcd_qh_deactivate()
>>> ...and always in that case we'll do a select / queue, so we don't need it
>>> there.
>>>
>>> B) You've got dwc2_hcd_urb_dequeue() -> dwc2_hcd_qh_deactivate()
>>>
>>> ...but why don't we need it for dwc2_hcd_urb_dequeue()?  Yes, you're
>>> not continuing a split so timing isn't quite as urgent, but you still
>>> might have an INT or ISOC packet that's scheduled with an interval of
>>> 1.  We still might want to schedule right away if there are remaining
>>> QTDs, right?
>>
>> I ran out of time to fully test today, but I couldn't actually get a
>> case where we needed to schedule right away for B).  ...so given your
>> point about the the select / queue already present in case A, we could
>> probably just drop this patch ("usb: dwc2: host: Schedule periodic
>> right away if it's time") and if we can find a case where it's needed
>> in case B we can add the select / queue there.
>>
>> Sound OK?  I'll try to do more testing tomorrow...
>
> Yes, we don't get a case we need to schedule right away for case B).
>
> For INT or ISOC packet, I can recall I have seen somewhere but I can find
> it now, the synchronous transfer is happen in the next uframe instead of the
> uframe
> when the host channel initialized, so there is no difference of setting the
> host channel register sooner or later inside the same frame.
> Which means the existent code should be OK for case A).
>
> We can drop this patch before we have the exact use case.

I put in some printouts and I finally did manage to find a place where
we needed to queue things up in dwc2_hcd_urb_dequeue().  I saw:

314.587916: QH=d9535340 next(0) fn=2a52, sch=2a51=>2a52 (+1) miss=0
314.588040: QH=d9535340 next(0) fn=2a53, sch=2a52=>2a53 (+1) miss=0
314.588162: QH=d9535340 next(0) fn=2a54, sch=2a53=>2a54 (+1) miss=0
314.588299: QH=d9535340 next(0) fn=2a55, sch=2a54=>2a55 (+1) miss=0
314.588304: QH=d9535340 queue in dwc2_hcd_urb_dequeue
314.588363: QH=d9535340 next(0) fn=2a55, sch=2a55=>2a56 (+1) miss=0
314.588413: dwc2_handle_hcd_intr: ff540000.usb: SCH: QH=e5cea380 ready
fn=2a56, nxt=2a56
314.588414: dwc2_handle_hcd_intr: ff540000.usb: SCH: QH=e73ccc40 ready
fn=2a56, nxt=2a56
314.588415: dwc2_handle_hcd_intr: ff540000.usb: SCH: QH=e5cea8c0 ready
fn=2a56, nxt=2a56

It's not something that's terribly common.  It's fine to just drop
this patch, or I can replace it with
<https://chromium-review.googlesource.com/325540>.

-Doug

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time
@ 2016-02-02 23:28                 ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-02 23:28 UTC (permalink / raw)
  To: Kever Yang
  Cc: John Youn, Felipe Balbi, Tao Huang, Stefan Wahren,
	Heiko Stübner, John Youn, Greg Kroah-Hartman, Ming Lei,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Herrero,
	Gregory, 吴良峰,
	Julius Werner, Dinh Nguyen

Kever,

On Mon, Feb 1, 2016 at 11:04 PM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org> wrote:
>>> Oh, now I get what you're saying!
>>>
>>> A) You've got dwc2_release_channel() -> dwc2_deactivate_qh() ->
>>> dwc2_hcd_qh_deactivate()
>>> ...and always in that case we'll do a select / queue, so we don't need it
>>> there.
>>>
>>> B) You've got dwc2_hcd_urb_dequeue() -> dwc2_hcd_qh_deactivate()
>>>
>>> ...but why don't we need it for dwc2_hcd_urb_dequeue()?  Yes, you're
>>> not continuing a split so timing isn't quite as urgent, but you still
>>> might have an INT or ISOC packet that's scheduled with an interval of
>>> 1.  We still might want to schedule right away if there are remaining
>>> QTDs, right?
>>
>> I ran out of time to fully test today, but I couldn't actually get a
>> case where we needed to schedule right away for B).  ...so given your
>> point about the the select / queue already present in case A, we could
>> probably just drop this patch ("usb: dwc2: host: Schedule periodic
>> right away if it's time") and if we can find a case where it's needed
>> in case B we can add the select / queue there.
>>
>> Sound OK?  I'll try to do more testing tomorrow...
>
> Yes, we don't get a case we need to schedule right away for case B).
>
> For INT or ISOC packet, I can recall I have seen somewhere but I can find
> it now, the synchronous transfer is happen in the next uframe instead of the
> uframe
> when the host channel initialized, so there is no difference of setting the
> host channel register sooner or later inside the same frame.
> Which means the existent code should be OK for case A).
>
> We can drop this patch before we have the exact use case.

I put in some printouts and I finally did manage to find a place where
we needed to queue things up in dwc2_hcd_urb_dequeue().  I saw:

314.587916: QH=d9535340 next(0) fn=2a52, sch=2a51=>2a52 (+1) miss=0
314.588040: QH=d9535340 next(0) fn=2a53, sch=2a52=>2a53 (+1) miss=0
314.588162: QH=d9535340 next(0) fn=2a54, sch=2a53=>2a54 (+1) miss=0
314.588299: QH=d9535340 next(0) fn=2a55, sch=2a54=>2a55 (+1) miss=0
314.588304: QH=d9535340 queue in dwc2_hcd_urb_dequeue
314.588363: QH=d9535340 next(0) fn=2a55, sch=2a55=>2a56 (+1) miss=0
314.588413: dwc2_handle_hcd_intr: ff540000.usb: SCH: QH=e5cea380 ready
fn=2a56, nxt=2a56
314.588414: dwc2_handle_hcd_intr: ff540000.usb: SCH: QH=e73ccc40 ready
fn=2a56, nxt=2a56
314.588415: dwc2_handle_hcd_intr: ff540000.usb: SCH: QH=e5cea8c0 ready
fn=2a56, nxt=2a56

It's not something that's terribly common.  It's fine to just drop
this patch, or I can replace it with
<https://chromium-review.googlesource.com/325540>.

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits
  2016-01-29  2:19 [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits Douglas Anderson
@ 2016-02-02 23:57   ` John Youn
  2016-01-29  2:19   ` Douglas Anderson
                     ` (21 subsequent siblings)
  22 siblings, 0 replies; 71+ messages in thread
From: John Youn @ 2016-02-02 23:57 UTC (permalink / raw)
  To: Douglas Anderson, John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, gregkh, linux-usb, linux-kernel

On 1/28/2016 6:20 PM, Douglas Anderson wrote:
> This is a bit of catchall series for all the bug fix and performance
> patches I've been working on over the last few months.  Note that for
> dwc2 we need to do LOTS in software and need super low interrupt
> latency, so most performance improvements actually fix real bugs.
> 
> Patches are structured to start with no-brainer stuff that could be
> applied ASAP, especially things I've already gotten Acks for.  Things
> get slightly more RFC / RFT like as we get farther down the series.
> Anything that can be landed sooner rather than later (especially those
> Acked long ago) would help in re-posts (I'm not biased, of course).
> 

Hi Doug,

I've yet to review this, but just wanted to let you know that we've
started on it and also testing. We'll get back to you with some
feedback and results soon.

We had also been looking at some of these same and related issues so
we want to make sure everything we've done is compatible with your
changes and is still working ok too.

Regards,
John

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits
@ 2016-02-02 23:57   ` John Youn
  0 siblings, 0 replies; 71+ messages in thread
From: John Youn @ 2016-02-02 23:57 UTC (permalink / raw)
  To: Douglas Anderson, John Youn, balbi, kever.yang
  Cc: william.wu, huangtao, heiko, stefan.wahren, linux-rockchip,
	linux-rpi-kernel, Julius Werner, gregory.herrero, yousaf.kaukab,
	dinguyen, stern, ming.lei, gregkh, linux-usb, linux-kernel

On 1/28/2016 6:20 PM, Douglas Anderson wrote:
> This is a bit of catchall series for all the bug fix and performance
> patches I've been working on over the last few months.  Note that for
> dwc2 we need to do LOTS in software and need super low interrupt
> latency, so most performance improvements actually fix real bugs.
> 
> Patches are structured to start with no-brainer stuff that could be
> applied ASAP, especially things I've already gotten Acks for.  Things
> get slightly more RFC / RFT like as we get farther down the series.
> Anything that can be landed sooner rather than later (especially those
> Acked long ago) would help in re-posts (I'm not biased, of course).
> 

Hi Doug,

I've yet to review this, but just wanted to let you know that we've
started on it and also testing. We'll get back to you with some
feedback and results soon.

We had also been looking at some of these same and related issues so
we want to make sure everything we've done is compatible with your
changes and is still working ok too.

Regards,
John

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame
@ 2016-02-03  7:47         ` Kever Yang
  0 siblings, 0 replies; 71+ messages in thread
From: Kever Yang @ 2016-02-03  7:47 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Tao Huang, Stefan Wahren, Heiko Stübner, John Youn,
	Greg Kroah-Hartman, Ming Lei, linux-usb, linux-kernel,
	Felipe Balbi, John Youn, open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern, linux-rpi-kernel, Julius Werner,
	吴良峰,
	Herrero, Gregory, Dinh Nguyen

Doug,

Thanks for your detail debug information, pls add my Reviewed-by for 
this patch.

Thanks,
- Kever
On 02/03/2016 06:47 AM, Doug Anderson wrote:
> Kever,
>
> On Mon, Feb 1, 2016 at 11:46 PM, Kever Yang <kever.yang@rock-chips.com> wrote:
>> Doug,
>>
>>
>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>> When setting up ISO and INT transfers dwc2 needs to specify whether the
>>> transfer is for an even or an odd frame (or microframe if the controller
>>> is running in high speed mode).
>>>
>>> The controller appears to use this as a simple way to figure out if a
>>> transfer should happen right away (in the current microframe) or should
>>> happen at the start of the next microframe.  Said another way:
>>>
>>> - If you set "odd" and the current frame number is odd it appears that
>>>     the controller will try to transfer right away.  Same thing if you set
>>>     "even" and the current frame number is even.
>>> - If the oddness you set and the oddness of the frame number are
>>>     _different_, the transfer will be delayed until the frame number
>>>     changes.
>>>
>>> As I understand it, the above technique allows you to plan ahead of time
>>> where possible by always working on the next frame.  ...but it still
>>> allows you to properly respond immediately to things that happened in
>>> the previous frame.
>>>
>>> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
>>> It always looked at the frame number and setup the transfer to happen in
>>> the next frame.  In some cases that meant that certain transactions
>>> would be transferred in the wrong frame.
>>>
>>> We'll try our best to set the even / odd to do the transfer in the
>>> scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
>>> We'll also modify the scheduler code to handle this and not try to
>>> schedule a second transfer for the same frame.
>>>
>>> Note that this change relies on the work to redo the microframe
>>> scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
>>> in scheduler") but it works even better after ("usb: dwc2: host: Totally
>>> redo the microframe scheduler").
>>>
>>> With this change my stressful USB test (USB webcam + USB audio +
>>> keyboards) has less audio crackling than before.
>> Seems this really help for your case?
> Yes, I believe it does.  Of course my test case is pretty "black box"
> for the most part in that I play music on youtube while having a
> webcam open and several USB input devices connected.  I then try to
> decide whether I hear more static or less static.  ...clearly a less
> subjective test would be better...
>
> * I tried with http://crosreview.com/325451 (see below) and I hear
> more static with "use_old = true" than with "use_old = "false".
>
> * I tried with this entire patch reverted and I hear about the same
> static as with "use_old = true".
>
> Note that counting reported MISS lines from my logging also shows that
> the new code is better...
>
>
>> Do you check if the transfer can happen right in the current frame? I know
>> it's
>> quite difficult to check it, but this changes what I know for the dwc core
>> schedule the transaction.
> Yes.  I just tried again, too.  I coded up
> <https://chromium-review.googlesource.com/325451> and included it.  I
> then opened up a USB webcam.
>
> With things set to the old way:
>
>    115.355370  QH=dc6ba8c0 next(0) fn=10cb, sch=10ca=>10cb (+1) miss=0
>    115.355373  QH=dc6ba8c0 IMM ready fn=10cb, nxt=10cb
>    115.355518  QH=dc6ba8c0 next(0) fn=10cc, sch=10cb=>10cc (+1) miss=0
>    115.355522  QH=dc6ba8c0 IMM ready fn=10cc, nxt=10cc
>    115.355637  QH=dc6ba8c0 next(0) fn=10cd, sch=10cc=>10cd (+1) miss=0
>    115.355641  QH=dc6ba8c0 IMM ready fn=10cd, nxt=10cd
>    115.355857  QH=dc6ba8c0 next(0) fn=10ce, sch=10cd=>10ce (+1) miss=0
>    115.355859  QH=dc6ba8c0 IMM ready fn=10ce, nxt=10ce
>    115.355867  QH=dc6ba8c0, wire=10cf, old_wire=10d0, EO diff (use OLD)
>    115.355870  QH=dc6ba8c0 EO MISS w/ old (10ce != 10cf)
>    115.356037  QH=dc6ba8c0 next(0) fn=10d0, sch=10cf=>10d0 (+1) miss=1 MISS
>    115.356039  QH=dc6ba8c0 IMM ready fn=10d0, nxt=10d0
>    115.356169  QH=dc6ba8c0 next(0) fn=10d1, sch=10d0=>10d1 (+1) miss=0
>    115.356170  QH=dc6ba8c0 IMM ready fn=10d1, nxt=10d1
>    115.356269  QH=dc6ba8c0 next(0) fn=10d2, sch=10d1=>10d2 (+1) miss=0
>    115.356273  QH=dc6ba8c0 IMM ready fn=10d2, nxt=10d2
>    115.356404  QH=dc6ba8c0 next(0) fn=10d3, sch=10d2=>10d3 (+1) miss=0
>    115.356407  QH=dc6ba8c0 IMM ready fn=10d3, nxt=10d3
>
> With the new way:
>
>     87.814741  QH=e2fd7880 next(0) fn=32e4, sch=32e3=>32e4 (+1) miss=0
>     87.814744  QH=e2fd7880 IMM ready fn=32e4, nxt=32e4
>     87.814858  QH=e2fd7880 next(0) fn=32e5, sch=32e4=>32e5 (+1) miss=0
>     87.814862  QH=e2fd7880 IMM ready fn=32e5, nxt=32e5
>     87.815010  QH=e2fd7880 next(0) fn=32e6, sch=32e5=>32e6 (+1) miss=0
>     87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
>     87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
>     87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
>     87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
>     87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
>     87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
>     87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
>     87.815391  QH=e2fd7880 IMM ready fn=32e9, nxt=32e9
>     87.815491  QH=e2fd7880 next(0) fn=32ea, sch=32e9=>32ea (+1) miss=0
>     87.815493  QH=e2fd7880 IMM ready fn=32ea, nxt=32ea
>     87.815635  QH=e2fd7880 next(0) fn=32eb, sch=32ea=>32eb (+1) miss=0
>     87.815638  QH=e2fd7880 IMM ready fn=32eb, nxt=32eb
>
>
> Note that with my TEST-ONLY patch the old way is still _slightly_
> different in that I still communicate back to the scheduler with:
>
>    chan->qh->next_active_frame = now_frame;
>
> The old code didn't used to do that.  If I don't do that then you
> you'll just stay in an inconsistent state for a while where things are
> going on the wire 1 frame later than we think they are.
>
>
> Also note that above you can see that the new way is indeed able to
> schedule things in the current microframe.  Looking one line at a
> time:
>
>
>     87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
>
> QH e2fd7880 is going straight to the ready queue.  Actual frame number
> in hardware is 32e6.  next_active_frame = 32e6 which means we ideally
> want to give it to hardware in 32e6 and wire frame is 32e7.
>
>
>     87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
>     87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
>
> Frame number in hardware is now 32e8.  We'd like to give the next
> transfer to hardware in 32e7 to transfer on the wire at 32e8, but
> that's obviously impossible.  We will try to give it right away.
>
>
>     87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
>
> Showing a difference in the old way.  We'll choose "even" to have the
> packet go on the wire (expecting 32e8).
>
>
>     87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
>     87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
>
> We got a response back and are ready to schedule the next transfer and
> it's still 32e8!  That means that transfer must have happened (as
> expected) in 32e8.  Whew!  Give the next transfer to hardware hoping
> for 32e9 wire.
>
>
>     87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
>
> Now at hardware 32e9 and ready to schedule the next...
>
>
>
>> In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso
>> IN/OUT)
>> in DMA Mode, the normal Interrupt OUT operation says:
>> The DWC_otg host attempts to send out the OUT token in the beginning of next
>> odd frame/microframe.
>>
>> So I'm confuse about if the dwc core can do the transaction at the same
>> frame
>> of host channel initialized or not.
> The docbook is obviously way too terse here, but the above experiment
> shows that the hardware is designed in the only sane way that it could
> be designed.
>
> Why do I say that this is the only sane way for the hardware to work?
> I think all the following is true (please correct any errors):
>
> A) HW only lets you specify even/odd which means you choose between
> two frame to send the packet.  Two possible ways HW could be
> implemented: "sane" way means you can send a packet in frame "x" and
> "x + 1".  "insane" way means you can send a packet in frame "x + 1"
> and "x + 2" but not frame "x"
>
> B) In some cases (especially with regards to SPLIT transfers), we need
> to use the result of a transfer in uFrame "x" to decide what to do
> about uFrame "x + 1".  Specifically for IN transfers I think we can't
> know for sure whether we'll get back all of our data in uFrame "x" or
> whether we'll only get part of the data and need uFrame "x + 1".
>
> C) It's possible to schedule 100us worth of periodic transfers in one
> 125us uFrame.
>
> D) We can't know the result of a transfer until that transfer is done.
>
>
> So above basically means that we might have a periodic transfer where
> we get the result of the transfer 100us into a uFrame.  We've now got
> to quickly queue up the transfer for the next uFrame.  If hardware was
> designed in the "insane" way then we'd need an interrupt latency of <
> 25 us since once the frame ticked we'd no longer be able to schedule.
> If hardware was designed in the "sane" way then we'd "only" need an
> interrupt latency of 125 us since we could continue to schedule even
> partway through the current frame.
>
> Also note that if there's any chance that a periodic transfer ends
> later than 100 us into a frame (like if a non-periodic transfer snuck
> in there because we were out of periodic channels) then the above
> problem becomes even more extreme.
>
>
>
> -Doug
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
>

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame
@ 2016-02-03  7:47         ` Kever Yang
  0 siblings, 0 replies; 71+ messages in thread
From: Kever Yang @ 2016-02-03  7:47 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Tao Huang, Stefan Wahren, Heiko Stübner, John Youn,
	Greg Kroah-Hartman, Ming Lei, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Felipe Balbi, John Youn,
	open list:ARM/Rockchip SoC...,
	Kaukab, Yousaf, Alan Stern,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Julius Werner,
	吴良峰,
	Herrero, Gregory, Dinh Nguyen

Doug,

Thanks for your detail debug information, pls add my Reviewed-by for 
this patch.

Thanks,
- Kever
On 02/03/2016 06:47 AM, Doug Anderson wrote:
> Kever,
>
> On Mon, Feb 1, 2016 at 11:46 PM, Kever Yang <kever.yang-TNX95d0MmH7DzftRWevZcw@public.gmane.org> wrote:
>> Doug,
>>
>>
>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>> When setting up ISO and INT transfers dwc2 needs to specify whether the
>>> transfer is for an even or an odd frame (or microframe if the controller
>>> is running in high speed mode).
>>>
>>> The controller appears to use this as a simple way to figure out if a
>>> transfer should happen right away (in the current microframe) or should
>>> happen at the start of the next microframe.  Said another way:
>>>
>>> - If you set "odd" and the current frame number is odd it appears that
>>>     the controller will try to transfer right away.  Same thing if you set
>>>     "even" and the current frame number is even.
>>> - If the oddness you set and the oddness of the frame number are
>>>     _different_, the transfer will be delayed until the frame number
>>>     changes.
>>>
>>> As I understand it, the above technique allows you to plan ahead of time
>>> where possible by always working on the next frame.  ...but it still
>>> allows you to properly respond immediately to things that happened in
>>> the previous frame.
>>>
>>> The old dwc2_hc_set_even_odd_frame() didn't really handle this concept.
>>> It always looked at the frame number and setup the transfer to happen in
>>> the next frame.  In some cases that meant that certain transactions
>>> would be transferred in the wrong frame.
>>>
>>> We'll try our best to set the even / odd to do the transfer in the
>>> scheduled frame.  If that fails then we'll do an ugly "schedule ASAP".
>>> We'll also modify the scheduler code to handle this and not try to
>>> schedule a second transfer for the same frame.
>>>
>>> Note that this change relies on the work to redo the microframe
>>> scheduler.  It can work atop ("usb: dwc2: host: Manage frame nums better
>>> in scheduler") but it works even better after ("usb: dwc2: host: Totally
>>> redo the microframe scheduler").
>>>
>>> With this change my stressful USB test (USB webcam + USB audio +
>>> keyboards) has less audio crackling than before.
>> Seems this really help for your case?
> Yes, I believe it does.  Of course my test case is pretty "black box"
> for the most part in that I play music on youtube while having a
> webcam open and several USB input devices connected.  I then try to
> decide whether I hear more static or less static.  ...clearly a less
> subjective test would be better...
>
> * I tried with http://crosreview.com/325451 (see below) and I hear
> more static with "use_old = true" than with "use_old = "false".
>
> * I tried with this entire patch reverted and I hear about the same
> static as with "use_old = true".
>
> Note that counting reported MISS lines from my logging also shows that
> the new code is better...
>
>
>> Do you check if the transfer can happen right in the current frame? I know
>> it's
>> quite difficult to check it, but this changes what I know for the dwc core
>> schedule the transaction.
> Yes.  I just tried again, too.  I coded up
> <https://chromium-review.googlesource.com/325451> and included it.  I
> then opened up a USB webcam.
>
> With things set to the old way:
>
>    115.355370  QH=dc6ba8c0 next(0) fn=10cb, sch=10ca=>10cb (+1) miss=0
>    115.355373  QH=dc6ba8c0 IMM ready fn=10cb, nxt=10cb
>    115.355518  QH=dc6ba8c0 next(0) fn=10cc, sch=10cb=>10cc (+1) miss=0
>    115.355522  QH=dc6ba8c0 IMM ready fn=10cc, nxt=10cc
>    115.355637  QH=dc6ba8c0 next(0) fn=10cd, sch=10cc=>10cd (+1) miss=0
>    115.355641  QH=dc6ba8c0 IMM ready fn=10cd, nxt=10cd
>    115.355857  QH=dc6ba8c0 next(0) fn=10ce, sch=10cd=>10ce (+1) miss=0
>    115.355859  QH=dc6ba8c0 IMM ready fn=10ce, nxt=10ce
>    115.355867  QH=dc6ba8c0, wire=10cf, old_wire=10d0, EO diff (use OLD)
>    115.355870  QH=dc6ba8c0 EO MISS w/ old (10ce != 10cf)
>    115.356037  QH=dc6ba8c0 next(0) fn=10d0, sch=10cf=>10d0 (+1) miss=1 MISS
>    115.356039  QH=dc6ba8c0 IMM ready fn=10d0, nxt=10d0
>    115.356169  QH=dc6ba8c0 next(0) fn=10d1, sch=10d0=>10d1 (+1) miss=0
>    115.356170  QH=dc6ba8c0 IMM ready fn=10d1, nxt=10d1
>    115.356269  QH=dc6ba8c0 next(0) fn=10d2, sch=10d1=>10d2 (+1) miss=0
>    115.356273  QH=dc6ba8c0 IMM ready fn=10d2, nxt=10d2
>    115.356404  QH=dc6ba8c0 next(0) fn=10d3, sch=10d2=>10d3 (+1) miss=0
>    115.356407  QH=dc6ba8c0 IMM ready fn=10d3, nxt=10d3
>
> With the new way:
>
>     87.814741  QH=e2fd7880 next(0) fn=32e4, sch=32e3=>32e4 (+1) miss=0
>     87.814744  QH=e2fd7880 IMM ready fn=32e4, nxt=32e4
>     87.814858  QH=e2fd7880 next(0) fn=32e5, sch=32e4=>32e5 (+1) miss=0
>     87.814862  QH=e2fd7880 IMM ready fn=32e5, nxt=32e5
>     87.815010  QH=e2fd7880 next(0) fn=32e6, sch=32e5=>32e6 (+1) miss=0
>     87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
>     87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
>     87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
>     87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
>     87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
>     87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
>     87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
>     87.815391  QH=e2fd7880 IMM ready fn=32e9, nxt=32e9
>     87.815491  QH=e2fd7880 next(0) fn=32ea, sch=32e9=>32ea (+1) miss=0
>     87.815493  QH=e2fd7880 IMM ready fn=32ea, nxt=32ea
>     87.815635  QH=e2fd7880 next(0) fn=32eb, sch=32ea=>32eb (+1) miss=0
>     87.815638  QH=e2fd7880 IMM ready fn=32eb, nxt=32eb
>
>
> Note that with my TEST-ONLY patch the old way is still _slightly_
> different in that I still communicate back to the scheduler with:
>
>    chan->qh->next_active_frame = now_frame;
>
> The old code didn't used to do that.  If I don't do that then you
> you'll just stay in an inconsistent state for a while where things are
> going on the wire 1 frame later than we think they are.
>
>
> Also note that above you can see that the new way is indeed able to
> schedule things in the current microframe.  Looking one line at a
> time:
>
>
>     87.815012  QH=e2fd7880 IMM ready fn=32e6, nxt=32e6
>
> QH e2fd7880 is going straight to the ready queue.  Actual frame number
> in hardware is 32e6.  next_active_frame = 32e6 which means we ideally
> want to give it to hardware in 32e6 and wire frame is 32e7.
>
>
>     87.815220  QH=e2fd7880 next(0) fn=32e8, sch=32e6=>32e7 (+1) miss=0
>     87.815222  QH=e2fd7880 IMM ready fn=32e8, nxt=32e7
>
> Frame number in hardware is now 32e8.  We'd like to give the next
> transfer to hardware in 32e7 to transfer on the wire at 32e8, but
> that's obviously impossible.  We will try to give it right away.
>
>
>     87.815230  QH=e2fd7880, wire=32e8, old_wire=32e9, EO diff (use NEW)
>
> Showing a difference in the old way.  We'll choose "even" to have the
> packet go on the wire (expecting 32e8).
>
>
>     87.815278  QH=e2fd7880 next(0) fn=32e8, sch=32e7=>32e8 (+1) miss=0
>     87.815280  QH=e2fd7880 IMM ready fn=32e8, nxt=32e8
>
> We got a response back and are ready to schedule the next transfer and
> it's still 32e8!  That means that transfer must have happened (as
> expected) in 32e8.  Whew!  Give the next transfer to hardware hoping
> for 32e9 wire.
>
>
>     87.815390  QH=e2fd7880 next(0) fn=32e9, sch=32e8=>32e9 (+1) miss=0
>
> Now at hardware 32e9 and ready to schedule the next...
>
>
>
>> In dwc_otgbook, Interrupt OUT Transactions(also similar for Int IN, Iso
>> IN/OUT)
>> in DMA Mode, the normal Interrupt OUT operation says:
>> The DWC_otg host attempts to send out the OUT token in the beginning of next
>> odd frame/microframe.
>>
>> So I'm confuse about if the dwc core can do the transaction at the same
>> frame
>> of host channel initialized or not.
> The docbook is obviously way too terse here, but the above experiment
> shows that the hardware is designed in the only sane way that it could
> be designed.
>
> Why do I say that this is the only sane way for the hardware to work?
> I think all the following is true (please correct any errors):
>
> A) HW only lets you specify even/odd which means you choose between
> two frame to send the packet.  Two possible ways HW could be
> implemented: "sane" way means you can send a packet in frame "x" and
> "x + 1".  "insane" way means you can send a packet in frame "x + 1"
> and "x + 2" but not frame "x"
>
> B) In some cases (especially with regards to SPLIT transfers), we need
> to use the result of a transfer in uFrame "x" to decide what to do
> about uFrame "x + 1".  Specifically for IN transfers I think we can't
> know for sure whether we'll get back all of our data in uFrame "x" or
> whether we'll only get part of the data and need uFrame "x + 1".
>
> C) It's possible to schedule 100us worth of periodic transfers in one
> 125us uFrame.
>
> D) We can't know the result of a transfer until that transfer is done.
>
>
> So above basically means that we might have a periodic transfer where
> we get the result of the transfer 100us into a uFrame.  We've now got
> to quickly queue up the transfer for the next uFrame.  If hardware was
> designed in the "insane" way then we'd need an interrupt latency of <
> 25 us since once the frame ticked we'd no longer be able to schedule.
> If hardware was designed in the "sane" way then we'd "only" need an
> interrupt latency of 125 us since we could continue to schedule even
> partway through the current frame.
>
> Also note that if there's any chance that a periodic transfer ends
> later than 100 us into a frame (like if a non-periodic transfer snuck
> in there because we were out of periodic channels) then the above
> problem becomes even more extreme.
>
>
>
> -Doug
>
> _______________________________________________
> Linux-rockchip mailing list
> Linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
>


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits
  2016-02-02 23:57   ` John Youn
@ 2016-02-03 18:23     ` Doug Anderson
  -1 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-03 18:23 UTC (permalink / raw)
  To: John Youn
  Cc: balbi, kever.yang, william.wu, huangtao, heiko, stefan.wahren,
	linux-rockchip, linux-rpi-kernel, Julius Werner, gregory.herrero,
	yousaf.kaukab, dinguyen, stern, ming.lei, gregkh, linux-usb,
	linux-kernel

John,

On Tue, Feb 2, 2016 at 3:57 PM, John Youn <John.Youn@synopsys.com> wrote:
> On 1/28/2016 6:20 PM, Douglas Anderson wrote:
>> This is a bit of catchall series for all the bug fix and performance
>> patches I've been working on over the last few months.  Note that for
>> dwc2 we need to do LOTS in software and need super low interrupt
>> latency, so most performance improvements actually fix real bugs.
>>
>> Patches are structured to start with no-brainer stuff that could be
>> applied ASAP, especially things I've already gotten Acks for.  Things
>> get slightly more RFC / RFT like as we get farther down the series.
>> Anything that can be landed sooner rather than later (especially those
>> Acked long ago) would help in re-posts (I'm not biased, of course).
>>
>
> Hi Doug,
>
> I've yet to review this, but just wanted to let you know that we've
> started on it and also testing. We'll get back to you with some
> feedback and results soon.
>
> We had also been looking at some of these same and related issues so
> we want to make sure everything we've done is compatible with your
> changes and is still working ok too.

Great, thanks for the reply.  It's very helpful to know that you're
looking at it even if you haven't had time to do a full review yet.

Note: if any of my patches are wrong or redundant to patches that
you've developed and tested, I'm happy to take your patches instead.
;)

Note that patches 1 - 11 have already landed in our tree and thus are
getting additional exposure and testing.  IMHO those are all ready for
prime time.  Assuming there are no huge issues, it would be handy if
those 11 patches landed as-is (or almost as-is), but that's just me
being selfish so I don't need to revert / reland new versions.  :-P


Also note that the period of time where I can devote this much time to
dwc2 is coming soon to an end since I have to go on to work on other
things.  If there are major issues or easy fixes I will certainly be
able to help with those things (I'll still be around), but I won't be
able to devote days to tracking down weird problems or testing
rewrites.  ;-)


-Doug

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits
@ 2016-02-03 18:23     ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-03 18:23 UTC (permalink / raw)
  To: John Youn
  Cc: balbi, kever.yang, william.wu, huangtao, heiko, stefan.wahren,
	linux-rockchip, linux-rpi-kernel, Julius Werner, gregory.herrero,
	yousaf.kaukab, dinguyen, stern, ming.lei, gregkh, linux-usb

John,

On Tue, Feb 2, 2016 at 3:57 PM, John Youn <John.Youn@synopsys.com> wrote:
> On 1/28/2016 6:20 PM, Douglas Anderson wrote:
>> This is a bit of catchall series for all the bug fix and performance
>> patches I've been working on over the last few months.  Note that for
>> dwc2 we need to do LOTS in software and need super low interrupt
>> latency, so most performance improvements actually fix real bugs.
>>
>> Patches are structured to start with no-brainer stuff that could be
>> applied ASAP, especially things I've already gotten Acks for.  Things
>> get slightly more RFC / RFT like as we get farther down the series.
>> Anything that can be landed sooner rather than later (especially those
>> Acked long ago) would help in re-posts (I'm not biased, of course).
>>
>
> Hi Doug,
>
> I've yet to review this, but just wanted to let you know that we've
> started on it and also testing. We'll get back to you with some
> feedback and results soon.
>
> We had also been looking at some of these same and related issues so
> we want to make sure everything we've done is compatible with your
> changes and is still working ok too.

Great, thanks for the reply.  It's very helpful to know that you're
looking at it even if you haven't had time to do a full review yet.

Note: if any of my patches are wrong or redundant to patches that
you've developed and tested, I'm happy to take your patches instead.
;)

Note that patches 1 - 11 have already landed in our tree and thus are
getting additional exposure and testing.  IMHO those are all ready for
prime time.  Assuming there are no huge issues, it would be handy if
those 11 patches landed as-is (or almost as-is), but that's just me
being selfish so I don't need to revert / reland new versions.  :-P


Also note that the period of time where I can devote this much time to
dwc2 is coming soon to an end since I have to go on to work on other
things.  If there are major issues or easy fixes I will certainly be
able to help with those things (I'll still be around), but I won't be
able to devote days to tracking down weird problems or testing
rewrites.  ;-)


-Doug

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 17/22] usb: dwc2: host: Manage frame nums better in scheduler
@ 2016-02-03 20:29     ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-03 20:29 UTC (permalink / raw)
  To: John Youn, Felipe Balbi, Kever Yang, Herrero, Gregory
  Cc: 吴良峰,
	Tao Huang, Heiko Stübner, Stefan Wahren,
	open list:ARM/Rockchip SoC...,
	linux-rpi-kernel, Julius Werner, Kaukab, Yousaf, Dinh Nguyen,
	Alan Stern, Ming Lei, Douglas Anderson, John Youn,
	Greg Kroah-Hartman, linux-usb, linux-kernel

Hi,

On Thu, Jan 28, 2016 at 6:20 PM, Douglas Anderson <dianders@chromium.org> wrote:
>  static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
>                          struct dwc2_hcd_urb *urb)
>  {
> @@ -569,11 +655,6 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
>                               qh->ep_type == USB_ENDPOINT_XFER_ISOC,
>                               bytecount));
>
> -               /* Ensure frame_number corresponds to the reality */
> -               hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);

In reviewing patches I realized that this is actually a revert of
commit dd81dd7c8178 ("usb: dwc2: host: use correct frame number during
qh init").  IMHO that patch was wrong: hsotg->frame_number is supposed
to be the frame number as of the last start of frame.  If we need to
know a more recent frame number then we should query it ourselves.

Presumably the reason for the original patch was to try to fix some of
the same problems I've addressed in my series, so I'd presume that
this doesn't add any new regressions.  I haven't heard much from
Gregory Herrero about my series, but it would be nice to confirm that
this virtual revert wasn't causing problems.

-Doug

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 17/22] usb: dwc2: host: Manage frame nums better in scheduler
@ 2016-02-03 20:29     ` Doug Anderson
  0 siblings, 0 replies; 71+ messages in thread
From: Doug Anderson @ 2016-02-03 20:29 UTC (permalink / raw)
  To: John Youn, Felipe Balbi, Kever Yang, Herrero, Gregory
  Cc: 吴良峰,
	Tao Huang, Heiko Stübner, Stefan Wahren,
	open list:ARM/Rockchip SoC...,
	linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Julius Werner,
	Kaukab, Yousaf, Dinh Nguyen, Alan Stern, Ming Lei,
	Douglas Anderson, John Youn, Greg Kroah-Hartman,
	linux-usb-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi,

On Thu, Jan 28, 2016 at 6:20 PM, Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
>  static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
>                          struct dwc2_hcd_urb *urb)
>  {
> @@ -569,11 +655,6 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
>                               qh->ep_type == USB_ENDPOINT_XFER_ISOC,
>                               bytecount));
>
> -               /* Ensure frame_number corresponds to the reality */
> -               hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);

In reviewing patches I realized that this is actually a revert of
commit dd81dd7c8178 ("usb: dwc2: host: use correct frame number during
qh init").  IMHO that patch was wrong: hsotg->frame_number is supposed
to be the frame number as of the last start of frame.  If we need to
know a more recent frame number then we should query it ourselves.

Presumably the reason for the original patch was to try to fix some of
the same problems I've addressed in my series, so I'd presume that
this doesn't add any new regressions.  I haven't heard much from
Gregory Herrero about my series, but it would be nice to confirm that
this virtual revert wasn't causing problems.

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 17/22] usb: dwc2: host: Manage frame nums better in scheduler
  2016-02-03 20:29     ` Doug Anderson
  (?)
@ 2016-02-09  9:53     ` Herrero, Gregory
  -1 siblings, 0 replies; 71+ messages in thread
From: Herrero, Gregory @ 2016-02-09  9:53 UTC (permalink / raw)
  To: Doug Anderson
  Cc: John Youn, Felipe Balbi, Kever Yang, 吴良峰,
	Tao Huang, Heiko Stübner, Stefan Wahren,
	open list:ARM/Rockchip SoC...,
	linux-rpi-kernel, Julius Werner, Kaukab, Yousaf, Dinh Nguyen,
	Alan Stern, Ming Lei, John Youn, Greg Kroah-Hartman, linux-usb,
	linux-kernel

Hi Doug,

> Hi,
> 
> On Thu, Jan 28, 2016 at 6:20 PM, Douglas Anderson <dianders@chromium.org> wrote:
> >  static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
> >                          struct dwc2_hcd_urb *urb)
> >  {
> > @@ -569,11 +655,6 @@ static void dwc2_qh_init(struct dwc2_hsotg *hsotg, struct dwc2_qh *qh,
> >                               qh->ep_type == USB_ENDPOINT_XFER_ISOC,
> >                               bytecount));
> >
> > -               /* Ensure frame_number corresponds to the reality */
> > -               hsotg->frame_number = dwc2_hcd_get_frame_number(hsotg);
> 
> In reviewing patches I realized that this is actually a revert of
> commit dd81dd7c8178 ("usb: dwc2: host: use correct frame number during
> qh init").  IMHO that patch was wrong: hsotg->frame_number is supposed
> to be the frame number as of the last start of frame.  If we need to
> know a more recent frame number then we should query it ourselves.
> 
> Presumably the reason for the original patch was to try to fix some of
> the same problems I've addressed in my series, so I'd presume that
> this doesn't add any new regressions.  I haven't heard much from
> Gregory Herrero about my series, but it would be nice to confirm that
> this virtual revert wasn't causing problems.
> 

This patch ("usb: dwc2: host: use correct frame number during qh init")
is no more needed with your patchset.
Note that your patchset is also reverting commit 08c4ffc:
("usb: dwc2: host: reset frame number after suspend")
but it is no more needed as well with your patchset.

I tried suspend/resume with different devices and didn't face the issue
my previous commit was fixing.

Regards,
Gregory

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 10/22] usb: dwc2: host: Properly set the HFIR
  2016-01-31 22:19     ` Doug Anderson
@ 2016-02-10  2:08       ` John Youn
  0 siblings, 0 replies; 71+ messages in thread
From: John Youn @ 2016-02-10  2:08 UTC (permalink / raw)
  To: Doug Anderson, Kever Yang
  Cc: John Youn, Felipe Balbi, 吴良峰,
	Tao Huang, Heiko Stübner, Stefan Wahren,
	open list:ARM/Rockchip SoC...,
	linux-rpi-kernel, Julius Werner, Herrero, Gregory, Kaukab,
	Yousaf, Dinh Nguyen, Alan Stern, Ming Lei, Greg Kroah-Hartman,
	linux-usb, linux-kernel

On 1/31/2016 2:19 PM, Doug Anderson wrote:
> Kever,
> 
> On Sun, Jan 31, 2016 at 1:23 AM, Kever Yang <kever.yang@rock-chips.com> wrote:
>> Doug,
>>
>> On 01/29/2016 10:20 AM, Douglas Anderson wrote:
>>>
>>> According to the most up to date version of the dwc2 databook, the FRINT
>>> field of the HFIR register should be programmed to:
>>> * 125 us * (PHY clock freq for HS) - 1
>>> * 1000 us * (PHY clock freq for FS/LS) - 1
>>
>> I got 3 version of dwc_otg databook, 2.74a, 2.94a and 3.10a,
>> all the doc describe the FrInt as:
> 
> Can you check to see if you can get 3.30a (October 2015)?
> 
> 
>> * 125 us * (PHY clock freq for HS)
>> * 1000 us * (PHY clock freq for FS/LS)
>>
>> Maybe John can help to check the design.
> 
> Yes, this really needs John or someone at Synopsys.
> 
> 

The "- 1" is the correct value. The databook was corrected in 3.30a
and this applies to all previous versions of the core.

John

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits
  2016-02-03 18:23     ` Doug Anderson
@ 2016-02-10  2:25       ` John Youn
  -1 siblings, 0 replies; 71+ messages in thread
From: John Youn @ 2016-02-10  2:25 UTC (permalink / raw)
  To: Doug Anderson, John Youn
  Cc: balbi, kever.yang, william.wu, huangtao, heiko, stefan.wahren,
	linux-rockchip, linux-rpi-kernel, Julius Werner, gregory.herrero,
	yousaf.kaukab, dinguyen, stern, ming.lei, gregkh, linux-usb,
	linux-kernel

On 2/3/2016 10:24 AM, Doug Anderson wrote:
> John,
> 
> On Tue, Feb 2, 2016 at 3:57 PM, John Youn <John.Youn@synopsys.com> wrote:
>> On 1/28/2016 6:20 PM, Douglas Anderson wrote:
>>> This is a bit of catchall series for all the bug fix and performance
>>> patches I've been working on over the last few months.  Note that for
>>> dwc2 we need to do LOTS in software and need super low interrupt
>>> latency, so most performance improvements actually fix real bugs.
>>>
>>> Patches are structured to start with no-brainer stuff that could be
>>> applied ASAP, especially things I've already gotten Acks for.  Things
>>> get slightly more RFC / RFT like as we get farther down the series.
>>> Anything that can be landed sooner rather than later (especially those
>>> Acked long ago) would help in re-posts (I'm not biased, of course).
>>>
>>
>> Hi Doug,
>>
>> I've yet to review this, but just wanted to let you know that we've
>> started on it and also testing. We'll get back to you with some
>> feedback and results soon.
>>
>> We had also been looking at some of these same and related issues so
>> we want to make sure everything we've done is compatible with your
>> changes and is still working ok too.
> 
> Great, thanks for the reply.  It's very helpful to know that you're
> looking at it even if you haven't had time to do a full review yet.
> 
> Note: if any of my patches are wrong or redundant to patches that
> you've developed and tested, I'm happy to take your patches instead.
> ;)
> 
> Note that patches 1 - 11 have already landed in our tree and thus are
> getting additional exposure and testing.  IMHO those are all ready for
> prime time.  Assuming there are no huge issues, it would be handy if
> those 11 patches landed as-is (or almost as-is), but that's just me
> being selfish so I don't need to revert / reland new versions.  :-P
> 

Ok those seem good to me.

Patches 1-11:

Acked-by: John Youn <johnyoun@synopsys.com>


For the remaining patches in the series, we have some engineers still
working on validating it but so far it looks good. They have not
uncovered any new issues.

Regards,
John

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits
@ 2016-02-10  2:25       ` John Youn
  0 siblings, 0 replies; 71+ messages in thread
From: John Youn @ 2016-02-10  2:25 UTC (permalink / raw)
  To: Doug Anderson, John Youn
  Cc: balbi, kever.yang, william.wu, huangtao, heiko, stefan.wahren,
	linux-rockchip, linux-rpi-kernel, Julius Werner, gregory.herrero,
	yousaf.kaukab, dinguyen, stern, ming.lei, gregkh, linux-usb

On 2/3/2016 10:24 AM, Doug Anderson wrote:
> John,
> 
> On Tue, Feb 2, 2016 at 3:57 PM, John Youn <John.Youn@synopsys.com> wrote:
>> On 1/28/2016 6:20 PM, Douglas Anderson wrote:
>>> This is a bit of catchall series for all the bug fix and performance
>>> patches I've been working on over the last few months.  Note that for
>>> dwc2 we need to do LOTS in software and need super low interrupt
>>> latency, so most performance improvements actually fix real bugs.
>>>
>>> Patches are structured to start with no-brainer stuff that could be
>>> applied ASAP, especially things I've already gotten Acks for.  Things
>>> get slightly more RFC / RFT like as we get farther down the series.
>>> Anything that can be landed sooner rather than later (especially those
>>> Acked long ago) would help in re-posts (I'm not biased, of course).
>>>
>>
>> Hi Doug,
>>
>> I've yet to review this, but just wanted to let you know that we've
>> started on it and also testing. We'll get back to you with some
>> feedback and results soon.
>>
>> We had also been looking at some of these same and related issues so
>> we want to make sure everything we've done is compatible with your
>> changes and is still working ok too.
> 
> Great, thanks for the reply.  It's very helpful to know that you're
> looking at it even if you haven't had time to do a full review yet.
> 
> Note: if any of my patches are wrong or redundant to patches that
> you've developed and tested, I'm happy to take your patches instead.
> ;)
> 
> Note that patches 1 - 11 have already landed in our tree and thus are
> getting additional exposure and testing.  IMHO those are all ready for
> prime time.  Assuming there are no huge issues, it would be handy if
> those 11 patches landed as-is (or almost as-is), but that's just me
> being selfish so I don't need to revert / reland new versions.  :-P
> 

Ok those seem good to me.

Patches 1-11:

Acked-by: John Youn <johnyoun@synopsys.com>


For the remaining patches in the series, we have some engineers still
working on validating it but so far it looks good. They have not
uncovered any new issues.

Regards,
John

^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2016-02-10  2:26 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-29  2:19 [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits Douglas Anderson
2016-01-29  2:19 ` [PATCH v6 01/22] usb: dwc2: rockchip: Make the max_transfer_size automatic Douglas Anderson
2016-01-29  2:19   ` Douglas Anderson
2016-01-29  2:19 ` [PATCH v6 02/22] usb: dwc2: host: Get aligned DMA in a more supported way Douglas Anderson
2016-01-29  2:19   ` Douglas Anderson
2016-01-29  2:19 ` [PATCH v6 03/22] usb: dwc2: host: Set host_rx_fifo_size to 525 for rk3066 Douglas Anderson
2016-01-29  2:19   ` Douglas Anderson
2016-01-29  2:19 ` [PATCH v6 04/22] usb: dwc2: host: Avoid use of chan->qh after qh freed Douglas Anderson
2016-01-29  2:19   ` Douglas Anderson
2016-01-29  2:19 ` [PATCH v6 05/22] usb: dwc2: host: Always add to the tail of queues Douglas Anderson
2016-01-29  2:19   ` Douglas Anderson
2016-01-29  2:19 ` [PATCH v6 06/22] usb: dwc2: host: fix split transfer schedule sequence Douglas Anderson
2016-01-29  2:19   ` Douglas Anderson
2016-01-29  2:19 ` [PATCH v6 07/22] usb: dwc2: host: Add scheduler tracing Douglas Anderson
2016-01-29  2:19   ` Douglas Anderson
2016-01-29  2:19 ` [PATCH v6 08/22] usb: dwc2: host: Add a delay before releasing periodic bandwidth Douglas Anderson
2016-01-29  2:19   ` Douglas Anderson
2016-01-29  2:20 ` [PATCH v6 09/22] usb: dwc2: host: Giveback URB in tasklet context Douglas Anderson
2016-01-29  2:20 ` [PATCH v6 10/22] usb: dwc2: host: Properly set the HFIR Douglas Anderson
2016-01-31  9:23   ` Kever Yang
2016-01-31  9:23     ` Kever Yang
2016-01-31 22:19     ` Doug Anderson
2016-02-10  2:08       ` John Youn
2016-01-29  2:20 ` [PATCH v6 11/22] usb: dwc2: host: There's not really a TT for the root hub Douglas Anderson
2016-01-29  2:20   ` Douglas Anderson
2016-01-31  9:25   ` Kever Yang
2016-01-29  2:20 ` [PATCH v6 12/22] usb: dwc2: host: Use periodic interrupt even with DMA Douglas Anderson
2016-01-29  2:20   ` Douglas Anderson
2016-01-29  2:20 ` [PATCH v6 13/22] usb: dwc2: host: Rename some fields in struct dwc2_qh Douglas Anderson
2016-01-29  2:20 ` [PATCH v6 14/22] usb: dwc2: host: Reorder things in hcd_queue.c Douglas Anderson
2016-01-29  2:20   ` Douglas Anderson
2016-01-29  2:20 ` [PATCH v6 15/22] usb: dwc2: host: Split code out to make dwc2_do_reserve() Douglas Anderson
2016-01-29  2:20   ` Douglas Anderson
2016-01-29  2:20 ` [PATCH v6 16/22] usb: dwc2: host: Add scheduler logging for missed SOFs Douglas Anderson
2016-01-29  2:20 ` [PATCH v6 17/22] usb: dwc2: host: Manage frame nums better in scheduler Douglas Anderson
2016-01-29  2:20   ` Douglas Anderson
2016-02-03 20:29   ` Doug Anderson
2016-02-03 20:29     ` Doug Anderson
2016-02-09  9:53     ` Herrero, Gregory
2016-01-29  2:20 ` [PATCH v6 18/22] usb: dwc2: host: Schedule periodic right away if it's time Douglas Anderson
2016-01-31  9:36   ` Kever Yang
2016-01-31  9:36     ` Kever Yang
2016-01-31 22:09     ` Doug Anderson
2016-01-31 22:09       ` Doug Anderson
2016-02-01  3:32       ` Kever Yang
2016-02-01  4:36         ` Doug Anderson
2016-02-01  4:36           ` Doug Anderson
2016-02-02  0:36           ` Doug Anderson
2016-02-02  0:36             ` Doug Anderson
2016-02-02  7:04             ` Kever Yang
2016-02-02  7:04               ` Kever Yang
2016-02-02 23:28               ` Doug Anderson
2016-02-02 23:28                 ` Doug Anderson
2016-01-29  2:20 ` [PATCH v6 19/22] usb: dwc2: host: Add dwc2_hcd_get_future_frame_number() call Douglas Anderson
2016-01-29  2:20   ` Douglas Anderson
2016-01-29  2:20 ` [PATCH v6 20/22] usb: dwc2: host: Properly set even/odd frame Douglas Anderson
2016-02-02  7:46   ` Kever Yang
2016-02-02 22:47     ` Doug Anderson
2016-02-02 22:47       ` Doug Anderson
2016-02-03  7:47       ` Kever Yang
2016-02-03  7:47         ` Kever Yang
2016-01-29  2:20 ` [PATCH v6 21/22] usb: dwc2: host: Totally redo the microframe scheduler Douglas Anderson
2016-01-29  2:20   ` Douglas Anderson
2016-01-29  2:20 ` [PATCH v6 22/22] usb: dwc2: host: If using uframe scheduler, end splits better Douglas Anderson
2016-01-29  2:20   ` Douglas Anderson
2016-02-02 23:57 ` [PATCH v6 0/22] usb: dwc2: host: Fix and speed up all the stuff, especially with splits John Youn
2016-02-02 23:57   ` John Youn
2016-02-03 18:23   ` Doug Anderson
2016-02-03 18:23     ` Doug Anderson
2016-02-10  2:25     ` John Youn
2016-02-10  2:25       ` John Youn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.