linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule()
@ 2019-04-03 18:53 ` Matwey V. Kornilov
  2019-04-30 15:31   ` Bin Liu
  0 siblings, 1 reply; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-04-03 18:53 UTC (permalink / raw)
  To: b-liu, gregkh
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-kernel, linux-usb

Previously, the algorithm was the following:

 1. giveback current URB
 2. if current qh is not empty
    then start next URB
 3. if current qh is empty
    then dispose the qh, find next qh if any, and start URB.

It may take a while to run urb->callback inside URB giveback which is
run synchronously in musb. In order to improve the latency we rearrange
the function behaviour for the case when qh is not empty: next URB is
started before URB giveback. When qh is empty then the behaviour is
intentionally kept in order not to break existing inter qh scheduling:
URB giveback could potentionally enqueue other URB to the empty qh
preventing it from being disposed.

Before this patch, time spent in urb->callback led to the following
glitches between the host and a hub during isoc transfer (line 4):

    11.624492 d=  0.000124 [130.6 +  1.050] [  4] SPLIT
    11.624492 d=  0.000000 [130.6 +  1.467] [  3] IN   : 3.5
    11.624493 d=  0.000000 [130.6 +  1.967] [ 37] DATA0: aa 08 [skipped...]
    11.625617 d=  0.001124 [131.7 +  1.050] [  4] SPLIT
    11.625617 d=  0.000000 [131.7 +  1.467] [  3] IN   : 3.5
    11.625867 d=  0.000250 [132.1 +  1.050] [  4] SPLIT
    11.625867 d=  0.000000 [132.1 +  1.467] [  3] IN   : 3.5
    11.625868 d=  0.000001 [132.1 +  1.983] [  3] DATA0: 00 00
    11.626617 d=  0.000749 [132.7 +  1.050] [  4] SPLIT
    11.626617 d=  0.000000 [132.7 +  1.467] [  3] IN   : 3.5
    11.626867 d=  0.000250 [133.1 +  1.050] [  4] SPLIT
    11.626867 d=  0.000000 [133.1 +  1.467] [  3] IN   : 3.5
    11.626868 d=  0.000000 [133.1 +  1.967] [  3] DATA0: 00 00

After the hub, they look as the following and may lead to broken
perepherial transfer (as in case of PWC based webcam):

    11.332004 d=  0.000997 [ 30.0 +  3.417] [  3] IN   : 5.5
    11.332007 d=  0.000003 [ 30.0 +  6.833] [800] DATA0: 8a 1c [skipped...]
    11.334004 d=  0.001997 [ 32.0 +  3.417] [  3] IN   : 5.5
    11.334007 d=  0.000003 [ 32.0 +  6.750] [  3] DATA0: 00 00
    11.335004 d=  0.000997 [ 33   +  3.417] [  3] IN   : 5.5
    11.335007 d=  0.000003 [ 33   +  6.750] [  3] DATA0: 00 00

Removing this glitches makes us able to successfully run 10fps
video stream from the webcam attached via USB hub. That was
previously impossible.

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index ed99ecd4e63a..75be92873b5b 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -85,6 +85,11 @@ static bool musb_qh_empty(struct musb_qh *qh)
 	return list_empty(&qh->hep->urb_list);
 }
 
+static bool musb_qh_singular(struct musb_qh *qh)
+{
+	return list_is_singular(&qh->hep->urb_list);
+}
+
 static void musb_qh_unlink_hep(struct musb_qh *qh)
 {
 	if (!qh->hep)
@@ -362,6 +367,19 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 		break;
 	}
 
+	if (ready && !musb_qh_singular(qh)) {
+		struct urb *next_urb = list_next_entry(urb, urb_list);
+
+		musb_dbg(musb, "... next ep%d %cX urb %p", hw_ep->epnum, is_in ? 'R' : 'T', next_urb);
+		musb_start_urb(musb, is_in, qh, next_urb);
+
+		qh->is_ready = 0;
+		musb_giveback(musb, urb, status);
+		qh->is_ready = ready;
+
+		return;
+	}
+
 	qh->is_ready = 0;
 	musb_giveback(musb, urb, status);
 	qh->is_ready = ready;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/6] musb: Improve performance for hub-attached webcams
       [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
  2019-04-03 18:53 ` [6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule() Matwey V. Kornilov
@ 2019-04-24 15:42 ` Matwey V. Kornilov
  2019-04-30 15:20   ` Bin Liu
  2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-04-24 15:42 UTC (permalink / raw)
  To: Bin Liu, Greg KH
  Cc: open list, open list:MUSB MULTIPOINT HIGH SPEED DUAL-ROLE CONTROLLER

Ping

ср, 3 апр. 2019 г. в 21:53, Matwey V. Kornilov <matwey@sai.msu.ru>:
>
> The series is concerned to issues with isochronous transfer while
> streaming the USB webcam data. I discovered the issue first time
> when attached PWC USB webcam to AM335x-based BeagleBone Black SBC.
> It appeared that the root issue was in numerous missed IN requests
> during isochronous transfer where each missing leaded to the frame
> drop. Since every IN request is triggered in MUSB driver
> individually, it is important to queue the send IN request as
> earlier as possible when the previous IN completed. At the same
> time the URB giveback handler of the device driver has also to be
> called there, that leads to arbitrarily delay depending on the
> device driver performance. The details with the references are
> described in [1].
>
> The issue has two parts:
>
>   1) peripheral driver URB callback performance
>   2) MUSB host driver performance
>
> It appeared that the first part is related to the wrong memory
> allocation strategy in the most USB webcam drivers. Non-cached
> memory is used in assumption that coherent DMA memory leads to
> the better performance than non-coherent memory in conjunction with
> the proper synchronization. Yet the assumption might be valid for
> x86 platforms some time ago, the issue was fixed for PWC driver in:
>
>     1161db6776bd ("media: usb: pwc: Don't use coherent DMA buffers for ISO transfer")
>
> that leads to 3.5x performance gain. The more generic fix for this
> common issue are coming for the rest drivers [2].
>
> The patch allowed successfully running full-speed USB PWC webcams
> attached directly to BeagleBone Black USB port.
>
> However, the second part of the issue is still present for
> peripheral device attached through the high-speed USB hub due to
> its 125us frame time. The patch series is intended to reorganize
> musb_advance_schedule() to allow host to send IN request quicker.
>
> The patch series is organized as the following. First three patches
> improve readability of the existing code in
> musb_advance_schedule(). Patches 4 and 5 introduce updated
> signature for musb_start_urb(). The last patch introduce new
> code-path in musb_advance_schedule() which allows for faster
> response.
>
> References:
>
> [1] https://www.spinics.net/lists/linux-usb/msg165735.html
> [2] https://www.spinics.net/lists/linux-media/msg144279.html
>
> Matwey V. Kornilov (6):
>   usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
>   usb: musb: Introduce musb_qh_empty() helper function
>   usb: musb: Introduce musb_qh_free() helper function
>   usb: musb: Rename musb_start_urb() to musb_start_next_urb()
>   usb: musb: Introduce musb_start_urb()
>   usb: musb: Decrease URB starting latency in musb_advance_schedule()
>
>  drivers/usb/musb/musb_host.c | 114 +++++++++++++++++++++++++++----------------
>  1 file changed, 71 insertions(+), 43 deletions(-)
>
> --
> 2.16.4
>


-- 
With best regards,
Matwey V. Kornilov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/6] musb: Improve performance for hub-attached webcams
  2019-04-24 15:42 ` [PATCH 0/6] musb: Improve performance for hub-attached webcams Matwey V. Kornilov
@ 2019-04-30 15:20   ` Bin Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Bin Liu @ 2019-04-30 15:20 UTC (permalink / raw)
  To: Matwey V. Kornilov
  Cc: Greg KH, open list,
	open list:MUSB MULTIPOINT HIGH SPEED DUAL-ROLE CONTROLLER

Hi Matwey,

On Wed, Apr 24, 2019 at 06:42:30PM +0300, Matwey V. Kornilov wrote:
> Ping

Sorry for my late response. This series does improve isoch transfers for
webcam on musb. A few of my cameras used to fail to stream 640x480@30fps
now the test passed with your patches. Thank you for the work.

But since the patch changes the way handling urb giveback, I would need
more time to run more tests.

-Bin.

> 
> ср, 3 апр. 2019 г. в 21:53, Matwey V. Kornilov <matwey@sai.msu.ru>:
> >
> > The series is concerned to issues with isochronous transfer while
> > streaming the USB webcam data. I discovered the issue first time
> > when attached PWC USB webcam to AM335x-based BeagleBone Black SBC.
> > It appeared that the root issue was in numerous missed IN requests
> > during isochronous transfer where each missing leaded to the frame
> > drop. Since every IN request is triggered in MUSB driver
> > individually, it is important to queue the send IN request as
> > earlier as possible when the previous IN completed. At the same
> > time the URB giveback handler of the device driver has also to be
> > called there, that leads to arbitrarily delay depending on the
> > device driver performance. The details with the references are
> > described in [1].
> >
> > The issue has two parts:
> >
> >   1) peripheral driver URB callback performance
> >   2) MUSB host driver performance
> >
> > It appeared that the first part is related to the wrong memory
> > allocation strategy in the most USB webcam drivers. Non-cached
> > memory is used in assumption that coherent DMA memory leads to
> > the better performance than non-coherent memory in conjunction with
> > the proper synchronization. Yet the assumption might be valid for
> > x86 platforms some time ago, the issue was fixed for PWC driver in:
> >
> >     1161db6776bd ("media: usb: pwc: Don't use coherent DMA buffers for ISO transfer")
> >
> > that leads to 3.5x performance gain. The more generic fix for this
> > common issue are coming for the rest drivers [2].
> >
> > The patch allowed successfully running full-speed USB PWC webcams
> > attached directly to BeagleBone Black USB port.
> >
> > However, the second part of the issue is still present for
> > peripheral device attached through the high-speed USB hub due to
> > its 125us frame time. The patch series is intended to reorganize
> > musb_advance_schedule() to allow host to send IN request quicker.
> >
> > The patch series is organized as the following. First three patches
> > improve readability of the existing code in
> > musb_advance_schedule(). Patches 4 and 5 introduce updated
> > signature for musb_start_urb(). The last patch introduce new
> > code-path in musb_advance_schedule() which allows for faster
> > response.
> >
> > References:
> >
> > [1] https://www.spinics.net/lists/linux-usb/msg165735.html
> > [2] https://www.spinics.net/lists/linux-media/msg144279.html
> >
> > Matwey V. Kornilov (6):
> >   usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
> >   usb: musb: Introduce musb_qh_empty() helper function
> >   usb: musb: Introduce musb_qh_free() helper function
> >   usb: musb: Rename musb_start_urb() to musb_start_next_urb()
> >   usb: musb: Introduce musb_start_urb()
> >   usb: musb: Decrease URB starting latency in musb_advance_schedule()
> >
> >  drivers/usb/musb/musb_host.c | 114 +++++++++++++++++++++++++++----------------
> >  1 file changed, 71 insertions(+), 43 deletions(-)
> >
> > --
> > 2.16.4
> >
> 
> 
> -- 
> With best regards,
> Matwey V. Kornilov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule()
@ 2019-04-30 15:31   ` Bin Liu
  2019-04-30 15:31     ` [PATCH 6/6] " Bin Liu
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Bin Liu @ 2019-04-30 15:31 UTC (permalink / raw)
  To: Matwey V. Kornilov; +Cc: gregkh, matwey.kornilov, linux-kernel, linux-usb

Hi Greg and all devs,

On Wed, Apr 03, 2019 at 09:53:10PM +0300, Matwey V. Kornilov wrote:
> Previously, the algorithm was the following:
> 
>  1. giveback current URB
>  2. if current qh is not empty
>     then start next URB
>  3. if current qh is empty
>     then dispose the qh, find next qh if any, and start URB.
> 
> It may take a while to run urb->callback inside URB giveback which is
> run synchronously in musb. In order to improve the latency we rearrange
> the function behaviour for the case when qh is not empty: next URB is
> started before URB giveback. When qh is empty then the behaviour is
> intentionally kept in order not to break existing inter qh scheduling:
> URB giveback could potentionally enqueue other URB to the empty qh
> preventing it from being disposed.

This patch changes the sequence of urb giveback in musb.

	before				after
	------				-----
1. giveback current urb			1. start next urb if qh != empty
2. start next urb if qh != empty	2. giveback current urb

I see there is a potential that the urb giveback could be out of order,
for example, if urb giveback in BH and the next urb finishes before BH
runs.

If this potential is possible, is it a problem for any class driver?

Thanks,
-Bin.

> 
> Before this patch, time spent in urb->callback led to the following
> glitches between the host and a hub during isoc transfer (line 4):
> 
>     11.624492 d=  0.000124 [130.6 +  1.050] [  4] SPLIT
>     11.624492 d=  0.000000 [130.6 +  1.467] [  3] IN   : 3.5
>     11.624493 d=  0.000000 [130.6 +  1.967] [ 37] DATA0: aa 08 [skipped...]
>     11.625617 d=  0.001124 [131.7 +  1.050] [  4] SPLIT
>     11.625617 d=  0.000000 [131.7 +  1.467] [  3] IN   : 3.5
>     11.625867 d=  0.000250 [132.1 +  1.050] [  4] SPLIT
>     11.625867 d=  0.000000 [132.1 +  1.467] [  3] IN   : 3.5
>     11.625868 d=  0.000001 [132.1 +  1.983] [  3] DATA0: 00 00
>     11.626617 d=  0.000749 [132.7 +  1.050] [  4] SPLIT
>     11.626617 d=  0.000000 [132.7 +  1.467] [  3] IN   : 3.5
>     11.626867 d=  0.000250 [133.1 +  1.050] [  4] SPLIT
>     11.626867 d=  0.000000 [133.1 +  1.467] [  3] IN   : 3.5
>     11.626868 d=  0.000000 [133.1 +  1.967] [  3] DATA0: 00 00
> 
> After the hub, they look as the following and may lead to broken
> perepherial transfer (as in case of PWC based webcam):
> 
>     11.332004 d=  0.000997 [ 30.0 +  3.417] [  3] IN   : 5.5
>     11.332007 d=  0.000003 [ 30.0 +  6.833] [800] DATA0: 8a 1c [skipped...]
>     11.334004 d=  0.001997 [ 32.0 +  3.417] [  3] IN   : 5.5
>     11.334007 d=  0.000003 [ 32.0 +  6.750] [  3] DATA0: 00 00
>     11.335004 d=  0.000997 [ 33   +  3.417] [  3] IN   : 5.5
>     11.335007 d=  0.000003 [ 33   +  6.750] [  3] DATA0: 00 00
> 
> Removing this glitches makes us able to successfully run 10fps
> video stream from the webcam attached via USB hub. That was
> previously impossible.
> 
> Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
> ---
>  drivers/usb/musb/musb_host.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
> index ed99ecd4e63a..75be92873b5b 100644
> --- a/drivers/usb/musb/musb_host.c
> +++ b/drivers/usb/musb/musb_host.c
> @@ -85,6 +85,11 @@ static bool musb_qh_empty(struct musb_qh *qh)
>  	return list_empty(&qh->hep->urb_list);
>  }
>  
> +static bool musb_qh_singular(struct musb_qh *qh)
> +{
> +	return list_is_singular(&qh->hep->urb_list);
> +}
> +
>  static void musb_qh_unlink_hep(struct musb_qh *qh)
>  {
>  	if (!qh->hep)
> @@ -362,6 +367,19 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
>  		break;
>  	}
>  
> +	if (ready && !musb_qh_singular(qh)) {
> +		struct urb *next_urb = list_next_entry(urb, urb_list);
> +
> +		musb_dbg(musb, "... next ep%d %cX urb %p", hw_ep->epnum, is_in ? 'R' : 'T', next_urb);
> +		musb_start_urb(musb, is_in, qh, next_urb);
> +
> +		qh->is_ready = 0;
> +		musb_giveback(musb, urb, status);
> +		qh->is_ready = ready;
> +
> +		return;
> +	}
> +
>  	qh->is_ready = 0;
>  	musb_giveback(musb, urb, status);
>  	qh->is_ready = ready;
> -- 
> 2.16.4
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule()
  2019-04-30 15:31   ` Bin Liu
@ 2019-04-30 15:31     ` Bin Liu
  2019-04-30 17:29     ` [6/6] " Alan Stern
  2019-05-04  9:38     ` [6/6] " Matwey V. Kornilov
  2 siblings, 0 replies; 27+ messages in thread
From: Bin Liu @ 2019-04-30 15:31 UTC (permalink / raw)
  To: Matwey V. Kornilov; +Cc: gregkh, matwey.kornilov, linux-kernel, linux-usb

Hi Greg and all devs,

On Wed, Apr 03, 2019 at 09:53:10PM +0300, Matwey V. Kornilov wrote:
> Previously, the algorithm was the following:
> 
>  1. giveback current URB
>  2. if current qh is not empty
>     then start next URB
>  3. if current qh is empty
>     then dispose the qh, find next qh if any, and start URB.
> 
> It may take a while to run urb->callback inside URB giveback which is
> run synchronously in musb. In order to improve the latency we rearrange
> the function behaviour for the case when qh is not empty: next URB is
> started before URB giveback. When qh is empty then the behaviour is
> intentionally kept in order not to break existing inter qh scheduling:
> URB giveback could potentionally enqueue other URB to the empty qh
> preventing it from being disposed.

This patch changes the sequence of urb giveback in musb.

	before				after
	------				-----
1. giveback current urb			1. start next urb if qh != empty
2. start next urb if qh != empty	2. giveback current urb

I see there is a potential that the urb giveback could be out of order,
for example, if urb giveback in BH and the next urb finishes before BH
runs.

If this potential is possible, is it a problem for any class driver?

Thanks,
-Bin.

> 
> Before this patch, time spent in urb->callback led to the following
> glitches between the host and a hub during isoc transfer (line 4):
> 
>     11.624492 d=  0.000124 [130.6 +  1.050] [  4] SPLIT
>     11.624492 d=  0.000000 [130.6 +  1.467] [  3] IN   : 3.5
>     11.624493 d=  0.000000 [130.6 +  1.967] [ 37] DATA0: aa 08 [skipped...]
>     11.625617 d=  0.001124 [131.7 +  1.050] [  4] SPLIT
>     11.625617 d=  0.000000 [131.7 +  1.467] [  3] IN   : 3.5
>     11.625867 d=  0.000250 [132.1 +  1.050] [  4] SPLIT
>     11.625867 d=  0.000000 [132.1 +  1.467] [  3] IN   : 3.5
>     11.625868 d=  0.000001 [132.1 +  1.983] [  3] DATA0: 00 00
>     11.626617 d=  0.000749 [132.7 +  1.050] [  4] SPLIT
>     11.626617 d=  0.000000 [132.7 +  1.467] [  3] IN   : 3.5
>     11.626867 d=  0.000250 [133.1 +  1.050] [  4] SPLIT
>     11.626867 d=  0.000000 [133.1 +  1.467] [  3] IN   : 3.5
>     11.626868 d=  0.000000 [133.1 +  1.967] [  3] DATA0: 00 00
> 
> After the hub, they look as the following and may lead to broken
> perepherial transfer (as in case of PWC based webcam):
> 
>     11.332004 d=  0.000997 [ 30.0 +  3.417] [  3] IN   : 5.5
>     11.332007 d=  0.000003 [ 30.0 +  6.833] [800] DATA0: 8a 1c [skipped...]
>     11.334004 d=  0.001997 [ 32.0 +  3.417] [  3] IN   : 5.5
>     11.334007 d=  0.000003 [ 32.0 +  6.750] [  3] DATA0: 00 00
>     11.335004 d=  0.000997 [ 33   +  3.417] [  3] IN   : 5.5
>     11.335007 d=  0.000003 [ 33   +  6.750] [  3] DATA0: 00 00
> 
> Removing this glitches makes us able to successfully run 10fps
> video stream from the webcam attached via USB hub. That was
> previously impossible.
> 
> Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
> ---
>  drivers/usb/musb/musb_host.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
> index ed99ecd4e63a..75be92873b5b 100644
> --- a/drivers/usb/musb/musb_host.c
> +++ b/drivers/usb/musb/musb_host.c
> @@ -85,6 +85,11 @@ static bool musb_qh_empty(struct musb_qh *qh)
>  	return list_empty(&qh->hep->urb_list);
>  }
>  
> +static bool musb_qh_singular(struct musb_qh *qh)
> +{
> +	return list_is_singular(&qh->hep->urb_list);
> +}
> +
>  static void musb_qh_unlink_hep(struct musb_qh *qh)
>  {
>  	if (!qh->hep)
> @@ -362,6 +367,19 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
>  		break;
>  	}
>  
> +	if (ready && !musb_qh_singular(qh)) {
> +		struct urb *next_urb = list_next_entry(urb, urb_list);
> +
> +		musb_dbg(musb, "... next ep%d %cX urb %p", hw_ep->epnum, is_in ? 'R' : 'T', next_urb);
> +		musb_start_urb(musb, is_in, qh, next_urb);
> +
> +		qh->is_ready = 0;
> +		musb_giveback(musb, urb, status);
> +		qh->is_ready = ready;
> +
> +		return;
> +	}
> +
>  	qh->is_ready = 0;
>  	musb_giveback(musb, urb, status);
>  	qh->is_ready = ready;
> -- 
> 2.16.4
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule()
@ 2019-04-30 17:29     ` Alan Stern
  2019-04-30 17:29       ` [PATCH 6/6] " Alan Stern
  0 siblings, 1 reply; 27+ messages in thread
From: Alan Stern @ 2019-04-30 17:29 UTC (permalink / raw)
  To: Bin Liu
  Cc: Matwey V. Kornilov, gregkh, matwey.kornilov, linux-kernel, linux-usb

On Tue, 30 Apr 2019, Bin Liu wrote:

> Hi Greg and all devs,
> 
> On Wed, Apr 03, 2019 at 09:53:10PM +0300, Matwey V. Kornilov wrote:
> > Previously, the algorithm was the following:
> > 
> >  1. giveback current URB
> >  2. if current qh is not empty
> >     then start next URB
> >  3. if current qh is empty
> >     then dispose the qh, find next qh if any, and start URB.
> > 
> > It may take a while to run urb->callback inside URB giveback which is
> > run synchronously in musb. In order to improve the latency we rearrange
> > the function behaviour for the case when qh is not empty: next URB is
> > started before URB giveback. When qh is empty then the behaviour is
> > intentionally kept in order not to break existing inter qh scheduling:
> > URB giveback could potentionally enqueue other URB to the empty qh
> > preventing it from being disposed.
> 
> This patch changes the sequence of urb giveback in musb.
> 
> 	before				after
> 	------				-----
> 1. giveback current urb			1. start next urb if qh != empty
> 2. start next urb if qh != empty	2. giveback current urb
> 
> I see there is a potential that the urb giveback could be out of order,
> for example, if urb giveback in BH and the next urb finishes before BH
> runs.
> 
> If this potential is possible, is it a problem for any class driver?

I don't know of any specific examples where this would be a problem.  
But it definitely goes against the guarantee that except for unlinks, 
URBs for each endpoint are always given back in order.

There's also a guarantee that when an URB has an error status, it
causes the endpoint queue to stop.  This is necessary so that the class
driver can cancel any outstanding URBs before they run and cause even
more trouble.  Your brief outline above doesn't mention this.

On the other hand, it shouldn't be hard to maintain the order even
here.  For example, you could have a FIFO list of URBs waiting to be
given back, and the BH could always give back the URB at the front of
the list.

Alan Stern

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule()
  2019-04-30 17:29     ` [6/6] " Alan Stern
@ 2019-04-30 17:29       ` Alan Stern
  0 siblings, 0 replies; 27+ messages in thread
From: Alan Stern @ 2019-04-30 17:29 UTC (permalink / raw)
  To: Bin Liu
  Cc: Matwey V. Kornilov, gregkh, matwey.kornilov, linux-kernel, linux-usb

On Tue, 30 Apr 2019, Bin Liu wrote:

> Hi Greg and all devs,
> 
> On Wed, Apr 03, 2019 at 09:53:10PM +0300, Matwey V. Kornilov wrote:
> > Previously, the algorithm was the following:
> > 
> >  1. giveback current URB
> >  2. if current qh is not empty
> >     then start next URB
> >  3. if current qh is empty
> >     then dispose the qh, find next qh if any, and start URB.
> > 
> > It may take a while to run urb->callback inside URB giveback which is
> > run synchronously in musb. In order to improve the latency we rearrange
> > the function behaviour for the case when qh is not empty: next URB is
> > started before URB giveback. When qh is empty then the behaviour is
> > intentionally kept in order not to break existing inter qh scheduling:
> > URB giveback could potentionally enqueue other URB to the empty qh
> > preventing it from being disposed.
> 
> This patch changes the sequence of urb giveback in musb.
> 
> 	before				after
> 	------				-----
> 1. giveback current urb			1. start next urb if qh != empty
> 2. start next urb if qh != empty	2. giveback current urb
> 
> I see there is a potential that the urb giveback could be out of order,
> for example, if urb giveback in BH and the next urb finishes before BH
> runs.
> 
> If this potential is possible, is it a problem for any class driver?

I don't know of any specific examples where this would be a problem.  
But it definitely goes against the guarantee that except for unlinks, 
URBs for each endpoint are always given back in order.

There's also a guarantee that when an URB has an error status, it
causes the endpoint queue to stop.  This is necessary so that the class
driver can cancel any outstanding URBs before they run and cause even
more trouble.  Your brief outline above doesn't mention this.

On the other hand, it shouldn't be hard to maintain the order even
here.  For example, you could have a FIFO list of URBs waiting to be
given back, and the BH could always give back the URB at the front of
the list.

Alan Stern


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule()
@ 2019-05-04  9:38     ` Matwey V. Kornilov
  2019-05-04  9:38       ` [PATCH 6/6] " Matwey V. Kornilov
  0 siblings, 1 reply; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-05-04  9:38 UTC (permalink / raw)
  To: Bin Liu, Matwey V. Kornilov, Greg KH,
	Матвей
	Корнилов,
	open list,
	open list:MUSB MULTIPOINT HIGH SPEED DUAL-ROLE CONTROLLER

вт, 30 апр. 2019 г. в 18:31, Bin Liu <b-liu@ti.com>:
>
> Hi Greg and all devs,
>
> On Wed, Apr 03, 2019 at 09:53:10PM +0300, Matwey V. Kornilov wrote:
> > Previously, the algorithm was the following:
> >
> >  1. giveback current URB
> >  2. if current qh is not empty
> >     then start next URB
> >  3. if current qh is empty
> >     then dispose the qh, find next qh if any, and start URB.
> >
> > It may take a while to run urb->callback inside URB giveback which is
> > run synchronously in musb. In order to improve the latency we rearrange
> > the function behaviour for the case when qh is not empty: next URB is
> > started before URB giveback. When qh is empty then the behaviour is
> > intentionally kept in order not to break existing inter qh scheduling:
> > URB giveback could potentionally enqueue other URB to the empty qh
> > preventing it from being disposed.
>
> This patch changes the sequence of urb giveback in musb.
>
>         before                          after
>         ------                          -----
> 1. giveback current urb                 1. start next urb if qh != empty
> 2. start next urb if qh != empty        2. giveback current urb
>
> I see there is a potential that the urb giveback could be out of order,
> for example, if urb giveback in BH and the next urb finishes before BH
> runs.

Could you please give more details? Frankly speaking, I am not sure
that I understand the reordering issue origin correctly.
I see in the existing implementation that the function call order is
the following:

1. glue interrupt handler (for instance dsps_interrupt() in my am335x
case) holds musb->lock;
2. musb_interrupt()
3. musb_host_rx() (or *_tx())
4. musb_advance_schedule()
5. musb_giveback() releases and reacquires musb->lock around:
6. usb_hcd_giveback_urb()

So, when musb_giveback() is called inside musb_advance_schedule() then
the second instance of musb_advance_schedule() can be started
simultaneously when the following interrupt is being handled at other
CPU core. And we can see two usb_hcd_giveback_urb() running
concurrently.
Is it correct?

>
> If this potential is possible, is it a problem for any class driver?
>
> Thanks,
> -Bin.
>
> >
> > Before this patch, time spent in urb->callback led to the following
> > glitches between the host and a hub during isoc transfer (line 4):
> >
> >     11.624492 d=  0.000124 [130.6 +  1.050] [  4] SPLIT
> >     11.624492 d=  0.000000 [130.6 +  1.467] [  3] IN   : 3.5
> >     11.624493 d=  0.000000 [130.6 +  1.967] [ 37] DATA0: aa 08 [skipped...]
> >     11.625617 d=  0.001124 [131.7 +  1.050] [  4] SPLIT
> >     11.625617 d=  0.000000 [131.7 +  1.467] [  3] IN   : 3.5
> >     11.625867 d=  0.000250 [132.1 +  1.050] [  4] SPLIT
> >     11.625867 d=  0.000000 [132.1 +  1.467] [  3] IN   : 3.5
> >     11.625868 d=  0.000001 [132.1 +  1.983] [  3] DATA0: 00 00
> >     11.626617 d=  0.000749 [132.7 +  1.050] [  4] SPLIT
> >     11.626617 d=  0.000000 [132.7 +  1.467] [  3] IN   : 3.5
> >     11.626867 d=  0.000250 [133.1 +  1.050] [  4] SPLIT
> >     11.626867 d=  0.000000 [133.1 +  1.467] [  3] IN   : 3.5
> >     11.626868 d=  0.000000 [133.1 +  1.967] [  3] DATA0: 00 00
> >
> > After the hub, they look as the following and may lead to broken
> > perepherial transfer (as in case of PWC based webcam):
> >
> >     11.332004 d=  0.000997 [ 30.0 +  3.417] [  3] IN   : 5.5
> >     11.332007 d=  0.000003 [ 30.0 +  6.833] [800] DATA0: 8a 1c [skipped...]
> >     11.334004 d=  0.001997 [ 32.0 +  3.417] [  3] IN   : 5.5
> >     11.334007 d=  0.000003 [ 32.0 +  6.750] [  3] DATA0: 00 00
> >     11.335004 d=  0.000997 [ 33   +  3.417] [  3] IN   : 5.5
> >     11.335007 d=  0.000003 [ 33   +  6.750] [  3] DATA0: 00 00
> >
> > Removing this glitches makes us able to successfully run 10fps
> > video stream from the webcam attached via USB hub. That was
> > previously impossible.
> >
> > Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
> > ---
> >  drivers/usb/musb/musb_host.c | 18 ++++++++++++++++++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
> > index ed99ecd4e63a..75be92873b5b 100644
> > --- a/drivers/usb/musb/musb_host.c
> > +++ b/drivers/usb/musb/musb_host.c
> > @@ -85,6 +85,11 @@ static bool musb_qh_empty(struct musb_qh *qh)
> >       return list_empty(&qh->hep->urb_list);
> >  }
> >
> > +static bool musb_qh_singular(struct musb_qh *qh)
> > +{
> > +     return list_is_singular(&qh->hep->urb_list);
> > +}
> > +
> >  static void musb_qh_unlink_hep(struct musb_qh *qh)
> >  {
> >       if (!qh->hep)
> > @@ -362,6 +367,19 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
> >               break;
> >       }
> >
> > +     if (ready && !musb_qh_singular(qh)) {
> > +             struct urb *next_urb = list_next_entry(urb, urb_list);
> > +
> > +             musb_dbg(musb, "... next ep%d %cX urb %p", hw_ep->epnum, is_in ? 'R' : 'T', next_urb);
> > +             musb_start_urb(musb, is_in, qh, next_urb);
> > +
> > +             qh->is_ready = 0;
> > +             musb_giveback(musb, urb, status);
> > +             qh->is_ready = ready;
> > +
> > +             return;
> > +     }
> > +
> >       qh->is_ready = 0;
> >       musb_giveback(musb, urb, status);
> >       qh->is_ready = ready;
> > --
> > 2.16.4
> >

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule()
  2019-05-04  9:38     ` [6/6] " Matwey V. Kornilov
@ 2019-05-04  9:38       ` Matwey V. Kornilov
  0 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-05-04  9:38 UTC (permalink / raw)
  To: Bin Liu, Matwey V. Kornilov, Greg KH,
	Матвей
	Корнилов,
	open list,
	open list:MUSB MULTIPOINT HIGH SPEED DUAL-ROLE CONTROLLER

вт, 30 апр. 2019 г. в 18:31, Bin Liu <b-liu@ti.com>:
>
> Hi Greg and all devs,
>
> On Wed, Apr 03, 2019 at 09:53:10PM +0300, Matwey V. Kornilov wrote:
> > Previously, the algorithm was the following:
> >
> >  1. giveback current URB
> >  2. if current qh is not empty
> >     then start next URB
> >  3. if current qh is empty
> >     then dispose the qh, find next qh if any, and start URB.
> >
> > It may take a while to run urb->callback inside URB giveback which is
> > run synchronously in musb. In order to improve the latency we rearrange
> > the function behaviour for the case when qh is not empty: next URB is
> > started before URB giveback. When qh is empty then the behaviour is
> > intentionally kept in order not to break existing inter qh scheduling:
> > URB giveback could potentionally enqueue other URB to the empty qh
> > preventing it from being disposed.
>
> This patch changes the sequence of urb giveback in musb.
>
>         before                          after
>         ------                          -----
> 1. giveback current urb                 1. start next urb if qh != empty
> 2. start next urb if qh != empty        2. giveback current urb
>
> I see there is a potential that the urb giveback could be out of order,
> for example, if urb giveback in BH and the next urb finishes before BH
> runs.

Could you please give more details? Frankly speaking, I am not sure
that I understand the reordering issue origin correctly.
I see in the existing implementation that the function call order is
the following:

1. glue interrupt handler (for instance dsps_interrupt() in my am335x
case) holds musb->lock;
2. musb_interrupt()
3. musb_host_rx() (or *_tx())
4. musb_advance_schedule()
5. musb_giveback() releases and reacquires musb->lock around:
6. usb_hcd_giveback_urb()

So, when musb_giveback() is called inside musb_advance_schedule() then
the second instance of musb_advance_schedule() can be started
simultaneously when the following interrupt is being handled at other
CPU core. And we can see two usb_hcd_giveback_urb() running
concurrently.
Is it correct?

>
> If this potential is possible, is it a problem for any class driver?
>
> Thanks,
> -Bin.
>
> >
> > Before this patch, time spent in urb->callback led to the following
> > glitches between the host and a hub during isoc transfer (line 4):
> >
> >     11.624492 d=  0.000124 [130.6 +  1.050] [  4] SPLIT
> >     11.624492 d=  0.000000 [130.6 +  1.467] [  3] IN   : 3.5
> >     11.624493 d=  0.000000 [130.6 +  1.967] [ 37] DATA0: aa 08 [skipped...]
> >     11.625617 d=  0.001124 [131.7 +  1.050] [  4] SPLIT
> >     11.625617 d=  0.000000 [131.7 +  1.467] [  3] IN   : 3.5
> >     11.625867 d=  0.000250 [132.1 +  1.050] [  4] SPLIT
> >     11.625867 d=  0.000000 [132.1 +  1.467] [  3] IN   : 3.5
> >     11.625868 d=  0.000001 [132.1 +  1.983] [  3] DATA0: 00 00
> >     11.626617 d=  0.000749 [132.7 +  1.050] [  4] SPLIT
> >     11.626617 d=  0.000000 [132.7 +  1.467] [  3] IN   : 3.5
> >     11.626867 d=  0.000250 [133.1 +  1.050] [  4] SPLIT
> >     11.626867 d=  0.000000 [133.1 +  1.467] [  3] IN   : 3.5
> >     11.626868 d=  0.000000 [133.1 +  1.967] [  3] DATA0: 00 00
> >
> > After the hub, they look as the following and may lead to broken
> > perepherial transfer (as in case of PWC based webcam):
> >
> >     11.332004 d=  0.000997 [ 30.0 +  3.417] [  3] IN   : 5.5
> >     11.332007 d=  0.000003 [ 30.0 +  6.833] [800] DATA0: 8a 1c [skipped...]
> >     11.334004 d=  0.001997 [ 32.0 +  3.417] [  3] IN   : 5.5
> >     11.334007 d=  0.000003 [ 32.0 +  6.750] [  3] DATA0: 00 00
> >     11.335004 d=  0.000997 [ 33   +  3.417] [  3] IN   : 5.5
> >     11.335007 d=  0.000003 [ 33   +  6.750] [  3] DATA0: 00 00
> >
> > Removing this glitches makes us able to successfully run 10fps
> > video stream from the webcam attached via USB hub. That was
> > previously impossible.
> >
> > Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
> > ---
> >  drivers/usb/musb/musb_host.c | 18 ++++++++++++++++++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
> > index ed99ecd4e63a..75be92873b5b 100644
> > --- a/drivers/usb/musb/musb_host.c
> > +++ b/drivers/usb/musb/musb_host.c
> > @@ -85,6 +85,11 @@ static bool musb_qh_empty(struct musb_qh *qh)
> >       return list_empty(&qh->hep->urb_list);
> >  }
> >
> > +static bool musb_qh_singular(struct musb_qh *qh)
> > +{
> > +     return list_is_singular(&qh->hep->urb_list);
> > +}
> > +
> >  static void musb_qh_unlink_hep(struct musb_qh *qh)
> >  {
> >       if (!qh->hep)
> > @@ -362,6 +367,19 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
> >               break;
> >       }
> >
> > +     if (ready && !musb_qh_singular(qh)) {
> > +             struct urb *next_urb = list_next_entry(urb, urb_list);
> > +
> > +             musb_dbg(musb, "... next ep%d %cX urb %p", hw_ep->epnum, is_in ? 'R' : 'T', next_urb);
> > +             musb_start_urb(musb, is_in, qh, next_urb);
> > +
> > +             qh->is_ready = 0;
> > +             musb_giveback(musb, urb, status);
> > +             qh->is_ready = ready;
> > +
> > +             return;
> > +     }
> > +
> >       qh->is_ready = 0;
> >       musb_giveback(musb, urb, status);
> >       qh->is_ready = ready;
> > --
> > 2.16.4
> >



-- 
With best regards,
Matwey V. Kornilov.
Sternberg Astronomical Institute, Lomonosov Moscow State University, Russia
119234, Moscow, Universitetsky pr-k 13, +7 (495) 9392382

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 0/6] musb: Improve performance for hub-attached webcams
       [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
  2019-04-03 18:53 ` [6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule() Matwey V. Kornilov
  2019-04-24 15:42 ` [PATCH 0/6] musb: Improve performance for hub-attached webcams Matwey V. Kornilov
@ 2019-06-14 16:45 ` Matwey V. Kornilov
  2019-06-14 16:45   ` [PATCH v2 1/6] usb: musb: Use USB_DIR_IN when calling musb_advance_schedule() Matwey V. Kornilov
                     ` (6 more replies)
  2020-01-01 16:26 ` [PATCH RESEND " Matwey V. Kornilov
                   ` (6 subsequent siblings)
  9 siblings, 7 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-06-14 16:45 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

The series is concerned to issues with isochronous transfer while
streaming the USB webcam data. I discovered the issue first time
when attached PWC USB webcam to AM335x-based BeagleBone Black SBC.
It appeared that the root issue was in numerous missed IN requests
during isochronous transfer where each missing leaded to the frame
drop. Since every IN request is triggered in MUSB driver
individually, it is important to queue the send IN request as
earlier as possible when the previous IN completed. At the same
time the URB giveback handler of the device driver has also to be
called there, that leads to arbitrarily delay depending on the
device driver performance. The details with the references are
described in [1].

The issue has two parts:

  1) peripheral driver URB callback performance
  2) MUSB host driver performance

It appeared that the first part is related to the wrong memory
allocation strategy in the most USB webcam drivers. Non-cached
memory is used in assumption that coherent DMA memory leads to
the better performance than non-coherent memory in conjunction with
the proper synchronization. Yet the assumption might be valid for
x86 platforms some time ago, the issue was fixed for PWC driver in:

    1161db6776bd ("media: usb: pwc: Don't use coherent DMA buffers for ISO transfer")

that leads to 3.5x performance gain. The more generic fix for this
common issue are coming for the rest drivers [2].

The patch allowed successfully running full-speed USB PWC webcams
attached directly to BeagleBone Black USB port.

However, the second part of the issue is still present for
peripheral device attached through the high-speed USB hub due to
its 125us frame time. The patch series is intended to reorganize
musb_advance_schedule() to allow host to send IN request quicker.

The patch series is organized as the following. First three patches
improve readability of the existing code in
musb_advance_schedule(). Patches 4 and 5 introduce updated
signature for musb_start_urb(). The last patch introduce new
code-path in musb_advance_schedule() which allows for faster
response.

References:

[1] https://www.spinics.net/lists/linux-usb/msg165735.html
[2] https://www.spinics.net/lists/linux-media/msg144279.html

Changes since v1:
 - Patch 6 was redone to keep URB giveback order and stop transmission at
   erroneous URB.

Matwey V. Kornilov (6):
  usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
  usb: musb: Introduce musb_qh_empty() helper function
  usb: musb: Introduce musb_qh_free() helper function
  usb: musb: Rename musb_start_urb() to musb_start_next_urb()
  usb: musb: Introduce musb_start_urb()
  usb: musb: Decrease URB starting latency in musb_advance_schedule()

 drivers/usb/musb/musb_host.c | 132 ++++++++++++++++++++++++++++---------------
 drivers/usb/musb/musb_host.h |   1 +
 2 files changed, 86 insertions(+), 47 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/6] usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
  2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
@ 2019-06-14 16:45   ` Matwey V. Kornilov
  2019-06-14 16:45   ` [PATCH v2 2/6] usb: musb: Introduce musb_qh_empty() helper function Matwey V. Kornilov
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-06-14 16:45 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

Use USB_DIR_IN instead of 1 when calling musb_advance_schedule().
This is consistent with the rest of musb_host.c code and impoves
the readability.

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index eb308ec35c66..3ffea6a5e022 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -1195,7 +1195,7 @@ irqreturn_t musb_h_ep0_irq(struct musb *musb)
 
 	/* call completion handler if done */
 	if (complete)
-		musb_advance_schedule(musb, urb, hw_ep, 1);
+		musb_advance_schedule(musb, urb, hw_ep, USB_DIR_IN);
 done:
 	return retval;
 }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/6] usb: musb: Introduce musb_qh_empty() helper function
  2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
  2019-06-14 16:45   ` [PATCH v2 1/6] usb: musb: Use USB_DIR_IN when calling musb_advance_schedule() Matwey V. Kornilov
@ 2019-06-14 16:45   ` Matwey V. Kornilov
  2019-06-14 16:45   ` [PATCH v2 3/6] usb: musb: Introduce musb_qh_free() " Matwey V. Kornilov
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-06-14 16:45 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

Use musb_qh_empty() instead of &qh->hep->urb_list to avoid code
duplicating.

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 3ffea6a5e022..37aa9f6155d9 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -80,6 +80,11 @@ static void musb_ep_program(struct musb *musb, u8 epnum,
 			struct urb *urb, int is_out,
 			u8 *buf, u32 offset, u32 len);
 
+static bool musb_qh_empty(struct musb_qh *qh)
+{
+	return list_empty(&qh->hep->urb_list);
+}
+
 /*
  * Clear TX fifo. Needed to avoid BABBLE errors.
  */
@@ -342,7 +347,7 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 	/* reclaim resources (and bandwidth) ASAP; deschedule it, and
 	 * invalidate qh as soon as list_empty(&hep->urb_list)
 	 */
-	if (list_empty(&qh->hep->urb_list)) {
+	if (musb_qh_empty(qh)) {
 		struct list_head	*head;
 		struct dma_controller	*dma = musb->dma_controller;
 
@@ -2430,7 +2435,7 @@ static int musb_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
 		/* If nothing else (usually musb_giveback) is using it
 		 * and its URB list has emptied, recycle this qh.
 		 */
-		if (ready && list_empty(&qh->hep->urb_list)) {
+		if (ready && musb_qh_empty(qh)) {
 			qh->hep->hcpriv = NULL;
 			list_del(&qh->ring);
 			kfree(qh);
@@ -2475,7 +2480,7 @@ musb_h_disable(struct usb_hcd *hcd, struct usb_host_endpoint *hep)
 		/* Then nuke all the others ... and advance the
 		 * queue on hw_ep (e.g. bulk ring) when we're done.
 		 */
-		while (!list_empty(&hep->urb_list)) {
+		while (!musb_qh_empty(qh)) {
 			urb = next_urb(qh);
 			urb->status = -ESHUTDOWN;
 			musb_advance_schedule(musb, urb, qh->hw_ep, is_in);
@@ -2485,7 +2490,7 @@ musb_h_disable(struct usb_hcd *hcd, struct usb_host_endpoint *hep)
 		 * other transfers, and since !qh->is_ready nothing
 		 * will activate any of these as it advances.
 		 */
-		while (!list_empty(&hep->urb_list))
+		while (!musb_qh_empty(qh))
 			musb_giveback(musb, next_urb(qh), -ESHUTDOWN);
 
 		hep->hcpriv = NULL;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 3/6] usb: musb: Introduce musb_qh_free() helper function
  2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
  2019-06-14 16:45   ` [PATCH v2 1/6] usb: musb: Use USB_DIR_IN when calling musb_advance_schedule() Matwey V. Kornilov
  2019-06-14 16:45   ` [PATCH v2 2/6] usb: musb: Introduce musb_qh_empty() helper function Matwey V. Kornilov
@ 2019-06-14 16:45   ` Matwey V. Kornilov
  2019-06-14 16:45   ` [PATCH v2 4/6] usb: musb: Rename musb_start_urb() to musb_start_next_urb() Matwey V. Kornilov
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-06-14 16:45 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

Reduce the following similar snippets by using musb_qh_free().

    qh->hep->hcpriv = NULL;
    list_del(&qh->ring);
    kfree(qh);

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 66 +++++++++++++++++++++-----------------------
 1 file changed, 32 insertions(+), 34 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 37aa9f6155d9..5d23c950a21b 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -85,6 +85,21 @@ static bool musb_qh_empty(struct musb_qh *qh)
 	return list_empty(&qh->hep->urb_list);
 }
 
+static void musb_qh_unlink_hep(struct musb_qh *qh)
+{
+	if (!qh->hep)
+		return;
+
+	qh->hep->hcpriv = NULL;
+}
+
+static void musb_qh_free(struct musb_qh *qh)
+{
+	musb_qh_unlink_hep(qh);
+	list_del(&qh->ring);
+	kfree(qh);
+}
+
 /*
  * Clear TX fifo. Needed to avoid BABBLE errors.
  */
@@ -348,7 +363,7 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 	 * invalidate qh as soon as list_empty(&hep->urb_list)
 	 */
 	if (musb_qh_empty(qh)) {
-		struct list_head	*head;
+		struct list_head	*head = NULL;
 		struct dma_controller	*dma = musb->dma_controller;
 
 		if (is_in) {
@@ -367,34 +382,22 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 
 		/* Clobber old pointers to this qh */
 		musb_ep_set_qh(ep, is_in, NULL);
-		qh->hep->hcpriv = NULL;
 
-		switch (qh->type) {
+		/* USB_ENDPOINT_XFER_CONTROL and USB_ENDPOINT_XFER_BULK: fifo
+		 * policy for these lists, except that NAKing should rotate
+		 * a qh to the end (for fairness).
+		 * USB_ENDPOINT_XFER_ISOC and USB_ENDPOINT_XFER_INT: this is
+		 * where periodic bandwidth should be de-allocated if it's
+		 * tracked and allocated; and where we'd update the schedule
+		 * tree...
+		 */
+		if (qh->mux == 1
+		    && (qh->type == USB_ENDPOINT_XFER_CONTROL || qh->type == USB_ENDPOINT_XFER_BULK))
+			head = qh->ring.prev;
 
-		case USB_ENDPOINT_XFER_CONTROL:
-		case USB_ENDPOINT_XFER_BULK:
-			/* fifo policy for these lists, except that NAKing
-			 * should rotate a qh to the end (for fairness).
-			 */
-			if (qh->mux == 1) {
-				head = qh->ring.prev;
-				list_del(&qh->ring);
-				kfree(qh);
-				qh = first_qh(head);
-				break;
-			}
-			/* fall through */
+		musb_qh_free(qh);
 
-		case USB_ENDPOINT_XFER_ISOC:
-		case USB_ENDPOINT_XFER_INT:
-			/* this is where periodic bandwidth should be
-			 * de-allocated if it's tracked and allocated;
-			 * and where we'd update the schedule tree...
-			 */
-			kfree(qh);
-			qh = NULL;
-			break;
-		}
+		qh = head ? first_qh(head) : NULL;
 	}
 
 	if (qh != NULL && qh->is_ready) {
@@ -2435,11 +2438,8 @@ static int musb_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
 		/* If nothing else (usually musb_giveback) is using it
 		 * and its URB list has emptied, recycle this qh.
 		 */
-		if (ready && musb_qh_empty(qh)) {
-			qh->hep->hcpriv = NULL;
-			list_del(&qh->ring);
-			kfree(qh);
-		}
+		if (ready && musb_qh_empty(qh))
+			musb_qh_free(qh);
 	} else
 		ret = musb_cleanup_urb(urb, qh);
 done:
@@ -2493,9 +2493,7 @@ musb_h_disable(struct usb_hcd *hcd, struct usb_host_endpoint *hep)
 		while (!musb_qh_empty(qh))
 			musb_giveback(musb, next_urb(qh), -ESHUTDOWN);
 
-		hep->hcpriv = NULL;
-		list_del(&qh->ring);
-		kfree(qh);
+		musb_qh_free(qh);
 	}
 exit:
 	spin_unlock_irqrestore(&musb->lock, flags);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 4/6] usb: musb: Rename musb_start_urb() to musb_start_next_urb()
  2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
                     ` (2 preceding siblings ...)
  2019-06-14 16:45   ` [PATCH v2 3/6] usb: musb: Introduce musb_qh_free() " Matwey V. Kornilov
@ 2019-06-14 16:45   ` Matwey V. Kornilov
  2019-06-14 16:45   ` [PATCH v2 5/6] usb: musb: Introduce musb_start_urb() Matwey V. Kornilov
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-06-14 16:45 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

In the following commit we introduce new musb_start_urb() function
which will be able to start arbitrary urb. In order to have
intuitive function names we rename musb_start_urb() to
musb_start_next_urb().

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 5d23c950a21b..3a202a2a521d 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -213,7 +213,7 @@ static struct musb_qh *musb_ep_get_qh(struct musb_hw_ep *ep, int is_in)
  * Context: controller locked, irqs blocked
  */
 static void
-musb_start_urb(struct musb *musb, int is_in, struct musb_qh *qh)
+musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
 {
 	u32			len;
 	void __iomem		*mbase =  musb->mregs;
@@ -403,7 +403,7 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 	if (qh != NULL && qh->is_ready) {
 		musb_dbg(musb, "... next ep%d %cX urb %p",
 		    hw_ep->epnum, is_in ? 'R' : 'T', next_urb(qh));
-		musb_start_urb(musb, is_in, qh);
+		musb_start_next_urb(musb, is_in, qh);
 	}
 }
 
@@ -1001,7 +1001,7 @@ static void musb_bulk_nak_timeout(struct musb *musb, struct musb_hw_ep *ep,
 		}
 
 		if (next_qh)
-			musb_start_urb(musb, is_in, next_qh);
+			musb_start_next_urb(musb, is_in, next_qh);
 	}
 }
 
@@ -2141,7 +2141,7 @@ static int musb_schedule(
 	qh->hw_ep = hw_ep;
 	qh->hep->hcpriv = qh;
 	if (idle)
-		musb_start_urb(musb, is_in, qh);
+		musb_start_next_urb(musb, is_in, qh);
 	return 0;
 }
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 5/6] usb: musb: Introduce musb_start_urb()
  2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
                     ` (3 preceding siblings ...)
  2019-06-14 16:45   ` [PATCH v2 4/6] usb: musb: Rename musb_start_urb() to musb_start_next_urb() Matwey V. Kornilov
@ 2019-06-14 16:45   ` Matwey V. Kornilov
  2019-06-14 16:45   ` [PATCH v2 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule() Matwey V. Kornilov
  2019-07-02 17:29   ` [PATCH v2 0/6] musb: Improve performance for hub-attached webcams Matwey V. Kornilov
  6 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-06-14 16:45 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

This function allows us to start arbitrary urb.

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 3a202a2a521d..ed99ecd4e63a 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -213,11 +213,10 @@ static struct musb_qh *musb_ep_get_qh(struct musb_hw_ep *ep, int is_in)
  * Context: controller locked, irqs blocked
  */
 static void
-musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
+musb_start_urb(struct musb *musb, int is_in, struct musb_qh *qh, struct urb *urb)
 {
 	u32			len;
 	void __iomem		*mbase =  musb->mregs;
-	struct urb		*urb = next_urb(qh);
 	void			*buf = urb->transfer_buffer;
 	u32			offset = 0;
 	struct musb_hw_ep	*hw_ep = qh->hw_ep;
@@ -293,6 +292,14 @@ musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
 	}
 }
 
+static void
+musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
+{
+	struct urb *urb = next_urb(qh);
+
+	musb_start_urb(musb, is_in, qh, urb);
+}
+
 /* Context: caller owns controller lock, IRQs are blocked */
 static void musb_giveback(struct musb *musb, struct urb *urb, int status)
 __releases(musb->lock)
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule()
  2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
                     ` (4 preceding siblings ...)
  2019-06-14 16:45   ` [PATCH v2 5/6] usb: musb: Introduce musb_start_urb() Matwey V. Kornilov
@ 2019-06-14 16:45   ` Matwey V. Kornilov
  2019-07-02 17:29   ` [PATCH v2 0/6] musb: Improve performance for hub-attached webcams Matwey V. Kornilov
  6 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-06-14 16:45 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

Previously, the algorithm was the following:

 1. giveback current URB
 2. if current qh is not empty
    then start next URB
 3. if current qh is empty
    then dispose the qh, find next qh if any, and start URB.

It may take a while to run urb->callback inside URB giveback which is
run synchronously in musb. In order to improve the latency we rearrange
the function behaviour for the case when qh is not empty: next URB is
started before URB giveback. When qh is empty or current URB has an
error then the behaviour is intentionally kept in order
  a) not to break existing inter qh scheduling: URB giveback could
potentionally enqueue other URB to the empty qh preventing it from being
disposed;
  b) allow the class driver to cancel outstanding URBs in the queue.

Correct URB giveback order is guaranteed as the following. For each
qh there can be at most three ready URBs processing by the driver.
Indeed, every ready URB can send at most one URB in
musb_advance_schedule(), and in the worst case scenario we have the
following ready URBs:
  1) URB in the giveback lock protected section inside musb_giveback()
  2) URB waiting at the giveback lock acqusition in musb_giveback()
  3) URB waiting at the controller lock acqusition in the glue layer
     interrput handler
Here URB #2 and URB #3 are triggered by URB #1 and URB #2
correspondingly when they passed through musb_advance_schedule().
Since URB #3 is waiting before musb_advance_schedule(), no other new
URBs will be sent until URB#1 is finished, URB#2 goes to the giveback
lock protected section, and URB#3 goes to the controller lock protected
musb_advance_schedule().

Before this patch, time spent in urb->callback led to the following
glitches between the host and a hub during isoc transfer (line 4):

    11.624492 d=  0.000124 [130.6 +  1.050] [  4] SPLIT
    11.624492 d=  0.000000 [130.6 +  1.467] [  3] IN   : 3.5
    11.624493 d=  0.000000 [130.6 +  1.967] [ 37] DATA0: aa 08 [skipped...]
    11.625617 d=  0.001124 [131.7 +  1.050] [  4] SPLIT
    11.625617 d=  0.000000 [131.7 +  1.467] [  3] IN   : 3.5
    11.625867 d=  0.000250 [132.1 +  1.050] [  4] SPLIT
    11.625867 d=  0.000000 [132.1 +  1.467] [  3] IN   : 3.5
    11.625868 d=  0.000001 [132.1 +  1.983] [  3] DATA0: 00 00
    11.626617 d=  0.000749 [132.7 +  1.050] [  4] SPLIT
    11.626617 d=  0.000000 [132.7 +  1.467] [  3] IN   : 3.5
    11.626867 d=  0.000250 [133.1 +  1.050] [  4] SPLIT
    11.626867 d=  0.000000 [133.1 +  1.467] [  3] IN   : 3.5
    11.626868 d=  0.000000 [133.1 +  1.967] [  3] DATA0: 00 00

After the hub, they look as the following and may lead to broken
perepherial transfer (as in case of PWC based webcam):

    11.332004 d=  0.000997 [ 30.0 +  3.417] [  3] IN   : 5.5
    11.332007 d=  0.000003 [ 30.0 +  6.833] [800] DATA0: 8a 1c [skipped...]
    11.334004 d=  0.001997 [ 32.0 +  3.417] [  3] IN   : 5.5
    11.334007 d=  0.000003 [ 32.0 +  6.750] [  3] DATA0: 00 00
    11.335004 d=  0.000997 [ 33   +  3.417] [  3] IN   : 5.5
    11.335007 d=  0.000003 [ 33   +  6.750] [  3] DATA0: 00 00

Removing this glitches makes us able to successfully run 10fps
video stream from the webcam attached via USB hub. That was
previously impossible.

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 36 ++++++++++++++++++++++++++++++++----
 drivers/usb/musb/musb_host.h |  1 +
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index ed99ecd4e63a..5c43996f2de5 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -85,6 +85,11 @@ static bool musb_qh_empty(struct musb_qh *qh)
 	return list_empty(&qh->hep->urb_list);
 }
 
+static bool musb_qh_singular(struct musb_qh *qh)
+{
+	return list_is_singular(&qh->hep->urb_list);
+}
+
 static void musb_qh_unlink_hep(struct musb_qh *qh)
 {
 	if (!qh->hep)
@@ -301,15 +306,24 @@ musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
 }
 
 /* Context: caller owns controller lock, IRQs are blocked */
-static void musb_giveback(struct musb *musb, struct urb *urb, int status)
+static void musb_giveback(struct musb *musb, struct musb_qh *qh, struct urb *urb, int status)
 __releases(musb->lock)
 __acquires(musb->lock)
 {
 	trace_musb_urb_gb(musb, urb);
 
+	/*
+	 * This line is protected by the controller lock: at most
+	 * one thread waiting on the giveback lock.
+	 */
+	spin_lock(&qh->giveback_lock);
 	usb_hcd_unlink_urb_from_ep(musb->hcd, urb);
+
 	spin_unlock(&musb->lock);
+
 	usb_hcd_giveback_urb(musb->hcd, urb, status);
+	spin_unlock(&qh->giveback_lock);
+
 	spin_lock(&musb->lock);
 }
 
@@ -362,8 +376,21 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 		break;
 	}
 
+	if (ready && !musb_qh_singular(qh) && !status) {
+		struct urb *next_urb = list_next_entry(urb, urb_list);
+
+		musb_dbg(musb, "... next ep%d %cX urb %p", hw_ep->epnum, is_in ? 'R' : 'T', next_urb);
+		musb_start_urb(musb, is_in, qh, next_urb);
+
+		qh->is_ready = 0;
+		musb_giveback(musb, qh, urb, status);
+		qh->is_ready = ready;
+
+		return;
+	}
+
 	qh->is_ready = 0;
-	musb_giveback(musb, urb, status);
+	musb_giveback(musb, qh, urb, status);
 	qh->is_ready = ready;
 
 	/* reclaim resources (and bandwidth) ASAP; deschedule it, and
@@ -2207,6 +2234,7 @@ static int musb_urb_enqueue(
 	qh->hep = hep;
 	qh->dev = urb->dev;
 	INIT_LIST_HEAD(&qh->ring);
+	spin_lock_init(&qh->giveback_lock);
 	qh->is_ready = 1;
 
 	qh->maxpacket = usb_endpoint_maxp(epd);
@@ -2439,7 +2467,7 @@ static int musb_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
 		int	ready = qh->is_ready;
 
 		qh->is_ready = 0;
-		musb_giveback(musb, urb, 0);
+		musb_giveback(musb, qh, urb, 0);
 		qh->is_ready = ready;
 
 		/* If nothing else (usually musb_giveback) is using it
@@ -2498,7 +2526,7 @@ musb_h_disable(struct usb_hcd *hcd, struct usb_host_endpoint *hep)
 		 * will activate any of these as it advances.
 		 */
 		while (!musb_qh_empty(qh))
-			musb_giveback(musb, next_urb(qh), -ESHUTDOWN);
+			musb_giveback(musb, qh, next_urb(qh), -ESHUTDOWN);
 
 		musb_qh_free(qh);
 	}
diff --git a/drivers/usb/musb/musb_host.h b/drivers/usb/musb/musb_host.h
index 2999845632ce..6223b0177c68 100644
--- a/drivers/usb/musb/musb_host.h
+++ b/drivers/usb/musb/musb_host.h
@@ -19,6 +19,7 @@ struct musb_qh {
 	struct musb_hw_ep	*hw_ep;		/* current binding */
 
 	struct list_head	ring;		/* of musb_qh */
+	spinlock_t		giveback_lock;	/* to keep URB giveback order */
 	/* struct musb_qh		*next; */	/* for periodic tree */
 	u8			mux;		/* qh multiplexed to hw_ep */
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/6] musb: Improve performance for hub-attached webcams
  2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
                     ` (5 preceding siblings ...)
  2019-06-14 16:45   ` [PATCH v2 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule() Matwey V. Kornilov
@ 2019-07-02 17:29   ` Matwey V. Kornilov
  2019-07-02 17:33     ` Bin Liu
  6 siblings, 1 reply; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-07-02 17:29 UTC (permalink / raw)
  To: Bin Liu, Greg KH, Alan Stern
  Cc: open list:MUSB MULTIPOINT HIGH SPEED DUAL-ROLE CONTROLLER, open list

Ping?

пт, 14 июн. 2019 г. в 19:47, Matwey V. Kornilov <matwey@sai.msu.ru>:
>
> The series is concerned to issues with isochronous transfer while
> streaming the USB webcam data. I discovered the issue first time
> when attached PWC USB webcam to AM335x-based BeagleBone Black SBC.
> It appeared that the root issue was in numerous missed IN requests
> during isochronous transfer where each missing leaded to the frame
> drop. Since every IN request is triggered in MUSB driver
> individually, it is important to queue the send IN request as
> earlier as possible when the previous IN completed. At the same
> time the URB giveback handler of the device driver has also to be
> called there, that leads to arbitrarily delay depending on the
> device driver performance. The details with the references are
> described in [1].
>
> The issue has two parts:
>
>   1) peripheral driver URB callback performance
>   2) MUSB host driver performance
>
> It appeared that the first part is related to the wrong memory
> allocation strategy in the most USB webcam drivers. Non-cached
> memory is used in assumption that coherent DMA memory leads to
> the better performance than non-coherent memory in conjunction with
> the proper synchronization. Yet the assumption might be valid for
> x86 platforms some time ago, the issue was fixed for PWC driver in:
>
>     1161db6776bd ("media: usb: pwc: Don't use coherent DMA buffers for ISO transfer")
>
> that leads to 3.5x performance gain. The more generic fix for this
> common issue are coming for the rest drivers [2].
>
> The patch allowed successfully running full-speed USB PWC webcams
> attached directly to BeagleBone Black USB port.
>
> However, the second part of the issue is still present for
> peripheral device attached through the high-speed USB hub due to
> its 125us frame time. The patch series is intended to reorganize
> musb_advance_schedule() to allow host to send IN request quicker.
>
> The patch series is organized as the following. First three patches
> improve readability of the existing code in
> musb_advance_schedule(). Patches 4 and 5 introduce updated
> signature for musb_start_urb(). The last patch introduce new
> code-path in musb_advance_schedule() which allows for faster
> response.
>
> References:
>
> [1] https://www.spinics.net/lists/linux-usb/msg165735.html
> [2] https://www.spinics.net/lists/linux-media/msg144279.html
>
> Changes since v1:
>  - Patch 6 was redone to keep URB giveback order and stop transmission at
>    erroneous URB.
>
> Matwey V. Kornilov (6):
>   usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
>   usb: musb: Introduce musb_qh_empty() helper function
>   usb: musb: Introduce musb_qh_free() helper function
>   usb: musb: Rename musb_start_urb() to musb_start_next_urb()
>   usb: musb: Introduce musb_start_urb()
>   usb: musb: Decrease URB starting latency in musb_advance_schedule()
>
>  drivers/usb/musb/musb_host.c | 132 ++++++++++++++++++++++++++++---------------
>  drivers/usb/musb/musb_host.h |   1 +
>  2 files changed, 86 insertions(+), 47 deletions(-)
>
> --
> 2.16.4
>


-- 
With best regards,
Matwey V. Kornilov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/6] musb: Improve performance for hub-attached webcams
  2019-07-02 17:29   ` [PATCH v2 0/6] musb: Improve performance for hub-attached webcams Matwey V. Kornilov
@ 2019-07-02 17:33     ` Bin Liu
  2019-09-09 16:33       ` Matwey V. Kornilov
  0 siblings, 1 reply; 27+ messages in thread
From: Bin Liu @ 2019-07-02 17:33 UTC (permalink / raw)
  To: Matwey V. Kornilov
  Cc: Greg KH, Alan Stern,
	open list:MUSB MULTIPOINT HIGH SPEED DUAL-ROLE CONTROLLER,
	open list

Matwey,

On Tue, Jul 02, 2019 at 08:29:03PM +0300, Matwey V. Kornilov wrote:
> Ping?

I was offline and just got back. I will review it soon. Sorry for the
delay.

-Bin.

> 
> пт, 14 июн. 2019 г. в 19:47, Matwey V. Kornilov <matwey@sai.msu.ru>:
> >
> > The series is concerned to issues with isochronous transfer while
> > streaming the USB webcam data. I discovered the issue first time
> > when attached PWC USB webcam to AM335x-based BeagleBone Black SBC.
> > It appeared that the root issue was in numerous missed IN requests
> > during isochronous transfer where each missing leaded to the frame
> > drop. Since every IN request is triggered in MUSB driver
> > individually, it is important to queue the send IN request as
> > earlier as possible when the previous IN completed. At the same
> > time the URB giveback handler of the device driver has also to be
> > called there, that leads to arbitrarily delay depending on the
> > device driver performance. The details with the references are
> > described in [1].
> >
> > The issue has two parts:
> >
> >   1) peripheral driver URB callback performance
> >   2) MUSB host driver performance
> >
> > It appeared that the first part is related to the wrong memory
> > allocation strategy in the most USB webcam drivers. Non-cached
> > memory is used in assumption that coherent DMA memory leads to
> > the better performance than non-coherent memory in conjunction with
> > the proper synchronization. Yet the assumption might be valid for
> > x86 platforms some time ago, the issue was fixed for PWC driver in:
> >
> >     1161db6776bd ("media: usb: pwc: Don't use coherent DMA buffers for ISO transfer")
> >
> > that leads to 3.5x performance gain. The more generic fix for this
> > common issue are coming for the rest drivers [2].
> >
> > The patch allowed successfully running full-speed USB PWC webcams
> > attached directly to BeagleBone Black USB port.
> >
> > However, the second part of the issue is still present for
> > peripheral device attached through the high-speed USB hub due to
> > its 125us frame time. The patch series is intended to reorganize
> > musb_advance_schedule() to allow host to send IN request quicker.
> >
> > The patch series is organized as the following. First three patches
> > improve readability of the existing code in
> > musb_advance_schedule(). Patches 4 and 5 introduce updated
> > signature for musb_start_urb(). The last patch introduce new
> > code-path in musb_advance_schedule() which allows for faster
> > response.
> >
> > References:
> >
> > [1] https://www.spinics.net/lists/linux-usb/msg165735.html
> > [2] https://www.spinics.net/lists/linux-media/msg144279.html
> >
> > Changes since v1:
> >  - Patch 6 was redone to keep URB giveback order and stop transmission at
> >    erroneous URB.
> >
> > Matwey V. Kornilov (6):
> >   usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
> >   usb: musb: Introduce musb_qh_empty() helper function
> >   usb: musb: Introduce musb_qh_free() helper function
> >   usb: musb: Rename musb_start_urb() to musb_start_next_urb()
> >   usb: musb: Introduce musb_start_urb()
> >   usb: musb: Decrease URB starting latency in musb_advance_schedule()
> >
> >  drivers/usb/musb/musb_host.c | 132 ++++++++++++++++++++++++++++---------------
> >  drivers/usb/musb/musb_host.h |   1 +
> >  2 files changed, 86 insertions(+), 47 deletions(-)
> >
> > --
> > 2.16.4
> >
> 
> 
> -- 
> With best regards,
> Matwey V. Kornilov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/6] musb: Improve performance for hub-attached webcams
  2019-07-02 17:33     ` Bin Liu
@ 2019-09-09 16:33       ` Matwey V. Kornilov
  2019-10-23  8:12         ` Matwey V. Kornilov
  0 siblings, 1 reply; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-09-09 16:33 UTC (permalink / raw)
  To: Bin Liu
  Cc: Greg KH, Alan Stern,
	open list:MUSB MULTIPOINT HIGH SPEED DUAL-ROLE CONTROLLER,
	open list

вт, 2 июл. 2019 г. в 20:33, Bin Liu <b-liu@ti.com>:
>
> Matwey,
>
> On Tue, Jul 02, 2019 at 08:29:03PM +0300, Matwey V. Kornilov wrote:
> > Ping?
>
> I was offline and just got back. I will review it soon. Sorry for the
> delay.

Ping?

>
> -Bin.
>
> >
> > пт, 14 июн. 2019 г. в 19:47, Matwey V. Kornilov <matwey@sai.msu.ru>:
> > >
> > > The series is concerned to issues with isochronous transfer while
> > > streaming the USB webcam data. I discovered the issue first time
> > > when attached PWC USB webcam to AM335x-based BeagleBone Black SBC.
> > > It appeared that the root issue was in numerous missed IN requests
> > > during isochronous transfer where each missing leaded to the frame
> > > drop. Since every IN request is triggered in MUSB driver
> > > individually, it is important to queue the send IN request as
> > > earlier as possible when the previous IN completed. At the same
> > > time the URB giveback handler of the device driver has also to be
> > > called there, that leads to arbitrarily delay depending on the
> > > device driver performance. The details with the references are
> > > described in [1].
> > >
> > > The issue has two parts:
> > >
> > >   1) peripheral driver URB callback performance
> > >   2) MUSB host driver performance
> > >
> > > It appeared that the first part is related to the wrong memory
> > > allocation strategy in the most USB webcam drivers. Non-cached
> > > memory is used in assumption that coherent DMA memory leads to
> > > the better performance than non-coherent memory in conjunction with
> > > the proper synchronization. Yet the assumption might be valid for
> > > x86 platforms some time ago, the issue was fixed for PWC driver in:
> > >
> > >     1161db6776bd ("media: usb: pwc: Don't use coherent DMA buffers for ISO transfer")
> > >
> > > that leads to 3.5x performance gain. The more generic fix for this
> > > common issue are coming for the rest drivers [2].
> > >
> > > The patch allowed successfully running full-speed USB PWC webcams
> > > attached directly to BeagleBone Black USB port.
> > >
> > > However, the second part of the issue is still present for
> > > peripheral device attached through the high-speed USB hub due to
> > > its 125us frame time. The patch series is intended to reorganize
> > > musb_advance_schedule() to allow host to send IN request quicker.
> > >
> > > The patch series is organized as the following. First three patches
> > > improve readability of the existing code in
> > > musb_advance_schedule(). Patches 4 and 5 introduce updated
> > > signature for musb_start_urb(). The last patch introduce new
> > > code-path in musb_advance_schedule() which allows for faster
> > > response.
> > >
> > > References:
> > >
> > > [1] https://www.spinics.net/lists/linux-usb/msg165735.html
> > > [2] https://www.spinics.net/lists/linux-media/msg144279.html
> > >
> > > Changes since v1:
> > >  - Patch 6 was redone to keep URB giveback order and stop transmission at
> > >    erroneous URB.
> > >
> > > Matwey V. Kornilov (6):
> > >   usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
> > >   usb: musb: Introduce musb_qh_empty() helper function
> > >   usb: musb: Introduce musb_qh_free() helper function
> > >   usb: musb: Rename musb_start_urb() to musb_start_next_urb()
> > >   usb: musb: Introduce musb_start_urb()
> > >   usb: musb: Decrease URB starting latency in musb_advance_schedule()
> > >
> > >  drivers/usb/musb/musb_host.c | 132 ++++++++++++++++++++++++++++---------------
> > >  drivers/usb/musb/musb_host.h |   1 +
> > >  2 files changed, 86 insertions(+), 47 deletions(-)
> > >
> > > --
> > > 2.16.4
> > >
> >
> >
> > --
> > With best regards,
> > Matwey V. Kornilov



-- 
With best regards,
Matwey V. Kornilov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/6] musb: Improve performance for hub-attached webcams
  2019-09-09 16:33       ` Matwey V. Kornilov
@ 2019-10-23  8:12         ` Matwey V. Kornilov
  0 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2019-10-23  8:12 UTC (permalink / raw)
  To: Bin Liu
  Cc: Greg KH, Alan Stern,
	open list:MUSB MULTIPOINT HIGH SPEED DUAL-ROLE CONTROLLER,
	open list

пн, 9 сент. 2019 г. в 19:33, Matwey V. Kornilov <matwey.kornilov@gmail.com>:
>
> вт, 2 июл. 2019 г. в 20:33, Bin Liu <b-liu@ti.com>:
> >
> > Matwey,
> >
> > On Tue, Jul 02, 2019 at 08:29:03PM +0300, Matwey V. Kornilov wrote:
> > > Ping?
> >
> > I was offline and just got back. I will review it soon. Sorry for the
> > delay.
>
> Ping?
>

Ping?

> >
> > -Bin.
> >
> > >
> > > пт, 14 июн. 2019 г. в 19:47, Matwey V. Kornilov <matwey@sai.msu.ru>:
> > > >
> > > > The series is concerned to issues with isochronous transfer while
> > > > streaming the USB webcam data. I discovered the issue first time
> > > > when attached PWC USB webcam to AM335x-based BeagleBone Black SBC.
> > > > It appeared that the root issue was in numerous missed IN requests
> > > > during isochronous transfer where each missing leaded to the frame
> > > > drop. Since every IN request is triggered in MUSB driver
> > > > individually, it is important to queue the send IN request as
> > > > earlier as possible when the previous IN completed. At the same
> > > > time the URB giveback handler of the device driver has also to be
> > > > called there, that leads to arbitrarily delay depending on the
> > > > device driver performance. The details with the references are
> > > > described in [1].
> > > >
> > > > The issue has two parts:
> > > >
> > > >   1) peripheral driver URB callback performance
> > > >   2) MUSB host driver performance
> > > >
> > > > It appeared that the first part is related to the wrong memory
> > > > allocation strategy in the most USB webcam drivers. Non-cached
> > > > memory is used in assumption that coherent DMA memory leads to
> > > > the better performance than non-coherent memory in conjunction with
> > > > the proper synchronization. Yet the assumption might be valid for
> > > > x86 platforms some time ago, the issue was fixed for PWC driver in:
> > > >
> > > >     1161db6776bd ("media: usb: pwc: Don't use coherent DMA buffers for ISO transfer")
> > > >
> > > > that leads to 3.5x performance gain. The more generic fix for this
> > > > common issue are coming for the rest drivers [2].
> > > >
> > > > The patch allowed successfully running full-speed USB PWC webcams
> > > > attached directly to BeagleBone Black USB port.
> > > >
> > > > However, the second part of the issue is still present for
> > > > peripheral device attached through the high-speed USB hub due to
> > > > its 125us frame time. The patch series is intended to reorganize
> > > > musb_advance_schedule() to allow host to send IN request quicker.
> > > >
> > > > The patch series is organized as the following. First three patches
> > > > improve readability of the existing code in
> > > > musb_advance_schedule(). Patches 4 and 5 introduce updated
> > > > signature for musb_start_urb(). The last patch introduce new
> > > > code-path in musb_advance_schedule() which allows for faster
> > > > response.
> > > >
> > > > References:
> > > >
> > > > [1] https://www.spinics.net/lists/linux-usb/msg165735.html
> > > > [2] https://www.spinics.net/lists/linux-media/msg144279.html
> > > >
> > > > Changes since v1:
> > > >  - Patch 6 was redone to keep URB giveback order and stop transmission at
> > > >    erroneous URB.
> > > >
> > > > Matwey V. Kornilov (6):
> > > >   usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
> > > >   usb: musb: Introduce musb_qh_empty() helper function
> > > >   usb: musb: Introduce musb_qh_free() helper function
> > > >   usb: musb: Rename musb_start_urb() to musb_start_next_urb()
> > > >   usb: musb: Introduce musb_start_urb()
> > > >   usb: musb: Decrease URB starting latency in musb_advance_schedule()
> > > >
> > > >  drivers/usb/musb/musb_host.c | 132 ++++++++++++++++++++++++++++---------------
> > > >  drivers/usb/musb/musb_host.h |   1 +
> > > >  2 files changed, 86 insertions(+), 47 deletions(-)
> > > >
> > > > --
> > > > 2.16.4
> > > >
> > >
> > >
> > > --
> > > With best regards,
> > > Matwey V. Kornilov
>
>
>
> --
> With best regards,
> Matwey V. Kornilov



-- 
With best regards,
Matwey V. Kornilov

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH RESEND v2 0/6] musb: Improve performance for hub-attached webcams
       [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
                   ` (2 preceding siblings ...)
  2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
@ 2020-01-01 16:26 ` Matwey V. Kornilov
  2020-01-01 16:26 ` [PATCH RESEND v2 1/6] usb: musb: Use USB_DIR_IN when calling musb_advance_schedule() Matwey V. Kornilov
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2020-01-01 16:26 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

The series is concerned to issues with isochronous transfer while
streaming the USB webcam data. I discovered the issue first time
when attached PWC USB webcam to AM335x-based BeagleBone Black SBC.
It appeared that the root issue was in numerous missed IN requests
during isochronous transfer where each missing leaded to the frame
drop. Since every IN request is triggered in MUSB driver
individually, it is important to queue the send IN request as
earlier as possible when the previous IN completed. At the same
time the URB giveback handler of the device driver has also to be
called there, that leads to arbitrarily delay depending on the
device driver performance. The details with the references are
described in [1].

The issue has two parts:

  1) peripheral driver URB callback performance
  2) MUSB host driver performance

It appeared that the first part is related to the wrong memory
allocation strategy in the most USB webcam drivers. Non-cached
memory is used in assumption that coherent DMA memory leads to
the better performance than non-coherent memory in conjunction with
the proper synchronization. Yet the assumption might be valid for
x86 platforms some time ago, the issue was fixed for PWC driver in:

    1161db6776bd ("media: usb: pwc: Don't use coherent DMA buffers for ISO transfer")

that leads to 3.5x performance gain. The more generic fix for this
common issue are coming for the rest drivers [2].

The patch allowed successfully running full-speed USB PWC webcams
attached directly to BeagleBone Black USB port.

However, the second part of the issue is still present for
peripheral device attached through the high-speed USB hub due to
its 125us frame time. The patch series is intended to reorganize
musb_advance_schedule() to allow host to send IN request quicker.

The patch series is organized as the following. First three patches
improve readability of the existing code in
musb_advance_schedule(). Patches 4 and 5 introduce updated
signature for musb_start_urb(). The last patch introduce new
code-path in musb_advance_schedule() which allows for faster
response.

References:

[1] https://www.spinics.net/lists/linux-usb/msg165735.html
[2] https://www.spinics.net/lists/linux-media/msg144279.html

Changes since v1:
 - Patch 6 was redone to keep URB giveback order and stop transmission at
   erroneous URB.

Matwey V. Kornilov (6):
  usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
  usb: musb: Introduce musb_qh_empty() helper function
  usb: musb: Introduce musb_qh_free() helper function
  usb: musb: Rename musb_start_urb() to musb_start_next_urb()
  usb: musb: Introduce musb_start_urb()
  usb: musb: Decrease URB starting latency in musb_advance_schedule()

 drivers/usb/musb/musb_host.c | 132 ++++++++++++++++++++++++++++---------------
 drivers/usb/musb/musb_host.h |   1 +
 2 files changed, 86 insertions(+), 47 deletions(-)

-- 
2.16.4


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH RESEND v2 1/6] usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
       [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
                   ` (3 preceding siblings ...)
  2020-01-01 16:26 ` [PATCH RESEND " Matwey V. Kornilov
@ 2020-01-01 16:26 ` Matwey V. Kornilov
  2020-01-01 16:26 ` [PATCH RESEND v2 2/6] usb: musb: Introduce musb_qh_empty() helper function Matwey V. Kornilov
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2020-01-01 16:26 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

Use USB_DIR_IN instead of 1 when calling musb_advance_schedule().
This is consistent with the rest of musb_host.c code and impoves
the readability.

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index eb308ec35c66..3ffea6a5e022 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -1195,7 +1195,7 @@ irqreturn_t musb_h_ep0_irq(struct musb *musb)
 
 	/* call completion handler if done */
 	if (complete)
-		musb_advance_schedule(musb, urb, hw_ep, 1);
+		musb_advance_schedule(musb, urb, hw_ep, USB_DIR_IN);
 done:
 	return retval;
 }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RESEND v2 2/6] usb: musb: Introduce musb_qh_empty() helper function
       [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
                   ` (4 preceding siblings ...)
  2020-01-01 16:26 ` [PATCH RESEND v2 1/6] usb: musb: Use USB_DIR_IN when calling musb_advance_schedule() Matwey V. Kornilov
@ 2020-01-01 16:26 ` Matwey V. Kornilov
  2020-01-01 16:26 ` [PATCH RESEND v2 3/6] usb: musb: Introduce musb_qh_free() " Matwey V. Kornilov
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2020-01-01 16:26 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

Use musb_qh_empty() instead of &qh->hep->urb_list to avoid code
duplicating.

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 3ffea6a5e022..37aa9f6155d9 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -80,6 +80,11 @@ static void musb_ep_program(struct musb *musb, u8 epnum,
 			struct urb *urb, int is_out,
 			u8 *buf, u32 offset, u32 len);
 
+static bool musb_qh_empty(struct musb_qh *qh)
+{
+	return list_empty(&qh->hep->urb_list);
+}
+
 /*
  * Clear TX fifo. Needed to avoid BABBLE errors.
  */
@@ -342,7 +347,7 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 	/* reclaim resources (and bandwidth) ASAP; deschedule it, and
 	 * invalidate qh as soon as list_empty(&hep->urb_list)
 	 */
-	if (list_empty(&qh->hep->urb_list)) {
+	if (musb_qh_empty(qh)) {
 		struct list_head	*head;
 		struct dma_controller	*dma = musb->dma_controller;
 
@@ -2430,7 +2435,7 @@ static int musb_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
 		/* If nothing else (usually musb_giveback) is using it
 		 * and its URB list has emptied, recycle this qh.
 		 */
-		if (ready && list_empty(&qh->hep->urb_list)) {
+		if (ready && musb_qh_empty(qh)) {
 			qh->hep->hcpriv = NULL;
 			list_del(&qh->ring);
 			kfree(qh);
@@ -2475,7 +2480,7 @@ musb_h_disable(struct usb_hcd *hcd, struct usb_host_endpoint *hep)
 		/* Then nuke all the others ... and advance the
 		 * queue on hw_ep (e.g. bulk ring) when we're done.
 		 */
-		while (!list_empty(&hep->urb_list)) {
+		while (!musb_qh_empty(qh)) {
 			urb = next_urb(qh);
 			urb->status = -ESHUTDOWN;
 			musb_advance_schedule(musb, urb, qh->hw_ep, is_in);
@@ -2485,7 +2490,7 @@ musb_h_disable(struct usb_hcd *hcd, struct usb_host_endpoint *hep)
 		 * other transfers, and since !qh->is_ready nothing
 		 * will activate any of these as it advances.
 		 */
-		while (!list_empty(&hep->urb_list))
+		while (!musb_qh_empty(qh))
 			musb_giveback(musb, next_urb(qh), -ESHUTDOWN);
 
 		hep->hcpriv = NULL;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RESEND v2 3/6] usb: musb: Introduce musb_qh_free() helper function
       [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
                   ` (5 preceding siblings ...)
  2020-01-01 16:26 ` [PATCH RESEND v2 2/6] usb: musb: Introduce musb_qh_empty() helper function Matwey V. Kornilov
@ 2020-01-01 16:26 ` Matwey V. Kornilov
  2020-01-01 16:26 ` [PATCH RESEND v2 4/6] usb: musb: Rename musb_start_urb() to musb_start_next_urb() Matwey V. Kornilov
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2020-01-01 16:26 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

Reduce the following similar snippets by using musb_qh_free().

    qh->hep->hcpriv = NULL;
    list_del(&qh->ring);
    kfree(qh);

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 66 +++++++++++++++++++++-----------------------
 1 file changed, 32 insertions(+), 34 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 37aa9f6155d9..5d23c950a21b 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -85,6 +85,21 @@ static bool musb_qh_empty(struct musb_qh *qh)
 	return list_empty(&qh->hep->urb_list);
 }
 
+static void musb_qh_unlink_hep(struct musb_qh *qh)
+{
+	if (!qh->hep)
+		return;
+
+	qh->hep->hcpriv = NULL;
+}
+
+static void musb_qh_free(struct musb_qh *qh)
+{
+	musb_qh_unlink_hep(qh);
+	list_del(&qh->ring);
+	kfree(qh);
+}
+
 /*
  * Clear TX fifo. Needed to avoid BABBLE errors.
  */
@@ -348,7 +363,7 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 	 * invalidate qh as soon as list_empty(&hep->urb_list)
 	 */
 	if (musb_qh_empty(qh)) {
-		struct list_head	*head;
+		struct list_head	*head = NULL;
 		struct dma_controller	*dma = musb->dma_controller;
 
 		if (is_in) {
@@ -367,34 +382,22 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 
 		/* Clobber old pointers to this qh */
 		musb_ep_set_qh(ep, is_in, NULL);
-		qh->hep->hcpriv = NULL;
 
-		switch (qh->type) {
+		/* USB_ENDPOINT_XFER_CONTROL and USB_ENDPOINT_XFER_BULK: fifo
+		 * policy for these lists, except that NAKing should rotate
+		 * a qh to the end (for fairness).
+		 * USB_ENDPOINT_XFER_ISOC and USB_ENDPOINT_XFER_INT: this is
+		 * where periodic bandwidth should be de-allocated if it's
+		 * tracked and allocated; and where we'd update the schedule
+		 * tree...
+		 */
+		if (qh->mux == 1
+		    && (qh->type == USB_ENDPOINT_XFER_CONTROL || qh->type == USB_ENDPOINT_XFER_BULK))
+			head = qh->ring.prev;
 
-		case USB_ENDPOINT_XFER_CONTROL:
-		case USB_ENDPOINT_XFER_BULK:
-			/* fifo policy for these lists, except that NAKing
-			 * should rotate a qh to the end (for fairness).
-			 */
-			if (qh->mux == 1) {
-				head = qh->ring.prev;
-				list_del(&qh->ring);
-				kfree(qh);
-				qh = first_qh(head);
-				break;
-			}
-			/* fall through */
+		musb_qh_free(qh);
 
-		case USB_ENDPOINT_XFER_ISOC:
-		case USB_ENDPOINT_XFER_INT:
-			/* this is where periodic bandwidth should be
-			 * de-allocated if it's tracked and allocated;
-			 * and where we'd update the schedule tree...
-			 */
-			kfree(qh);
-			qh = NULL;
-			break;
-		}
+		qh = head ? first_qh(head) : NULL;
 	}
 
 	if (qh != NULL && qh->is_ready) {
@@ -2435,11 +2438,8 @@ static int musb_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
 		/* If nothing else (usually musb_giveback) is using it
 		 * and its URB list has emptied, recycle this qh.
 		 */
-		if (ready && musb_qh_empty(qh)) {
-			qh->hep->hcpriv = NULL;
-			list_del(&qh->ring);
-			kfree(qh);
-		}
+		if (ready && musb_qh_empty(qh))
+			musb_qh_free(qh);
 	} else
 		ret = musb_cleanup_urb(urb, qh);
 done:
@@ -2493,9 +2493,7 @@ musb_h_disable(struct usb_hcd *hcd, struct usb_host_endpoint *hep)
 		while (!musb_qh_empty(qh))
 			musb_giveback(musb, next_urb(qh), -ESHUTDOWN);
 
-		hep->hcpriv = NULL;
-		list_del(&qh->ring);
-		kfree(qh);
+		musb_qh_free(qh);
 	}
 exit:
 	spin_unlock_irqrestore(&musb->lock, flags);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RESEND v2 4/6] usb: musb: Rename musb_start_urb() to musb_start_next_urb()
       [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
                   ` (6 preceding siblings ...)
  2020-01-01 16:26 ` [PATCH RESEND v2 3/6] usb: musb: Introduce musb_qh_free() " Matwey V. Kornilov
@ 2020-01-01 16:26 ` Matwey V. Kornilov
  2020-01-01 16:26 ` [PATCH RESEND v2 5/6] usb: musb: Introduce musb_start_urb() Matwey V. Kornilov
  2020-01-01 16:26 ` [PATCH RESEND v2 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule() Matwey V. Kornilov
  9 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2020-01-01 16:26 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

In the following commit we introduce new musb_start_urb() function
which will be able to start arbitrary urb. In order to have
intuitive function names we rename musb_start_urb() to
musb_start_next_urb().

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 5d23c950a21b..3a202a2a521d 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -213,7 +213,7 @@ static struct musb_qh *musb_ep_get_qh(struct musb_hw_ep *ep, int is_in)
  * Context: controller locked, irqs blocked
  */
 static void
-musb_start_urb(struct musb *musb, int is_in, struct musb_qh *qh)
+musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
 {
 	u32			len;
 	void __iomem		*mbase =  musb->mregs;
@@ -403,7 +403,7 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 	if (qh != NULL && qh->is_ready) {
 		musb_dbg(musb, "... next ep%d %cX urb %p",
 		    hw_ep->epnum, is_in ? 'R' : 'T', next_urb(qh));
-		musb_start_urb(musb, is_in, qh);
+		musb_start_next_urb(musb, is_in, qh);
 	}
 }
 
@@ -1001,7 +1001,7 @@ static void musb_bulk_nak_timeout(struct musb *musb, struct musb_hw_ep *ep,
 		}
 
 		if (next_qh)
-			musb_start_urb(musb, is_in, next_qh);
+			musb_start_next_urb(musb, is_in, next_qh);
 	}
 }
 
@@ -2141,7 +2141,7 @@ static int musb_schedule(
 	qh->hw_ep = hw_ep;
 	qh->hep->hcpriv = qh;
 	if (idle)
-		musb_start_urb(musb, is_in, qh);
+		musb_start_next_urb(musb, is_in, qh);
 	return 0;
 }
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RESEND v2 5/6] usb: musb: Introduce musb_start_urb()
       [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
                   ` (7 preceding siblings ...)
  2020-01-01 16:26 ` [PATCH RESEND v2 4/6] usb: musb: Rename musb_start_urb() to musb_start_next_urb() Matwey V. Kornilov
@ 2020-01-01 16:26 ` Matwey V. Kornilov
  2020-01-01 16:26 ` [PATCH RESEND v2 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule() Matwey V. Kornilov
  9 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2020-01-01 16:26 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

This function allows us to start arbitrary urb.

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index 3a202a2a521d..ed99ecd4e63a 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -213,11 +213,10 @@ static struct musb_qh *musb_ep_get_qh(struct musb_hw_ep *ep, int is_in)
  * Context: controller locked, irqs blocked
  */
 static void
-musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
+musb_start_urb(struct musb *musb, int is_in, struct musb_qh *qh, struct urb *urb)
 {
 	u32			len;
 	void __iomem		*mbase =  musb->mregs;
-	struct urb		*urb = next_urb(qh);
 	void			*buf = urb->transfer_buffer;
 	u32			offset = 0;
 	struct musb_hw_ep	*hw_ep = qh->hw_ep;
@@ -293,6 +292,14 @@ musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
 	}
 }
 
+static void
+musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
+{
+	struct urb *urb = next_urb(qh);
+
+	musb_start_urb(musb, is_in, qh, urb);
+}
+
 /* Context: caller owns controller lock, IRQs are blocked */
 static void musb_giveback(struct musb *musb, struct urb *urb, int status)
 __releases(musb->lock)
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH RESEND v2 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule()
       [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
                   ` (8 preceding siblings ...)
  2020-01-01 16:26 ` [PATCH RESEND v2 5/6] usb: musb: Introduce musb_start_urb() Matwey V. Kornilov
@ 2020-01-01 16:26 ` Matwey V. Kornilov
  9 siblings, 0 replies; 27+ messages in thread
From: Matwey V. Kornilov @ 2020-01-01 16:26 UTC (permalink / raw)
  To: b-liu, gregkh, stern
  Cc: matwey.kornilov, Matwey V. Kornilov, linux-usb, linux-kernel

Previously, the algorithm was the following:

 1. giveback current URB
 2. if current qh is not empty
    then start next URB
 3. if current qh is empty
    then dispose the qh, find next qh if any, and start URB.

It may take a while to run urb->callback inside URB giveback which is
run synchronously in musb. In order to improve the latency we rearrange
the function behaviour for the case when qh is not empty: next URB is
started before URB giveback. When qh is empty or current URB has an
error then the behaviour is intentionally kept in order
  a) not to break existing inter qh scheduling: URB giveback could
potentionally enqueue other URB to the empty qh preventing it from being
disposed;
  b) allow the class driver to cancel outstanding URBs in the queue.

Correct URB giveback order is guaranteed as the following. For each
qh there can be at most three ready URBs processing by the driver.
Indeed, every ready URB can send at most one URB in
musb_advance_schedule(), and in the worst case scenario we have the
following ready URBs:
  1) URB in the giveback lock protected section inside musb_giveback()
  2) URB waiting at the giveback lock acqusition in musb_giveback()
  3) URB waiting at the controller lock acqusition in the glue layer
     interrput handler
Here URB #2 and URB #3 are triggered by URB #1 and URB #2
correspondingly when they passed through musb_advance_schedule().
Since URB #3 is waiting before musb_advance_schedule(), no other new
URBs will be sent until URB#1 is finished, URB#2 goes to the giveback
lock protected section, and URB#3 goes to the controller lock protected
musb_advance_schedule().

Before this patch, time spent in urb->callback led to the following
glitches between the host and a hub during isoc transfer (line 4):

    11.624492 d=  0.000124 [130.6 +  1.050] [  4] SPLIT
    11.624492 d=  0.000000 [130.6 +  1.467] [  3] IN   : 3.5
    11.624493 d=  0.000000 [130.6 +  1.967] [ 37] DATA0: aa 08 [skipped...]
    11.625617 d=  0.001124 [131.7 +  1.050] [  4] SPLIT
    11.625617 d=  0.000000 [131.7 +  1.467] [  3] IN   : 3.5
    11.625867 d=  0.000250 [132.1 +  1.050] [  4] SPLIT
    11.625867 d=  0.000000 [132.1 +  1.467] [  3] IN   : 3.5
    11.625868 d=  0.000001 [132.1 +  1.983] [  3] DATA0: 00 00
    11.626617 d=  0.000749 [132.7 +  1.050] [  4] SPLIT
    11.626617 d=  0.000000 [132.7 +  1.467] [  3] IN   : 3.5
    11.626867 d=  0.000250 [133.1 +  1.050] [  4] SPLIT
    11.626867 d=  0.000000 [133.1 +  1.467] [  3] IN   : 3.5
    11.626868 d=  0.000000 [133.1 +  1.967] [  3] DATA0: 00 00

After the hub, they look as the following and may lead to broken
perepherial transfer (as in case of PWC based webcam):

    11.332004 d=  0.000997 [ 30.0 +  3.417] [  3] IN   : 5.5
    11.332007 d=  0.000003 [ 30.0 +  6.833] [800] DATA0: 8a 1c [skipped...]
    11.334004 d=  0.001997 [ 32.0 +  3.417] [  3] IN   : 5.5
    11.334007 d=  0.000003 [ 32.0 +  6.750] [  3] DATA0: 00 00
    11.335004 d=  0.000997 [ 33   +  3.417] [  3] IN   : 5.5
    11.335007 d=  0.000003 [ 33   +  6.750] [  3] DATA0: 00 00

Removing this glitches makes us able to successfully run 10fps
video stream from the webcam attached via USB hub. That was
previously impossible.

Signed-off-by: Matwey V. Kornilov <matwey@sai.msu.ru>
---
 drivers/usb/musb/musb_host.c | 36 ++++++++++++++++++++++++++++++++----
 drivers/usb/musb/musb_host.h |  1 +
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/musb/musb_host.c b/drivers/usb/musb/musb_host.c
index ed99ecd4e63a..5c43996f2de5 100644
--- a/drivers/usb/musb/musb_host.c
+++ b/drivers/usb/musb/musb_host.c
@@ -85,6 +85,11 @@ static bool musb_qh_empty(struct musb_qh *qh)
 	return list_empty(&qh->hep->urb_list);
 }
 
+static bool musb_qh_singular(struct musb_qh *qh)
+{
+	return list_is_singular(&qh->hep->urb_list);
+}
+
 static void musb_qh_unlink_hep(struct musb_qh *qh)
 {
 	if (!qh->hep)
@@ -301,15 +306,24 @@ musb_start_next_urb(struct musb *musb, int is_in, struct musb_qh *qh)
 }
 
 /* Context: caller owns controller lock, IRQs are blocked */
-static void musb_giveback(struct musb *musb, struct urb *urb, int status)
+static void musb_giveback(struct musb *musb, struct musb_qh *qh, struct urb *urb, int status)
 __releases(musb->lock)
 __acquires(musb->lock)
 {
 	trace_musb_urb_gb(musb, urb);
 
+	/*
+	 * This line is protected by the controller lock: at most
+	 * one thread waiting on the giveback lock.
+	 */
+	spin_lock(&qh->giveback_lock);
 	usb_hcd_unlink_urb_from_ep(musb->hcd, urb);
+
 	spin_unlock(&musb->lock);
+
 	usb_hcd_giveback_urb(musb->hcd, urb, status);
+	spin_unlock(&qh->giveback_lock);
+
 	spin_lock(&musb->lock);
 }
 
@@ -362,8 +376,21 @@ static void musb_advance_schedule(struct musb *musb, struct urb *urb,
 		break;
 	}
 
+	if (ready && !musb_qh_singular(qh) && !status) {
+		struct urb *next_urb = list_next_entry(urb, urb_list);
+
+		musb_dbg(musb, "... next ep%d %cX urb %p", hw_ep->epnum, is_in ? 'R' : 'T', next_urb);
+		musb_start_urb(musb, is_in, qh, next_urb);
+
+		qh->is_ready = 0;
+		musb_giveback(musb, qh, urb, status);
+		qh->is_ready = ready;
+
+		return;
+	}
+
 	qh->is_ready = 0;
-	musb_giveback(musb, urb, status);
+	musb_giveback(musb, qh, urb, status);
 	qh->is_ready = ready;
 
 	/* reclaim resources (and bandwidth) ASAP; deschedule it, and
@@ -2207,6 +2234,7 @@ static int musb_urb_enqueue(
 	qh->hep = hep;
 	qh->dev = urb->dev;
 	INIT_LIST_HEAD(&qh->ring);
+	spin_lock_init(&qh->giveback_lock);
 	qh->is_ready = 1;
 
 	qh->maxpacket = usb_endpoint_maxp(epd);
@@ -2439,7 +2467,7 @@ static int musb_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
 		int	ready = qh->is_ready;
 
 		qh->is_ready = 0;
-		musb_giveback(musb, urb, 0);
+		musb_giveback(musb, qh, urb, 0);
 		qh->is_ready = ready;
 
 		/* If nothing else (usually musb_giveback) is using it
@@ -2498,7 +2526,7 @@ musb_h_disable(struct usb_hcd *hcd, struct usb_host_endpoint *hep)
 		 * will activate any of these as it advances.
 		 */
 		while (!musb_qh_empty(qh))
-			musb_giveback(musb, next_urb(qh), -ESHUTDOWN);
+			musb_giveback(musb, qh, next_urb(qh), -ESHUTDOWN);
 
 		musb_qh_free(qh);
 	}
diff --git a/drivers/usb/musb/musb_host.h b/drivers/usb/musb/musb_host.h
index 2999845632ce..6223b0177c68 100644
--- a/drivers/usb/musb/musb_host.h
+++ b/drivers/usb/musb/musb_host.h
@@ -19,6 +19,7 @@ struct musb_qh {
 	struct musb_hw_ep	*hw_ep;		/* current binding */
 
 	struct list_head	ring;		/* of musb_qh */
+	spinlock_t		giveback_lock;	/* to keep URB giveback order */
 	/* struct musb_qh		*next; */	/* for periodic tree */
 	u8			mux;		/* qh multiplexed to hw_ep */
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2020-01-01 17:23 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190403185310.8437-1-matwey@sai.msu.ru>
2019-04-03 18:53 ` [6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule() Matwey V. Kornilov
2019-04-30 15:31   ` Bin Liu
2019-04-30 15:31     ` [PATCH 6/6] " Bin Liu
2019-04-30 17:29     ` [6/6] " Alan Stern
2019-04-30 17:29       ` [PATCH 6/6] " Alan Stern
2019-05-04  9:38     ` [6/6] " Matwey V. Kornilov
2019-05-04  9:38       ` [PATCH 6/6] " Matwey V. Kornilov
2019-04-24 15:42 ` [PATCH 0/6] musb: Improve performance for hub-attached webcams Matwey V. Kornilov
2019-04-30 15:20   ` Bin Liu
2019-06-14 16:45 ` [PATCH v2 " Matwey V. Kornilov
2019-06-14 16:45   ` [PATCH v2 1/6] usb: musb: Use USB_DIR_IN when calling musb_advance_schedule() Matwey V. Kornilov
2019-06-14 16:45   ` [PATCH v2 2/6] usb: musb: Introduce musb_qh_empty() helper function Matwey V. Kornilov
2019-06-14 16:45   ` [PATCH v2 3/6] usb: musb: Introduce musb_qh_free() " Matwey V. Kornilov
2019-06-14 16:45   ` [PATCH v2 4/6] usb: musb: Rename musb_start_urb() to musb_start_next_urb() Matwey V. Kornilov
2019-06-14 16:45   ` [PATCH v2 5/6] usb: musb: Introduce musb_start_urb() Matwey V. Kornilov
2019-06-14 16:45   ` [PATCH v2 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule() Matwey V. Kornilov
2019-07-02 17:29   ` [PATCH v2 0/6] musb: Improve performance for hub-attached webcams Matwey V. Kornilov
2019-07-02 17:33     ` Bin Liu
2019-09-09 16:33       ` Matwey V. Kornilov
2019-10-23  8:12         ` Matwey V. Kornilov
2020-01-01 16:26 ` [PATCH RESEND " Matwey V. Kornilov
2020-01-01 16:26 ` [PATCH RESEND v2 1/6] usb: musb: Use USB_DIR_IN when calling musb_advance_schedule() Matwey V. Kornilov
2020-01-01 16:26 ` [PATCH RESEND v2 2/6] usb: musb: Introduce musb_qh_empty() helper function Matwey V. Kornilov
2020-01-01 16:26 ` [PATCH RESEND v2 3/6] usb: musb: Introduce musb_qh_free() " Matwey V. Kornilov
2020-01-01 16:26 ` [PATCH RESEND v2 4/6] usb: musb: Rename musb_start_urb() to musb_start_next_urb() Matwey V. Kornilov
2020-01-01 16:26 ` [PATCH RESEND v2 5/6] usb: musb: Introduce musb_start_urb() Matwey V. Kornilov
2020-01-01 16:26 ` [PATCH RESEND v2 6/6] usb: musb: Decrease URB starting latency in musb_advance_schedule() Matwey V. Kornilov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).