linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] usb: cdns3: improve handling of unaligned address case
@ 2023-05-18 20:49 Frank Li
  2023-05-18 20:49 ` [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data Frank Li
  2023-06-04 23:12 ` [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Peter Chen
  0 siblings, 2 replies; 4+ messages in thread
From: Frank Li @ 2023-05-18 20:49 UTC (permalink / raw)
  To: Peter Chen, Pawel Laszczak, Roger Quadros, Aswath Govindraju,
	Greg Kroah-Hartman, open list:CADENCE USB3 DRD IP DRIVER,
	open list
  Cc: imx

When the address of a request was not aligned with an 8-byte boundary, the
USB DMA was unable to process it, necessitating the use of an internal
bounce buffer.

In these cases, the request->buf had to be copied to/from this bounce
buffer. However, if this unaligned address scenario arises, it is
unnecessary to perform heavy cache maintenance operations like
usb_gadget_map(unmap)_request_by_dev() on the request->buf, as the DMA
does not utilize it at all. it can be skipped at this case.

iperf3 tests on the rndis case:

Transmit speed (TX): Improved from 299Mbps to 440Mbps
Receive speed (RX): Improved from 290Mbps to 500Mbps

Signed-off-by: Frank Li <Frank.Li@nxp.com>
---
 drivers/usb/cdns3/cdns3-gadget.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index 1dcadef933e3..09a0882a4e97 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -800,7 +800,8 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
 	if (request->status == -EINPROGRESS)
 		request->status = status;
 
-	usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
+	if (likely(!(priv_req->flags & REQUEST_UNALIGNED)))
+		usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
 					priv_ep->dir);
 
 	if ((priv_req->flags & REQUEST_UNALIGNED) &&
@@ -2543,10 +2544,12 @@ static int __cdns3_gadget_ep_queue(struct usb_ep *ep,
 	if (ret < 0)
 		return ret;
 
-	ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
+	if (likely(!(priv_req->flags & REQUEST_UNALIGNED))) {
+		ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
 					    usb_endpoint_dir_in(ep->desc));
-	if (ret)
-		return ret;
+		if (ret)
+			return ret;
+	}
 
 	list_add_tail(&request->list, &priv_ep->deferred_req_list);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data
  2023-05-18 20:49 [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Frank Li
@ 2023-05-18 20:49 ` Frank Li
  2023-06-04 23:03   ` Peter Chen
  2023-06-04 23:12 ` [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Peter Chen
  1 sibling, 1 reply; 4+ messages in thread
From: Frank Li @ 2023-05-18 20:49 UTC (permalink / raw)
  To: Peter Chen, Pawel Laszczak, Roger Quadros, Aswath Govindraju,
	Greg Kroah-Hartman, open list:CADENCE USB3 DRD IP DRIVER,
	open list
  Cc: imx

Previously, the entire length of the request, which is equal to or greater
than the actual data, was dma synced and memcpy when using the bounce
buffer. Actually only the actual data indicated by request->actual need be
synced and copied.

Signed-off-by: Frank Li <Frank.Li@nxp.com>
---
 drivers/usb/cdns3/cdns3-gadget.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index 09a0882a4e97..ea19253fd2d0 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -809,10 +809,10 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
 		/* Make DMA buffer CPU accessible */
 		dma_sync_single_for_cpu(priv_dev->sysdev,
 			priv_req->aligned_buf->dma,
-			priv_req->aligned_buf->size,
+			request->actual,
 			priv_req->aligned_buf->dir);
 		memcpy(request->buf, priv_req->aligned_buf->buf,
-		       request->length);
+		       request->actual);
 	}
 
 	priv_req->flags &= ~(REQUEST_PENDING | REQUEST_UNALIGNED);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data
  2023-05-18 20:49 ` [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data Frank Li
@ 2023-06-04 23:03   ` Peter Chen
  0 siblings, 0 replies; 4+ messages in thread
From: Peter Chen @ 2023-06-04 23:03 UTC (permalink / raw)
  To: Frank Li
  Cc: Pawel Laszczak, Roger Quadros, Aswath Govindraju,
	Greg Kroah-Hartman, open list:CADENCE USB3 DRD IP DRIVER,
	open list, imx

On 23-05-18 16:49:46, Frank Li wrote:
> Previously, the entire length of the request, which is equal to or greater
> than the actual data, was dma synced and memcpy when using the bounce
> buffer. Actually only the actual data indicated by request->actual need be
> synced and copied.
> 
> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---
>  drivers/usb/cdns3/cdns3-gadget.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
> index 09a0882a4e97..ea19253fd2d0 100644
> --- a/drivers/usb/cdns3/cdns3-gadget.c
> +++ b/drivers/usb/cdns3/cdns3-gadget.c
> @@ -809,10 +809,10 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
>  		/* Make DMA buffer CPU accessible */
>  		dma_sync_single_for_cpu(priv_dev->sysdev,
>  			priv_req->aligned_buf->dma,
> -			priv_req->aligned_buf->size,
> +			request->actual,
>  			priv_req->aligned_buf->dir);
>  		memcpy(request->buf, priv_req->aligned_buf->buf,
> -		       request->length);
> +		       request->actual);
>  	}

Acked-by: Peter Chen <peter.chen@kernel.org>

>  
>  	priv_req->flags &= ~(REQUEST_PENDING | REQUEST_UNALIGNED);
> -- 
> 2.34.1
> 

-- 

Thanks,
Peter Chen

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/2] usb: cdns3: improve handling of unaligned address case
  2023-05-18 20:49 [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Frank Li
  2023-05-18 20:49 ` [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data Frank Li
@ 2023-06-04 23:12 ` Peter Chen
  1 sibling, 0 replies; 4+ messages in thread
From: Peter Chen @ 2023-06-04 23:12 UTC (permalink / raw)
  To: Frank Li
  Cc: Pawel Laszczak, Roger Quadros, Aswath Govindraju,
	Greg Kroah-Hartman, open list:CADENCE USB3 DRD IP DRIVER,
	open list, imx

On 23-05-18 16:49:45, Frank Li wrote:
> When the address of a request was not aligned with an 8-byte boundary, the
> USB DMA was unable to process it, necessitating the use of an internal
> bounce buffer.
> 
> In these cases, the request->buf had to be copied to/from this bounce
> buffer. However, if this unaligned address scenario arises, it is
> unnecessary to perform heavy cache maintenance operations like
> usb_gadget_map(unmap)_request_by_dev() on the request->buf, as the DMA
> does not utilize it at all. it can be skipped at this case.
> 
> iperf3 tests on the rndis case:
> 
> Transmit speed (TX): Improved from 299Mbps to 440Mbps
> Receive speed (RX): Improved from 290Mbps to 500Mbps
> 
> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---
>  drivers/usb/cdns3/cdns3-gadget.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
> index 1dcadef933e3..09a0882a4e97 100644
> --- a/drivers/usb/cdns3/cdns3-gadget.c
> +++ b/drivers/usb/cdns3/cdns3-gadget.c
> @@ -800,7 +800,8 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
>  	if (request->status == -EINPROGRESS)
>  		request->status = status;
>  
> -	usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
> +	if (likely(!(priv_req->flags & REQUEST_UNALIGNED)))
> +		usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
>  					priv_ep->dir);
>  
>  	if ((priv_req->flags & REQUEST_UNALIGNED) &&
> @@ -2543,10 +2544,12 @@ static int __cdns3_gadget_ep_queue(struct usb_ep *ep,
>  	if (ret < 0)
>  		return ret;
>  
> -	ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
> +	if (likely(!(priv_req->flags & REQUEST_UNALIGNED))) {
> +		ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
>  					    usb_endpoint_dir_in(ep->desc));

So, the possible reason for performance drop is	do cache coherency
operation twice for unaligned buffers?

Peter

> -	if (ret)
> -		return ret;
> +		if (ret)
> +			return ret;
> +	}
>  
>  	list_add_tail(&request->list, &priv_ep->deferred_req_list);
>  
> -- 
> 2.34.1
> 

-- 

Thanks,
Peter Chen

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-06-04 23:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-18 20:49 [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Frank Li
2023-05-18 20:49 ` [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data Frank Li
2023-06-04 23:03   ` Peter Chen
2023-06-04 23:12 ` [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Peter Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).