* [PATCH 1/2] usb: cdns3: improve handling of unaligned address case
@ 2023-05-18 20:49 Frank Li
2023-05-18 20:49 ` [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data Frank Li
2023-06-04 23:12 ` [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Peter Chen
0 siblings, 2 replies; 4+ messages in thread
From: Frank Li @ 2023-05-18 20:49 UTC (permalink / raw)
To: Peter Chen, Pawel Laszczak, Roger Quadros, Aswath Govindraju,
Greg Kroah-Hartman, open list:CADENCE USB3 DRD IP DRIVER,
open list
Cc: imx
When the address of a request was not aligned with an 8-byte boundary, the
USB DMA was unable to process it, necessitating the use of an internal
bounce buffer.
In these cases, the request->buf had to be copied to/from this bounce
buffer. However, if this unaligned address scenario arises, it is
unnecessary to perform heavy cache maintenance operations like
usb_gadget_map(unmap)_request_by_dev() on the request->buf, as the DMA
does not utilize it at all. it can be skipped at this case.
iperf3 tests on the rndis case:
Transmit speed (TX): Improved from 299Mbps to 440Mbps
Receive speed (RX): Improved from 290Mbps to 500Mbps
Signed-off-by: Frank Li <Frank.Li@nxp.com>
---
drivers/usb/cdns3/cdns3-gadget.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index 1dcadef933e3..09a0882a4e97 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -800,7 +800,8 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
if (request->status == -EINPROGRESS)
request->status = status;
- usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
+ if (likely(!(priv_req->flags & REQUEST_UNALIGNED)))
+ usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
priv_ep->dir);
if ((priv_req->flags & REQUEST_UNALIGNED) &&
@@ -2543,10 +2544,12 @@ static int __cdns3_gadget_ep_queue(struct usb_ep *ep,
if (ret < 0)
return ret;
- ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
+ if (likely(!(priv_req->flags & REQUEST_UNALIGNED))) {
+ ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
usb_endpoint_dir_in(ep->desc));
- if (ret)
- return ret;
+ if (ret)
+ return ret;
+ }
list_add_tail(&request->list, &priv_ep->deferred_req_list);
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data
2023-05-18 20:49 [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Frank Li
@ 2023-05-18 20:49 ` Frank Li
2023-06-04 23:03 ` Peter Chen
2023-06-04 23:12 ` [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Peter Chen
1 sibling, 1 reply; 4+ messages in thread
From: Frank Li @ 2023-05-18 20:49 UTC (permalink / raw)
To: Peter Chen, Pawel Laszczak, Roger Quadros, Aswath Govindraju,
Greg Kroah-Hartman, open list:CADENCE USB3 DRD IP DRIVER,
open list
Cc: imx
Previously, the entire length of the request, which is equal to or greater
than the actual data, was dma synced and memcpy when using the bounce
buffer. Actually only the actual data indicated by request->actual need be
synced and copied.
Signed-off-by: Frank Li <Frank.Li@nxp.com>
---
drivers/usb/cdns3/cdns3-gadget.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index 09a0882a4e97..ea19253fd2d0 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -809,10 +809,10 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
/* Make DMA buffer CPU accessible */
dma_sync_single_for_cpu(priv_dev->sysdev,
priv_req->aligned_buf->dma,
- priv_req->aligned_buf->size,
+ request->actual,
priv_req->aligned_buf->dir);
memcpy(request->buf, priv_req->aligned_buf->buf,
- request->length);
+ request->actual);
}
priv_req->flags &= ~(REQUEST_PENDING | REQUEST_UNALIGNED);
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data
2023-05-18 20:49 ` [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data Frank Li
@ 2023-06-04 23:03 ` Peter Chen
0 siblings, 0 replies; 4+ messages in thread
From: Peter Chen @ 2023-06-04 23:03 UTC (permalink / raw)
To: Frank Li
Cc: Pawel Laszczak, Roger Quadros, Aswath Govindraju,
Greg Kroah-Hartman, open list:CADENCE USB3 DRD IP DRIVER,
open list, imx
On 23-05-18 16:49:46, Frank Li wrote:
> Previously, the entire length of the request, which is equal to or greater
> than the actual data, was dma synced and memcpy when using the bounce
> buffer. Actually only the actual data indicated by request->actual need be
> synced and copied.
>
> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---
> drivers/usb/cdns3/cdns3-gadget.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
> index 09a0882a4e97..ea19253fd2d0 100644
> --- a/drivers/usb/cdns3/cdns3-gadget.c
> +++ b/drivers/usb/cdns3/cdns3-gadget.c
> @@ -809,10 +809,10 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
> /* Make DMA buffer CPU accessible */
> dma_sync_single_for_cpu(priv_dev->sysdev,
> priv_req->aligned_buf->dma,
> - priv_req->aligned_buf->size,
> + request->actual,
> priv_req->aligned_buf->dir);
> memcpy(request->buf, priv_req->aligned_buf->buf,
> - request->length);
> + request->actual);
> }
Acked-by: Peter Chen <peter.chen@kernel.org>
>
> priv_req->flags &= ~(REQUEST_PENDING | REQUEST_UNALIGNED);
> --
> 2.34.1
>
--
Thanks,
Peter Chen
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] usb: cdns3: improve handling of unaligned address case
2023-05-18 20:49 [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Frank Li
2023-05-18 20:49 ` [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data Frank Li
@ 2023-06-04 23:12 ` Peter Chen
1 sibling, 0 replies; 4+ messages in thread
From: Peter Chen @ 2023-06-04 23:12 UTC (permalink / raw)
To: Frank Li
Cc: Pawel Laszczak, Roger Quadros, Aswath Govindraju,
Greg Kroah-Hartman, open list:CADENCE USB3 DRD IP DRIVER,
open list, imx
On 23-05-18 16:49:45, Frank Li wrote:
> When the address of a request was not aligned with an 8-byte boundary, the
> USB DMA was unable to process it, necessitating the use of an internal
> bounce buffer.
>
> In these cases, the request->buf had to be copied to/from this bounce
> buffer. However, if this unaligned address scenario arises, it is
> unnecessary to perform heavy cache maintenance operations like
> usb_gadget_map(unmap)_request_by_dev() on the request->buf, as the DMA
> does not utilize it at all. it can be skipped at this case.
>
> iperf3 tests on the rndis case:
>
> Transmit speed (TX): Improved from 299Mbps to 440Mbps
> Receive speed (RX): Improved from 290Mbps to 500Mbps
>
> Signed-off-by: Frank Li <Frank.Li@nxp.com>
> ---
> drivers/usb/cdns3/cdns3-gadget.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
> index 1dcadef933e3..09a0882a4e97 100644
> --- a/drivers/usb/cdns3/cdns3-gadget.c
> +++ b/drivers/usb/cdns3/cdns3-gadget.c
> @@ -800,7 +800,8 @@ void cdns3_gadget_giveback(struct cdns3_endpoint *priv_ep,
> if (request->status == -EINPROGRESS)
> request->status = status;
>
> - usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
> + if (likely(!(priv_req->flags & REQUEST_UNALIGNED)))
> + usb_gadget_unmap_request_by_dev(priv_dev->sysdev, request,
> priv_ep->dir);
>
> if ((priv_req->flags & REQUEST_UNALIGNED) &&
> @@ -2543,10 +2544,12 @@ static int __cdns3_gadget_ep_queue(struct usb_ep *ep,
> if (ret < 0)
> return ret;
>
> - ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
> + if (likely(!(priv_req->flags & REQUEST_UNALIGNED))) {
> + ret = usb_gadget_map_request_by_dev(priv_dev->sysdev, request,
> usb_endpoint_dir_in(ep->desc));
So, the possible reason for performance drop is do cache coherency
operation twice for unaligned buffers?
Peter
> - if (ret)
> - return ret;
> + if (ret)
> + return ret;
> + }
>
> list_add_tail(&request->list, &priv_ep->deferred_req_list);
>
> --
> 2.34.1
>
--
Thanks,
Peter Chen
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-06-04 23:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-18 20:49 [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Frank Li
2023-05-18 20:49 ` [PATCH 2/2] usb: cdns3: optimize OUT transfer by copying only actual received data Frank Li
2023-06-04 23:03 ` Peter Chen
2023-06-04 23:12 ` [PATCH 1/2] usb: cdns3: improve handling of unaligned address case Peter Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).