linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/4] Add logic to consolidate TRBs for Synopsys xHC
@ 2020-04-21  9:47 Tejas Joglekar
  2020-04-21  9:48 ` [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk Tejas Joglekar
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Tejas Joglekar @ 2020-04-21  9:47 UTC (permalink / raw)
  To: Felipe Balbi, Greg Kroah-Hartman, Tejas Joglekar, linux-usb,
	devicetree, Rob Herring, Mathias Nyman
  Cc: John Youn

The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
for HS. The controller loads and updates the TRB cache from the
transfer ring in system memory whenever the driver issues a start
transfer or update transfer command.

For chained TRBs, the Synopsys xHC requires that the total amount of
bytes for all TRBs loaded in the TRB cache be greater than or equal to
1 MPS. Or the chain ends within the TRB cache (with a last TRB).

If this requirement is not met, the controller will not be able to
send or receive a packet and it will hang causing a driver timeout and
error.

This patch set adds logic to the XHCI driver to detect and prevent this
from happening along with the quirk to enable this logic for Synopsys
HAPS platform.

Based on Mathias's feedback on previous implementation where consolidation
was done in TRB cache, with this patch series the implementation is done
during mapping of the URB by consolidating the SG list into a temporary
buffer if the SG list buffer sizes within TRB_CACHE_SIZE is less than MPS.

Changes since v1:
 - Comments from Greg are addressed on [PATCH 4/4] and [PATCH 1/4]
 - Renamed the property and quirk as in other patches based on [PATCH 1/4]

Tejas Joglekar (4):
  dt-bindings: usb: Add documentation for SG trb cache size quirk
  usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK
  usb: dwc3: Add device property sgl-trb-cache-size-quirk
  usb: xhci: Use temporary buffer to consolidate SG

 Documentation/devicetree/bindings/usb/dwc3.txt     |   4 +
 Documentation/devicetree/bindings/usb/usb-xhci.txt |   3 +
 drivers/usb/dwc3/core.c                            |   2 +
 drivers/usb/dwc3/core.h                            |   2 +
 drivers/usb/dwc3/dwc3-haps.c                       |   1 +
 drivers/usb/dwc3/host.c                            |   6 +-
 drivers/usb/host/xhci-pci.c                        |   3 +
 drivers/usb/host/xhci-plat.c                       |   4 +
 drivers/usb/host/xhci-ring.c                       |   2 +-
 drivers/usb/host/xhci.c                            | 125 +++++++++++++++++++++
 drivers/usb/host/xhci.h                            |   5 +
 11 files changed, 155 insertions(+), 2 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk
  2020-04-21  9:47 [RFC PATCH v2 0/4] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
@ 2020-04-21  9:48 ` Tejas Joglekar
  2020-05-06 20:15   ` Rob Herring
  2020-05-06 20:16   ` Rob Herring
  2020-04-21  9:48 ` [RFC PATCH v2 2/4] usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK Tejas Joglekar
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 18+ messages in thread
From: Tejas Joglekar @ 2020-04-21  9:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tejas Joglekar, linux-usb, devicetree, Rob Herring
  Cc: John Youn

This commit adds the documentation for sgl-trb-cache-size-quirk, and
snps,sgl-trb-cache-size-quirk property. These when set enables the
quirk for XHCI driver for consolidation of sg list into a temporary
buffer when small buffer sizes are scattered over the sg list not
making up to MPS or total transfer size within TRB cache size with
Synopsys xHC.

Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
---
 Changes in v2:
 - Renamed the property

 Documentation/devicetree/bindings/usb/dwc3.txt     | 4 ++++
 Documentation/devicetree/bindings/usb/usb-xhci.txt | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt b/Documentation/devicetree/bindings/usb/dwc3.txt
index 9946ff9ba735..6d0418ee4dbd 100644
--- a/Documentation/devicetree/bindings/usb/dwc3.txt
+++ b/Documentation/devicetree/bindings/usb/dwc3.txt
@@ -104,6 +104,10 @@ Optional properties:
 			this and tx-thr-num-pkt-prd to a valid, non-zero value
 			1-16 (DWC_usb31 programming guide section 1.2.3) to
 			enable periodic ESS TX threshold.
+ - snps,sgl-trb-cache-size-quirk: enable sg list consolidation - host mode
+			only. Set to use SG buffers of at least MPS size
+			by consolidating smaller SG buffers list into a
+			single buffer.
 
  - <DEPRECATED> tx-fifo-resize: determines if the FIFO *has* to be reallocated.
  - snps,incr-burst-type-adjustment: Value for INCR burst type of GSBUSCFG0
diff --git a/Documentation/devicetree/bindings/usb/usb-xhci.txt b/Documentation/devicetree/bindings/usb/usb-xhci.txt
index 3f378951d624..14d900474894 100644
--- a/Documentation/devicetree/bindings/usb/usb-xhci.txt
+++ b/Documentation/devicetree/bindings/usb/usb-xhci.txt
@@ -43,6 +43,9 @@ Optional properties:
   - quirk-broken-port-ped: set if the controller has broken port disable mechanism
   - imod-interval-ns: default interrupt moderation interval is 5000ns
   - phys : see usb-hcd.yaml in the current directory
+  - sgl-trb-cache-size-quirk: set if you need to consolidate sg list into a
+    temporary buffer when small SG buffer sizes does not make upto MPS
+    size or total transfer size across the TRB cache size.
 
 additionally the properties from usb-hcd.yaml (in the current directory) are
 supported.
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 2/4] usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK
  2020-04-21  9:47 [RFC PATCH v2 0/4] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
  2020-04-21  9:48 ` [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk Tejas Joglekar
@ 2020-04-21  9:48 ` Tejas Joglekar
  2020-04-21  9:48 ` [RFC PATCH v2 3/4] usb: dwc3: Add device property sgl-trb-cache-size-quirk Tejas Joglekar
  2020-04-21  9:49 ` [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG Tejas Joglekar
  3 siblings, 0 replies; 18+ messages in thread
From: Tejas Joglekar @ 2020-04-21  9:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tejas Joglekar, linux-usb, Mathias Nyman; +Cc: John Youn

This commit enables the sgl-trb-cache-size-quirk property is set
for the Synopsys xHC. This patch fixes the SNPS xHC hang issue
when the data is scattered across small buffers which does not make
atleast MPS size for given TRB cache size of SNPS xHC.

Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
---
 Changes in v2:
 - Renamed the quirk
 - Renamed the property

 drivers/usb/host/xhci-pci.c  | 3 +++
 drivers/usb/host/xhci-plat.c | 4 ++++
 drivers/usb/host/xhci.h      | 1 +
 3 files changed, 8 insertions(+)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 766b74723e64..05b743a819ac 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -268,6 +268,9 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
 	     pdev->device == 0x9026)
 		xhci->quirks |= XHCI_RESET_PLL_ON_DISCONNECT;
 
+	if (pdev->vendor == PCI_VENDOR_ID_SYNOPSYS)
+		xhci->quirks |= XHCI_SG_TRB_CACHE_SIZE_QUIRK;
+
 	if (xhci->quirks & XHCI_RESET_ON_RESUME)
 		xhci_dbg_trace(xhci, trace_xhci_dbg_quirks,
 				"QUIRK: Resetting on resume");
diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index 1d4f6f85f0fe..a8526fe50811 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -288,6 +288,10 @@ static int xhci_plat_probe(struct platform_device *pdev)
 		if (device_property_read_bool(tmpdev, "quirk-broken-port-ped"))
 			xhci->quirks |= XHCI_BROKEN_PORT_PED;
 
+		if (device_property_read_bool(tmpdev,
+					      "sgl-trb-cache-size-quirk"))
+			xhci->quirks |= XHCI_SG_TRB_CACHE_SIZE_QUIRK;
+
 		device_property_read_u32(tmpdev, "imod-interval-ns",
 					 &xhci->imod_interval);
 	}
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 3289bb516201..4db825459b40 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1873,6 +1873,7 @@ struct xhci_hcd {
 #define XHCI_DEFAULT_PM_RUNTIME_ALLOW	BIT_ULL(33)
 #define XHCI_RESET_PLL_ON_DISCONNECT	BIT_ULL(34)
 #define XHCI_SNPS_BROKEN_SUSPEND    BIT_ULL(35)
+#define XHCI_SG_TRB_CACHE_SIZE_QUIRK	BIT_ULL(36)
 
 	unsigned int		num_active_eps;
 	unsigned int		limit_active_eps;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 3/4] usb: dwc3: Add device property sgl-trb-cache-size-quirk
  2020-04-21  9:47 [RFC PATCH v2 0/4] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
  2020-04-21  9:48 ` [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk Tejas Joglekar
  2020-04-21  9:48 ` [RFC PATCH v2 2/4] usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK Tejas Joglekar
@ 2020-04-21  9:48 ` Tejas Joglekar
  2020-04-21  9:49 ` [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG Tejas Joglekar
  3 siblings, 0 replies; 18+ messages in thread
From: Tejas Joglekar @ 2020-04-21  9:48 UTC (permalink / raw)
  To: Felipe Balbi, Greg Kroah-Hartman, Tejas Joglekar, linux-usb; +Cc: John Youn

This commit adds the sgl-trb-cache-size-quirk property to enable
quirk for the XHCI driver with Synopsys xHC. This property is
enabled as initial property for the dwc3-haps driver.

Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
---
 Changes in v2:
 - Renamed the property

 drivers/usb/dwc3/core.c      | 2 ++
 drivers/usb/dwc3/core.h      | 2 ++
 drivers/usb/dwc3/dwc3-haps.c | 1 +
 drivers/usb/dwc3/host.c      | 6 +++++-
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index ab6323b8e323..beb95f5fdce3 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -1303,6 +1303,8 @@ static void dwc3_get_properties(struct dwc3 *dwc)
 				"snps,usb3_lpm_capable");
 	dwc->usb2_lpm_disable = device_property_read_bool(dev,
 				"snps,usb2-lpm-disable");
+	dwc->sgl_trb_cache_size_quirk = device_property_read_bool(dev,
+				"snps,sgl-trb-cache-size-quirk");
 	device_property_read_u8(dev, "snps,rx-thr-num-pkt-prd",
 				&rx_thr_num_pkt_prd);
 	device_property_read_u8(dev, "snps,rx-max-burst-prd",
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index bfc5c780a963..10647cbeea68 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -1007,6 +1007,7 @@ struct dwc3_scratchpad_array {
  *			not needed for DWC_usb31 version 1.70a-ea06 and below
  * @usb3_lpm_capable: set if hadrware supports Link Power Management
  * @usb2_lpm_disable: set to disable usb2 lpm
+ * @sgl_trb_cache_size_quirk: set to enable the SG list consolidation
  * @disable_scramble_quirk: set if we enable the disable scramble quirk
  * @u2exit_lfps_quirk: set if we enable u2exit lfps quirk
  * @u2ss_inp3_quirk: set if we enable P3 OK for U2/SS Inactive quirk
@@ -1206,6 +1207,7 @@ struct dwc3 {
 	unsigned		dis_start_transfer_quirk:1;
 	unsigned		usb3_lpm_capable:1;
 	unsigned		usb2_lpm_disable:1;
+	unsigned		sgl_trb_cache_size_quirk:1;
 
 	unsigned		disable_scramble_quirk:1;
 	unsigned		u2exit_lfps_quirk:1;
diff --git a/drivers/usb/dwc3/dwc3-haps.c b/drivers/usb/dwc3/dwc3-haps.c
index 3cecbf169452..9311cbe5f264 100644
--- a/drivers/usb/dwc3/dwc3-haps.c
+++ b/drivers/usb/dwc3/dwc3-haps.c
@@ -29,6 +29,7 @@ static const struct property_entry initial_properties[] = {
 	PROPERTY_ENTRY_BOOL("snps,usb3_lpm_capable"),
 	PROPERTY_ENTRY_BOOL("snps,has-lpm-erratum"),
 	PROPERTY_ENTRY_BOOL("snps,dis_enblslpm_quirk"),
+	PROPERTY_ENTRY_BOOL("snps,sgl-trb-cache-size-quirk"),
 	PROPERTY_ENTRY_BOOL("linux,sysdev_is_parent"),
 	{ },
 };
diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c
index 86dbd012b984..c1d851ff39b1 100644
--- a/drivers/usb/dwc3/host.c
+++ b/drivers/usb/dwc3/host.c
@@ -44,7 +44,7 @@ static int dwc3_host_get_irq(struct dwc3 *dwc)
 
 int dwc3_host_init(struct dwc3 *dwc)
 {
-	struct property_entry	props[4];
+	struct property_entry	props[5];
 	struct platform_device	*xhci;
 	int			ret, irq;
 	struct resource		*res;
@@ -95,6 +95,10 @@ int dwc3_host_init(struct dwc3 *dwc)
 	if (dwc->usb2_lpm_disable)
 		props[prop_idx++] = PROPERTY_ENTRY_BOOL("usb2-lpm-disable");
 
+	if (dwc->sgl_trb_cache_size_quirk)
+		props[prop_idx++] =
+			PROPERTY_ENTRY_BOOL("sgl-trb-cache-size-quirk");
+
 	/**
 	 * WORKAROUND: dwc3 revisions <=3.00a have a limitation
 	 * where Port Disable command doesn't work.
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-04-21  9:47 [RFC PATCH v2 0/4] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
                   ` (2 preceding siblings ...)
  2020-04-21  9:48 ` [RFC PATCH v2 3/4] usb: dwc3: Add device property sgl-trb-cache-size-quirk Tejas Joglekar
@ 2020-04-21  9:49 ` Tejas Joglekar
  2020-05-18  8:38   ` Tejas Joglekar
  2020-05-18 16:21   ` Mathias Nyman
  3 siblings, 2 replies; 18+ messages in thread
From: Tejas Joglekar @ 2020-04-21  9:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tejas Joglekar, linux-usb, Mathias Nyman; +Cc: John Youn

The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
for HS. The controller loads and updates the TRB cache from the transfer
ring in system memory whenever the driver issues a start transfer or
update transfer command.

For chained TRBs, the Synopsys xHC requires that the total amount of
bytes for all TRBs loaded in the TRB cache be greater than or equal to 1
MPS. Or the chain ends within the TRB cache (with a last TRB).

If this requirement is not met, the controller will not be able to send
or receive a packet and it will hang causing a driver timeout and error.

This can be a problem if a class driver queues SG requests with many
small-buffer entries. The XHCI driver will create a chained TRB for each
entry which may trigger this issue.

This patch adds logic to the XHCI driver to detect and prevent this from
happening.

For every (TRB_CACHE_SIZE - 2), we check the total buffer size of
the SG list and if the last window of (TRB_CACHE_SIZE - 2) SG list length
and we don't make up at least 1 MPS, we create a temporary buffer to
consolidate full SG list into the buffer.

We check at (TRB_CACHE_SIZE - 2) window because it is possible that there
would be a link and/or event data TRB that take up to 2 of the cache
entries.

We discovered this issue with devices on other platforms but have not
yet come across any device that triggers this on Linux. But it could be
a real problem now or in the future. All it takes is N number of small
chained TRBs. And other instances of the Synopsys IP may have smaller
values for the TRB_CACHE_SIZE which would exacerbate the problem.

Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
---
 Changes in v2:
 - Removed redundant debug messages
 - Modified logic to remove unnecessary changes in hcd.c
 - Rename the quirk

 drivers/usb/host/xhci-ring.c |   2 +-
 drivers/usb/host/xhci.c      | 125 +++++++++++++++++++++++++++++++++++++++++++
 drivers/usb/host/xhci.h      |   4 ++
 3 files changed, 130 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index a78787bb5133..2fad9474912a 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3291,7 +3291,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 
 	full_len = urb->transfer_buffer_length;
 	/* If we have scatter/gather list, we use it. */
-	if (urb->num_sgs) {
+	if (urb->num_sgs && !(urb->transfer_flags & URB_DMA_MAP_SINGLE)) {
 		num_sgs = urb->num_mapped_sgs;
 		sg = urb->sg;
 		addr = (u64) sg_dma_address(sg);
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index fe38275363e0..15f06bc6b1ad 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1256,6 +1256,106 @@ EXPORT_SYMBOL_GPL(xhci_resume);
 
 /*-------------------------------------------------------------------------*/
 
+static int xhci_map_temp_buffer(struct usb_hcd *hcd, struct urb *urb)
+{
+	void *temp;
+	int ret = 0;
+	unsigned int len;
+	unsigned int buf_len;
+	enum dma_data_direction dir;
+	struct xhci_hcd *xhci;
+
+	xhci = hcd_to_xhci(hcd);
+	dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+	buf_len = urb->transfer_buffer_length;
+
+	temp = kzalloc_node(buf_len, GFP_ATOMIC,
+			    dev_to_node(hcd->self.sysdev));
+
+	if (usb_urb_dir_out(urb))
+		len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs,
+					 temp, buf_len, 0);
+
+	urb->transfer_buffer = temp;
+	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
+					   urb->transfer_buffer,
+					   urb->transfer_buffer_length,
+					   dir);
+
+	if (dma_mapping_error(hcd->self.sysdev,
+			      urb->transfer_dma)) {
+		ret = -EAGAIN;
+		kfree(temp);
+	} else {
+		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
+	}
+
+	return ret;
+}
+
+static bool xhci_urb_temp_buffer_required(struct usb_hcd *hcd,
+					  struct urb *urb)
+{
+	bool ret = false;
+	unsigned int i;
+	unsigned int len = 0;
+	unsigned int buf_len;
+	unsigned int trb_size;
+	unsigned int max_pkt;
+	struct scatterlist *sg;
+	struct scatterlist *tail_sg;
+
+	sg = urb->sg;
+	tail_sg = urb->sg;
+	buf_len = urb->transfer_buffer_length;
+	max_pkt = usb_endpoint_maxp(&urb->ep->desc);
+
+	if (!urb->num_sgs)
+		return ret;
+
+	if (urb->dev->speed >= USB_SPEED_SUPER)
+		trb_size = TRB_CACHE_SIZE_SS;
+	else
+		trb_size = TRB_CACHE_SIZE_HS;
+
+	if (urb->transfer_buffer_length != 0 &&
+	    !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) {
+		for_each_sg(urb->sg, sg, urb->num_sgs, i) {
+			len = len + sg->length;
+			if (i > trb_size - 2) {
+				len = len - tail_sg->length;
+				if (len < max_pkt) {
+					ret = true;
+					break;
+				}
+
+				tail_sg = sg_next(tail_sg);
+			}
+		}
+	}
+	return ret;
+}
+
+static void xhci_unmap_temp_buf(struct urb *urb)
+{
+	struct scatterlist *sg;
+	unsigned int len;
+	unsigned int buf_len;
+
+	sg = urb->sg;
+	buf_len = urb->transfer_buffer_length;
+
+	if (usb_urb_dir_in(urb)) {
+		len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs,
+					   urb->transfer_buffer,
+					   buf_len,
+					   0);
+	}
+
+	kfree(urb->transfer_buffer);
+	urb->transfer_buffer = NULL;
+}
+
 /*
  * Bypass the DMA mapping if URB is suitable for Immediate Transfer (IDT),
  * we'll copy the actual data into the TRB address register. This is limited to
@@ -1265,12 +1365,36 @@ EXPORT_SYMBOL_GPL(xhci_resume);
 static int xhci_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
 				gfp_t mem_flags)
 {
+	struct xhci_hcd *xhci;
+
+	xhci = hcd_to_xhci(hcd);
+
 	if (xhci_urb_suitable_for_idt(urb))
 		return 0;
 
+	if (xhci->quirks & XHCI_SG_TRB_CACHE_SIZE_QUIRK) {
+		if (xhci_urb_temp_buffer_required(hcd, urb))
+			return xhci_map_temp_buffer(hcd, urb);
+	}
 	return usb_hcd_map_urb_for_dma(hcd, urb, mem_flags);
 }
 
+static void xhci_unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb)
+{
+	struct xhci_hcd *xhci;
+	bool unmap_temp_buf = false;
+
+	xhci = hcd_to_xhci(hcd);
+
+	if (urb->num_sgs && (urb->transfer_flags & URB_DMA_MAP_SINGLE))
+		unmap_temp_buf = true;
+
+	usb_hcd_unmap_urb_for_dma(hcd, urb);
+
+	if ((xhci->quirks & XHCI_SG_TRB_CACHE_SIZE_QUIRK) && unmap_temp_buf)
+		xhci_unmap_temp_buf(urb);
+}
+
 /**
  * xhci_get_endpoint_index - Used for passing endpoint bitmasks between the core and
  * HCDs.  Find the index for an endpoint given its descriptor.  Use the return
@@ -5315,6 +5439,7 @@ static const struct hc_driver xhci_hc_driver = {
 	 * managing i/o requests and associated device resources
 	 */
 	.map_urb_for_dma =      xhci_map_urb_for_dma,
+	.unmap_urb_for_dma =    xhci_unmap_urb_for_dma,
 	.urb_enqueue =		xhci_urb_enqueue,
 	.urb_dequeue =		xhci_urb_dequeue,
 	.alloc_dev =		xhci_alloc_dev,
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 4db825459b40..14600214188f 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1330,6 +1330,10 @@ enum xhci_setup_dev {
 #define TRB_SIA			(1<<31)
 #define TRB_FRAME_ID(p)		(((p) & 0x7ff) << 20)
 
+/* TRB cache size for xHC with TRB cache */
+#define TRB_CACHE_SIZE_HS	8
+#define TRB_CACHE_SIZE_SS	16
+
 struct xhci_generic_trb {
 	__le32 field[4];
 };
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk
  2020-04-21  9:48 ` [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk Tejas Joglekar
@ 2020-05-06 20:15   ` Rob Herring
  2020-05-07 16:03     ` Tejas Joglekar
  2020-05-06 20:16   ` Rob Herring
  1 sibling, 1 reply; 18+ messages in thread
From: Rob Herring @ 2020-05-06 20:15 UTC (permalink / raw)
  To: Tejas Joglekar; +Cc: Greg Kroah-Hartman, linux-usb, devicetree, John Youn

On Tue, Apr 21, 2020 at 03:18:09PM +0530, Tejas Joglekar wrote:
> This commit adds the documentation for sgl-trb-cache-size-quirk, and
> snps,sgl-trb-cache-size-quirk property. These when set enables the
> quirk for XHCI driver for consolidation of sg list into a temporary
> buffer when small buffer sizes are scattered over the sg list not
> making up to MPS or total transfer size within TRB cache size with
> Synopsys xHC.
> 
> Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
> ---
>  Changes in v2:
>  - Renamed the property
> 
>  Documentation/devicetree/bindings/usb/dwc3.txt     | 4 ++++
>  Documentation/devicetree/bindings/usb/usb-xhci.txt | 3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt b/Documentation/devicetree/bindings/usb/dwc3.txt
> index 9946ff9ba735..6d0418ee4dbd 100644
> --- a/Documentation/devicetree/bindings/usb/dwc3.txt
> +++ b/Documentation/devicetree/bindings/usb/dwc3.txt
> @@ -104,6 +104,10 @@ Optional properties:
>  			this and tx-thr-num-pkt-prd to a valid, non-zero value
>  			1-16 (DWC_usb31 programming guide section 1.2.3) to
>  			enable periodic ESS TX threshold.
> + - snps,sgl-trb-cache-size-quirk: enable sg list consolidation - host mode
> +			only. Set to use SG buffers of at least MPS size
> +			by consolidating smaller SG buffers list into a
> +			single buffer.
>  
>   - <DEPRECATED> tx-fifo-resize: determines if the FIFO *has* to be reallocated.
>   - snps,incr-burst-type-adjustment: Value for INCR burst type of GSBUSCFG0
> diff --git a/Documentation/devicetree/bindings/usb/usb-xhci.txt b/Documentation/devicetree/bindings/usb/usb-xhci.txt
> index 3f378951d624..14d900474894 100644
> --- a/Documentation/devicetree/bindings/usb/usb-xhci.txt
> +++ b/Documentation/devicetree/bindings/usb/usb-xhci.txt
> @@ -43,6 +43,9 @@ Optional properties:
>    - quirk-broken-port-ped: set if the controller has broken port disable mechanism
>    - imod-interval-ns: default interrupt moderation interval is 5000ns
>    - phys : see usb-hcd.yaml in the current directory
> +  - sgl-trb-cache-size-quirk: set if you need to consolidate sg list into a
> +    temporary buffer when small SG buffer sizes does not make upto MPS
> +    size or total transfer size across the TRB cache size.

Why do we have 2 different names?

>  
>  additionally the properties from usb-hcd.yaml (in the current directory) are
>  supported.
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk
  2020-04-21  9:48 ` [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk Tejas Joglekar
  2020-05-06 20:15   ` Rob Herring
@ 2020-05-06 20:16   ` Rob Herring
  2020-05-07 16:07     ` Tejas Joglekar
  1 sibling, 1 reply; 18+ messages in thread
From: Rob Herring @ 2020-05-06 20:16 UTC (permalink / raw)
  To: Tejas Joglekar; +Cc: Greg Kroah-Hartman, linux-usb, devicetree, John Youn

On Tue, Apr 21, 2020 at 03:18:09PM +0530, Tejas Joglekar wrote:
> This commit adds the documentation for sgl-trb-cache-size-quirk, and
> snps,sgl-trb-cache-size-quirk property. These when set enables the
> quirk for XHCI driver for consolidation of sg list into a temporary
> buffer when small buffer sizes are scattered over the sg list not
> making up to MPS or total transfer size within TRB cache size with
> Synopsys xHC.
> 
> Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>

The author and S-o-b emails don't match.

> ---
>  Changes in v2:
>  - Renamed the property
> 
>  Documentation/devicetree/bindings/usb/dwc3.txt     | 4 ++++
>  Documentation/devicetree/bindings/usb/usb-xhci.txt | 3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt b/Documentation/devicetree/bindings/usb/dwc3.txt
> index 9946ff9ba735..6d0418ee4dbd 100644
> --- a/Documentation/devicetree/bindings/usb/dwc3.txt
> +++ b/Documentation/devicetree/bindings/usb/dwc3.txt
> @@ -104,6 +104,10 @@ Optional properties:
>  			this and tx-thr-num-pkt-prd to a valid, non-zero value
>  			1-16 (DWC_usb31 programming guide section 1.2.3) to
>  			enable periodic ESS TX threshold.
> + - snps,sgl-trb-cache-size-quirk: enable sg list consolidation - host mode
> +			only. Set to use SG buffers of at least MPS size
> +			by consolidating smaller SG buffers list into a
> +			single buffer.
>  
>   - <DEPRECATED> tx-fifo-resize: determines if the FIFO *has* to be reallocated.
>   - snps,incr-burst-type-adjustment: Value for INCR burst type of GSBUSCFG0
> diff --git a/Documentation/devicetree/bindings/usb/usb-xhci.txt b/Documentation/devicetree/bindings/usb/usb-xhci.txt
> index 3f378951d624..14d900474894 100644
> --- a/Documentation/devicetree/bindings/usb/usb-xhci.txt
> +++ b/Documentation/devicetree/bindings/usb/usb-xhci.txt
> @@ -43,6 +43,9 @@ Optional properties:
>    - quirk-broken-port-ped: set if the controller has broken port disable mechanism
>    - imod-interval-ns: default interrupt moderation interval is 5000ns
>    - phys : see usb-hcd.yaml in the current directory
> +  - sgl-trb-cache-size-quirk: set if you need to consolidate sg list into a
> +    temporary buffer when small SG buffer sizes does not make upto MPS
> +    size or total transfer size across the TRB cache size.
>  
>  additionally the properties from usb-hcd.yaml (in the current directory) are
>  supported.
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk
  2020-05-06 20:15   ` Rob Herring
@ 2020-05-07 16:03     ` Tejas Joglekar
  0 siblings, 0 replies; 18+ messages in thread
From: Tejas Joglekar @ 2020-05-07 16:03 UTC (permalink / raw)
  To: Rob Herring, Tejas Joglekar
  Cc: Greg Kroah-Hartman, linux-usb, devicetree, John Youn

Hi,
On 5/7/2020 1:45 AM, Rob Herring wrote:
> On Tue, Apr 21, 2020 at 03:18:09PM +0530, Tejas Joglekar wrote:
>> This commit adds the documentation for sgl-trb-cache-size-quirk, and
>> snps,sgl-trb-cache-size-quirk property. These when set enables the
>> quirk for XHCI driver for consolidation of sg list into a temporary
>> buffer when small buffer sizes are scattered over the sg list not
>> making up to MPS or total transfer size within TRB cache size with
>> Synopsys xHC.
>>
>> Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
>> ---
>>  Changes in v2:
>>  - Renamed the property
>>
>>  Documentation/devicetree/bindings/usb/dwc3.txt     | 4 ++++
>>  Documentation/devicetree/bindings/usb/usb-xhci.txt | 3 +++
>>  2 files changed, 7 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt b/Documentation/devicetree/bindings/usb/dwc3.txt
>> index 9946ff9ba735..6d0418ee4dbd 100644
>> --- a/Documentation/devicetree/bindings/usb/dwc3.txt
>> +++ b/Documentation/devicetree/bindings/usb/dwc3.txt
>> @@ -104,6 +104,10 @@ Optional properties:
>>  			this and tx-thr-num-pkt-prd to a valid, non-zero value
>>  			1-16 (DWC_usb31 programming guide section 1.2.3) to
>>  			enable periodic ESS TX threshold.
>> + - snps,sgl-trb-cache-size-quirk: enable sg list consolidation - host mode
>> +			only. Set to use SG buffers of at least MPS size
>> +			by consolidating smaller SG buffers list into a
>> +			single buffer.
>>  
>>   - <DEPRECATED> tx-fifo-resize: determines if the FIFO *has* to be reallocated.
>>   - snps,incr-burst-type-adjustment: Value for INCR burst type of GSBUSCFG0
>> diff --git a/Documentation/devicetree/bindings/usb/usb-xhci.txt b/Documentation/devicetree/bindings/usb/usb-xhci.txt
>> index 3f378951d624..14d900474894 100644
>> --- a/Documentation/devicetree/bindings/usb/usb-xhci.txt
>> +++ b/Documentation/devicetree/bindings/usb/usb-xhci.txt
>> @@ -43,6 +43,9 @@ Optional properties:
>>    - quirk-broken-port-ped: set if the controller has broken port disable mechanism
>>    - imod-interval-ns: default interrupt moderation interval is 5000ns
>>    - phys : see usb-hcd.yaml in the current directory
>> +  - sgl-trb-cache-size-quirk: set if you need to consolidate sg list into a
>> +    temporary buffer when small SG buffer sizes does not make upto MPS
>> +    size or total transfer size across the TRB cache size.
> 
> Why do we have 2 different names?
> 
I tried to follow the usb2-lpm-disable property implementation where in usb-xhci
the snps is not added to the property name. Should I use snps,sgl-trb-cache-size-quirk
in usb-xhci too ?
>>  
>>  additionally the properties from usb-hcd.yaml (in the current directory) are
>>  supported.
>> -- 
>> 2.11.0
>>

Thanks & Regards,
  Tejas Joglekar


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk
  2020-05-06 20:16   ` Rob Herring
@ 2020-05-07 16:07     ` Tejas Joglekar
  0 siblings, 0 replies; 18+ messages in thread
From: Tejas Joglekar @ 2020-05-07 16:07 UTC (permalink / raw)
  To: Rob Herring, Tejas Joglekar
  Cc: Greg Kroah-Hartman, linux-usb, devicetree, John Youn

Hi,
On 5/7/2020 1:46 AM, Rob Herring wrote:
> On Tue, Apr 21, 2020 at 03:18:09PM +0530, Tejas Joglekar wrote:
>> This commit adds the documentation for sgl-trb-cache-size-quirk, and
>> snps,sgl-trb-cache-size-quirk property. These when set enables the
>> quirk for XHCI driver for consolidation of sg list into a temporary
>> buffer when small buffer sizes are scattered over the sg list not
>> making up to MPS or total transfer size within TRB cache size with
>> Synopsys xHC.
>>
>> Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
> 
> The author and S-o-b emails don't match.
> 
I intend to use short email, but the server mangles the full email even if we
set it manually to short email. I have been using the same in the past to send
the patches to community. Do you want me to change the sign-off to sender email address? 

>> ---
>>  Changes in v2:
>>  - Renamed the property
>>
>>  Documentation/devicetree/bindings/usb/dwc3.txt     | 4 ++++
>>  Documentation/devicetree/bindings/usb/usb-xhci.txt | 3 +++
>>  2 files changed, 7 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/usb/dwc3.txt b/Documentation/devicetree/bindings/usb/dwc3.txt
>> index 9946ff9ba735..6d0418ee4dbd 100644
>> --- a/Documentation/devicetree/bindings/usb/dwc3.txt
>> +++ b/Documentation/devicetree/bindings/usb/dwc3.txt
>> @@ -104,6 +104,10 @@ Optional properties:
>>  			this and tx-thr-num-pkt-prd to a valid, non-zero value
>>  			1-16 (DWC_usb31 programming guide section 1.2.3) to
>>  			enable periodic ESS TX threshold.
>> + - snps,sgl-trb-cache-size-quirk: enable sg list consolidation - host mode
>> +			only. Set to use SG buffers of at least MPS size
>> +			by consolidating smaller SG buffers list into a
>> +			single buffer.
>>  
>>   - <DEPRECATED> tx-fifo-resize: determines if the FIFO *has* to be reallocated.
>>   - snps,incr-burst-type-adjustment: Value for INCR burst type of GSBUSCFG0
>> diff --git a/Documentation/devicetree/bindings/usb/usb-xhci.txt b/Documentation/devicetree/bindings/usb/usb-xhci.txt
>> index 3f378951d624..14d900474894 100644
>> --- a/Documentation/devicetree/bindings/usb/usb-xhci.txt
>> +++ b/Documentation/devicetree/bindings/usb/usb-xhci.txt
>> @@ -43,6 +43,9 @@ Optional properties:
>>    - quirk-broken-port-ped: set if the controller has broken port disable mechanism
>>    - imod-interval-ns: default interrupt moderation interval is 5000ns
>>    - phys : see usb-hcd.yaml in the current directory
>> +  - sgl-trb-cache-size-quirk: set if you need to consolidate sg list into a
>> +    temporary buffer when small SG buffer sizes does not make upto MPS
>> +    size or total transfer size across the TRB cache size.
>>  
>>  additionally the properties from usb-hcd.yaml (in the current directory) are
>>  supported.
>> -- 
>> 2.11.0
>>
Thanks & Regards,
  Tejas Joglekar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-04-21  9:49 ` [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG Tejas Joglekar
@ 2020-05-18  8:38   ` Tejas Joglekar
  2020-05-18 16:21   ` Mathias Nyman
  1 sibling, 0 replies; 18+ messages in thread
From: Tejas Joglekar @ 2020-05-18  8:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-usb, Mathias Nyman; +Cc: John Youn

Hi Mathias,
 Can you please review and comment?
On 4/21/2020 3:19 PM, Tejas Joglekar wrote:
> The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
> each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
> for HS. The controller loads and updates the TRB cache from the transfer
> ring in system memory whenever the driver issues a start transfer or
> update transfer command.
> 
> For chained TRBs, the Synopsys xHC requires that the total amount of
> bytes for all TRBs loaded in the TRB cache be greater than or equal to 1
> MPS. Or the chain ends within the TRB cache (with a last TRB).
> 
> If this requirement is not met, the controller will not be able to send
> or receive a packet and it will hang causing a driver timeout and error.
> 
> This can be a problem if a class driver queues SG requests with many
> small-buffer entries. The XHCI driver will create a chained TRB for each
> entry which may trigger this issue.
> 
> This patch adds logic to the XHCI driver to detect and prevent this from
> happening.
> 
> For every (TRB_CACHE_SIZE - 2), we check the total buffer size of
> the SG list and if the last window of (TRB_CACHE_SIZE - 2) SG list length
> and we don't make up at least 1 MPS, we create a temporary buffer to
> consolidate full SG list into the buffer.
> 
> We check at (TRB_CACHE_SIZE - 2) window because it is possible that there
> would be a link and/or event data TRB that take up to 2 of the cache
> entries.
> 
> We discovered this issue with devices on other platforms but have not
> yet come across any device that triggers this on Linux. But it could be
> a real problem now or in the future. All it takes is N number of small
> chained TRBs. And other instances of the Synopsys IP may have smaller
> values for the TRB_CACHE_SIZE which would exacerbate the problem.
> 
> Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
> ---
>  Changes in v2:
>  - Removed redundant debug messages
>  - Modified logic to remove unnecessary changes in hcd.c
>  - Rename the quirk
> 
>  drivers/usb/host/xhci-ring.c |   2 +-
>  drivers/usb/host/xhci.c      | 125 +++++++++++++++++++++++++++++++++++++++++++
>  drivers/usb/host/xhci.h      |   4 ++
>  3 files changed, 130 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index a78787bb5133..2fad9474912a 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -3291,7 +3291,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
>  
>  	full_len = urb->transfer_buffer_length;
>  	/* If we have scatter/gather list, we use it. */
> -	if (urb->num_sgs) {
> +	if (urb->num_sgs && !(urb->transfer_flags & URB_DMA_MAP_SINGLE)) {
>  		num_sgs = urb->num_mapped_sgs;
>  		sg = urb->sg;
>  		addr = (u64) sg_dma_address(sg);
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index fe38275363e0..15f06bc6b1ad 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1256,6 +1256,106 @@ EXPORT_SYMBOL_GPL(xhci_resume);
>  
>  /*-------------------------------------------------------------------------*/
>  
> +static int xhci_map_temp_buffer(struct usb_hcd *hcd, struct urb *urb)
> +{
> +	void *temp;
> +	int ret = 0;
> +	unsigned int len;
> +	unsigned int buf_len;
> +	enum dma_data_direction dir;
> +	struct xhci_hcd *xhci;
> +
> +	xhci = hcd_to_xhci(hcd);
> +	dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> +	buf_len = urb->transfer_buffer_length;
> +
> +	temp = kzalloc_node(buf_len, GFP_ATOMIC,
> +			    dev_to_node(hcd->self.sysdev));
> +
> +	if (usb_urb_dir_out(urb))
> +		len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs,
> +					 temp, buf_len, 0);
> +
> +	urb->transfer_buffer = temp;
> +	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
> +					   urb->transfer_buffer,
> +					   urb->transfer_buffer_length,
> +					   dir);
> +
> +	if (dma_mapping_error(hcd->self.sysdev,
> +			      urb->transfer_dma)) {
> +		ret = -EAGAIN;
> +		kfree(temp);
> +	} else {
> +		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
> +	}
> +
> +	return ret;
> +}
> +
> +static bool xhci_urb_temp_buffer_required(struct usb_hcd *hcd,
> +					  struct urb *urb)
> +{
> +	bool ret = false;
> +	unsigned int i;
> +	unsigned int len = 0;
> +	unsigned int buf_len;
> +	unsigned int trb_size;
> +	unsigned int max_pkt;
> +	struct scatterlist *sg;
> +	struct scatterlist *tail_sg;
> +
> +	sg = urb->sg;
> +	tail_sg = urb->sg;
> +	buf_len = urb->transfer_buffer_length;
> +	max_pkt = usb_endpoint_maxp(&urb->ep->desc);
> +
> +	if (!urb->num_sgs)
> +		return ret;
> +
> +	if (urb->dev->speed >= USB_SPEED_SUPER)
> +		trb_size = TRB_CACHE_SIZE_SS;
> +	else
> +		trb_size = TRB_CACHE_SIZE_HS;
> +
> +	if (urb->transfer_buffer_length != 0 &&
> +	    !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) {
> +		for_each_sg(urb->sg, sg, urb->num_sgs, i) {
> +			len = len + sg->length;
> +			if (i > trb_size - 2) {
> +				len = len - tail_sg->length;
> +				if (len < max_pkt) {
> +					ret = true;
> +					break;
> +				}
> +
> +				tail_sg = sg_next(tail_sg);
> +			}
> +		}
> +	}
> +	return ret;
> +}
> +
> +static void xhci_unmap_temp_buf(struct urb *urb)
> +{
> +	struct scatterlist *sg;
> +	unsigned int len;
> +	unsigned int buf_len;
> +
> +	sg = urb->sg;
> +	buf_len = urb->transfer_buffer_length;
> +
> +	if (usb_urb_dir_in(urb)) {
> +		len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs,
> +					   urb->transfer_buffer,
> +					   buf_len,
> +					   0);
> +	}
> +
> +	kfree(urb->transfer_buffer);
> +	urb->transfer_buffer = NULL;
> +}
> +
>  /*
>   * Bypass the DMA mapping if URB is suitable for Immediate Transfer (IDT),
>   * we'll copy the actual data into the TRB address register. This is limited to
> @@ -1265,12 +1365,36 @@ EXPORT_SYMBOL_GPL(xhci_resume);
>  static int xhci_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
>  				gfp_t mem_flags)
>  {
> +	struct xhci_hcd *xhci;
> +
> +	xhci = hcd_to_xhci(hcd);
> +
>  	if (xhci_urb_suitable_for_idt(urb))
>  		return 0;
>  
> +	if (xhci->quirks & XHCI_SG_TRB_CACHE_SIZE_QUIRK) {
> +		if (xhci_urb_temp_buffer_required(hcd, urb))
> +			return xhci_map_temp_buffer(hcd, urb);
> +	}
>  	return usb_hcd_map_urb_for_dma(hcd, urb, mem_flags);
>  }
>  
> +static void xhci_unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb)
> +{
> +	struct xhci_hcd *xhci;
> +	bool unmap_temp_buf = false;
> +
> +	xhci = hcd_to_xhci(hcd);
> +
> +	if (urb->num_sgs && (urb->transfer_flags & URB_DMA_MAP_SINGLE))
> +		unmap_temp_buf = true;
> +
> +	usb_hcd_unmap_urb_for_dma(hcd, urb);
> +
> +	if ((xhci->quirks & XHCI_SG_TRB_CACHE_SIZE_QUIRK) && unmap_temp_buf)
> +		xhci_unmap_temp_buf(urb);
> +}
> +
>  /**
>   * xhci_get_endpoint_index - Used for passing endpoint bitmasks between the core and
>   * HCDs.  Find the index for an endpoint given its descriptor.  Use the return
> @@ -5315,6 +5439,7 @@ static const struct hc_driver xhci_hc_driver = {
>  	 * managing i/o requests and associated device resources
>  	 */
>  	.map_urb_for_dma =      xhci_map_urb_for_dma,
> +	.unmap_urb_for_dma =    xhci_unmap_urb_for_dma,
>  	.urb_enqueue =		xhci_urb_enqueue,
>  	.urb_dequeue =		xhci_urb_dequeue,
>  	.alloc_dev =		xhci_alloc_dev,
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index 4db825459b40..14600214188f 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1330,6 +1330,10 @@ enum xhci_setup_dev {
>  #define TRB_SIA			(1<<31)
>  #define TRB_FRAME_ID(p)		(((p) & 0x7ff) << 20)
>  
> +/* TRB cache size for xHC with TRB cache */
> +#define TRB_CACHE_SIZE_HS	8
> +#define TRB_CACHE_SIZE_SS	16
> +
>  struct xhci_generic_trb {
>  	__le32 field[4];
>  };
> 

Thanks & Regards,
 Tejas Joglekar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-04-21  9:49 ` [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG Tejas Joglekar
  2020-05-18  8:38   ` Tejas Joglekar
@ 2020-05-18 16:21   ` Mathias Nyman
  2020-05-20  6:41     ` Tejas Joglekar
  1 sibling, 1 reply; 18+ messages in thread
From: Mathias Nyman @ 2020-05-18 16:21 UTC (permalink / raw)
  To: Tejas Joglekar, Greg Kroah-Hartman, USB, Mathias Nyman
  Cc: John Youn, Alan Stern

On 21.4.2020 12.49, Tejas Joglekar wrote:
> The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
> each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
> for HS. The controller loads and updates the TRB cache from the transfer
> ring in system memory whenever the driver issues a start transfer or
> update transfer command.
> 
> For chained TRBs, the Synopsys xHC requires that the total amount of
> bytes for all TRBs loaded in the TRB cache be greater than or equal to 1
> MPS. Or the chain ends within the TRB cache (with a last TRB).
> 
> If this requirement is not met, the controller will not be able to send
> or receive a packet and it will hang causing a driver timeout and error.
> 
> This can be a problem if a class driver queues SG requests with many
> small-buffer entries. The XHCI driver will create a chained TRB for each
> entry which may trigger this issue.
> 
> This patch adds logic to the XHCI driver to detect and prevent this from
> happening.
> 
> For every (TRB_CACHE_SIZE - 2), we check the total buffer size of
> the SG list and if the last window of (TRB_CACHE_SIZE - 2) SG list length
> and we don't make up at least 1 MPS, we create a temporary buffer to
> consolidate full SG list into the buffer.
> 
> We check at (TRB_CACHE_SIZE - 2) window because it is possible that there
> would be a link and/or event data TRB that take up to 2 of the cache
> entries.
> 
> We discovered this issue with devices on other platforms but have not
> yet come across any device that triggers this on Linux. But it could be
> a real problem now or in the future. All it takes is N number of small
> chained TRBs. And other instances of the Synopsys IP may have smaller
> values for the TRB_CACHE_SIZE which would exacerbate the problem.
> 
> Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
> ---
>  Changes in v2:
>  - Removed redundant debug messages
>  - Modified logic to remove unnecessary changes in hcd.c
>  - Rename the quirk
> 
>  drivers/usb/host/xhci-ring.c |   2 +-
>  drivers/usb/host/xhci.c      | 125 +++++++++++++++++++++++++++++++++++++++++++
>  drivers/usb/host/xhci.h      |   4 ++
>  3 files changed, 130 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index a78787bb5133..2fad9474912a 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -3291,7 +3291,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
>  
>  	full_len = urb->transfer_buffer_length;
>  	/* If we have scatter/gather list, we use it. */
> -	if (urb->num_sgs) {
> +	if (urb->num_sgs && !(urb->transfer_flags & URB_DMA_MAP_SINGLE)) {
>  		num_sgs = urb->num_mapped_sgs;
>  		sg = urb->sg;
>  		addr = (u64) sg_dma_address(sg);
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index fe38275363e0..15f06bc6b1ad 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1256,6 +1256,106 @@ EXPORT_SYMBOL_GPL(xhci_resume);
>  
>  /*-------------------------------------------------------------------------*/
>  
> +static int xhci_map_temp_buffer(struct usb_hcd *hcd, struct urb *urb)
> +{
> +	void *temp;
> +	int ret = 0;
> +	unsigned int len;
> +	unsigned int buf_len;
> +	enum dma_data_direction dir;
> +	struct xhci_hcd *xhci;
> +
> +	xhci = hcd_to_xhci(hcd);
> +	dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> +	buf_len = urb->transfer_buffer_length;
> +
> +	temp = kzalloc_node(buf_len, GFP_ATOMIC,
> +			    dev_to_node(hcd->self.sysdev));
> +
> +	if (usb_urb_dir_out(urb))
> +		len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs,
> +					 temp, buf_len, 0);
> +
> +	urb->transfer_buffer = temp;
> +	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
> +					   urb->transfer_buffer,
> +					   urb->transfer_buffer_length,
> +					   dir);
> +
> +	if (dma_mapping_error(hcd->self.sysdev,
> +			      urb->transfer_dma)) {
> +		ret = -EAGAIN;
> +		kfree(temp);
> +	} else {
> +		urb->transfer_flags |= URB_DMA_MAP_SINGLE;

Not sure if using URB_DMA_MAP_SINGLE to flag that this urb is boucebuffered is
appropriate.

If Greg or Alan don't object then it's fine for me as well.



> +	}
> +
> +	return ret;
> +}
> +
> +static bool xhci_urb_temp_buffer_required(struct usb_hcd *hcd,
> +					  struct urb *urb)
> +{
> +	bool ret = false;
> +	unsigned int i;
> +	unsigned int len = 0;
> +	unsigned int buf_len;
> +	unsigned int trb_size;
> +	unsigned int max_pkt;
> +	struct scatterlist *sg;
> +	struct scatterlist *tail_sg;
> +
> +	sg = urb->sg;
> +	tail_sg = urb->sg;
> +	buf_len = urb->transfer_buffer_length;
> +	max_pkt = usb_endpoint_maxp(&urb->ep->desc);
> +
> +	if (!urb->num_sgs)
> +		return ret;
> +
> +	if (urb->dev->speed >= USB_SPEED_SUPER)
> +		trb_size = TRB_CACHE_SIZE_SS;
> +	else
> +		trb_size = TRB_CACHE_SIZE_HS;
> +
> +	if (urb->transfer_buffer_length != 0 &&
> +	    !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) {
> +		for_each_sg(urb->sg, sg, urb->num_sgs, i) {
> +			len = len + sg->length;
> +			if (i > trb_size - 2) {
> +				len = len - tail_sg->length;
> +				if (len < max_pkt) {
> +					ret = true;
> +					break;
> +				}
> +
> +				tail_sg = sg_next(tail_sg);
> +			}
> +		}
> +	}
> +	return ret;
> +}
> +
> +static void xhci_unmap_temp_buf(struct urb *urb)
> +{
> +	struct scatterlist *sg;
> +	unsigned int len;
> +	unsigned int buf_len;
> +
> +	sg = urb->sg;
> +	buf_len = urb->transfer_buffer_length;
> +
> +	if (usb_urb_dir_in(urb)) {
> +		len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs,
> +					   urb->transfer_buffer,
> +					   buf_len,
> +					   0);
> +	}
> +
> +	kfree(urb->transfer_buffer);
> +	urb->transfer_buffer = NULL;

clear URB_DMA_MAP_SINGLE from urb->transfer_flags?

Everything else looks good do me.
-Mathias

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-05-18 16:21   ` Mathias Nyman
@ 2020-05-20  6:41     ` Tejas Joglekar
  2020-05-20 16:50       ` Alan Stern
  0 siblings, 1 reply; 18+ messages in thread
From: Tejas Joglekar @ 2020-05-20  6:41 UTC (permalink / raw)
  To: Mathias Nyman, Tejas Joglekar, Greg Kroah-Hartman, USB, Mathias Nyman
  Cc: John Youn, Alan Stern

Hi,
On 5/18/2020 9:51 PM, Mathias Nyman wrote:
> On 21.4.2020 12.49, Tejas Joglekar wrote:
>> The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
>> each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
>> for HS. The controller loads and updates the TRB cache from the transfer
>> ring in system memory whenever the driver issues a start transfer or
>> update transfer command.
>>
>> For chained TRBs, the Synopsys xHC requires that the total amount of
>> bytes for all TRBs loaded in the TRB cache be greater than or equal to 1
>> MPS. Or the chain ends within the TRB cache (with a last TRB).
>>
>> If this requirement is not met, the controller will not be able to send
>> or receive a packet and it will hang causing a driver timeout and error.
>>
>> This can be a problem if a class driver queues SG requests with many
>> small-buffer entries. The XHCI driver will create a chained TRB for each
>> entry which may trigger this issue.
>>
>> This patch adds logic to the XHCI driver to detect and prevent this from
>> happening.
>>
>> For every (TRB_CACHE_SIZE - 2), we check the total buffer size of
>> the SG list and if the last window of (TRB_CACHE_SIZE - 2) SG list length
>> and we don't make up at least 1 MPS, we create a temporary buffer to
>> consolidate full SG list into the buffer.
>>
>> We check at (TRB_CACHE_SIZE - 2) window because it is possible that there
>> would be a link and/or event data TRB that take up to 2 of the cache
>> entries.
>>
>> We discovered this issue with devices on other platforms but have not
>> yet come across any device that triggers this on Linux. But it could be
>> a real problem now or in the future. All it takes is N number of small
>> chained TRBs. And other instances of the Synopsys IP may have smaller
>> values for the TRB_CACHE_SIZE which would exacerbate the problem.
>>
>> Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
>> ---
>>  Changes in v2:
>>  - Removed redundant debug messages
>>  - Modified logic to remove unnecessary changes in hcd.c
>>  - Rename the quirk
>>
>>  drivers/usb/host/xhci-ring.c |   2 +-
>>  drivers/usb/host/xhci.c      | 125 +++++++++++++++++++++++++++++++++++++++++++
>>  drivers/usb/host/xhci.h      |   4 ++
>>  3 files changed, 130 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>> index a78787bb5133..2fad9474912a 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -3291,7 +3291,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
>>  
>>  	full_len = urb->transfer_buffer_length;
>>  	/* If we have scatter/gather list, we use it. */
>> -	if (urb->num_sgs) {
>> +	if (urb->num_sgs && !(urb->transfer_flags & URB_DMA_MAP_SINGLE)) {
>>  		num_sgs = urb->num_mapped_sgs;
>>  		sg = urb->sg;
>>  		addr = (u64) sg_dma_address(sg);
>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
>> index fe38275363e0..15f06bc6b1ad 100644
>> --- a/drivers/usb/host/xhci.c
>> +++ b/drivers/usb/host/xhci.c
>> @@ -1256,6 +1256,106 @@ EXPORT_SYMBOL_GPL(xhci_resume);
>>  
>>  /*-------------------------------------------------------------------------*/
>>  
>> +static int xhci_map_temp_buffer(struct usb_hcd *hcd, struct urb *urb)
>> +{
>> +	void *temp;
>> +	int ret = 0;
>> +	unsigned int len;
>> +	unsigned int buf_len;
>> +	enum dma_data_direction dir;
>> +	struct xhci_hcd *xhci;
>> +
>> +	xhci = hcd_to_xhci(hcd);
>> +	dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
>> +	buf_len = urb->transfer_buffer_length;
>> +
>> +	temp = kzalloc_node(buf_len, GFP_ATOMIC,
>> +			    dev_to_node(hcd->self.sysdev));
>> +
>> +	if (usb_urb_dir_out(urb))
>> +		len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs,
>> +					 temp, buf_len, 0);
>> +
>> +	urb->transfer_buffer = temp;
>> +	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
>> +					   urb->transfer_buffer,
>> +					   urb->transfer_buffer_length,
>> +					   dir);
>> +
>> +	if (dma_mapping_error(hcd->self.sysdev,
>> +			      urb->transfer_dma)) {
>> +		ret = -EAGAIN;
>> +		kfree(temp);
>> +	} else {
>> +		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
> 
> Not sure if using URB_DMA_MAP_SINGLE to flag that this urb is boucebuffered is
> appropriate.
> 
> If Greg or Alan don't object then it's fine for me as well.
> 
> 
> 
@Greg/Alan do you suggest any change for the flag here?
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static bool xhci_urb_temp_buffer_required(struct usb_hcd *hcd,
>> +					  struct urb *urb)
>> +{
>> +	bool ret = false;
>> +	unsigned int i;
>> +	unsigned int len = 0;
>> +	unsigned int buf_len;
>> +	unsigned int trb_size;
>> +	unsigned int max_pkt;
>> +	struct scatterlist *sg;
>> +	struct scatterlist *tail_sg;
>> +
>> +	sg = urb->sg;
>> +	tail_sg = urb->sg;
>> +	buf_len = urb->transfer_buffer_length;
>> +	max_pkt = usb_endpoint_maxp(&urb->ep->desc);
>> +
>> +	if (!urb->num_sgs)
>> +		return ret;
>> +
>> +	if (urb->dev->speed >= USB_SPEED_SUPER)
>> +		trb_size = TRB_CACHE_SIZE_SS;
>> +	else
>> +		trb_size = TRB_CACHE_SIZE_HS;
>> +
>> +	if (urb->transfer_buffer_length != 0 &&
>> +	    !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) {
>> +		for_each_sg(urb->sg, sg, urb->num_sgs, i) {
>> +			len = len + sg->length;
>> +			if (i > trb_size - 2) {
>> +				len = len - tail_sg->length;
>> +				if (len < max_pkt) {
>> +					ret = true;
>> +					break;
>> +				}
>> +
>> +				tail_sg = sg_next(tail_sg);
>> +			}
>> +		}
>> +	}
>> +	return ret;
>> +}
>> +
>> +static void xhci_unmap_temp_buf(struct urb *urb)
>> +{
>> +	struct scatterlist *sg;
>> +	unsigned int len;
>> +	unsigned int buf_len;
>> +
>> +	sg = urb->sg;
>> +	buf_len = urb->transfer_buffer_length;
>> +
>> +	if (usb_urb_dir_in(urb)) {
>> +		len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs,
>> +					   urb->transfer_buffer,
>> +					   buf_len,
>> +					   0);
>> +	}
>> +
>> +	kfree(urb->transfer_buffer);
>> +	urb->transfer_buffer = NULL;
> 
> clear URB_DMA_MAP_SINGLE from urb->transfer_flags?
>
In current implmentation the dma_unmap is done with existing 
function usb_hcd_unmap_urb_for_dma() and then buffer is copied
in the xhci_unmap_temp_buf(), so URB_DMA_MAP_SINGLE flag gets
cleared with usb_hcd_unmap_urb_for_dma().
I think in next patch set I will add logic for dma_unmap 
in xhci_unmap_temp_buf() function.  

 
> Everything else looks good do me.
> -Mathias
> 

Thanks & Regards,
 Tejas Joglekar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-05-20  6:41     ` Tejas Joglekar
@ 2020-05-20 16:50       ` Alan Stern
  2020-05-20 17:00         ` Tejas Joglekar
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Stern @ 2020-05-20 16:50 UTC (permalink / raw)
  To: Tejas Joglekar
  Cc: Mathias Nyman, Greg Kroah-Hartman, USB, Mathias Nyman, John Youn

On Wed, May 20, 2020 at 06:41:16AM +0000, Tejas Joglekar wrote:
> >> +	urb->transfer_buffer = temp;
> >> +	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
> >> +					   urb->transfer_buffer,
> >> +					   urb->transfer_buffer_length,
> >> +					   dir);
> >> +
> >> +	if (dma_mapping_error(hcd->self.sysdev,
> >> +			      urb->transfer_dma)) {
> >> +		ret = -EAGAIN;
> >> +		kfree(temp);
> >> +	} else {
> >> +		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
> > 
> > Not sure if using URB_DMA_MAP_SINGLE to flag that this urb is boucebuffered is
> > appropriate.
> > 
> > If Greg or Alan don't object then it's fine for me as well.
> > 
> > 
> > 
> @Greg/Alan do you suggest any change for the flag here?

This requires care, because the core will already have set other flags 
for this URB.  I don't remember the details and I don't have time to 
check them now.  But it wouldn't be at all surprising if the core 
doesn't expect an URB both to use SG and to use DMA_MAP_SINGLE.

Alan Stern

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-05-20 16:50       ` Alan Stern
@ 2020-05-20 17:00         ` Tejas Joglekar
  2020-05-20 17:44           ` Alan Stern
  0 siblings, 1 reply; 18+ messages in thread
From: Tejas Joglekar @ 2020-05-20 17:00 UTC (permalink / raw)
  To: Alan Stern, Tejas Joglekar
  Cc: Mathias Nyman, Greg Kroah-Hartman, USB, Mathias Nyman, John Youn

Hi,
On 5/20/2020 10:20 PM, Alan Stern wrote:
> On Wed, May 20, 2020 at 06:41:16AM +0000, Tejas Joglekar wrote:
>>>> +	urb->transfer_buffer = temp;
>>>> +	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
>>>> +					   urb->transfer_buffer,
>>>> +					   urb->transfer_buffer_length,
>>>> +					   dir);
>>>> +
>>>> +	if (dma_mapping_error(hcd->self.sysdev,
>>>> +			      urb->transfer_dma)) {
>>>> +		ret = -EAGAIN;
>>>> +		kfree(temp);
>>>> +	} else {
>>>> +		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
>>>
>>> Not sure if using URB_DMA_MAP_SINGLE to flag that this urb is boucebuffered is
>>> appropriate.
>>>
>>> If Greg or Alan don't object then it's fine for me as well.
>>>
>>>
>>>
>> @Greg/Alan do you suggest any change for the flag here?
> 
> This requires care, because the core will already have set other flags 
> for this URB.  I don't remember the details and I don't have time to 
> check them now.  But it wouldn't be at all surprising if the core 
> doesn't expect an URB both to use SG and to use DMA_MAP_SINGLE.
> 
When this quirk is enabled and it is required to consolidate SG list to 
single buffer no other flags are set for the URB. Only question here is 
if it is good to use DMA_MAP_SINGLE flag for the consolidated SG list 
into a single temporary buffer.

> Alan Stern
> 
Thanks & Regards,
 Tejas Joglekar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-05-20 17:00         ` Tejas Joglekar
@ 2020-05-20 17:44           ` Alan Stern
  2020-05-22  1:58             ` Tejas Joglekar
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Stern @ 2020-05-20 17:44 UTC (permalink / raw)
  To: Tejas Joglekar
  Cc: Mathias Nyman, Greg Kroah-Hartman, USB, Mathias Nyman, John Youn

On Wed, May 20, 2020 at 05:00:50PM +0000, Tejas Joglekar wrote:
> Hi,
> On 5/20/2020 10:20 PM, Alan Stern wrote:
> > On Wed, May 20, 2020 at 06:41:16AM +0000, Tejas Joglekar wrote:
> >>>> +	urb->transfer_buffer = temp;
> >>>> +	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
> >>>> +					   urb->transfer_buffer,
> >>>> +					   urb->transfer_buffer_length,
> >>>> +					   dir);
> >>>> +
> >>>> +	if (dma_mapping_error(hcd->self.sysdev,
> >>>> +			      urb->transfer_dma)) {
> >>>> +		ret = -EAGAIN;
> >>>> +		kfree(temp);
> >>>> +	} else {
> >>>> +		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
> >>>
> >>> Not sure if using URB_DMA_MAP_SINGLE to flag that this urb is boucebuffered is
> >>> appropriate.
> >>>
> >>> If Greg or Alan don't object then it's fine for me as well.
> >>>
> >>>
> >>>
> >> @Greg/Alan do you suggest any change for the flag here?
> > 
> > This requires care, because the core will already have set other flags 
> > for this URB.  I don't remember the details and I don't have time to 
> > check them now.  But it wouldn't be at all surprising if the core 
> > doesn't expect an URB both to use SG and to use DMA_MAP_SINGLE.
> > 
> When this quirk is enabled and it is required to consolidate SG list to 
> single buffer no other flags are set for the URB.

I don't believe that.  What about URB_DMA_MAP_SG or 
URB_DMA_MAP_PAGE?  The USB core certainly would have set one of them 
already.  Not only would you need to clear that flag, you want also need 
to undo the mapping.

> Only question here is 
> if it is good to use DMA_MAP_SINGLE flag for the consolidated SG list 
> into a single temporary buffer.

If you call dma_map_single() then USB_DMA_MAP_SINGLE is the appropriate 
flag to sest.

Alan Stern

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-05-20 17:44           ` Alan Stern
@ 2020-05-22  1:58             ` Tejas Joglekar
  2020-05-22 14:21               ` Alan Stern
  0 siblings, 1 reply; 18+ messages in thread
From: Tejas Joglekar @ 2020-05-22  1:58 UTC (permalink / raw)
  To: Alan Stern, Tejas Joglekar
  Cc: Mathias Nyman, Greg Kroah-Hartman, USB, Mathias Nyman, John Youn

Hi,
On 5/20/2020 11:14 PM, Alan Stern wrote:
> On Wed, May 20, 2020 at 05:00:50PM +0000, Tejas Joglekar wrote:
>> Hi,
>> On 5/20/2020 10:20 PM, Alan Stern wrote:
>>> On Wed, May 20, 2020 at 06:41:16AM +0000, Tejas Joglekar wrote:
>>>>>> +	urb->transfer_buffer = temp;
>>>>>> +	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
>>>>>> +					   urb->transfer_buffer,
>>>>>> +					   urb->transfer_buffer_length,
>>>>>> +					   dir);
>>>>>> +
>>>>>> +	if (dma_mapping_error(hcd->self.sysdev,
>>>>>> +			      urb->transfer_dma)) {
>>>>>> +		ret = -EAGAIN;
>>>>>> +		kfree(temp);
>>>>>> +	} else {
>>>>>> +		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
>>>>>
>>>>> Not sure if using URB_DMA_MAP_SINGLE to flag that this urb is boucebuffered is
>>>>> appropriate.
>>>>>
>>>>> If Greg or Alan don't object then it's fine for me as well.
>>>>>
>>>>>
>>>>>
>>>> @Greg/Alan do you suggest any change for the flag here?
>>>
>>> This requires care, because the core will already have set other flags 
>>> for this URB.  I don't remember the details and I don't have time to 
>>> check them now.  But it wouldn't be at all surprising if the core 
>>> doesn't expect an URB both to use SG and to use DMA_MAP_SINGLE.
>>>
>> When this quirk is enabled and it is required to consolidate SG list to 
>> single buffer no other flags are set for the URB.
> 
> I don't believe that.  What about URB_DMA_MAP_SG or 
> URB_DMA_MAP_PAGE?  The USB core certainly would have set one of them 
> already.  Not only would you need to clear that flag, you want also need 
> to undo the mapping.
>
URB_DMA_MAP_SG or URB_DMA_MAP_PAGE flags are set for an URB in core within
function usb_hcd_map_urb_for_dma(), with my patch the xhci_map_urb_for_dma()
does not call usb_hcd_map_urb_for_dma() when SG list consolidation is required
but it calls the newly added function xhci_map_temp_buffer() within that function
only dma_map_single() is called to map consolidated SG list.
 
>> Only question here is 
>> if it is good to use DMA_MAP_SINGLE flag for the consolidated SG list 
>> into a single temporary buffer.
> 
> If you call dma_map_single() then USB_DMA_MAP_SINGLE is the appropriate 
> flag to sest.
> 
Yes, within xhci_map_temp_buffer(), after consolidating the SG list in a temp
buffer I call dma_map_single() to map the temp buffer.

> Alan Stern
> 

Thanks & Regards,
 Tejas Joglekar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-05-22  1:58             ` Tejas Joglekar
@ 2020-05-22 14:21               ` Alan Stern
  2020-05-22 17:22                 ` Tejas Joglekar
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Stern @ 2020-05-22 14:21 UTC (permalink / raw)
  To: Tejas Joglekar
  Cc: Mathias Nyman, Greg Kroah-Hartman, USB, Mathias Nyman, John Youn

On Fri, May 22, 2020 at 01:58:43AM +0000, Tejas Joglekar wrote:
> Hi,
> On 5/20/2020 11:14 PM, Alan Stern wrote:
> > On Wed, May 20, 2020 at 05:00:50PM +0000, Tejas Joglekar wrote:
> >> Hi,
> >> On 5/20/2020 10:20 PM, Alan Stern wrote:
> >>> On Wed, May 20, 2020 at 06:41:16AM +0000, Tejas Joglekar wrote:
> >>>>>> +	urb->transfer_buffer = temp;
> >>>>>> +	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
> >>>>>> +					   urb->transfer_buffer,
> >>>>>> +					   urb->transfer_buffer_length,
> >>>>>> +					   dir);
> >>>>>> +
> >>>>>> +	if (dma_mapping_error(hcd->self.sysdev,
> >>>>>> +			      urb->transfer_dma)) {
> >>>>>> +		ret = -EAGAIN;
> >>>>>> +		kfree(temp);
> >>>>>> +	} else {
> >>>>>> +		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
> >>>>>
> >>>>> Not sure if using URB_DMA_MAP_SINGLE to flag that this urb is boucebuffered is
> >>>>> appropriate.
> >>>>>
> >>>>> If Greg or Alan don't object then it's fine for me as well.
> >>>>>
> >>>>>
> >>>>>
> >>>> @Greg/Alan do you suggest any change for the flag here?
> >>>
> >>> This requires care, because the core will already have set other flags 
> >>> for this URB.  I don't remember the details and I don't have time to 
> >>> check them now.  But it wouldn't be at all surprising if the core 
> >>> doesn't expect an URB both to use SG and to use DMA_MAP_SINGLE.
> >>>
> >> When this quirk is enabled and it is required to consolidate SG list to 
> >> single buffer no other flags are set for the URB.
> > 
> > I don't believe that.  What about URB_DMA_MAP_SG or 
> > URB_DMA_MAP_PAGE?  The USB core certainly would have set one of them 
> > already.  Not only would you need to clear that flag, you want also need 
> > to undo the mapping.
> >
> URB_DMA_MAP_SG or URB_DMA_MAP_PAGE flags are set for an URB in core within
> function usb_hcd_map_urb_for_dma(), with my patch the xhci_map_urb_for_dma()
> does not call usb_hcd_map_urb_for_dma() when SG list consolidation is required
> but it calls the newly added function xhci_map_temp_buffer() within that function
> only dma_map_single() is called to map consolidated SG list.

Ah, in your patch xhci-hcd's map_urb_for_dma() routine bypasses 
usb_hcd_map_urb_for_dma().  I hadn't noticed that.

Then yes, your use of the URB flags is okay.

Alan Stern

> >> Only question here is 
> >> if it is good to use DMA_MAP_SINGLE flag for the consolidated SG list 
> >> into a single temporary buffer.
> > 
> > If you call dma_map_single() then USB_DMA_MAP_SINGLE is the appropriate 
> > flag to sest.
> > 
> Yes, within xhci_map_temp_buffer(), after consolidating the SG list in a temp
> buffer I call dma_map_single() to map the temp buffer.
> 
> > Alan Stern
> > 
> 
> Thanks & Regards,
>  Tejas Joglekar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG
  2020-05-22 14:21               ` Alan Stern
@ 2020-05-22 17:22                 ` Tejas Joglekar
  0 siblings, 0 replies; 18+ messages in thread
From: Tejas Joglekar @ 2020-05-22 17:22 UTC (permalink / raw)
  To: Alan Stern, Tejas Joglekar, Mathias Nyman
  Cc: Greg Kroah-Hartman, USB, Mathias Nyman, John Youn

Hi,
On 5/22/2020 7:51 PM, Alan Stern wrote:
> On Fri, May 22, 2020 at 01:58:43AM +0000, Tejas Joglekar wrote:
>> Hi,
>> On 5/20/2020 11:14 PM, Alan Stern wrote:
>>> On Wed, May 20, 2020 at 05:00:50PM +0000, Tejas Joglekar wrote:
>>>> Hi,
>>>> On 5/20/2020 10:20 PM, Alan Stern wrote:
>>>>> On Wed, May 20, 2020 at 06:41:16AM +0000, Tejas Joglekar wrote:
>>>>>>>> +	urb->transfer_buffer = temp;
>>>>>>>> +	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
>>>>>>>> +					   urb->transfer_buffer,
>>>>>>>> +					   urb->transfer_buffer_length,
>>>>>>>> +					   dir);
>>>>>>>> +
>>>>>>>> +	if (dma_mapping_error(hcd->self.sysdev,
>>>>>>>> +			      urb->transfer_dma)) {
>>>>>>>> +		ret = -EAGAIN;
>>>>>>>> +		kfree(temp);
>>>>>>>> +	} else {
>>>>>>>> +		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
>>>>>>>
>>>>>>> Not sure if using URB_DMA_MAP_SINGLE to flag that this urb is boucebuffered is
>>>>>>> appropriate.
>>>>>>>
>>>>>>> If Greg or Alan don't object then it's fine for me as well.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> @Greg/Alan do you suggest any change for the flag here?
>>>>>
>>>>> This requires care, because the core will already have set other flags 
>>>>> for this URB.  I don't remember the details and I don't have time to 
>>>>> check them now.  But it wouldn't be at all surprising if the core 
>>>>> doesn't expect an URB both to use SG and to use DMA_MAP_SINGLE.
>>>>>
>>>> When this quirk is enabled and it is required to consolidate SG list to 
>>>> single buffer no other flags are set for the URB.
>>>
>>> I don't believe that.  What about URB_DMA_MAP_SG or 
>>> URB_DMA_MAP_PAGE?  The USB core certainly would have set one of them 
>>> already.  Not only would you need to clear that flag, you want also need 
>>> to undo the mapping.
>>>
>> URB_DMA_MAP_SG or URB_DMA_MAP_PAGE flags are set for an URB in core within
>> function usb_hcd_map_urb_for_dma(), with my patch the xhci_map_urb_for_dma()
>> does not call usb_hcd_map_urb_for_dma() when SG list consolidation is required
>> but it calls the newly added function xhci_map_temp_buffer() within that function
>> only dma_map_single() is called to map consolidated SG list.
> 
> Ah, in your patch xhci-hcd's map_urb_for_dma() routine bypasses 
> usb_hcd_map_urb_for_dma().  I hadn't noticed that.
> 
> Then yes, your use of the URB flags is okay.
> 
> Alan Stern
> 
Thanks for the review Alan
@Mathias: I will send v3 after updating the unmap function.

Thanks & Regards,
Tejas Joglekar




^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-05-22 17:23 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-21  9:47 [RFC PATCH v2 0/4] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
2020-04-21  9:48 ` [RFC PATCH v2 1/4] dt-bindings: usb: Add documentation for SG trb cache size quirk Tejas Joglekar
2020-05-06 20:15   ` Rob Herring
2020-05-07 16:03     ` Tejas Joglekar
2020-05-06 20:16   ` Rob Herring
2020-05-07 16:07     ` Tejas Joglekar
2020-04-21  9:48 ` [RFC PATCH v2 2/4] usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK Tejas Joglekar
2020-04-21  9:48 ` [RFC PATCH v2 3/4] usb: dwc3: Add device property sgl-trb-cache-size-quirk Tejas Joglekar
2020-04-21  9:49 ` [RFC PATCH v2 4/4] usb: xhci: Use temporary buffer to consolidate SG Tejas Joglekar
2020-05-18  8:38   ` Tejas Joglekar
2020-05-18 16:21   ` Mathias Nyman
2020-05-20  6:41     ` Tejas Joglekar
2020-05-20 16:50       ` Alan Stern
2020-05-20 17:00         ` Tejas Joglekar
2020-05-20 17:44           ` Alan Stern
2020-05-22  1:58             ` Tejas Joglekar
2020-05-22 14:21               ` Alan Stern
2020-05-22 17:22                 ` Tejas Joglekar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).