Linux-USB Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v5 0/3] Add logic to consolidate TRBs for Synopsys xHC
@ 2020-10-16 13:04 Tejas Joglekar
  2020-10-16 13:04 ` [PATCH v5 1/3] usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK Tejas Joglekar
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Tejas Joglekar @ 2020-10-16 13:04 UTC (permalink / raw)
  To: Felipe Balbi, Greg Kroah-Hartman, Tejas Joglekar, linux-usb,
	Mathias Nyman
  Cc: John Youn

The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
for HS. The controller loads and updates the TRB cache from the
transfer ring in system memory whenever the driver issues a start
transfer or update transfer command.

For chained TRBs, the Synopsys xHC requires that the total amount of
bytes for all TRBs loaded in the TRB cache be greater than or equal to
1 MPS. Or the chain ends within the TRB cache (with a last TRB).

If this requirement is not met, the controller will not be able to
send or receive a packet and it will hang causing a driver timeout and
error.

This patch set adds logic to the XHCI driver to detect and prevent this
from happening along with the quirk to enable this logic for Synopsys
HAPS platform.

Based on Mathias's feedback on previous implementation where consolidation
was done in TRB cache, with this patch series the implementation is done
during mapping of the URB by consolidating the SG list into a temporary
buffer if the SG list buffer sizes within TRB_CACHE_SIZE is less than MPS.

Changes since v4:
 - Updated [Patch 3/3] platform data structure initialization
 - Trivial changes in commit wordings

Changes since v3:
 - Removed the dt-binding patch
 - Added new patch to pass the quirk as platform data
 - Modified the patch to set the quirk

Changes since v2:
 - Modified the xhci_unmap_temp_buffer function to unmap dma and c
   the dma flag
 - Removed RFC tag

Changes since v1:
 - Comments from Greg are addressed on [PATCH 4/4] and [PATCH 1/4]
 - Renamed the property and quirk as in other patches based on [PATCH 1/4]



Tejas Joglekar (3):
  usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK
  usb: xhci: Use temporary buffer to consolidate SG
  usb: dwc3: Pass quirk as platform data

 drivers/usb/dwc3/host.c      |  10 +++
 drivers/usb/host/xhci-plat.c |   3 +
 drivers/usb/host/xhci-ring.c |   2 +-
 drivers/usb/host/xhci.c      | 137 ++++++++++++++++++++++++++++++++++-
 drivers/usb/host/xhci.h      |   5 ++
 5 files changed, 155 insertions(+), 2 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v5 1/3] usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK
  2020-10-16 13:04 [PATCH v5 0/3] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
@ 2020-10-16 13:04 ` Tejas Joglekar
  2020-10-16 13:05 ` [PATCH v5 2/3] usb: xhci: Use temporary buffer to consolidate SG Tejas Joglekar
  2020-10-16 13:05 ` [PATCH v5 3/3] usb: dwc3: Pass quirk as platform data Tejas Joglekar
  2 siblings, 0 replies; 4+ messages in thread
From: Tejas Joglekar @ 2020-10-16 13:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tejas Joglekar, linux-usb, Mathias Nyman; +Cc: John Youn

This commit uses the private data passed by parent device
to set the quirk for Synopsys xHC. This patch fixes the
SNPS xHC hang issue when the data is scattered across
small buffers which does not make atleast MPS size for
given TRB cache size of SNPS xHC.

Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
---
 drivers/usb/host/xhci-plat.c | 3 +++
 drivers/usb/host/xhci.h      | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index aa2d35f98200..4d34f6005381 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -333,6 +333,9 @@ static int xhci_plat_probe(struct platform_device *pdev)
 	if (priv && (priv->quirks & XHCI_SKIP_PHY_INIT))
 		hcd->skip_phy_initialization = 1;
 
+	if (priv && (priv->quirks & XHCI_SG_TRB_CACHE_SIZE_QUIRK))
+		xhci->quirks |= XHCI_SG_TRB_CACHE_SIZE_QUIRK;
+
 	ret = usb_add_hcd(hcd, irq, IRQF_SHARED);
 	if (ret)
 		goto disable_usb_phy;
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 8be88379c0fb..f8e453a2674d 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1877,6 +1877,7 @@ struct xhci_hcd {
 #define XHCI_SNPS_BROKEN_SUSPEND    BIT_ULL(35)
 #define XHCI_RENESAS_FW_QUIRK	BIT_ULL(36)
 #define XHCI_SKIP_PHY_INIT	BIT_ULL(37)
+#define XHCI_SG_TRB_CACHE_SIZE_QUIRK	BIT_ULL(38)
 
 	unsigned int		num_active_eps;
 	unsigned int		limit_active_eps;
-- 
2.28.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v5 2/3] usb: xhci: Use temporary buffer to consolidate SG
  2020-10-16 13:04 [PATCH v5 0/3] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
  2020-10-16 13:04 ` [PATCH v5 1/3] usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK Tejas Joglekar
@ 2020-10-16 13:05 ` Tejas Joglekar
  2020-10-16 13:05 ` [PATCH v5 3/3] usb: dwc3: Pass quirk as platform data Tejas Joglekar
  2 siblings, 0 replies; 4+ messages in thread
From: Tejas Joglekar @ 2020-10-16 13:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tejas Joglekar, linux-usb, Mathias Nyman; +Cc: John Youn

The Synopsys xHC has an internal TRB cache of size TRB_CACHE_SIZE for
each endpoint. The default value for TRB_CACHE_SIZE is 16 for SS and 8
for HS. The controller loads and updates the TRB cache from the transfer
ring in system memory whenever the driver issues a start transfer or
update transfer command.

For chained TRBs, the Synopsys xHC requires that the total amount of
bytes for all TRBs loaded in the TRB cache be greater than or equal to 1
MPS. Or the chain ends within the TRB cache (with a last TRB).

If this requirement is not met, the controller will not be able to send
or receive a packet and it will hang causing a driver timeout and error.

This can be a problem if a class driver queues SG requests with many
small-buffer entries. The XHCI driver will create a chained TRB for each
entry which may trigger this issue.

This patch adds logic to the XHCI driver to detect and prevent this from
happening.

For every (TRB_CACHE_SIZE - 2), we check the total buffer size of
the SG list and if the last window of (TRB_CACHE_SIZE - 2) SG list length
and we don't make up at least 1 MPS, we create a temporary buffer to
consolidate full SG list into the buffer.

We check at (TRB_CACHE_SIZE - 2) window because it is possible that there
would be a link and/or event data TRB that take up to 2 of the cache
entries.

We discovered this issue with devices on other platforms but have not
yet come across any device that triggers this on Linux. But it could be
a real problem now or in the future. All it takes is N number of small
chained TRBs. And other instances of the Synopsys IP may have smaller
values for the TRB_CACHE_SIZE which would exacerbate the problem.

Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
---
 drivers/usb/host/xhci-ring.c |   2 +-
 drivers/usb/host/xhci.c      | 137 ++++++++++++++++++++++++++++++++++-
 drivers/usb/host/xhci.h      |   4 +
 3 files changed, 141 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 167dae117f73..6d4dae5e5f21 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -3325,7 +3325,7 @@ int xhci_queue_bulk_tx(struct xhci_hcd *xhci, gfp_t mem_flags,
 
 	full_len = urb->transfer_buffer_length;
 	/* If we have scatter/gather list, we use it. */
-	if (urb->num_sgs) {
+	if (urb->num_sgs && !(urb->transfer_flags & URB_DMA_MAP_SINGLE)) {
 		num_sgs = urb->num_mapped_sgs;
 		sg = urb->sg;
 		addr = (u64) sg_dma_address(sg);
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 482fe8c5e3b4..8c6089a4a6a4 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1259,6 +1259,116 @@ EXPORT_SYMBOL_GPL(xhci_resume);
 
 /*-------------------------------------------------------------------------*/
 
+static int xhci_map_temp_buffer(struct usb_hcd *hcd, struct urb *urb)
+{
+	void *temp;
+	int ret = 0;
+	unsigned int len;
+	unsigned int buf_len;
+	enum dma_data_direction dir;
+	struct xhci_hcd *xhci;
+
+	xhci = hcd_to_xhci(hcd);
+	dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+	buf_len = urb->transfer_buffer_length;
+
+	temp = kzalloc_node(buf_len, GFP_ATOMIC,
+			    dev_to_node(hcd->self.sysdev));
+
+	if (usb_urb_dir_out(urb))
+		len = sg_pcopy_to_buffer(urb->sg, urb->num_sgs,
+					 temp, buf_len, 0);
+
+	urb->transfer_buffer = temp;
+	urb->transfer_dma = dma_map_single(hcd->self.sysdev,
+					   urb->transfer_buffer,
+					   urb->transfer_buffer_length,
+					   dir);
+
+	if (dma_mapping_error(hcd->self.sysdev,
+			      urb->transfer_dma)) {
+		ret = -EAGAIN;
+		kfree(temp);
+	} else {
+		urb->transfer_flags |= URB_DMA_MAP_SINGLE;
+	}
+
+	return ret;
+}
+
+static bool xhci_urb_temp_buffer_required(struct usb_hcd *hcd,
+					  struct urb *urb)
+{
+	bool ret = false;
+	unsigned int i;
+	unsigned int len = 0;
+	unsigned int buf_len;
+	unsigned int trb_size;
+	unsigned int max_pkt;
+	struct scatterlist *sg;
+	struct scatterlist *tail_sg;
+
+	sg = urb->sg;
+	tail_sg = urb->sg;
+	buf_len = urb->transfer_buffer_length;
+	max_pkt = usb_endpoint_maxp(&urb->ep->desc);
+
+	if (!urb->num_sgs)
+		return ret;
+
+	if (urb->dev->speed >= USB_SPEED_SUPER)
+		trb_size = TRB_CACHE_SIZE_SS;
+	else
+		trb_size = TRB_CACHE_SIZE_HS;
+
+	if (urb->transfer_buffer_length != 0 &&
+	    !(urb->transfer_flags & URB_NO_TRANSFER_DMA_MAP)) {
+		for_each_sg(urb->sg, sg, urb->num_sgs, i) {
+			len = len + sg->length;
+			if (i > trb_size - 2) {
+				len = len - tail_sg->length;
+				if (len < max_pkt) {
+					ret = true;
+					break;
+				}
+
+				tail_sg = sg_next(tail_sg);
+			}
+		}
+	}
+	return ret;
+}
+
+static void xhci_unmap_temp_buf(struct usb_hcd *hcd, struct urb *urb)
+{
+	struct scatterlist *sg;
+	unsigned int len;
+	unsigned int buf_len;
+	enum dma_data_direction dir;
+
+	dir = usb_urb_dir_in(urb) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+
+	sg = urb->sg;
+	buf_len = urb->transfer_buffer_length;
+
+	if (IS_ENABLED(CONFIG_HAS_DMA) &&
+	    (urb->transfer_flags & URB_DMA_MAP_SINGLE))
+		dma_unmap_single(hcd->self.sysdev,
+				 urb->transfer_dma,
+				 urb->transfer_buffer_length,
+				 dir);
+
+	if (usb_urb_dir_in(urb))
+		len = sg_pcopy_from_buffer(urb->sg, urb->num_sgs,
+					   urb->transfer_buffer,
+					   buf_len,
+					   0);
+
+	urb->transfer_flags &= ~URB_DMA_MAP_SINGLE;
+	kfree(urb->transfer_buffer);
+	urb->transfer_buffer = NULL;
+}
+
 /*
  * Bypass the DMA mapping if URB is suitable for Immediate Transfer (IDT),
  * we'll copy the actual data into the TRB address register. This is limited to
@@ -1268,13 +1378,37 @@ EXPORT_SYMBOL_GPL(xhci_resume);
 static int xhci_map_urb_for_dma(struct usb_hcd *hcd, struct urb *urb,
 				gfp_t mem_flags)
 {
+	struct xhci_hcd *xhci;
+
+	xhci = hcd_to_xhci(hcd);
+
 	if (xhci_urb_suitable_for_idt(urb))
 		return 0;
 
+	if (xhci->quirks & XHCI_SG_TRB_CACHE_SIZE_QUIRK) {
+		if (xhci_urb_temp_buffer_required(hcd, urb))
+			return xhci_map_temp_buffer(hcd, urb);
+	}
 	return usb_hcd_map_urb_for_dma(hcd, urb, mem_flags);
 }
 
-/*
+static void xhci_unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb)
+{
+	struct xhci_hcd *xhci;
+	bool unmap_temp_buf = false;
+
+	xhci = hcd_to_xhci(hcd);
+
+	if (urb->num_sgs && (urb->transfer_flags & URB_DMA_MAP_SINGLE))
+		unmap_temp_buf = true;
+
+	if ((xhci->quirks & XHCI_SG_TRB_CACHE_SIZE_QUIRK) && unmap_temp_buf)
+		xhci_unmap_temp_buf(hcd, urb);
+	else
+		usb_hcd_unmap_urb_for_dma(hcd, urb);
+}
+
+/**
  * xhci_get_endpoint_index - Used for passing endpoint bitmasks between the core and
  * HCDs.  Find the index for an endpoint given its descriptor.  Use the return
  * value to right shift 1 for the bitmask.
@@ -5326,6 +5460,7 @@ static const struct hc_driver xhci_hc_driver = {
 	 * managing i/o requests and associated device resources
 	 */
 	.map_urb_for_dma =      xhci_map_urb_for_dma,
+	.unmap_urb_for_dma =    xhci_unmap_urb_for_dma,
 	.urb_enqueue =		xhci_urb_enqueue,
 	.urb_dequeue =		xhci_urb_dequeue,
 	.alloc_dev =		xhci_alloc_dev,
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index f8e453a2674d..f69504f91b5e 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1330,6 +1330,10 @@ enum xhci_setup_dev {
 #define TRB_SIA			(1<<31)
 #define TRB_FRAME_ID(p)		(((p) & 0x7ff) << 20)
 
+/* TRB cache size for xHC with TRB cache */
+#define TRB_CACHE_SIZE_HS	8
+#define TRB_CACHE_SIZE_SS	16
+
 struct xhci_generic_trb {
 	__le32 field[4];
 };
-- 
2.28.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v5 3/3] usb: dwc3: Pass quirk as platform data
  2020-10-16 13:04 [PATCH v5 0/3] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
  2020-10-16 13:04 ` [PATCH v5 1/3] usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK Tejas Joglekar
  2020-10-16 13:05 ` [PATCH v5 2/3] usb: xhci: Use temporary buffer to consolidate SG Tejas Joglekar
@ 2020-10-16 13:05 ` Tejas Joglekar
  2 siblings, 0 replies; 4+ messages in thread
From: Tejas Joglekar @ 2020-10-16 13:05 UTC (permalink / raw)
  To: Felipe Balbi, Greg Kroah-Hartman, Tejas Joglekar, linux-usb; +Cc: John Youn

This commit adds the platform device data to setup
the XHCI_SG_TRB_CACHE_SIZE_QUIRK quirk. DWC3 hosts
which are PCI devices does not use OF to create platform device
but create xhci-plat platform device at runtime. So
this patch allows parent device to supply the quirk
through platform data.

Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
---
 drivers/usb/dwc3/host.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/usb/dwc3/host.c b/drivers/usb/dwc3/host.c
index e195176580de..0434bc8cec12 100644
--- a/drivers/usb/dwc3/host.c
+++ b/drivers/usb/dwc3/host.c
@@ -11,6 +11,11 @@
 #include <linux/platform_device.h>
 
 #include "core.h"
+#include "../host/xhci-plat.h"
+
+static const struct xhci_plat_priv dwc3_pdata = {
+	.quirks = XHCI_SG_TRB_CACHE_SIZE_QUIRK,
+};
 
 static int dwc3_host_get_irq(struct dwc3 *dwc)
 {
@@ -87,6 +92,11 @@ int dwc3_host_init(struct dwc3 *dwc)
 		goto err;
 	}
 
+	ret = platform_device_add_data(xhci, &dwc3_pdata, sizeof(dwc3_pdata));
+	if (ret) {
+		dev_err(dwc->dev, "couldn't add platform data to xHCI device\n");
+		goto err;
+	}
 	memset(props, 0, sizeof(struct property_entry) * ARRAY_SIZE(props));
 
 	if (dwc->usb3_lpm_capable)
-- 
2.28.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-16 13:04 [PATCH v5 0/3] Add logic to consolidate TRBs for Synopsys xHC Tejas Joglekar
2020-10-16 13:04 ` [PATCH v5 1/3] usb: xhci: Set quirk for XHCI_SG_TRB_CACHE_SIZE_QUIRK Tejas Joglekar
2020-10-16 13:05 ` [PATCH v5 2/3] usb: xhci: Use temporary buffer to consolidate SG Tejas Joglekar
2020-10-16 13:05 ` [PATCH v5 3/3] usb: dwc3: Pass quirk as platform data Tejas Joglekar

Linux-USB Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-usb/0 linux-usb/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-usb linux-usb/ https://lore.kernel.org/linux-usb \
		linux-usb@vger.kernel.org
	public-inbox-index linux-usb

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-usb


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git