All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation
@ 2020-03-22 13:00 Lukasz Majewski
  2020-03-22 13:00 ` [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running" Lukasz Majewski
                   ` (5 more replies)
  0 siblings, 6 replies; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-22 13:00 UTC (permalink / raw)
  To: u-boot

This patch set is rather a request for testing (and a starting point for the
discussion), as it may improve the robustness of USB with some pendrives - and
yes sacrifice some performance for reliability.
The previous version of this patch: https://patchwork.ozlabs.org/patch/1244928/
fixed issue for some network USB adapters and improved stability on TI boards.
This patch also provides very detailed explanation of the problem in the commit
message.

With the async support patch applied (
SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 ), the qhtoken variable
has value 0x00 when token shows errors. As a result the error handling path
is not executed. This looks like some missing/broken cache flushing - for easier
bisecting this patch has been reverted for now


Test setup:
===========
Data:
VFAT on the pendrive
16MiB FitImage on USB pen drive (to be read in U-Boot)
400 MiB recovery image (but not read)

Pen drives:
-----------
1. ID 0930:6544 Toshiba Corp. TransMemory-Mini / Kingston DataTraveler 2.0 Stick
2. ID 21c4:8005 GOODRAM 8GB

HW:
---
IMX6Q -> TPC70 board

Procedure:
----------
Boot U-Boot, execute: loadusb=usb start; fatload usb 0 ${loadaddr} ${upd_image}
(Then the image has been crafted to OOPs and WDT after 4 seconds causes reset).
When we fail - the "normal" execution path is followed and we boot up till
prompt.

Results:
========

1. Current mainline - SHA1: 63b2ef407de2f0997deef3b54b7e7ab9c7a7cb27
- The USB error (EHCI timed out on TD - token=0x80008d80) is present after ~10
  minutes - pendrive [1]

- Error after a few minutes (1,2) - pendrive [2]

2. When USB async support is reverted (commit
SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 [*])
- Error is detected after ~1h - pendrive [1]


3. With this patchset applied
- 12h of testing - no error - pendrive [1]

With the pendrive [2] I do observe that very often it is not recognized at all.
Even more strange - there is a difference in the reliability of being recognized
between identical pendrives (used one vs. just unboxed one).



Lukasz Majewski (5):
  Revert "usb: ehci-hcd: Keep async schedule running"
  usb: Handle XACTERR error in DATA phase of USB storage
  usb: Add some delay to wait for slow USB devices to be operational
  usb: Provide code to handle spinup of USB usb devices (mostly HDDs)
  usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data

 common/usb.c                | 10 ++++-
 common/usb_storage.c        | 46 ++++++++++++++++++++++
 drivers/usb/host/ehci-hcd.c | 78 ++++++++++++++++++++++++++-----------
 include/usb.h               |  1 +
 include/usb_defs.h          |  1 +
 5 files changed, 112 insertions(+), 24 deletions(-)

-- 
2.20.1

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-22 13:00 [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Lukasz Majewski
@ 2020-03-22 13:00 ` Lukasz Majewski
  2020-03-22 13:18   ` Marek Vasut
  2020-03-22 13:00 ` [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage Lukasz Majewski
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-22 13:00 UTC (permalink / raw)
  To: u-boot

This reverts commit 02b0e1a36c5bc20174299312556ec4e266872bd6.

Signed-off-by: Lukasz Majewski <lukma@denx.de>
---

 drivers/usb/host/ehci-hcd.c | 51 ++++++++++++++++++++++---------------
 1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 1cc02052f5..0a77111f80 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -309,7 +309,7 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	volatile struct qTD *vtd;
 	unsigned long ts;
 	uint32_t *tdp;
-	uint32_t endpt, maxpacket, token, usbsts, qhtoken;
+	uint32_t endpt, maxpacket, token, usbsts;
 	uint32_t c, toggle;
 	uint32_t cmd;
 	int timeout;
@@ -553,21 +553,22 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	flush_dcache_range((unsigned long)qtd,
 			   ALIGN_END_ADDR(struct qTD, qtd, qtd_count));
 
+	/* Set async. queue head pointer. */
+	ehci_writel(&ctrl->hcor->or_asynclistaddr, virt_to_phys(&ctrl->qh_list));
+
 	usbsts = ehci_readl(&ctrl->hcor->or_usbsts);
 	ehci_writel(&ctrl->hcor->or_usbsts, (usbsts & 0x3f));
 
 	/* Enable async. schedule. */
 	cmd = ehci_readl(&ctrl->hcor->or_usbcmd);
-	if (!(cmd & CMD_ASE)) {
-		cmd |= CMD_ASE;
-		ehci_writel(&ctrl->hcor->or_usbcmd, cmd);
+	cmd |= CMD_ASE;
+	ehci_writel(&ctrl->hcor->or_usbcmd, cmd);
 
-		ret = handshake((uint32_t *)&ctrl->hcor->or_usbsts, STS_ASS, STS_ASS,
-				100 * 1000);
-		if (ret < 0) {
-			printf("EHCI fail timeout STS_ASS set\n");
-			goto fail;
-		}
+	ret = handshake((uint32_t *)&ctrl->hcor->or_usbsts, STS_ASS, STS_ASS,
+			100 * 1000);
+	if (ret < 0) {
+		printf("EHCI fail timeout STS_ASS set\n");
+		goto fail;
 	}
 
 	/* Wait for TDs to be processed. */
@@ -588,11 +589,6 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 			break;
 		WATCHDOG_RESET();
 	} while (get_timer(ts) < timeout);
-	qhtoken = hc32_to_cpu(qh->qh_overlay.qt_token);
-
-	ctrl->qh_list.qh_link = cpu_to_hc32(virt_to_phys(&ctrl->qh_list) | QH_LINK_TYPE_QH);
-	flush_dcache_range((unsigned long)&ctrl->qh_list,
-		ALIGN_END_ADDR(struct QH, &ctrl->qh_list, 1));
 
 	/*
 	 * Invalidate the memory area occupied by buffer
@@ -611,12 +607,25 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	if (QT_TOKEN_GET_STATUS(token) & QT_TOKEN_STATUS_ACTIVE)
 		printf("EHCI timed out on TD - token=%#x\n", token);
 
-	if (!(QT_TOKEN_GET_STATUS(qhtoken) & QT_TOKEN_STATUS_ACTIVE)) {
-		debug("TOKEN=%#x\n", qhtoken);
-		switch (QT_TOKEN_GET_STATUS(qhtoken) &
+	/* Disable async schedule. */
+	cmd = ehci_readl(&ctrl->hcor->or_usbcmd);
+	cmd &= ~CMD_ASE;
+	ehci_writel(&ctrl->hcor->or_usbcmd, cmd);
+
+	ret = handshake((uint32_t *)&ctrl->hcor->or_usbsts, STS_ASS, 0,
+			100 * 1000);
+	if (ret < 0) {
+		printf("EHCI fail timeout STS_ASS reset\n");
+		goto fail;
+	}
+
+	token = hc32_to_cpu(qh->qh_overlay.qt_token);
+	if (!(QT_TOKEN_GET_STATUS(token) & QT_TOKEN_STATUS_ACTIVE)) {
+		debug("TOKEN=%#x\n", token);
+		switch (QT_TOKEN_GET_STATUS(token) &
 			~(QT_TOKEN_STATUS_SPLITXSTATE | QT_TOKEN_STATUS_PERR)) {
 		case 0:
-			toggle = QT_TOKEN_GET_DT(qhtoken);
+			toggle = QT_TOKEN_GET_DT(token);
 			usb_settoggle(dev, usb_pipeendpoint(pipe),
 				       usb_pipeout(pipe), toggle);
 			dev->status = 0;
@@ -634,11 +643,11 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 			break;
 		default:
 			dev->status = USB_ST_CRC_ERR;
-			if (QT_TOKEN_GET_STATUS(qhtoken) & QT_TOKEN_STATUS_HALTED)
+			if (QT_TOKEN_GET_STATUS(token) & QT_TOKEN_STATUS_HALTED)
 				dev->status |= USB_ST_STALLED;
 			break;
 		}
-		dev->act_len = length - QT_TOKEN_GET_TOTALBYTES(qhtoken);
+		dev->act_len = length - QT_TOKEN_GET_TOTALBYTES(token);
 	} else {
 		dev->act_len = 0;
 #ifndef CONFIG_USB_EHCI_FARADAY
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage
  2020-03-22 13:00 [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Lukasz Majewski
  2020-03-22 13:00 ` [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running" Lukasz Majewski
@ 2020-03-22 13:00 ` Lukasz Majewski
  2020-03-22 13:23   ` Marek Vasut
  2020-03-22 13:00 ` [RFT PATCH v1 3/5] usb: Add some delay to wait for slow USB devices to be operational Lukasz Majewski
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-22 13:00 UTC (permalink / raw)
  To: u-boot

This change brings support for handling XACTERR error during DATA phase
of USB BBB (bulk) transmission.

The fix is to clear stall'ed endpoint and reset recovery on the USB mass
storage class.

This code is the port to newest U-Boot of the fix from - "rayvt" (from [1]).

Links:
[1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
[2] - https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0

Signed-off-by: Lukasz Majewski <lukma@denx.de>
[Unfortunately, the original patch [2] did not contain S-o-B from the original
author - "rayvt"]
---

 common/usb_storage.c | 10 ++++++++++
 include/usb_defs.h   |  1 +
 2 files changed, 11 insertions(+)

diff --git a/common/usb_storage.c b/common/usb_storage.c
index 097b6729c1..92e1e54d1c 100644
--- a/common/usb_storage.c
+++ b/common/usb_storage.c
@@ -740,6 +740,16 @@ static int usb_stor_BBB_transport(struct scsi_cmd *srb, struct us_data *us)
 
 	result = usb_bulk_msg(us->pusb_dev, pipe, srb->pdata, srb->datalen,
 			      &data_actlen, USB_CNTL_TIMEOUT * 5);
+
+	/* special handling of XACTERR in DATA phase */
+	if (result < 0 && (us->pusb_dev->status & USB_ST_XACTERR)) {
+		debug("XACTERR in data phase - clr, reset, and return fail.\n");
+		usb_stor_BBB_clear_endpt_stall(us, dir_in ?
+					       us->ep_in : us->ep_out);
+		usb_stor_BBB_reset(us);
+		return USB_STOR_TRANSPORT_FAILED;
+	}
+
 	/* special handling of STALL in DATA phase */
 	if ((result < 0) && (us->pusb_dev->status & USB_ST_STALLED)) {
 		debug("DATA:stall\n");
diff --git a/include/usb_defs.h b/include/usb_defs.h
index 6dd2c997f9..572f6ab296 100644
--- a/include/usb_defs.h
+++ b/include/usb_defs.h
@@ -197,6 +197,7 @@
 #define USB_ST_NAK_REC          0x10	/* NAK Received*/
 #define USB_ST_CRC_ERR          0x20	/* CRC/timeout Error */
 #define USB_ST_BIT_ERR          0x40	/* Bitstuff error */
+#define USB_ST_XACTERR          0x80	/* XACTERR error */
 #define USB_ST_NOT_PROC         0x80000000L	/* Not yet processed */
 
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 3/5] usb: Add some delay to wait for slow USB devices to be operational
  2020-03-22 13:00 [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Lukasz Majewski
  2020-03-22 13:00 ` [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running" Lukasz Majewski
  2020-03-22 13:00 ` [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage Lukasz Majewski
@ 2020-03-22 13:00 ` Lukasz Majewski
  2020-03-22 13:29   ` Marek Vasut
  2020-03-22 13:00 ` [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs) Lukasz Majewski
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-22 13:00 UTC (permalink / raw)
  To: u-boot

This change provides some extra time for some slow (or degraded) USB devices
to become fully operational.

This code is the port to newest U-Boot of the fix from - "rayvt" (from [1]).

Links:
[1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
[2] - https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0

Signed-off-by: Lukasz Majewski <lukma@denx.de>
[Unfortunately, the original patch [2] did not contain S-o-B from the original
author - "rayvt"]
---

 common/usb.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/common/usb.c b/common/usb.c
index 349e838f1d..305482b5bb 100644
--- a/common/usb.c
+++ b/common/usb.c
@@ -925,14 +925,20 @@ static int get_descriptor_len(struct usb_device *dev, int len, int expect_len)
 	__maybe_unused struct usb_device_descriptor *desc;
 	ALLOC_CACHE_ALIGN_BUFFER(unsigned char, tmpbuf, USB_BUFSIZ);
 	int err;
+	int retry = 5;
 
 	desc = (struct usb_device_descriptor *)tmpbuf;
 
+again:
 	err = usb_get_descriptor(dev, USB_DT_DEVICE, 0, desc, len);
 	if (err < expect_len) {
 		if (err < 0) {
-			printf("unable to get device descriptor (error=%d)\n",
-				err);
+			printf("unable to get device descriptor (error=%d) retry: %d\n",
+			       err, retry);
+			mdelay(50);
+			if (--retry >= 0)
+				/* Some drives are just slow to wake up. */
+				goto again;
 			return err;
 		} else {
 			printf("USB device descriptor short read (expected %i, got %i)\n",
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs)
  2020-03-22 13:00 [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Lukasz Majewski
                   ` (2 preceding siblings ...)
  2020-03-22 13:00 ` [RFT PATCH v1 3/5] usb: Add some delay to wait for slow USB devices to be operational Lukasz Majewski
@ 2020-03-22 13:00 ` Lukasz Majewski
  2020-03-22 13:32   ` Marek Vasut
  2020-03-22 13:00 ` [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data Lukasz Majewski
  2020-03-23 20:58 ` [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Tom Rini
  5 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-22 13:00 UTC (permalink / raw)
  To: u-boot

After this change USB mass storage devices - mostly HDDs connected via USB
- will gain handling of extra time needed for their spin up.

This operation is realized with issuing SCSI start/stop unit (0x1B) command.

This code is the port to newest U-Boot of the fix from - "rayvt" (from [1]).

Links:
[1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
[2] - https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0

Signed-off-by: Lukasz Majewski <lukma@denx.de>
[Unfortunately, the original patch [2] did not contain S-o-B from the original
author - "rayvt"]
---

 common/usb_storage.c | 36 ++++++++++++++++++++++++++++++++++++
 include/usb.h        |  1 +
 2 files changed, 37 insertions(+)

diff --git a/common/usb_storage.c b/common/usb_storage.c
index 92e1e54d1c..3c2324fa1a 100644
--- a/common/usb_storage.c
+++ b/common/usb_storage.c
@@ -729,6 +729,7 @@ static int usb_stor_BBB_transport(struct scsi_cmd *srb, struct us_data *us)
 	pipeout = usb_sndbulkpipe(us->pusb_dev, us->ep_out);
 	/* DATA phase + error handling */
 	data_actlen = 0;
+	mdelay(10);		/* Like linux does. */
 	/* no data, go immediately to the STATUS phase */
 	if (srb->datalen == 0)
 		goto st;
@@ -1023,9 +1024,32 @@ static int usb_request_sense(struct scsi_cmd *srb, struct us_data *ss)
 	return 0;
 }
 
+/*
+ * This spins up the disk and also consumes the time that the
+ * disk takes to become active and ready to read data.
+ * Some drives (like Western Digital) can take more than 5 seconds.
+ * The delay occurs on the 1st data read from the disk.
+ * Extending the timeout here works better than handling the timeout
+ * as an error on a "real" read.
+ */
+static int usb_spinup(struct scsi_cmd *srb, struct us_data *ss)
+{
+	memset(&srb->cmd[0], 0, 12);
+	srb->cmd[0] = SCSI_START_STP;
+	srb->cmd[1] = srb->lun << 5;
+	srb->cmd[4] = 1;	/* Start spinup. */
+	srb->datalen = 0;
+	srb->cmdlen = 6;
+	ss->pusb_dev->extra_timeout = 9876;
+	ss->transport(srb, ss);
+	ss->pusb_dev->extra_timeout = 0;
+	return 0;
+}
+
 static int usb_test_unit_ready(struct scsi_cmd *srb, struct us_data *ss)
 {
 	int retries = 10;
+	int gave_extra_time = 0;
 
 	do {
 		memset(&srb->cmd[0], 0, 12);
@@ -1048,6 +1072,17 @@ static int usb_test_unit_ready(struct scsi_cmd *srb, struct us_data *ss)
 		if ((srb->sense_buf[2] == 0x02) &&
 		    (srb->sense_buf[12] == 0x3a))
 			return -1;
+		/*
+		 * If the status is "Not Ready - becoming ready", give it
+		 * more time.  Linux issues a spinup command (once) and gives
+		 * it 100 seconds.
+		 */
+		if (srb->sense_buf[2] == 0x02 &&
+		    srb->sense_buf[12] == 0x04 &&
+		    gave_extra_time == 0) {
+			retries = 100; /* Allow 10 seconds. */
+			gave_extra_time = retries;
+		}
 		mdelay(100);
 	} while (retries--);
 
@@ -1502,6 +1537,7 @@ int usb_stor_get_info(struct usb_device *dev, struct us_data *ss,
 	dev_desc->log2blksz = LOG2(dev_desc->blksz);
 	dev_desc->type = perq;
 	debug(" address %d\n", dev_desc->target);
+	usb_spinup(pccb, ss);
 
 	return 1;
 }
diff --git a/include/usb.h b/include/usb.h
index efb67ea33f..5b0f5ce5d6 100644
--- a/include/usb.h
+++ b/include/usb.h
@@ -140,6 +140,7 @@ struct usb_device {
 	int act_len;			/* transferred bytes */
 	int maxchild;			/* Number of ports if hub */
 	int portnr;			/* Port number, 1=first */
+	int extra_timeout; /* Add to timeout in ehci_submit_async or spinup */
 #if !CONFIG_IS_ENABLED(DM_USB)
 	/* parent hub, or NULL if this is the root hub */
 	struct usb_device *parent;
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data
  2020-03-22 13:00 [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Lukasz Majewski
                   ` (3 preceding siblings ...)
  2020-03-22 13:00 ` [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs) Lukasz Majewski
@ 2020-03-22 13:00 ` Lukasz Majewski
  2020-03-22 13:45   ` Marek Vasut
  2020-03-23 20:58 ` [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Tom Rini
  5 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-22 13:00 UTC (permalink / raw)
  To: u-boot

This code adds check if QT_TOKEN_STATUS_XACTERR error occurred. When it is
detected the token is reconfigured and transmission is retried.

This code is the port to newest U-Boot of the fix from - "rayvt" (from [1]).

Links:
[1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
[2] - https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0


Signed-off-by: Lukasz Majewski <lukma@denx.de>
[Unfortunately, the original patch [2] did not contain S-o-B from the original
author - "rayvt"]
---

 drivers/usb/host/ehci-hcd.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 0a77111f80..45eda7ad24 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -315,6 +315,7 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	int timeout;
 	int ret = 0;
 	struct ehci_ctrl *ctrl = ehci_get_ctrl(dev);
+	int trynum;
 
 	debug("dev=%p, pipe=%lx, buffer=%p, length=%d, req=%p\n", dev, pipe,
 	      buffer, length, req);
@@ -560,6 +561,10 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	ehci_writel(&ctrl->hcor->or_usbsts, (usbsts & 0x3f));
 
 	/* Enable async. schedule. */
+	trynum = 1;	/* No more than 2 tries, in case of XACTERR. */
+retry_xacterr:;
+	vtd = &qtd[qtd_counter - 1];
+
 	cmd = ehci_readl(&ctrl->hcor->or_usbcmd);
 	cmd |= CMD_ASE;
 	ehci_writel(&ctrl->hcor->or_usbcmd, cmd);
@@ -573,8 +578,8 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 
 	/* Wait for TDs to be processed. */
 	ts = get_timer(0);
-	vtd = &qtd[qtd_counter - 1];
 	timeout = USB_TIMEOUT_MS(pipe);
+	timeout += dev->extra_timeout;
 	do {
 		/* Invalidate dcache */
 		invalidate_dcache_range((unsigned long)&ctrl->qh_list,
@@ -589,6 +594,7 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 			break;
 		WATCHDOG_RESET();
 	} while (get_timer(ts) < timeout);
+	debug("took %4lu ms of %4d\n", get_timer(ts), timeout);
 
 	/*
 	 * Invalidate the memory area occupied by buffer
@@ -622,6 +628,25 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	token = hc32_to_cpu(qh->qh_overlay.qt_token);
 	if (!(QT_TOKEN_GET_STATUS(token) & QT_TOKEN_STATUS_ACTIVE)) {
 		debug("TOKEN=%#x\n", token);
+		if (token & QT_TOKEN_STATUS_XACTERR) {
+			if (--trynum >= 0) {
+				/*
+				 * It is necessary to do this, otherwise the
+				 * disk is clagged.
+				 */
+				debug("reset the TD and redo, because of XACTERR\n");
+				token &= ~QT_TOKEN_STATUS_HALTED;
+				token |= QT_TOKEN_STATUS_ACTIVE |
+					QT_TOKEN_CERR(2);
+				vtd->qt_token = cpu_to_hc32(token);
+				qh->qh_overlay.qt_token = cpu_to_hc32(token);
+				goto retry_xacterr;
+			}
+			dev->status = USB_ST_XACTERR;
+			dev->act_len = length - QT_TOKEN_GET_TOTALBYTES(token);
+			goto fail;
+		}
+
 		switch (QT_TOKEN_GET_STATUS(token) &
 			~(QT_TOKEN_STATUS_SPLITXSTATE | QT_TOKEN_STATUS_PERR)) {
 		case 0:
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-22 13:00 ` [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running" Lukasz Majewski
@ 2020-03-22 13:18   ` Marek Vasut
  2020-03-23  6:57     ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-22 13:18 UTC (permalink / raw)
  To: u-boot

On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> This reverts commit 02b0e1a36c5bc20174299312556ec4e266872bd6.
> 
> Signed-off-by: Lukasz Majewski <lukma@denx.de>

This patch lacks any and all explanation why this is being reverted. The
patch you are reverting here explains why it was added and what real
issues it was fixing, so instead of reverting it, if there is an issue
with that patch, you should identify the issue and fix it.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage
  2020-03-22 13:00 ` [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage Lukasz Majewski
@ 2020-03-22 13:23   ` Marek Vasut
  2020-03-23  7:00     ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-22 13:23 UTC (permalink / raw)
  To: u-boot

On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> This change brings support for handling XACTERR error during DATA phase
> of USB BBB (bulk) transmission.
> 
> The fix is to clear stall'ed endpoint and reset recovery on the USB mass
> storage class.
> 
> This code is the port to newest U-Boot of the fix from - "rayvt" (from [1]).
> 
> Links:
> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> [2] - https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
> 
> Signed-off-by: Lukasz Majewski <lukma@denx.de>
> [Unfortunately, the original patch [2] did not contain S-o-B from the original
> author - "rayvt"]
> ---
> 
>  common/usb_storage.c | 10 ++++++++++
>  include/usb_defs.h   |  1 +
>  2 files changed, 11 insertions(+)
> 
> diff --git a/common/usb_storage.c b/common/usb_storage.c
> index 097b6729c1..92e1e54d1c 100644
> --- a/common/usb_storage.c
> +++ b/common/usb_storage.c
> @@ -740,6 +740,16 @@ static int usb_stor_BBB_transport(struct scsi_cmd *srb, struct us_data *us)
>  
>  	result = usb_bulk_msg(us->pusb_dev, pipe, srb->pdata, srb->datalen,
>  			      &data_actlen, USB_CNTL_TIMEOUT * 5);
> +
> +	/* special handling of XACTERR in DATA phase */
> +	if (result < 0 && (us->pusb_dev->status & USB_ST_XACTERR)) {
> +		debug("XACTERR in data phase - clr, reset, and return fail.\n");
> +		usb_stor_BBB_clear_endpt_stall(us, dir_in ?
> +					       us->ep_in : us->ep_out);
> +		usb_stor_BBB_reset(us);
> +		return USB_STOR_TRANSPORT_FAILED;
> +	}
> +

Can resetting the endpoint result in data corruption of some sort here ?
Especially if this happens on filesystem write for example ?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 3/5] usb: Add some delay to wait for slow USB devices to be operational
  2020-03-22 13:00 ` [RFT PATCH v1 3/5] usb: Add some delay to wait for slow USB devices to be operational Lukasz Majewski
@ 2020-03-22 13:29   ` Marek Vasut
  2020-03-23  7:04     ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-22 13:29 UTC (permalink / raw)
  To: u-boot

On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> This change provides some extra time for some slow (or degraded) USB devices
> to become fully operational.
> 
> This code is the port to newest U-Boot of the fix from - "rayvt" (from [1]).
> 
> Links:
> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> [2] - https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
> 
> Signed-off-by: Lukasz Majewski <lukma@denx.de>
> [Unfortunately, the original patch [2] did not contain S-o-B from the original
> author - "rayvt"]
> ---
> 
>  common/usb.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/common/usb.c b/common/usb.c
> index 349e838f1d..305482b5bb 100644
> --- a/common/usb.c
> +++ b/common/usb.c
> @@ -925,14 +925,20 @@ static int get_descriptor_len(struct usb_device *dev, int len, int expect_len)
>  	__maybe_unused struct usb_device_descriptor *desc;
>  	ALLOC_CACHE_ALIGN_BUFFER(unsigned char, tmpbuf, USB_BUFSIZ);
>  	int err;
> +	int retry = 5;
>  
>  	desc = (struct usb_device_descriptor *)tmpbuf;
>  
> +again:
>  	err = usb_get_descriptor(dev, USB_DT_DEVICE, 0, desc, len);
>  	if (err < expect_len) {
>  		if (err < 0) {
> -			printf("unable to get device descriptor (error=%d)\n",
> -				err);
> +			printf("unable to get device descriptor (error=%d) retry: %d\n",
> +			       err, retry);
> +			mdelay(50);

Why 50 mSec and not some other value, like 100 mSec ?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs)
  2020-03-22 13:00 ` [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs) Lukasz Majewski
@ 2020-03-22 13:32   ` Marek Vasut
  2020-03-23  7:53     ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-22 13:32 UTC (permalink / raw)
  To: u-boot

On 3/22/20 2:00 PM, Lukasz Majewski wrote:
[...]
> diff --git a/common/usb_storage.c b/common/usb_storage.c
> index 92e1e54d1c..3c2324fa1a 100644
> --- a/common/usb_storage.c
> +++ b/common/usb_storage.c
> @@ -729,6 +729,7 @@ static int usb_stor_BBB_transport(struct scsi_cmd *srb, struct us_data *us)
>  	pipeout = usb_sndbulkpipe(us->pusb_dev, us->ep_out);
>  	/* DATA phase + error handling */
>  	data_actlen = 0;
> +	mdelay(10);		/* Like linux does. */

Does this add delay to every single transfer ?
That would mean a massive slowdown if you use short data transfers,
which is needed for old/limited USB sticks.

>  	/* no data, go immediately to the STATUS phase */
>  	if (srb->datalen == 0)
>  		goto st;
> @@ -1023,9 +1024,32 @@ static int usb_request_sense(struct scsi_cmd *srb, struct us_data *ss)
>  	return 0;
>  }
>  
> +/*
> + * This spins up the disk and also consumes the time that the
> + * disk takes to become active and ready to read data.
> + * Some drives (like Western Digital) can take more than 5 seconds.
> + * The delay occurs on the 1st data read from the disk.
> + * Extending the timeout here works better than handling the timeout
> + * as an error on a "real" read.
> + */
> +static int usb_spinup(struct scsi_cmd *srb, struct us_data *ss)
> +{
> +	memset(&srb->cmd[0], 0, 12);
> +	srb->cmd[0] = SCSI_START_STP;
> +	srb->cmd[1] = srb->lun << 5;
> +	srb->cmd[4] = 1;	/* Start spinup. */
> +	srb->datalen = 0;
> +	srb->cmdlen = 6;
> +	ss->pusb_dev->extra_timeout = 9876;

What is this magic number ?

> +	ss->transport(srb, ss);
> +	ss->pusb_dev->extra_timeout = 0;
> +	return 0;
> +}

[...]

> diff --git a/include/usb.h b/include/usb.h
> index efb67ea33f..5b0f5ce5d6 100644
> --- a/include/usb.h
> +++ b/include/usb.h
> @@ -140,6 +140,7 @@ struct usb_device {
>  	int act_len;			/* transferred bytes */
>  	int maxchild;			/* Number of ports if hub */
>  	int portnr;			/* Port number, 1=first */
> +	int extra_timeout; /* Add to timeout in ehci_submit_async or spinup */

Does this work with xhci too ?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data
  2020-03-22 13:00 ` [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data Lukasz Majewski
@ 2020-03-22 13:45   ` Marek Vasut
  2020-03-23  7:18     ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-22 13:45 UTC (permalink / raw)
  To: u-boot

On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> This code adds check if QT_TOKEN_STATUS_XACTERR error occurred. When it is
> detected the token is reconfigured and transmission is retried.
> 
> This code is the port to newest U-Boot of the fix from - "rayvt" (from [1]).
> 
> Links:
> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> [2] - https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
> 
> 
> Signed-off-by: Lukasz Majewski <lukma@denx.de>
> [Unfortunately, the original patch [2] did not contain S-o-B from the original
> author - "rayvt"]
> ---
> 
>  drivers/usb/host/ehci-hcd.c | 27 ++++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
> index 0a77111f80..45eda7ad24 100644
> --- a/drivers/usb/host/ehci-hcd.c
> +++ b/drivers/usb/host/ehci-hcd.c
> @@ -315,6 +315,7 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
>  	int timeout;
>  	int ret = 0;
>  	struct ehci_ctrl *ctrl = ehci_get_ctrl(dev);
> +	int trynum;
>  
>  	debug("dev=%p, pipe=%lx, buffer=%p, length=%d, req=%p\n", dev, pipe,
>  	      buffer, length, req);
> @@ -560,6 +561,10 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
>  	ehci_writel(&ctrl->hcor->or_usbsts, (usbsts & 0x3f));
>  
>  	/* Enable async. schedule. */
> +	trynum = 1;	/* No more than 2 tries, in case of XACTERR. */
> +retry_xacterr:;
> +	vtd = &qtd[qtd_counter - 1];
> +
>  	cmd = ehci_readl(&ctrl->hcor->or_usbcmd);
>  	cmd |= CMD_ASE;
>  	ehci_writel(&ctrl->hcor->or_usbcmd, cmd);

Are you sure always retrying the transfer if you get XACTERR is the
right thing to do, even if you get that e.g. on filesystem writes ?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-22 13:18   ` Marek Vasut
@ 2020-03-23  6:57     ` Lukasz Majewski
  2020-03-23 11:46       ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23  6:57 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> > This reverts commit 02b0e1a36c5bc20174299312556ec4e266872bd6.
> > 
> > Signed-off-by: Lukasz Majewski <lukma@denx.de>  
> 
> This patch lacks any and all explanation why this is being reverted.
> The patch you are reverting here explains why it was added and what
> real issues it was fixing, so instead of reverting it, if there is an
> issue with that patch, you should identify the issue and fix it.

Marek, have you received the cover letter for this patch series?

In the cover letter I've written the rationale for reverting this
patch. 

In short - qhtoken has value of 0x0, when the token variable shows
errors. As a result the error handling is broken.
Could you comment on those arguments?

Moreover, I've explicitly stated that this is a Request For
Testing like patch series with a detailed report of testing procedure
(for my use case) for the USB in U-Boot (as Tom has tested the patch
with some ETH dongles). 



Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/02cc5973/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage
  2020-03-22 13:23   ` Marek Vasut
@ 2020-03-23  7:00     ` Lukasz Majewski
  2020-03-23 11:50       ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23  7:00 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> > This change brings support for handling XACTERR error during DATA
> > phase of USB BBB (bulk) transmission.
> > 
> > The fix is to clear stall'ed endpoint and reset recovery on the USB
> > mass storage class.
> > 
> > This code is the port to newest U-Boot of the fix from - "rayvt"
> > (from [1]).
> > 
> > Links:
> > [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> > [2] -
> > https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
> > 
> > Signed-off-by: Lukasz Majewski <lukma@denx.de>
> > [Unfortunately, the original patch [2] did not contain S-o-B from
> > the original author - "rayvt"]
> > ---
> > 
> >  common/usb_storage.c | 10 ++++++++++
> >  include/usb_defs.h   |  1 +
> >  2 files changed, 11 insertions(+)
> > 
> > diff --git a/common/usb_storage.c b/common/usb_storage.c
> > index 097b6729c1..92e1e54d1c 100644
> > --- a/common/usb_storage.c
> > +++ b/common/usb_storage.c
> > @@ -740,6 +740,16 @@ static int usb_stor_BBB_transport(struct
> > scsi_cmd *srb, struct us_data *us) 
> >  	result = usb_bulk_msg(us->pusb_dev, pipe, srb->pdata,
> > srb->datalen, &data_actlen, USB_CNTL_TIMEOUT * 5);
> > +
> > +	/* special handling of XACTERR in DATA phase */
> > +	if (result < 0 && (us->pusb_dev->status & USB_ST_XACTERR))
> > {
> > +		debug("XACTERR in data phase - clr, reset, and
> > return fail.\n");
> > +		usb_stor_BBB_clear_endpt_stall(us, dir_in ?
> > +					       us->ep_in :
> > us->ep_out);
> > +		usb_stor_BBB_reset(us);
> > +		return USB_STOR_TRANSPORT_FAILED;
> > +	}
> > +  
> 
> Can resetting the endpoint result in data corruption of some sort
> here ?

Please correct me if I'm wrong, but this code is executed when we do
receive data, not writing them. Those operations shall (and are?)
orthogonal.

> Especially if this happens on filesystem write for example ?

Do we write data here?


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/fe801a2d/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 3/5] usb: Add some delay to wait for slow USB devices to be operational
  2020-03-22 13:29   ` Marek Vasut
@ 2020-03-23  7:04     ` Lukasz Majewski
  2020-03-23 11:52       ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23  7:04 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> > This change provides some extra time for some slow (or degraded)
> > USB devices to become fully operational.
> > 
> > This code is the port to newest U-Boot of the fix from - "rayvt"
> > (from [1]).
> > 
> > Links:
> > [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> > [2] -
> > https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
> > 
> > Signed-off-by: Lukasz Majewski <lukma@denx.de>
> > [Unfortunately, the original patch [2] did not contain S-o-B from
> > the original author - "rayvt"]
> > ---
> > 
> >  common/usb.c | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/common/usb.c b/common/usb.c
> > index 349e838f1d..305482b5bb 100644
> > --- a/common/usb.c
> > +++ b/common/usb.c
> > @@ -925,14 +925,20 @@ static int get_descriptor_len(struct
> > usb_device *dev, int len, int expect_len) __maybe_unused struct
> > usb_device_descriptor *desc; ALLOC_CACHE_ALIGN_BUFFER(unsigned
> > char, tmpbuf, USB_BUFSIZ); int err;
> > +	int retry = 5;
> >  
> >  	desc = (struct usb_device_descriptor *)tmpbuf;
> >  
> > +again:
> >  	err = usb_get_descriptor(dev, USB_DT_DEVICE, 0, desc, len);
> >  	if (err < expect_len) {
> >  		if (err < 0) {
> > -			printf("unable to get device descriptor
> > (error=%d)\n",
> > -				err);
> > +			printf("unable to get device descriptor
> > (error=%d) retry: %d\n",
> > +			       err, retry);
> > +			mdelay(50);  
> 
> Why 50 mSec and not some other value, like 100 mSec ?

I think that this value (50 ms) was took from Linux in some point and
with the retry set to 5 was the ported heuristics.

If you ask why exactly there is 50 ms - I cannot say, as I've just
ported the patch.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/1f4f6231/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data
  2020-03-22 13:45   ` Marek Vasut
@ 2020-03-23  7:18     ` Lukasz Majewski
  2020-03-23 11:59       ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23  7:18 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> > This code adds check if QT_TOKEN_STATUS_XACTERR error occurred.
> > When it is detected the token is reconfigured and transmission is
> > retried.
> > 
> > This code is the port to newest U-Boot of the fix from - "rayvt"
> > (from [1]).
> > 
> > Links:
> > [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> > [2] -
> > https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
> > 
> > 
> > Signed-off-by: Lukasz Majewski <lukma@denx.de>
> > [Unfortunately, the original patch [2] did not contain S-o-B from
> > the original author - "rayvt"]
> > ---
> > 
> >  drivers/usb/host/ehci-hcd.c | 27 ++++++++++++++++++++++++++-
> >  1 file changed, 26 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/usb/host/ehci-hcd.c
> > b/drivers/usb/host/ehci-hcd.c index 0a77111f80..45eda7ad24 100644
> > --- a/drivers/usb/host/ehci-hcd.c
> > +++ b/drivers/usb/host/ehci-hcd.c
> > @@ -315,6 +315,7 @@ ehci_submit_async(struct usb_device *dev,
> > unsigned long pipe, void *buffer, int timeout;
> >  	int ret = 0;
> >  	struct ehci_ctrl *ctrl = ehci_get_ctrl(dev);
> > +	int trynum;
> >  
> >  	debug("dev=%p, pipe=%lx, buffer=%p, length=%d, req=%p\n",
> > dev, pipe, buffer, length, req);
> > @@ -560,6 +561,10 @@ ehci_submit_async(struct usb_device *dev,
> > unsigned long pipe, void *buffer,
> > ehci_writel(&ctrl->hcor->or_usbsts, (usbsts & 0x3f)); 
> >  	/* Enable async. schedule. */
> > +	trynum = 1;	/* No more than 2 tries, in case of
> > XACTERR. */ +retry_xacterr:;
> > +	vtd = &qtd[qtd_counter - 1];
> > +
> >  	cmd = ehci_readl(&ctrl->hcor->or_usbcmd);
> >  	cmd |= CMD_ASE;
> >  	ehci_writel(&ctrl->hcor->or_usbcmd, cmd);  
> 
> Are you sure always retrying the transfer if you get XACTERR is the
> right thing to do, even if you get that e.g. on filesystem writes ?

Please correct me if I'm wrong, but this function - ehci_submit_async
- doesn't work with filesystem. It just setups proper EHCI descriptor
  and checks if it was sent with or without errors.

When the XACTERR happens, proper flag is cleared and the transmission
is retried.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/d6546ec7/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs)
  2020-03-22 13:32   ` Marek Vasut
@ 2020-03-23  7:53     ` Lukasz Majewski
  2020-03-23 11:57       ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23  7:53 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> [...]
> > diff --git a/common/usb_storage.c b/common/usb_storage.c
> > index 92e1e54d1c..3c2324fa1a 100644
> > --- a/common/usb_storage.c
> > +++ b/common/usb_storage.c
> > @@ -729,6 +729,7 @@ static int usb_stor_BBB_transport(struct
> > scsi_cmd *srb, struct us_data *us) pipeout =
> > usb_sndbulkpipe(us->pusb_dev, us->ep_out); /* DATA phase + error
> > handling */ data_actlen = 0;
> > +	mdelay(10);		/* Like linux does. */  
> 
> Does this add delay to every single transfer ?

It brings the slowdown, yes.

However, without it I very often see the error that the USB address
couldn't be assigned.

> That would mean a massive slowdown if you use short data transfers,
> which is needed for old/limited USB sticks.
> 
> >  	/* no data, go immediately to the STATUS phase */
> >  	if (srb->datalen == 0)
> >  		goto st;
> > @@ -1023,9 +1024,32 @@ static int usb_request_sense(struct scsi_cmd
> > *srb, struct us_data *ss) return 0;
> >  }
> >  
> > +/*
> > + * This spins up the disk and also consumes the time that the
> > + * disk takes to become active and ready to read data.
> > + * Some drives (like Western Digital) can take more than 5 seconds.
> > + * The delay occurs on the 1st data read from the disk.
> > + * Extending the timeout here works better than handling the
> > timeout
> > + * as an error on a "real" read.
> > + */
> > +static int usb_spinup(struct scsi_cmd *srb, struct us_data *ss)
> > +{
> > +	memset(&srb->cmd[0], 0, 12);
> > +	srb->cmd[0] = SCSI_START_STP;
> > +	srb->cmd[1] = srb->lun << 5;
> > +	srb->cmd[4] = 1;	/* Start spinup. */
> > +	srb->datalen = 0;
> > +	srb->cmdlen = 6;
> > +	ss->pusb_dev->extra_timeout = 9876;  
> 
> What is this magic number ?

This number is added to the timeout up to which ehci controller waits
for EHCI TD to be sent.

Why there is 9876 - I do guess that it was took from Linux in some
point in time.

> 
> > +	ss->transport(srb, ss);
> > +	ss->pusb_dev->extra_timeout = 0;
> > +	return 0;
> > +}  
> 
> [...]
> 
> > diff --git a/include/usb.h b/include/usb.h
> > index efb67ea33f..5b0f5ce5d6 100644
> > --- a/include/usb.h
> > +++ b/include/usb.h
> > @@ -140,6 +140,7 @@ struct usb_device {
> >  	int act_len;			/* transferred bytes */
> >  	int maxchild;			/* Number of ports if
> > hub */ int portnr;			/* Port number, 1=first */
> > +	int extra_timeout; /* Add to timeout in ehci_submit_async
> > or spinup */  
> 
> Does this work with xhci too ?

Yes, it is used in patch 5/5.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/d7fd11d3/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-23  6:57     ` Lukasz Majewski
@ 2020-03-23 11:46       ` Marek Vasut
  2020-03-23 12:41         ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-23 11:46 UTC (permalink / raw)
  To: u-boot

On 3/23/20 7:57 AM, Lukasz Majewski wrote:
> Hi Marek,

Hi,

>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
>>> This reverts commit 02b0e1a36c5bc20174299312556ec4e266872bd6.
>>>
>>> Signed-off-by: Lukasz Majewski <lukma@denx.de>  
>>
>> This patch lacks any and all explanation why this is being reverted.
>> The patch you are reverting here explains why it was added and what
>> real issues it was fixing, so instead of reverting it, if there is an
>> issue with that patch, you should identify the issue and fix it.
> 
> Marek, have you received the cover letter for this patch series?
> 
> In the cover letter I've written the rationale for reverting this
> patch. 

That should have been explained in this patch description.

> In short - qhtoken has value of 0x0, when the token variable shows
> errors. As a result the error handling is broken.
> Could you comment on those arguments?

Maybe you are referencing/reading the wrong token ?

You should probably figure out why this doesn't work first and then add
fixes on top.

> Moreover, I've explicitly stated that this is a Request For
> Testing like patch series with a detailed report of testing procedure
> (for my use case) for the USB in U-Boot (as Tom has tested the patch
> with some ETH dongles). 

I was still unable to replicate the ethernet device failure.

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage
  2020-03-23  7:00     ` Lukasz Majewski
@ 2020-03-23 11:50       ` Marek Vasut
  2020-03-23 13:03         ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-23 11:50 UTC (permalink / raw)
  To: u-boot

On 3/23/20 8:00 AM, Lukasz Majewski wrote:
> Hi Marek,
> 
>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
>>> This change brings support for handling XACTERR error during DATA
>>> phase of USB BBB (bulk) transmission.
>>>
>>> The fix is to clear stall'ed endpoint and reset recovery on the USB
>>> mass storage class.
>>>
>>> This code is the port to newest U-Boot of the fix from - "rayvt"
>>> (from [1]).
>>>
>>> Links:
>>> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
>>> [2] -
>>> https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
>>>
>>> Signed-off-by: Lukasz Majewski <lukma@denx.de>
>>> [Unfortunately, the original patch [2] did not contain S-o-B from
>>> the original author - "rayvt"]
>>> ---
>>>
>>>  common/usb_storage.c | 10 ++++++++++
>>>  include/usb_defs.h   |  1 +
>>>  2 files changed, 11 insertions(+)
>>>
>>> diff --git a/common/usb_storage.c b/common/usb_storage.c
>>> index 097b6729c1..92e1e54d1c 100644
>>> --- a/common/usb_storage.c
>>> +++ b/common/usb_storage.c
>>> @@ -740,6 +740,16 @@ static int usb_stor_BBB_transport(struct
>>> scsi_cmd *srb, struct us_data *us) 
>>>  	result = usb_bulk_msg(us->pusb_dev, pipe, srb->pdata,
>>> srb->datalen, &data_actlen, USB_CNTL_TIMEOUT * 5);
>>> +
>>> +	/* special handling of XACTERR in DATA phase */
>>> +	if (result < 0 && (us->pusb_dev->status & USB_ST_XACTERR))
>>> {
>>> +		debug("XACTERR in data phase - clr, reset, and
>>> return fail.\n");
>>> +		usb_stor_BBB_clear_endpt_stall(us, dir_in ?
>>> +					       us->ep_in :
>>> us->ep_out);
>>> +		usb_stor_BBB_reset(us);
>>> +		return USB_STOR_TRANSPORT_FAILED;
>>> +	}
>>> +  
>>
>> Can resetting the endpoint result in data corruption of some sort
>> here ?
> 
> Please correct me if I'm wrong, but this code is executed when we do
> receive data, not writing them. Those operations shall (and are?)
> orthogonal.
> 
>> Especially if this happens on filesystem write for example ?
> 
> Do we write data here?

I only did a very quick look into the code, but there I see

1082 static int usb_write_10(struct scsi_cmd *srb, struct us_data *ss,
...
1096         return ss->transport(srb, ss);

1338         case US_PR_BULK:
1339                 debug("Bulk/Bulk/Bulk\n");
1340                 ss->transport = usb_stor_BBB_transport;

So I would say, the answer is -- yes.

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 3/5] usb: Add some delay to wait for slow USB devices to be operational
  2020-03-23  7:04     ` Lukasz Majewski
@ 2020-03-23 11:52       ` Marek Vasut
  0 siblings, 0 replies; 35+ messages in thread
From: Marek Vasut @ 2020-03-23 11:52 UTC (permalink / raw)
  To: u-boot

On 3/23/20 8:04 AM, Lukasz Majewski wrote:
> Hi Marek,
> 
>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
>>> This change provides some extra time for some slow (or degraded)
>>> USB devices to become fully operational.
>>>
>>> This code is the port to newest U-Boot of the fix from - "rayvt"
>>> (from [1]).
>>>
>>> Links:
>>> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
>>> [2] -
>>> https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
>>>
>>> Signed-off-by: Lukasz Majewski <lukma@denx.de>
>>> [Unfortunately, the original patch [2] did not contain S-o-B from
>>> the original author - "rayvt"]
>>> ---
>>>
>>>  common/usb.c | 10 ++++++++--
>>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/common/usb.c b/common/usb.c
>>> index 349e838f1d..305482b5bb 100644
>>> --- a/common/usb.c
>>> +++ b/common/usb.c
>>> @@ -925,14 +925,20 @@ static int get_descriptor_len(struct
>>> usb_device *dev, int len, int expect_len) __maybe_unused struct
>>> usb_device_descriptor *desc; ALLOC_CACHE_ALIGN_BUFFER(unsigned
>>> char, tmpbuf, USB_BUFSIZ); int err;
>>> +	int retry = 5;
>>>  
>>>  	desc = (struct usb_device_descriptor *)tmpbuf;
>>>  
>>> +again:
>>>  	err = usb_get_descriptor(dev, USB_DT_DEVICE, 0, desc, len);
>>>  	if (err < expect_len) {
>>>  		if (err < 0) {
>>> -			printf("unable to get device descriptor
>>> (error=%d)\n",
>>> -				err);
>>> +			printf("unable to get device descriptor
>>> (error=%d) retry: %d\n",
>>> +			       err, retry);
>>> +			mdelay(50);  
>>
>> Why 50 mSec and not some other value, like 100 mSec ?
> 
> I think that this value (50 ms) was took from Linux in some point and
> with the retry set to 5 was the ported heuristics.
> 
> If you ask why exactly there is 50 ms - I cannot say, as I've just
> ported the patch.

I see, then please research this. The USB stack has enough ad-hoc random
values in it already, no need to add new ones.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs)
  2020-03-23  7:53     ` Lukasz Majewski
@ 2020-03-23 11:57       ` Marek Vasut
  2020-03-23 12:54         ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-23 11:57 UTC (permalink / raw)
  To: u-boot

On 3/23/20 8:53 AM, Lukasz Majewski wrote:
> Hi Marek,

Hi,

>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
>> [...]
>>> diff --git a/common/usb_storage.c b/common/usb_storage.c
>>> index 92e1e54d1c..3c2324fa1a 100644
>>> --- a/common/usb_storage.c
>>> +++ b/common/usb_storage.c
>>> @@ -729,6 +729,7 @@ static int usb_stor_BBB_transport(struct
>>> scsi_cmd *srb, struct us_data *us) pipeout =
>>> usb_sndbulkpipe(us->pusb_dev, us->ep_out); /* DATA phase + error
>>> handling */ data_actlen = 0;
>>> +	mdelay(10);		/* Like linux does. */  
>>
>> Does this add delay to every single transfer ?
> 
> It brings the slowdown, yes.
> 
> However, without it I very often see the error that the USB address
> couldn't be assigned.

Seems like this is hiding some real error then.

If I do basic math, then I reach a conclusion that the comment is bogus.
Look, assume the transfer itself takes 0 time, then if you have 10 mS
delays between transfers, you can do 100 transfer per second. If one
transfer is 240 blocks * 512 bytes , then you are limited to 12.2 MB/s.
And I am positive USB 2.0 sticks in Linux can transfer faster than that.

>> That would mean a massive slowdown if you use short data transfers,
>> which is needed for old/limited USB sticks.
>>
>>>  	/* no data, go immediately to the STATUS phase */
>>>  	if (srb->datalen == 0)
>>>  		goto st;
>>> @@ -1023,9 +1024,32 @@ static int usb_request_sense(struct scsi_cmd
>>> *srb, struct us_data *ss) return 0;
>>>  }
>>>  
>>> +/*
>>> + * This spins up the disk and also consumes the time that the
>>> + * disk takes to become active and ready to read data.
>>> + * Some drives (like Western Digital) can take more than 5 seconds.
>>> + * The delay occurs on the 1st data read from the disk.
>>> + * Extending the timeout here works better than handling the
>>> timeout
>>> + * as an error on a "real" read.
>>> + */
>>> +static int usb_spinup(struct scsi_cmd *srb, struct us_data *ss)
>>> +{
>>> +	memset(&srb->cmd[0], 0, 12);
>>> +	srb->cmd[0] = SCSI_START_STP;
>>> +	srb->cmd[1] = srb->lun << 5;
>>> +	srb->cmd[4] = 1;	/* Start spinup. */
>>> +	srb->datalen = 0;
>>> +	srb->cmdlen = 6;
>>> +	ss->pusb_dev->extra_timeout = 9876;  
>>
>> What is this magic number ?
> 
> This number is added to the timeout up to which ehci controller waits
> for EHCI TD to be sent.

This is generic code and has to work with OHCI/UHCI/xHCI too.

> Why there is 9876 - I do guess that it was took from Linux in some
> point in time.

Please, research it.

>>> +	ss->transport(srb, ss);
>>> +	ss->pusb_dev->extra_timeout = 0;
>>> +	return 0;
>>> +}  
>>
>> [...]
>>
>>> diff --git a/include/usb.h b/include/usb.h
>>> index efb67ea33f..5b0f5ce5d6 100644
>>> --- a/include/usb.h
>>> +++ b/include/usb.h
>>> @@ -140,6 +140,7 @@ struct usb_device {
>>>  	int act_len;			/* transferred bytes */
>>>  	int maxchild;			/* Number of ports if
>>> hub */ int portnr;			/* Port number, 1=first */
>>> +	int extra_timeout; /* Add to timeout in ehci_submit_async
>>> or spinup */  
>>
>> Does this work with xhci too ?
> 
> Yes, it is used in patch 5/5.

Does xhci need it ?

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data
  2020-03-23  7:18     ` Lukasz Majewski
@ 2020-03-23 11:59       ` Marek Vasut
  2020-03-23 12:58         ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-23 11:59 UTC (permalink / raw)
  To: u-boot

On 3/23/20 8:18 AM, Lukasz Majewski wrote:
> Hi Marek,

Hi,

>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
>>> This code adds check if QT_TOKEN_STATUS_XACTERR error occurred.
>>> When it is detected the token is reconfigured and transmission is
>>> retried.
>>>
>>> This code is the port to newest U-Boot of the fix from - "rayvt"
>>> (from [1]).
>>>
>>> Links:
>>> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
>>> [2] -
>>> https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
>>>
>>>
>>> Signed-off-by: Lukasz Majewski <lukma@denx.de>
>>> [Unfortunately, the original patch [2] did not contain S-o-B from
>>> the original author - "rayvt"]
>>> ---
>>>
>>>  drivers/usb/host/ehci-hcd.c | 27 ++++++++++++++++++++++++++-
>>>  1 file changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/usb/host/ehci-hcd.c
>>> b/drivers/usb/host/ehci-hcd.c index 0a77111f80..45eda7ad24 100644
>>> --- a/drivers/usb/host/ehci-hcd.c
>>> +++ b/drivers/usb/host/ehci-hcd.c
>>> @@ -315,6 +315,7 @@ ehci_submit_async(struct usb_device *dev,
>>> unsigned long pipe, void *buffer, int timeout;
>>>  	int ret = 0;
>>>  	struct ehci_ctrl *ctrl = ehci_get_ctrl(dev);
>>> +	int trynum;
>>>  
>>>  	debug("dev=%p, pipe=%lx, buffer=%p, length=%d, req=%p\n",
>>> dev, pipe, buffer, length, req);
>>> @@ -560,6 +561,10 @@ ehci_submit_async(struct usb_device *dev,
>>> unsigned long pipe, void *buffer,
>>> ehci_writel(&ctrl->hcor->or_usbsts, (usbsts & 0x3f)); 
>>>  	/* Enable async. schedule. */
>>> +	trynum = 1;	/* No more than 2 tries, in case of
>>> XACTERR. */ +retry_xacterr:;
>>> +	vtd = &qtd[qtd_counter - 1];
>>> +
>>>  	cmd = ehci_readl(&ctrl->hcor->or_usbcmd);
>>>  	cmd |= CMD_ASE;
>>>  	ehci_writel(&ctrl->hcor->or_usbcmd, cmd);  
>>
>> Are you sure always retrying the transfer if you get XACTERR is the
>> right thing to do, even if you get that e.g. on filesystem writes ?
> 
> Please correct me if I'm wrong, but this function - ehci_submit_async
> - doesn't work with filesystem. It just setups proper EHCI descriptor
>   and checks if it was sent with or without errors.

If you're writing into a block device, this function is used. If you get
XACTERR during the transfer and you retry, is there a chance the data
get actually written to the device ?

> When the XACTERR happens, proper flag is cleared and the transmission
> is retried.

What happens to the payload if it's a host-to-device transfer, do the
data get discarded or do they get written to the storage ?

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-23 11:46       ` Marek Vasut
@ 2020-03-23 12:41         ` Lukasz Majewski
  2020-03-24  0:58           ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23 12:41 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/23/20 7:57 AM, Lukasz Majewski wrote:
> > Hi Marek,  
> 
> Hi,
> 
> >> On 3/22/20 2:00 PM, Lukasz Majewski wrote:  
> >>> This reverts commit 02b0e1a36c5bc20174299312556ec4e266872bd6.
> >>>
> >>> Signed-off-by: Lukasz Majewski <lukma@denx.de>    
> >>
> >> This patch lacks any and all explanation why this is being
> >> reverted. The patch you are reverting here explains why it was
> >> added and what real issues it was fixing, so instead of reverting
> >> it, if there is an issue with that patch, you should identify the
> >> issue and fix it.  
> > 
> > Marek, have you received the cover letter for this patch series?
> > 
> > In the cover letter I've written the rationale for reverting this
> > patch.   
> 
> That should have been explained in this patch description.
> 
> > In short - qhtoken has value of 0x0, when the token variable shows
> > errors. As a result the error handling is broken.
> > Could you comment on those arguments?  
> 
> Maybe you are referencing/reading the wrong token ?

I'm printing the token which is used afterwards for reacting on possible
errors.

> 
> You should probably figure out why this doesn't work first and then
> add fixes on top.

Haven't you seen such problem during code development on your setup
when developing this patch? 

> 
> > Moreover, I've explicitly stated that this is a Request For
> > Testing like patch series with a detailed report of testing
> > procedure (for my use case) for the USB in U-Boot (as Tom has
> > tested the patch with some ETH dongles).   
> 
> I was still unable to replicate the ethernet device failure.
> 

Which boards and SoCs do you used for your test setup?

For me the issue is visible on i.MX53 and i.MX6Q.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/3a45d9e7/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs)
  2020-03-23 11:57       ` Marek Vasut
@ 2020-03-23 12:54         ` Lukasz Majewski
  2020-03-24  1:04           ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23 12:54 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/23/20 8:53 AM, Lukasz Majewski wrote:
> > Hi Marek,  
> 
> Hi,
> 
> >> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
> >> [...]  
> >>> diff --git a/common/usb_storage.c b/common/usb_storage.c
> >>> index 92e1e54d1c..3c2324fa1a 100644
> >>> --- a/common/usb_storage.c
> >>> +++ b/common/usb_storage.c
> >>> @@ -729,6 +729,7 @@ static int usb_stor_BBB_transport(struct
> >>> scsi_cmd *srb, struct us_data *us) pipeout =
> >>> usb_sndbulkpipe(us->pusb_dev, us->ep_out); /* DATA phase + error
> >>> handling */ data_actlen = 0;
> >>> +	mdelay(10);		/* Like linux does. */    
> >>
> >> Does this add delay to every single transfer ?  
> > 
> > It brings the slowdown, yes.
> > 
> > However, without it I very often see the error that the USB address
> > couldn't be assigned.  
> 
> Seems like this is hiding some real error then.
> 
> If I do basic math, then I reach a conclusion that the comment is
> bogus. Look, assume the transfer itself takes 0 time, then if you
> have 10 mS delays between transfers, you can do 100 transfer per
> second. If one transfer is 240 blocks * 512 bytes , then you are
> limited to 12.2 MB/s. And I am positive USB 2.0 sticks in Linux can
> transfer faster than that.

Theoretically USB 2.0 can have up to 60MiB/s transfer rate. The 12MiB/s
is for USB 1.x. 

I cannot say why this delay helps with recognition of some devices. I
also start wondering why I do see some strange problems in U-Boot (like
not assigning address, timeouts), as those errors are not present on
Linux.

> 
> >> That would mean a massive slowdown if you use short data transfers,
> >> which is needed for old/limited USB sticks.
> >>  
> >>>  	/* no data, go immediately to the STATUS phase */
> >>>  	if (srb->datalen == 0)
> >>>  		goto st;
> >>> @@ -1023,9 +1024,32 @@ static int usb_request_sense(struct
> >>> scsi_cmd *srb, struct us_data *ss) return 0;
> >>>  }
> >>>  
> >>> +/*
> >>> + * This spins up the disk and also consumes the time that the
> >>> + * disk takes to become active and ready to read data.
> >>> + * Some drives (like Western Digital) can take more than 5
> >>> seconds.
> >>> + * The delay occurs on the 1st data read from the disk.
> >>> + * Extending the timeout here works better than handling the
> >>> timeout
> >>> + * as an error on a "real" read.
> >>> + */
> >>> +static int usb_spinup(struct scsi_cmd *srb, struct us_data *ss)
> >>> +{
> >>> +	memset(&srb->cmd[0], 0, 12);
> >>> +	srb->cmd[0] = SCSI_START_STP;
> >>> +	srb->cmd[1] = srb->lun << 5;
> >>> +	srb->cmd[4] = 1;	/* Start spinup. */
> >>> +	srb->datalen = 0;
> >>> +	srb->cmdlen = 6;
> >>> +	ss->pusb_dev->extra_timeout = 9876;    
> >>
> >> What is this magic number ?  
> > 
> > This number is added to the timeout up to which ehci controller
> > waits for EHCI TD to be sent.  
> 
> This is generic code and has to work with OHCI/UHCI/xHCI too.
> 
> > Why there is 9876 - I do guess that it was took from Linux in some
> > point in time.  
> 
> Please, research it.
> 
> >>> +	ss->transport(srb, ss);
> >>> +	ss->pusb_dev->extra_timeout = 0;
> >>> +	return 0;
> >>> +}    
> >>
> >> [...]
> >>  
> >>> diff --git a/include/usb.h b/include/usb.h
> >>> index efb67ea33f..5b0f5ce5d6 100644
> >>> --- a/include/usb.h
> >>> +++ b/include/usb.h
> >>> @@ -140,6 +140,7 @@ struct usb_device {
> >>>  	int act_len;			/* transferred bytes
> >>> */ int maxchild;			/* Number of ports if
> >>> hub */ int portnr;			/* Port number, 1=first
> >>> */
> >>> +	int extra_timeout; /* Add to timeout in ehci_submit_async
> >>> or spinup */    
> >>
> >> Does this work with xhci too ?  
> > 
> > Yes, it is used in patch 5/5.  
> 
> Does xhci need it ?
> 

It is hard to say. The original fix was for EHCI.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/e0c86ad7/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data
  2020-03-23 11:59       ` Marek Vasut
@ 2020-03-23 12:58         ` Lukasz Majewski
  2020-03-24  1:06           ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23 12:58 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/23/20 8:18 AM, Lukasz Majewski wrote:
> > Hi Marek,  
> 
> Hi,
> 
> >> On 3/22/20 2:00 PM, Lukasz Majewski wrote:  
> >>> This code adds check if QT_TOKEN_STATUS_XACTERR error occurred.
> >>> When it is detected the token is reconfigured and transmission is
> >>> retried.
> >>>
> >>> This code is the port to newest U-Boot of the fix from - "rayvt"
> >>> (from [1]).
> >>>
> >>> Links:
> >>> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> >>> [2] -
> >>> https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
> >>>
> >>>
> >>> Signed-off-by: Lukasz Majewski <lukma@denx.de>
> >>> [Unfortunately, the original patch [2] did not contain S-o-B from
> >>> the original author - "rayvt"]
> >>> ---
> >>>
> >>>  drivers/usb/host/ehci-hcd.c | 27 ++++++++++++++++++++++++++-
> >>>  1 file changed, 26 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/usb/host/ehci-hcd.c
> >>> b/drivers/usb/host/ehci-hcd.c index 0a77111f80..45eda7ad24 100644
> >>> --- a/drivers/usb/host/ehci-hcd.c
> >>> +++ b/drivers/usb/host/ehci-hcd.c
> >>> @@ -315,6 +315,7 @@ ehci_submit_async(struct usb_device *dev,
> >>> unsigned long pipe, void *buffer, int timeout;
> >>>  	int ret = 0;
> >>>  	struct ehci_ctrl *ctrl = ehci_get_ctrl(dev);
> >>> +	int trynum;
> >>>  
> >>>  	debug("dev=%p, pipe=%lx, buffer=%p, length=%d, req=%p\n",
> >>> dev, pipe, buffer, length, req);
> >>> @@ -560,6 +561,10 @@ ehci_submit_async(struct usb_device *dev,
> >>> unsigned long pipe, void *buffer,
> >>> ehci_writel(&ctrl->hcor->or_usbsts, (usbsts & 0x3f)); 
> >>>  	/* Enable async. schedule. */
> >>> +	trynum = 1;	/* No more than 2 tries, in case of
> >>> XACTERR. */ +retry_xacterr:;
> >>> +	vtd = &qtd[qtd_counter - 1];
> >>> +
> >>>  	cmd = ehci_readl(&ctrl->hcor->or_usbcmd);
> >>>  	cmd |= CMD_ASE;
> >>>  	ehci_writel(&ctrl->hcor->or_usbcmd, cmd);    
> >>
> >> Are you sure always retrying the transfer if you get XACTERR is the
> >> right thing to do, even if you get that e.g. on filesystem writes
> >> ?  
> > 
> > Please correct me if I'm wrong, but this function -
> > ehci_submit_async
> > - doesn't work with filesystem. It just setups proper EHCI
> > descriptor and checks if it was sent with or without errors.  
> 
> If you're writing into a block device, this function is used. If you
> get XACTERR during the transfer and you retry, is there a chance the
> data get actually written to the device ?

As fair as I can tell this code snippet doesn't touch file systems.

> 
> > When the XACTERR happens, proper flag is cleared and the
> > transmission is retried.  
> 
> What happens to the payload if it's a host-to-device transfer, do the
> data get discarded or do they get written to the storage ?
> 

As state above - the resent shall only result is data being to some
in-memory buffer. 


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/c11fbcea/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage
  2020-03-23 11:50       ` Marek Vasut
@ 2020-03-23 13:03         ` Lukasz Majewski
  2020-03-24  1:01           ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23 13:03 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/23/20 8:00 AM, Lukasz Majewski wrote:
> > Hi Marek,
> >   
> >> On 3/22/20 2:00 PM, Lukasz Majewski wrote:  
> >>> This change brings support for handling XACTERR error during DATA
> >>> phase of USB BBB (bulk) transmission.
> >>>
> >>> The fix is to clear stall'ed endpoint and reset recovery on the
> >>> USB mass storage class.
> >>>
> >>> This code is the port to newest U-Boot of the fix from - "rayvt"
> >>> (from [1]).
> >>>
> >>> Links:
> >>> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
> >>> [2] -
> >>> https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
> >>>
> >>> Signed-off-by: Lukasz Majewski <lukma@denx.de>
> >>> [Unfortunately, the original patch [2] did not contain S-o-B from
> >>> the original author - "rayvt"]
> >>> ---
> >>>
> >>>  common/usb_storage.c | 10 ++++++++++
> >>>  include/usb_defs.h   |  1 +
> >>>  2 files changed, 11 insertions(+)
> >>>
> >>> diff --git a/common/usb_storage.c b/common/usb_storage.c
> >>> index 097b6729c1..92e1e54d1c 100644
> >>> --- a/common/usb_storage.c
> >>> +++ b/common/usb_storage.c
> >>> @@ -740,6 +740,16 @@ static int usb_stor_BBB_transport(struct
> >>> scsi_cmd *srb, struct us_data *us) 
> >>>  	result = usb_bulk_msg(us->pusb_dev, pipe, srb->pdata,
> >>> srb->datalen, &data_actlen, USB_CNTL_TIMEOUT * 5);
> >>> +
> >>> +	/* special handling of XACTERR in DATA phase */
> >>> +	if (result < 0 && (us->pusb_dev->status &
> >>> USB_ST_XACTERR)) {
> >>> +		debug("XACTERR in data phase - clr, reset, and
> >>> return fail.\n");
> >>> +		usb_stor_BBB_clear_endpt_stall(us, dir_in ?
> >>> +					       us->ep_in :
> >>> us->ep_out);
> >>> +		usb_stor_BBB_reset(us);
> >>> +		return USB_STOR_TRANSPORT_FAILED;
> >>> +	}
> >>> +    
> >>
> >> Can resetting the endpoint result in data corruption of some sort
> >> here ?  
> > 
> > Please correct me if I'm wrong, but this code is executed when we do
> > receive data, not writing them. Those operations shall (and are?)
> > orthogonal.
> >   
> >> Especially if this happens on filesystem write for example ?  
> > 
> > Do we write data here?  
> 
> I only did a very quick look into the code, but there I see
> 
> 1082 static int usb_write_10(struct scsi_cmd *srb, struct us_data *ss,
> ...
> 1096         return ss->transport(srb, ss);
> 
> 1338         case US_PR_BULK:
> 1339                 debug("Bulk/Bulk/Bulk\n");
> 1340                 ss->transport = usb_stor_BBB_transport;
> 
> So I would say, the answer is -- yes.
> 

A few lines down (usb_storage.c @ L757) you have the USB_ST_STALLED
status handled in the same way (it also aborts after a single retry).


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/ddac4d0d/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation
  2020-03-22 13:00 [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Lukasz Majewski
                   ` (4 preceding siblings ...)
  2020-03-22 13:00 ` [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data Lukasz Majewski
@ 2020-03-23 20:58 ` Tom Rini
  2020-03-23 22:11   ` Lukasz Majewski
  5 siblings, 1 reply; 35+ messages in thread
From: Tom Rini @ 2020-03-23 20:58 UTC (permalink / raw)
  To: u-boot

On Sun, Mar 22, 2020 at 02:00:26PM +0100, Lukasz Majewski wrote:

> This patch set is rather a request for testing (and a starting point for the
> discussion), as it may improve the robustness of USB with some pendrives - and
> yes sacrifice some performance for reliability.
> The previous version of this patch: https://patchwork.ozlabs.org/patch/1244928/
> fixed issue for some network USB adapters and improved stability on TI boards.
> This patch also provides very detailed explanation of the problem in the commit
> message.
> 
> With the async support patch applied (
> SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 ), the qhtoken variable
> has value 0x00 when token shows errors. As a result the error handling path
> is not executed. This looks like some missing/broken cache flushing - for easier
> bisecting this patch has been reverted for now

Note that while the original patch returned USB ethernet on the
Beagleboard to a functional state with this series applied it's back to
non-functional.

-- 
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 659 bytes
Desc: not available
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/f3e60009/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation
  2020-03-23 20:58 ` [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Tom Rini
@ 2020-03-23 22:11   ` Lukasz Majewski
  0 siblings, 0 replies; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-23 22:11 UTC (permalink / raw)
  To: u-boot

Hi Tom,

> On Sun, Mar 22, 2020 at 02:00:26PM +0100, Lukasz Majewski wrote:
> 
> > This patch set is rather a request for testing (and a starting
> > point for the discussion), as it may improve the robustness of USB
> > with some pendrives - and yes sacrifice some performance for
> > reliability. The previous version of this patch:
> > https://patchwork.ozlabs.org/patch/1244928/ fixed issue for some
> > network USB adapters and improved stability on TI boards. This
> > patch also provides very detailed explanation of the problem in the
> > commit message.
> > 
> > With the async support patch applied (
> > SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 ), the qhtoken
> > variable has value 0x00 when token shows errors. As a result the
> > error handling path is not executed. This looks like some
> > missing/broken cache flushing - for easier bisecting this patch has
> > been reverted for now  
> 
> Note that while the original patch returned USB ethernet on the
> Beagleboard to a functional state with this series applied it's back
> to non-functional.
> 

Thanks for testing.

The _only_ difference between the first version of this patch and this
one is the lack of dynamic reduction of transfer size for the latter.
The former reduces the transfer size to 64 blocks (instead of default
240). And with 64 blocks it retries two times the transmission.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200323/3c9e4559/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-23 12:41         ` Lukasz Majewski
@ 2020-03-24  0:58           ` Marek Vasut
  2020-03-24  7:06             ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-24  0:58 UTC (permalink / raw)
  To: u-boot

On 3/23/20 1:41 PM, Lukasz Majewski wrote:
> Hi Marek,

Hi,

>> On 3/23/20 7:57 AM, Lukasz Majewski wrote:
>>> Hi Marek,  
>>
>> Hi,
>>
>>>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:  
>>>>> This reverts commit 02b0e1a36c5bc20174299312556ec4e266872bd6.
>>>>>
>>>>> Signed-off-by: Lukasz Majewski <lukma@denx.de>    
>>>>
>>>> This patch lacks any and all explanation why this is being
>>>> reverted. The patch you are reverting here explains why it was
>>>> added and what real issues it was fixing, so instead of reverting
>>>> it, if there is an issue with that patch, you should identify the
>>>> issue and fix it.  
>>>
>>> Marek, have you received the cover letter for this patch series?
>>>
>>> In the cover letter I've written the rationale for reverting this
>>> patch.   
>>
>> That should have been explained in this patch description.
>>
>>> In short - qhtoken has value of 0x0, when the token variable shows
>>> errors. As a result the error handling is broken.
>>> Could you comment on those arguments?  
>>
>> Maybe you are referencing/reading the wrong token ?
> 
> I'm printing the token which is used afterwards for reacting on possible
> errors.
> 
>>
>> You should probably figure out why this doesn't work first and then
>> add fixes on top.
> 
> Haven't you seen such problem during code development on your setup
> when developing this patch? 

During the development of the patch, I don't remember, sorry. I most
certainly saw various failure modes, however those should not be present
mainline.

I tested this patch with the problematic USB sticks on R-Car Gen3 and
with SMSC95xx USB ethernet adapter last weekend and I didn't see a problem.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage
  2020-03-23 13:03         ` Lukasz Majewski
@ 2020-03-24  1:01           ` Marek Vasut
  0 siblings, 0 replies; 35+ messages in thread
From: Marek Vasut @ 2020-03-24  1:01 UTC (permalink / raw)
  To: u-boot

On 3/23/20 2:03 PM, Lukasz Majewski wrote:
> Hi Marek,

Hi,

>> On 3/23/20 8:00 AM, Lukasz Majewski wrote:
>>> Hi Marek,
>>>   
>>>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:  
>>>>> This change brings support for handling XACTERR error during DATA
>>>>> phase of USB BBB (bulk) transmission.
>>>>>
>>>>> The fix is to clear stall'ed endpoint and reset recovery on the
>>>>> USB mass storage class.
>>>>>
>>>>> This code is the port to newest U-Boot of the fix from - "rayvt"
>>>>> (from [1]).
>>>>>
>>>>> Links:
>>>>> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
>>>>> [2] -
>>>>> https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
>>>>>
>>>>> Signed-off-by: Lukasz Majewski <lukma@denx.de>
>>>>> [Unfortunately, the original patch [2] did not contain S-o-B from
>>>>> the original author - "rayvt"]
>>>>> ---
>>>>>
>>>>>  common/usb_storage.c | 10 ++++++++++
>>>>>  include/usb_defs.h   |  1 +
>>>>>  2 files changed, 11 insertions(+)
>>>>>
>>>>> diff --git a/common/usb_storage.c b/common/usb_storage.c
>>>>> index 097b6729c1..92e1e54d1c 100644
>>>>> --- a/common/usb_storage.c
>>>>> +++ b/common/usb_storage.c
>>>>> @@ -740,6 +740,16 @@ static int usb_stor_BBB_transport(struct
>>>>> scsi_cmd *srb, struct us_data *us) 
>>>>>  	result = usb_bulk_msg(us->pusb_dev, pipe, srb->pdata,
>>>>> srb->datalen, &data_actlen, USB_CNTL_TIMEOUT * 5);
>>>>> +
>>>>> +	/* special handling of XACTERR in DATA phase */
>>>>> +	if (result < 0 && (us->pusb_dev->status &
>>>>> USB_ST_XACTERR)) {
>>>>> +		debug("XACTERR in data phase - clr, reset, and
>>>>> return fail.\n");
>>>>> +		usb_stor_BBB_clear_endpt_stall(us, dir_in ?
>>>>> +					       us->ep_in :
>>>>> us->ep_out);
>>>>> +		usb_stor_BBB_reset(us);
>>>>> +		return USB_STOR_TRANSPORT_FAILED;
>>>>> +	}
>>>>> +    
>>>>
>>>> Can resetting the endpoint result in data corruption of some sort
>>>> here ?  
>>>
>>> Please correct me if I'm wrong, but this code is executed when we do
>>> receive data, not writing them. Those operations shall (and are?)
>>> orthogonal.
>>>   
>>>> Especially if this happens on filesystem write for example ?  
>>>
>>> Do we write data here?  
>>
>> I only did a very quick look into the code, but there I see
>>
>> 1082 static int usb_write_10(struct scsi_cmd *srb, struct us_data *ss,
>> ...
>> 1096         return ss->transport(srb, ss);
>>
>> 1338         case US_PR_BULK:
>> 1339                 debug("Bulk/Bulk/Bulk\n");
>> 1340                 ss->transport = usb_stor_BBB_transport;
>>
>> So I would say, the answer is -- yes.
>>
> 
> A few lines down (usb_storage.c @ L757) you have the USB_ST_STALLED
> status handled in the same way (it also aborts after a single retry).

But stall isn't xfer error. Which makes me wonder, why is the xfer error
here handled the same way as a stall, shouldn't the handling be different ?

Also, this doesn't answer the question whether such handling can cause
data corruption during write.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs)
  2020-03-23 12:54         ` Lukasz Majewski
@ 2020-03-24  1:04           ` Marek Vasut
  0 siblings, 0 replies; 35+ messages in thread
From: Marek Vasut @ 2020-03-24  1:04 UTC (permalink / raw)
  To: u-boot

On 3/23/20 1:54 PM, Lukasz Majewski wrote:
> Hi Marek,

Hi,

>> On 3/23/20 8:53 AM, Lukasz Majewski wrote:
>>> Hi Marek,  
>>
>> Hi,
>>
>>>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:
>>>> [...]  
>>>>> diff --git a/common/usb_storage.c b/common/usb_storage.c
>>>>> index 92e1e54d1c..3c2324fa1a 100644
>>>>> --- a/common/usb_storage.c
>>>>> +++ b/common/usb_storage.c
>>>>> @@ -729,6 +729,7 @@ static int usb_stor_BBB_transport(struct
>>>>> scsi_cmd *srb, struct us_data *us) pipeout =
>>>>> usb_sndbulkpipe(us->pusb_dev, us->ep_out); /* DATA phase + error
>>>>> handling */ data_actlen = 0;
>>>>> +	mdelay(10);		/* Like linux does. */    
>>>>
>>>> Does this add delay to every single transfer ?  
>>>
>>> It brings the slowdown, yes.
>>>
>>> However, without it I very often see the error that the USB address
>>> couldn't be assigned.  
>>
>> Seems like this is hiding some real error then.
>>
>> If I do basic math, then I reach a conclusion that the comment is
>> bogus. Look, assume the transfer itself takes 0 time, then if you
>> have 10 mS delays between transfers, you can do 100 transfer per
>> second. If one transfer is 240 blocks * 512 bytes , then you are
>> limited to 12.2 MB/s. And I am positive USB 2.0 sticks in Linux can
>> transfer faster than that.
> 
> Theoretically USB 2.0 can have up to 60MiB/s transfer rate. The 12MiB/s
> is for USB 1.x. 

If you add such a massive 10 mS delay between each and every single
read, then the performance will be much lower, see the simple
calculation above.

> I cannot say why this delay helps with recognition of some devices. I
> also start wondering why I do see some strange problems in U-Boot (like
> not assigning address, timeouts), as those errors are not present on
> Linux.

Maybe something to investigate in more depth ?

[...]

>>>>> diff --git a/include/usb.h b/include/usb.h
>>>>> index efb67ea33f..5b0f5ce5d6 100644
>>>>> --- a/include/usb.h
>>>>> +++ b/include/usb.h
>>>>> @@ -140,6 +140,7 @@ struct usb_device {
>>>>>  	int act_len;			/* transferred bytes
>>>>> */ int maxchild;			/* Number of ports if
>>>>> hub */ int portnr;			/* Port number, 1=first
>>>>> */
>>>>> +	int extra_timeout; /* Add to timeout in ehci_submit_async
>>>>> or spinup */    
>>>>
>>>> Does this work with xhci too ?  
>>>
>>> Yes, it is used in patch 5/5.  
>>
>> Does xhci need it ?
>>
> 
> It is hard to say. The original fix was for EHCI.

Please check. This needs to be fixed properly instead of just papering
over the problem and hacking it up somehow to make it somehow work.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data
  2020-03-23 12:58         ` Lukasz Majewski
@ 2020-03-24  1:06           ` Marek Vasut
  0 siblings, 0 replies; 35+ messages in thread
From: Marek Vasut @ 2020-03-24  1:06 UTC (permalink / raw)
  To: u-boot

On 3/23/20 1:58 PM, Lukasz Majewski wrote:
> Hi Marek,

Hi,

>> On 3/23/20 8:18 AM, Lukasz Majewski wrote:
>>> Hi Marek,  
>>
>> Hi,
>>
>>>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:  
>>>>> This code adds check if QT_TOKEN_STATUS_XACTERR error occurred.
>>>>> When it is detected the token is reconfigured and transmission is
>>>>> retried.
>>>>>
>>>>> This code is the port to newest U-Boot of the fix from - "rayvt"
>>>>> (from [1]).
>>>>>
>>>>> Links:
>>>>> [1] - https://forum.doozan.com/read.php?3,35295,35295#msg-35295
>>>>> [2] -
>>>>> https://www.dropbox.com/s/nrkrd1no63viuu8/uboot-bodhi-2016.05-timeoutTD.patch?dl=0
>>>>>
>>>>>
>>>>> Signed-off-by: Lukasz Majewski <lukma@denx.de>
>>>>> [Unfortunately, the original patch [2] did not contain S-o-B from
>>>>> the original author - "rayvt"]
>>>>> ---
>>>>>
>>>>>  drivers/usb/host/ehci-hcd.c | 27 ++++++++++++++++++++++++++-
>>>>>  1 file changed, 26 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/usb/host/ehci-hcd.c
>>>>> b/drivers/usb/host/ehci-hcd.c index 0a77111f80..45eda7ad24 100644
>>>>> --- a/drivers/usb/host/ehci-hcd.c
>>>>> +++ b/drivers/usb/host/ehci-hcd.c
>>>>> @@ -315,6 +315,7 @@ ehci_submit_async(struct usb_device *dev,
>>>>> unsigned long pipe, void *buffer, int timeout;
>>>>>  	int ret = 0;
>>>>>  	struct ehci_ctrl *ctrl = ehci_get_ctrl(dev);
>>>>> +	int trynum;
>>>>>  
>>>>>  	debug("dev=%p, pipe=%lx, buffer=%p, length=%d, req=%p\n",
>>>>> dev, pipe, buffer, length, req);
>>>>> @@ -560,6 +561,10 @@ ehci_submit_async(struct usb_device *dev,
>>>>> unsigned long pipe, void *buffer,
>>>>> ehci_writel(&ctrl->hcor->or_usbsts, (usbsts & 0x3f)); 
>>>>>  	/* Enable async. schedule. */
>>>>> +	trynum = 1;	/* No more than 2 tries, in case of
>>>>> XACTERR. */ +retry_xacterr:;
>>>>> +	vtd = &qtd[qtd_counter - 1];
>>>>> +
>>>>>  	cmd = ehci_readl(&ctrl->hcor->or_usbcmd);
>>>>>  	cmd |= CMD_ASE;
>>>>>  	ehci_writel(&ctrl->hcor->or_usbcmd, cmd);    
>>>>
>>>> Are you sure always retrying the transfer if you get XACTERR is the
>>>> right thing to do, even if you get that e.g. on filesystem writes
>>>> ?  
>>>
>>> Please correct me if I'm wrong, but this function -
>>> ehci_submit_async
>>> - doesn't work with filesystem. It just setups proper EHCI
>>> descriptor and checks if it was sent with or without errors.  
>>
>> If you're writing into a block device, this function is used. If you
>> get XACTERR during the transfer and you retry, is there a chance the
>> data get actually written to the device ?
> 
> As fair as I can tell this code snippet doesn't touch file systems.

It could be any sort of write into block storage, it does not have to be
filesystem. The filesystem was just an example of write type we trigger
more often.

>>> When the XACTERR happens, proper flag is cleared and the
>>> transmission is retried.  
>>
>> What happens to the payload if it's a host-to-device transfer, do the
>> data get discarded or do they get written to the storage ?
>>
> 
> As state above - the resent shall only result is data being to some
> in-memory buffer. 

If you trigger a write from device memory to remote storage, then it is
the other way around.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-24  0:58           ` Marek Vasut
@ 2020-03-24  7:06             ` Lukasz Majewski
  2020-03-24 15:07               ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-24  7:06 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/23/20 1:41 PM, Lukasz Majewski wrote:
> > Hi Marek,  
> 
> Hi,
> 
> >> On 3/23/20 7:57 AM, Lukasz Majewski wrote:  
> >>> Hi Marek,    
> >>
> >> Hi,
> >>  
> >>>> On 3/22/20 2:00 PM, Lukasz Majewski wrote:    
> >>>>> This reverts commit 02b0e1a36c5bc20174299312556ec4e266872bd6.
> >>>>>
> >>>>> Signed-off-by: Lukasz Majewski <lukma@denx.de>      
> >>>>
> >>>> This patch lacks any and all explanation why this is being
> >>>> reverted. The patch you are reverting here explains why it was
> >>>> added and what real issues it was fixing, so instead of reverting
> >>>> it, if there is an issue with that patch, you should identify the
> >>>> issue and fix it.    
> >>>
> >>> Marek, have you received the cover letter for this patch series?
> >>>
> >>> In the cover letter I've written the rationale for reverting this
> >>> patch.     
> >>
> >> That should have been explained in this patch description.
> >>  
> >>> In short - qhtoken has value of 0x0, when the token variable shows
> >>> errors. As a result the error handling is broken.
> >>> Could you comment on those arguments?    
> >>
> >> Maybe you are referencing/reading the wrong token ?  
> > 
> > I'm printing the token which is used afterwards for reacting on
> > possible errors.
> >   
> >>
> >> You should probably figure out why this doesn't work first and then
> >> add fixes on top.  
> > 
> > Haven't you seen such problem during code development on your setup
> > when developing this patch?   
> 
> During the development of the patch, I don't remember, sorry. I most
> certainly saw various failure modes, however those should not be
> present mainline.

The issue is that the qhtoken is not updated at all. 

Maybe you remember - is Linux using async setup by default (as
introduced in SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6) ?

> 
> I tested this patch with the problematic USB sticks on R-Car Gen3 and
> with SMSC95xx USB ethernet adapter last weekend and I didn't see a
> problem.

Ok.

For i.MX6Q:
The SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 patch causes the
iMX6Q to fail after a few minutes of testing. General in i.MX6Q the
usb is NOT robust at all.



For i.MX53:
With patch 02b0e1a36c5bc20174299312556ec4e266872bd6 applied it also
breaks after a few minutes.

With this patch series applied it works for 2 days now without any
issue.


Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200324/717bba31/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-24  7:06             ` Lukasz Majewski
@ 2020-03-24 15:07               ` Marek Vasut
  2020-03-24 18:11                 ` Lukasz Majewski
  0 siblings, 1 reply; 35+ messages in thread
From: Marek Vasut @ 2020-03-24 15:07 UTC (permalink / raw)
  To: u-boot

On 3/24/20 8:06 AM, Lukasz Majewski wrote:
> Hi Marek,

Hi,

[...]

>>>> You should probably figure out why this doesn't work first and then
>>>> add fixes on top.  
>>>
>>> Haven't you seen such problem during code development on your setup
>>> when developing this patch?   
>>
>> During the development of the patch, I don't remember, sorry. I most
>> certainly saw various failure modes, however those should not be
>> present mainline.
> 
> The issue is that the qhtoken is not updated at all. 
> 
> Maybe you remember - is Linux using async setup by default (as
> introduced in SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6) ?

If I recall correctly, it is using async schedule for bulk transfers.
But the code is available, so you can double-check that.

>> I tested this patch with the problematic USB sticks on R-Car Gen3 and
>> with SMSC95xx USB ethernet adapter last weekend and I didn't see a
>> problem.
> 
> Ok.
> 
> For i.MX6Q:
> The SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 patch causes the
> iMX6Q to fail after a few minutes of testing. General in i.MX6Q the
> usb is NOT robust at all.
> 
> For i.MX53:
> With patch 02b0e1a36c5bc20174299312556ec4e266872bd6 applied it also
> breaks after a few minutes.

So on CI HDRC , there is some difference in behavior. That is what you
need to find and fix then.

> With this patch series applied it works for 2 days now without any
> issue.

Except performance is totally degraded and there is still no clear
explanation _why_ any of these patches are needed and/or whether doing
write to a block device with these patches may cause data corruption.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-24 15:07               ` Marek Vasut
@ 2020-03-24 18:11                 ` Lukasz Majewski
  2020-03-24 18:33                   ` Marek Vasut
  0 siblings, 1 reply; 35+ messages in thread
From: Lukasz Majewski @ 2020-03-24 18:11 UTC (permalink / raw)
  To: u-boot

Hi Marek,

> On 3/24/20 8:06 AM, Lukasz Majewski wrote:
> > Hi Marek,  
> 
> Hi,
> 
> [...]
> 
> >>>> You should probably figure out why this doesn't work first and
> >>>> then add fixes on top.    
> >>>
> >>> Haven't you seen such problem during code development on your
> >>> setup when developing this patch?     
> >>
> >> During the development of the patch, I don't remember, sorry. I
> >> most certainly saw various failure modes, however those should not
> >> be present mainline.  
> > 
> > The issue is that the qhtoken is not updated at all. 
> > 
> > Maybe you remember - is Linux using async setup by default (as
> > introduced in SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6) ?  
> 
> If I recall correctly, it is using async schedule for bulk transfers.
> But the code is available, so you can double-check that.
> 
> >> I tested this patch with the problematic USB sticks on R-Car Gen3
> >> and with SMSC95xx USB ethernet adapter last weekend and I didn't
> >> see a problem.  
> > 
> > Ok.
> > 
> > For i.MX6Q:
> > The SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 patch causes the
> > iMX6Q to fail after a few minutes of testing. General in i.MX6Q the
> > usb is NOT robust at all.
> > 
> > For i.MX53:
> > With patch 02b0e1a36c5bc20174299312556ec4e266872bd6 applied it also
> > breaks after a few minutes.  
> 
> So on CI HDRC , there is some difference in behavior. That is what you
> need to find and fix then.

The conclusion is that some boards/implementations are broken.

> 
> > With this patch series applied it works for 2 days now without any
> > issue.  
> 
> Except performance is totally degraded

So we do have _very_ fast USB which breaks after a few minutes of
constant testing (with procedure stated earlier).

> and there is still no clear
> explanation _why_ any of these patches are needed

Haven't I explicitly explained in previous mails why XACTARR error shall
be handled? Nor the original thread did it? Wasn't the cover-letter
verbose enough?

> and/or whether doing
> write to a block device with these patches may cause data corruption.

So I will ask differently - what _may_ happen when the "TD -
token=XXXX" error shows up and the board hangs? Wouldn't we risk some
unwanted storage corruption?



Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma at denx.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.denx.de/pipermail/u-boot/attachments/20200324/a0b2a941/attachment.sig>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running"
  2020-03-24 18:11                 ` Lukasz Majewski
@ 2020-03-24 18:33                   ` Marek Vasut
  0 siblings, 0 replies; 35+ messages in thread
From: Marek Vasut @ 2020-03-24 18:33 UTC (permalink / raw)
  To: u-boot

On 3/24/20 7:11 PM, Lukasz Majewski wrote:
> Hi Marek,

Hi,

>> On 3/24/20 8:06 AM, Lukasz Majewski wrote:
>>> Hi Marek,  
>>
>> Hi,
>>
>> [...]
>>
>>>>>> You should probably figure out why this doesn't work first and
>>>>>> then add fixes on top.    
>>>>>
>>>>> Haven't you seen such problem during code development on your
>>>>> setup when developing this patch?     
>>>>
>>>> During the development of the patch, I don't remember, sorry. I
>>>> most certainly saw various failure modes, however those should not
>>>> be present mainline.  
>>>
>>> The issue is that the qhtoken is not updated at all. 
>>>
>>> Maybe you remember - is Linux using async setup by default (as
>>> introduced in SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6) ?  
>>
>> If I recall correctly, it is using async schedule for bulk transfers.
>> But the code is available, so you can double-check that.
>>
>>>> I tested this patch with the problematic USB sticks on R-Car Gen3
>>>> and with SMSC95xx USB ethernet adapter last weekend and I didn't
>>>> see a problem.  
>>>
>>> Ok.
>>>
>>> For i.MX6Q:
>>> The SHA1: 02b0e1a36c5bc20174299312556ec4e266872bd6 patch causes the
>>> iMX6Q to fail after a few minutes of testing. General in i.MX6Q the
>>> usb is NOT robust at all.
>>>
>>> For i.MX53:
>>> With patch 02b0e1a36c5bc20174299312556ec4e266872bd6 applied it also
>>> breaks after a few minutes.  
>>
>> So on CI HDRC , there is some difference in behavior. That is what you
>> need to find and fix then.
> 
> The conclusion is that some boards/implementations are broken.

At least systems with CI HDRC.

>>> With this patch series applied it works for 2 days now without any
>>> issue.  
>>
>> Except performance is totally degraded
> 
> So we do have _very_ fast USB which breaks after a few minutes of
> constant testing (with procedure stated earlier).

No, we have a controller which manifests some problem and that problem
needs to be identified and fixed. Whether it's in the stack or in the
controller driver is to be seen.

>> and there is still no clear
>> explanation _why_ any of these patches are needed
> 
> Haven't I explicitly explained in previous mails why XACTARR error shall
> be handled? Nor the original thread did it? Wasn't the cover-letter
> verbose enough?

Regarding XACTERR, I agree it should be handled somehow.

However, I don't think handling XACTERR is what fixes your problems with
the USB sticks, is it ?

Also, it is still unclear whether handling XACTERR exactly the same as
STALL is the right thing to do. Is it ? I think it's not.

>> and/or whether doing
>> write to a block device with these patches may cause data corruption.
> 
> So I will ask differently - what _may_ happen when the "TD -
> token=XXXX" error shows up and the board hangs? Wouldn't we risk some
> unwanted storage corruption?

The behavior is undefined.

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2020-03-24 18:33 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-22 13:00 [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Lukasz Majewski
2020-03-22 13:00 ` [RFT PATCH v1 1/5] Revert "usb: ehci-hcd: Keep async schedule running" Lukasz Majewski
2020-03-22 13:18   ` Marek Vasut
2020-03-23  6:57     ` Lukasz Majewski
2020-03-23 11:46       ` Marek Vasut
2020-03-23 12:41         ` Lukasz Majewski
2020-03-24  0:58           ` Marek Vasut
2020-03-24  7:06             ` Lukasz Majewski
2020-03-24 15:07               ` Marek Vasut
2020-03-24 18:11                 ` Lukasz Majewski
2020-03-24 18:33                   ` Marek Vasut
2020-03-22 13:00 ` [RFT PATCH v1 2/5] usb: Handle XACTERR error in DATA phase of USB storage Lukasz Majewski
2020-03-22 13:23   ` Marek Vasut
2020-03-23  7:00     ` Lukasz Majewski
2020-03-23 11:50       ` Marek Vasut
2020-03-23 13:03         ` Lukasz Majewski
2020-03-24  1:01           ` Marek Vasut
2020-03-22 13:00 ` [RFT PATCH v1 3/5] usb: Add some delay to wait for slow USB devices to be operational Lukasz Majewski
2020-03-22 13:29   ` Marek Vasut
2020-03-23  7:04     ` Lukasz Majewski
2020-03-23 11:52       ` Marek Vasut
2020-03-22 13:00 ` [RFT PATCH v1 4/5] usb: Provide code to handle spinup of USB usb devices (mostly HDDs) Lukasz Majewski
2020-03-22 13:32   ` Marek Vasut
2020-03-23  7:53     ` Lukasz Majewski
2020-03-23 11:57       ` Marek Vasut
2020-03-23 12:54         ` Lukasz Majewski
2020-03-24  1:04           ` Marek Vasut
2020-03-22 13:00 ` [RFT PATCH v1 5/5] usb: Handle QT_TOKEN_STATUS_XACTERR error when sending data Lukasz Majewski
2020-03-22 13:45   ` Marek Vasut
2020-03-23  7:18     ` Lukasz Majewski
2020-03-23 11:59       ` Marek Vasut
2020-03-23 12:58         ` Lukasz Majewski
2020-03-24  1:06           ` Marek Vasut
2020-03-23 20:58 ` [RFT PATCH v1 0/5] usb: Improve robustness of ehci-hcd controller operation Tom Rini
2020-03-23 22:11   ` Lukasz Majewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.