linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] refactor command timeout handling
@ 2016-12-23  6:52 Lu Baolu
  2016-12-23  6:52 ` [PATCH 1/4] usb: xhci: remove unnecessary second abort try Lu Baolu
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Lu Baolu @ 2016-12-23  6:52 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb, linux-kernel, Lu Baolu

Hi Mathias,

This patch series is for command timeout refactoring.

Patches
  usb: xhci: remove unnecessary second abort try
  usb: xhci: remove CRR polling in xhci_abort_cmd_ring()
are follow-ups of your comments of
  "remove unnecessary second abort try as a separate patch, send to usb-next"
and
   "remove polling for the Command ring running (CRR), waiting for completion
   is enough, if completion times out then we can check CRR. for usb-next" 
in below discussion thread.
   https://lkml.org/lkml/2016/12/21/186

Patches
  usb: xhci: add XHCI_MISS_CA_EVENT quirk bit
  usb: xhci: warn on command timeout in stopped command ring
are my proposals.

They base on the top of your timeout_race_fixes branch.

(git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes)

Lu Baolu (4):
  usb: xhci: remove unnecessary second abort try
  usb: xhci: remove CRR polling in xhci_abort_cmd_ring()
  usb: xhci: add XHCI_MISS_CA_EVENT quirk bit
  usb: xhci: warn on command timeout in stopped command ring

 drivers/usb/host/xhci-ring.c | 64 ++++++++++++++++++--------------------------
 drivers/usb/host/xhci.h      |  6 +++++
 2 files changed, 32 insertions(+), 38 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] usb: xhci: remove unnecessary second abort try
  2016-12-23  6:52 [PATCH 0/4] refactor command timeout handling Lu Baolu
@ 2016-12-23  6:52 ` Lu Baolu
  2016-12-23  6:52 ` [PATCH 2/4] usb: xhci: remove CRR polling in xhci_abort_cmd_ring() Lu Baolu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Lu Baolu @ 2016-12-23  6:52 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb, linux-kernel, Lu Baolu

The second try was a workaround for (what we thought was) command
ring failing to stop in the first place. But this turns out to be
due to the race that we have fixed(see "xhci: Fix race related to
abort operation"). With that fix, it is time to remove the second
try.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 2f4ea21..e3bcf6d 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -343,19 +343,11 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags)
 	ret = xhci_handshake(&xhci->op_regs->cmd_ring,
 			CMD_RING_RUNNING, 0, 5 * 1000 * 1000);
 	if (ret < 0) {
-		/* we are about to kill xhci, give it one more chance */
-		xhci_write_64(xhci, temp_64 | CMD_RING_ABORT,
-			      &xhci->op_regs->cmd_ring);
-		udelay(1000);
-		ret = xhci_handshake(&xhci->op_regs->cmd_ring,
-				     CMD_RING_RUNNING, 0, 3 * 1000 * 1000);
-		if (ret < 0) {
-			xhci_err(xhci, "Stopped the command ring failed, "
-				 "maybe the host is dead\n");
-			xhci->xhc_state |= XHCI_STATE_DYING;
-			xhci_halt(xhci);
-			return -ESHUTDOWN;
-		}
+		xhci_err(xhci,
+			 "Stop command ring failed, maybe the host is dead\n");
+		xhci->xhc_state |= XHCI_STATE_DYING;
+		xhci_halt(xhci);
+		return -ESHUTDOWN;
 	}
 	/*
 	 * Writing the CMD_RING_ABORT bit should cause a cmd completion event,
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] usb: xhci: remove CRR polling in xhci_abort_cmd_ring()
  2016-12-23  6:52 [PATCH 0/4] refactor command timeout handling Lu Baolu
  2016-12-23  6:52 ` [PATCH 1/4] usb: xhci: remove unnecessary second abort try Lu Baolu
@ 2016-12-23  6:52 ` Lu Baolu
  2016-12-23  6:52 ` [PATCH 3/4] usb: xhci: add XHCI_MISS_CA_EVENT quirk bit Lu Baolu
  2016-12-23  6:52 ` [PATCH 4/4] usb: xhci: warn on command timeout in stopped command ring Lu Baolu
  3 siblings, 0 replies; 6+ messages in thread
From: Lu Baolu @ 2016-12-23  6:52 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb, linux-kernel, Lu Baolu

xHCI driver aborts the command ring by setting CA (Command
Abort) bit in the command register. With this setting, host
should abort all command executions and generate a command
completion event with the completion code set to command
ring stopped. Software should time the completion of command
abort by checking the CRR (Command Ring Running) and waiting
for the command ring stopped event.

Current xhci_abort_cmd_ring() does this in the following
way: setting CA bit; busy polling CRR bit until it negates;
sleep and wait for the command ring stopped event. This is
not an efficient way since cpu cycles are wasted on polling
registers. This patch removes polling for CRR (Command Ring
Running). Wait for completion, and check CRR if completion
times out is enough.

Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 49 ++++++++++++++++++++++----------------------
 1 file changed, 24 insertions(+), 25 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index e3bcf6d..5935dce 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -323,7 +323,6 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags)
 {
 	unsigned long timeleft;
 	u64 temp_64;
-	int ret;
 
 	xhci_dbg(xhci, "Abort command ring\n");
 
@@ -333,34 +332,34 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags)
 	xhci_write_64(xhci, temp_64 | CMD_RING_ABORT,
 			&xhci->op_regs->cmd_ring);
 
-	/* Section 4.6.1.2 of xHCI 1.0 spec says software should
-	 * time the completion od all xHCI commands, including
-	 * the Command Abort operation. If software doesn't see
-	 * CRR negated in a timely manner (e.g. longer than 5
-	 * seconds), then it should assume that the there are
-	 * larger problems with the xHC and assert HCRST.
-	 */
-	ret = xhci_handshake(&xhci->op_regs->cmd_ring,
-			CMD_RING_RUNNING, 0, 5 * 1000 * 1000);
-	if (ret < 0) {
-		xhci_err(xhci,
-			 "Stop command ring failed, maybe the host is dead\n");
-		xhci->xhc_state |= XHCI_STATE_DYING;
-		xhci_halt(xhci);
-		return -ESHUTDOWN;
-	}
-	/*
-	 * Writing the CMD_RING_ABORT bit should cause a cmd completion event,
-	 * however on some host hw the CMD_RING_RUNNING bit is correctly cleared
-	 * but the completion event in never sent. Wait 2 secs (arbitrary
-	 * number) to handle those cases after negation of CMD_RING_RUNNING.
-	 */
 	spin_unlock_irqrestore(&xhci->lock, *flags);
 	timeleft = wait_for_completion_timeout(&xhci->cmd_ring_stop_completion,
-					       2 * HZ);
+					       XHCI_CMD_DEFAULT_TIMEOUT);
 	spin_lock_irqsave(&xhci->lock, *flags);
 	if (!timeleft) {
-		xhci_dbg(xhci, "No stop event for abort, ring start fail?\n");
+		/* Section 4.6.1.2 of xHCI 1.0 spec says software should
+		 * time the completion of all xHCI commands, including
+		 * the Command Abort operation. If software doesn't see
+		 * CRR negated in a timely manner (e.g. longer than 5
+		 * seconds), then it should assume that the there are
+		 * larger problems with the xHC and assert HCRST.
+		 */
+		temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
+		if (temp_64 & CMD_RING_RUNNING) {
+			xhci_err(xhci,
+				 "Stop command ring failed, maybe the host is dead\n");
+			xhci->xhc_state |= XHCI_STATE_DYING;
+			xhci_halt(xhci);
+			return -ESHUTDOWN;
+		}
+
+		/*
+		 * Writing the CMD_RING_ABORT bit should cause a cmd
+		 * completion event, however on some hosts the bit is
+		 * correctly cleared but the completion event is never
+		 * sent.
+		 */
+		xhci_warn(xhci, "No stop event for abort, ring start fail?\n");
 		xhci_cleanup_command_queue(xhci);
 	} else {
 		xhci_handle_stopped_cmd_ring(xhci, xhci_next_queued_cmd(xhci));
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] usb: xhci: add XHCI_MISS_CA_EVENT quirk bit
  2016-12-23  6:52 [PATCH 0/4] refactor command timeout handling Lu Baolu
  2016-12-23  6:52 ` [PATCH 1/4] usb: xhci: remove unnecessary second abort try Lu Baolu
  2016-12-23  6:52 ` [PATCH 2/4] usb: xhci: remove CRR polling in xhci_abort_cmd_ring() Lu Baolu
@ 2016-12-23  6:52 ` Lu Baolu
  2016-12-23  6:52 ` [PATCH 4/4] usb: xhci: warn on command timeout in stopped command ring Lu Baolu
  3 siblings, 0 replies; 6+ messages in thread
From: Lu Baolu @ 2016-12-23  6:52 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb, linux-kernel, Lu Baolu

Writing the CMD_RING_ABORT bit in xhci command register should
cause a command completion event with the command completion
code set to command ring stopped. However on some hosts, the
CMD_RING_RUNNING bit is correctly cleared but the completion
event is never sent.

Current xhci driver treats the behavior of "CMD_RING_RUNNING
bit is correctly cleared but the completion event is missed"
as a correct behavior for all hosts. This is different from
that defined in xhci spec. Refer to 4.6.1.2 in xhci spec.

This patch introduces a quirk bit for those quirky hardwares.
For normal hosts, if the command completion for abort command
ring misses, we shall assume that there are larger problems
with the host and assert HCRST.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 17 +++++------------
 drivers/usb/host/xhci.h      |  6 ++++++
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 5935dce..6a23c37 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -345,25 +345,18 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci, unsigned long *flags)
 		 * larger problems with the xHC and assert HCRST.
 		 */
 		temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
-		if (temp_64 & CMD_RING_RUNNING) {
+		if ((temp_64 & CMD_RING_RUNNING) ||
+		    !(xhci->quirks & XHCI_MISS_CA_EVENT)) {
 			xhci_err(xhci,
 				 "Stop command ring failed, maybe the host is dead\n");
 			xhci->xhc_state |= XHCI_STATE_DYING;
 			xhci_halt(xhci);
 			return -ESHUTDOWN;
 		}
-
-		/*
-		 * Writing the CMD_RING_ABORT bit should cause a cmd
-		 * completion event, however on some hosts the bit is
-		 * correctly cleared but the completion event is never
-		 * sent.
-		 */
-		xhci_warn(xhci, "No stop event for abort, ring start fail?\n");
-		xhci_cleanup_command_queue(xhci);
-	} else {
-		xhci_handle_stopped_cmd_ring(xhci, xhci_next_queued_cmd(xhci));
 	}
+
+	xhci_handle_stopped_cmd_ring(xhci, xhci_next_queued_cmd(xhci));
+
 	return 0;
 }
 
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 09fe63f..c550bf0 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1656,6 +1656,12 @@ struct xhci_hcd {
 #define XHCI_SSIC_PORT_UNUSED	(1 << 22)
 #define XHCI_NO_64BIT_SUPPORT	(1 << 23)
 #define XHCI_MISSING_CAS	(1 << 24)
+/*
+ * Writing the CMD_RING_ABORT bit should cause a cmd completion event,
+ * however on some host hw the CMD_RING_RUNNING bit is correctly cleared
+ * but the completion event is never sent.
+ */
+#define XHCI_MISS_CA_EVENT	(1 << 25)
 	unsigned int		num_active_eps;
 	unsigned int		limit_active_eps;
 	/* There are two roothubs to keep track of bus suspend info for */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] usb: xhci: warn on command timeout in stopped command ring
  2016-12-23  6:52 [PATCH 0/4] refactor command timeout handling Lu Baolu
                   ` (2 preceding siblings ...)
  2016-12-23  6:52 ` [PATCH 3/4] usb: xhci: add XHCI_MISS_CA_EVENT quirk bit Lu Baolu
@ 2016-12-23  6:52 ` Lu Baolu
  2016-12-23  9:49   ` Sergei Shtylyov
  3 siblings, 1 reply; 6+ messages in thread
From: Lu Baolu @ 2016-12-23  6:52 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb, linux-kernel, Lu Baolu

If xhci host fails to response to a command, the command
watchdog timer will be fired. The callback function will
abort and clear current command and restart the command
execution. If driver fails to restart command execution,
it will assume there is a larger problem in host and
report the situation to the upper layer. In rare cases,
if the driver sees a command timeout in a stopped command
ring, driver should let the user know it.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 6a23c37..16baefc 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1308,8 +1308,12 @@ void xhci_handle_command_timeout(struct work_struct *work)
 		goto time_out_completed;
 	}
 
-	/* command timeout on stopped ring, ring can't be aborted */
-	xhci_dbg(xhci, "Command timeout on stopped ring\n");
+	/*
+	 * We should never reach here until a command times out on a stopped
+	 * ring and xhci driver has no idea about the reason why the command
+	 * ring was stopped.
+	 */
+	xhci_warn(xhci, "WARN command timeout on stopped ring\n");
 	xhci_handle_stopped_cmd_ring(xhci, xhci->current_cmd);
 
 time_out_completed:
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/4] usb: xhci: warn on command timeout in stopped command ring
  2016-12-23  6:52 ` [PATCH 4/4] usb: xhci: warn on command timeout in stopped command ring Lu Baolu
@ 2016-12-23  9:49   ` Sergei Shtylyov
  0 siblings, 0 replies; 6+ messages in thread
From: Sergei Shtylyov @ 2016-12-23  9:49 UTC (permalink / raw)
  To: Lu Baolu, Mathias Nyman; +Cc: linux-usb, linux-kernel

Hello!

On 12/23/2016 9:52 AM, Lu Baolu wrote:

> If xhci host fails to response to a command, the command

    s/response/respond/.

> watchdog timer will be fired. The callback function will
> abort and clear current command and restart the command
> execution. If driver fails to restart command execution,
> it will assume there is a larger problem in host and
> report the situation to the upper layer. In rare cases,
> if the driver sees a command timeout in a stopped command
> ring, driver should let the user know it.
>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
>  drivers/usb/host/xhci-ring.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 6a23c37..16baefc 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -1308,8 +1308,12 @@ void xhci_handle_command_timeout(struct work_struct *work)
>  		goto time_out_completed;
>  	}
>
> -	/* command timeout on stopped ring, ring can't be aborted */
> -	xhci_dbg(xhci, "Command timeout on stopped ring\n");
> +	/*
> +	 * We should never reach here until a command times out on a stopped
> +	 * ring and xhci driver has no idea about the reason why the command
> +	 * ring was stopped.
> +	 */
> +	xhci_warn(xhci, "WARN command timeout on stopped ring\n");

    WARNING, maybe?

[...]

MBR, Sergei

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-12-23  9:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-23  6:52 [PATCH 0/4] refactor command timeout handling Lu Baolu
2016-12-23  6:52 ` [PATCH 1/4] usb: xhci: remove unnecessary second abort try Lu Baolu
2016-12-23  6:52 ` [PATCH 2/4] usb: xhci: remove CRR polling in xhci_abort_cmd_ring() Lu Baolu
2016-12-23  6:52 ` [PATCH 3/4] usb: xhci: add XHCI_MISS_CA_EVENT quirk bit Lu Baolu
2016-12-23  6:52 ` [PATCH 4/4] usb: xhci: warn on command timeout in stopped command ring Lu Baolu
2016-12-23  9:49   ` Sergei Shtylyov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).