linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
@ 2022-02-17 13:56 Mathias Nyman
  2022-02-18  9:41 ` Pavan Kondeti
  2022-03-01  1:49 ` Jack Pham
  0 siblings, 2 replies; 11+ messages in thread
From: Mathias Nyman @ 2022-02-17 13:56 UTC (permalink / raw)
  To: quic_pkondeti; +Cc: s.shtylyov, linux-usb, Mathias Nyman

xhci_reset() timeout was increased from 250ms to 10 seconds in order to
give Renesas 720201 xHC enough time to get ready in probe.

xhci_reset() is called with interrupts disabled in other places, and
waiting for 10 seconds there is not acceptable.

Add a timeout parameter to xhci_reset(), and adjust it back to 250ms
when called from xhci_stop() or xhci_shutdown() where interrupts are
disabled, and successful reset isn't that critical.

Additionally change the signed integer timeout parameter in
xhci_handshake() to a u64 to match the timeout value we pass to
readl_poll_timeout_atomic()

Reported-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com>
Fixes: 22ceac191211 ("xhci: Increase reset timeout for Renesas 720201 host.")
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-hub.c |  2 +-
 drivers/usb/host/xhci-mem.c |  2 +-
 drivers/usb/host/xhci.c     | 20 +++++++++-----------
 drivers/usb/host/xhci.h     |  7 +++++--
 4 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index df3522dab31b..bb4f01ce90e3 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -762,7 +762,7 @@ static int xhci_exit_test_mode(struct xhci_hcd *xhci)
 	}
 	pm_runtime_allow(xhci_to_hcd(xhci)->self.controller);
 	xhci->test_mode = 0;
-	return xhci_reset(xhci);
+	return xhci_reset(xhci, XHCI_RESET_SHORT_USEC);
 }
 
 void xhci_set_link_state(struct xhci_hcd *xhci, struct xhci_port *port,
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index f8c2b6c79543..803c89700ed3 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -2575,7 +2575,7 @@ int xhci_mem_init(struct xhci_hcd *xhci, gfp_t flags)
 
 fail:
 	xhci_halt(xhci);
-	xhci_reset(xhci);
+	xhci_reset(xhci, XHCI_RESET_SHORT_USEC);
 	xhci_mem_cleanup(xhci);
 	return -ENOMEM;
 }
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 33bae434aa94..2567176b9bdb 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -65,7 +65,7 @@ static bool td_on_ring(struct xhci_td *td, struct xhci_ring *ring)
  * handshake done).  There are two failure modes:  "usec" have passed (major
  * hardware flakeout), or the register reads as all-ones (hardware removed).
  */
-int xhci_handshake(void __iomem *ptr, u32 mask, u32 done, int usec)
+int xhci_handshake(void __iomem *ptr, u32 mask, u32 done, u64 timeout_us)
 {
 	u32	result;
 	int	ret;
@@ -73,7 +73,7 @@ int xhci_handshake(void __iomem *ptr, u32 mask, u32 done, int usec)
 	ret = readl_poll_timeout_atomic(ptr, result,
 					(result & mask) == done ||
 					result == U32_MAX,
-					1, usec);
+					1, timeout_us);
 	if (result == U32_MAX)		/* card removed */
 		return -ENODEV;
 
@@ -162,7 +162,7 @@ int xhci_start(struct xhci_hcd *xhci)
  * Transactions will be terminated immediately, and operational registers
  * will be set to their defaults.
  */
-int xhci_reset(struct xhci_hcd *xhci)
+int xhci_reset(struct xhci_hcd *xhci, u64 timeout_us)
 {
 	u32 command;
 	u32 state;
@@ -195,8 +195,7 @@ int xhci_reset(struct xhci_hcd *xhci)
 	if (xhci->quirks & XHCI_INTEL_HOST)
 		udelay(1000);
 
-	ret = xhci_handshake(&xhci->op_regs->command,
-			CMD_RESET, 0, 10 * 1000 * 1000);
+	ret = xhci_handshake(&xhci->op_regs->command, CMD_RESET, 0, timeout_us);
 	if (ret)
 		return ret;
 
@@ -209,8 +208,7 @@ int xhci_reset(struct xhci_hcd *xhci)
 	 * xHCI cannot write to any doorbells or operational registers other
 	 * than status until the "Controller Not Ready" flag is cleared.
 	 */
-	ret = xhci_handshake(&xhci->op_regs->status,
-			STS_CNR, 0, 10 * 1000 * 1000);
+	ret = xhci_handshake(&xhci->op_regs->status, STS_CNR, 0, timeout_us);
 
 	xhci->usb2_rhub.bus_state.port_c_suspend = 0;
 	xhci->usb2_rhub.bus_state.suspended_ports = 0;
@@ -731,7 +729,7 @@ static void xhci_stop(struct usb_hcd *hcd)
 	xhci->xhc_state |= XHCI_STATE_HALTED;
 	xhci->cmd_ring_state = CMD_RING_STATE_STOPPED;
 	xhci_halt(xhci);
-	xhci_reset(xhci);
+	xhci_reset(xhci, XHCI_RESET_SHORT_USEC);
 	spin_unlock_irq(&xhci->lock);
 
 	xhci_cleanup_msix(xhci);
@@ -784,7 +782,7 @@ void xhci_shutdown(struct usb_hcd *hcd)
 	xhci_halt(xhci);
 	/* Workaround for spurious wakeups at shutdown with HSW */
 	if (xhci->quirks & XHCI_SPURIOUS_WAKEUP)
-		xhci_reset(xhci);
+		xhci_reset(xhci, XHCI_RESET_SHORT_USEC);
 	spin_unlock_irq(&xhci->lock);
 
 	xhci_cleanup_msix(xhci);
@@ -1163,7 +1161,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
 		xhci_dbg(xhci, "Stop HCD\n");
 		xhci_halt(xhci);
 		xhci_zero_64b_regs(xhci);
-		retval = xhci_reset(xhci);
+		retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
 		spin_unlock_irq(&xhci->lock);
 		if (retval)
 			return retval;
@@ -5297,7 +5295,7 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
 
 	xhci_dbg(xhci, "Resetting HCD\n");
 	/* Reset the internal HC memory state and registers. */
-	retval = xhci_reset(xhci);
+	retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
 	if (retval)
 		return retval;
 	xhci_dbg(xhci, "Reset complete\n");
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 8a0026ee9524..fce32f8ea9d0 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -229,6 +229,9 @@ struct xhci_op_regs {
 #define CMD_ETE		(1 << 14)
 /* bits 15:31 are reserved (and should be preserved on writes). */
 
+#define XHCI_RESET_LONG_USEC		(10 * 1000 * 1000)
+#define XHCI_RESET_SHORT_USEC		(250 * 1000)
+
 /* IMAN - Interrupt Management Register */
 #define IMAN_IE		(1 << 1)
 #define IMAN_IP		(1 << 0)
@@ -2081,11 +2084,11 @@ void xhci_free_container_ctx(struct xhci_hcd *xhci,
 
 /* xHCI host controller glue */
 typedef void (*xhci_get_quirks_t)(struct device *, struct xhci_hcd *);
-int xhci_handshake(void __iomem *ptr, u32 mask, u32 done, int usec);
+int xhci_handshake(void __iomem *ptr, u32 mask, u32 done, u64 timeout_us);
 void xhci_quiesce(struct xhci_hcd *xhci);
 int xhci_halt(struct xhci_hcd *xhci);
 int xhci_start(struct xhci_hcd *xhci);
-int xhci_reset(struct xhci_hcd *xhci);
+int xhci_reset(struct xhci_hcd *xhci, u64 timeout_us);
 int xhci_run(struct usb_hcd *hcd);
 int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks);
 void xhci_shutdown(struct usb_hcd *hcd);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-02-17 13:56 [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable Mathias Nyman
@ 2022-02-18  9:41 ` Pavan Kondeti
  2022-02-24  8:44   ` Udipto Goswami
  2022-03-01  1:49 ` Jack Pham
  1 sibling, 1 reply; 11+ messages in thread
From: Pavan Kondeti @ 2022-02-18  9:41 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: quic_pkondeti, s.shtylyov, linux-usb

On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
> xhci_reset() timeout was increased from 250ms to 10 seconds in order to
> give Renesas 720201 xHC enough time to get ready in probe.
> 
> xhci_reset() is called with interrupts disabled in other places, and
> waiting for 10 seconds there is not acceptable.
> 
> Add a timeout parameter to xhci_reset(), and adjust it back to 250ms
> when called from xhci_stop() or xhci_shutdown() where interrupts are
> disabled, and successful reset isn't that critical.
> 
> Additionally change the signed integer timeout parameter in
> xhci_handshake() to a u64 to match the timeout value we pass to
> readl_poll_timeout_atomic()
> 
> Reported-by: Sergey Shtylyov <s.shtylyov@omp.ru>
> Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com>
> Fixes: 22ceac191211 ("xhci: Increase reset timeout for Renesas 720201 host.")
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>

We have tested the patch and found no issues. Since the issue happens
very rarely, we have included in our builds for the wider testing.

Thanks,
Pavan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-02-18  9:41 ` Pavan Kondeti
@ 2022-02-24  8:44   ` Udipto Goswami
  2022-02-28 11:39     ` Mathias Nyman
  0 siblings, 1 reply; 11+ messages in thread
From: Udipto Goswami @ 2022-02-24  8:44 UTC (permalink / raw)
  To: Pavan Kondeti, Mathias Nyman; +Cc: s.shtylyov, linux-usb

Hi Pavan, Mathias,

we have tested the patch in the testing environment where initially we 
were hitting the issue. We don't see any issue after including this.

On 18-02-2022 03:11 pm, Pavan Kondeti wrote:
> On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
>> xhci_reset() timeout was increased from 250ms to 10 seconds in order to
>> give Renesas 720201 xHC enough time to get ready in probe.
>>
>> xhci_reset() is called with interrupts disabled in other places, and
>> waiting for 10 seconds there is not acceptable.
>>
>> Add a timeout parameter to xhci_reset(), and adjust it back to 250ms
>> when called from xhci_stop() or xhci_shutdown() where interrupts are
>> disabled, and successful reset isn't that critical.
>>
>> Additionally change the signed integer timeout parameter in
>> xhci_handshake() to a u64 to match the timeout value we pass to
>> readl_poll_timeout_atomic()
>>
>> Reported-by: Sergey Shtylyov <s.shtylyov@omp.ru>
>> Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com>
>> Fixes: 22ceac191211 ("xhci: Increase reset timeout for Renesas 720201 host.")
>> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> We have tested the patch and found no issues. Since the issue happens
> very rarely, we have included in our builds for the wider testing.
>
> Thanks,
> Pavan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-02-24  8:44   ` Udipto Goswami
@ 2022-02-28 11:39     ` Mathias Nyman
  2022-02-28 12:10       ` Pavan Kondeti
  0 siblings, 1 reply; 11+ messages in thread
From: Mathias Nyman @ 2022-02-28 11:39 UTC (permalink / raw)
  To: Udipto Goswami, Pavan Kondeti; +Cc: s.shtylyov, linux-usb

On 24.2.2022 10.44, Udipto Goswami wrote:
> Hi Pavan, Mathias,
> 
> we have tested the patch in the testing environment where initially we were hitting the issue. We don't see any issue after including this.
> 
> On 18-02-2022 03:11 pm, Pavan Kondeti wrote:
>> On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
>>> xhci_reset() timeout was increased from 250ms to 10 seconds in order to
>>> give Renesas 720201 xHC enough time to get ready in probe.
>>>
>>> xhci_reset() is called with interrupts disabled in other places, and
>>> waiting for 10 seconds there is not acceptable.
>>>
>>> Add a timeout parameter to xhci_reset(), and adjust it back to 250ms
>>> when called from xhci_stop() or xhci_shutdown() where interrupts are
>>> disabled, and successful reset isn't that critical.
>>>
>>> Additionally change the signed integer timeout parameter in
>>> xhci_handshake() to a u64 to match the timeout value we pass to
>>> readl_poll_timeout_atomic()
>>>
>>> Reported-by: Sergey Shtylyov <s.shtylyov@omp.ru>
>>> Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com>
>>> Fixes: 22ceac191211 ("xhci: Increase reset timeout for Renesas 720201 host.")
>>> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
>> We have tested the patch and found no issues. Since the issue happens
>> very rarely, we have included in our builds for the wider testing.
>>
>> Thanks,
>> Pavan

Thanks 
Can I add a "Tested-by: Pavan Kondeti <quic_pkondeti@quicinc.com>" tag?

-Mathias

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-02-28 11:39     ` Mathias Nyman
@ 2022-02-28 12:10       ` Pavan Kondeti
  0 siblings, 0 replies; 11+ messages in thread
From: Pavan Kondeti @ 2022-02-28 12:10 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: Udipto Goswami, Pavan Kondeti, s.shtylyov, linux-usb

Hi Mathias,

On Mon, Feb 28, 2022 at 01:39:10PM +0200, Mathias Nyman wrote:
> On 24.2.2022 10.44, Udipto Goswami wrote:
> > Hi Pavan, Mathias,
> > 
> > we have tested the patch in the testing environment where initially we were hitting the issue. We don't see any issue after including this.
> > 
> > On 18-02-2022 03:11 pm, Pavan Kondeti wrote:
> >> On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
> >>> xhci_reset() timeout was increased from 250ms to 10 seconds in order to
> >>> give Renesas 720201 xHC enough time to get ready in probe.
> >>>
> >>> xhci_reset() is called with interrupts disabled in other places, and
> >>> waiting for 10 seconds there is not acceptable.
> >>>
> >>> Add a timeout parameter to xhci_reset(), and adjust it back to 250ms
> >>> when called from xhci_stop() or xhci_shutdown() where interrupts are
> >>> disabled, and successful reset isn't that critical.
> >>>
> >>> Additionally change the signed integer timeout parameter in
> >>> xhci_handshake() to a u64 to match the timeout value we pass to
> >>> readl_poll_timeout_atomic()
> >>>
> >>> Reported-by: Sergey Shtylyov <s.shtylyov@omp.ru>
> >>> Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com>
> >>> Fixes: 22ceac191211 ("xhci: Increase reset timeout for Renesas 720201 host.")
> >>> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> >> We have tested the patch and found no issues. Since the issue happens
> >> very rarely, we have included in our builds for the wider testing.
> >>
> >> Thanks,
> >> Pavan
> 
> Thanks 
> Can I add a "Tested-by: Pavan Kondeti <quic_pkondeti@quicinc.com>" tag?
> 
Yes, please. Thanks for your help.

Thanks,
Pavan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-02-17 13:56 [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable Mathias Nyman
  2022-02-18  9:41 ` Pavan Kondeti
@ 2022-03-01  1:49 ` Jack Pham
  2022-03-01  4:03   ` Pavan Kondeti
  1 sibling, 1 reply; 11+ messages in thread
From: Jack Pham @ 2022-03-01  1:49 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: quic_pkondeti, s.shtylyov, linux-usb

Hi Mathias,

On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
> xhci_reset() timeout was increased from 250ms to 10 seconds in order to
> give Renesas 720201 xHC enough time to get ready in probe.

This suggests that the only place we really need the long timeout is
in xhci_gen_setup().

> @@ -1163,7 +1161,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
>  		xhci_dbg(xhci, "Stop HCD\n");
>  		xhci_halt(xhci);
>  		xhci_zero_64b_regs(xhci);
> -		retval = xhci_reset(xhci);
> +		retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);

Since preemption is disabled here (spin_lock_irq() is called near the
start of this function), shouldn't we also limit this to 250ms here in
xhci_resume() as well?

>  		spin_unlock_irq(&xhci->lock);
>  		if (retval)
>  			return retval;
> @@ -5297,7 +5295,7 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
>  
>  	xhci_dbg(xhci, "Resetting HCD\n");
>  	/* Reset the internal HC memory state and registers. */
> -	retval = xhci_reset(xhci);
> +	retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
>  	if (retval)
>  		return retval;
>  	xhci_dbg(xhci, "Reset complete\n");

Thanks,
Jack

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-03-01  1:49 ` Jack Pham
@ 2022-03-01  4:03   ` Pavan Kondeti
  2022-03-01  8:47     ` Mathias Nyman
  0 siblings, 1 reply; 11+ messages in thread
From: Pavan Kondeti @ 2022-03-01  4:03 UTC (permalink / raw)
  To: Jack Pham; +Cc: Mathias Nyman, quic_pkondeti, s.shtylyov, linux-usb

On Mon, Feb 28, 2022 at 05:49:49PM -0800, Jack Pham wrote:
> Hi Mathias,
> 
> On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
> > xhci_reset() timeout was increased from 250ms to 10 seconds in order to
> > give Renesas 720201 xHC enough time to get ready in probe.
> 
> This suggests that the only place we really need the long timeout is
> in xhci_gen_setup().
> 
> > @@ -1163,7 +1161,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
> >  		xhci_dbg(xhci, "Stop HCD\n");
> >  		xhci_halt(xhci);
> >  		xhci_zero_64b_regs(xhci);
> > -		retval = xhci_reset(xhci);
> > +		retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
> 
> Since preemption is disabled here (spin_lock_irq() is called near the
> start of this function), shouldn't we also limit this to 250ms here in
> xhci_resume() as well?
> 
The rationale of decreasing the timeout to 250 in certain places is based
on the criticality of the operation but not on the preemption/irq state.
Since xHC reset is critical in startup and resume, the 10 seconds timeout
is enforced so that we don't break Renesas 720201 xHC.

Since all of our internl test reports indicate that the timeout is happening
from stop hcd, this patch is helping.

Thanks,
Pavan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-03-01  4:03   ` Pavan Kondeti
@ 2022-03-01  8:47     ` Mathias Nyman
  2022-03-02  3:23       ` Pavan Kondeti
  0 siblings, 1 reply; 11+ messages in thread
From: Mathias Nyman @ 2022-03-01  8:47 UTC (permalink / raw)
  To: Pavan Kondeti, Jack Pham; +Cc: s.shtylyov, linux-usb

On 1.3.2022 6.03, Pavan Kondeti wrote:
> On Mon, Feb 28, 2022 at 05:49:49PM -0800, Jack Pham wrote:
>> Hi Mathias,
>>
>> On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
>>> xhci_reset() timeout was increased from 250ms to 10 seconds in order to
>>> give Renesas 720201 xHC enough time to get ready in probe.
>>
>> This suggests that the only place we really need the long timeout is
>> in xhci_gen_setup().
>>
>>> @@ -1163,7 +1161,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
>>>  		xhci_dbg(xhci, "Stop HCD\n");
>>>  		xhci_halt(xhci);
>>>  		xhci_zero_64b_regs(xhci);
>>> -		retval = xhci_reset(xhci);
>>> +		retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
>>
>> Since preemption is disabled here (spin_lock_irq() is called near the
>> start of this function), shouldn't we also limit this to 250ms here in
>> xhci_resume() as well?
>>> The rationale of decreasing the timeout to 250 in certain places is based
> on the criticality of the operation but not on the preemption/irq state.
> Since xHC reset is critical in startup and resume, the 10 seconds timeout
> is enforced so that we don't break Renesas 720201 xHC.
> 
> Since all of our internl test reports indicate that the timeout is happening
> from stop hcd, this patch is helping.
> 

This was pretty much my reasoning as well.
I could add a comment about this to the commit message

In addition we want a targeted fix for a real world issue that we can send to
stable without changing too much, risking regressions.

I also think we should focus more on fixing the locking (preemption/irq state)
around xhci_reset() in xhci_resume() than tuning the timeout, but this needs more
thought and should be a separate patch for later.

Additionally I guess xhci_reset() is more likely to fail in xhci_stop() and
xhci_shutdown() as power or clocks for xHC may be disabled, or entire xHC removed from the
bus by then.   

Thanks
-Mathias

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-03-01  8:47     ` Mathias Nyman
@ 2022-03-02  3:23       ` Pavan Kondeti
  2022-03-02  7:49         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 11+ messages in thread
From: Pavan Kondeti @ 2022-03-02  3:23 UTC (permalink / raw)
  To: Mathias Nyman, Greg Kroah-Hartman
  Cc: Pavan Kondeti, Jack Pham, s.shtylyov, linux-usb

On Tue, Mar 01, 2022 at 10:47:36AM +0200, Mathias Nyman wrote:
> On 1.3.2022 6.03, Pavan Kondeti wrote:
> > On Mon, Feb 28, 2022 at 05:49:49PM -0800, Jack Pham wrote:
> >> Hi Mathias,
> >>
> >> On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
> >>> xhci_reset() timeout was increased from 250ms to 10 seconds in order to
> >>> give Renesas 720201 xHC enough time to get ready in probe.
> >>
> >> This suggests that the only place we really need the long timeout is
> >> in xhci_gen_setup().
> >>
> >>> @@ -1163,7 +1161,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
> >>>  		xhci_dbg(xhci, "Stop HCD\n");
> >>>  		xhci_halt(xhci);
> >>>  		xhci_zero_64b_regs(xhci);
> >>> -		retval = xhci_reset(xhci);
> >>> +		retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
> >>
> >> Since preemption is disabled here (spin_lock_irq() is called near the
> >> start of this function), shouldn't we also limit this to 250ms here in
> >> xhci_resume() as well?
> >>> The rationale of decreasing the timeout to 250 in certain places is based
> > on the criticality of the operation but not on the preemption/irq state.
> > Since xHC reset is critical in startup and resume, the 10 seconds timeout
> > is enforced so that we don't break Renesas 720201 xHC.
> > 
> > Since all of our internl test reports indicate that the timeout is happening
> > from stop hcd, this patch is helping.
> > 
> 
> This was pretty much my reasoning as well.
> I could add a comment about this to the commit message
> 
> In addition we want a targeted fix for a real world issue that we can send to
> stable without changing too much, risking regressions.
> 
Makes complete sense. 

Greg,

Do you plan to include this patch?

Thanks,
Pavan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-03-02  3:23       ` Pavan Kondeti
@ 2022-03-02  7:49         ` Greg Kroah-Hartman
  2022-03-02  7:57           ` Pavan Kondeti
  0 siblings, 1 reply; 11+ messages in thread
From: Greg Kroah-Hartman @ 2022-03-02  7:49 UTC (permalink / raw)
  To: Pavan Kondeti; +Cc: Mathias Nyman, Jack Pham, s.shtylyov, linux-usb

On Wed, Mar 02, 2022 at 08:53:06AM +0530, Pavan Kondeti wrote:
> On Tue, Mar 01, 2022 at 10:47:36AM +0200, Mathias Nyman wrote:
> > On 1.3.2022 6.03, Pavan Kondeti wrote:
> > > On Mon, Feb 28, 2022 at 05:49:49PM -0800, Jack Pham wrote:
> > >> Hi Mathias,
> > >>
> > >> On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
> > >>> xhci_reset() timeout was increased from 250ms to 10 seconds in order to
> > >>> give Renesas 720201 xHC enough time to get ready in probe.
> > >>
> > >> This suggests that the only place we really need the long timeout is
> > >> in xhci_gen_setup().
> > >>
> > >>> @@ -1163,7 +1161,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
> > >>>  		xhci_dbg(xhci, "Stop HCD\n");
> > >>>  		xhci_halt(xhci);
> > >>>  		xhci_zero_64b_regs(xhci);
> > >>> -		retval = xhci_reset(xhci);
> > >>> +		retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
> > >>
> > >> Since preemption is disabled here (spin_lock_irq() is called near the
> > >> start of this function), shouldn't we also limit this to 250ms here in
> > >> xhci_resume() as well?
> > >>> The rationale of decreasing the timeout to 250 in certain places is based
> > > on the criticality of the operation but not on the preemption/irq state.
> > > Since xHC reset is critical in startup and resume, the 10 seconds timeout
> > > is enforced so that we don't break Renesas 720201 xHC.
> > > 
> > > Since all of our internl test reports indicate that the timeout is happening
> > > from stop hcd, this patch is helping.
> > > 
> > 
> > This was pretty much my reasoning as well.
> > I could add a comment about this to the commit message
> > 
> > In addition we want a targeted fix for a real world issue that we can send to
> > stable without changing too much, risking regressions.
> > 
> Makes complete sense. 
> 
> Greg,
> 
> Do you plan to include this patch?

I need it to be forwarded to me by the maintainer with their
signed-off-by, like any other xhci core patch :)

What is the rush here?  Your hardware is obviously broken somehow if you
are hitting this problem just now.  Why not fix the root cause here,
this patch should only affect rare situations, right?  Are you having
this problem a lot?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable
  2022-03-02  7:49         ` Greg Kroah-Hartman
@ 2022-03-02  7:57           ` Pavan Kondeti
  0 siblings, 0 replies; 11+ messages in thread
From: Pavan Kondeti @ 2022-03-02  7:57 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Pavan Kondeti, Mathias Nyman, Jack Pham, s.shtylyov, linux-usb

Hi Greg,

On Wed, Mar 02, 2022 at 08:49:08AM +0100, Greg Kroah-Hartman wrote:
> On Wed, Mar 02, 2022 at 08:53:06AM +0530, Pavan Kondeti wrote:
> > On Tue, Mar 01, 2022 at 10:47:36AM +0200, Mathias Nyman wrote:
> > > On 1.3.2022 6.03, Pavan Kondeti wrote:
> > > > On Mon, Feb 28, 2022 at 05:49:49PM -0800, Jack Pham wrote:
> > > >> Hi Mathias,
> > > >>
> > > >> On Thu, Feb 17, 2022 at 03:56:43PM +0200, Mathias Nyman wrote:
> > > >>> xhci_reset() timeout was increased from 250ms to 10 seconds in order to
> > > >>> give Renesas 720201 xHC enough time to get ready in probe.
> > > >>
> > > >> This suggests that the only place we really need the long timeout is
> > > >> in xhci_gen_setup().
> > > >>
> > > >>> @@ -1163,7 +1161,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
> > > >>>  		xhci_dbg(xhci, "Stop HCD\n");
> > > >>>  		xhci_halt(xhci);
> > > >>>  		xhci_zero_64b_regs(xhci);
> > > >>> -		retval = xhci_reset(xhci);
> > > >>> +		retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC);
> > > >>
> > > >> Since preemption is disabled here (spin_lock_irq() is called near the
> > > >> start of this function), shouldn't we also limit this to 250ms here in
> > > >> xhci_resume() as well?
> > > >>> The rationale of decreasing the timeout to 250 in certain places is based
> > > > on the criticality of the operation but not on the preemption/irq state.
> > > > Since xHC reset is critical in startup and resume, the 10 seconds timeout
> > > > is enforced so that we don't break Renesas 720201 xHC.
> > > > 
> > > > Since all of our internl test reports indicate that the timeout is happening
> > > > from stop hcd, this patch is helping.
> > > > 
> > > 
> > > This was pretty much my reasoning as well.
> > > I could add a comment about this to the commit message
> > > 
> > > In addition we want a targeted fix for a real world issue that we can send to
> > > stable without changing too much, risking regressions.
> > > 
> > Makes complete sense. 
> > 
> > Greg,
> > 
> > Do you plan to include this patch?
> 
> I need it to be forwarded to me by the maintainer with their
> signed-off-by, like any other xhci core patch :)

Got it. I will wait for Mathia's final patch.

> 
> What is the rush here?  Your hardware is obviously broken somehow if you
> are hitting this problem just now.  Why not fix the root cause here,
> this patch should only affect rare situations, right?  Are you having
> this problem a lot?
> 
Sure, we will work with hardware designers to understand the root cause but
it is very unlikely that this can be fixed with a firmware/control setting
change. so this workaround would atleast solve the stability issues. It
happens rarely but we test with lots of devices in our farm and it shows up
eventually on one or the other device in a day or so. Why not include a
harmless patch and improve user experience is what driving us to request
Mathias and you to help.

Thanks,
Pavan

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-02  7:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-17 13:56 [RFT PATCH] xhci: make xhci_handshake timeout for xhci_reset() adjustable Mathias Nyman
2022-02-18  9:41 ` Pavan Kondeti
2022-02-24  8:44   ` Udipto Goswami
2022-02-28 11:39     ` Mathias Nyman
2022-02-28 12:10       ` Pavan Kondeti
2022-03-01  1:49 ` Jack Pham
2022-03-01  4:03   ` Pavan Kondeti
2022-03-01  8:47     ` Mathias Nyman
2022-03-02  3:23       ` Pavan Kondeti
2022-03-02  7:49         ` Greg Kroah-Hartman
2022-03-02  7:57           ` Pavan Kondeti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).