linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] scsi: ufs: clear doorbell for hibern8 errors when using ah8
       [not found] <CGME20211007075635epcas2p16af95ce29750417f34f8490b0d8000d4@epcas2p1.samsung.com>
@ 2021-10-07  7:40 ` Kiwoong Kim
  2021-10-07 21:47   ` Bart Van Assche
  0 siblings, 1 reply; 3+ messages in thread
From: Kiwoong Kim @ 2021-10-07  7:40 UTC (permalink / raw)
  To: linux-scsi, linux-kernel, alim.akhtar, avri.altman, jejb,
	martin.petersen, beanhuo, cang, adrian.hunter, sc.suh, hy50.seo,
	sh425.lee, bhoon95.kim, vkumar.1997
  Cc: Kiwoong Kim

When an scsi command is dispatched right after host complete
all the pending requests goes to idle and ufs driver tries
to ring a doorbell, host might be still during entering into
hibern8. If an error occurrs during that period, the doorbell
might not be zero. In this case, clearing it should be needed
to requeue its command.
Currently, ufshcd_err_handler goes directly to reset when the
driver's link state is broken. This patch is to make it clear
doorbells in the case. In the situation, communication would
be disabled, so TM isn't necessary or we can say it's not
available.

Here's an actual symptom that I've faced. At the time, tag #17
is still pended even after host reset. And then the block timer
is expired.

exynos-ufs 11100000.ufs: ufshcd_check_errors: Auto Hibern8
Enter failed - status: 0x00000040, upmcrs: 0x00000001
..
host_regs: 00000050: b8671000 00000008 00020000 00000000
..
exynos-ufs 11100000.ufs: ufshcd_abort: Device abort task at tag 17

Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
---
 drivers/scsi/ufs/ufshcd.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 9faf02c..13f406d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6076,6 +6076,7 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
 	int err = 0, pmc_err;
 	int tag;
 	bool needs_reset = false, needs_restore = false;
+	bool link_broken_in_ah8 = false;
 
 	down(&hba->host_sem);
 	spin_lock_irqsave(hba->host->host_lock, flags);
@@ -6138,7 +6139,12 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
 	     (hba->saved_uic_err & (UFSHCD_UIC_DL_NAC_RECEIVED_ERROR |
 				    UFSHCD_UIC_DL_TCx_REPLAY_ERROR)))) {
 		needs_reset = true;
-		goto do_reset;
+		spin_lock_irqsave(hba->host->host_lock, flags);
+		if (!hba->ahit && ufshcd_is_link_broken(hba))
+			link_broken_in_ah8 = true;
+		spin_unlock_irqrestore(hba->host->host_lock, flags);
+		if (!link_broken_in_ah8)
+			goto do_reset;
 	}
 
 	/*
@@ -6168,6 +6174,9 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
 		}
 	}
 
+	if (link_broken_in_ah8)
+		goto lock_skip_pending_xfer_clear;
+
 	/* Clear pending task management requests */
 	for_each_set_bit(tag, &hba->outstanding_tasks, hba->nutmrs) {
 		if (ufshcd_clear_tm_cmd(hba, tag)) {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v1] scsi: ufs: clear doorbell for hibern8 errors when using ah8
  2021-10-07  7:40 ` [PATCH v1] scsi: ufs: clear doorbell for hibern8 errors when using ah8 Kiwoong Kim
@ 2021-10-07 21:47   ` Bart Van Assche
  2021-10-08  2:02     ` Kiwoong Kim
  0 siblings, 1 reply; 3+ messages in thread
From: Bart Van Assche @ 2021-10-07 21:47 UTC (permalink / raw)
  To: Kiwoong Kim, linux-scsi, linux-kernel, alim.akhtar, avri.altman,
	jejb, martin.petersen, beanhuo, cang, adrian.hunter, sc.suh,
	hy50.seo, sh425.lee, bhoon95.kim, vkumar.1997

On 10/7/21 12:40 AM, Kiwoong Kim wrote:
> When an scsi command is dispatched right after host complete
> all the pending requests goes to idle and ufs driver tries
> to ring a doorbell, host might be still during entering into
> hibern8. If an error occurrs during that period, the doorbell
> might not be zero. In this case, clearing it should be needed
> to requeue its command.
> Currently, ufshcd_err_handler goes directly to reset when the
> driver's link state is broken. This patch is to make it clear
> doorbells in the case. In the situation, communication would
> be disabled, so TM isn't necessary or we can say it's not
> available.

The above text is too hard to comprehend. Please make it more clear. As 
an example, in the first sentence, what does "goes to idle" apply to? 
Does it apply to "SCSI command", "pending requests" or something else?

What mechanism does "If an error occurs" refer to? A SCSI command 
processing error, a link error or another type of error?

> Here's an actual symptom that I've faced. At the time, tag #17
> is still pended even after host reset. And then the block timer
> is expired.

pended -> pending.

> exynos-ufs 11100000.ufs: ufshcd_check_errors: Auto Hibern8
> Enter failed - status: 0x00000040, upmcrs: 0x00000001
> ..
> host_regs: 00000050: b8671000 00000008 00020000 00000000
> ..
> exynos-ufs 11100000.ufs: ufshcd_abort: Device abort task at tag 17

The relationship between the example and the description above the 
example is not clear. If a command is pending before the error handler 
starts, aborting that command fails and the host is not reset then the 
command will still be pending after the error handler has finished.

> @@ -6138,7 +6139,12 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
>   	     (hba->saved_uic_err & (UFSHCD_UIC_DL_NAC_RECEIVED_ERROR |
>   				    UFSHCD_UIC_DL_TCx_REPLAY_ERROR)))) {
>   		needs_reset = true;
> -		goto do_reset;
> +		spin_lock_irqsave(hba->host->host_lock, flags);
> +		if (!hba->ahit && ufshcd_is_link_broken(hba))
> +			link_broken_in_ah8 = true;
> +		spin_unlock_irqrestore(hba->host->host_lock, flags);
> +		if (!link_broken_in_ah8)
> +			goto do_reset;
>   	}
>   

My understanding is that hba->ahit == 0 means that auto-hibernation is 
disabled. Hence, the above code sets 'link_broken_in_ah8' only if 
auto-hibernation is disabled. So what does the name of the variable 
'link_broken_in_ah8' mean?

> @@ -6168,6 +6174,9 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
>   		}
>   	}
>   
> +	if (link_broken_in_ah8)
> +		goto lock_skip_pending_xfer_clear;
> +
>   	/* Clear pending task management requests */
>   	for_each_set_bit(tag, &hba->outstanding_tasks, hba->nutmrs) {
>   		if (ufshcd_clear_tm_cmd(hba, tag)) {

Why is skipping the ufshcd_clear_tm_cmd() calls useful in this case?

Thanks,

Bart.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [PATCH v1] scsi: ufs: clear doorbell for hibern8 errors when using ah8
  2021-10-07 21:47   ` Bart Van Assche
@ 2021-10-08  2:02     ` Kiwoong Kim
  0 siblings, 0 replies; 3+ messages in thread
From: Kiwoong Kim @ 2021-10-08  2:02 UTC (permalink / raw)
  To: 'Bart Van Assche',
	linux-scsi, linux-kernel, alim.akhtar, avri.altman, jejb,
	martin.petersen, beanhuo, cang, adrian.hunter, sc.suh, hy50.seo,
	sh425.lee, bhoon95.kim, vkumar.1997

> On 10/7/21 12:40 AM, Kiwoong Kim wrote:
> > When an scsi command is dispatched right after host complete all the
> > pending requests goes to idle and ufs driver tries to ring a doorbell,
> > host might be still during entering into hibern8. If an error occurrs
> > during that period, the doorbell might not be zero. In this case,
> > clearing it should be needed to requeue its command.
> > Currently, ufshcd_err_handler goes directly to reset when the driver's
> > link state is broken. This patch is to make it clear doorbells in the
> > case. In the situation, communication would be disabled, so TM isn't
> > necessary or we can say it's not available.
> 
> The above text is too hard to comprehend. Please make it more clear. As an
> example, in the first sentence, what does "goes to idle" apply to?
> Does it apply to "SCSI command", "pending requests" or something else?

My 'goes to idle' means a state of no pended UTP requests.
I write 'scsi command' because the symptom that I saw is related with scsi command.

> 
> What mechanism does "If an error occurs" refer to? A SCSI command
> processing error, a link error or another type of error?

Hibern8 errors written in the title.

> 
> > Here's an actual symptom that I've faced. At the time, tag #17 is
> > still pended even after host reset. And then the block timer is
> > expired.
> 
> pended -> pending.

Got it.
> 
> > exynos-ufs 11100000.ufs: ufshcd_check_errors: Auto Hibern8 Enter
> > failed - status: 0x00000040, upmcrs: 0x00000001 ..
> > host_regs: 00000050: b8671000 00000008 00020000 00000000 ..
> > exynos-ufs 11100000.ufs: ufshcd_abort: Device abort task at tag 17
> 
> The relationship between the example and the description above the example
> is not clear. If a command is pending before the error handler starts,
> aborting that command fails and the host is not reset then the command
> will still be pending after the error handler has finished.
> 
> > @@ -6138,7 +6139,12 @@ static void ufshcd_err_handler(struct Scsi_Host
> *host)
> >   	     (hba->saved_uic_err & (UFSHCD_UIC_DL_NAC_RECEIVED_ERROR |
> >   				    UFSHCD_UIC_DL_TCx_REPLAY_ERROR)))) {
> >   		needs_reset = true;
> > -		goto do_reset;
> > +		spin_lock_irqsave(hba->host->host_lock, flags);
> > +		if (!hba->ahit && ufshcd_is_link_broken(hba))
> > +			link_broken_in_ah8 = true;
> > +		spin_unlock_irqrestore(hba->host->host_lock, flags);
> > +		if (!link_broken_in_ah8)
> > +			goto do_reset;
> >   	}
> >
> 
> My understanding is that hba->ahit == 0 means that auto-hibernation is
> disabled. Hence, the above code sets 'link_broken_in_ah8' only if auto-
> hibernation is disabled. So what does the name of the variable
> 'link_broken_in_ah8' mean?

Mistake. And while I'm commenting, I get a better idea and will revise this patch.

> 
> > @@ -6168,6 +6174,9 @@ static void ufshcd_err_handler(struct Scsi_Host
> *host)
> >   		}
> >   	}
> >
> > +	if (link_broken_in_ah8)
> > +		goto lock_skip_pending_xfer_clear;
> > +
> >   	/* Clear pending task management requests */
> >   	for_each_set_bit(tag, &hba->outstanding_tasks, hba->nutmrs) {
> >   		if (ufshcd_clear_tm_cmd(hba, tag)) {
> 
> Why is skipping the ufshcd_clear_tm_cmd() calls useful in this case?
> 
> Thanks,
> 
> Bart.
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-08  2:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20211007075635epcas2p16af95ce29750417f34f8490b0d8000d4@epcas2p1.samsung.com>
2021-10-07  7:40 ` [PATCH v1] scsi: ufs: clear doorbell for hibern8 errors when using ah8 Kiwoong Kim
2021-10-07 21:47   ` Bart Van Assche
2021-10-08  2:02     ` Kiwoong Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).