All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational
@ 2021-10-02 15:45 Adrian Hunter
  2021-10-02 15:45 ` [PATCH 1/2] scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead Adrian Hunter
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Adrian Hunter @ 2021-10-02 15:45 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Bart Van Assche, linux-scsi

Hi

Callers of ufshcd_reset_and_restore() and ufshcd_err_handler() expect them
to return in an operational state. However, the code does not check the
state before exiting.  Here are a couple of patches to correct that.


Adrian Hunter (2):
      scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead
      scsi: ufs: Do not exit ufshcd_err_handler() unless operational or dead

 drivers/scsi/ufs/ufshcd.c | 66 ++++++++++++++++++++++++++++++++++-------------
 1 file changed, 48 insertions(+), 18 deletions(-)


Regards
Adrian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead
  2021-10-02 15:45 [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational Adrian Hunter
@ 2021-10-02 15:45 ` Adrian Hunter
  2021-10-03  7:26   ` Avri Altman
  2021-10-02 15:45 ` [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() " Adrian Hunter
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Adrian Hunter @ 2021-10-02 15:45 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Bart Van Assche, linux-scsi

Callers of ufshcd_reset_and_restore() expect it to return in an operational
state. However, the code only checks direct errors and so the ufshcd_state
may not be UFSHCD_STATE_OPERATIONAL due to error interrupts.

Fix by checking also ufshcd_state, still allowing non-fatal errors which
are left for the error handler to deal with.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/scsi/ufs/ufshcd.c | 36 +++++++++++++++++++++++-------------
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 9faf02cbb9ad..16492779d3a6 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -7156,31 +7156,41 @@ static int ufshcd_host_reset_and_restore(struct ufs_hba *hba)
  */
 static int ufshcd_reset_and_restore(struct ufs_hba *hba)
 {
-	u32 saved_err;
-	u32 saved_uic_err;
+	u32 saved_err = 0;
+	u32 saved_uic_err = 0;
 	int err = 0;
 	unsigned long flags;
 	int retries = MAX_HOST_RESET_RETRIES;
 
-	/*
-	 * This is a fresh start, cache and clear saved error first,
-	 * in case new error generated during reset and restore.
-	 */
 	spin_lock_irqsave(hba->host->host_lock, flags);
-	saved_err = hba->saved_err;
-	saved_uic_err = hba->saved_uic_err;
-	hba->saved_err = 0;
-	hba->saved_uic_err = 0;
-	spin_unlock_irqrestore(hba->host->host_lock, flags);
-
 	do {
+		/*
+		 * This is a fresh start, cache and clear saved error first,
+		 * in case new error generated during reset and restore.
+		 */
+		saved_err |= hba->saved_err;
+		saved_uic_err |= hba->saved_uic_err;
+		hba->saved_err = 0;
+		hba->saved_uic_err = 0;
+		hba->force_reset = false;
+		hba->ufshcd_state = UFSHCD_STATE_RESET;
+		spin_unlock_irqrestore(hba->host->host_lock, flags);
+
 		/* Reset the attached device */
 		ufshcd_device_reset(hba);
 
 		err = ufshcd_host_reset_and_restore(hba);
+
+		spin_lock_irqsave(hba->host->host_lock, flags);
+		if (err)
+			continue;
+		/* Do not exit unless operational or dead */
+		if (hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL &&
+		    hba->ufshcd_state != UFSHCD_STATE_ERROR &&
+		    hba->ufshcd_state != UFSHCD_STATE_EH_SCHEDULED_NON_FATAL)
+			err = -EAGAIN;
 	} while (err && --retries);
 
-	spin_lock_irqsave(hba->host->host_lock, flags);
 	/*
 	 * Inform scsi mid-layer that we did reset and allow to handle
 	 * Unit Attention properly.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() unless operational or dead
  2021-10-02 15:45 [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational Adrian Hunter
  2021-10-02 15:45 ` [PATCH 1/2] scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead Adrian Hunter
@ 2021-10-02 15:45 ` Adrian Hunter
  2021-10-03  6:47   ` Avri Altman
  2021-10-03  7:26   ` Avri Altman
  2021-10-05  2:21 ` [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational Martin K. Petersen
  2021-10-12 20:35 ` Martin K. Petersen
  3 siblings, 2 replies; 10+ messages in thread
From: Adrian Hunter @ 2021-10-02 15:45 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Avri Altman, Alim Akhtar,
	Can Guo, Asutosh Das, Bart Van Assche, linux-scsi

Callers of ufshcd_err_handler() expect it to return in an operational
state. However, the code does not check the state before exiting.

Add a check for the state and perform retries until either success or the
maximum number of retries is reached.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/scsi/ufs/ufshcd.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 16492779d3a6..33f55ecf43de 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -64,6 +64,9 @@
 /* maximum number of reset retries before giving up */
 #define MAX_HOST_RESET_RETRIES 5
 
+/* Maximum number of error handler retries before giving up */
+#define MAX_ERR_HANDLER_RETRIES 5
+
 /* Expose the flag value from utp_upiu_query.value */
 #define MASK_QUERY_UPIU_FLAG_LOC 0xFF
 
@@ -6070,12 +6073,14 @@ static bool ufshcd_is_pwr_mode_restore_needed(struct ufs_hba *hba)
 static void ufshcd_err_handler(struct Scsi_Host *host)
 {
 	struct ufs_hba *hba = shost_priv(host);
+	int retries = MAX_ERR_HANDLER_RETRIES;
 	unsigned long flags;
-	bool err_xfer = false;
-	bool err_tm = false;
-	int err = 0, pmc_err;
-	int tag;
-	bool needs_reset = false, needs_restore = false;
+	bool needs_restore;
+	bool needs_reset;
+	bool err_xfer;
+	bool err_tm;
+	int pmc_err;
+ 	int tag;
 
 	down(&hba->host_sem);
 	spin_lock_irqsave(hba->host->host_lock, flags);
@@ -6093,6 +6098,12 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
 	/* Complete requests that have door-bell cleared by h/w */
 	ufshcd_complete_requests(hba);
 	spin_lock_irqsave(hba->host->host_lock, flags);
+again:
+	needs_restore = false;
+	needs_reset = false;
+	err_xfer = false;
+	err_tm = false;
+
 	if (hba->ufshcd_state != UFSHCD_STATE_ERROR)
 		hba->ufshcd_state = UFSHCD_STATE_RESET;
 	/*
@@ -6213,6 +6224,8 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
 do_reset:
 	/* Fatal errors need reset */
 	if (needs_reset) {
+		int err;
+
 		hba->force_reset = false;
 		spin_unlock_irqrestore(hba->host->host_lock, flags);
 		err = ufshcd_reset_and_restore(hba);
@@ -6232,6 +6245,13 @@ static void ufshcd_err_handler(struct Scsi_Host *host)
 			dev_err_ratelimited(hba->dev, "%s: exit: saved_err 0x%x saved_uic_err 0x%x",
 			    __func__, hba->saved_err, hba->saved_uic_err);
 	}
+	/* Exit in an operational state or dead */
+	if (hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL &&
+	    hba->ufshcd_state != UFSHCD_STATE_ERROR) {
+		if (--retries)
+			goto again;
+		hba->ufshcd_state = UFSHCD_STATE_ERROR;
+	}
 	ufshcd_clear_eh_in_progress(hba);
 	spin_unlock_irqrestore(hba->host->host_lock, flags);
 	ufshcd_err_handling_unprepare(hba);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() unless operational or dead
  2021-10-02 15:45 ` [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() " Adrian Hunter
@ 2021-10-03  6:47   ` Avri Altman
  2021-10-03  7:10     ` Adrian Hunter
  2021-10-03  7:26   ` Avri Altman
  1 sibling, 1 reply; 10+ messages in thread
From: Avri Altman @ 2021-10-03  6:47 UTC (permalink / raw)
  To: Adrian Hunter, Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Alim Akhtar, Can Guo,
	Asutosh Das, Bart Van Assche, linux-scsi

> Callers of ufshcd_err_handler() expect it to return in an operational
> state. However, the code does not check the state before exiting.
> 
> Add a check for the state and perform retries until either success or the
> maximum number of retries is reached.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  drivers/scsi/ufs/ufshcd.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 16492779d3a6..33f55ecf43de 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -64,6 +64,9 @@
>  /* maximum number of reset retries before giving up */
>  #define MAX_HOST_RESET_RETRIES 5
> 
> +/* Maximum number of error handler retries before giving up */
> +#define MAX_ERR_HANDLER_RETRIES 5
> +
>  /* Expose the flag value from utp_upiu_query.value */
>  #define MASK_QUERY_UPIU_FLAG_LOC 0xFF
> 
> @@ -6070,12 +6073,14 @@ static bool
> ufshcd_is_pwr_mode_restore_needed(struct ufs_hba *hba)
>  static void ufshcd_err_handler(struct Scsi_Host *host)
>  {
>         struct ufs_hba *hba = shost_priv(host);
> +       int retries = MAX_ERR_HANDLER_RETRIES;
>         unsigned long flags;
> -       bool err_xfer = false;
> -       bool err_tm = false;
> -       int err = 0, pmc_err;
> -       int tag;
> -       bool needs_reset = false, needs_restore = false;
> +       bool needs_restore;
> +       bool needs_reset;
> +       bool err_xfer;
> +       bool err_tm;
> +       int pmc_err;
> +       int tag;
> 
>         down(&hba->host_sem);
>         spin_lock_irqsave(hba->host->host_lock, flags);
> @@ -6093,6 +6098,12 @@ static void ufshcd_err_handler(struct Scsi_Host
> *host)
>         /* Complete requests that have door-bell cleared by h/w */
>         ufshcd_complete_requests(hba);
>         spin_lock_irqsave(hba->host->host_lock, flags);
> +again:
> +       needs_restore = false;
> +       needs_reset = false;
> +       err_xfer = false;
> +       err_tm = false;
> +
>         if (hba->ufshcd_state != UFSHCD_STATE_ERROR)
>                 hba->ufshcd_state = UFSHCD_STATE_RESET;
>         /*
> @@ -6213,6 +6224,8 @@ static void ufshcd_err_handler(struct Scsi_Host
> *host)
>  do_reset:
>         /* Fatal errors need reset */
>         if (needs_reset) {
> +               int err;
> +
>                 hba->force_reset = false;
>                 spin_unlock_irqrestore(hba->host->host_lock, flags);
>                 err = ufshcd_reset_and_restore(hba);
> @@ -6232,6 +6245,13 @@ static void ufshcd_err_handler(struct Scsi_Host
> *host)
>                         dev_err_ratelimited(hba->dev, "%s: exit: saved_err 0x%x
> saved_uic_err 0x%x",
>                             __func__, hba->saved_err, hba->saved_uic_err);
>         }
> +       /* Exit in an operational state or dead */
> +       if (hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL &&
> +           hba->ufshcd_state != UFSHCD_STATE_ERROR) {
> +               if (--retries)
> +                       goto again;
Why do you need to retry here as well?
ufshcd_reset_and_restore() already exists only if operational or dead?

Thanks,
Avri

> +               hba->ufshcd_state = UFSHCD_STATE_ERROR;
> +       }
>         ufshcd_clear_eh_in_progress(hba);
>         spin_unlock_irqrestore(hba->host->host_lock, flags);
>         ufshcd_err_handling_unprepare(hba);
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() unless operational or dead
  2021-10-03  6:47   ` Avri Altman
@ 2021-10-03  7:10     ` Adrian Hunter
  2021-10-03  7:25       ` Avri Altman
  0 siblings, 1 reply; 10+ messages in thread
From: Adrian Hunter @ 2021-10-03  7:10 UTC (permalink / raw)
  To: Avri Altman, Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Alim Akhtar, Can Guo,
	Asutosh Das, Bart Van Assche, linux-scsi

On 03/10/2021 09:47, Avri Altman wrote:
>> Callers of ufshcd_err_handler() expect it to return in an operational
>> state. However, the code does not check the state before exiting.
>>
>> Add a check for the state and perform retries until either success or the
>> maximum number of retries is reached.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  drivers/scsi/ufs/ufshcd.c | 30 +++++++++++++++++++++++++-----
>>  1 file changed, 25 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> index 16492779d3a6..33f55ecf43de 100644
>> --- a/drivers/scsi/ufs/ufshcd.c
>> +++ b/drivers/scsi/ufs/ufshcd.c
>> @@ -64,6 +64,9 @@
>>  /* maximum number of reset retries before giving up */
>>  #define MAX_HOST_RESET_RETRIES 5
>>
>> +/* Maximum number of error handler retries before giving up */
>> +#define MAX_ERR_HANDLER_RETRIES 5
>> +
>>  /* Expose the flag value from utp_upiu_query.value */
>>  #define MASK_QUERY_UPIU_FLAG_LOC 0xFF
>>
>> @@ -6070,12 +6073,14 @@ static bool
>> ufshcd_is_pwr_mode_restore_needed(struct ufs_hba *hba)
>>  static void ufshcd_err_handler(struct Scsi_Host *host)
>>  {
>>         struct ufs_hba *hba = shost_priv(host);
>> +       int retries = MAX_ERR_HANDLER_RETRIES;
>>         unsigned long flags;
>> -       bool err_xfer = false;
>> -       bool err_tm = false;
>> -       int err = 0, pmc_err;
>> -       int tag;
>> -       bool needs_reset = false, needs_restore = false;
>> +       bool needs_restore;
>> +       bool needs_reset;
>> +       bool err_xfer;
>> +       bool err_tm;
>> +       int pmc_err;
>> +       int tag;
>>
>>         down(&hba->host_sem);
>>         spin_lock_irqsave(hba->host->host_lock, flags);
>> @@ -6093,6 +6098,12 @@ static void ufshcd_err_handler(struct Scsi_Host
>> *host)
>>         /* Complete requests that have door-bell cleared by h/w */
>>         ufshcd_complete_requests(hba);
>>         spin_lock_irqsave(hba->host->host_lock, flags);
>> +again:
>> +       needs_restore = false;
>> +       needs_reset = false;
>> +       err_xfer = false;
>> +       err_tm = false;
>> +
>>         if (hba->ufshcd_state != UFSHCD_STATE_ERROR)
>>                 hba->ufshcd_state = UFSHCD_STATE_RESET;
>>         /*
>> @@ -6213,6 +6224,8 @@ static void ufshcd_err_handler(struct Scsi_Host
>> *host)
>>  do_reset:
>>         /* Fatal errors need reset */
>>         if (needs_reset) {
>> +               int err;
>> +
>>                 hba->force_reset = false;
>>                 spin_unlock_irqrestore(hba->host->host_lock, flags);
>>                 err = ufshcd_reset_and_restore(hba);
>> @@ -6232,6 +6245,13 @@ static void ufshcd_err_handler(struct Scsi_Host
>> *host)
>>                         dev_err_ratelimited(hba->dev, "%s: exit: saved_err 0x%x
>> saved_uic_err 0x%x",
>>                             __func__, hba->saved_err, hba->saved_uic_err);
>>         }
>> +       /* Exit in an operational state or dead */
>> +       if (hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL &&
>> +           hba->ufshcd_state != UFSHCD_STATE_ERROR) {
>> +               if (--retries)
>> +                       goto again;
> Why do you need to retry here as well?

Thanks for looking at this.

It shouldn't hurt to retry bringing the device back to life.  The
alternative is UFSHCD_STATE_ERROR which means dead.

> ufshcd_reset_and_restore() already exists only if operational or dead?

ufshcd_reset_and_restore() isn't the only path.  There are also
ufshcd_quirk_dl_nac_errors() and ufshcd_config_pwr_mode() and in
the future perhaps others.

This seems the right place to ensure that the error handler
guarantees operational (or dead) status.

> 
> Thanks,
> Avri
> 
>> +               hba->ufshcd_state = UFSHCD_STATE_ERROR;
>> +       }
>>         ufshcd_clear_eh_in_progress(hba);
>>         spin_unlock_irqrestore(hba->host->host_lock, flags);
>>         ufshcd_err_handling_unprepare(hba);
>> --
>> 2.25.1
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() unless operational or dead
  2021-10-03  7:10     ` Adrian Hunter
@ 2021-10-03  7:25       ` Avri Altman
  0 siblings, 0 replies; 10+ messages in thread
From: Avri Altman @ 2021-10-03  7:25 UTC (permalink / raw)
  To: Adrian Hunter, Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Alim Akhtar, Can Guo,
	Asutosh Das, Bart Van Assche, linux-scsi

> >>                 hba->force_reset = false;
> >>                 spin_unlock_irqrestore(hba->host->host_lock, flags);
> >>                 err = ufshcd_reset_and_restore(hba); @@ -6232,6
> >> +6245,13 @@ static void ufshcd_err_handler(struct Scsi_Host
> >> *host)
> >>                         dev_err_ratelimited(hba->dev, "%s: exit:
> >> saved_err 0x%x saved_uic_err 0x%x",
> >>                             __func__, hba->saved_err, hba->saved_uic_err);
> >>         }
> >> +       /* Exit in an operational state or dead */
> >> +       if (hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL &&
> >> +           hba->ufshcd_state != UFSHCD_STATE_ERROR) {
> >> +               if (--retries)
> >> +                       goto again;
> > Why do you need to retry here as well?
> 
> Thanks for looking at this.
> 
> It shouldn't hurt to retry bringing the device back to life.  The alternative is
> UFSHCD_STATE_ERROR which means dead.
> 
> > ufshcd_reset_and_restore() already exists only if operational or dead?
> 
> ufshcd_reset_and_restore() isn't the only path.  There are also
> ufshcd_quirk_dl_nac_errors() and ufshcd_config_pwr_mode() and in the
> future perhaps others.
> 
> This seems the right place to ensure that the error handler guarantees
> operational (or dead) status.
OK.  Thanks.
Avri

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 1/2] scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead
  2021-10-02 15:45 ` [PATCH 1/2] scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead Adrian Hunter
@ 2021-10-03  7:26   ` Avri Altman
  0 siblings, 0 replies; 10+ messages in thread
From: Avri Altman @ 2021-10-03  7:26 UTC (permalink / raw)
  To: Adrian Hunter, Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Alim Akhtar, Can Guo,
	Asutosh Das, Bart Van Assche, linux-scsi

> 
> Callers of ufshcd_reset_and_restore() expect it to return in an operational
> state. However, the code only checks direct errors and so the ufshcd_state
> may not be UFSHCD_STATE_OPERATIONAL due to error interrupts.
> 
> Fix by checking also ufshcd_state, still allowing non-fatal errors which are left
> for the error handler to deal with.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Avri altman <avri.altman@wdc.com>


> ---
>  drivers/scsi/ufs/ufshcd.c | 36 +++++++++++++++++++++++-------------
>  1 file changed, 23 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index
> 9faf02cbb9ad..16492779d3a6 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -7156,31 +7156,41 @@ static int ufshcd_host_reset_and_restore(struct
> ufs_hba *hba)
>   */
>  static int ufshcd_reset_and_restore(struct ufs_hba *hba)  {
> -       u32 saved_err;
> -       u32 saved_uic_err;
> +       u32 saved_err = 0;
> +       u32 saved_uic_err = 0;
>         int err = 0;
>         unsigned long flags;
>         int retries = MAX_HOST_RESET_RETRIES;
> 
> -       /*
> -        * This is a fresh start, cache and clear saved error first,
> -        * in case new error generated during reset and restore.
> -        */
>         spin_lock_irqsave(hba->host->host_lock, flags);
> -       saved_err = hba->saved_err;
> -       saved_uic_err = hba->saved_uic_err;
> -       hba->saved_err = 0;
> -       hba->saved_uic_err = 0;
> -       spin_unlock_irqrestore(hba->host->host_lock, flags);
> -
>         do {
> +               /*
> +                * This is a fresh start, cache and clear saved error first,
> +                * in case new error generated during reset and restore.
> +                */
> +               saved_err |= hba->saved_err;
> +               saved_uic_err |= hba->saved_uic_err;
> +               hba->saved_err = 0;
> +               hba->saved_uic_err = 0;
> +               hba->force_reset = false;
> +               hba->ufshcd_state = UFSHCD_STATE_RESET;
> +               spin_unlock_irqrestore(hba->host->host_lock, flags);
> +
>                 /* Reset the attached device */
>                 ufshcd_device_reset(hba);
> 
>                 err = ufshcd_host_reset_and_restore(hba);
> +
> +               spin_lock_irqsave(hba->host->host_lock, flags);
> +               if (err)
> +                       continue;
> +               /* Do not exit unless operational or dead */
> +               if (hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL &&
> +                   hba->ufshcd_state != UFSHCD_STATE_ERROR &&
> +                   hba->ufshcd_state !=
> UFSHCD_STATE_EH_SCHEDULED_NON_FATAL)
> +                       err = -EAGAIN;
>         } while (err && --retries);
> 
> -       spin_lock_irqsave(hba->host->host_lock, flags);
>         /*
>          * Inform scsi mid-layer that we did reset and allow to handle
>          * Unit Attention properly.
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() unless operational or dead
  2021-10-02 15:45 ` [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() " Adrian Hunter
  2021-10-03  6:47   ` Avri Altman
@ 2021-10-03  7:26   ` Avri Altman
  1 sibling, 0 replies; 10+ messages in thread
From: Avri Altman @ 2021-10-03  7:26 UTC (permalink / raw)
  To: Adrian Hunter, Martin K . Petersen
  Cc: James E . J . Bottomley, Bean Huo, Alim Akhtar, Can Guo,
	Asutosh Das, Bart Van Assche, linux-scsi

> 
> Callers of ufshcd_err_handler() expect it to return in an operational
> state. However, the code does not check the state before exiting.
> 
> Add a check for the state and perform retries until either success or the
> maximum number of retries is reached.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>


> ---
>  drivers/scsi/ufs/ufshcd.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 16492779d3a6..33f55ecf43de 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -64,6 +64,9 @@
>  /* maximum number of reset retries before giving up */
>  #define MAX_HOST_RESET_RETRIES 5
> 
> +/* Maximum number of error handler retries before giving up */
> +#define MAX_ERR_HANDLER_RETRIES 5
> +
>  /* Expose the flag value from utp_upiu_query.value */
>  #define MASK_QUERY_UPIU_FLAG_LOC 0xFF
> 
> @@ -6070,12 +6073,14 @@ static bool
> ufshcd_is_pwr_mode_restore_needed(struct ufs_hba *hba)
>  static void ufshcd_err_handler(struct Scsi_Host *host)
>  {
>         struct ufs_hba *hba = shost_priv(host);
> +       int retries = MAX_ERR_HANDLER_RETRIES;
>         unsigned long flags;
> -       bool err_xfer = false;
> -       bool err_tm = false;
> -       int err = 0, pmc_err;
> -       int tag;
> -       bool needs_reset = false, needs_restore = false;
> +       bool needs_restore;
> +       bool needs_reset;
> +       bool err_xfer;
> +       bool err_tm;
> +       int pmc_err;
> +       int tag;
> 
>         down(&hba->host_sem);
>         spin_lock_irqsave(hba->host->host_lock, flags);
> @@ -6093,6 +6098,12 @@ static void ufshcd_err_handler(struct Scsi_Host
> *host)
>         /* Complete requests that have door-bell cleared by h/w */
>         ufshcd_complete_requests(hba);
>         spin_lock_irqsave(hba->host->host_lock, flags);
> +again:
> +       needs_restore = false;
> +       needs_reset = false;
> +       err_xfer = false;
> +       err_tm = false;
> +
>         if (hba->ufshcd_state != UFSHCD_STATE_ERROR)
>                 hba->ufshcd_state = UFSHCD_STATE_RESET;
>         /*
> @@ -6213,6 +6224,8 @@ static void ufshcd_err_handler(struct Scsi_Host
> *host)
>  do_reset:
>         /* Fatal errors need reset */
>         if (needs_reset) {
> +               int err;
> +
>                 hba->force_reset = false;
>                 spin_unlock_irqrestore(hba->host->host_lock, flags);
>                 err = ufshcd_reset_and_restore(hba);
> @@ -6232,6 +6245,13 @@ static void ufshcd_err_handler(struct Scsi_Host
> *host)
>                         dev_err_ratelimited(hba->dev, "%s: exit: saved_err 0x%x
> saved_uic_err 0x%x",
>                             __func__, hba->saved_err, hba->saved_uic_err);
>         }
> +       /* Exit in an operational state or dead */
> +       if (hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL &&
> +           hba->ufshcd_state != UFSHCD_STATE_ERROR) {
> +               if (--retries)
> +                       goto again;
> +               hba->ufshcd_state = UFSHCD_STATE_ERROR;
> +       }
>         ufshcd_clear_eh_in_progress(hba);
>         spin_unlock_irqrestore(hba->host->host_lock, flags);
>         ufshcd_err_handling_unprepare(hba);
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational
  2021-10-02 15:45 [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational Adrian Hunter
  2021-10-02 15:45 ` [PATCH 1/2] scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead Adrian Hunter
  2021-10-02 15:45 ` [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() " Adrian Hunter
@ 2021-10-05  2:21 ` Martin K. Petersen
  2021-10-12 20:35 ` Martin K. Petersen
  3 siblings, 0 replies; 10+ messages in thread
From: Martin K. Petersen @ 2021-10-05  2:21 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Martin K . Petersen, James E . J . Bottomley, Bean Huo,
	Avri Altman, Alim Akhtar, Can Guo, Asutosh Das, Bart Van Assche,
	linux-scsi


Adrian,

> Callers of ufshcd_reset_and_restore() and ufshcd_err_handler() expect
> them to return in an operational state. However, the code does not
> check the state before exiting.  Here are a couple of patches to
> correct that.

Applied to 5.16/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational
  2021-10-02 15:45 [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational Adrian Hunter
                   ` (2 preceding siblings ...)
  2021-10-05  2:21 ` [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational Martin K. Petersen
@ 2021-10-12 20:35 ` Martin K. Petersen
  3 siblings, 0 replies; 10+ messages in thread
From: Martin K. Petersen @ 2021-10-12 20:35 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Martin K . Petersen, Bart Van Assche, James E . J . Bottomley,
	Bean Huo, Alim Akhtar, linux-scsi, Asutosh Das, Can Guo,
	Avri Altman

On Sat, 2 Oct 2021 18:45:48 +0300, Adrian Hunter wrote:

> Callers of ufshcd_reset_and_restore() and ufshcd_err_handler() expect them
> to return in an operational state. However, the code does not check the
> state before exiting.  Here are a couple of patches to correct that.
> 
> 
> Adrian Hunter (2):
>       scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead
>       scsi: ufs: Do not exit ufshcd_err_handler() unless operational or dead
> 
> [...]

Applied to 5.16/scsi-queue, thanks!

[1/2] scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead
      https://git.kernel.org/mkp/scsi/c/54a4045342a8
[2/2] scsi: ufs: Do not exit ufshcd_err_handler() unless operational or dead
      https://git.kernel.org/mkp/scsi/c/87bf6a6bbe8b

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-12 20:35 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-02 15:45 [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational Adrian Hunter
2021-10-02 15:45 ` [PATCH 1/2] scsi: ufs: Do not exit ufshcd_reset_and_restore() unless operational or dead Adrian Hunter
2021-10-03  7:26   ` Avri Altman
2021-10-02 15:45 ` [PATCH 2/2] scsi: ufs: Do not exit ufshcd_err_handler() " Adrian Hunter
2021-10-03  6:47   ` Avri Altman
2021-10-03  7:10     ` Adrian Hunter
2021-10-03  7:25       ` Avri Altman
2021-10-03  7:26   ` Avri Altman
2021-10-05  2:21 ` [PATCH 0/2] scsi: ufs: Do not exit reset of error functions unless operational Martin K. Petersen
2021-10-12 20:35 ` Martin K. Petersen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.