linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] scsi: ufs: Fix possible infinite loop in ufshcd_hold
@ 2020-07-29  2:40 Stanley Chu
  2020-07-29  8:43 ` Can Guo
  0 siblings, 1 reply; 6+ messages in thread
From: Stanley Chu @ 2020-07-29  2:40 UTC (permalink / raw)
  To: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb, bvanassche
  Cc: beanhuo, asutoshd, cang, matthias.bgg, linux-mediatek,
	linux-arm-kernel, linux-kernel, kuohong.wang, peter.wang,
	chun-hung.wu, andy.teng, chaotian.jing, cc.chou, Stanley Chu

In ufshcd_suspend(), after clk-gating is suspended and link is set
as Hibern8 state, ufshcd_hold() is still possibly invoked before
ufshcd_suspend() returns. For example, MediaTek's suspend vops may
issue UIC commands which would call ufshcd_hold() during the command
issuing flow.

Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled,
then ufshcd_hold() may enter infinite loops because there is no
clk-ungating work scheduled or pending. In this case, ufshcd_hold()
shall just bypass, and keep the link as Hibern8 state.

Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Andy Teng <andy.teng@mediatek.com>

---

Changes since v1:
- Fix return value: Use unique bool variable to get the result of flush_work(). Thcan prevent incorrect returned value, i.e., rc, if flush_work() returns true
- Fix commit message

---
 drivers/scsi/ufs/ufshcd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 577cc0d7487f..acba2271c5d3 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1561,6 +1561,7 @@ static void ufshcd_ungate_work(struct work_struct *work)
 int ufshcd_hold(struct ufs_hba *hba, bool async)
 {
 	int rc = 0;
+	bool flush_result;
 	unsigned long flags;
 
 	if (!ufshcd_is_clkgating_allowed(hba))
@@ -1592,7 +1593,9 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
 				break;
 			}
 			spin_unlock_irqrestore(hba->host->host_lock, flags);
-			flush_work(&hba->clk_gating.ungate_work);
+			flush_result = flush_work(&hba->clk_gating.ungate_work);
+			if (hba->clk_gating.is_suspended && !flush_result)
+				goto out;
 			spin_lock_irqsave(hba->host->host_lock, flags);
 			goto start;
 		}
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] scsi: ufs: Fix possible infinite loop in ufshcd_hold
  2020-07-29  2:40 [PATCH v2] scsi: ufs: Fix possible infinite loop in ufshcd_hold Stanley Chu
@ 2020-07-29  8:43 ` Can Guo
  2020-07-29 10:26   ` Stanley Chu
  0 siblings, 1 reply; 6+ messages in thread
From: Can Guo @ 2020-07-29  8:43 UTC (permalink / raw)
  To: Stanley Chu
  Cc: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb,
	bvanassche, beanhuo, asutoshd, matthias.bgg, linux-mediatek,
	linux-arm-kernel, linux-kernel, kuohong.wang, peter.wang,
	chun-hung.wu, andy.teng, chaotian.jing, cc.chou

Hi Stanley,

On 2020-07-29 10:40, Stanley Chu wrote:
> In ufshcd_suspend(), after clk-gating is suspended and link is set
> as Hibern8 state, ufshcd_hold() is still possibly invoked before
> ufshcd_suspend() returns. For example, MediaTek's suspend vops may
> issue UIC commands which would call ufshcd_hold() during the command
> issuing flow.
> 
> Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled,
> then ufshcd_hold() may enter infinite loops because there is no
> clk-ungating work scheduled or pending. In this case, ufshcd_hold()
> shall just bypass, and keep the link as Hibern8 state.
> 

The infinite loop is expected as ufshcd_hold is called again after
link is put to hibern8 state, so in QCOM's code, we never do this.
The cap UFSHCD_CAP_HIBERN8_WITH_CLK_GATING means UIC link state
must not be HIBERN8 after ufshcd_hold(async=false) returns.

Instead of bailing out from that loop, which makes the logic of
ufshcd_hold and clk gating even more complex, how about removing
ufshcd_hold/release from ufshcd_send_uic_cmd()? I think they are
redundant and we should never send DME cmds if clocks/powers are
not ready. I mean callers should make sure they are ready to send
DME cmds (and only callers know when), but not leave that job to
ufshcd_send_uic_cmd(). It is convenient to remove ufshcd_hold/
release from ufshcd_send_uic_cmd() as there are not many places
sending DME cmds without holding the clocks, ufs_bsg.c is one.
And I have tested my idea on my setup, it worked well for me.
Another benefit is that it also allows us to use DME cmds
in clk gating/ungating contexts if we need to in the future.

Please let me know your idea, thanks.

Can Guo.

> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
> Signed-off-by: Andy Teng <andy.teng@mediatek.com>
> 
> ---
> 
> Changes since v1:
> - Fix return value: Use unique bool variable to get the result of
> flush_work(). Thcan prevent incorrect returned value, i.e., rc, if
> flush_work() returns true
> - Fix commit message
> 
> ---
>  drivers/scsi/ufs/ufshcd.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 577cc0d7487f..acba2271c5d3 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -1561,6 +1561,7 @@ static void ufshcd_ungate_work(struct work_struct 
> *work)
>  int ufshcd_hold(struct ufs_hba *hba, bool async)
>  {
>  	int rc = 0;
> +	bool flush_result;
>  	unsigned long flags;
> 
>  	if (!ufshcd_is_clkgating_allowed(hba))
> @@ -1592,7 +1593,9 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
>  				break;
>  			}
>  			spin_unlock_irqrestore(hba->host->host_lock, flags);
> -			flush_work(&hba->clk_gating.ungate_work);
> +			flush_result = flush_work(&hba->clk_gating.ungate_work);
> +			if (hba->clk_gating.is_suspended && !flush_result)
> +				goto out;
>  			spin_lock_irqsave(hba->host->host_lock, flags);
>  			goto start;
>  		}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] scsi: ufs: Fix possible infinite loop in ufshcd_hold
  2020-07-29  8:43 ` Can Guo
@ 2020-07-29 10:26   ` Stanley Chu
  2020-07-29 10:53     ` Can Guo
  0 siblings, 1 reply; 6+ messages in thread
From: Stanley Chu @ 2020-07-29 10:26 UTC (permalink / raw)
  To: Can Guo
  Cc: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb,
	bvanassche, beanhuo, asutoshd, matthias.bgg, linux-mediatek,
	linux-arm-kernel, linux-kernel, kuohong.wang, peter.wang,
	chun-hung.wu, andy.teng, chaotian.jing, cc.chou

Hi Can,

On Wed, 2020-07-29 at 16:43 +0800, Can Guo wrote:
> Hi Stanley,
> 
> On 2020-07-29 10:40, Stanley Chu wrote:
> > In ufshcd_suspend(), after clk-gating is suspended and link is set
> > as Hibern8 state, ufshcd_hold() is still possibly invoked before
> > ufshcd_suspend() returns. For example, MediaTek's suspend vops may
> > issue UIC commands which would call ufshcd_hold() during the command
> > issuing flow.
> > 
> > Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled,
> > then ufshcd_hold() may enter infinite loops because there is no
> > clk-ungating work scheduled or pending. In this case, ufshcd_hold()
> > shall just bypass, and keep the link as Hibern8 state.
> > 
> 
> The infinite loop is expected as ufshcd_hold is called again after
> link is put to hibern8 state, so in QCOM's code, we never do this.

Sadly MediaTek have to do this to make our UniPro to enter low-power
mode.

> The cap UFSHCD_CAP_HIBERN8_WITH_CLK_GATING means UIC link state
> must not be HIBERN8 after ufshcd_hold(async=false) returns.

If driver is not in PM scenarios, e.g., suspended, above statement shall
be always followed. But two obvious violations are existed,

1. In ufshcd_suspend(), link is set as HIBERN8 behind ufshcd_hold()
2. In ufshcd_resume(), link is set back as Active before
ufshcd_release() is invoked 

So as my understanding, special conditions are allowed in PM scenarios,
and this is why "hba->clk_gating.is_suspended" is introduced. By this
thought, I used "hba->clk_gating.is_suspended" in this patch as the
mandatory condition to allow ufshcd_hold() usage in vendor suspend and
resume callbacks.


> Instead of bailing out from that loop, which makes the logic of
> ufshcd_hold and clk gating even more complex, how about removing
> ufshcd_hold/release from ufshcd_send_uic_cmd()? I think they are
> redundant and we should never send DME cmds if clocks/powers are
> not ready. I mean callers should make sure they are ready to send
> DME cmds (and only callers know when), but not leave that job to
> ufshcd_send_uic_cmd(). It is convenient to remove ufshcd_hold/
> release from ufshcd_send_uic_cmd() as there are not many places
> sending DME cmds without holding the clocks, ufs_bsg.c is one.
> And I have tested my idea on my setup, it worked well for me.
> Another benefit is that it also allows us to use DME cmds
> in clk gating/ungating contexts if we need to in the future.
> 

Brilliant idea! But this may not solve problems if vendor callbacks need
more than UIC commands in the future.

This simple patch could make all vendor operations on UFSHCI in PM
callbacks possible with UFSHCD_CAP_HIBERN8_WITH_CLK_GATING enabled, and
again, it allows those operations in PM scenarios only.

> Please let me know your idea, thanks.
> 
> Can Guo.

Thanks,
Stanley Chu

> 
> > Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
> > Signed-off-by: Andy Teng <andy.teng@mediatek.com>
> > 
> > ---
> > 
> > Changes since v1:
> > - Fix return value: Use unique bool variable to get the result of
> > flush_work(). Thcan prevent incorrect returned value, i.e., rc, if
> > flush_work() returns true
> > - Fix commit message
> > 
> > ---
> >  drivers/scsi/ufs/ufshcd.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > index 577cc0d7487f..acba2271c5d3 100644
> > --- a/drivers/scsi/ufs/ufshcd.c
> > +++ b/drivers/scsi/ufs/ufshcd.c
> > @@ -1561,6 +1561,7 @@ static void ufshcd_ungate_work(struct work_struct 
> > *work)
> >  int ufshcd_hold(struct ufs_hba *hba, bool async)
> >  {
> >  	int rc = 0;
> > +	bool flush_result;
> >  	unsigned long flags;
> > 
> >  	if (!ufshcd_is_clkgating_allowed(hba))
> > @@ -1592,7 +1593,9 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
> >  				break;
> >  			}
> >  			spin_unlock_irqrestore(hba->host->host_lock, flags);
> > -			flush_work(&hba->clk_gating.ungate_work);
> > +			flush_result = flush_work(&hba->clk_gating.ungate_work);
> > +			if (hba->clk_gating.is_suspended && !flush_result)
> > +				goto out;
> >  			spin_lock_irqsave(hba->host->host_lock, flags);
> >  			goto start;
> >  		}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] scsi: ufs: Fix possible infinite loop in ufshcd_hold
  2020-07-29 10:26   ` Stanley Chu
@ 2020-07-29 10:53     ` Can Guo
  2020-07-30  7:59       ` Stanley Chu
  0 siblings, 1 reply; 6+ messages in thread
From: Can Guo @ 2020-07-29 10:53 UTC (permalink / raw)
  To: Stanley Chu
  Cc: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb,
	bvanassche, beanhuo, asutoshd, matthias.bgg, linux-mediatek,
	linux-arm-kernel, linux-kernel, kuohong.wang, peter.wang,
	chun-hung.wu, andy.teng, chaotian.jing, cc.chou

Hi Stanley,

On 2020-07-29 18:26, Stanley Chu wrote:
> Hi Can,
> 
> On Wed, 2020-07-29 at 16:43 +0800, Can Guo wrote:
>> Hi Stanley,
>> 
>> On 2020-07-29 10:40, Stanley Chu wrote:
>> > In ufshcd_suspend(), after clk-gating is suspended and link is set
>> > as Hibern8 state, ufshcd_hold() is still possibly invoked before
>> > ufshcd_suspend() returns. For example, MediaTek's suspend vops may
>> > issue UIC commands which would call ufshcd_hold() during the command
>> > issuing flow.
>> >
>> > Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled,
>> > then ufshcd_hold() may enter infinite loops because there is no
>> > clk-ungating work scheduled or pending. In this case, ufshcd_hold()
>> > shall just bypass, and keep the link as Hibern8 state.
>> >
>> 
>> The infinite loop is expected as ufshcd_hold is called again after
>> link is put to hibern8 state, so in QCOM's code, we never do this.
> 
> Sadly MediaTek have to do this to make our UniPro to enter low-power
> mode.
> 
>> The cap UFSHCD_CAP_HIBERN8_WITH_CLK_GATING means UIC link state
>> must not be HIBERN8 after ufshcd_hold(async=false) returns.
> 
> If driver is not in PM scenarios, e.g., suspended, above statement 
> shall
> be always followed. But two obvious violations are existed,
> 
> 1. In ufshcd_suspend(), link is set as HIBERN8 behind ufshcd_hold()
> 2. In ufshcd_resume(), link is set back as Active before
> ufshcd_release() is invoked
> 
> So as my understanding, special conditions are allowed in PM scenarios,
> and this is why "hba->clk_gating.is_suspended" is introduced. By this
> thought, I used "hba->clk_gating.is_suspended" in this patch as the
> mandatory condition to allow ufshcd_hold() usage in vendor suspend and
> resume callbacks.
> 
> 
>> Instead of bailing out from that loop, which makes the logic of
>> ufshcd_hold and clk gating even more complex, how about removing
>> ufshcd_hold/release from ufshcd_send_uic_cmd()? I think they are
>> redundant and we should never send DME cmds if clocks/powers are
>> not ready. I mean callers should make sure they are ready to send
>> DME cmds (and only callers know when), but not leave that job to
>> ufshcd_send_uic_cmd(). It is convenient to remove ufshcd_hold/
>> release from ufshcd_send_uic_cmd() as there are not many places
>> sending DME cmds without holding the clocks, ufs_bsg.c is one.
>> And I have tested my idea on my setup, it worked well for me.
>> Another benefit is that it also allows us to use DME cmds
>> in clk gating/ungating contexts if we need to in the future.
>> 
> 
> Brilliant idea! But this may not solve problems if vendor callbacks 
> need
> more than UIC commands in the future.
> 
> This simple patch could make all vendor operations on UFSHCI in PM
> callbacks possible with UFSHCD_CAP_HIBERN8_WITH_CLK_GATING enabled, and
> again, it allows those operations in PM scenarios only.
> 

Other than UIC cmds, I can only think of device manangement cmds (like 
query).
If device management cmds come into the way in the future, we fix it as 
well.
I mean that is the right thing to do in my opinion - just like we don't 
call
pm_runtime_get_sync() in ufshcd_send_uic_cmd().

I can understand that you want a simple/quick fix to get it work for you 
once
for all, but from my point of view, debugging clk gating/ungating really 
takes
huge efforts sometime (I've spent a lot of time on it). Some flash 
vendors also
use it in their own driver widely which makes some failure scenes even 
harder to
undertand/debug. So the first thing comes to my head is that we should 
avoid
making it more complex or giving it more exceptions.

 From functionality point of view, it looks ok to me. It is just that I 
cannot
predict it won't cause new problems since the clk gating/ungating 
sequeces are
like magic in some use cases sometime.

Thanks,

Can Guo.

>> Please let me know your idea, thanks.
>> 
>> Can Guo.
> 
> Thanks,
> Stanley Chu
> 
>> 
>> > Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
>> > Signed-off-by: Andy Teng <andy.teng@mediatek.com>
>> >
>> > ---
>> >
>> > Changes since v1:
>> > - Fix return value: Use unique bool variable to get the result of
>> > flush_work(). Thcan prevent incorrect returned value, i.e., rc, if
>> > flush_work() returns true
>> > - Fix commit message
>> >
>> > ---
>> >  drivers/scsi/ufs/ufshcd.c | 5 ++++-
>> >  1 file changed, 4 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> > index 577cc0d7487f..acba2271c5d3 100644
>> > --- a/drivers/scsi/ufs/ufshcd.c
>> > +++ b/drivers/scsi/ufs/ufshcd.c
>> > @@ -1561,6 +1561,7 @@ static void ufshcd_ungate_work(struct work_struct
>> > *work)
>> >  int ufshcd_hold(struct ufs_hba *hba, bool async)
>> >  {
>> >  	int rc = 0;
>> > +	bool flush_result;
>> >  	unsigned long flags;
>> >
>> >  	if (!ufshcd_is_clkgating_allowed(hba))
>> > @@ -1592,7 +1593,9 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
>> >  				break;
>> >  			}
>> >  			spin_unlock_irqrestore(hba->host->host_lock, flags);
>> > -			flush_work(&hba->clk_gating.ungate_work);
>> > +			flush_result = flush_work(&hba->clk_gating.ungate_work);
>> > +			if (hba->clk_gating.is_suspended && !flush_result)
>> > +				goto out;
>> >  			spin_lock_irqsave(hba->host->host_lock, flags);
>> >  			goto start;
>> >  		}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] scsi: ufs: Fix possible infinite loop in ufshcd_hold
  2020-07-29 10:53     ` Can Guo
@ 2020-07-30  7:59       ` Stanley Chu
       [not found]         ` <1596518850.27829.5.camel@mtkswgap22>
  0 siblings, 1 reply; 6+ messages in thread
From: Stanley Chu @ 2020-07-30  7:59 UTC (permalink / raw)
  To: Can Guo
  Cc: linux-scsi, martin.petersen, avri.altman, alim.akhtar, jejb,
	bvanassche, beanhuo, asutoshd, matthias.bgg, linux-mediatek,
	linux-arm-kernel, linux-kernel, kuohong.wang, peter.wang,
	chun-hung.wu, andy.teng, chaotian.jing, cc.chou

Hi Can,

On Wed, 2020-07-29 at 18:53 +0800, Can Guo wrote:
> Hi Stanley,
> 
> On 2020-07-29 18:26, Stanley Chu wrote:
> > Hi Can,
> > 
> > On Wed, 2020-07-29 at 16:43 +0800, Can Guo wrote:
> >> Hi Stanley,
> >> 
> >> On 2020-07-29 10:40, Stanley Chu wrote:
> >> > In ufshcd_suspend(), after clk-gating is suspended and link is set
> >> > as Hibern8 state, ufshcd_hold() is still possibly invoked before
> >> > ufshcd_suspend() returns. For example, MediaTek's suspend vops may
> >> > issue UIC commands which would call ufshcd_hold() during the command
> >> > issuing flow.
> >> >
> >> > Now if UFSHCD_CAP_HIBERN8_WITH_CLK_GATING capability is enabled,
> >> > then ufshcd_hold() may enter infinite loops because there is no
> >> > clk-ungating work scheduled or pending. In this case, ufshcd_hold()
> >> > shall just bypass, and keep the link as Hibern8 state.
> >> >
> >> 
> >> The infinite loop is expected as ufshcd_hold is called again after
> >> link is put to hibern8 state, so in QCOM's code, we never do this.
> > 
> > Sadly MediaTek have to do this to make our UniPro to enter low-power
> > mode.
> > 
> >> The cap UFSHCD_CAP_HIBERN8_WITH_CLK_GATING means UIC link state
> >> must not be HIBERN8 after ufshcd_hold(async=false) returns.
> > 
> > If driver is not in PM scenarios, e.g., suspended, above statement 
> > shall
> > be always followed. But two obvious violations are existed,
> > 
> > 1. In ufshcd_suspend(), link is set as HIBERN8 behind ufshcd_hold()
> > 2. In ufshcd_resume(), link is set back as Active before
> > ufshcd_release() is invoked
> > 
> > So as my understanding, special conditions are allowed in PM scenarios,
> > and this is why "hba->clk_gating.is_suspended" is introduced. By this
> > thought, I used "hba->clk_gating.is_suspended" in this patch as the
> > mandatory condition to allow ufshcd_hold() usage in vendor suspend and
> > resume callbacks.
> > 
> > 
> >> Instead of bailing out from that loop, which makes the logic of
> >> ufshcd_hold and clk gating even more complex, how about removing
> >> ufshcd_hold/release from ufshcd_send_uic_cmd()? I think they are
> >> redundant and we should never send DME cmds if clocks/powers are
> >> not ready. I mean callers should make sure they are ready to send
> >> DME cmds (and only callers know when), but not leave that job to
> >> ufshcd_send_uic_cmd(). It is convenient to remove ufshcd_hold/
> >> release from ufshcd_send_uic_cmd() as there are not many places
> >> sending DME cmds without holding the clocks, ufs_bsg.c is one.
> >> And I have tested my idea on my setup, it worked well for me.
> >> Another benefit is that it also allows us to use DME cmds
> >> in clk gating/ungating contexts if we need to in the future.
> >> 
> > 
> > Brilliant idea! But this may not solve problems if vendor callbacks 
> > need
> > more than UIC commands in the future.
> > 
> > This simple patch could make all vendor operations on UFSHCI in PM
> > callbacks possible with UFSHCD_CAP_HIBERN8_WITH_CLK_GATING enabled, and
> > again, it allows those operations in PM scenarios only.
> > 
> 
> Other than UIC cmds, I can only think of device manangement cmds (like 
> query).
> If device management cmds come into the way in the future, we fix it as 
> well.
> I mean that is the right thing to do in my opinion - just like we don't 
> call
> pm_runtime_get_sync() in ufshcd_send_uic_cmd().
> 
> I can understand that you want a simple/quick fix to get it work for you 
> once
> for all, but from my point of view, debugging clk gating/ungating really 
> takes
> huge efforts sometime (I've spent a lot of time on it). Some flash 
> vendors also
> use it in their own driver widely which makes some failure scenes even 
> harder to
> undertand/debug. So the first thing comes to my head is that we should 
> avoid
> making it more complex or giving it more exceptions.
> 
>  From functionality point of view, it looks ok to me. It is just that I 
> cannot
> predict it won't cause new problems since the clk gating/ungating 
> sequeces are
> like magic in some use cases sometime.

Thanks for the functionality review.

I totally understand what you mentioned above about the clk-gating
debugging because we also spent lots of time for issue analysis.

I just finished some fault injection for this patch in our platform, the
results are fine.

The active window of this patch is limited: Starting from
ufshcd_link_state_transition() in ufshcd_suspend to ufshcd_vops_resume()
in ufshcd_resume() because the link is back to LINKUP state in MediaTek
resume callback. So I was focus on injecting errors in our callbacks
between this period and most of injected fails triggered host and device
reset flow.

For example,
Suspend: UniPro PowerDownControl timeout
Resume: hba_enable timeout
Resume: UniPro PowerDownControl timeout
Resume: HIBERN8 Exit timeout

Hope these tests can ease your concerns.

Thanks,
Stanley Chu

> 
> Thanks,
> 
> Can Guo.
> 
> >> Please let me know your idea, thanks.
> >> 
> >> Can Guo.
> > 
> > Thanks,
> > Stanley Chu
> > 
> >> 
> >> > Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
> >> > Signed-off-by: Andy Teng <andy.teng@mediatek.com>
> >> >
> >> > ---
> >> >
> >> > Changes since v1:
> >> > - Fix return value: Use unique bool variable to get the result of
> >> > flush_work(). Thcan prevent incorrect returned value, i.e., rc, if
> >> > flush_work() returns true
> >> > - Fix commit message
> >> >
> >> > ---
> >> >  drivers/scsi/ufs/ufshcd.c | 5 ++++-
> >> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> >> > index 577cc0d7487f..acba2271c5d3 100644
> >> > --- a/drivers/scsi/ufs/ufshcd.c
> >> > +++ b/drivers/scsi/ufs/ufshcd.c
> >> > @@ -1561,6 +1561,7 @@ static void ufshcd_ungate_work(struct work_struct
> >> > *work)
> >> >  int ufshcd_hold(struct ufs_hba *hba, bool async)
> >> >  {
> >> >  	int rc = 0;
> >> > +	bool flush_result;
> >> >  	unsigned long flags;
> >> >
> >> >  	if (!ufshcd_is_clkgating_allowed(hba))
> >> > @@ -1592,7 +1593,9 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
> >> >  				break;
> >> >  			}
> >> >  			spin_unlock_irqrestore(hba->host->host_lock, flags);
> >> > -			flush_work(&hba->clk_gating.ungate_work);
> >> > +			flush_result = flush_work(&hba->clk_gating.ungate_work);
> >> > +			if (hba->clk_gating.is_suspended && !flush_result)
> >> > +				goto out;
> >> >  			spin_lock_irqsave(hba->host->host_lock, flags);
> >> >  			goto start;
> >> >  		}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] scsi: ufs: Fix possible infinite loop in ufshcd_hold
       [not found]           ` <BYAPR04MB4629D947CAC7E07FF94D55F2FC4A0@BYAPR04MB4629.namprd04.prod.outlook.com>
@ 2020-08-07  0:39             ` Stanley Chu
  0 siblings, 0 replies; 6+ messages in thread
From: Stanley Chu @ 2020-08-07  0:39 UTC (permalink / raw)
  To: Avri Altman
  Cc: Can Guo, linux-scsi, martin.petersen, alim.akhtar, jejb,
	bvanassche, beanhuo, asutoshd, matthias.bgg, linux-mediatek,
	linux-arm-kernel, linux-kernel,
	Kuohong Wang (王國鴻),
	Peter Wang (王信友),
	Chun-Hung Wu (巫駿宏),
	Andy Teng (鄧如宏),
	Chaotian Jing (井朝天),
	CC Chou (周志杰)

Hi Avri,


On Tue, 2020-08-04 at 06:10 +0000, Avri Altman wrote:
> > Hi Avri,
> > 
> > Would you have any suggestions for this patch?
> > 
> > We need this patch to enable UFSHCD_CAP_HIBERN8_WITH_CLK_GATING in
> > MediaTek platform.
> > 
> > Thanks a lot,
> > 
> > Stanley Chu
> Let's use it for now as a short-term WA.
> Please work with Can on a proper fix that will utilize his idea to remove ufshcd_hold/release
> From ufshcd_send_uic_cmd, as you are both in agreement that this should work as the long term solution.
> 
> 
> > > > >
> > > > >>
> > > > >> > Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
> > > > >> > Signed-off-by: Andy Teng <andy.teng@mediatek.com>
> Reviewed-by: Avri Altman <avri.altman@wdc.com>

Thanks!

I will continue with the long term solution with Can.

BTW, this mail seems not showing up in LKML or Patchwork, just reply
this mail with removing "[SPAM]" keyword in title.

Thanks,
Stanley Chu


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-07  0:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-29  2:40 [PATCH v2] scsi: ufs: Fix possible infinite loop in ufshcd_hold Stanley Chu
2020-07-29  8:43 ` Can Guo
2020-07-29 10:26   ` Stanley Chu
2020-07-29 10:53     ` Can Guo
2020-07-30  7:59       ` Stanley Chu
     [not found]         ` <1596518850.27829.5.camel@mtkswgap22>
     [not found]           ` <BYAPR04MB4629D947CAC7E07FF94D55F2FC4A0@BYAPR04MB4629.namprd04.prod.outlook.com>
2020-08-07  0:39             ` Stanley Chu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).