linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][v2] PM / s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
@ 2020-06-16  4:04 Chen Yu
  2020-06-22 16:19 ` Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Chen Yu @ 2020-06-16  4:04 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra
  Cc: Len Brown, Daniel Lezcano, Ingo Molnar, linux-pm, linux-kernel,
	Rui Zhang, Chen Yu

Suspend to idle was found to not work on Goldmont CPU recently.
And the issue was triggered due to:

1. On Goldmont the CPU in idle can only be woken up via IPIs,
   not POLLING mode:
   Commit 08e237fa56a1 ("x86/cpu: Add workaround for MONITOR
   instruction erratum on Goldmont based CPUs")
2. When the CPU is entering suspend to idle process, the
   _TIF_POLLING_NRFLAG is kept on, due to cpuidle_enter_s2idle()
   doesn't properly match call_cpuidle().
3. Commit b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()")
   makes use of _TIF_POLLING_NRFLAG to avoid sending IPIs to
   idle CPUs.
4. As a result, some IPIs related functions might not work
   well during suspend to idle on Goldmont. For example, one
   suspected victim:
   tick_unfreeze() -> timekeeping_resume() -> hrtimers_resume()
   -> clock_was_set() -> on_each_cpu() might wait forever,
   because the IPIs will not be sent to the CPUs which are
   sleeping with _TIF_POLLING_NRFLAG set, and Goldmont CPU
   could not be woken up by only setting _TIF_NEED_RESCHED
   on the monitor address.

I don't find a way in Ubuntu to update the firmware of Goldmont
and check if the issue was gone, a fix patch would do no harm.
Clear the _TIF_POLLING_NRFLAG flag before entering suspend to idle,
and let the driver's enter_s2idle() to decide whether to set
_TIF_POLLING_NRFLAG or not. So that to avoid the scenario described
above and keep the context consistent with before. Also adjust
the naming to be consistent with call_cpuidle().

Fixes: b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()")
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: kbuild test robot <lkp@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
v2: According to Peter's review, v1 is racy, if someone already
    set TIF_NEED_RESCHED this patch just clear POLLING and go to sleep.
    Check TIF_NEED_RESCHED before entering suspend to idle and
    adjust the naming to be consistent with call_cpuidle().
--
 drivers/cpuidle/cpuidle.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index c149d9e20dfd..b003767abebd 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -13,6 +13,7 @@
 #include <linux/mutex.h>
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
+#include <linux/sched/idle.h>
 #include <linux/notifier.h>
 #include <linux/pm_qos.h>
 #include <linux/cpu.h>
@@ -133,8 +134,8 @@ int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
 }
 
 #ifdef CONFIG_SUSPEND
-static void enter_s2idle_proper(struct cpuidle_driver *drv,
-				struct cpuidle_device *dev, int index)
+static void s2idle_enter(struct cpuidle_driver *drv,
+			 struct cpuidle_device *dev, int index)
 {
 	ktime_t time_start, time_end;
 
@@ -168,6 +169,15 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv,
 	dev->states_usage[index].s2idle_usage++;
 }
 
+static int call_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev,
+		       int index)
+{
+	if (!current_clr_polling_and_test())
+		s2idle_enter(drv, dev, index);
+
+	return index;
+}
+
 /**
  * cpuidle_enter_s2idle - Enter an idle state suitable for suspend-to-idle.
  * @drv: cpuidle driver for the given CPU.
@@ -187,7 +197,7 @@ int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 	 */
 	index = find_deepest_state(drv, dev, U64_MAX, 0, true);
 	if (index > 0)
-		enter_s2idle_proper(drv, dev, index);
+		call_s2idle(drv, dev, index);
 
 	return index;
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH][v2] PM / s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
  2020-06-16  4:04 [PATCH][v2] PM / s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle Chen Yu
@ 2020-06-22 16:19 ` Rafael J. Wysocki
  2020-06-22 17:18   ` Chen Yu
  2020-06-22 17:29   ` Peter Zijlstra
  0 siblings, 2 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2020-06-22 16:19 UTC (permalink / raw)
  To: Chen Yu, Peter Zijlstra
  Cc: Rafael J. Wysocki, Len Brown, Daniel Lezcano, Ingo Molnar,
	Linux PM, Linux Kernel Mailing List, Rui Zhang

On Tue, Jun 16, 2020 at 6:03 AM Chen Yu <yu.c.chen@intel.com> wrote:
>
> Suspend to idle was found to not work on Goldmont CPU recently.
> And the issue was triggered due to:
>
> 1. On Goldmont the CPU in idle can only be woken up via IPIs,
>    not POLLING mode:
>    Commit 08e237fa56a1 ("x86/cpu: Add workaround for MONITOR
>    instruction erratum on Goldmont based CPUs")
> 2. When the CPU is entering suspend to idle process, the
>    _TIF_POLLING_NRFLAG is kept on, due to cpuidle_enter_s2idle()
>    doesn't properly match call_cpuidle().
> 3. Commit b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()")
>    makes use of _TIF_POLLING_NRFLAG to avoid sending IPIs to
>    idle CPUs.
> 4. As a result, some IPIs related functions might not work
>    well during suspend to idle on Goldmont. For example, one
>    suspected victim:
>    tick_unfreeze() -> timekeeping_resume() -> hrtimers_resume()
>    -> clock_was_set() -> on_each_cpu() might wait forever,
>    because the IPIs will not be sent to the CPUs which are
>    sleeping with _TIF_POLLING_NRFLAG set, and Goldmont CPU
>    could not be woken up by only setting _TIF_NEED_RESCHED
>    on the monitor address.
>
> I don't find a way in Ubuntu to update the firmware of Goldmont
> and check if the issue was gone, a fix patch would do no harm.
> Clear the _TIF_POLLING_NRFLAG flag before entering suspend to idle,
> and let the driver's enter_s2idle() to decide whether to set
> _TIF_POLLING_NRFLAG or not. So that to avoid the scenario described
> above and keep the context consistent with before. Also adjust
> the naming to be consistent with call_cpuidle().
>
> Fixes: b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()")
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Reported-by: kbuild test robot <lkp@intel.com>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>

Peter, any more comments here?

> ---
> v2: According to Peter's review, v1 is racy, if someone already
>     set TIF_NEED_RESCHED this patch just clear POLLING and go to sleep.
>     Check TIF_NEED_RESCHED before entering suspend to idle and
>     adjust the naming to be consistent with call_cpuidle().
> --
>  drivers/cpuidle/cpuidle.c | 16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index c149d9e20dfd..b003767abebd 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -13,6 +13,7 @@
>  #include <linux/mutex.h>
>  #include <linux/sched.h>
>  #include <linux/sched/clock.h>
> +#include <linux/sched/idle.h>
>  #include <linux/notifier.h>
>  #include <linux/pm_qos.h>
>  #include <linux/cpu.h>
> @@ -133,8 +134,8 @@ int cpuidle_find_deepest_state(struct cpuidle_driver *drv,
>  }
>
>  #ifdef CONFIG_SUSPEND
> -static void enter_s2idle_proper(struct cpuidle_driver *drv,
> -                               struct cpuidle_device *dev, int index)
> +static void s2idle_enter(struct cpuidle_driver *drv,
> +                        struct cpuidle_device *dev, int index)
>  {
>         ktime_t time_start, time_end;
>
> @@ -168,6 +169,15 @@ static void enter_s2idle_proper(struct cpuidle_driver *drv,
>         dev->states_usage[index].s2idle_usage++;
>  }
>
> +static int call_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev,
> +                      int index)
> +{
> +       if (!current_clr_polling_and_test())
> +               s2idle_enter(drv, dev, index);
> +
> +       return index;

Is the value returned here used at all?

> +}
> +
>  /**
>   * cpuidle_enter_s2idle - Enter an idle state suitable for suspend-to-idle.
>   * @drv: cpuidle driver for the given CPU.
> @@ -187,7 +197,7 @@ int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>          */
>         index = find_deepest_state(drv, dev, U64_MAX, 0, true);
>         if (index > 0)
> -               enter_s2idle_proper(drv, dev, index);
> +               call_s2idle(drv, dev, index);

I'm wondering why this can't be

    if (index > 0 && !current_clr_polling_and_test())
            enter_s2idle_proper(drv, dev, index);

>         return index;
>  }
> --

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH][v2] PM / s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
  2020-06-22 16:19 ` Rafael J. Wysocki
@ 2020-06-22 17:18   ` Chen Yu
  2020-06-22 19:45     ` Rafael J. Wysocki
  2020-06-22 17:29   ` Peter Zijlstra
  1 sibling, 1 reply; 6+ messages in thread
From: Chen Yu @ 2020-06-22 17:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Zijlstra, Len Brown, Daniel Lezcano, Ingo Molnar, Linux PM,
	Linux Kernel Mailing List, Rui Zhang

Hi Rafael,
On Mon, Jun 22, 2020 at 06:19:35PM +0200, Rafael J. Wysocki wrote:
[cut]
> > +{
> > +       if (!current_clr_polling_and_test())
> > +               s2idle_enter(drv, dev, index);
> > +
> > +       return index;
> 
> Is the value returned here used at all?
>
It is not used for now IMO.
> >          */
> >         index = find_deepest_state(drv, dev, U64_MAX, 0, true);
> >         if (index > 0)
> > -               enter_s2idle_proper(drv, dev, index);
> > +               call_s2idle(drv, dev, index);
> 
> I'm wondering why this can't be
> 
>     if (index > 0 && !current_clr_polling_and_test())
>             enter_s2idle_proper(drv, dev, index);
> 
Yes, it should be simpler, but I guess Peter was trying to
make call_s2idle() consistent with call_cpuidle(),
and also s2idle_enter() is analogous to cpuidle_enter().

Thanks,
Chenyu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH][v2] PM / s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
  2020-06-22 16:19 ` Rafael J. Wysocki
  2020-06-22 17:18   ` Chen Yu
@ 2020-06-22 17:29   ` Peter Zijlstra
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2020-06-22 17:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Chen Yu, Len Brown, Daniel Lezcano, Ingo Molnar, Linux PM,
	Linux Kernel Mailing List, Rui Zhang

On Mon, Jun 22, 2020 at 06:19:35PM +0200, Rafael J. Wysocki wrote:
> > Fixes: b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()")
> > Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Reported-by: kbuild test robot <lkp@intel.com>
> > Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> > Cc: Len Brown <lenb@kernel.org>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> 
> Peter, any more comments here?

Only that the whole s2idle stuff could do with a cleanup :-)

> > +static int call_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev,
> > +                      int index)
> > +{
> > +       if (!current_clr_polling_and_test())
> > +               s2idle_enter(drv, dev, index);
> > +
> > +       return index;
> 
> Is the value returned here used at all?
> 
> > +}
> > +
> >  /**
> >   * cpuidle_enter_s2idle - Enter an idle state suitable for suspend-to-idle.
> >   * @drv: cpuidle driver for the given CPU.
> > @@ -187,7 +197,7 @@ int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev)
> >          */
> >         index = find_deepest_state(drv, dev, U64_MAX, 0, true);
> >         if (index > 0)
> > -               enter_s2idle_proper(drv, dev, index);
> > +               call_s2idle(drv, dev, index);
> 
> I'm wondering why this can't be
> 
>     if (index > 0 && !current_clr_polling_and_test())
>             enter_s2idle_proper(drv, dev, index);

Works for me. Some Wysocki guy wrote much of it, best ask him :-)

The thing that confused me is that all this is way different from the
normal idle path and didn't keep the invariants.

Ideally; much of that gets folded back into the normal patch somehow.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH][v2] PM / s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
  2020-06-22 17:18   ` Chen Yu
@ 2020-06-22 19:45     ` Rafael J. Wysocki
  2020-06-23  1:56       ` Chen Yu
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2020-06-22 19:45 UTC (permalink / raw)
  To: Chen Yu
  Cc: Rafael J. Wysocki, Peter Zijlstra, Len Brown, Daniel Lezcano,
	Ingo Molnar, Linux PM, Linux Kernel Mailing List, Rui Zhang

On Mon, Jun 22, 2020 at 7:16 PM Chen Yu <yu.c.chen@intel.com> wrote:
>
> Hi Rafael,
> On Mon, Jun 22, 2020 at 06:19:35PM +0200, Rafael J. Wysocki wrote:
> [cut]
> > > +{
> > > +       if (!current_clr_polling_and_test())
> > > +               s2idle_enter(drv, dev, index);
> > > +
> > > +       return index;
> >
> > Is the value returned here used at all?
> >
> It is not used for now IMO.
> > >          */
> > >         index = find_deepest_state(drv, dev, U64_MAX, 0, true);
> > >         if (index > 0)
> > > -               enter_s2idle_proper(drv, dev, index);
> > > +               call_s2idle(drv, dev, index);
> >
> > I'm wondering why this can't be
> >
> >     if (index > 0 && !current_clr_polling_and_test())
> >             enter_s2idle_proper(drv, dev, index);
> >
> Yes, it should be simpler, but I guess Peter was trying to
> make call_s2idle() consistent with call_cpuidle(),
> and also s2idle_enter() is analogous to cpuidle_enter().

So IMO it would be better to do the simplest fix first and then do the
cleanup on top of it.

Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH][v2] PM / s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle
  2020-06-22 19:45     ` Rafael J. Wysocki
@ 2020-06-23  1:56       ` Chen Yu
  0 siblings, 0 replies; 6+ messages in thread
From: Chen Yu @ 2020-06-23  1:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Zijlstra, Len Brown, Daniel Lezcano, Ingo Molnar, Linux PM,
	Linux Kernel Mailing List, Rui Zhang

On Mon, Jun 22, 2020 at 09:45:35PM +0200, Rafael J. Wysocki wrote:
> On Mon, Jun 22, 2020 at 7:16 PM Chen Yu <yu.c.chen@intel.com> wrote:
> >
> > Hi Rafael,
> > On Mon, Jun 22, 2020 at 06:19:35PM +0200, Rafael J. Wysocki wrote:
> > [cut]
> > > > +{
> > > > +       if (!current_clr_polling_and_test())
> > > > +               s2idle_enter(drv, dev, index);
> > > > +
> > > > +       return index;
> > >
> > > Is the value returned here used at all?
> > >
> > It is not used for now IMO.
> > > >          */
> > > >         index = find_deepest_state(drv, dev, U64_MAX, 0, true);
> > > >         if (index > 0)
> > > > -               enter_s2idle_proper(drv, dev, index);
> > > > +               call_s2idle(drv, dev, index);
> > >
> > > I'm wondering why this can't be
> > >
> > >     if (index > 0 && !current_clr_polling_and_test())
> > >             enter_s2idle_proper(drv, dev, index);
> > >
> > Yes, it should be simpler, but I guess Peter was trying to
> > make call_s2idle() consistent with call_cpuidle(),
> > and also s2idle_enter() is analogous to cpuidle_enter().
> 
> So IMO it would be better to do the simplest fix first and then do the
> cleanup on top of it.
>
Okay, I'll do that.

Thanks,
Chenyu

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-06-23  1:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16  4:04 [PATCH][v2] PM / s2idle: Clear _TIF_POLLING_NRFLAG before suspend to idle Chen Yu
2020-06-22 16:19 ` Rafael J. Wysocki
2020-06-22 17:18   ` Chen Yu
2020-06-22 19:45     ` Rafael J. Wysocki
2020-06-23  1:56       ` Chen Yu
2020-06-22 17:29   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).