All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: PM: Register syscore_ops for scale invariance
@ 2021-01-08 18:05 Rafael J. Wysocki
  2021-01-11 18:36 ` Giovanni Gherdovich
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2021-01-08 18:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linux PM, LKML, x86 Maintainers, Srinivas Pandruvada,
	Giovanni Gherdovich, Giovanni Gherdovich

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

On x86 scale invariace tends to be disabled during resume from
suspend-to-RAM, because the MPERF or APERF MSR values are not as
expected then due to updates taking place after the platform
firmware has been invoked to complete the suspend transition.

That, of course, is not desirable, especially if the schedutil
scaling governor is in use, because the lack of scale invariance
causes it to be less reliable.

To counter that effect, modify init_freq_invariance() to register
a syscore_ops object for scale invariance with the ->resume callback
pointing to init_counter_refs() which will run on the CPU starting
the resume transition (the other CPUs will be taken care of the
"online" operations taking place later).

Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in frequency invariant accounting")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 arch/x86/kernel/smpboot.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Index: linux-pm/arch/x86/kernel/smpboot.c
===================================================================
--- linux-pm.orig/arch/x86/kernel/smpboot.c
+++ linux-pm/arch/x86/kernel/smpboot.c
@@ -56,6 +56,7 @@
 #include <linux/numa.h>
 #include <linux/pgtable.h>
 #include <linux/overflow.h>
+#include <linux/syscore_ops.h>
 
 #include <asm/acpi.h>
 #include <asm/desc.h>
@@ -2083,6 +2084,23 @@ static void init_counter_refs(void)
 	this_cpu_write(arch_prev_mperf, mperf);
 }
 
+#ifdef CONFIG_PM_SLEEP
+static struct syscore_ops freq_invariance_syscore_ops = {
+	.resume = init_counter_refs,
+};
+
+static void register_freq_invariance_syscore_ops(void)
+{
+	/* Bail out if registered already. */
+	if (freq_invariance_syscore_ops.node.prev)
+		return;
+
+	register_syscore_ops(&freq_invariance_syscore_ops);
+}
+#else
+static inline void register_freq_invariance_syscore_ops(void) {}
+#endif
+
 static void init_freq_invariance(bool secondary, bool cppc_ready)
 {
 	bool ret = false;
@@ -2109,6 +2127,7 @@ static void init_freq_invariance(bool se
 	if (ret) {
 		init_counter_refs();
 		static_branch_enable(&arch_scale_freq_key);
+		register_freq_invariance_syscore_ops();
 		pr_info("Estimated ratio of average max frequency by base frequency (times 1024): %llu\n", arch_max_freq_ratio);
 	} else {
 		pr_debug("Couldn't determine max cpu frequency, necessary for scale-invariant accounting.\n");




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86: PM: Register syscore_ops for scale invariance
  2021-01-08 18:05 [PATCH] x86: PM: Register syscore_ops for scale invariance Rafael J. Wysocki
@ 2021-01-11 18:36 ` Giovanni Gherdovich
  2021-01-12 15:01 ` Peter Zijlstra
  2021-01-19 16:09 ` [tip: sched/urgent] " tip-bot2 for Rafael J. Wysocki
  2 siblings, 0 replies; 7+ messages in thread
From: Giovanni Gherdovich @ 2021-01-11 18:36 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra
  Cc: Linux PM, LKML, x86 Maintainers, Srinivas Pandruvada

On Fri, 2021-01-08 at 19:05 +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> On x86 scale invariace tends to be disabled during resume from
> suspend-to-RAM, because the MPERF or APERF MSR values are not as
> expected then due to updates taking place after the platform
> firmware has been invoked to complete the suspend transition.
> 
> That, of course, is not desirable, especially if the schedutil
> scaling governor is in use, because the lack of scale invariance
> causes it to be less reliable.
> 
> To counter that effect, modify init_freq_invariance() to register
> a syscore_ops object for scale invariance with the ->resume callback
> pointing to init_counter_refs() which will run on the CPU starting
> the resume transition (the other CPUs will be taken care of the
> "online" operations taking place later).
> 
> Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in frequency invariant accounting")
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> <snip>

Thanks for writing this, Rafael.

Peter Zijlstra asked to fix this problem months ago; I started but
got stucked and never finished.


Giovanni Gherdovich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86: PM: Register syscore_ops for scale invariance
  2021-01-08 18:05 [PATCH] x86: PM: Register syscore_ops for scale invariance Rafael J. Wysocki
  2021-01-11 18:36 ` Giovanni Gherdovich
@ 2021-01-12 15:01 ` Peter Zijlstra
  2021-01-12 15:10   ` Rafael J. Wysocki
  2021-01-19 16:09 ` [tip: sched/urgent] " tip-bot2 for Rafael J. Wysocki
  2 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2021-01-12 15:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, LKML, x86 Maintainers, Srinivas Pandruvada,
	Giovanni Gherdovich, Giovanni Gherdovich

On Fri, Jan 08, 2021 at 07:05:59PM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> On x86 scale invariace tends to be disabled during resume from
> suspend-to-RAM, because the MPERF or APERF MSR values are not as
> expected then due to updates taking place after the platform
> firmware has been invoked to complete the suspend transition.
> 
> That, of course, is not desirable, especially if the schedutil
> scaling governor is in use, because the lack of scale invariance
> causes it to be less reliable.
> 
> To counter that effect, modify init_freq_invariance() to register
> a syscore_ops object for scale invariance with the ->resume callback
> pointing to init_counter_refs() which will run on the CPU starting
> the resume transition (the other CPUs will be taken care of the
> "online" operations taking place later).
> 
> Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in frequency invariant accounting")
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Thanks!, I'll take it through the sched/urgent tree?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86: PM: Register syscore_ops for scale invariance
  2021-01-12 15:01 ` Peter Zijlstra
@ 2021-01-12 15:10   ` Rafael J. Wysocki
  2021-01-19 15:12     ` Rafael J. Wysocki
  0 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2021-01-12 15:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Rafael J. Wysocki, Linux PM, LKML, x86 Maintainers,
	Srinivas Pandruvada, Giovanni Gherdovich, Giovanni Gherdovich

On Tue, Jan 12, 2021 at 4:02 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Jan 08, 2021 at 07:05:59PM +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > On x86 scale invariace tends to be disabled during resume from
> > suspend-to-RAM, because the MPERF or APERF MSR values are not as
> > expected then due to updates taking place after the platform
> > firmware has been invoked to complete the suspend transition.
> >
> > That, of course, is not desirable, especially if the schedutil
> > scaling governor is in use, because the lack of scale invariance
> > causes it to be less reliable.
> >
> > To counter that effect, modify init_freq_invariance() to register
> > a syscore_ops object for scale invariance with the ->resume callback
> > pointing to init_counter_refs() which will run on the CPU starting
> > the resume transition (the other CPUs will be taken care of the
> > "online" operations taking place later).
> >
> > Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in frequency invariant accounting")
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Thanks!, I'll take it through the sched/urgent tree?

That works, thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86: PM: Register syscore_ops for scale invariance
  2021-01-12 15:10   ` Rafael J. Wysocki
@ 2021-01-19 15:12     ` Rafael J. Wysocki
  2021-01-19 16:03       ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2021-01-19 15:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Peter Zijlstra, Rafael J. Wysocki, Linux PM, LKML,
	x86 Maintainers, Srinivas Pandruvada, Giovanni Gherdovich,
	Giovanni Gherdovich

On Tue, Jan 12, 2021 at 4:10 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Jan 12, 2021 at 4:02 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Fri, Jan 08, 2021 at 07:05:59PM +0100, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > On x86 scale invariace tends to be disabled during resume from
> > > suspend-to-RAM, because the MPERF or APERF MSR values are not as
> > > expected then due to updates taking place after the platform
> > > firmware has been invoked to complete the suspend transition.
> > >
> > > That, of course, is not desirable, especially if the schedutil
> > > scaling governor is in use, because the lack of scale invariance
> > > causes it to be less reliable.
> > >
> > > To counter that effect, modify init_freq_invariance() to register
> > > a syscore_ops object for scale invariance with the ->resume callback
> > > pointing to init_counter_refs() which will run on the CPU starting
> > > the resume transition (the other CPUs will be taken care of the
> > > "online" operations taking place later).
> > >
> > > Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in frequency invariant accounting")
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Thanks!, I'll take it through the sched/urgent tree?
>
> That works, thanks!

Any news on this front?  It's been a few days ...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86: PM: Register syscore_ops for scale invariance
  2021-01-19 15:12     ` Rafael J. Wysocki
@ 2021-01-19 16:03       ` Peter Zijlstra
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2021-01-19 16:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM, LKML, x86 Maintainers,
	Srinivas Pandruvada, Giovanni Gherdovich, Giovanni Gherdovich

On Tue, Jan 19, 2021 at 04:12:20PM +0100, Rafael J. Wysocki wrote:
> On Tue, Jan 12, 2021 at 4:10 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Tue, Jan 12, 2021 at 4:02 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Fri, Jan 08, 2021 at 07:05:59PM +0100, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > >
> > > > On x86 scale invariace tends to be disabled during resume from
> > > > suspend-to-RAM, because the MPERF or APERF MSR values are not as
> > > > expected then due to updates taking place after the platform
> > > > firmware has been invoked to complete the suspend transition.
> > > >
> > > > That, of course, is not desirable, especially if the schedutil
> > > > scaling governor is in use, because the lack of scale invariance
> > > > causes it to be less reliable.
> > > >
> > > > To counter that effect, modify init_freq_invariance() to register
> > > > a syscore_ops object for scale invariance with the ->resume callback
> > > > pointing to init_counter_refs() which will run on the CPU starting
> > > > the resume transition (the other CPUs will be taken care of the
> > > > "online" operations taking place later).
> > > >
> > > > Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in frequency invariant accounting")
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > Thanks!, I'll take it through the sched/urgent tree?
> >
> > That works, thanks!
> 
> Any news on this front?  It's been a few days ...

My bad, it's been held up behind me trying to fix another sched
regression. Lemme push out just this one so it doesn't go walk-about.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [tip: sched/urgent] x86: PM: Register syscore_ops for scale invariance
  2021-01-08 18:05 [PATCH] x86: PM: Register syscore_ops for scale invariance Rafael J. Wysocki
  2021-01-11 18:36 ` Giovanni Gherdovich
  2021-01-12 15:01 ` Peter Zijlstra
@ 2021-01-19 16:09 ` tip-bot2 for Rafael J. Wysocki
  2 siblings, 0 replies; 7+ messages in thread
From: tip-bot2 for Rafael J. Wysocki @ 2021-01-19 16:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Rafael J. Wysocki, Peter Zijlstra (Intel),
	Giovanni Gherdovich, x86, linux-kernel

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     9c7d9017a49fb8516c13b7bff59b7da2abed23e1
Gitweb:        https://git.kernel.org/tip/9c7d9017a49fb8516c13b7bff59b7da2abed23e1
Author:        Rafael J. Wysocki <rafael.j.wysocki@intel.com>
AuthorDate:    Fri, 08 Jan 2021 19:05:59 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 19 Jan 2021 17:04:03 +01:00

x86: PM: Register syscore_ops for scale invariance

On x86 scale invariace tends to be disabled during resume from
suspend-to-RAM, because the MPERF or APERF MSR values are not as
expected then due to updates taking place after the platform
firmware has been invoked to complete the suspend transition.

That, of course, is not desirable, especially if the schedutil
scaling governor is in use, because the lack of scale invariance
causes it to be less reliable.

To counter that effect, modify init_freq_invariance() to register
a syscore_ops object for scale invariance with the ->resume callback
pointing to init_counter_refs() which will run on the CPU starting
the resume transition (the other CPUs will be taken care of the
"online" operations taking place later).

Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in frequency invariant accounting")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Link: https://lkml.kernel.org/r/1803209.Mvru99baaF@kreacher
---
 arch/x86/kernel/smpboot.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 8ca66af..117e24f 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -56,6 +56,7 @@
 #include <linux/numa.h>
 #include <linux/pgtable.h>
 #include <linux/overflow.h>
+#include <linux/syscore_ops.h>
 
 #include <asm/acpi.h>
 #include <asm/desc.h>
@@ -2083,6 +2084,23 @@ static void init_counter_refs(void)
 	this_cpu_write(arch_prev_mperf, mperf);
 }
 
+#ifdef CONFIG_PM_SLEEP
+static struct syscore_ops freq_invariance_syscore_ops = {
+	.resume = init_counter_refs,
+};
+
+static void register_freq_invariance_syscore_ops(void)
+{
+	/* Bail out if registered already. */
+	if (freq_invariance_syscore_ops.node.prev)
+		return;
+
+	register_syscore_ops(&freq_invariance_syscore_ops);
+}
+#else
+static inline void register_freq_invariance_syscore_ops(void) {}
+#endif
+
 static void init_freq_invariance(bool secondary, bool cppc_ready)
 {
 	bool ret = false;
@@ -2109,6 +2127,7 @@ static void init_freq_invariance(bool secondary, bool cppc_ready)
 	if (ret) {
 		init_counter_refs();
 		static_branch_enable(&arch_scale_freq_key);
+		register_freq_invariance_syscore_ops();
 		pr_info("Estimated ratio of average max frequency by base frequency (times 1024): %llu\n", arch_max_freq_ratio);
 	} else {
 		pr_debug("Couldn't determine max cpu frequency, necessary for scale-invariant accounting.\n");

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-01-19 16:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-08 18:05 [PATCH] x86: PM: Register syscore_ops for scale invariance Rafael J. Wysocki
2021-01-11 18:36 ` Giovanni Gherdovich
2021-01-12 15:01 ` Peter Zijlstra
2021-01-12 15:10   ` Rafael J. Wysocki
2021-01-19 15:12     ` Rafael J. Wysocki
2021-01-19 16:03       ` Peter Zijlstra
2021-01-19 16:09 ` [tip: sched/urgent] " tip-bot2 for Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.