linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: add workaround monitor bug
@ 2016-07-06 17:01 Jacob Pan
  2016-07-08  8:55 ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Jacob Pan @ 2016-07-06 17:01 UTC (permalink / raw)
  To: LKML, Peter Zijlstra, Ingo Molnar, H. Peter Anvin, X86 Kernel
  Cc: Arjan van de Ven, Len Brown, Jacob Pan

From: Peter Zijlstra <peterz@infradead.org>

Monitored cached line may not wake up from mwait on certain
Goldmont based CPUs. This patch will avoid calling
current_set_polling_and_test() and thereby not set the TIF_ flag.
The result is that we'll always send IPIs for wakeups.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/include/asm/mwait.h       | 2 +-
 arch/x86/kernel/cpu/intel.c        | 5 +++++
 arch/x86/kernel/process.c          | 2 +-
 4 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 78dbd28..197a3f4 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -304,6 +304,7 @@
 #define X86_BUG_SYSRET_SS_ATTRS	X86_BUG(8) /* SYSRET doesn't fix up SS attrs */
 #define X86_BUG_NULL_SEG	X86_BUG(9) /* Nulling a selector preserves the base */
 #define X86_BUG_SWAPGS_FENCE	X86_BUG(10) /* SWAPGS without input dep on GS */
+#define X86_BUG_MONITOR		X86_BUG(11) /* IPI required to wake up remote cpu */
 
 
 #ifdef CONFIG_X86_32
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 0deeb2d..f37f2d8 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -97,7 +97,7 @@ static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
  */
 static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
 {
-	if (!current_set_polling_and_test()) {
+	if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) {
 		if (static_cpu_has_bug(X86_BUG_CLFLUSH_MONITOR)) {
 			mb();
 			clflush((void *)&current_thread_info()->flags);
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 6e2ffbe..77c6b3e 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -13,6 +13,7 @@
 #include <asm/msr.h>
 #include <asm/bugs.h>
 #include <asm/cpu.h>
+#include <asm/intel-family.h>
 
 #ifdef CONFIG_X86_64
 #include <linux/topology.h>
@@ -509,6 +510,10 @@ static void init_intel(struct cpuinfo_x86 *c)
 	    (c->x86_model == 29 || c->x86_model == 46 || c->x86_model == 47))
 		set_cpu_bug(c, X86_BUG_CLFLUSH_MONITOR);
 
+	if (c->x86 == 6 && boot_cpu_has(X86_FEATURE_MWAIT) &&
+		((c->x86_model == INTEL_FAM6_ATOM_GOLDMONT)))
+		set_cpu_bug(c, X86_BUG_MONITOR);
+
 #ifdef CONFIG_X86_64
 	if (c->x86 == 15)
 		c->x86_cache_alignment = c->x86_clflush_size * 2;
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 96becbb..59f68f1 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -404,7 +404,7 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 	if (c->x86_vendor != X86_VENDOR_INTEL)
 		return 0;
 
-	if (!cpu_has(c, X86_FEATURE_MWAIT))
+	if (!cpu_has(c, X86_FEATURE_MWAIT) || static_cpu_has_bug(X86_BUG_MONITOR))
 		return 0;
 
 	return 1;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: add workaround monitor bug
  2016-07-06 17:01 [PATCH] x86: add workaround monitor bug Jacob Pan
@ 2016-07-08  8:55 ` Ingo Molnar
  2016-07-08 11:45   ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2016-07-08  8:55 UTC (permalink / raw)
  To: Jacob Pan
  Cc: LKML, Peter Zijlstra, Ingo Molnar, H. Peter Anvin, X86 Kernel,
	Arjan van de Ven, Len Brown


* Jacob Pan <jacob.jun.pan@linux.intel.com> wrote:

> From: Peter Zijlstra <peterz@infradead.org>
> 
> Monitored cached line may not wake up from mwait on certain
> Goldmont based CPUs. This patch will avoid calling
> current_set_polling_and_test() and thereby not set the TIF_ flag.
> The result is that we'll always send IPIs for wakeups.
> 
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
> ---
>  arch/x86/include/asm/cpufeatures.h | 1 +
>  arch/x86/include/asm/mwait.h       | 2 +-
>  arch/x86/kernel/cpu/intel.c        | 5 +++++
>  arch/x86/kernel/process.c          | 2 +-
>  4 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 78dbd28..197a3f4 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -304,6 +304,7 @@
>  #define X86_BUG_SYSRET_SS_ATTRS	X86_BUG(8) /* SYSRET doesn't fix up SS attrs */
>  #define X86_BUG_NULL_SEG	X86_BUG(9) /* Nulling a selector preserves the base */
>  #define X86_BUG_SWAPGS_FENCE	X86_BUG(10) /* SWAPGS without input dep on GS */
> +#define X86_BUG_MONITOR		X86_BUG(11) /* IPI required to wake up remote cpu */
>  
>  
>  #ifdef CONFIG_X86_32
> diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
> index 0deeb2d..f37f2d8 100644
> --- a/arch/x86/include/asm/mwait.h
> +++ b/arch/x86/include/asm/mwait.h
> @@ -97,7 +97,7 @@ static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
>   */
>  static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
>  {
> -	if (!current_set_polling_and_test()) {
> +	if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) {

Hm, this might be suboptimal: if MONITOR/MWAIT is implemented by setting the 
exclusive flag for the monitored memory address and then snooping for cache 
invalidation requests for that cache line, then not modifying the ->flags value 
with TIF_POLLING_NRFLAG makes MWAIT not wake up - only the IPI would wake it up.

I think a better approach would be to still optimistically modify the ->flags 
value _AND_ to also send an IPI, to make sure the wakeup is not lost. This means 
that the woken CPU will wake up much faster (no IPI latency).

(The system will still bear the ovehread of sending and receiving the IPI, but 
that cost is unavoidable if there's no other workaround for this erratum.)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: add workaround monitor bug
  2016-07-08  8:55 ` Ingo Molnar
@ 2016-07-08 11:45   ` Peter Zijlstra
  2016-07-08 12:07     ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2016-07-08 11:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jacob Pan, LKML, Ingo Molnar, H. Peter Anvin, X86 Kernel,
	Arjan van de Ven, Len Brown

On Fri, Jul 08, 2016 at 10:55:15AM +0200, Ingo Molnar wrote:

> >  static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
> >  {
> > -	if (!current_set_polling_and_test()) {
> > +	if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) {
> 
> Hm, this might be suboptimal: if MONITOR/MWAIT is implemented by setting the 
> exclusive flag for the monitored memory address and then snooping for cache 
> invalidation requests for that cache line, then not modifying the ->flags value 
> with TIF_POLLING_NRFLAG makes MWAIT not wake up - only the IPI would wake it up.

Confused.. POLLING_NRFLAGS is not used to wake up ever. It is only used
to determine if we want to send IPIs or not.

And since we _must_ send an IPI in this case, because the monitor is
busted, we cannot set this.

> I think a better approach would be to still optimistically modify the ->flags 
> value _AND_ to also send an IPI, to make sure the wakeup is not lost. This means 
> that the woken CPU will wake up much faster (no IPI latency).

This is exactly what is done. See resched_curr()'s use of
set_nr_and_not_polling(). That does:

	if (!(fetch_or(&flags, NEED_RESCHED) & POLLING_NRFLAG))
		smp_send_reschedule(cpu);

So we unconditionally set NEED_RESCHED, if, when we set that, POLLING
was set, we skip the IPI.

So again, since monitor is busted, simply setting NEED_RESCHED will not
wake us, we must send the IPI, this is achieved by not setting
POLLING_NRFLAG.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: add workaround monitor bug
  2016-07-08 11:45   ` Peter Zijlstra
@ 2016-07-08 12:07     ` Ingo Molnar
  2016-07-08 19:14       ` Jacob Pan
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2016-07-08 12:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jacob Pan, LKML, Ingo Molnar, H. Peter Anvin, X86 Kernel,
	Arjan van de Ven, Len Brown


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, Jul 08, 2016 at 10:55:15AM +0200, Ingo Molnar wrote:
> 
> > >  static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
> > >  {
> > > -	if (!current_set_polling_and_test()) {
> > > +	if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) {
> > 
> > Hm, this might be suboptimal: if MONITOR/MWAIT is implemented by setting the 
> > exclusive flag for the monitored memory address and then snooping for cache 
> > invalidation requests for that cache line, then not modifying the ->flags value 
> > with TIF_POLLING_NRFLAG makes MWAIT not wake up - only the IPI would wake it up.
> 
> Confused.. POLLING_NRFLAGS is not used to wake up ever. It is only used
> to determine if we want to send IPIs or not.

I called the IPI the 'wakeup' - it's the 'CPU wakeup' :-)

> And since we _must_ send an IPI in this case, because the monitor is
> busted, we cannot set this.
> 
> > I think a better approach would be to still optimistically modify the ->flags 
> > value _AND_ to also send an IPI, to make sure the wakeup is not lost. This means 
> > that the woken CPU will wake up much faster (no IPI latency).
> 
> This is exactly what is done. See resched_curr()'s use of
> set_nr_and_not_polling(). That does:
> 
> 	if (!(fetch_or(&flags, NEED_RESCHED) & POLLING_NRFLAG))
> 		smp_send_reschedule(cpu);
> 
> So we unconditionally set NEED_RESCHED, if, when we set that, POLLING
> was set, we skip the IPI.

Ah, indeed, we set NEED_RESCHED in the same memory address that __monitor() is 
watching so all is good.

> So again, since monitor is busted, simply setting NEED_RESCHED will not
> wake us, we must send the IPI, this is achieved by not setting
> POLLING_NRFLAG.

Yeah, so I got the impression that it might be broken in only certain 
circumstances, or is it completely busted?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: add workaround monitor bug
  2016-07-08 12:07     ` Ingo Molnar
@ 2016-07-08 19:14       ` Jacob Pan
  2016-07-18 18:45         ` Jacob Pan
  0 siblings, 1 reply; 6+ messages in thread
From: Jacob Pan @ 2016-07-08 19:14 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, LKML, Ingo Molnar, H. Peter Anvin, X86 Kernel,
	Arjan van de Ven, Len Brown, jacob.jun.pan

On Fri, 8 Jul 2016 14:07:12 +0200
Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Fri, Jul 08, 2016 at 10:55:15AM +0200, Ingo Molnar wrote:
> > 
> > > >  static inline void mwait_idle_with_hints(unsigned long eax,
> > > > unsigned long ecx) {
> > > > -	if (!current_set_polling_and_test()) {
> > > > +	if (static_cpu_has_bug(X86_BUG_MONITOR)
> > > > || !current_set_polling_and_test()) {
> > > 
> > > Hm, this might be suboptimal: if MONITOR/MWAIT is implemented by
> > > setting the exclusive flag for the monitored memory address and
> > > then snooping for cache invalidation requests for that cache
> > > line, then not modifying the ->flags value with
> > > TIF_POLLING_NRFLAG makes MWAIT not wake up - only the IPI would
> > > wake it up.
> > 
> > Confused.. POLLING_NRFLAGS is not used to wake up ever. It is only
> > used to determine if we want to send IPIs or not.
> 
> I called the IPI the 'wakeup' - it's the 'CPU wakeup' :-)
> 
> > And since we _must_ send an IPI in this case, because the monitor is
> > busted, we cannot set this.
> > 
> > > I think a better approach would be to still optimistically modify
> > > the ->flags value _AND_ to also send an IPI, to make sure the
> > > wakeup is not lost. This means that the woken CPU will wake up
> > > much faster (no IPI latency).
> > 
> > This is exactly what is done. See resched_curr()'s use of
> > set_nr_and_not_polling(). That does:
> > 
> > 	if (!(fetch_or(&flags, NEED_RESCHED) & POLLING_NRFLAG))
> > 		smp_send_reschedule(cpu);
> > 
> > So we unconditionally set NEED_RESCHED, if, when we set that,
> > POLLING was set, we skip the IPI.
> 
> Ah, indeed, we set NEED_RESCHED in the same memory address that
> __monitor() is watching so all is good.
> 
> > So again, since monitor is busted, simply setting NEED_RESCHED will
> > not wake us, we must send the IPI, this is achieved by not setting
> > POLLING_NRFLAG.
> 
> Yeah, so I got the impression that it might be broken in only certain 
> circumstances, or is it completely busted?
> 
That is right, monitor is only partially broken not completely busted. I
don't have the statistics but it is not rare to miss wakeup. The
typical symptom is random slowness and fails to boot, without this
patch.

So doing both can speed up wake up in some cases.

Jacob
> Thanks,
> 
> 	Ingo

[Jacob Pan]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] x86: add workaround monitor bug
  2016-07-08 19:14       ` Jacob Pan
@ 2016-07-18 18:45         ` Jacob Pan
  0 siblings, 0 replies; 6+ messages in thread
From: Jacob Pan @ 2016-07-18 18:45 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Ingo Molnar, Peter Zijlstra, LKML, Ingo Molnar, H. Peter Anvin,
	X86 Kernel, Arjan van de Ven, Len Brown

On Fri, 8 Jul 2016 12:14:33 -0700
Jacob Pan <jacob.jun.pan@linux.intel.com> wrote:

> On Fri, 8 Jul 2016 14:07:12 +0200
> Ingo Molnar <mingo@kernel.org> wrote:
> 
> > 
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > > On Fri, Jul 08, 2016 at 10:55:15AM +0200, Ingo Molnar wrote:
> > > 
> > > > >  static inline void mwait_idle_with_hints(unsigned long eax,
> > > > > unsigned long ecx) {
> > > > > -	if (!current_set_polling_and_test()) {
> > > > > +	if (static_cpu_has_bug(X86_BUG_MONITOR)
> > > > > || !current_set_polling_and_test()) {
> > > > 
> > > > Hm, this might be suboptimal: if MONITOR/MWAIT is implemented by
> > > > setting the exclusive flag for the monitored memory address and
> > > > then snooping for cache invalidation requests for that cache
> > > > line, then not modifying the ->flags value with
> > > > TIF_POLLING_NRFLAG makes MWAIT not wake up - only the IPI would
> > > > wake it up.
> > > 
> > > Confused.. POLLING_NRFLAGS is not used to wake up ever. It is only
> > > used to determine if we want to send IPIs or not.
> > 
> > I called the IPI the 'wakeup' - it's the 'CPU wakeup' :-)
> > 
> > > And since we _must_ send an IPI in this case, because the monitor
> > > is busted, we cannot set this.
> > > 
> > > > I think a better approach would be to still optimistically
> > > > modify the ->flags value _AND_ to also send an IPI, to make
> > > > sure the wakeup is not lost. This means that the woken CPU will
> > > > wake up much faster (no IPI latency).
> > > 
> > > This is exactly what is done. See resched_curr()'s use of
> > > set_nr_and_not_polling(). That does:
> > > 
> > > 	if (!(fetch_or(&flags, NEED_RESCHED) & POLLING_NRFLAG))
> > > 		smp_send_reschedule(cpu);
> > > 
> > > So we unconditionally set NEED_RESCHED, if, when we set that,
> > > POLLING was set, we skip the IPI.
> > 
> > Ah, indeed, we set NEED_RESCHED in the same memory address that
> > __monitor() is watching so all is good.
> > 
> > > So again, since monitor is busted, simply setting NEED_RESCHED
> > > will not wake us, we must send the IPI, this is achieved by not
> > > setting POLLING_NRFLAG.
> > 
> > Yeah, so I got the impression that it might be broken in only
> > certain circumstances, or is it completely busted?
> > 
> That is right, monitor is only partially broken not completely
> busted. I don't have the statistics but it is not rare to miss
> wakeup. The typical symptom is random slowness and fails to boot,
> without this patch.
> 
> So doing both can speed up wake up in some cases.
> 
> Jacob
> > Thanks,
> > 
> > 	Ingo
> 
BTW, I just rebased and sent out v2 to avoid conflict with the commit
below. Could you take this patch for v4.8? it is a critical fix for
affected CPUs.

commit 8709ed4d4b0eab04561c1ec9e6ea50fd1e3897ff
Author: Dave Hansen <dave.hansen@linux.intel.com>
Date:   Fri Jun 17 17:15:03 2016 -0700

    x86/cpu: Fix duplicated X86_BUG(9) macro
    


Thanks,

Jacob

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-07-18 18:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-06 17:01 [PATCH] x86: add workaround monitor bug Jacob Pan
2016-07-08  8:55 ` Ingo Molnar
2016-07-08 11:45   ` Peter Zijlstra
2016-07-08 12:07     ` Ingo Molnar
2016-07-08 19:14       ` Jacob Pan
2016-07-18 18:45         ` Jacob Pan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).