linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] X86: don't report PAT on CPUs that don't support it
@ 2017-04-18 19:07 Mikulas Patocka
  2017-04-18 19:28 ` H. Peter Anvin
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Mikulas Patocka @ 2017-04-18 19:07 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin; +Cc: x86, linux-kernel

In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.

However, on AMD K6-3 CPU, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.

The result of this bug is that this warning is produced when attemting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers.

This patch changes pat_enabled() so that it returns true only if pat
initialization was actually done.

Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
this_cpu_has(X86_FEATURE_PAT) in pat_ap_init, so that we check the PAT
feature on the processor that is being initialized.

x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix
CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35
Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
Call Trace:
 ? __warn+0xab/0xc0
 ? untrack_pfn+0x5c/0x9f
 ? warn_slowpath_null+0xf/0x13
 ? untrack_pfn+0x5c/0x9f
 ? unmap_single_vma+0x43/0x66
 ? unmap_vmas+0x24/0x30
 ? exit_mmap+0x3c/0xa5
 ? __mmput+0xf/0x76
 ? copy_process.part.76+0xb43/0x1129
 ? _do_fork+0x96/0x177
 ? do_int80_syscall_32+0x3e/0x4c
 ? entry_INT80_32+0x2a/0x2a
---[ end trace 8726f9d9fa90d702 ]---
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# v4.2+

---
 arch/x86/mm/pat.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Index: linux-2.6/arch/x86/mm/pat.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/pat.c
+++ linux-2.6/arch/x86/mm/pat.c
@@ -64,9 +64,11 @@ static int __init nopat(char *str)
 }
 early_param("nopat", nopat);
 
+static bool __read_mostly __pat_initialized = false;
+
 bool pat_enabled(void)
 {
-	return !!__pat_enabled;
+	return __pat_initialized;
 }
 EXPORT_SYMBOL_GPL(pat_enabled);
 
@@ -224,13 +226,14 @@ static void pat_bsp_init(u64 pat)
 	}
 
 	wrmsrl(MSR_IA32_CR_PAT, pat);
+	__pat_initialized = true;
 
 	__init_cache_modes(pat);
 }
 
 static void pat_ap_init(u64 pat)
 {
-	if (!boot_cpu_has(X86_FEATURE_PAT)) {
+	if (!this_cpu_has(X86_FEATURE_PAT)) {
 		/*
 		 * If this happens we are on a secondary CPU, but switched to
 		 * PAT on the boot CPU. We have no way to undo PAT.
@@ -305,7 +308,7 @@ void pat_init(void)
 	u64 pat;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	if (!pat_enabled()) {
+	if (!__pat_enabled) {
 		init_cache_modes();
 		return;
 	}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] X86: don't report PAT on CPUs that don't support it
  2017-04-18 19:07 [PATCH] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
@ 2017-04-18 19:28 ` H. Peter Anvin
  2017-04-18 20:47   ` Mikulas Patocka
  2017-05-16 13:57 ` H. Peter Anvin
  2017-05-24 10:21 ` [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT tip-bot for Mikulas Patocka
  2 siblings, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2017-04-18 19:28 UTC (permalink / raw)
  To: Mikulas Patocka, Thomas Gleixner, Ingo Molnar; +Cc: x86, linux-kernel

On 04/18/17 12:07, Mikulas Patocka wrote:
> In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
> variable is set to 1 by default and the function pat_init() sets
> __pat_enabled to 0 if the CPU doesn't support PAT.
> 
> However, on AMD K6-3 CPU, the processor initialization code never calls
> pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> returns true, even though the K6-3 CPU doesn't support PAT.
> 
> The result of this bug is that this warning is produced when attemting to
> start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
> Another symptom of this bug is that the framebuffer driver doesn't set the
> K6-3 MTRR registers.
> 
> This patch changes pat_enabled() so that it returns true only if pat
> initialization was actually done.
> 
> Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
> this_cpu_has(X86_FEATURE_PAT) in pat_ap_init, so that we check the PAT
> feature on the processor that is being initialized.
> 

I'm thinking it would be better to replace __pat_enabled with
static_cpu_has(X86_FEATURE_PAT) and disable the feature bit if the user
has disabled it on the command line, just as we do with other features.

	-hpa

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] X86: don't report PAT on CPUs that don't support it
  2017-04-18 19:28 ` H. Peter Anvin
@ 2017-04-18 20:47   ` Mikulas Patocka
  2017-05-14 22:07     ` Mikulas Patocka
  0 siblings, 1 reply; 27+ messages in thread
From: Mikulas Patocka @ 2017-04-18 20:47 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel



On Tue, 18 Apr 2017, H. Peter Anvin wrote:

> On 04/18/17 12:07, Mikulas Patocka wrote:
> > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
> > variable is set to 1 by default and the function pat_init() sets
> > __pat_enabled to 0 if the CPU doesn't support PAT.
> > 
> > However, on AMD K6-3 CPU, the processor initialization code never calls
> > pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> > returns true, even though the K6-3 CPU doesn't support PAT.
> > 
> > The result of this bug is that this warning is produced when attemting to
> > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
> > Another symptom of this bug is that the framebuffer driver doesn't set the
> > K6-3 MTRR registers.
> > 
> > This patch changes pat_enabled() so that it returns true only if pat
> > initialization was actually done.
> > 
> > Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
> > this_cpu_has(X86_FEATURE_PAT) in pat_ap_init, so that we check the PAT
> > feature on the processor that is being initialized.
> > 
> 
> I'm thinking it would be better to replace __pat_enabled with
> static_cpu_has(X86_FEATURE_PAT) and disable the feature bit if the user
> has disabled it on the command line, just as we do with other features.
> 
> 	-hpa

If MTRR initialization fails for whatever reason, then pat_init() won't be 
called and the kernel would mistakenly believe that PAT is working 
(because there would be no one to clear X86_FEATURE_PAT).

I think that pat should be reported only if pat_init() is called and 
succeeds.


Another strange thing: pat_disable() calls init_cache_modes() - but since 
pat_disable() may not be called at all, it is possible that 
init_cache_modes() is also not called at all. It doesn't produce any 
visible misbehavior on my machine, but it doesn't seem right - we should 
not call init_cache_modes() from pat_disable() and do the initialization 
elsewhere, where it is always called.

Mikulas

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] X86: don't report PAT on CPUs that don't support it
  2017-04-18 20:47   ` Mikulas Patocka
@ 2017-05-14 22:07     ` Mikulas Patocka
  0 siblings, 0 replies; 27+ messages in thread
From: Mikulas Patocka @ 2017-05-14 22:07 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel



On Tue, 18 Apr 2017, Mikulas Patocka wrote:

> 
> 
> On Tue, 18 Apr 2017, H. Peter Anvin wrote:
> 
> > On 04/18/17 12:07, Mikulas Patocka wrote:
> > > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
> > > variable is set to 1 by default and the function pat_init() sets
> > > __pat_enabled to 0 if the CPU doesn't support PAT.
> > > 
> > > However, on AMD K6-3 CPU, the processor initialization code never calls
> > > pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> > > returns true, even though the K6-3 CPU doesn't support PAT.
> > > 
> > > The result of this bug is that this warning is produced when attemting to
> > > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
> > > Another symptom of this bug is that the framebuffer driver doesn't set the
> > > K6-3 MTRR registers.
> > > 
> > > This patch changes pat_enabled() so that it returns true only if pat
> > > initialization was actually done.
> > > 
> > > Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
> > > this_cpu_has(X86_FEATURE_PAT) in pat_ap_init, so that we check the PAT
> > > feature on the processor that is being initialized.
> > > 
> > 
> > I'm thinking it would be better to replace __pat_enabled with
> > static_cpu_has(X86_FEATURE_PAT) and disable the feature bit if the user
> > has disabled it on the command line, just as we do with other features.
> > 
> > 	-hpa

What's the status of this bug report? Will you accept the patch that I 
sent or do you want a different patch?

If you want a special X86 capability for PAT, you can use the patch below. 
Note, that we can't use X86_FEATURE_PAT for this (because if CPU supports 
PAT, but pat_init() is not called, pat_enabled() would incorrectly report 
true), so we need a different capability X86_FEATURE_PAT_ENABLED.

Mikulas

> If MTRR initialization fails for whatever reason, then pat_init() won't be 
> called and the kernel would mistakenly believe that PAT is working 
> (because there would be no one to clear X86_FEATURE_PAT).
> 
> I think that pat should be reported only if pat_init() is called and 
> succeeds.
> 
> 
> Another strange thing: pat_disable() calls init_cache_modes() - but since 
> pat_disable() may not be called at all, it is possible that 
> init_cache_modes() is also not called at all. It doesn't produce any 
> visible misbehavior on my machine, but it doesn't seem right - we should 
> not call init_cache_modes() from pat_disable() and do the initialization 
> elsewhere, where it is always called.
> 
> Mikulas

X86: don't report PAT on CPUs that don't support it

In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.

However, on AMD K6-3 CPU, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.

The result of this bug is that this warning is produced when attemting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers.

This patch introduces a new CPU capability X86_FEATURE_PAT_ENABLED (which
is set when pat_init() is called and succeeds) and changes pat_enabled()
to return this capability. Note that we can use X86_FEATURE_PAT for this
purpose, because if pat_init() is not called, there would be no one to
clear X86_FEATURE_PAT and the kernel would mistakenly behave as if PAT is
working.

x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix
CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35
Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
Call Trace:
 ? __warn+0xab/0xc0
 ? untrack_pfn+0x5c/0x9f
 ? warn_slowpath_null+0xf/0x13
 ? untrack_pfn+0x5c/0x9f
 ? unmap_single_vma+0x43/0x66
 ? unmap_vmas+0x24/0x30
 ? exit_mmap+0x3c/0xa5
 ? __mmput+0xf/0x76
 ? copy_process.part.76+0xb43/0x1129
 ? _do_fork+0x96/0x177
 ? do_int80_syscall_32+0x3e/0x4c
 ? entry_INT80_32+0x2a/0x2a
---[ end trace 8726f9d9fa90d702 ]---
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# v4.2+

---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/include/asm/pat.h         |    3 ++-
 arch/x86/mm/pat.c                  |   12 ++++--------
 3 files changed, 7 insertions(+), 9 deletions(-)

Index: linux-2.6/arch/x86/mm/pat.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/pat.c
+++ linux-2.6/arch/x86/mm/pat.c
@@ -64,12 +64,6 @@ static int __init nopat(char *str)
 }
 early_param("nopat", nopat);
 
-bool pat_enabled(void)
-{
-	return !!__pat_enabled;
-}
-EXPORT_SYMBOL_GPL(pat_enabled);
-
 int pat_debug_enable;
 
 static int __init pat_debug_setup(char *str)
@@ -225,12 +219,14 @@ static void pat_bsp_init(u64 pat)
 
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 
+	setup_force_cpu_cap(X86_FEATURE_PAT_ENABLED);
+
 	__init_cache_modes(pat);
 }
 
 static void pat_ap_init(u64 pat)
 {
-	if (!boot_cpu_has(X86_FEATURE_PAT)) {
+	if (!this_cpu_has(X86_FEATURE_PAT)) {
 		/*
 		 * If this happens we are on a secondary CPU, but switched to
 		 * PAT on the boot CPU. We have no way to undo PAT.
@@ -305,7 +301,7 @@ void pat_init(void)
 	u64 pat;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	if (!pat_enabled()) {
+	if (!__pat_enabled) {
 		init_cache_modes();
 		return;
 	}
Index: linux-2.6/arch/x86/include/asm/pat.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/pat.h
+++ linux-2.6/arch/x86/include/asm/pat.h
@@ -4,7 +4,8 @@
 #include <linux/types.h>
 #include <asm/pgtable_types.h>
 
-bool pat_enabled(void);
+#define pat_enabled()	static_cpu_has(X86_FEATURE_PAT_ENABLED)
+
 void pat_disable(const char *reason);
 extern void pat_init(void);
 
Index: linux-2.6/arch/x86/include/asm/cpufeatures.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/cpufeatures.h
+++ linux-2.6/arch/x86/include/asm/cpufeatures.h
@@ -104,6 +104,7 @@
 #define X86_FEATURE_EXTD_APICID	( 3*32+26) /* has extended APICID (8 bits) */
 #define X86_FEATURE_AMD_DCM     ( 3*32+27) /* multi-node processor */
 #define X86_FEATURE_APERFMPERF	( 3*32+28) /* APERFMPERF */
+#define X86_FEATURE_PAT_ENABLED	( 3*32+29) /* PAT enabled */
 #define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */
 #define X86_FEATURE_TSC_KNOWN_FREQ ( 3*32+31) /* TSC has known frequency */
 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] X86: don't report PAT on CPUs that don't support it
  2017-04-18 19:07 [PATCH] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
  2017-04-18 19:28 ` H. Peter Anvin
@ 2017-05-16 13:57 ` H. Peter Anvin
  2017-05-16 15:49   ` Mikulas Patocka
  2017-05-24 10:21 ` [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT tip-bot for Mikulas Patocka
  2 siblings, 1 reply; 27+ messages in thread
From: H. Peter Anvin @ 2017-05-16 13:57 UTC (permalink / raw)
  To: Mikulas Patocka, Thomas Gleixner, Ingo Molnar; +Cc: x86, linux-kernel

On 04/18/17 12:07, Mikulas Patocka wrote:
> 
> However, on AMD K6-3 CPU, the processor initialization code never calls
> pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> returns true, even though the K6-3 CPU doesn't support PAT.
> 

OK, now I'm wondering: are you actually *using* said K6-3 machine, and
if so, are you actually dependent on write combining on it?  The reason
I'm asking is because I would personally like to completely remove the
support for using MTRRs to create WC mappings, as it only affects a
handful of ancient CPUs: Pentium Pro, Pentium II, K6-*, and possibly
some Cyrix/Centaur part.  Earlier CPUs didn't have WC, but could set WB,
WT or UC via the page tables without needing the PAT MSR, and newer CPUs
have PAT.

	-hpa

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] X86: don't report PAT on CPUs that don't support it
  2017-05-16 13:57 ` H. Peter Anvin
@ 2017-05-16 15:49   ` Mikulas Patocka
  2017-05-18  7:17     ` Ingo Molnar
  0 siblings, 1 reply; 27+ messages in thread
From: Mikulas Patocka @ 2017-05-16 15:49 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel



On Tue, 16 May 2017, H. Peter Anvin wrote:

> On 04/18/17 12:07, Mikulas Patocka wrote:
> > 
> > However, on AMD K6-3 CPU, the processor initialization code never calls
> > pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> > returns true, even though the K6-3 CPU doesn't support PAT.
> > 
> 
> OK, now I'm wondering: are you actually *using* said K6-3 machine, and

I use it for playing music, browsing with the links browser and connecting 
to other machines with ssh. That machine is slow but it is completely 
quiet.

It is also good to run my own software on a slow CPU to make sure that 
there are no obvious inefficiencies.

> if so, are you actually dependent on write combining on it?  The reason

Those K6-3 MTRRs improve framebuffer write throughput by 33%.

> I'm asking is because I would personally like to completely remove the
> support for using MTRRs to create WC mappings, as it only affects a
> handful of ancient CPUs: Pentium Pro, Pentium II, K6-*, and possibly
> some Cyrix/Centaur part.  Earlier CPUs didn't have WC, but could set WB,
> WT or UC via the page tables without needing the PAT MSR, and newer CPUs
> have PAT.

MTRRs are also needed on Pentium 3, Core Solo and Core Duo due to an 
erratum that makes it not possible to set WC with PAT. See the comment 
before "clear_cpu_cap(c, X86_FEATURE_PAT)" in early_init_intel().

Mikulas

> 	-hpa
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] X86: don't report PAT on CPUs that don't support it
  2017-05-16 15:49   ` Mikulas Patocka
@ 2017-05-18  7:17     ` Ingo Molnar
  0 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2017-05-18  7:17 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel


* Mikulas Patocka <mpatocka@redhat.com> wrote:

> On Tue, 16 May 2017, H. Peter Anvin wrote:
> 
> > On 04/18/17 12:07, Mikulas Patocka wrote:
> > > 
> > > However, on AMD K6-3 CPU, the processor initialization code never calls
> > > pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> > > returns true, even though the K6-3 CPU doesn't support PAT.
> > > 
> > 
> > OK, now I'm wondering: are you actually *using* said K6-3 machine, and
> 
> I use it for playing music, browsing with the links browser and connecting 
> to other machines with ssh. That machine is slow but it is completely 
> quiet.
> 
> It is also good to run my own software on a slow CPU to make sure that 
> there are no obvious inefficiencies.
> 
> > if so, are you actually dependent on write combining on it?  The reason
> 
> Those K6-3 MTRRs improve framebuffer write throughput by 33%.
> 
> > I'm asking is because I would personally like to completely remove the
> > support for using MTRRs to create WC mappings, as it only affects a
> > handful of ancient CPUs: Pentium Pro, Pentium II, K6-*, and possibly
> > some Cyrix/Centaur part.  Earlier CPUs didn't have WC, but could set WB,
> > WT or UC via the page tables without needing the PAT MSR, and newer CPUs
> > have PAT.
> 
> MTRRs are also needed on Pentium 3, Core Solo and Core Duo due to an 
> erratum that makes it not possible to set WC with PAT. See the comment 
> before "clear_cpu_cap(c, X86_FEATURE_PAT)" in early_init_intel().

Ok, I'm inclined to apply your regression fix - hpa do you concur?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  2017-04-18 19:07 [PATCH] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
  2017-04-18 19:28 ` H. Peter Anvin
  2017-05-16 13:57 ` H. Peter Anvin
@ 2017-05-24 10:21 ` tip-bot for Mikulas Patocka
  2017-05-28 18:18   ` Bernhard Held
  2 siblings, 1 reply; 27+ messages in thread
From: tip-bot for Mikulas Patocka @ 2017-05-24 10:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mpatocka, linux-kernel, peterz, mcgrof, tglx, dvlasenk, jpoimboe,
	toshi.kani, luto, akpm, brgerst, bp, torvalds, hpa, mingo

Commit-ID:  cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4
Gitweb:     http://git.kernel.org/tip/cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4
Author:     Mikulas Patocka <mpatocka@redhat.com>
AuthorDate: Tue, 18 Apr 2017 15:07:11 -0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 24 May 2017 10:17:23 +0200

x86/PAT: Fix Xorg regression on CPUs that don't support PAT

In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.

However, on AMD K6-3 CPUs, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.

The result of this bug is that a kernel warning is produced when attempting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers:

  x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
  ...
  x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining

To fix the bug change pat_enabled() so that it returns true only if PAT
initialization was actually done.

Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT
feature on the processor that is being initialized.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1704181501450.26399@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/mm/pat.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 9b78685..83a59a6 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -65,9 +65,11 @@ static int __init nopat(char *str)
 }
 early_param("nopat", nopat);
 
+static bool __read_mostly __pat_initialized = false;
+
 bool pat_enabled(void)
 {
-	return !!__pat_enabled;
+	return __pat_initialized;
 }
 EXPORT_SYMBOL_GPL(pat_enabled);
 
@@ -225,13 +227,14 @@ static void pat_bsp_init(u64 pat)
 	}
 
 	wrmsrl(MSR_IA32_CR_PAT, pat);
+	__pat_initialized = true;
 
 	__init_cache_modes(pat);
 }
 
 static void pat_ap_init(u64 pat)
 {
-	if (!boot_cpu_has(X86_FEATURE_PAT)) {
+	if (!this_cpu_has(X86_FEATURE_PAT)) {
 		/*
 		 * If this happens we are on a secondary CPU, but switched to
 		 * PAT on the boot CPU. We have no way to undo PAT.
@@ -306,7 +309,7 @@ void pat_init(void)
 	u64 pat;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	if (!pat_enabled()) {
+	if (!__pat_enabled) {
 		init_cache_modes();
 		return;
 	}

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  2017-05-24 10:21 ` [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT tip-bot for Mikulas Patocka
@ 2017-05-28 18:18   ` Bernhard Held
  2017-05-28 18:43     ` Andy Lutomirski
  0 siblings, 1 reply; 27+ messages in thread
From: Bernhard Held @ 2017-05-28 18:18 UTC (permalink / raw)
  To: bp, akpm, brgerst, torvalds, hpa, mingo, mpatocka, tglx, peterz,
	mcgrof, dvlasenk, jpoimboe, toshi.kani, luto
  Cc: linux-kernel

Hi,

this patch breaks the boot of my kernel. The last message is "Booting
the kernel.".

My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a
Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the
microcode of the E5450 and recognizes the CPU.

Please find below the dmesg of a the latest kernel w/o the PAT-patch.
I'm happy to provide more information or to test patches.

Have fun,
Bernhard

[    0.000000] Linux version 4.12.0-rc2-linus+ (berny@quad) (gcc version 6.3.1 20170202 [gcc-6-branch revision 245119] (SUSE Linux) ) #152 SMP PREEMPT Sun May 28 19:26:20 CEST 2017
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.12.0-rc2-linus+ root=/dev/mapper/VGMX300-root resume=/dev/sda2 showopts radeon.dpm=1 memmap=1$0xe4fd net.ifnames=0
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Enabled xstate features 0x3, context size is 576 bytes, using 'standard' format.
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000cfedffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000cfee0000-0x00000000cfee2fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000cfee3000-0x00000000cfeeffff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000cfef0000-0x00000000cfefffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000d0000000-0x00000000dfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x00000001afffffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] e820: user-defined physical RAM map:
[    0.000000] user: [mem 0x0000000000000000-0x000000000000e4fc] usable
[    0.000000] user: [mem 0x000000000000e4fd-0x000000000000e4fd] reserved
[    0.000000] user: [mem 0x000000000000e4fe-0x000000000009dbff] usable
[    0.000000] user: [mem 0x000000000009f800-0x000000000009ffff] reserved
[    0.000000] user: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] user: [mem 0x0000000000100000-0x00000000cfedffff] usable
[    0.000000] user: [mem 0x00000000cfee0000-0x00000000cfee2fff] ACPI NVS
[    0.000000] user: [mem 0x00000000cfee3000-0x00000000cfeeffff] ACPI data
[    0.000000] user: [mem 0x00000000cfef0000-0x00000000cfefffff] reserved
[    0.000000] user: [mem 0x00000000d0000000-0x00000000dfffffff] reserved
[    0.000000] user: [mem 0x00000000fec00000-0x00000000ffffffff] reserved
[    0.000000] user: [mem 0x0000000100000000-0x00000001afffffff] usable
[    0.000000] SMBIOS 2.4 present.
[    0.000000] DMI: Gigabyte Technology Co., Ltd. G33-DS3R/G33-DS3R, BIOS F7L 07/31/2009
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] e820: last_pfn = 0x1b0000 max_arch_pfn = 0x400000000
[    0.000000] MTRR default type: uncachable
[    0.000000] MTRR fixed ranges enabled:
[    0.000000]   00000-9FFFF write-back
[    0.000000]   A0000-BFFFF uncachable
[    0.000000]   C0000-CDFFF write-protect
[    0.000000]   CE000-EFFFF uncachable
[    0.000000]   F0000-FFFFF write-through
[    0.000000] MTRR variable ranges enabled:
[    0.000000]   0 base 0000000000 mask 0F00000000 write-back
[    0.000000]   1 base 00E0000000 mask 0FE0000000 uncachable
[    0.000000]   2 base 00D0000000 mask 0FF0000000 uncachable
[    0.000000]   3 base 0100000000 mask 0F00000000 write-back
[    0.000000]   4 base 01C0000000 mask 0FC0000000 uncachable
[    0.000000]   5 base 01B0000000 mask 0FF0000000 uncachable
[    0.000000]   6 base 00CFF00000 mask 0FFFF00000 uncachable
[    0.000000]   7 disabled
[    0.000000] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT
[    0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[    0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[    0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[    0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[    0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[    0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[    0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it.
[    0.000000] total RAM covered: 6143M
[    0.000000] Found optimal setting for mtrr clean up
[    0.000000]  gran_size: 64K  chunk_size: 2M  num_reg: 7      lose cover RAM: 0G
[    0.000000] e820: update [mem 0xcff00000-0xffffffff] usable ==> reserved
[    0.000000] e820: last_pfn = 0xcfee0 max_arch_pfn = 0x400000000
[    0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576
[    0.000000] BRK [0x01e34000, 0x01e34fff] PGTABLE
[    0.000000] BRK [0x01e35000, 0x01e35fff] PGTABLE
[    0.000000] BRK [0x01e36000, 0x01e36fff] PGTABLE
[    0.000000] BRK [0x01e37000, 0x01e37fff] PGTABLE
[    0.000000] BRK [0x01e38000, 0x01e38fff] PGTABLE
[    0.000000] BRK [0x01e39000, 0x01e39fff] PGTABLE
[    0.000000] RAMDISK: [mem 0x36c5f000-0x37626fff]


On 05/24/2017 at 12:21 PM, tip-bot for Mikulas Patocka wrote:
> Commit-ID:  cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4
> Gitweb:     http://git.kernel.org/tip/cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4
> Author:     Mikulas Patocka <mpatocka@redhat.com>
> AuthorDate: Tue, 18 Apr 2017 15:07:11 -0400
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Wed, 24 May 2017 10:17:23 +0200
> 
> x86/PAT: Fix Xorg regression on CPUs that don't support PAT
> 
> In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The
> variable is set to 1 by default and the function pat_init() sets
> __pat_enabled to 0 if the CPU doesn't support PAT.
> 
> However, on AMD K6-3 CPUs, the processor initialization code never calls
> pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> returns true, even though the K6-3 CPU doesn't support PAT.
> 
> The result of this bug is that a kernel warning is produced when attempting to
> start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
> Another symptom of this bug is that the framebuffer driver doesn't set the
> K6-3 MTRR registers:
> 
>    x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
>    ------------[ cut here ]------------
>    WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
>    ...
>    x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
> 
> To fix the bug change pat_enabled() so that it returns true only if PAT
> initialization was actually done.
> 
> Also, I changed boot_cpu_has(X86_FEATURE_PAT) to
> this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT
> feature on the processor that is being initialized.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Brian Gerst <brgerst@gmail.com>
> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Luis R. Rodriguez <mcgrof@suse.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Toshi Kani <toshi.kani@hp.com>
> Cc: stable@vger.kernel.org # v4.2+
> Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1704181501450.26399@file01.intranet.prod.int.rdu2.redhat.com
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> ---
>   arch/x86/mm/pat.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> index 9b78685..83a59a6 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -65,9 +65,11 @@ static int __init nopat(char *str)
>   }
>   early_param("nopat", nopat);
>   
> +static bool __read_mostly __pat_initialized = false;
> +
>   bool pat_enabled(void)
>   {
> -	return !!__pat_enabled;
> +	return __pat_initialized;
>   }
>   EXPORT_SYMBOL_GPL(pat_enabled);
>   
> @@ -225,13 +227,14 @@ static void pat_bsp_init(u64 pat)
>   	}
>   
>   	wrmsrl(MSR_IA32_CR_PAT, pat);
> +	__pat_initialized = true;
>   
>   	__init_cache_modes(pat);
>   }
>   
>   static void pat_ap_init(u64 pat)
>   {
> -	if (!boot_cpu_has(X86_FEATURE_PAT)) {
> +	if (!this_cpu_has(X86_FEATURE_PAT)) {
>   		/*
>   		 * If this happens we are on a secondary CPU, but switched to
>   		 * PAT on the boot CPU. We have no way to undo PAT.
> @@ -306,7 +309,7 @@ void pat_init(void)
>   	u64 pat;
>   	struct cpuinfo_x86 *c = &boot_cpu_data;
>   
> -	if (!pat_enabled()) {
> +	if (!__pat_enabled) {
>   		init_cache_modes();
>   		return;
>   	}
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  2017-05-28 18:18   ` Bernhard Held
@ 2017-05-28 18:43     ` Andy Lutomirski
  2017-05-29 22:50       ` Mikulas Patocka
  2017-06-06 22:49       ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
  0 siblings, 2 replies; 27+ messages in thread
From: Andy Lutomirski @ 2017-05-28 18:43 UTC (permalink / raw)
  To: Bernhard Held, Toshi Kani
  Cc: Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds,
	H. Peter Anvin, Ingo Molnar, Mikulas Patocka, Thomas Gleixner,
	Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko,
	Josh Poimboeuf, Andrew Lutomirski, linux-kernel

On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote:
> Hi,
>
> this patch breaks the boot of my kernel. The last message is "Booting
> the kernel.".
>
> My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a
> Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the
> microcode of the E5450 and recognizes the CPU.
>
> Please find below the dmesg of a the latest kernel w/o the PAT-patch.
> I'm happy to provide more information or to test patches.

I think this patch is bogus.  pat_enabled() sure looks like it's
supposed to return true if PAT is *enabled*, and these days PAT is
"enabled" even if there's no HW PAT support.  Even if the patch were
somehow correct, it should have been split up into two patches, one to
change pat_enabled() and one to use this_cpu_has().

Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so
-stable knows not to backport it, and starting over with the fix.
>From very brief inspection, the right fix is to make sure that
pat_init(), or at least init_cache_modes(), gets called on the
affected CPUs.

As a future cleanup, I think that pat_enabled() could be deleted
outright and, if needed, replaced by functions like have_memtype_wc()
or similar.  (Do we already have helpers like that?)  Toshi, am I
right?

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  2017-05-28 18:43     ` Andy Lutomirski
@ 2017-05-29 22:50       ` Mikulas Patocka
  2017-05-30 17:14         ` Dominik Brodowski
  2017-06-06 22:49       ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
  1 sibling, 1 reply; 27+ messages in thread
From: Mikulas Patocka @ 2017-05-29 22:50 UTC (permalink / raw)
  To: Bernhard Held, Andy Lutomirski
  Cc: Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst,
	Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko,
	Josh Poimboeuf, linux-kernel



On Sun, 28 May 2017, Andy Lutomirski wrote:

> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote:
> > Hi,
> >
> > this patch breaks the boot of my kernel. The last message is "Booting
> > the kernel.".
> >
> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a
> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the
> > microcode of the E5450 and recognizes the CPU.
> >
> > Please find below the dmesg of a the latest kernel w/o the PAT-patch.
> > I'm happy to provide more information or to test patches.

Hi

Please do the following three tests and test if the kernel boots.

1. use the PAT patch and revert the change to the function pat_enabled()
- i.e. change it to the original:
bool pat_enabled(void)
{
	return !!__pat_enabled;
}

2. use the PAT patch and revert the change to the function pat_ap_init
- i.e. change it to the original:
static void pat_ap_init(u64 pat)
{
	if (!boot_cpu_has(X86_FEATURE_PAT)) {

3. use the full PAT patch and apply the below patch on the top of it.

> I think this patch is bogus.  pat_enabled() sure looks like it's
> supposed to return true if PAT is *enabled*, and these days PAT is
> "enabled" even if there's no HW PAT support.  Even if the patch were
> somehow correct, it should have been split up into two patches, one to
> change pat_enabled() and one to use this_cpu_has().
> 
> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so
> -stable knows not to backport it, and starting over with the fix.
> >From very brief inspection, the right fix is to make sure that
> pat_init(), or at least init_cache_modes(), gets called on the

pat_init() needs to be called with cache disabled - and the cache disable 
code (functions prepare_set() and post_set()) exists in 
arch/x86/kernel/cpu/mtrr/generic.c - it may not be compiled if CONFIG_MTRR 
is not set.

Though, it is possible to call init_cache_modes() - see the patch below. 
init_cache_modes() does nothing if it is called multiple times.

> affected CPUs.
> 
> As a future cleanup, I think that pat_enabled() could be deleted
> outright and, if needed, replaced by functions like have_memtype_wc()
> or similar.  (Do we already have helpers like that?)  Toshi, am I
> right?
> 
> --Andy


---
 arch/x86/include/asm/pat.h |    1 +
 arch/x86/kernel/setup.c    |    1 +
 arch/x86/mm/pat.c          |    3 +--
 3 files changed, 3 insertions(+), 2 deletions(-)

Index: linux-stable/arch/x86/include/asm/pat.h
===================================================================
--- linux-stable.orig/arch/x86/include/asm/pat.h
+++ linux-stable/arch/x86/include/asm/pat.h
@@ -8,6 +8,7 @@
 
 void pat_disable(const char *reason);
 extern void pat_init(void);
+extern void init_cache_modes(void);
 
 extern int reserve_memtype(u64 start, u64 end,
 		enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
Index: linux-stable/arch/x86/kernel/setup.c
===================================================================
--- linux-stable.orig/arch/x86/kernel/setup.c
+++ linux-stable/arch/x86/kernel/setup.c
@@ -1074,6 +1074,7 @@ void __init setup_arch(char **cmdline_p)
 
 	/* update e820 for memory not covered by WB MTRRs */
 	mtrr_bp_init();
+	init_cache_modes();
 	if (mtrr_trim_uncached_memory(max_pfn))
 		max_pfn = e820_end_of_ram_pfn();
 
Index: linux-stable/arch/x86/mm/pat.c
===================================================================
--- linux-stable.orig/arch/x86/mm/pat.c
+++ linux-stable/arch/x86/mm/pat.c
@@ -39,7 +39,6 @@
 static bool boot_cpu_done;
 
 static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);
-static void init_cache_modes(void);
 
 void pat_disable(const char *reason)
 {
@@ -237,7 +236,7 @@ static void pat_ap_init(u64 pat)
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 }
 
-static void init_cache_modes(void)
+void init_cache_modes(void)
 {
 	u64 pat = 0;
 	static int init_cm_done;

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  2017-05-29 22:50       ` Mikulas Patocka
@ 2017-05-30 17:14         ` Dominik Brodowski
  2017-05-30 17:59           ` Mikulas Patocka
  0 siblings, 1 reply; 27+ messages in thread
From: Dominik Brodowski @ 2017-05-30 17:14 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov,
	Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 750 bytes --]

Same boot problem here (Intel(R) Core(TM) i5-5200U CPU on a Dell XPS 13),
git-bisected to the same patch...

On Mon, May 29, 2017 at 06:50:57PM -0400, Mikulas Patocka wrote:
> Please do the following three tests and test if the kernel boots.
> 
> 1. use the PAT patch and revert the change to the function pat_enabled()
> - i.e. change it to the original:
> bool pat_enabled(void)
> {
> 	return !!__pat_enabled;
> }

No joy.

> 2. use the PAT patch and revert the change to the function pat_ap_init
> - i.e. change it to the original:
> static void pat_ap_init(u64 pat)
> {
> 	if (!boot_cpu_has(X86_FEATURE_PAT)) {

Joy.

> 3. use the full PAT patch and apply the below patch on the top of it.

No joy.


Best,
	Dominik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  2017-05-30 17:14         ` Dominik Brodowski
@ 2017-05-30 17:59           ` Mikulas Patocka
  2017-05-30 18:47             ` Dominik Brodowski
                               ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Mikulas Patocka @ 2017-05-30 17:59 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov,
	Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel



On Tue, 30 May 2017, Dominik Brodowski wrote:

> Same boot problem here (Intel(R) Core(TM) i5-5200U CPU on a Dell XPS 13),
> git-bisected to the same patch...
> 
> On Mon, May 29, 2017 at 06:50:57PM -0400, Mikulas Patocka wrote:
> > Please do the following three tests and test if the kernel boots.
> > 
> > 1. use the PAT patch and revert the change to the function pat_enabled()
> > - i.e. change it to the original:
> > bool pat_enabled(void)
> > {
> > 	return !!__pat_enabled;
> > }
> 
> No joy.
> 
> > 2. use the PAT patch and revert the change to the function pat_ap_init
> > - i.e. change it to the original:
> > static void pat_ap_init(u64 pat)
> > {
> > 	if (!boot_cpu_has(X86_FEATURE_PAT)) {
> 
> Joy.

It is interesting - does it mean that the boot cpu does have PAT and the 
secondary CPUs don't? Please send /proc/cpuinfo with all the cores active.

This part of the patch is not required anyway, so I will resubmit the 
patch with this part disabled (and with an added call to 
init_cache_modes() as Andy suggested).

Mikulas

> > 3. use the full PAT patch and apply the below patch on the top of it.
> 
> No joy.
> 
> 
> Best,
> 	Dominik
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  2017-05-30 17:59           ` Mikulas Patocka
@ 2017-05-30 18:47             ` Dominik Brodowski
  2017-05-30 19:30             ` Bernhard Held
  2017-05-31  9:39             ` Junichi Nomura
  2 siblings, 0 replies; 27+ messages in thread
From: Dominik Brodowski @ 2017-05-30 18:47 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov,
	Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5466 bytes --]

On Tue, May 30, 2017 at 01:59:41PM -0400, Mikulas Patocka wrote:
> On Tue, 30 May 2017, Dominik Brodowski wrote:
> 
> > Same boot problem here (Intel(R) Core(TM) i5-5200U CPU on a Dell XPS 13),
> > git-bisected to the same patch...
> > 
> > On Mon, May 29, 2017 at 06:50:57PM -0400, Mikulas Patocka wrote:
> > > Please do the following three tests and test if the kernel boots.
> > > 
> > > 1. use the PAT patch and revert the change to the function pat_enabled()
> > > - i.e. change it to the original:
> > > bool pat_enabled(void)
> > > {
> > > 	return !!__pat_enabled;
> > > }
> > 
> > No joy.
> > 
> > > 2. use the PAT patch and revert the change to the function pat_ap_init
> > > - i.e. change it to the original:
> > > static void pat_ap_init(u64 pat)
> > > {
> > > 	if (!boot_cpu_has(X86_FEATURE_PAT)) {
> > 
> > Joy.
> 
> It is interesting - does it mean that the boot cpu does have PAT and the 
> secondary CPUs don't? Please send /proc/cpuinfo with all the cores active.

This physical CPU has PAT on all cores / siblings:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 61
model name	: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
stepping	: 4
microcode	: 0x24
cpu MHz		: 800.158
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 20
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap xsaveopt dtherm ida arat pln pts
bugs		:
bogomips	: 4389.80
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 61
model name	: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
stepping	: 4
microcode	: 0x24
cpu MHz		: 800.158
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 20
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap xsaveopt dtherm ida arat pln pts
bugs		:
bogomips	: 4436.14
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 61
model name	: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
stepping	: 4
microcode	: 0x24
cpu MHz		: 800.024
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 20
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap xsaveopt dtherm ida arat pln pts
bugs		:
bogomips	: 4397.30
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 61
model name	: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
stepping	: 4
microcode	: 0x24
cpu MHz		: 799.890
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 20
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap xsaveopt dtherm ida arat pln pts
bugs		:
bogomips	: 4396.84
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

Best,
	Dominik

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  2017-05-30 17:59           ` Mikulas Patocka
  2017-05-30 18:47             ` Dominik Brodowski
@ 2017-05-30 19:30             ` Bernhard Held
  2017-05-31  9:39             ` Junichi Nomura
  2 siblings, 0 replies; 27+ messages in thread
From: Bernhard Held @ 2017-05-30 19:30 UTC (permalink / raw)
  To: Mikulas Patocka, Dominik Brodowski
  Cc: Andy Lutomirski, Toshi Kani, Borislav Petkov, Andrew Morton,
	Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel

On 05/30/2017 at 07:59 PM, Mikulas Patocka wrote:
> 
> 
> On Tue, 30 May 2017, Dominik Brodowski wrote:
> 
>> Same boot problem here (Intel(R) Core(TM) i5-5200U CPU on a Dell XPS 13),
>> git-bisected to the same patch...
>>
>> On Mon, May 29, 2017 at 06:50:57PM -0400, Mikulas Patocka wrote:
>>> Please do the following three tests and test if the kernel boots.
>>>
>>> 1. use the PAT patch and revert the change to the function pat_enabled()
>>> - i.e. change it to the original:
>>> bool pat_enabled(void)
>>> {
>>> 	return !!__pat_enabled;
>>> }
>>
>> No joy.
>>
>>> 2. use the PAT patch and revert the change to the function pat_ap_init
>>> - i.e. change it to the original:
>>> static void pat_ap_init(u64 pat)
>>> {
>>> 	if (!boot_cpu_has(X86_FEATURE_PAT)) {
>>
>> Joy.
> 
> It is interesting - does it mean that the boot cpu does have PAT and the
> secondary CPUs don't? Please send /proc/cpuinfo with all the cores active.
> 
> This part of the patch is not required anyway, so I will resubmit the
> patch with this part disabled (and with an added call to
> init_cache_modes() as Andy suggested).
> 
> Mikulas
> 
>>> 3. use the full PAT patch and apply the below patch on the top of it.
>>
>> No joy.

Same result here., only #2 boots.

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Xeon(R) CPU           E5450  @ 3.00GHz
stepping        : 10
microcode       : 0xa0b
cpu MHz         : 2000.000
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority dtherm
bugs            :
bogomips        : 5999.70
clflush size    : 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Xeon(R) CPU           E5450  @ 3.00GHz
stepping        : 10
microcode       : 0xa0b
cpu MHz         : 2000.000
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 3
cpu cores       : 4
apicid          : 3
initial apicid  : 3
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority dtherm
bugs            :
bogomips        : 5999.98
clflush size    : 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Xeon(R) CPU           E5450  @ 3.00GHz
stepping        : 10
microcode       : 0xa0b
cpu MHz         : 2000.000
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 2
cpu cores       : 4
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority dtherm
bugs            :
bogomips        : 5900.00
clflush size    : 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Xeon(R) CPU           E5450  @ 3.00GHz
stepping        : 10
microcode       : 0xa0b
cpu MHz         : 2000.000
cache size      : 6144 KB
physical id     : 0
siblings        : 4
core id         : 1
cpu cores       : 4
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority dtherm
bugs            :
bogomips        : 5999.98
clflush size    : 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

Cheers,
Bernhard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  2017-05-30 17:59           ` Mikulas Patocka
  2017-05-30 18:47             ` Dominik Brodowski
  2017-05-30 19:30             ` Bernhard Held
@ 2017-05-31  9:39             ` Junichi Nomura
  2 siblings, 0 replies; 27+ messages in thread
From: Junichi Nomura @ 2017-05-31  9:39 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Dominik Brodowski, Bernhard Held, Andy Lutomirski, Toshi Kani,
	Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds,
	H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
	Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel

On 05/31/17 02:59, Mikulas Patocka wrote:
>>> 2. use the PAT patch and revert the change to the function pat_ap_init
>>> - i.e. change it to the original:
>>> static void pat_ap_init(u64 pat)
>>> {
>>> 	if (!boot_cpu_has(X86_FEATURE_PAT)) {
>>
>> Joy.
> 
> It is interesting - does it mean that the boot cpu does have PAT and the 
> secondary CPUs don't?

It seems pat_init() is called twice for boot cpu, from mtrr_bp_pat_init()
and generic_set_all(). The 2nd call ends up calling pat_ap_init() and it's
before boot_cpu_data is copied to cpu_data[0].

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2] X86: don't report PAT on CPUs that don't support it
  2017-05-28 18:43     ` Andy Lutomirski
  2017-05-29 22:50       ` Mikulas Patocka
@ 2017-06-06 22:49       ` Mikulas Patocka
  2017-06-06 22:51         ` Andy Lutomirski
                           ` (2 more replies)
  1 sibling, 3 replies; 27+ messages in thread
From: Mikulas Patocka @ 2017-06-06 22:49 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Bernhard Held, Toshi Kani, Borislav Petkov, Andrew Morton,
	Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel



On Sun, 28 May 2017, Andy Lutomirski wrote:

> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote:
> > Hi,
> >
> > this patch breaks the boot of my kernel. The last message is "Booting
> > the kernel.".
> >
> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a
> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the
> > microcode of the E5450 and recognizes the CPU.
> >
> > Please find below the dmesg of a the latest kernel w/o the PAT-patch.
> > I'm happy to provide more information or to test patches.
> 
> I think this patch is bogus.  pat_enabled() sure looks like it's
> supposed to return true if PAT is *enabled*, and these days PAT is
> "enabled" even if there's no HW PAT support.  Even if the patch were
> somehow correct, it should have been split up into two patches, one to
> change pat_enabled() and one to use this_cpu_has().
> 
> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so
> -stable knows not to backport it, and starting over with the fix.
> >From very brief inspection, the right fix is to make sure that
> pat_init(), or at least init_cache_modes(), gets called on the
> affected CPUs.
> 
> --Andy

Hi

Here I send the second version of the patch. It drops the change from 
boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that 
caused kernel to be unbootable for some people).

Another change is that setup_arch() calls init_cache_modes() if PAT is 
disabled, so that init_cache_modes() is always called.

Mikulas



From: Mikulas Patocka <mpatocka@redhat.com>

In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.

However, on AMD K6-3 CPU, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.

The result of this bug is that this warning is produced when attemting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers.

This patch changes pat_enabled() so that it returns true only if pat
initialization was actually done.

As Andy Lutomirski suggested, we also need to call init_cache_modes() if
pat was not initialized, so we call it after mtrr_bp_init()
(mtrr_bp_init() would or wouldn't call pat_init()). Note that
init_cache_modes() detects if it was called more than once and does
nothing in that case.

x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix
CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35
Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
Call Trace:
 ? __warn+0xab/0xc0
 ? untrack_pfn+0x5c/0x9f
 ? warn_slowpath_null+0xf/0x13
 ? untrack_pfn+0x5c/0x9f
 ? unmap_single_vma+0x43/0x66
 ? unmap_vmas+0x24/0x30
 ? exit_mmap+0x3c/0xa5
 ? __mmput+0xf/0x76
 ? copy_process.part.76+0xb43/0x1129
 ? _do_fork+0x96/0x177
 ? do_int80_syscall_32+0x3e/0x4c
 ? entry_INT80_32+0x2a/0x2a
---[ end trace 8726f9d9fa90d702 ]---
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# v4.2+

---
 arch/x86/include/asm/pat.h |    1 +
 arch/x86/kernel/setup.c    |    2 ++
 arch/x86/mm/pat.c          |   10 ++++++----
 3 files changed, 9 insertions(+), 4 deletions(-)

Index: linux-stable/arch/x86/mm/pat.c
===================================================================
--- linux-stable.orig/arch/x86/mm/pat.c
+++ linux-stable/arch/x86/mm/pat.c
@@ -40,7 +40,6 @@
 static bool boot_cpu_done;
 
 static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);
-static void init_cache_modes(void);
 
 void pat_disable(const char *reason)
 {
@@ -65,9 +64,11 @@ static int __init nopat(char *str)
 }
 early_param("nopat", nopat);
 
+static bool __read_mostly __pat_initialized = false;
+
 bool pat_enabled(void)
 {
-	return !!__pat_enabled;
+	return __pat_initialized;
 }
 EXPORT_SYMBOL_GPL(pat_enabled);
 
@@ -225,6 +226,7 @@ static void pat_bsp_init(u64 pat)
 	}
 
 	wrmsrl(MSR_IA32_CR_PAT, pat);
+	__pat_initialized = true;
 
 	__init_cache_modes(pat);
 }
@@ -242,7 +244,7 @@ static void pat_ap_init(u64 pat)
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 }
 
-static void init_cache_modes(void)
+void init_cache_modes(void)
 {
 	u64 pat = 0;
 	static int init_cm_done;
@@ -306,7 +308,7 @@ void pat_init(void)
 	u64 pat;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	if (!pat_enabled()) {
+	if (!__pat_enabled) {
 		init_cache_modes();
 		return;
 	}
Index: linux-stable/arch/x86/include/asm/pat.h
===================================================================
--- linux-stable.orig/arch/x86/include/asm/pat.h
+++ linux-stable/arch/x86/include/asm/pat.h
@@ -7,6 +7,7 @@
 bool pat_enabled(void);
 void pat_disable(const char *reason);
 extern void pat_init(void);
+extern void init_cache_modes(void);
 
 extern int reserve_memtype(u64 start, u64 end,
 		enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
Index: linux-stable/arch/x86/kernel/setup.c
===================================================================
--- linux-stable.orig/arch/x86/kernel/setup.c
+++ linux-stable/arch/x86/kernel/setup.c
@@ -1070,6 +1070,8 @@ void __init setup_arch(char **cmdline_p)
 
 	/* update e820 for memory not covered by WB MTRRs */
 	mtrr_bp_init();
+	if (!pat_enabled())
+		init_cache_modes();
 	if (mtrr_trim_uncached_memory(max_pfn))
 		max_pfn = e820__end_of_ram_pfn();
 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it
  2017-06-06 22:49       ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
@ 2017-06-06 22:51         ` Andy Lutomirski
  2017-06-06 23:21           ` Mikulas Patocka
  2017-06-07 19:54         ` Bernhard Held
  2017-07-03  5:05         ` Mikulas Patocka
  2 siblings, 1 reply; 27+ messages in thread
From: Andy Lutomirski @ 2017-06-06 22:51 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Andy Lutomirski, Bernhard Held, Toshi Kani, Borislav Petkov,
	Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel

On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka <mpatocka@redhat.com> wrote:
>
>
> On Sun, 28 May 2017, Andy Lutomirski wrote:
>
>> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote:
>> > Hi,
>> >
>> > this patch breaks the boot of my kernel. The last message is "Booting
>> > the kernel.".
>> >
>> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a
>> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the
>> > microcode of the E5450 and recognizes the CPU.
>> >
>> > Please find below the dmesg of a the latest kernel w/o the PAT-patch.
>> > I'm happy to provide more information or to test patches.
>>
>> I think this patch is bogus.  pat_enabled() sure looks like it's
>> supposed to return true if PAT is *enabled*, and these days PAT is
>> "enabled" even if there's no HW PAT support.  Even if the patch were
>> somehow correct, it should have been split up into two patches, one to
>> change pat_enabled() and one to use this_cpu_has().
>>
>> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so
>> -stable knows not to backport it, and starting over with the fix.
>> >From very brief inspection, the right fix is to make sure that
>> pat_init(), or at least init_cache_modes(), gets called on the
>> affected CPUs.
>>
>> --Andy
>
> Hi
>
> Here I send the second version of the patch. It drops the change from
> boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that
> caused kernel to be unbootable for some people).
>
> Another change is that setup_arch() calls init_cache_modes() if PAT is
> disabled, so that init_cache_modes() is always called.
>
> Mikulas
>
>
>
> From: Mikulas Patocka <mpatocka@redhat.com>
>
> In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
> variable is set to 1 by default and the function pat_init() sets
> __pat_enabled to 0 if the CPU doesn't support PAT.
>
> However, on AMD K6-3 CPU, the processor initialization code never calls
> pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> returns true, even though the K6-3 CPU doesn't support PAT.
>
> The result of this bug is that this warning is produced when attemting to
> start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
> Another symptom of this bug is that the framebuffer driver doesn't set the
> K6-3 MTRR registers.
>
> This patch changes pat_enabled() so that it returns true only if pat
> initialization was actually done.

Why?  Shouldn't calling init_cache_modes() be sufficient?

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it
  2017-06-06 22:51         ` Andy Lutomirski
@ 2017-06-06 23:21           ` Mikulas Patocka
  2017-06-13 15:54             ` Andy Lutomirski
  0 siblings, 1 reply; 27+ messages in thread
From: Mikulas Patocka @ 2017-06-06 23:21 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Bernhard Held, Toshi Kani, Borislav Petkov, Andrew Morton,
	Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel



On Tue, 6 Jun 2017, Andy Lutomirski wrote:

> On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka <mpatocka@redhat.com> wrote:
> >
> >
> > On Sun, 28 May 2017, Andy Lutomirski wrote:
> >
> >> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote:
> >> > Hi,
> >> >
> >> > this patch breaks the boot of my kernel. The last message is "Booting
> >> > the kernel.".
> >> >
> >> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a
> >> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the
> >> > microcode of the E5450 and recognizes the CPU.
> >> >
> >> > Please find below the dmesg of a the latest kernel w/o the PAT-patch.
> >> > I'm happy to provide more information or to test patches.
> >>
> >> I think this patch is bogus.  pat_enabled() sure looks like it's
> >> supposed to return true if PAT is *enabled*, and these days PAT is
> >> "enabled" even if there's no HW PAT support.  Even if the patch were
> >> somehow correct, it should have been split up into two patches, one to
> >> change pat_enabled() and one to use this_cpu_has().
> >>
> >> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so
> >> -stable knows not to backport it, and starting over with the fix.
> >> >From very brief inspection, the right fix is to make sure that
> >> pat_init(), or at least init_cache_modes(), gets called on the
> >> affected CPUs.
> >>
> >> --Andy
> >
> > Hi
> >
> > Here I send the second version of the patch. It drops the change from
> > boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that
> > caused kernel to be unbootable for some people).
> >
> > Another change is that setup_arch() calls init_cache_modes() if PAT is
> > disabled, so that init_cache_modes() is always called.
> >
> > Mikulas
> >
> >
> >
> > From: Mikulas Patocka <mpatocka@redhat.com>
> >
> > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
> > variable is set to 1 by default and the function pat_init() sets
> > __pat_enabled to 0 if the CPU doesn't support PAT.
> >
> > However, on AMD K6-3 CPU, the processor initialization code never calls
> > pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> > returns true, even though the K6-3 CPU doesn't support PAT.
> >
> > The result of this bug is that this warning is produced when attemting to
> > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
> > Another symptom of this bug is that the framebuffer driver doesn't set the
> > K6-3 MTRR registers.
> >
> > This patch changes pat_enabled() so that it returns true only if pat
> > initialization was actually done.
> 
> Why?  Shouldn't calling init_cache_modes() be sufficient?
> 
> --Andy

See the function arch_phys_wc_add():

if (pat_enabled() || !mtrr_enabled())
	return 0;  /* Success!  (We don't need to do anything.) */
ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true);

- if pat_enabled() returns true, that function doesn't set MTRRs. 
pat_enabled() must return false on systems without PAT, so that MTRRs are 
set.

Mikulas

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it
  2017-06-06 22:49       ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
  2017-06-06 22:51         ` Andy Lutomirski
@ 2017-06-07 19:54         ` Bernhard Held
  2017-07-03  5:05         ` Mikulas Patocka
  2 siblings, 0 replies; 27+ messages in thread
From: Bernhard Held @ 2017-06-07 19:54 UTC (permalink / raw)
  To: Mikulas Patocka, Andy Lutomirski
  Cc: Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst,
	Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko,
	Josh Poimboeuf, linux-kernel

On 06/07/2017 at 12:49 AM, Mikulas Patocka wrote:
> Hi
> 
> Here I send the second version of the patch. It drops the change from
> boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that
> caused kernel to be unbootable for some people).
> 
> Another change is that setup_arch() calls init_cache_modes() if PAT is
> disabled, so that init_cache_modes() is always called.
> 
> Mikulas

[PATCH v2] Works for me!

CHeers,
Bernhard

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it
  2017-06-06 23:21           ` Mikulas Patocka
@ 2017-06-13 15:54             ` Andy Lutomirski
  2017-06-14 20:24               ` Mikulas Patocka
  0 siblings, 1 reply; 27+ messages in thread
From: Andy Lutomirski @ 2017-06-13 15:54 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Andy Lutomirski, Bernhard Held, Toshi Kani, Borislav Petkov,
	Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel

On Tue, Jun 6, 2017 at 4:21 PM, Mikulas Patocka <mpatocka@redhat.com> wrote:
>
>
> On Tue, 6 Jun 2017, Andy Lutomirski wrote:
>
>> On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka <mpatocka@redhat.com> wrote:
>> >
>> >
>> > On Sun, 28 May 2017, Andy Lutomirski wrote:
>> >
>> >> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote:
>> >> > Hi,
>> >> >
>> >> > this patch breaks the boot of my kernel. The last message is "Booting
>> >> > the kernel.".
>> >> >
>> >> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a
>> >> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the
>> >> > microcode of the E5450 and recognizes the CPU.
>> >> >
>> >> > Please find below the dmesg of a the latest kernel w/o the PAT-patch.
>> >> > I'm happy to provide more information or to test patches.
>> >>
>> >> I think this patch is bogus.  pat_enabled() sure looks like it's
>> >> supposed to return true if PAT is *enabled*, and these days PAT is
>> >> "enabled" even if there's no HW PAT support.  Even if the patch were
>> >> somehow correct, it should have been split up into two patches, one to
>> >> change pat_enabled() and one to use this_cpu_has().
>> >>
>> >> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so
>> >> -stable knows not to backport it, and starting over with the fix.
>> >> >From very brief inspection, the right fix is to make sure that
>> >> pat_init(), or at least init_cache_modes(), gets called on the
>> >> affected CPUs.
>> >>
>> >> --Andy
>> >
>> > Hi
>> >
>> > Here I send the second version of the patch. It drops the change from
>> > boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that
>> > caused kernel to be unbootable for some people).
>> >
>> > Another change is that setup_arch() calls init_cache_modes() if PAT is
>> > disabled, so that init_cache_modes() is always called.
>> >
>> > Mikulas
>> >
>> >
>> >
>> > From: Mikulas Patocka <mpatocka@redhat.com>
>> >
>> > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
>> > variable is set to 1 by default and the function pat_init() sets
>> > __pat_enabled to 0 if the CPU doesn't support PAT.
>> >
>> > However, on AMD K6-3 CPU, the processor initialization code never calls
>> > pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
>> > returns true, even though the K6-3 CPU doesn't support PAT.
>> >
>> > The result of this bug is that this warning is produced when attemting to
>> > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
>> > Another symptom of this bug is that the framebuffer driver doesn't set the
>> > K6-3 MTRR registers.
>> >
>> > This patch changes pat_enabled() so that it returns true only if pat
>> > initialization was actually done.
>>
>> Why?  Shouldn't calling init_cache_modes() be sufficient?
>>
>> --Andy
>
> See the function arch_phys_wc_add():
>
> if (pat_enabled() || !mtrr_enabled())
>         return 0;  /* Success!  (We don't need to do anything.) */
> ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true);
>
> - if pat_enabled() returns true, that function doesn't set MTRRs.
> pat_enabled() must return false on systems without PAT, so that MTRRs are
> set.

It still sounds to me like there are two bugs here that should be
treated separately.

Bug 1: A warning fires.  Have you figured out why the warning fires?

Bug 2: arch_phys_wc_add() appears to be checking the wrong condition.
How about checking the right condition?  It doesn't actually want to
know if PAT is enabled -- it wants to know if the PAT contains a
usable WC entry.  Something like pat_has_wc() would be better, I
think.

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it
  2017-06-13 15:54             ` Andy Lutomirski
@ 2017-06-14 20:24               ` Mikulas Patocka
  0 siblings, 0 replies; 27+ messages in thread
From: Mikulas Patocka @ 2017-06-14 20:24 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Bernhard Held, Toshi Kani, Borislav Petkov, Andrew Morton,
	Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel



On Tue, 13 Jun 2017, Andy Lutomirski wrote:

> On Tue, Jun 6, 2017 at 4:21 PM, Mikulas Patocka <mpatocka@redhat.com> wrote:
> >
> > On Tue, 6 Jun 2017, Andy Lutomirski wrote:
> >
> >> On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka <mpatocka@redhat.com> wrote:
> >> >
> >> > This patch changes pat_enabled() so that it returns true only if pat
> >> > initialization was actually done.
> >>
> >> Why?  Shouldn't calling init_cache_modes() be sufficient?
> >>
> >> --Andy
> >
> > See the function arch_phys_wc_add():
> >
> > if (pat_enabled() || !mtrr_enabled())
> >         return 0;  /* Success!  (We don't need to do anything.) */
> > ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true);
> >
> > - if pat_enabled() returns true, that function doesn't set MTRRs.
> > pat_enabled() must return false on systems without PAT, so that MTRRs are
> > set.
> 
> It still sounds to me like there are two bugs here that should be
> treated separately.
> 
> Bug 1: A warning fires.  Have you figured out why the warning fires?

The Xserver maps videoram and attempts to call fork() to spawn some 
utility.

reserve_pfn_range is called. At the beginning of reserve_pfn_range, 
want_pcm is set to 2 (_PAGE_CACHE_MODE_UC_MINUS).

reserve_pfn_range calls reserve_memtype(paddr, paddr + size, want_pcm, 
&pcm);

reserve_memtype contains this code at the beginning:
if (!pat_enabled()) {
	/* This is identical to page table setting without PAT */
	if (new_type)
		*new_type = req_type;
	return 0;
} --- so, if pat_enabled() return false, it sets pcm equal to want_pcm and 
there is no problem.

However, if pat_enabled() returns true, reserve_memtype goes on, it sets
actual_type = pat_x_mtrr_type(start, end, req_type); (actual_type is 2)
if (new_type)
	*new_type = actual_type; (*new_type is 2)

and finally it calls rbt_memtype_check_insert(new, new_type); 

rbt_memtype_check_insert calls memtype_rb_check_conflict 
memtype_rb_check_conflict sets *newtype to the value 1 
(_PAGE_CACHE_MODE_WC) (the pointer newtype points to the variable "pcm" in 
the function reserve_pfn_range)

Then, we go back to reserve_pfn_range, the variable want_pcm is 2 and the 
variable pcm is 1. reserve_pfn_range checks "if (pcm != want_pcm)", prints 
a warning "x86/PAT: Xorg:3986 map pfn expected mapping type uncached-minus 
for [mem 0xe4000000-0xe5ffffff], got write-combining", returns an error - 
and this causes fork failure.

> Bug 2: arch_phys_wc_add() appears to be checking the wrong condition.
> How about checking the right condition?  It doesn't actually want to
> know if PAT is enabled -- it wants to know if the PAT contains a
> usable WC entry.  Something like pat_has_wc() would be better, I
> think.
>
> --Andy

The function pat_init() always programs the PAT with the WC type. So - 
surely we can rename pat_enabled() to pat_has_wc(). But renaming the 
function doesn't change functionality.

This is the same patch with pat_enabled() renamed to pat_has_wc(). Note 
also that this renaming causes conflicts when backporting the patch to 
stable kernels.

Mikulas





X86: don't report PAT on CPUs that don't support it

In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
variable is set to 1 by default and the function pat_init() sets
__pat_enabled to 0 if the CPU doesn't support PAT.

However, on AMD K6-3 CPU, the processor initialization code never calls
pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
returns true, even though the K6-3 CPU doesn't support PAT.

The result of this bug is that this warning is produced when attemting to
start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
Another symptom of this bug is that the framebuffer driver doesn't set the
K6-3 MTRR registers.

This patch changes pat_enabled() so that it returns true only if pat
initialization was actually done.

As Andy Lutomirski suggested, we also need to call init_cache_modes() if
pat was not initialized, so we call it after mtrr_bp_init()
(mtrr_bp_init() would or wouldn't call pat_init()). Note that
init_cache_modes() detects if it was called more than once and does
nothing in that case.

x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
------------[ cut here ]------------
WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix
CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35
Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
Call Trace:
 ? __warn+0xab/0xc0
 ? untrack_pfn+0x5c/0x9f
 ? warn_slowpath_null+0xf/0x13
 ? untrack_pfn+0x5c/0x9f
 ? unmap_single_vma+0x43/0x66
 ? unmap_vmas+0x24/0x30
 ? exit_mmap+0x3c/0xa5
 ? __mmput+0xf/0x76
 ? copy_process.part.76+0xb43/0x1129
 ? _do_fork+0x96/0x177
 ? do_int80_syscall_32+0x3e/0x4c
 ? entry_INT80_32+0x2a/0x2a
---[ end trace 8726f9d9fa90d702 ]---
x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org	# v4.2+

---
 arch/x86/include/asm/pat.h        |    3 ++-
 arch/x86/include/asm/pci.h        |    2 +-
 arch/x86/kernel/cpu/mtrr/main.c   |    2 +-
 arch/x86/kernel/setup.c           |    2 ++
 arch/x86/mm/iomap_32.c            |    2 +-
 arch/x86/mm/ioremap.c             |    2 +-
 arch/x86/mm/pat.c                 |   28 +++++++++++++++-------------
 drivers/infiniband/hw/mlx5/main.c |    2 +-
 drivers/media/pci/ivtv/ivtvfb.c   |    2 +-
 9 files changed, 25 insertions(+), 20 deletions(-)

Index: linux-stable/arch/x86/mm/pat.c
===================================================================
--- linux-stable.orig/arch/x86/mm/pat.c
+++ linux-stable/arch/x86/mm/pat.c
@@ -40,7 +40,6 @@
 static bool boot_cpu_done;
 
 static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);
-static void init_cache_modes(void);
 
 void pat_disable(const char *reason)
 {
@@ -65,11 +64,13 @@ static int __init nopat(char *str)
 }
 early_param("nopat", nopat);
 
-bool pat_enabled(void)
+static bool __read_mostly __pat_has_wc = false;
+
+bool pat_has_wc(void)
 {
-	return !!__pat_enabled;
+	return __pat_has_wc;
 }
-EXPORT_SYMBOL_GPL(pat_enabled);
+EXPORT_SYMBOL_GPL(pat_has_wc);
 
 int pat_debug_enable;
 
@@ -225,6 +226,7 @@ static void pat_bsp_init(u64 pat)
 	}
 
 	wrmsrl(MSR_IA32_CR_PAT, pat);
+	__pat_has_wc = true;
 
 	__init_cache_modes(pat);
 }
@@ -242,7 +244,7 @@ static void pat_ap_init(u64 pat)
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 }
 
-static void init_cache_modes(void)
+void init_cache_modes(void)
 {
 	u64 pat = 0;
 	static int init_cm_done;
@@ -306,7 +308,7 @@ void pat_init(void)
 	u64 pat;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	if (!pat_enabled()) {
+	if (!__pat_enabled) {
 		init_cache_modes();
 		return;
 	}
@@ -539,7 +541,7 @@ int reserve_memtype(u64 start, u64 end,
 
 	BUG_ON(start >= end); /* end is exclusive */
 
-	if (!pat_enabled()) {
+	if (!pat_has_wc()) {
 		/* This is identical to page table setting without PAT */
 		if (new_type)
 			*new_type = req_type;
@@ -610,7 +612,7 @@ int free_memtype(u64 start, u64 end)
 	int is_range_ram;
 	struct memtype *entry;
 
-	if (!pat_enabled())
+	if (!pat_has_wc())
 		return 0;
 
 	/* Low ISA region is always mapped WB. No need to track */
@@ -765,7 +767,7 @@ static inline int range_is_allowed(unsig
 	u64 to = from + size;
 	u64 cursor = from;
 
-	if (!pat_enabled())
+	if (!pat_has_wc())
 		return 1;
 
 	while (cursor < to) {
@@ -848,7 +850,7 @@ static int reserve_pfn_range(u64 paddr,
 	 * the type requested matches the type of first page in the range.
 	 */
 	if (is_ram) {
-		if (!pat_enabled())
+		if (!pat_has_wc())
 			return 0;
 
 		pcm = lookup_memtype(paddr);
@@ -964,7 +966,7 @@ int track_pfn_remap(struct vm_area_struc
 		return ret;
 	}
 
-	if (!pat_enabled())
+	if (!pat_has_wc())
 		return 0;
 
 	/*
@@ -991,7 +993,7 @@ void track_pfn_insert(struct vm_area_str
 {
 	enum page_cache_mode pcm;
 
-	if (!pat_enabled())
+	if (!pat_has_wc())
 		return;
 
 	/* Set prot based on lookup */
@@ -1128,7 +1130,7 @@ static const struct file_operations memt
 
 static int __init pat_memtype_list_init(void)
 {
-	if (pat_enabled()) {
+	if (pat_has_wc()) {
 		debugfs_create_file("pat_memtype_list", S_IRUSR,
 				    arch_debugfs_dir, NULL, &memtype_fops);
 	}
Index: linux-stable/arch/x86/include/asm/pat.h
===================================================================
--- linux-stable.orig/arch/x86/include/asm/pat.h
+++ linux-stable/arch/x86/include/asm/pat.h
@@ -4,9 +4,10 @@
 #include <linux/types.h>
 #include <asm/pgtable_types.h>
 
-bool pat_enabled(void);
+bool pat_has_wc(void);
 void pat_disable(const char *reason);
 extern void pat_init(void);
+extern void init_cache_modes(void);
 
 extern int reserve_memtype(u64 start, u64 end,
 		enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
Index: linux-stable/arch/x86/kernel/setup.c
===================================================================
--- linux-stable.orig/arch/x86/kernel/setup.c
+++ linux-stable/arch/x86/kernel/setup.c
@@ -1070,6 +1070,8 @@ void __init setup_arch(char **cmdline_p)
 
 	/* update e820 for memory not covered by WB MTRRs */
 	mtrr_bp_init();
+	if (!pat_has_wc())
+		init_cache_modes();
 	if (mtrr_trim_uncached_memory(max_pfn))
 		max_pfn = e820__end_of_ram_pfn();
 
Index: linux-stable/arch/x86/include/asm/pci.h
===================================================================
--- linux-stable.orig/arch/x86/include/asm/pci.h
+++ linux-stable/arch/x86/include/asm/pci.h
@@ -103,7 +103,7 @@ int pcibios_set_irq_routing(struct pci_d
 
 
 #define HAVE_PCI_MMAP
-#define arch_can_pci_mmap_wc()	pat_enabled()
+#define arch_can_pci_mmap_wc()	pat_has_wc()
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE
 
 #ifdef CONFIG_PCI
Index: linux-stable/arch/x86/kernel/cpu/mtrr/main.c
===================================================================
--- linux-stable.orig/arch/x86/kernel/cpu/mtrr/main.c
+++ linux-stable/arch/x86/kernel/cpu/mtrr/main.c
@@ -556,7 +556,7 @@ int arch_phys_wc_add(unsigned long base,
 {
 	int ret;
 
-	if (pat_enabled() || !mtrr_enabled())
+	if (pat_has_wc() || !mtrr_enabled())
 		return 0;  /* Success!  (We don't need to do anything.) */
 
 	ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true);
Index: linux-stable/arch/x86/mm/iomap_32.c
===================================================================
--- linux-stable.orig/arch/x86/mm/iomap_32.c
+++ linux-stable/arch/x86/mm/iomap_32.c
@@ -84,7 +84,7 @@ iomap_atomic_prot_pfn(unsigned long pfn,
 	 * is UC or WC. UC- gets the real intention, of the user, which is
 	 * "WC if the MTRR is WC, UC if you can't do that."
 	 */
-	if (!pat_enabled() && pgprot2cachemode(prot) != _PAGE_CACHE_MODE_WB)
+	if (!pat_has_wc() && pgprot2cachemode(prot) != _PAGE_CACHE_MODE_WB)
 		prot = __pgprot(__PAGE_KERNEL |
 				cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS));
 
Index: linux-stable/arch/x86/mm/ioremap.c
===================================================================
--- linux-stable.orig/arch/x86/mm/ioremap.c
+++ linux-stable/arch/x86/mm/ioremap.c
@@ -230,7 +230,7 @@ void __iomem *ioremap_nocache(resource_s
 {
 	/*
 	 * Ideally, this should be:
-	 *	pat_enabled() ? _PAGE_CACHE_MODE_UC : _PAGE_CACHE_MODE_UC_MINUS;
+	 *	pat_has_wc() ? _PAGE_CACHE_MODE_UC : _PAGE_CACHE_MODE_UC_MINUS;
 	 *
 	 * Till we fix all X drivers to use ioremap_wc(), we will use
 	 * UC MINUS. Drivers that are certain they need or can already
Index: linux-stable/drivers/infiniband/hw/mlx5/main.c
===================================================================
--- linux-stable.orig/drivers/infiniband/hw/mlx5/main.c
+++ linux-stable/drivers/infiniband/hw/mlx5/main.c
@@ -1602,7 +1602,7 @@ static int uar_mmap(struct mlx5_ib_dev *
 	case MLX5_IB_MMAP_WC_PAGE:
 /* Some architectures don't support WC memory */
 #if defined(CONFIG_X86)
-		if (!pat_enabled())
+		if (!pat_has_wc())
 			return -EPERM;
 #elif !(defined(CONFIG_PPC) || (defined(CONFIG_ARM) && defined(CONFIG_MMU)))
 			return -EPERM;
Index: linux-stable/drivers/media/pci/ivtv/ivtvfb.c
===================================================================
--- linux-stable.orig/drivers/media/pci/ivtv/ivtvfb.c
+++ linux-stable/drivers/media/pci/ivtv/ivtvfb.c
@@ -1168,7 +1168,7 @@ static int ivtvfb_init_card(struct ivtv
 	int rc;
 
 #ifdef CONFIG_X86_64
-	if (pat_enabled()) {
+	if (pat_has_wc()) {
 		pr_warn("ivtvfb needs PAT disabled, boot with nopat kernel parameter\n");
 		return -ENODEV;
 	}

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it
  2017-06-06 22:49       ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
  2017-06-06 22:51         ` Andy Lutomirski
  2017-06-07 19:54         ` Bernhard Held
@ 2017-07-03  5:05         ` Mikulas Patocka
  2017-07-04 13:41           ` Thomas Gleixner
  2 siblings, 1 reply; 27+ messages in thread
From: Mikulas Patocka @ 2017-07-03  5:05 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov,
	Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin,
	Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel



On Tue, 6 Jun 2017, Mikulas Patocka wrote:

> Hi
> 
> Here I send the second version of the patch. It drops the change from 
> boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that 
> caused kernel to be unbootable for some people).
> 
> Another change is that setup_arch() calls init_cache_modes() if PAT is 
> disabled, so that init_cache_modes() is always called.
> 
> Mikulas

Is there any progress with this patch? Will you accept it or do you want 
some changes to it?

> From: Mikulas Patocka <mpatocka@redhat.com>
> 
> In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The
> variable is set to 1 by default and the function pat_init() sets
> __pat_enabled to 0 if the CPU doesn't support PAT.
> 
> However, on AMD K6-3 CPU, the processor initialization code never calls
> pat_init() and so __pat_enabled stays 1 and the function pat_enabled()
> returns true, even though the K6-3 CPU doesn't support PAT.
> 
> The result of this bug is that this warning is produced when attemting to
> start the Xserver and the Xserver doesn't start (fork() returns ENOMEM).
> Another symptom of this bug is that the framebuffer driver doesn't set the
> K6-3 MTRR registers.
> 
> This patch changes pat_enabled() so that it returns true only if pat
> initialization was actually done.
> 
> As Andy Lutomirski suggested, we also need to call init_cache_modes() if
> pat was not initialized, so we call it after mtrr_bp_init()
> (mtrr_bp_init() would or wouldn't call pat_init()). Note that
> init_cache_modes() detects if it was called more than once and does
> nothing in that case.
> 
> x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f
> Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix
> CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35
> Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00
> Call Trace:
>  ? __warn+0xab/0xc0
>  ? untrack_pfn+0x5c/0x9f
>  ? warn_slowpath_null+0xf/0x13
>  ? untrack_pfn+0x5c/0x9f
>  ? unmap_single_vma+0x43/0x66
>  ? unmap_vmas+0x24/0x30
>  ? exit_mmap+0x3c/0xa5
>  ? __mmput+0xf/0x76
>  ? copy_process.part.76+0xb43/0x1129
>  ? _do_fork+0x96/0x177
>  ? do_int80_syscall_32+0x3e/0x4c
>  ? entry_INT80_32+0x2a/0x2a
> ---[ end trace 8726f9d9fa90d702 ]---
> x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Cc: stable@vger.kernel.org	# v4.2+
> 
> ---
>  arch/x86/include/asm/pat.h |    1 +
>  arch/x86/kernel/setup.c    |    2 ++
>  arch/x86/mm/pat.c          |   10 ++++++----
>  3 files changed, 9 insertions(+), 4 deletions(-)
> 
> Index: linux-stable/arch/x86/mm/pat.c
> ===================================================================
> --- linux-stable.orig/arch/x86/mm/pat.c
> +++ linux-stable/arch/x86/mm/pat.c
> @@ -40,7 +40,6 @@
>  static bool boot_cpu_done;
>  
>  static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);
> -static void init_cache_modes(void);
>  
>  void pat_disable(const char *reason)
>  {
> @@ -65,9 +64,11 @@ static int __init nopat(char *str)
>  }
>  early_param("nopat", nopat);
>  
> +static bool __read_mostly __pat_initialized = false;
> +
>  bool pat_enabled(void)
>  {
> -	return !!__pat_enabled;
> +	return __pat_initialized;
>  }
>  EXPORT_SYMBOL_GPL(pat_enabled);
>  
> @@ -225,6 +226,7 @@ static void pat_bsp_init(u64 pat)
>  	}
>  
>  	wrmsrl(MSR_IA32_CR_PAT, pat);
> +	__pat_initialized = true;
>  
>  	__init_cache_modes(pat);
>  }
> @@ -242,7 +244,7 @@ static void pat_ap_init(u64 pat)
>  	wrmsrl(MSR_IA32_CR_PAT, pat);
>  }
>  
> -static void init_cache_modes(void)
> +void init_cache_modes(void)
>  {
>  	u64 pat = 0;
>  	static int init_cm_done;
> @@ -306,7 +308,7 @@ void pat_init(void)
>  	u64 pat;
>  	struct cpuinfo_x86 *c = &boot_cpu_data;
>  
> -	if (!pat_enabled()) {
> +	if (!__pat_enabled) {
>  		init_cache_modes();
>  		return;
>  	}
> Index: linux-stable/arch/x86/include/asm/pat.h
> ===================================================================
> --- linux-stable.orig/arch/x86/include/asm/pat.h
> +++ linux-stable/arch/x86/include/asm/pat.h
> @@ -7,6 +7,7 @@
>  bool pat_enabled(void);
>  void pat_disable(const char *reason);
>  extern void pat_init(void);
> +extern void init_cache_modes(void);
>  
>  extern int reserve_memtype(u64 start, u64 end,
>  		enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
> Index: linux-stable/arch/x86/kernel/setup.c
> ===================================================================
> --- linux-stable.orig/arch/x86/kernel/setup.c
> +++ linux-stable/arch/x86/kernel/setup.c
> @@ -1070,6 +1070,8 @@ void __init setup_arch(char **cmdline_p)
>  
>  	/* update e820 for memory not covered by WB MTRRs */
>  	mtrr_bp_init();
> +	if (!pat_enabled())
> +		init_cache_modes();
>  	if (mtrr_trim_uncached_memory(max_pfn))
>  		max_pfn = e820__end_of_ram_pfn();
>  
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it
  2017-07-03  5:05         ` Mikulas Patocka
@ 2017-07-04 13:41           ` Thomas Gleixner
  2017-07-04 13:48             ` Thomas Gleixner
  2017-07-04 23:04             ` [PATCH v3] " Mikulas Patocka
  0 siblings, 2 replies; 27+ messages in thread
From: Thomas Gleixner @ 2017-07-04 13:41 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Ingo Molnar, Bernhard Held, Andy Lutomirski, Toshi Kani,
	Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds,
	H. Peter Anvin, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel

On Mon, 3 Jul 2017, Mikulas Patocka wrote:
> Is there any progress with this patch? Will you accept it or do you want 
> some changes to it?

Aside of the unparseable changelog, that patch is mostly duct tape.

1) __pat_enabled should be renamed to pat_disabled, as that is the actual
   purpose of that variable

2) Making the call to init_cache_modes() conditional in setup_arch() is
   pointless. init_cache_modes() has it's own protection against multiple
   invocations.

3) It adds yet another invocation of init_cache_modes() instead of getting
   rid of the ones in pat_disable() and the pat disabled case in pat_init().

I've reworked the whole thing into the patch below.

Thanks,

	tglx

8<---------------------
Subject: x86/mm/pat: Don't report PAT on CPUs that don't support it
From: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue, 6 Jun 2017 18:49:39 -0400 (EDT)

The pat_enabled() logic is broken on CPUs which do not support PAT and
where the initialization code fails to call pat_init(). Due to that the
enabled flag stays true and pat_enabled() returns true wrongfully.

As a consequence the mappings, e.g. for Xorg, are set up with the wrong
caching mode and the required MTRR setups are omitted.

To cure this the following changes are required:

  1) Make pat_enabled() return true only if PAT initialization was
     invoked and successful.

  2) Invoke init_cache_modes() unconditionally in setup_arch() and
     remove the extra callsites in pat_disable() and the pat disabled
     code path in pat_init().

Also rename __pat_enabled to pat_disabled to reflect the real purpose of
this variable.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: "Luis R. Rodriguez" <mcgrof@suse.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>

---
 arch/x86/include/asm/pat.h |    1 +
 arch/x86/kernel/setup.c    |    7 +++++++
 arch/x86/mm/pat.c          |   22 +++++++++-------------
 3 files changed, 17 insertions(+), 13 deletions(-)

--- a/arch/x86/include/asm/pat.h
+++ b/arch/x86/include/asm/pat.h
@@ -7,6 +7,7 @@
 bool pat_enabled(void);
 void pat_disable(const char *reason);
 extern void pat_init(void);
+extern void init_cache_modes(void);
 
 extern int reserve_memtype(u64 start, u64 end,
 		enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1076,6 +1076,13 @@ void __init setup_arch(char **cmdline_p)
 	max_possible_pfn = max_pfn;
 
 	/*
+	 * This call is required when the CPU does not support PAT. If
+	 * mtrr_bp_init() invoked it already via pat_init() the call has no
+	 * effect.
+	 */
+	init_cache_modes();
+
+	/*
 	 * Define random base addresses for memory sections after max_pfn is
 	 * defined and before each memory section base is used.
 	 */
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -37,14 +37,13 @@
 #undef pr_fmt
 #define pr_fmt(fmt) "" fmt
 
-static bool boot_cpu_done;
-
-static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);
-static void init_cache_modes(void);
+static bool __read_mostly boot_cpu_done;
+static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT);
+static bool __read_mostly pat_initialized;
 
 void pat_disable(const char *reason)
 {
-	if (!__pat_enabled)
+	if (pat_disabled)
 		return;
 
 	if (boot_cpu_done) {
@@ -52,10 +51,8 @@ void pat_disable(const char *reason)
 		return;
 	}
 
-	__pat_enabled = 0;
+	pat_disabled = true;
 	pr_info("x86/PAT: %s\n", reason);
-
-	init_cache_modes();
 }
 
 static int __init nopat(char *str)
@@ -67,7 +64,7 @@ early_param("nopat", nopat);
 
 bool pat_enabled(void)
 {
-	return !!__pat_enabled;
+	return pat_initialized;
 }
 EXPORT_SYMBOL_GPL(pat_enabled);
 
@@ -225,6 +222,7 @@ static void pat_bsp_init(u64 pat)
 	}
 
 	wrmsrl(MSR_IA32_CR_PAT, pat);
+	__pat_initialized = true;
 
 	__init_cache_modes(pat);
 }
@@ -242,7 +240,7 @@ static void pat_ap_init(u64 pat)
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 }
 
-static void init_cache_modes(void)
+void init_cache_modes(void)
 {
 	u64 pat = 0;
 	static int init_cm_done;
@@ -306,10 +304,8 @@ void pat_init(void)
 	u64 pat;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	if (!pat_enabled()) {
-		init_cache_modes();
+	if (pat_disabled)
 		return;
-	}
 
 	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
 	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it
  2017-07-04 13:41           ` Thomas Gleixner
@ 2017-07-04 13:48             ` Thomas Gleixner
  2017-07-04 23:04             ` [PATCH v3] " Mikulas Patocka
  1 sibling, 0 replies; 27+ messages in thread
From: Thomas Gleixner @ 2017-07-04 13:48 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Ingo Molnar, Bernhard Held, Andy Lutomirski, Toshi Kani,
	Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds,
	H. Peter Anvin, Peter Zijlstra, Luis R. Rodriguez,
	Denys Vlasenko, Josh Poimboeuf, linux-kernel

On Tue, 4 Jul 2017, Thomas Gleixner wrote:
>  	wrmsrl(MSR_IA32_CR_PAT, pat);
> +	__pat_initialized = true;

That should be

> +	pat_initialized = true;

of course.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3] X86: don't report PAT on CPUs that don't support it
  2017-07-04 13:41           ` Thomas Gleixner
  2017-07-04 13:48             ` Thomas Gleixner
@ 2017-07-04 23:04             ` Mikulas Patocka
  2017-07-05  7:03               ` [tip:x86/urgent] x86/mm/pat: Don't " tip-bot for Mikulas Patocka
  1 sibling, 1 reply; 27+ messages in thread
From: Mikulas Patocka @ 2017-07-04 23:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, Bernhard Held, Andy Lutomirski, Borislav Petkov,
	Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin,
	Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko,
	Josh Poimboeuf, linux-kernel



On Tue, 4 Jul 2017, Thomas Gleixner wrote:

> On Mon, 3 Jul 2017, Mikulas Patocka wrote:
> > Is there any progress with this patch? Will you accept it or do you want 
> > some changes to it?
> 
> Aside of the unparseable changelog, that patch is mostly duct tape.
> 
> 1) __pat_enabled should be renamed to pat_disabled, as that is the actual
>    purpose of that variable
> 
> 2) Making the call to init_cache_modes() conditional in setup_arch() is
>    pointless. init_cache_modes() has it's own protection against multiple
>    invocations.
> 
> 3) It adds yet another invocation of init_cache_modes() instead of getting
>    rid of the ones in pat_disable() and the pat disabled case in pat_init().
> 
> I've reworked the whole thing into the patch below.
> 
> Thanks,
> 
> 	tglx

Yes - renaming __pat_enabled to pat_disabled is a good thing.

Just one more change - init_cache_modes() is protected against multiple 
calls, but pat_bsp_init() calls __init_cache_modes() (not 
init_cacha_modes()). The generic code would call init_cache_modes() later 
and init_cache_modes() would do the initialization again - it would be 
mostly harmless because it would just read the pat MSR that pat_bsp_init 
have written and call __init_cache_modes() with the same value - the 
symptom is that on machines with PAT we see this line twice in the syslog:

[    0.000000] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT
[    0.000000] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- WT

I fixed this double initialization by moving the variable init_cm_done to 
file scope and setting it in __init_cache_modes().

Mikulas


8<---------------------
Subject: x86/mm/pat: Don't report PAT on CPUs that don't support it
From: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue, 6 Jun 2017 18:49:39 -0400 (EDT)

The pat_enabled() logic is broken on CPUs which do not support PAT and
where the initialization code fails to call pat_init(). Due to that the
enabled flag stays true and pat_enabled() returns true wrongfully.

As a consequence the mappings, e.g. for Xorg, are set up with the wrong
caching mode and the required MTRR setups are omitted.

To cure this the following changes are required:

  1) Make pat_enabled() return true only if PAT initialization was
     invoked and successful.

  2) Invoke init_cache_modes() unconditionally in setup_arch() and
     remove the extra callsites in pat_disable() and the pat disabled
     code path in pat_init().

Also rename __pat_enabled to pat_disabled to reflect the real purpose of
this variable.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: "Luis R. Rodriguez" <mcgrof@suse.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org	# v4.2+

---
 arch/x86/include/asm/pat.h |    1 +
 arch/x86/kernel/setup.c    |    7 +++++++
 arch/x86/mm/pat.c          |   28 ++++++++++++----------------
 3 files changed, 20 insertions(+), 16 deletions(-)

Index: linux-2.6/arch/x86/include/asm/pat.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/pat.h
+++ linux-2.6/arch/x86/include/asm/pat.h
@@ -7,6 +7,7 @@
 bool pat_enabled(void);
 void pat_disable(const char *reason);
 extern void pat_init(void);
+extern void init_cache_modes(void);
 
 extern int reserve_memtype(u64 start, u64 end,
 		enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
Index: linux-2.6/arch/x86/kernel/setup.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup.c
+++ linux-2.6/arch/x86/kernel/setup.c
@@ -1076,6 +1076,13 @@ void __init setup_arch(char **cmdline_p)
 	max_possible_pfn = max_pfn;
 
 	/*
+	 * This call is required when the CPU does not support PAT. If
+	 * mtrr_bp_init() invoked it already via pat_init() the call has no
+	 * effect.
+	 */
+	init_cache_modes();
+
+	/*
 	 * Define random base addresses for memory sections after max_pfn is
 	 * defined and before each memory section base is used.
 	 */
Index: linux-2.6/arch/x86/mm/pat.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/pat.c
+++ linux-2.6/arch/x86/mm/pat.c
@@ -37,14 +37,14 @@
 #undef pr_fmt
 #define pr_fmt(fmt) "" fmt
 
-static bool boot_cpu_done;
-
-static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);
-static void init_cache_modes(void);
+static bool __read_mostly boot_cpu_done;
+static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT);
+static bool __read_mostly pat_initialized;
+static bool __read_mostly init_cm_done;
 
 void pat_disable(const char *reason)
 {
-	if (!__pat_enabled)
+	if (pat_disabled)
 		return;
 
 	if (boot_cpu_done) {
@@ -52,10 +52,8 @@ void pat_disable(const char *reason)
 		return;
 	}
 
-	__pat_enabled = 0;
+	pat_disabled = true;
 	pr_info("x86/PAT: %s\n", reason);
-
-	init_cache_modes();
 }
 
 static int __init nopat(char *str)
@@ -67,7 +65,7 @@ early_param("nopat", nopat);
 
 bool pat_enabled(void)
 {
-	return !!__pat_enabled;
+	return pat_initialized;
 }
 EXPORT_SYMBOL_GPL(pat_enabled);
 
@@ -205,6 +203,8 @@ static void __init_cache_modes(u64 pat)
 		update_cache_mode_entry(i, cache);
 	}
 	pr_info("x86/PAT: Configuration [0-7]: %s\n", pat_msg);
+
+	init_cm_done = true;
 }
 
 #define PAT(x, y)	((u64)PAT_ ## y << ((x)*8))
@@ -225,6 +225,7 @@ static void pat_bsp_init(u64 pat)
 	}
 
 	wrmsrl(MSR_IA32_CR_PAT, pat);
+	pat_initialized = true;
 
 	__init_cache_modes(pat);
 }
@@ -242,10 +243,9 @@ static void pat_ap_init(u64 pat)
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 }
 
-static void init_cache_modes(void)
+void init_cache_modes(void)
 {
 	u64 pat = 0;
-	static int init_cm_done;
 
 	if (init_cm_done)
 		return;
@@ -287,8 +287,6 @@ static void init_cache_modes(void)
 	}
 
 	__init_cache_modes(pat);
-
-	init_cm_done = 1;
 }
 
 /**
@@ -306,10 +304,8 @@ void pat_init(void)
 	u64 pat;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	if (!pat_enabled()) {
-		init_cache_modes();
+	if (pat_disabled)
 		return;
-	}
 
 	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
 	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:x86/urgent] x86/mm/pat: Don't report PAT on CPUs that don't support it
  2017-07-04 23:04             ` [PATCH v3] " Mikulas Patocka
@ 2017-07-05  7:03               ` tip-bot for Mikulas Patocka
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Mikulas Patocka @ 2017-07-05  7:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, bp, dvlasenk, jpoimboe, akpm, peterz, brgerst,
	mcgrof, mingo, berny156, hpa, mpatocka, torvalds, luto, tglx

Commit-ID:  99c13b8c8896d7bcb92753bf0c63a8de4326e78d
Gitweb:     http://git.kernel.org/tip/99c13b8c8896d7bcb92753bf0c63a8de4326e78d
Author:     Mikulas Patocka <mpatocka@redhat.com>
AuthorDate: Tue, 4 Jul 2017 19:04:23 -0400
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 5 Jul 2017 09:01:24 +0200

x86/mm/pat: Don't report PAT on CPUs that don't support it

The pat_enabled() logic is broken on CPUs which do not support PAT and
where the initialization code fails to call pat_init(). Due to that the
enabled flag stays true and pat_enabled() returns true wrongfully.

As a consequence the mappings, e.g. for Xorg, are set up with the wrong
caching mode and the required MTRR setups are omitted.

To cure this the following changes are required:

  1) Make pat_enabled() return true only if PAT initialization was
     invoked and successful.

  2) Invoke init_cache_modes() unconditionally in setup_arch() and
     remove the extra callsites in pat_disable() and the pat disabled
     code path in pat_init().

Also rename __pat_enabled to pat_disabled to reflect the real purpose of
this variable.

Fixes: 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is disabled")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: "Luis R. Rodriguez" <mcgrof@suse.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707041749300.3456@file01.intranet.prod.int.rdu2.redhat.com

---
 arch/x86/include/asm/pat.h |  1 +
 arch/x86/kernel/setup.c    |  7 +++++++
 arch/x86/mm/pat.c          | 28 ++++++++++++----------------
 3 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h
index 0b1ff4c..fffb279 100644
--- a/arch/x86/include/asm/pat.h
+++ b/arch/x86/include/asm/pat.h
@@ -7,6 +7,7 @@
 bool pat_enabled(void);
 void pat_disable(const char *reason);
 extern void pat_init(void);
+extern void init_cache_modes(void);
 
 extern int reserve_memtype(u64 start, u64 end,
 		enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 65622f0..3486d04 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1076,6 +1076,13 @@ void __init setup_arch(char **cmdline_p)
 	max_possible_pfn = max_pfn;
 
 	/*
+	 * This call is required when the CPU does not support PAT. If
+	 * mtrr_bp_init() invoked it already via pat_init() the call has no
+	 * effect.
+	 */
+	init_cache_modes();
+
+	/*
 	 * Define random base addresses for memory sections after max_pfn is
 	 * defined and before each memory section base is used.
 	 */
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 9b78685..4597950 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -37,14 +37,14 @@
 #undef pr_fmt
 #define pr_fmt(fmt) "" fmt
 
-static bool boot_cpu_done;
-
-static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT);
-static void init_cache_modes(void);
+static bool __read_mostly boot_cpu_done;
+static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT);
+static bool __read_mostly pat_initialized;
+static bool __read_mostly init_cm_done;
 
 void pat_disable(const char *reason)
 {
-	if (!__pat_enabled)
+	if (pat_disabled)
 		return;
 
 	if (boot_cpu_done) {
@@ -52,10 +52,8 @@ void pat_disable(const char *reason)
 		return;
 	}
 
-	__pat_enabled = 0;
+	pat_disabled = true;
 	pr_info("x86/PAT: %s\n", reason);
-
-	init_cache_modes();
 }
 
 static int __init nopat(char *str)
@@ -67,7 +65,7 @@ early_param("nopat", nopat);
 
 bool pat_enabled(void)
 {
-	return !!__pat_enabled;
+	return pat_initialized;
 }
 EXPORT_SYMBOL_GPL(pat_enabled);
 
@@ -205,6 +203,8 @@ static void __init_cache_modes(u64 pat)
 		update_cache_mode_entry(i, cache);
 	}
 	pr_info("x86/PAT: Configuration [0-7]: %s\n", pat_msg);
+
+	init_cm_done = true;
 }
 
 #define PAT(x, y)	((u64)PAT_ ## y << ((x)*8))
@@ -225,6 +225,7 @@ static void pat_bsp_init(u64 pat)
 	}
 
 	wrmsrl(MSR_IA32_CR_PAT, pat);
+	pat_initialized = true;
 
 	__init_cache_modes(pat);
 }
@@ -242,10 +243,9 @@ static void pat_ap_init(u64 pat)
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 }
 
-static void init_cache_modes(void)
+void init_cache_modes(void)
 {
 	u64 pat = 0;
-	static int init_cm_done;
 
 	if (init_cm_done)
 		return;
@@ -287,8 +287,6 @@ static void init_cache_modes(void)
 	}
 
 	__init_cache_modes(pat);
-
-	init_cm_done = 1;
 }
 
 /**
@@ -306,10 +304,8 @@ void pat_init(void)
 	u64 pat;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	if (!pat_enabled()) {
-		init_cache_modes();
+	if (pat_disabled)
 		return;
-	}
 
 	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
 	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||

^ permalink raw reply related	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-07-05  7:09 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-18 19:07 [PATCH] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
2017-04-18 19:28 ` H. Peter Anvin
2017-04-18 20:47   ` Mikulas Patocka
2017-05-14 22:07     ` Mikulas Patocka
2017-05-16 13:57 ` H. Peter Anvin
2017-05-16 15:49   ` Mikulas Patocka
2017-05-18  7:17     ` Ingo Molnar
2017-05-24 10:21 ` [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT tip-bot for Mikulas Patocka
2017-05-28 18:18   ` Bernhard Held
2017-05-28 18:43     ` Andy Lutomirski
2017-05-29 22:50       ` Mikulas Patocka
2017-05-30 17:14         ` Dominik Brodowski
2017-05-30 17:59           ` Mikulas Patocka
2017-05-30 18:47             ` Dominik Brodowski
2017-05-30 19:30             ` Bernhard Held
2017-05-31  9:39             ` Junichi Nomura
2017-06-06 22:49       ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka
2017-06-06 22:51         ` Andy Lutomirski
2017-06-06 23:21           ` Mikulas Patocka
2017-06-13 15:54             ` Andy Lutomirski
2017-06-14 20:24               ` Mikulas Patocka
2017-06-07 19:54         ` Bernhard Held
2017-07-03  5:05         ` Mikulas Patocka
2017-07-04 13:41           ` Thomas Gleixner
2017-07-04 13:48             ` Thomas Gleixner
2017-07-04 23:04             ` [PATCH v3] " Mikulas Patocka
2017-07-05  7:03               ` [tip:x86/urgent] x86/mm/pat: Don't " tip-bot for Mikulas Patocka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).