* [PATCH] X86: don't report PAT on CPUs that don't support it @ 2017-04-18 19:07 Mikulas Patocka 2017-04-18 19:28 ` H. Peter Anvin ` (2 more replies) 0 siblings, 3 replies; 31+ messages in thread From: Mikulas Patocka @ 2017-04-18 19:07 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H. Peter Anvin; +Cc: x86, linux-kernel In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The variable is set to 1 by default and the function pat_init() sets __pat_enabled to 0 if the CPU doesn't support PAT. However, on AMD K6-3 CPU, the processor initialization code never calls pat_init() and so __pat_enabled stays 1 and the function pat_enabled() returns true, even though the K6-3 CPU doesn't support PAT. The result of this bug is that this warning is produced when attemting to start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). Another symptom of this bug is that the framebuffer driver doesn't set the K6-3 MTRR registers. This patch changes pat_enabled() so that it returns true only if pat initialization was actually done. Also, I changed boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) in pat_ap_init, so that we check the PAT feature on the processor that is being initialized. x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining ------------[ cut here ]------------ WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35 Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00 Call Trace: ? __warn+0xab/0xc0 ? untrack_pfn+0x5c/0x9f ? warn_slowpath_null+0xf/0x13 ? untrack_pfn+0x5c/0x9f ? unmap_single_vma+0x43/0x66 ? unmap_vmas+0x24/0x30 ? exit_mmap+0x3c/0xa5 ? __mmput+0xf/0x76 ? copy_process.part.76+0xb43/0x1129 ? _do_fork+0x96/0x177 ? do_int80_syscall_32+0x3e/0x4c ? entry_INT80_32+0x2a/0x2a ---[ end trace 8726f9d9fa90d702 ]--- x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # v4.2+ --- arch/x86/mm/pat.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) Index: linux-2.6/arch/x86/mm/pat.c =================================================================== --- linux-2.6.orig/arch/x86/mm/pat.c +++ linux-2.6/arch/x86/mm/pat.c @@ -64,9 +64,11 @@ static int __init nopat(char *str) } early_param("nopat", nopat); +static bool __read_mostly __pat_initialized = false; + bool pat_enabled(void) { - return !!__pat_enabled; + return __pat_initialized; } EXPORT_SYMBOL_GPL(pat_enabled); @@ -224,13 +226,14 @@ static void pat_bsp_init(u64 pat) } wrmsrl(MSR_IA32_CR_PAT, pat); + __pat_initialized = true; __init_cache_modes(pat); } static void pat_ap_init(u64 pat) { - if (!boot_cpu_has(X86_FEATURE_PAT)) { + if (!this_cpu_has(X86_FEATURE_PAT)) { /* * If this happens we are on a secondary CPU, but switched to * PAT on the boot CPU. We have no way to undo PAT. @@ -305,7 +308,7 @@ void pat_init(void) u64 pat; struct cpuinfo_x86 *c = &boot_cpu_data; - if (!pat_enabled()) { + if (!__pat_enabled) { init_cache_modes(); return; } ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] X86: don't report PAT on CPUs that don't support it 2017-04-18 19:07 [PATCH] X86: don't report PAT on CPUs that don't support it Mikulas Patocka @ 2017-04-18 19:28 ` H. Peter Anvin 2017-04-18 20:47 ` Mikulas Patocka 2017-05-16 13:57 ` H. Peter Anvin 2017-05-24 10:21 ` [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT tip-bot for Mikulas Patocka 2 siblings, 1 reply; 31+ messages in thread From: H. Peter Anvin @ 2017-04-18 19:28 UTC (permalink / raw) To: Mikulas Patocka, Thomas Gleixner, Ingo Molnar; +Cc: x86, linux-kernel On 04/18/17 12:07, Mikulas Patocka wrote: > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The > variable is set to 1 by default and the function pat_init() sets > __pat_enabled to 0 if the CPU doesn't support PAT. > > However, on AMD K6-3 CPU, the processor initialization code never calls > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > returns true, even though the K6-3 CPU doesn't support PAT. > > The result of this bug is that this warning is produced when attemting to > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). > Another symptom of this bug is that the framebuffer driver doesn't set the > K6-3 MTRR registers. > > This patch changes pat_enabled() so that it returns true only if pat > initialization was actually done. > > Also, I changed boot_cpu_has(X86_FEATURE_PAT) to > this_cpu_has(X86_FEATURE_PAT) in pat_ap_init, so that we check the PAT > feature on the processor that is being initialized. > I'm thinking it would be better to replace __pat_enabled with static_cpu_has(X86_FEATURE_PAT) and disable the feature bit if the user has disabled it on the command line, just as we do with other features. -hpa ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] X86: don't report PAT on CPUs that don't support it 2017-04-18 19:28 ` H. Peter Anvin @ 2017-04-18 20:47 ` Mikulas Patocka 2017-05-14 22:07 ` Mikulas Patocka 0 siblings, 1 reply; 31+ messages in thread From: Mikulas Patocka @ 2017-04-18 20:47 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Tue, 18 Apr 2017, H. Peter Anvin wrote: > On 04/18/17 12:07, Mikulas Patocka wrote: > > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The > > variable is set to 1 by default and the function pat_init() sets > > __pat_enabled to 0 if the CPU doesn't support PAT. > > > > However, on AMD K6-3 CPU, the processor initialization code never calls > > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > > returns true, even though the K6-3 CPU doesn't support PAT. > > > > The result of this bug is that this warning is produced when attemting to > > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). > > Another symptom of this bug is that the framebuffer driver doesn't set the > > K6-3 MTRR registers. > > > > This patch changes pat_enabled() so that it returns true only if pat > > initialization was actually done. > > > > Also, I changed boot_cpu_has(X86_FEATURE_PAT) to > > this_cpu_has(X86_FEATURE_PAT) in pat_ap_init, so that we check the PAT > > feature on the processor that is being initialized. > > > > I'm thinking it would be better to replace __pat_enabled with > static_cpu_has(X86_FEATURE_PAT) and disable the feature bit if the user > has disabled it on the command line, just as we do with other features. > > -hpa If MTRR initialization fails for whatever reason, then pat_init() won't be called and the kernel would mistakenly believe that PAT is working (because there would be no one to clear X86_FEATURE_PAT). I think that pat should be reported only if pat_init() is called and succeeds. Another strange thing: pat_disable() calls init_cache_modes() - but since pat_disable() may not be called at all, it is possible that init_cache_modes() is also not called at all. It doesn't produce any visible misbehavior on my machine, but it doesn't seem right - we should not call init_cache_modes() from pat_disable() and do the initialization elsewhere, where it is always called. Mikulas ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] X86: don't report PAT on CPUs that don't support it 2017-04-18 20:47 ` Mikulas Patocka @ 2017-05-14 22:07 ` Mikulas Patocka 0 siblings, 0 replies; 31+ messages in thread From: Mikulas Patocka @ 2017-05-14 22:07 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Tue, 18 Apr 2017, Mikulas Patocka wrote: > > > On Tue, 18 Apr 2017, H. Peter Anvin wrote: > > > On 04/18/17 12:07, Mikulas Patocka wrote: > > > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The > > > variable is set to 1 by default and the function pat_init() sets > > > __pat_enabled to 0 if the CPU doesn't support PAT. > > > > > > However, on AMD K6-3 CPU, the processor initialization code never calls > > > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > > > returns true, even though the K6-3 CPU doesn't support PAT. > > > > > > The result of this bug is that this warning is produced when attemting to > > > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). > > > Another symptom of this bug is that the framebuffer driver doesn't set the > > > K6-3 MTRR registers. > > > > > > This patch changes pat_enabled() so that it returns true only if pat > > > initialization was actually done. > > > > > > Also, I changed boot_cpu_has(X86_FEATURE_PAT) to > > > this_cpu_has(X86_FEATURE_PAT) in pat_ap_init, so that we check the PAT > > > feature on the processor that is being initialized. > > > > > > > I'm thinking it would be better to replace __pat_enabled with > > static_cpu_has(X86_FEATURE_PAT) and disable the feature bit if the user > > has disabled it on the command line, just as we do with other features. > > > > -hpa What's the status of this bug report? Will you accept the patch that I sent or do you want a different patch? If you want a special X86 capability for PAT, you can use the patch below. Note, that we can't use X86_FEATURE_PAT for this (because if CPU supports PAT, but pat_init() is not called, pat_enabled() would incorrectly report true), so we need a different capability X86_FEATURE_PAT_ENABLED. Mikulas > If MTRR initialization fails for whatever reason, then pat_init() won't be > called and the kernel would mistakenly believe that PAT is working > (because there would be no one to clear X86_FEATURE_PAT). > > I think that pat should be reported only if pat_init() is called and > succeeds. > > > Another strange thing: pat_disable() calls init_cache_modes() - but since > pat_disable() may not be called at all, it is possible that > init_cache_modes() is also not called at all. It doesn't produce any > visible misbehavior on my machine, but it doesn't seem right - we should > not call init_cache_modes() from pat_disable() and do the initialization > elsewhere, where it is always called. > > Mikulas X86: don't report PAT on CPUs that don't support it In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The variable is set to 1 by default and the function pat_init() sets __pat_enabled to 0 if the CPU doesn't support PAT. However, on AMD K6-3 CPU, the processor initialization code never calls pat_init() and so __pat_enabled stays 1 and the function pat_enabled() returns true, even though the K6-3 CPU doesn't support PAT. The result of this bug is that this warning is produced when attemting to start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). Another symptom of this bug is that the framebuffer driver doesn't set the K6-3 MTRR registers. This patch introduces a new CPU capability X86_FEATURE_PAT_ENABLED (which is set when pat_init() is called and succeeds) and changes pat_enabled() to return this capability. Note that we can use X86_FEATURE_PAT for this purpose, because if pat_init() is not called, there would be no one to clear X86_FEATURE_PAT and the kernel would mistakenly behave as if PAT is working. x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining ------------[ cut here ]------------ WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35 Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00 Call Trace: ? __warn+0xab/0xc0 ? untrack_pfn+0x5c/0x9f ? warn_slowpath_null+0xf/0x13 ? untrack_pfn+0x5c/0x9f ? unmap_single_vma+0x43/0x66 ? unmap_vmas+0x24/0x30 ? exit_mmap+0x3c/0xa5 ? __mmput+0xf/0x76 ? copy_process.part.76+0xb43/0x1129 ? _do_fork+0x96/0x177 ? do_int80_syscall_32+0x3e/0x4c ? entry_INT80_32+0x2a/0x2a ---[ end trace 8726f9d9fa90d702 ]--- x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # v4.2+ --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/pat.h | 3 ++- arch/x86/mm/pat.c | 12 ++++-------- 3 files changed, 7 insertions(+), 9 deletions(-) Index: linux-2.6/arch/x86/mm/pat.c =================================================================== --- linux-2.6.orig/arch/x86/mm/pat.c +++ linux-2.6/arch/x86/mm/pat.c @@ -64,12 +64,6 @@ static int __init nopat(char *str) } early_param("nopat", nopat); -bool pat_enabled(void) -{ - return !!__pat_enabled; -} -EXPORT_SYMBOL_GPL(pat_enabled); - int pat_debug_enable; static int __init pat_debug_setup(char *str) @@ -225,12 +219,14 @@ static void pat_bsp_init(u64 pat) wrmsrl(MSR_IA32_CR_PAT, pat); + setup_force_cpu_cap(X86_FEATURE_PAT_ENABLED); + __init_cache_modes(pat); } static void pat_ap_init(u64 pat) { - if (!boot_cpu_has(X86_FEATURE_PAT)) { + if (!this_cpu_has(X86_FEATURE_PAT)) { /* * If this happens we are on a secondary CPU, but switched to * PAT on the boot CPU. We have no way to undo PAT. @@ -305,7 +301,7 @@ void pat_init(void) u64 pat; struct cpuinfo_x86 *c = &boot_cpu_data; - if (!pat_enabled()) { + if (!__pat_enabled) { init_cache_modes(); return; } Index: linux-2.6/arch/x86/include/asm/pat.h =================================================================== --- linux-2.6.orig/arch/x86/include/asm/pat.h +++ linux-2.6/arch/x86/include/asm/pat.h @@ -4,7 +4,8 @@ #include <linux/types.h> #include <asm/pgtable_types.h> -bool pat_enabled(void); +#define pat_enabled() static_cpu_has(X86_FEATURE_PAT_ENABLED) + void pat_disable(const char *reason); extern void pat_init(void); Index: linux-2.6/arch/x86/include/asm/cpufeatures.h =================================================================== --- linux-2.6.orig/arch/x86/include/asm/cpufeatures.h +++ linux-2.6/arch/x86/include/asm/cpufeatures.h @@ -104,6 +104,7 @@ #define X86_FEATURE_EXTD_APICID ( 3*32+26) /* has extended APICID (8 bits) */ #define X86_FEATURE_AMD_DCM ( 3*32+27) /* multi-node processor */ #define X86_FEATURE_APERFMPERF ( 3*32+28) /* APERFMPERF */ +#define X86_FEATURE_PAT_ENABLED ( 3*32+29) /* PAT enabled */ #define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */ #define X86_FEATURE_TSC_KNOWN_FREQ ( 3*32+31) /* TSC has known frequency */ ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] X86: don't report PAT on CPUs that don't support it 2017-04-18 19:07 [PATCH] X86: don't report PAT on CPUs that don't support it Mikulas Patocka 2017-04-18 19:28 ` H. Peter Anvin @ 2017-05-16 13:57 ` H. Peter Anvin 2017-05-16 15:49 ` Mikulas Patocka 2017-05-24 10:21 ` [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT tip-bot for Mikulas Patocka 2 siblings, 1 reply; 31+ messages in thread From: H. Peter Anvin @ 2017-05-16 13:57 UTC (permalink / raw) To: Mikulas Patocka, Thomas Gleixner, Ingo Molnar; +Cc: x86, linux-kernel On 04/18/17 12:07, Mikulas Patocka wrote: > > However, on AMD K6-3 CPU, the processor initialization code never calls > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > returns true, even though the K6-3 CPU doesn't support PAT. > OK, now I'm wondering: are you actually *using* said K6-3 machine, and if so, are you actually dependent on write combining on it? The reason I'm asking is because I would personally like to completely remove the support for using MTRRs to create WC mappings, as it only affects a handful of ancient CPUs: Pentium Pro, Pentium II, K6-*, and possibly some Cyrix/Centaur part. Earlier CPUs didn't have WC, but could set WB, WT or UC via the page tables without needing the PAT MSR, and newer CPUs have PAT. -hpa ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] X86: don't report PAT on CPUs that don't support it 2017-05-16 13:57 ` H. Peter Anvin @ 2017-05-16 15:49 ` Mikulas Patocka 2017-05-18 7:17 ` Ingo Molnar 0 siblings, 1 reply; 31+ messages in thread From: Mikulas Patocka @ 2017-05-16 15:49 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Thomas Gleixner, Ingo Molnar, x86, linux-kernel On Tue, 16 May 2017, H. Peter Anvin wrote: > On 04/18/17 12:07, Mikulas Patocka wrote: > > > > However, on AMD K6-3 CPU, the processor initialization code never calls > > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > > returns true, even though the K6-3 CPU doesn't support PAT. > > > > OK, now I'm wondering: are you actually *using* said K6-3 machine, and I use it for playing music, browsing with the links browser and connecting to other machines with ssh. That machine is slow but it is completely quiet. It is also good to run my own software on a slow CPU to make sure that there are no obvious inefficiencies. > if so, are you actually dependent on write combining on it? The reason Those K6-3 MTRRs improve framebuffer write throughput by 33%. > I'm asking is because I would personally like to completely remove the > support for using MTRRs to create WC mappings, as it only affects a > handful of ancient CPUs: Pentium Pro, Pentium II, K6-*, and possibly > some Cyrix/Centaur part. Earlier CPUs didn't have WC, but could set WB, > WT or UC via the page tables without needing the PAT MSR, and newer CPUs > have PAT. MTRRs are also needed on Pentium 3, Core Solo and Core Duo due to an erratum that makes it not possible to set WC with PAT. See the comment before "clear_cpu_cap(c, X86_FEATURE_PAT)" in early_init_intel(). Mikulas > -hpa > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH] X86: don't report PAT on CPUs that don't support it 2017-05-16 15:49 ` Mikulas Patocka @ 2017-05-18 7:17 ` Ingo Molnar 0 siblings, 0 replies; 31+ messages in thread From: Ingo Molnar @ 2017-05-18 7:17 UTC (permalink / raw) To: Mikulas Patocka Cc: H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, linux-kernel * Mikulas Patocka <mpatocka@redhat.com> wrote: > On Tue, 16 May 2017, H. Peter Anvin wrote: > > > On 04/18/17 12:07, Mikulas Patocka wrote: > > > > > > However, on AMD K6-3 CPU, the processor initialization code never calls > > > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > > > returns true, even though the K6-3 CPU doesn't support PAT. > > > > > > > OK, now I'm wondering: are you actually *using* said K6-3 machine, and > > I use it for playing music, browsing with the links browser and connecting > to other machines with ssh. That machine is slow but it is completely > quiet. > > It is also good to run my own software on a slow CPU to make sure that > there are no obvious inefficiencies. > > > if so, are you actually dependent on write combining on it? The reason > > Those K6-3 MTRRs improve framebuffer write throughput by 33%. > > > I'm asking is because I would personally like to completely remove the > > support for using MTRRs to create WC mappings, as it only affects a > > handful of ancient CPUs: Pentium Pro, Pentium II, K6-*, and possibly > > some Cyrix/Centaur part. Earlier CPUs didn't have WC, but could set WB, > > WT or UC via the page tables without needing the PAT MSR, and newer CPUs > > have PAT. > > MTRRs are also needed on Pentium 3, Core Solo and Core Duo due to an > erratum that makes it not possible to set WC with PAT. See the comment > before "clear_cpu_cap(c, X86_FEATURE_PAT)" in early_init_intel(). Ok, I'm inclined to apply your regression fix - hpa do you concur? Thanks, Ingo ^ permalink raw reply [flat|nested] 31+ messages in thread
* [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-04-18 19:07 [PATCH] X86: don't report PAT on CPUs that don't support it Mikulas Patocka 2017-04-18 19:28 ` H. Peter Anvin 2017-05-16 13:57 ` H. Peter Anvin @ 2017-05-24 10:21 ` tip-bot for Mikulas Patocka 2017-05-28 18:18 ` Bernhard Held 2 siblings, 1 reply; 31+ messages in thread From: tip-bot for Mikulas Patocka @ 2017-05-24 10:21 UTC (permalink / raw) To: linux-tip-commits Cc: mpatocka, linux-kernel, peterz, mcgrof, tglx, dvlasenk, jpoimboe, toshi.kani, luto, akpm, brgerst, bp, torvalds, hpa, mingo Commit-ID: cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4 Gitweb: http://git.kernel.org/tip/cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4 Author: Mikulas Patocka <mpatocka@redhat.com> AuthorDate: Tue, 18 Apr 2017 15:07:11 -0400 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 May 2017 10:17:23 +0200 x86/PAT: Fix Xorg regression on CPUs that don't support PAT In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The variable is set to 1 by default and the function pat_init() sets __pat_enabled to 0 if the CPU doesn't support PAT. However, on AMD K6-3 CPUs, the processor initialization code never calls pat_init() and so __pat_enabled stays 1 and the function pat_enabled() returns true, even though the K6-3 CPU doesn't support PAT. The result of this bug is that a kernel warning is produced when attempting to start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). Another symptom of this bug is that the framebuffer driver doesn't set the K6-3 MTRR registers: x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining ------------[ cut here ]------------ WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f ... x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining To fix the bug change pat_enabled() so that it returns true only if PAT initialization was actually done. Also, I changed boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT feature on the processor that is being initialized. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: stable@vger.kernel.org # v4.2+ Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1704181501450.26399@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/mm/pat.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 9b78685..83a59a6 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -65,9 +65,11 @@ static int __init nopat(char *str) } early_param("nopat", nopat); +static bool __read_mostly __pat_initialized = false; + bool pat_enabled(void) { - return !!__pat_enabled; + return __pat_initialized; } EXPORT_SYMBOL_GPL(pat_enabled); @@ -225,13 +227,14 @@ static void pat_bsp_init(u64 pat) } wrmsrl(MSR_IA32_CR_PAT, pat); + __pat_initialized = true; __init_cache_modes(pat); } static void pat_ap_init(u64 pat) { - if (!boot_cpu_has(X86_FEATURE_PAT)) { + if (!this_cpu_has(X86_FEATURE_PAT)) { /* * If this happens we are on a secondary CPU, but switched to * PAT on the boot CPU. We have no way to undo PAT. @@ -306,7 +309,7 @@ void pat_init(void) u64 pat; struct cpuinfo_x86 *c = &boot_cpu_data; - if (!pat_enabled()) { + if (!__pat_enabled) { init_cache_modes(); return; } ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-05-24 10:21 ` [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT tip-bot for Mikulas Patocka @ 2017-05-28 18:18 ` Bernhard Held 2017-05-28 18:43 ` Andy Lutomirski 0 siblings, 1 reply; 31+ messages in thread From: Bernhard Held @ 2017-05-28 18:18 UTC (permalink / raw) To: bp, akpm, brgerst, torvalds, hpa, mingo, mpatocka, tglx, peterz, mcgrof, dvlasenk, jpoimboe, toshi.kani, luto Cc: linux-kernel Hi, this patch breaks the boot of my kernel. The last message is "Booting the kernel.". My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the microcode of the E5450 and recognizes the CPU. Please find below the dmesg of a the latest kernel w/o the PAT-patch. I'm happy to provide more information or to test patches. Have fun, Bernhard [ 0.000000] Linux version 4.12.0-rc2-linus+ (berny@quad) (gcc version 6.3.1 20170202 [gcc-6-branch revision 245119] (SUSE Linux) ) #152 SMP PREEMPT Sun May 28 19:26:20 CEST 2017 [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.12.0-rc2-linus+ root=/dev/mapper/VGMX300-root resume=/dev/sda2 showopts radeon.dpm=1 memmap=1$0xe4fd net.ifnames=0 [ 0.000000] KERNEL supported cpus: [ 0.000000] Intel GenuineIntel [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.000000] x86/fpu: Enabled xstate features 0x3, context size is 576 bytes, using 'standard' format. [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dbff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000cfedffff] usable [ 0.000000] BIOS-e820: [mem 0x00000000cfee0000-0x00000000cfee2fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000cfee3000-0x00000000cfeeffff] ACPI data [ 0.000000] BIOS-e820: [mem 0x00000000cfef0000-0x00000000cfefffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000d0000000-0x00000000dfffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x00000001afffffff] usable [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] e820: user-defined physical RAM map: [ 0.000000] user: [mem 0x0000000000000000-0x000000000000e4fc] usable [ 0.000000] user: [mem 0x000000000000e4fd-0x000000000000e4fd] reserved [ 0.000000] user: [mem 0x000000000000e4fe-0x000000000009dbff] usable [ 0.000000] user: [mem 0x000000000009f800-0x000000000009ffff] reserved [ 0.000000] user: [mem 0x00000000000f0000-0x00000000000fffff] reserved [ 0.000000] user: [mem 0x0000000000100000-0x00000000cfedffff] usable [ 0.000000] user: [mem 0x00000000cfee0000-0x00000000cfee2fff] ACPI NVS [ 0.000000] user: [mem 0x00000000cfee3000-0x00000000cfeeffff] ACPI data [ 0.000000] user: [mem 0x00000000cfef0000-0x00000000cfefffff] reserved [ 0.000000] user: [mem 0x00000000d0000000-0x00000000dfffffff] reserved [ 0.000000] user: [mem 0x00000000fec00000-0x00000000ffffffff] reserved [ 0.000000] user: [mem 0x0000000100000000-0x00000001afffffff] usable [ 0.000000] SMBIOS 2.4 present. [ 0.000000] DMI: Gigabyte Technology Co., Ltd. G33-DS3R/G33-DS3R, BIOS F7L 07/31/2009 [ 0.000000] tsc: Fast TSC calibration using PIT [ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved [ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable [ 0.000000] e820: last_pfn = 0x1b0000 max_arch_pfn = 0x400000000 [ 0.000000] MTRR default type: uncachable [ 0.000000] MTRR fixed ranges enabled: [ 0.000000] 00000-9FFFF write-back [ 0.000000] A0000-BFFFF uncachable [ 0.000000] C0000-CDFFF write-protect [ 0.000000] CE000-EFFFF uncachable [ 0.000000] F0000-FFFFF write-through [ 0.000000] MTRR variable ranges enabled: [ 0.000000] 0 base 0000000000 mask 0F00000000 write-back [ 0.000000] 1 base 00E0000000 mask 0FE0000000 uncachable [ 0.000000] 2 base 00D0000000 mask 0FF0000000 uncachable [ 0.000000] 3 base 0100000000 mask 0F00000000 write-back [ 0.000000] 4 base 01C0000000 mask 0FC0000000 uncachable [ 0.000000] 5 base 01B0000000 mask 0FF0000000 uncachable [ 0.000000] 6 base 00CFF00000 mask 0FFFF00000 uncachable [ 0.000000] 7 disabled [ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT [ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it. [ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it. [ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it. [ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it. [ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it. [ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it. [ 0.000000] mtrr: your BIOS has configured an incorrect mask, fixing it. [ 0.000000] total RAM covered: 6143M [ 0.000000] Found optimal setting for mtrr clean up [ 0.000000] gran_size: 64K chunk_size: 2M num_reg: 7 lose cover RAM: 0G [ 0.000000] e820: update [mem 0xcff00000-0xffffffff] usable ==> reserved [ 0.000000] e820: last_pfn = 0xcfee0 max_arch_pfn = 0x400000000 [ 0.000000] Base memory trampoline at [ffff880000097000] 97000 size 24576 [ 0.000000] BRK [0x01e34000, 0x01e34fff] PGTABLE [ 0.000000] BRK [0x01e35000, 0x01e35fff] PGTABLE [ 0.000000] BRK [0x01e36000, 0x01e36fff] PGTABLE [ 0.000000] BRK [0x01e37000, 0x01e37fff] PGTABLE [ 0.000000] BRK [0x01e38000, 0x01e38fff] PGTABLE [ 0.000000] BRK [0x01e39000, 0x01e39fff] PGTABLE [ 0.000000] RAMDISK: [mem 0x36c5f000-0x37626fff] On 05/24/2017 at 12:21 PM, tip-bot for Mikulas Patocka wrote: > Commit-ID: cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4 > Gitweb: http://git.kernel.org/tip/cbed27cdf0e3f7ea3b2259e86b9e34df02be3fe4 > Author: Mikulas Patocka <mpatocka@redhat.com> > AuthorDate: Tue, 18 Apr 2017 15:07:11 -0400 > Committer: Ingo Molnar <mingo@kernel.org> > CommitDate: Wed, 24 May 2017 10:17:23 +0200 > > x86/PAT: Fix Xorg regression on CPUs that don't support PAT > > In the file arch/x86/mm/pat.c, there's a '__pat_enabled' variable. The > variable is set to 1 by default and the function pat_init() sets > __pat_enabled to 0 if the CPU doesn't support PAT. > > However, on AMD K6-3 CPUs, the processor initialization code never calls > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > returns true, even though the K6-3 CPU doesn't support PAT. > > The result of this bug is that a kernel warning is produced when attempting to > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). > Another symptom of this bug is that the framebuffer driver doesn't set the > K6-3 MTRR registers: > > x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f > ... > x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining > > To fix the bug change pat_enabled() so that it returns true only if PAT > initialization was actually done. > > Also, I changed boot_cpu_has(X86_FEATURE_PAT) to > this_cpu_has(X86_FEATURE_PAT) in pat_ap_init(), so that we check the PAT > feature on the processor that is being initialized. > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Andy Lutomirski <luto@kernel.org> > Cc: Borislav Petkov <bp@alien8.de> > Cc: Brian Gerst <brgerst@gmail.com> > Cc: Denys Vlasenko <dvlasenk@redhat.com> > Cc: H. Peter Anvin <hpa@zytor.com> > Cc: Josh Poimboeuf <jpoimboe@redhat.com> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Luis R. Rodriguez <mcgrof@suse.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Toshi Kani <toshi.kani@hp.com> > Cc: stable@vger.kernel.org # v4.2+ > Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1704181501450.26399@file01.intranet.prod.int.rdu2.redhat.com > Signed-off-by: Ingo Molnar <mingo@kernel.org> > --- > arch/x86/mm/pat.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c > index 9b78685..83a59a6 100644 > --- a/arch/x86/mm/pat.c > +++ b/arch/x86/mm/pat.c > @@ -65,9 +65,11 @@ static int __init nopat(char *str) > } > early_param("nopat", nopat); > > +static bool __read_mostly __pat_initialized = false; > + > bool pat_enabled(void) > { > - return !!__pat_enabled; > + return __pat_initialized; > } > EXPORT_SYMBOL_GPL(pat_enabled); > > @@ -225,13 +227,14 @@ static void pat_bsp_init(u64 pat) > } > > wrmsrl(MSR_IA32_CR_PAT, pat); > + __pat_initialized = true; > > __init_cache_modes(pat); > } > > static void pat_ap_init(u64 pat) > { > - if (!boot_cpu_has(X86_FEATURE_PAT)) { > + if (!this_cpu_has(X86_FEATURE_PAT)) { > /* > * If this happens we are on a secondary CPU, but switched to > * PAT on the boot CPU. We have no way to undo PAT. > @@ -306,7 +309,7 @@ void pat_init(void) > u64 pat; > struct cpuinfo_x86 *c = &boot_cpu_data; > > - if (!pat_enabled()) { > + if (!__pat_enabled) { > init_cache_modes(); > return; > } > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-05-28 18:18 ` Bernhard Held @ 2017-05-28 18:43 ` Andy Lutomirski 2017-05-29 22:50 ` Mikulas Patocka 2017-06-06 22:49 ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka 0 siblings, 2 replies; 31+ messages in thread From: Andy Lutomirski @ 2017-05-28 18:43 UTC (permalink / raw) To: Bernhard Held, Toshi Kani Cc: Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Mikulas Patocka, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, Andrew Lutomirski, linux-kernel On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote: > Hi, > > this patch breaks the boot of my kernel. The last message is "Booting > the kernel.". > > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the > microcode of the E5450 and recognizes the CPU. > > Please find below the dmesg of a the latest kernel w/o the PAT-patch. > I'm happy to provide more information or to test patches. I think this patch is bogus. pat_enabled() sure looks like it's supposed to return true if PAT is *enabled*, and these days PAT is "enabled" even if there's no HW PAT support. Even if the patch were somehow correct, it should have been split up into two patches, one to change pat_enabled() and one to use this_cpu_has(). Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so -stable knows not to backport it, and starting over with the fix. >From very brief inspection, the right fix is to make sure that pat_init(), or at least init_cache_modes(), gets called on the affected CPUs. As a future cleanup, I think that pat_enabled() could be deleted outright and, if needed, replaced by functions like have_memtype_wc() or similar. (Do we already have helpers like that?) Toshi, am I right? --Andy ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-05-28 18:43 ` Andy Lutomirski @ 2017-05-29 22:50 ` Mikulas Patocka 2017-05-30 17:14 ` Dominik Brodowski 2017-06-06 22:49 ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka 1 sibling, 1 reply; 31+ messages in thread From: Mikulas Patocka @ 2017-05-29 22:50 UTC (permalink / raw) To: Bernhard Held, Andy Lutomirski Cc: Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Sun, 28 May 2017, Andy Lutomirski wrote: > On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote: > > Hi, > > > > this patch breaks the boot of my kernel. The last message is "Booting > > the kernel.". > > > > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a > > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the > > microcode of the E5450 and recognizes the CPU. > > > > Please find below the dmesg of a the latest kernel w/o the PAT-patch. > > I'm happy to provide more information or to test patches. Hi Please do the following three tests and test if the kernel boots. 1. use the PAT patch and revert the change to the function pat_enabled() - i.e. change it to the original: bool pat_enabled(void) { return !!__pat_enabled; } 2. use the PAT patch and revert the change to the function pat_ap_init - i.e. change it to the original: static void pat_ap_init(u64 pat) { if (!boot_cpu_has(X86_FEATURE_PAT)) { 3. use the full PAT patch and apply the below patch on the top of it. > I think this patch is bogus. pat_enabled() sure looks like it's > supposed to return true if PAT is *enabled*, and these days PAT is > "enabled" even if there's no HW PAT support. Even if the patch were > somehow correct, it should have been split up into two patches, one to > change pat_enabled() and one to use this_cpu_has(). > > Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so > -stable knows not to backport it, and starting over with the fix. > >From very brief inspection, the right fix is to make sure that > pat_init(), or at least init_cache_modes(), gets called on the pat_init() needs to be called with cache disabled - and the cache disable code (functions prepare_set() and post_set()) exists in arch/x86/kernel/cpu/mtrr/generic.c - it may not be compiled if CONFIG_MTRR is not set. Though, it is possible to call init_cache_modes() - see the patch below. init_cache_modes() does nothing if it is called multiple times. > affected CPUs. > > As a future cleanup, I think that pat_enabled() could be deleted > outright and, if needed, replaced by functions like have_memtype_wc() > or similar. (Do we already have helpers like that?) Toshi, am I > right? > > --Andy --- arch/x86/include/asm/pat.h | 1 + arch/x86/kernel/setup.c | 1 + arch/x86/mm/pat.c | 3 +-- 3 files changed, 3 insertions(+), 2 deletions(-) Index: linux-stable/arch/x86/include/asm/pat.h =================================================================== --- linux-stable.orig/arch/x86/include/asm/pat.h +++ linux-stable/arch/x86/include/asm/pat.h @@ -8,6 +8,7 @@ void pat_disable(const char *reason); extern void pat_init(void); +extern void init_cache_modes(void); extern int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm); Index: linux-stable/arch/x86/kernel/setup.c =================================================================== --- linux-stable.orig/arch/x86/kernel/setup.c +++ linux-stable/arch/x86/kernel/setup.c @@ -1074,6 +1074,7 @@ void __init setup_arch(char **cmdline_p) /* update e820 for memory not covered by WB MTRRs */ mtrr_bp_init(); + init_cache_modes(); if (mtrr_trim_uncached_memory(max_pfn)) max_pfn = e820_end_of_ram_pfn(); Index: linux-stable/arch/x86/mm/pat.c =================================================================== --- linux-stable.orig/arch/x86/mm/pat.c +++ linux-stable/arch/x86/mm/pat.c @@ -39,7 +39,6 @@ static bool boot_cpu_done; static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT); -static void init_cache_modes(void); void pat_disable(const char *reason) { @@ -237,7 +236,7 @@ static void pat_ap_init(u64 pat) wrmsrl(MSR_IA32_CR_PAT, pat); } -static void init_cache_modes(void) +void init_cache_modes(void) { u64 pat = 0; static int init_cm_done; ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-05-29 22:50 ` Mikulas Patocka @ 2017-05-30 17:14 ` Dominik Brodowski 2017-05-30 17:59 ` Mikulas Patocka 0 siblings, 1 reply; 31+ messages in thread From: Dominik Brodowski @ 2017-05-30 17:14 UTC (permalink / raw) To: Mikulas Patocka Cc: Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel [-- Attachment #1: Type: text/plain, Size: 750 bytes --] Same boot problem here (Intel(R) Core(TM) i5-5200U CPU on a Dell XPS 13), git-bisected to the same patch... On Mon, May 29, 2017 at 06:50:57PM -0400, Mikulas Patocka wrote: > Please do the following three tests and test if the kernel boots. > > 1. use the PAT patch and revert the change to the function pat_enabled() > - i.e. change it to the original: > bool pat_enabled(void) > { > return !!__pat_enabled; > } No joy. > 2. use the PAT patch and revert the change to the function pat_ap_init > - i.e. change it to the original: > static void pat_ap_init(u64 pat) > { > if (!boot_cpu_has(X86_FEATURE_PAT)) { Joy. > 3. use the full PAT patch and apply the below patch on the top of it. No joy. Best, Dominik [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-05-30 17:14 ` Dominik Brodowski @ 2017-05-30 17:59 ` Mikulas Patocka 2017-05-30 18:47 ` Dominik Brodowski ` (2 more replies) 0 siblings, 3 replies; 31+ messages in thread From: Mikulas Patocka @ 2017-05-30 17:59 UTC (permalink / raw) To: Dominik Brodowski Cc: Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Tue, 30 May 2017, Dominik Brodowski wrote: > Same boot problem here (Intel(R) Core(TM) i5-5200U CPU on a Dell XPS 13), > git-bisected to the same patch... > > On Mon, May 29, 2017 at 06:50:57PM -0400, Mikulas Patocka wrote: > > Please do the following three tests and test if the kernel boots. > > > > 1. use the PAT patch and revert the change to the function pat_enabled() > > - i.e. change it to the original: > > bool pat_enabled(void) > > { > > return !!__pat_enabled; > > } > > No joy. > > > 2. use the PAT patch and revert the change to the function pat_ap_init > > - i.e. change it to the original: > > static void pat_ap_init(u64 pat) > > { > > if (!boot_cpu_has(X86_FEATURE_PAT)) { > > Joy. It is interesting - does it mean that the boot cpu does have PAT and the secondary CPUs don't? Please send /proc/cpuinfo with all the cores active. This part of the patch is not required anyway, so I will resubmit the patch with this part disabled (and with an added call to init_cache_modes() as Andy suggested). Mikulas > > 3. use the full PAT patch and apply the below patch on the top of it. > > No joy. > > > Best, > Dominik > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-05-30 17:59 ` Mikulas Patocka @ 2017-05-30 18:47 ` Dominik Brodowski 2017-05-30 19:30 ` Bernhard Held 2017-05-31 9:39 ` Junichi Nomura 2 siblings, 0 replies; 31+ messages in thread From: Dominik Brodowski @ 2017-05-30 18:47 UTC (permalink / raw) To: Mikulas Patocka Cc: Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel [-- Attachment #1: Type: text/plain, Size: 5466 bytes --] On Tue, May 30, 2017 at 01:59:41PM -0400, Mikulas Patocka wrote: > On Tue, 30 May 2017, Dominik Brodowski wrote: > > > Same boot problem here (Intel(R) Core(TM) i5-5200U CPU on a Dell XPS 13), > > git-bisected to the same patch... > > > > On Mon, May 29, 2017 at 06:50:57PM -0400, Mikulas Patocka wrote: > > > Please do the following three tests and test if the kernel boots. > > > > > > 1. use the PAT patch and revert the change to the function pat_enabled() > > > - i.e. change it to the original: > > > bool pat_enabled(void) > > > { > > > return !!__pat_enabled; > > > } > > > > No joy. > > > > > 2. use the PAT patch and revert the change to the function pat_ap_init > > > - i.e. change it to the original: > > > static void pat_ap_init(u64 pat) > > > { > > > if (!boot_cpu_has(X86_FEATURE_PAT)) { > > > > Joy. > > It is interesting - does it mean that the boot cpu does have PAT and the > secondary CPUs don't? Please send /proc/cpuinfo with all the cores active. This physical CPU has PAT on all cores / siblings: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 61 model name : Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz stepping : 4 microcode : 0x24 cpu MHz : 800.158 cache size : 3072 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 20 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap xsaveopt dtherm ida arat pln pts bugs : bogomips : 4389.80 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 61 model name : Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz stepping : 4 microcode : 0x24 cpu MHz : 800.158 cache size : 3072 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 20 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap xsaveopt dtherm ida arat pln pts bugs : bogomips : 4436.14 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 61 model name : Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz stepping : 4 microcode : 0x24 cpu MHz : 800.024 cache size : 3072 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 20 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap xsaveopt dtherm ida arat pln pts bugs : bogomips : 4397.30 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 61 model name : Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz stepping : 4 microcode : 0x24 cpu MHz : 799.890 cache size : 3072 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 20 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap xsaveopt dtherm ida arat pln pts bugs : bogomips : 4396.84 clflush size : 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual power management: Best, Dominik [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-05-30 17:59 ` Mikulas Patocka 2017-05-30 18:47 ` Dominik Brodowski @ 2017-05-30 19:30 ` Bernhard Held 2017-05-31 9:39 ` Junichi Nomura 2 siblings, 0 replies; 31+ messages in thread From: Bernhard Held @ 2017-05-30 19:30 UTC (permalink / raw) To: Mikulas Patocka, Dominik Brodowski Cc: Andy Lutomirski, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On 05/30/2017 at 07:59 PM, Mikulas Patocka wrote: > > > On Tue, 30 May 2017, Dominik Brodowski wrote: > >> Same boot problem here (Intel(R) Core(TM) i5-5200U CPU on a Dell XPS 13), >> git-bisected to the same patch... >> >> On Mon, May 29, 2017 at 06:50:57PM -0400, Mikulas Patocka wrote: >>> Please do the following three tests and test if the kernel boots. >>> >>> 1. use the PAT patch and revert the change to the function pat_enabled() >>> - i.e. change it to the original: >>> bool pat_enabled(void) >>> { >>> return !!__pat_enabled; >>> } >> >> No joy. >> >>> 2. use the PAT patch and revert the change to the function pat_ap_init >>> - i.e. change it to the original: >>> static void pat_ap_init(u64 pat) >>> { >>> if (!boot_cpu_has(X86_FEATURE_PAT)) { >> >> Joy. > > It is interesting - does it mean that the boot cpu does have PAT and the > secondary CPUs don't? Please send /proc/cpuinfo with all the cores active. > > This part of the patch is not required anyway, so I will resubmit the > patch with this part disabled (and with an added call to > init_cache_modes() as Andy suggested). > > Mikulas > >>> 3. use the full PAT patch and apply the below patch on the top of it. >> >> No joy. Same result here., only #2 boots. processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5450 @ 3.00GHz stepping : 10 microcode : 0xa0b cpu MHz : 2000.000 cache size : 6144 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority dtherm bugs : bogomips : 5999.70 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5450 @ 3.00GHz stepping : 10 microcode : 0xa0b cpu MHz : 2000.000 cache size : 6144 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority dtherm bugs : bogomips : 5999.98 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5450 @ 3.00GHz stepping : 10 microcode : 0xa0b cpu MHz : 2000.000 cache size : 6144 KB physical id : 0 siblings : 4 core id : 2 cpu cores : 4 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority dtherm bugs : bogomips : 5900.00 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5450 @ 3.00GHz stepping : 10 microcode : 0xa0b cpu MHz : 2000.000 cache size : 6144 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 4 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm tpr_shadow vnmi flexpriority dtherm bugs : bogomips : 5999.98 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: Cheers, Bernhard ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-05-30 17:59 ` Mikulas Patocka 2017-05-30 18:47 ` Dominik Brodowski 2017-05-30 19:30 ` Bernhard Held @ 2017-05-31 9:39 ` Junichi Nomura 2 siblings, 0 replies; 31+ messages in thread From: Junichi Nomura @ 2017-05-31 9:39 UTC (permalink / raw) To: Mikulas Patocka Cc: Dominik Brodowski, Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On 05/31/17 02:59, Mikulas Patocka wrote: >>> 2. use the PAT patch and revert the change to the function pat_ap_init >>> - i.e. change it to the original: >>> static void pat_ap_init(u64 pat) >>> { >>> if (!boot_cpu_has(X86_FEATURE_PAT)) { >> >> Joy. > > It is interesting - does it mean that the boot cpu does have PAT and the > secondary CPUs don't? It seems pat_init() is called twice for boot cpu, from mtrr_bp_pat_init() and generic_set_all(). The 2nd call ends up calling pat_ap_init() and it's before boot_cpu_data is copied to cpu_data[0]. -- Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2] X86: don't report PAT on CPUs that don't support it 2017-05-28 18:43 ` Andy Lutomirski 2017-05-29 22:50 ` Mikulas Patocka @ 2017-06-06 22:49 ` Mikulas Patocka 2017-06-06 22:51 ` Andy Lutomirski ` (2 more replies) 1 sibling, 3 replies; 31+ messages in thread From: Mikulas Patocka @ 2017-06-06 22:49 UTC (permalink / raw) To: Andy Lutomirski Cc: Bernhard Held, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Sun, 28 May 2017, Andy Lutomirski wrote: > On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote: > > Hi, > > > > this patch breaks the boot of my kernel. The last message is "Booting > > the kernel.". > > > > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a > > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the > > microcode of the E5450 and recognizes the CPU. > > > > Please find below the dmesg of a the latest kernel w/o the PAT-patch. > > I'm happy to provide more information or to test patches. > > I think this patch is bogus. pat_enabled() sure looks like it's > supposed to return true if PAT is *enabled*, and these days PAT is > "enabled" even if there's no HW PAT support. Even if the patch were > somehow correct, it should have been split up into two patches, one to > change pat_enabled() and one to use this_cpu_has(). > > Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so > -stable knows not to backport it, and starting over with the fix. > >From very brief inspection, the right fix is to make sure that > pat_init(), or at least init_cache_modes(), gets called on the > affected CPUs. > > --Andy Hi Here I send the second version of the patch. It drops the change from boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that caused kernel to be unbootable for some people). Another change is that setup_arch() calls init_cache_modes() if PAT is disabled, so that init_cache_modes() is always called. Mikulas From: Mikulas Patocka <mpatocka@redhat.com> In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The variable is set to 1 by default and the function pat_init() sets __pat_enabled to 0 if the CPU doesn't support PAT. However, on AMD K6-3 CPU, the processor initialization code never calls pat_init() and so __pat_enabled stays 1 and the function pat_enabled() returns true, even though the K6-3 CPU doesn't support PAT. The result of this bug is that this warning is produced when attemting to start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). Another symptom of this bug is that the framebuffer driver doesn't set the K6-3 MTRR registers. This patch changes pat_enabled() so that it returns true only if pat initialization was actually done. As Andy Lutomirski suggested, we also need to call init_cache_modes() if pat was not initialized, so we call it after mtrr_bp_init() (mtrr_bp_init() would or wouldn't call pat_init()). Note that init_cache_modes() detects if it was called more than once and does nothing in that case. x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining ------------[ cut here ]------------ WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35 Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00 Call Trace: ? __warn+0xab/0xc0 ? untrack_pfn+0x5c/0x9f ? warn_slowpath_null+0xf/0x13 ? untrack_pfn+0x5c/0x9f ? unmap_single_vma+0x43/0x66 ? unmap_vmas+0x24/0x30 ? exit_mmap+0x3c/0xa5 ? __mmput+0xf/0x76 ? copy_process.part.76+0xb43/0x1129 ? _do_fork+0x96/0x177 ? do_int80_syscall_32+0x3e/0x4c ? entry_INT80_32+0x2a/0x2a ---[ end trace 8726f9d9fa90d702 ]--- x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # v4.2+ --- arch/x86/include/asm/pat.h | 1 + arch/x86/kernel/setup.c | 2 ++ arch/x86/mm/pat.c | 10 ++++++---- 3 files changed, 9 insertions(+), 4 deletions(-) Index: linux-stable/arch/x86/mm/pat.c =================================================================== --- linux-stable.orig/arch/x86/mm/pat.c +++ linux-stable/arch/x86/mm/pat.c @@ -40,7 +40,6 @@ static bool boot_cpu_done; static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT); -static void init_cache_modes(void); void pat_disable(const char *reason) { @@ -65,9 +64,11 @@ static int __init nopat(char *str) } early_param("nopat", nopat); +static bool __read_mostly __pat_initialized = false; + bool pat_enabled(void) { - return !!__pat_enabled; + return __pat_initialized; } EXPORT_SYMBOL_GPL(pat_enabled); @@ -225,6 +226,7 @@ static void pat_bsp_init(u64 pat) } wrmsrl(MSR_IA32_CR_PAT, pat); + __pat_initialized = true; __init_cache_modes(pat); } @@ -242,7 +244,7 @@ static void pat_ap_init(u64 pat) wrmsrl(MSR_IA32_CR_PAT, pat); } -static void init_cache_modes(void) +void init_cache_modes(void) { u64 pat = 0; static int init_cm_done; @@ -306,7 +308,7 @@ void pat_init(void) u64 pat; struct cpuinfo_x86 *c = &boot_cpu_data; - if (!pat_enabled()) { + if (!__pat_enabled) { init_cache_modes(); return; } Index: linux-stable/arch/x86/include/asm/pat.h =================================================================== --- linux-stable.orig/arch/x86/include/asm/pat.h +++ linux-stable/arch/x86/include/asm/pat.h @@ -7,6 +7,7 @@ bool pat_enabled(void); void pat_disable(const char *reason); extern void pat_init(void); +extern void init_cache_modes(void); extern int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm); Index: linux-stable/arch/x86/kernel/setup.c =================================================================== --- linux-stable.orig/arch/x86/kernel/setup.c +++ linux-stable/arch/x86/kernel/setup.c @@ -1070,6 +1070,8 @@ void __init setup_arch(char **cmdline_p) /* update e820 for memory not covered by WB MTRRs */ mtrr_bp_init(); + if (!pat_enabled()) + init_cache_modes(); if (mtrr_trim_uncached_memory(max_pfn)) max_pfn = e820__end_of_ram_pfn(); ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it 2017-06-06 22:49 ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka @ 2017-06-06 22:51 ` Andy Lutomirski 2017-06-06 23:21 ` Mikulas Patocka 2017-06-07 19:54 ` Bernhard Held 2017-07-03 5:05 ` Mikulas Patocka 2 siblings, 1 reply; 31+ messages in thread From: Andy Lutomirski @ 2017-06-06 22:51 UTC (permalink / raw) To: Mikulas Patocka Cc: Andy Lutomirski, Bernhard Held, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Sun, 28 May 2017, Andy Lutomirski wrote: > >> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote: >> > Hi, >> > >> > this patch breaks the boot of my kernel. The last message is "Booting >> > the kernel.". >> > >> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a >> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the >> > microcode of the E5450 and recognizes the CPU. >> > >> > Please find below the dmesg of a the latest kernel w/o the PAT-patch. >> > I'm happy to provide more information or to test patches. >> >> I think this patch is bogus. pat_enabled() sure looks like it's >> supposed to return true if PAT is *enabled*, and these days PAT is >> "enabled" even if there's no HW PAT support. Even if the patch were >> somehow correct, it should have been split up into two patches, one to >> change pat_enabled() and one to use this_cpu_has(). >> >> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so >> -stable knows not to backport it, and starting over with the fix. >> >From very brief inspection, the right fix is to make sure that >> pat_init(), or at least init_cache_modes(), gets called on the >> affected CPUs. >> >> --Andy > > Hi > > Here I send the second version of the patch. It drops the change from > boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that > caused kernel to be unbootable for some people). > > Another change is that setup_arch() calls init_cache_modes() if PAT is > disabled, so that init_cache_modes() is always called. > > Mikulas > > > > From: Mikulas Patocka <mpatocka@redhat.com> > > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The > variable is set to 1 by default and the function pat_init() sets > __pat_enabled to 0 if the CPU doesn't support PAT. > > However, on AMD K6-3 CPU, the processor initialization code never calls > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > returns true, even though the K6-3 CPU doesn't support PAT. > > The result of this bug is that this warning is produced when attemting to > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). > Another symptom of this bug is that the framebuffer driver doesn't set the > K6-3 MTRR registers. > > This patch changes pat_enabled() so that it returns true only if pat > initialization was actually done. Why? Shouldn't calling init_cache_modes() be sufficient? --Andy ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it 2017-06-06 22:51 ` Andy Lutomirski @ 2017-06-06 23:21 ` Mikulas Patocka 2017-06-13 15:54 ` Andy Lutomirski 0 siblings, 1 reply; 31+ messages in thread From: Mikulas Patocka @ 2017-06-06 23:21 UTC (permalink / raw) To: Andy Lutomirski Cc: Bernhard Held, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Tue, 6 Jun 2017, Andy Lutomirski wrote: > On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > > > On Sun, 28 May 2017, Andy Lutomirski wrote: > > > >> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote: > >> > Hi, > >> > > >> > this patch breaks the boot of my kernel. The last message is "Booting > >> > the kernel.". > >> > > >> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a > >> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the > >> > microcode of the E5450 and recognizes the CPU. > >> > > >> > Please find below the dmesg of a the latest kernel w/o the PAT-patch. > >> > I'm happy to provide more information or to test patches. > >> > >> I think this patch is bogus. pat_enabled() sure looks like it's > >> supposed to return true if PAT is *enabled*, and these days PAT is > >> "enabled" even if there's no HW PAT support. Even if the patch were > >> somehow correct, it should have been split up into two patches, one to > >> change pat_enabled() and one to use this_cpu_has(). > >> > >> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so > >> -stable knows not to backport it, and starting over with the fix. > >> >From very brief inspection, the right fix is to make sure that > >> pat_init(), or at least init_cache_modes(), gets called on the > >> affected CPUs. > >> > >> --Andy > > > > Hi > > > > Here I send the second version of the patch. It drops the change from > > boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that > > caused kernel to be unbootable for some people). > > > > Another change is that setup_arch() calls init_cache_modes() if PAT is > > disabled, so that init_cache_modes() is always called. > > > > Mikulas > > > > > > > > From: Mikulas Patocka <mpatocka@redhat.com> > > > > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The > > variable is set to 1 by default and the function pat_init() sets > > __pat_enabled to 0 if the CPU doesn't support PAT. > > > > However, on AMD K6-3 CPU, the processor initialization code never calls > > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > > returns true, even though the K6-3 CPU doesn't support PAT. > > > > The result of this bug is that this warning is produced when attemting to > > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). > > Another symptom of this bug is that the framebuffer driver doesn't set the > > K6-3 MTRR registers. > > > > This patch changes pat_enabled() so that it returns true only if pat > > initialization was actually done. > > Why? Shouldn't calling init_cache_modes() be sufficient? > > --Andy See the function arch_phys_wc_add(): if (pat_enabled() || !mtrr_enabled()) return 0; /* Success! (We don't need to do anything.) */ ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true); - if pat_enabled() returns true, that function doesn't set MTRRs. pat_enabled() must return false on systems without PAT, so that MTRRs are set. Mikulas ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it 2017-06-06 23:21 ` Mikulas Patocka @ 2017-06-13 15:54 ` Andy Lutomirski 2017-06-14 20:24 ` Mikulas Patocka 0 siblings, 1 reply; 31+ messages in thread From: Andy Lutomirski @ 2017-06-13 15:54 UTC (permalink / raw) To: Mikulas Patocka Cc: Andy Lutomirski, Bernhard Held, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Tue, Jun 6, 2017 at 4:21 PM, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > On Tue, 6 Jun 2017, Andy Lutomirski wrote: > >> On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka <mpatocka@redhat.com> wrote: >> > >> > >> > On Sun, 28 May 2017, Andy Lutomirski wrote: >> > >> >> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote: >> >> > Hi, >> >> > >> >> > this patch breaks the boot of my kernel. The last message is "Booting >> >> > the kernel.". >> >> > >> >> > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a >> >> > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the >> >> > microcode of the E5450 and recognizes the CPU. >> >> > >> >> > Please find below the dmesg of a the latest kernel w/o the PAT-patch. >> >> > I'm happy to provide more information or to test patches. >> >> >> >> I think this patch is bogus. pat_enabled() sure looks like it's >> >> supposed to return true if PAT is *enabled*, and these days PAT is >> >> "enabled" even if there's no HW PAT support. Even if the patch were >> >> somehow correct, it should have been split up into two patches, one to >> >> change pat_enabled() and one to use this_cpu_has(). >> >> >> >> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so >> >> -stable knows not to backport it, and starting over with the fix. >> >> >From very brief inspection, the right fix is to make sure that >> >> pat_init(), or at least init_cache_modes(), gets called on the >> >> affected CPUs. >> >> >> >> --Andy >> > >> > Hi >> > >> > Here I send the second version of the patch. It drops the change from >> > boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that >> > caused kernel to be unbootable for some people). >> > >> > Another change is that setup_arch() calls init_cache_modes() if PAT is >> > disabled, so that init_cache_modes() is always called. >> > >> > Mikulas >> > >> > >> > >> > From: Mikulas Patocka <mpatocka@redhat.com> >> > >> > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The >> > variable is set to 1 by default and the function pat_init() sets >> > __pat_enabled to 0 if the CPU doesn't support PAT. >> > >> > However, on AMD K6-3 CPU, the processor initialization code never calls >> > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() >> > returns true, even though the K6-3 CPU doesn't support PAT. >> > >> > The result of this bug is that this warning is produced when attemting to >> > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). >> > Another symptom of this bug is that the framebuffer driver doesn't set the >> > K6-3 MTRR registers. >> > >> > This patch changes pat_enabled() so that it returns true only if pat >> > initialization was actually done. >> >> Why? Shouldn't calling init_cache_modes() be sufficient? >> >> --Andy > > See the function arch_phys_wc_add(): > > if (pat_enabled() || !mtrr_enabled()) > return 0; /* Success! (We don't need to do anything.) */ > ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true); > > - if pat_enabled() returns true, that function doesn't set MTRRs. > pat_enabled() must return false on systems without PAT, so that MTRRs are > set. It still sounds to me like there are two bugs here that should be treated separately. Bug 1: A warning fires. Have you figured out why the warning fires? Bug 2: arch_phys_wc_add() appears to be checking the wrong condition. How about checking the right condition? It doesn't actually want to know if PAT is enabled -- it wants to know if the PAT contains a usable WC entry. Something like pat_has_wc() would be better, I think. --Andy ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it 2017-06-13 15:54 ` Andy Lutomirski @ 2017-06-14 20:24 ` Mikulas Patocka 0 siblings, 0 replies; 31+ messages in thread From: Mikulas Patocka @ 2017-06-14 20:24 UTC (permalink / raw) To: Andy Lutomirski Cc: Bernhard Held, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Tue, 13 Jun 2017, Andy Lutomirski wrote: > On Tue, Jun 6, 2017 at 4:21 PM, Mikulas Patocka <mpatocka@redhat.com> wrote: > > > > On Tue, 6 Jun 2017, Andy Lutomirski wrote: > > > >> On Tue, Jun 6, 2017 at 3:49 PM, Mikulas Patocka <mpatocka@redhat.com> wrote: > >> > > >> > This patch changes pat_enabled() so that it returns true only if pat > >> > initialization was actually done. > >> > >> Why? Shouldn't calling init_cache_modes() be sufficient? > >> > >> --Andy > > > > See the function arch_phys_wc_add(): > > > > if (pat_enabled() || !mtrr_enabled()) > > return 0; /* Success! (We don't need to do anything.) */ > > ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true); > > > > - if pat_enabled() returns true, that function doesn't set MTRRs. > > pat_enabled() must return false on systems without PAT, so that MTRRs are > > set. > > It still sounds to me like there are two bugs here that should be > treated separately. > > Bug 1: A warning fires. Have you figured out why the warning fires? The Xserver maps videoram and attempts to call fork() to spawn some utility. reserve_pfn_range is called. At the beginning of reserve_pfn_range, want_pcm is set to 2 (_PAGE_CACHE_MODE_UC_MINUS). reserve_pfn_range calls reserve_memtype(paddr, paddr + size, want_pcm, &pcm); reserve_memtype contains this code at the beginning: if (!pat_enabled()) { /* This is identical to page table setting without PAT */ if (new_type) *new_type = req_type; return 0; } --- so, if pat_enabled() return false, it sets pcm equal to want_pcm and there is no problem. However, if pat_enabled() returns true, reserve_memtype goes on, it sets actual_type = pat_x_mtrr_type(start, end, req_type); (actual_type is 2) if (new_type) *new_type = actual_type; (*new_type is 2) and finally it calls rbt_memtype_check_insert(new, new_type); rbt_memtype_check_insert calls memtype_rb_check_conflict memtype_rb_check_conflict sets *newtype to the value 1 (_PAGE_CACHE_MODE_WC) (the pointer newtype points to the variable "pcm" in the function reserve_pfn_range) Then, we go back to reserve_pfn_range, the variable want_pcm is 2 and the variable pcm is 1. reserve_pfn_range checks "if (pcm != want_pcm)", prints a warning "x86/PAT: Xorg:3986 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining", returns an error - and this causes fork failure. > Bug 2: arch_phys_wc_add() appears to be checking the wrong condition. > How about checking the right condition? It doesn't actually want to > know if PAT is enabled -- it wants to know if the PAT contains a > usable WC entry. Something like pat_has_wc() would be better, I > think. > > --Andy The function pat_init() always programs the PAT with the WC type. So - surely we can rename pat_enabled() to pat_has_wc(). But renaming the function doesn't change functionality. This is the same patch with pat_enabled() renamed to pat_has_wc(). Note also that this renaming causes conflicts when backporting the patch to stable kernels. Mikulas X86: don't report PAT on CPUs that don't support it In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The variable is set to 1 by default and the function pat_init() sets __pat_enabled to 0 if the CPU doesn't support PAT. However, on AMD K6-3 CPU, the processor initialization code never calls pat_init() and so __pat_enabled stays 1 and the function pat_enabled() returns true, even though the K6-3 CPU doesn't support PAT. The result of this bug is that this warning is produced when attemting to start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). Another symptom of this bug is that the framebuffer driver doesn't set the K6-3 MTRR registers. This patch changes pat_enabled() so that it returns true only if pat initialization was actually done. As Andy Lutomirski suggested, we also need to call init_cache_modes() if pat was not initialized, so we call it after mtrr_bp_init() (mtrr_bp_init() would or wouldn't call pat_init()). Note that init_cache_modes() detects if it was called more than once and does nothing in that case. x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining ------------[ cut here ]------------ WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35 Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00 Call Trace: ? __warn+0xab/0xc0 ? untrack_pfn+0x5c/0x9f ? warn_slowpath_null+0xf/0x13 ? untrack_pfn+0x5c/0x9f ? unmap_single_vma+0x43/0x66 ? unmap_vmas+0x24/0x30 ? exit_mmap+0x3c/0xa5 ? __mmput+0xf/0x76 ? copy_process.part.76+0xb43/0x1129 ? _do_fork+0x96/0x177 ? do_int80_syscall_32+0x3e/0x4c ? entry_INT80_32+0x2a/0x2a ---[ end trace 8726f9d9fa90d702 ]--- x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # v4.2+ --- arch/x86/include/asm/pat.h | 3 ++- arch/x86/include/asm/pci.h | 2 +- arch/x86/kernel/cpu/mtrr/main.c | 2 +- arch/x86/kernel/setup.c | 2 ++ arch/x86/mm/iomap_32.c | 2 +- arch/x86/mm/ioremap.c | 2 +- arch/x86/mm/pat.c | 28 +++++++++++++++------------- drivers/infiniband/hw/mlx5/main.c | 2 +- drivers/media/pci/ivtv/ivtvfb.c | 2 +- 9 files changed, 25 insertions(+), 20 deletions(-) Index: linux-stable/arch/x86/mm/pat.c =================================================================== --- linux-stable.orig/arch/x86/mm/pat.c +++ linux-stable/arch/x86/mm/pat.c @@ -40,7 +40,6 @@ static bool boot_cpu_done; static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT); -static void init_cache_modes(void); void pat_disable(const char *reason) { @@ -65,11 +64,13 @@ static int __init nopat(char *str) } early_param("nopat", nopat); -bool pat_enabled(void) +static bool __read_mostly __pat_has_wc = false; + +bool pat_has_wc(void) { - return !!__pat_enabled; + return __pat_has_wc; } -EXPORT_SYMBOL_GPL(pat_enabled); +EXPORT_SYMBOL_GPL(pat_has_wc); int pat_debug_enable; @@ -225,6 +226,7 @@ static void pat_bsp_init(u64 pat) } wrmsrl(MSR_IA32_CR_PAT, pat); + __pat_has_wc = true; __init_cache_modes(pat); } @@ -242,7 +244,7 @@ static void pat_ap_init(u64 pat) wrmsrl(MSR_IA32_CR_PAT, pat); } -static void init_cache_modes(void) +void init_cache_modes(void) { u64 pat = 0; static int init_cm_done; @@ -306,7 +308,7 @@ void pat_init(void) u64 pat; struct cpuinfo_x86 *c = &boot_cpu_data; - if (!pat_enabled()) { + if (!__pat_enabled) { init_cache_modes(); return; } @@ -539,7 +541,7 @@ int reserve_memtype(u64 start, u64 end, BUG_ON(start >= end); /* end is exclusive */ - if (!pat_enabled()) { + if (!pat_has_wc()) { /* This is identical to page table setting without PAT */ if (new_type) *new_type = req_type; @@ -610,7 +612,7 @@ int free_memtype(u64 start, u64 end) int is_range_ram; struct memtype *entry; - if (!pat_enabled()) + if (!pat_has_wc()) return 0; /* Low ISA region is always mapped WB. No need to track */ @@ -765,7 +767,7 @@ static inline int range_is_allowed(unsig u64 to = from + size; u64 cursor = from; - if (!pat_enabled()) + if (!pat_has_wc()) return 1; while (cursor < to) { @@ -848,7 +850,7 @@ static int reserve_pfn_range(u64 paddr, * the type requested matches the type of first page in the range. */ if (is_ram) { - if (!pat_enabled()) + if (!pat_has_wc()) return 0; pcm = lookup_memtype(paddr); @@ -964,7 +966,7 @@ int track_pfn_remap(struct vm_area_struc return ret; } - if (!pat_enabled()) + if (!pat_has_wc()) return 0; /* @@ -991,7 +993,7 @@ void track_pfn_insert(struct vm_area_str { enum page_cache_mode pcm; - if (!pat_enabled()) + if (!pat_has_wc()) return; /* Set prot based on lookup */ @@ -1128,7 +1130,7 @@ static const struct file_operations memt static int __init pat_memtype_list_init(void) { - if (pat_enabled()) { + if (pat_has_wc()) { debugfs_create_file("pat_memtype_list", S_IRUSR, arch_debugfs_dir, NULL, &memtype_fops); } Index: linux-stable/arch/x86/include/asm/pat.h =================================================================== --- linux-stable.orig/arch/x86/include/asm/pat.h +++ linux-stable/arch/x86/include/asm/pat.h @@ -4,9 +4,10 @@ #include <linux/types.h> #include <asm/pgtable_types.h> -bool pat_enabled(void); +bool pat_has_wc(void); void pat_disable(const char *reason); extern void pat_init(void); +extern void init_cache_modes(void); extern int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm); Index: linux-stable/arch/x86/kernel/setup.c =================================================================== --- linux-stable.orig/arch/x86/kernel/setup.c +++ linux-stable/arch/x86/kernel/setup.c @@ -1070,6 +1070,8 @@ void __init setup_arch(char **cmdline_p) /* update e820 for memory not covered by WB MTRRs */ mtrr_bp_init(); + if (!pat_has_wc()) + init_cache_modes(); if (mtrr_trim_uncached_memory(max_pfn)) max_pfn = e820__end_of_ram_pfn(); Index: linux-stable/arch/x86/include/asm/pci.h =================================================================== --- linux-stable.orig/arch/x86/include/asm/pci.h +++ linux-stable/arch/x86/include/asm/pci.h @@ -103,7 +103,7 @@ int pcibios_set_irq_routing(struct pci_d #define HAVE_PCI_MMAP -#define arch_can_pci_mmap_wc() pat_enabled() +#define arch_can_pci_mmap_wc() pat_has_wc() #define ARCH_GENERIC_PCI_MMAP_RESOURCE #ifdef CONFIG_PCI Index: linux-stable/arch/x86/kernel/cpu/mtrr/main.c =================================================================== --- linux-stable.orig/arch/x86/kernel/cpu/mtrr/main.c +++ linux-stable/arch/x86/kernel/cpu/mtrr/main.c @@ -556,7 +556,7 @@ int arch_phys_wc_add(unsigned long base, { int ret; - if (pat_enabled() || !mtrr_enabled()) + if (pat_has_wc() || !mtrr_enabled()) return 0; /* Success! (We don't need to do anything.) */ ret = mtrr_add(base, size, MTRR_TYPE_WRCOMB, true); Index: linux-stable/arch/x86/mm/iomap_32.c =================================================================== --- linux-stable.orig/arch/x86/mm/iomap_32.c +++ linux-stable/arch/x86/mm/iomap_32.c @@ -84,7 +84,7 @@ iomap_atomic_prot_pfn(unsigned long pfn, * is UC or WC. UC- gets the real intention, of the user, which is * "WC if the MTRR is WC, UC if you can't do that." */ - if (!pat_enabled() && pgprot2cachemode(prot) != _PAGE_CACHE_MODE_WB) + if (!pat_has_wc() && pgprot2cachemode(prot) != _PAGE_CACHE_MODE_WB) prot = __pgprot(__PAGE_KERNEL | cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)); Index: linux-stable/arch/x86/mm/ioremap.c =================================================================== --- linux-stable.orig/arch/x86/mm/ioremap.c +++ linux-stable/arch/x86/mm/ioremap.c @@ -230,7 +230,7 @@ void __iomem *ioremap_nocache(resource_s { /* * Ideally, this should be: - * pat_enabled() ? _PAGE_CACHE_MODE_UC : _PAGE_CACHE_MODE_UC_MINUS; + * pat_has_wc() ? _PAGE_CACHE_MODE_UC : _PAGE_CACHE_MODE_UC_MINUS; * * Till we fix all X drivers to use ioremap_wc(), we will use * UC MINUS. Drivers that are certain they need or can already Index: linux-stable/drivers/infiniband/hw/mlx5/main.c =================================================================== --- linux-stable.orig/drivers/infiniband/hw/mlx5/main.c +++ linux-stable/drivers/infiniband/hw/mlx5/main.c @@ -1602,7 +1602,7 @@ static int uar_mmap(struct mlx5_ib_dev * case MLX5_IB_MMAP_WC_PAGE: /* Some architectures don't support WC memory */ #if defined(CONFIG_X86) - if (!pat_enabled()) + if (!pat_has_wc()) return -EPERM; #elif !(defined(CONFIG_PPC) || (defined(CONFIG_ARM) && defined(CONFIG_MMU))) return -EPERM; Index: linux-stable/drivers/media/pci/ivtv/ivtvfb.c =================================================================== --- linux-stable.orig/drivers/media/pci/ivtv/ivtvfb.c +++ linux-stable/drivers/media/pci/ivtv/ivtvfb.c @@ -1168,7 +1168,7 @@ static int ivtvfb_init_card(struct ivtv int rc; #ifdef CONFIG_X86_64 - if (pat_enabled()) { + if (pat_has_wc()) { pr_warn("ivtvfb needs PAT disabled, boot with nopat kernel parameter\n"); return -ENODEV; } ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it 2017-06-06 22:49 ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka 2017-06-06 22:51 ` Andy Lutomirski @ 2017-06-07 19:54 ` Bernhard Held 2017-07-03 5:05 ` Mikulas Patocka 2 siblings, 0 replies; 31+ messages in thread From: Bernhard Held @ 2017-06-07 19:54 UTC (permalink / raw) To: Mikulas Patocka, Andy Lutomirski Cc: Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On 06/07/2017 at 12:49 AM, Mikulas Patocka wrote: > Hi > > Here I send the second version of the patch. It drops the change from > boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that > caused kernel to be unbootable for some people). > > Another change is that setup_arch() calls init_cache_modes() if PAT is > disabled, so that init_cache_modes() is always called. > > Mikulas [PATCH v2] Works for me! CHeers, Bernhard ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it 2017-06-06 22:49 ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka 2017-06-06 22:51 ` Andy Lutomirski 2017-06-07 19:54 ` Bernhard Held @ 2017-07-03 5:05 ` Mikulas Patocka 2017-07-04 13:41 ` Thomas Gleixner 2 siblings, 1 reply; 31+ messages in thread From: Mikulas Patocka @ 2017-07-03 5:05 UTC (permalink / raw) To: Ingo Molnar Cc: Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Thomas Gleixner, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Tue, 6 Jun 2017, Mikulas Patocka wrote: > Hi > > Here I send the second version of the patch. It drops the change from > boot_cpu_has(X86_FEATURE_PAT) to this_cpu_has(X86_FEATURE_PAT) (that > caused kernel to be unbootable for some people). > > Another change is that setup_arch() calls init_cache_modes() if PAT is > disabled, so that init_cache_modes() is always called. > > Mikulas Is there any progress with this patch? Will you accept it or do you want some changes to it? > From: Mikulas Patocka <mpatocka@redhat.com> > > In the file arch/x86/mm/pat.c, there's a variable __pat_enabled. The > variable is set to 1 by default and the function pat_init() sets > __pat_enabled to 0 if the CPU doesn't support PAT. > > However, on AMD K6-3 CPU, the processor initialization code never calls > pat_init() and so __pat_enabled stays 1 and the function pat_enabled() > returns true, even though the K6-3 CPU doesn't support PAT. > > The result of this bug is that this warning is produced when attemting to > start the Xserver and the Xserver doesn't start (fork() returns ENOMEM). > Another symptom of this bug is that the framebuffer driver doesn't set the > K6-3 MTRR registers. > > This patch changes pat_enabled() so that it returns true only if pat > initialization was actually done. > > As Andy Lutomirski suggested, we also need to call init_cache_modes() if > pat was not initialized, so we call it after mtrr_bp_init() > (mtrr_bp_init() would or wouldn't call pat_init()). Note that > init_cache_modes() detects if it was called more than once and does > nothing in that case. > > x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 3891 at arch/x86/mm/pat.c:1020 untrack_pfn+0x5c/0x9f > Modules linked in: blowfish_generic blowfish_common cbc dm_crypt dm_loop msr configfs cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative ext4 crc16 jbd2 mbcache hpfs nls_cp852 msdos fat matroxfb_base matroxfb_g450 matroxfb_accel cfbfillrect cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc floppy snd_usb_audio snd_hwdep snd_usbmidi_lib snd_seq_midi snd_seq_midi_event snd_rawmidi snd_pcm snd_seq snd_seq_device snd_timer hid_generic snd usbhid hid soundcore powernow_k6 pcspkr evdev e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common dm_mod pata_it821x unix > CPU: 0 PID: 3891 Comm: Xorg Not tainted 4.11.0-rc6-test+ #35 > Hardware name: System Manufacturer Product Name/VA-503A, BIOS 4.51 PG 08/02/00 > Call Trace: > ? __warn+0xab/0xc0 > ? untrack_pfn+0x5c/0x9f > ? warn_slowpath_null+0xf/0x13 > ? untrack_pfn+0x5c/0x9f > ? unmap_single_vma+0x43/0x66 > ? unmap_vmas+0x24/0x30 > ? exit_mmap+0x3c/0xa5 > ? __mmput+0xf/0x76 > ? copy_process.part.76+0xb43/0x1129 > ? _do_fork+0x96/0x177 > ? do_int80_syscall_32+0x3e/0x4c > ? entry_INT80_32+0x2a/0x2a > ---[ end trace 8726f9d9fa90d702 ]--- > x86/PAT: Xorg:3891 map pfn expected mapping type uncached-minus for [mem 0xe4000000-0xe5ffffff], got write-combining > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > Cc: stable@vger.kernel.org # v4.2+ > > --- > arch/x86/include/asm/pat.h | 1 + > arch/x86/kernel/setup.c | 2 ++ > arch/x86/mm/pat.c | 10 ++++++---- > 3 files changed, 9 insertions(+), 4 deletions(-) > > Index: linux-stable/arch/x86/mm/pat.c > =================================================================== > --- linux-stable.orig/arch/x86/mm/pat.c > +++ linux-stable/arch/x86/mm/pat.c > @@ -40,7 +40,6 @@ > static bool boot_cpu_done; > > static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT); > -static void init_cache_modes(void); > > void pat_disable(const char *reason) > { > @@ -65,9 +64,11 @@ static int __init nopat(char *str) > } > early_param("nopat", nopat); > > +static bool __read_mostly __pat_initialized = false; > + > bool pat_enabled(void) > { > - return !!__pat_enabled; > + return __pat_initialized; > } > EXPORT_SYMBOL_GPL(pat_enabled); > > @@ -225,6 +226,7 @@ static void pat_bsp_init(u64 pat) > } > > wrmsrl(MSR_IA32_CR_PAT, pat); > + __pat_initialized = true; > > __init_cache_modes(pat); > } > @@ -242,7 +244,7 @@ static void pat_ap_init(u64 pat) > wrmsrl(MSR_IA32_CR_PAT, pat); > } > > -static void init_cache_modes(void) > +void init_cache_modes(void) > { > u64 pat = 0; > static int init_cm_done; > @@ -306,7 +308,7 @@ void pat_init(void) > u64 pat; > struct cpuinfo_x86 *c = &boot_cpu_data; > > - if (!pat_enabled()) { > + if (!__pat_enabled) { > init_cache_modes(); > return; > } > Index: linux-stable/arch/x86/include/asm/pat.h > =================================================================== > --- linux-stable.orig/arch/x86/include/asm/pat.h > +++ linux-stable/arch/x86/include/asm/pat.h > @@ -7,6 +7,7 @@ > bool pat_enabled(void); > void pat_disable(const char *reason); > extern void pat_init(void); > +extern void init_cache_modes(void); > > extern int reserve_memtype(u64 start, u64 end, > enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm); > Index: linux-stable/arch/x86/kernel/setup.c > =================================================================== > --- linux-stable.orig/arch/x86/kernel/setup.c > +++ linux-stable/arch/x86/kernel/setup.c > @@ -1070,6 +1070,8 @@ void __init setup_arch(char **cmdline_p) > > /* update e820 for memory not covered by WB MTRRs */ > mtrr_bp_init(); > + if (!pat_enabled()) > + init_cache_modes(); > if (mtrr_trim_uncached_memory(max_pfn)) > max_pfn = e820__end_of_ram_pfn(); > > ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it 2017-07-03 5:05 ` Mikulas Patocka @ 2017-07-04 13:41 ` Thomas Gleixner 2017-07-04 13:48 ` Thomas Gleixner 2017-07-04 23:04 ` [PATCH v3] " Mikulas Patocka 0 siblings, 2 replies; 31+ messages in thread From: Thomas Gleixner @ 2017-07-04 13:41 UTC (permalink / raw) To: Mikulas Patocka Cc: Ingo Molnar, Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Mon, 3 Jul 2017, Mikulas Patocka wrote: > Is there any progress with this patch? Will you accept it or do you want > some changes to it? Aside of the unparseable changelog, that patch is mostly duct tape. 1) __pat_enabled should be renamed to pat_disabled, as that is the actual purpose of that variable 2) Making the call to init_cache_modes() conditional in setup_arch() is pointless. init_cache_modes() has it's own protection against multiple invocations. 3) It adds yet another invocation of init_cache_modes() instead of getting rid of the ones in pat_disable() and the pat disabled case in pat_init(). I've reworked the whole thing into the patch below. Thanks, tglx 8<--------------------- Subject: x86/mm/pat: Don't report PAT on CPUs that don't support it From: Mikulas Patocka <mpatocka@redhat.com> Date: Tue, 6 Jun 2017 18:49:39 -0400 (EDT) The pat_enabled() logic is broken on CPUs which do not support PAT and where the initialization code fails to call pat_init(). Due to that the enabled flag stays true and pat_enabled() returns true wrongfully. As a consequence the mappings, e.g. for Xorg, are set up with the wrong caching mode and the required MTRR setups are omitted. To cure this the following changes are required: 1) Make pat_enabled() return true only if PAT initialization was invoked and successful. 2) Invoke init_cache_modes() unconditionally in setup_arch() and remove the extra callsites in pat_disable() and the pat disabled code path in pat_init(). Also rename __pat_enabled to pat_disabled to reflect the real purpose of this variable. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Bernhard Held <berny156@gmx.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: "Luis R. Rodriguez" <mcgrof@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> --- arch/x86/include/asm/pat.h | 1 + arch/x86/kernel/setup.c | 7 +++++++ arch/x86/mm/pat.c | 22 +++++++++------------- 3 files changed, 17 insertions(+), 13 deletions(-) --- a/arch/x86/include/asm/pat.h +++ b/arch/x86/include/asm/pat.h @@ -7,6 +7,7 @@ bool pat_enabled(void); void pat_disable(const char *reason); extern void pat_init(void); +extern void init_cache_modes(void); extern int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm); --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1076,6 +1076,13 @@ void __init setup_arch(char **cmdline_p) max_possible_pfn = max_pfn; /* + * This call is required when the CPU does not support PAT. If + * mtrr_bp_init() invoked it already via pat_init() the call has no + * effect. + */ + init_cache_modes(); + + /* * Define random base addresses for memory sections after max_pfn is * defined and before each memory section base is used. */ --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -37,14 +37,13 @@ #undef pr_fmt #define pr_fmt(fmt) "" fmt -static bool boot_cpu_done; - -static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT); -static void init_cache_modes(void); +static bool __read_mostly boot_cpu_done; +static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT); +static bool __read_mostly pat_initialized; void pat_disable(const char *reason) { - if (!__pat_enabled) + if (pat_disabled) return; if (boot_cpu_done) { @@ -52,10 +51,8 @@ void pat_disable(const char *reason) return; } - __pat_enabled = 0; + pat_disabled = true; pr_info("x86/PAT: %s\n", reason); - - init_cache_modes(); } static int __init nopat(char *str) @@ -67,7 +64,7 @@ early_param("nopat", nopat); bool pat_enabled(void) { - return !!__pat_enabled; + return pat_initialized; } EXPORT_SYMBOL_GPL(pat_enabled); @@ -225,6 +222,7 @@ static void pat_bsp_init(u64 pat) } wrmsrl(MSR_IA32_CR_PAT, pat); + __pat_initialized = true; __init_cache_modes(pat); } @@ -242,7 +240,7 @@ static void pat_ap_init(u64 pat) wrmsrl(MSR_IA32_CR_PAT, pat); } -static void init_cache_modes(void) +void init_cache_modes(void) { u64 pat = 0; static int init_cm_done; @@ -306,10 +304,8 @@ void pat_init(void) u64 pat; struct cpuinfo_x86 *c = &boot_cpu_data; - if (!pat_enabled()) { - init_cache_modes(); + if (pat_disabled) return; - } if ((c->x86_vendor == X86_VENDOR_INTEL) && (((c->x86 == 0x6) && (c->x86_model <= 0xd)) || ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2] X86: don't report PAT on CPUs that don't support it 2017-07-04 13:41 ` Thomas Gleixner @ 2017-07-04 13:48 ` Thomas Gleixner 2017-07-04 23:04 ` [PATCH v3] " Mikulas Patocka 1 sibling, 0 replies; 31+ messages in thread From: Thomas Gleixner @ 2017-07-04 13:48 UTC (permalink / raw) To: Mikulas Patocka Cc: Ingo Molnar, Bernhard Held, Andy Lutomirski, Toshi Kani, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Tue, 4 Jul 2017, Thomas Gleixner wrote: > wrmsrl(MSR_IA32_CR_PAT, pat); > + __pat_initialized = true; That should be > + pat_initialized = true; of course. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v3] X86: don't report PAT on CPUs that don't support it 2017-07-04 13:41 ` Thomas Gleixner 2017-07-04 13:48 ` Thomas Gleixner @ 2017-07-04 23:04 ` Mikulas Patocka 2017-07-05 7:03 ` [tip:x86/urgent] x86/mm/pat: Don't " tip-bot for Mikulas Patocka 1 sibling, 1 reply; 31+ messages in thread From: Mikulas Patocka @ 2017-07-04 23:04 UTC (permalink / raw) To: Thomas Gleixner Cc: Ingo Molnar, Bernhard Held, Andy Lutomirski, Borislav Petkov, Andrew Morton, Brian Gerst, Linus Torvalds, H. Peter Anvin, Peter Zijlstra, Luis R. Rodriguez, Denys Vlasenko, Josh Poimboeuf, linux-kernel On Tue, 4 Jul 2017, Thomas Gleixner wrote: > On Mon, 3 Jul 2017, Mikulas Patocka wrote: > > Is there any progress with this patch? Will you accept it or do you want > > some changes to it? > > Aside of the unparseable changelog, that patch is mostly duct tape. > > 1) __pat_enabled should be renamed to pat_disabled, as that is the actual > purpose of that variable > > 2) Making the call to init_cache_modes() conditional in setup_arch() is > pointless. init_cache_modes() has it's own protection against multiple > invocations. > > 3) It adds yet another invocation of init_cache_modes() instead of getting > rid of the ones in pat_disable() and the pat disabled case in pat_init(). > > I've reworked the whole thing into the patch below. > > Thanks, > > tglx Yes - renaming __pat_enabled to pat_disabled is a good thing. Just one more change - init_cache_modes() is protected against multiple calls, but pat_bsp_init() calls __init_cache_modes() (not init_cacha_modes()). The generic code would call init_cache_modes() later and init_cache_modes() would do the initialization again - it would be mostly harmless because it would just read the pat MSR that pat_bsp_init have written and call __init_cache_modes() with the same value - the symptom is that on machines with PAT we see this line twice in the syslog: [ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT [ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT I fixed this double initialization by moving the variable init_cm_done to file scope and setting it in __init_cache_modes(). Mikulas 8<--------------------- Subject: x86/mm/pat: Don't report PAT on CPUs that don't support it From: Mikulas Patocka <mpatocka@redhat.com> Date: Tue, 6 Jun 2017 18:49:39 -0400 (EDT) The pat_enabled() logic is broken on CPUs which do not support PAT and where the initialization code fails to call pat_init(). Due to that the enabled flag stays true and pat_enabled() returns true wrongfully. As a consequence the mappings, e.g. for Xorg, are set up with the wrong caching mode and the required MTRR setups are omitted. To cure this the following changes are required: 1) Make pat_enabled() return true only if PAT initialization was invoked and successful. 2) Invoke init_cache_modes() unconditionally in setup_arch() and remove the extra callsites in pat_disable() and the pat disabled code path in pat_init(). Also rename __pat_enabled to pat_disabled to reflect the real purpose of this variable. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Bernhard Held <berny156@gmx.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: "Luis R. Rodriguez" <mcgrof@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org # v4.2+ --- arch/x86/include/asm/pat.h | 1 + arch/x86/kernel/setup.c | 7 +++++++ arch/x86/mm/pat.c | 28 ++++++++++++---------------- 3 files changed, 20 insertions(+), 16 deletions(-) Index: linux-2.6/arch/x86/include/asm/pat.h =================================================================== --- linux-2.6.orig/arch/x86/include/asm/pat.h +++ linux-2.6/arch/x86/include/asm/pat.h @@ -7,6 +7,7 @@ bool pat_enabled(void); void pat_disable(const char *reason); extern void pat_init(void); +extern void init_cache_modes(void); extern int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm); Index: linux-2.6/arch/x86/kernel/setup.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/setup.c +++ linux-2.6/arch/x86/kernel/setup.c @@ -1076,6 +1076,13 @@ void __init setup_arch(char **cmdline_p) max_possible_pfn = max_pfn; /* + * This call is required when the CPU does not support PAT. If + * mtrr_bp_init() invoked it already via pat_init() the call has no + * effect. + */ + init_cache_modes(); + + /* * Define random base addresses for memory sections after max_pfn is * defined and before each memory section base is used. */ Index: linux-2.6/arch/x86/mm/pat.c =================================================================== --- linux-2.6.orig/arch/x86/mm/pat.c +++ linux-2.6/arch/x86/mm/pat.c @@ -37,14 +37,14 @@ #undef pr_fmt #define pr_fmt(fmt) "" fmt -static bool boot_cpu_done; - -static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT); -static void init_cache_modes(void); +static bool __read_mostly boot_cpu_done; +static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT); +static bool __read_mostly pat_initialized; +static bool __read_mostly init_cm_done; void pat_disable(const char *reason) { - if (!__pat_enabled) + if (pat_disabled) return; if (boot_cpu_done) { @@ -52,10 +52,8 @@ void pat_disable(const char *reason) return; } - __pat_enabled = 0; + pat_disabled = true; pr_info("x86/PAT: %s\n", reason); - - init_cache_modes(); } static int __init nopat(char *str) @@ -67,7 +65,7 @@ early_param("nopat", nopat); bool pat_enabled(void) { - return !!__pat_enabled; + return pat_initialized; } EXPORT_SYMBOL_GPL(pat_enabled); @@ -205,6 +203,8 @@ static void __init_cache_modes(u64 pat) update_cache_mode_entry(i, cache); } pr_info("x86/PAT: Configuration [0-7]: %s\n", pat_msg); + + init_cm_done = true; } #define PAT(x, y) ((u64)PAT_ ## y << ((x)*8)) @@ -225,6 +225,7 @@ static void pat_bsp_init(u64 pat) } wrmsrl(MSR_IA32_CR_PAT, pat); + pat_initialized = true; __init_cache_modes(pat); } @@ -242,10 +243,9 @@ static void pat_ap_init(u64 pat) wrmsrl(MSR_IA32_CR_PAT, pat); } -static void init_cache_modes(void) +void init_cache_modes(void) { u64 pat = 0; - static int init_cm_done; if (init_cm_done) return; @@ -287,8 +287,6 @@ static void init_cache_modes(void) } __init_cache_modes(pat); - - init_cm_done = 1; } /** @@ -306,10 +304,8 @@ void pat_init(void) u64 pat; struct cpuinfo_x86 *c = &boot_cpu_data; - if (!pat_enabled()) { - init_cache_modes(); + if (pat_disabled) return; - } if ((c->x86_vendor == X86_VENDOR_INTEL) && (((c->x86 == 0x6) && (c->x86_model <= 0xd)) || ^ permalink raw reply [flat|nested] 31+ messages in thread
* [tip:x86/urgent] x86/mm/pat: Don't report PAT on CPUs that don't support it 2017-07-04 23:04 ` [PATCH v3] " Mikulas Patocka @ 2017-07-05 7:03 ` tip-bot for Mikulas Patocka 0 siblings, 0 replies; 31+ messages in thread From: tip-bot for Mikulas Patocka @ 2017-07-05 7:03 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, bp, dvlasenk, jpoimboe, akpm, peterz, brgerst, mcgrof, mingo, berny156, hpa, mpatocka, torvalds, luto, tglx Commit-ID: 99c13b8c8896d7bcb92753bf0c63a8de4326e78d Gitweb: http://git.kernel.org/tip/99c13b8c8896d7bcb92753bf0c63a8de4326e78d Author: Mikulas Patocka <mpatocka@redhat.com> AuthorDate: Tue, 4 Jul 2017 19:04:23 -0400 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Wed, 5 Jul 2017 09:01:24 +0200 x86/mm/pat: Don't report PAT on CPUs that don't support it The pat_enabled() logic is broken on CPUs which do not support PAT and where the initialization code fails to call pat_init(). Due to that the enabled flag stays true and pat_enabled() returns true wrongfully. As a consequence the mappings, e.g. for Xorg, are set up with the wrong caching mode and the required MTRR setups are omitted. To cure this the following changes are required: 1) Make pat_enabled() return true only if PAT initialization was invoked and successful. 2) Invoke init_cache_modes() unconditionally in setup_arch() and remove the extra callsites in pat_disable() and the pat disabled code path in pat_init(). Also rename __pat_enabled to pat_disabled to reflect the real purpose of this variable. Fixes: 9cd25aac1f44 ("x86/mm/pat: Emulate PAT when it is disabled") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bernhard Held <berny156@gmx.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: "Luis R. Rodriguez" <mcgrof@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707041749300.3456@file01.intranet.prod.int.rdu2.redhat.com --- arch/x86/include/asm/pat.h | 1 + arch/x86/kernel/setup.c | 7 +++++++ arch/x86/mm/pat.c | 28 ++++++++++++---------------- 3 files changed, 20 insertions(+), 16 deletions(-) diff --git a/arch/x86/include/asm/pat.h b/arch/x86/include/asm/pat.h index 0b1ff4c..fffb279 100644 --- a/arch/x86/include/asm/pat.h +++ b/arch/x86/include/asm/pat.h @@ -7,6 +7,7 @@ bool pat_enabled(void); void pat_disable(const char *reason); extern void pat_init(void); +extern void init_cache_modes(void); extern int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_pcm, enum page_cache_mode *ret_pcm); diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 65622f0..3486d04 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1076,6 +1076,13 @@ void __init setup_arch(char **cmdline_p) max_possible_pfn = max_pfn; /* + * This call is required when the CPU does not support PAT. If + * mtrr_bp_init() invoked it already via pat_init() the call has no + * effect. + */ + init_cache_modes(); + + /* * Define random base addresses for memory sections after max_pfn is * defined and before each memory section base is used. */ diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 9b78685..4597950 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -37,14 +37,14 @@ #undef pr_fmt #define pr_fmt(fmt) "" fmt -static bool boot_cpu_done; - -static int __read_mostly __pat_enabled = IS_ENABLED(CONFIG_X86_PAT); -static void init_cache_modes(void); +static bool __read_mostly boot_cpu_done; +static bool __read_mostly pat_disabled = !IS_ENABLED(CONFIG_X86_PAT); +static bool __read_mostly pat_initialized; +static bool __read_mostly init_cm_done; void pat_disable(const char *reason) { - if (!__pat_enabled) + if (pat_disabled) return; if (boot_cpu_done) { @@ -52,10 +52,8 @@ void pat_disable(const char *reason) return; } - __pat_enabled = 0; + pat_disabled = true; pr_info("x86/PAT: %s\n", reason); - - init_cache_modes(); } static int __init nopat(char *str) @@ -67,7 +65,7 @@ early_param("nopat", nopat); bool pat_enabled(void) { - return !!__pat_enabled; + return pat_initialized; } EXPORT_SYMBOL_GPL(pat_enabled); @@ -205,6 +203,8 @@ static void __init_cache_modes(u64 pat) update_cache_mode_entry(i, cache); } pr_info("x86/PAT: Configuration [0-7]: %s\n", pat_msg); + + init_cm_done = true; } #define PAT(x, y) ((u64)PAT_ ## y << ((x)*8)) @@ -225,6 +225,7 @@ static void pat_bsp_init(u64 pat) } wrmsrl(MSR_IA32_CR_PAT, pat); + pat_initialized = true; __init_cache_modes(pat); } @@ -242,10 +243,9 @@ static void pat_ap_init(u64 pat) wrmsrl(MSR_IA32_CR_PAT, pat); } -static void init_cache_modes(void) +void init_cache_modes(void) { u64 pat = 0; - static int init_cm_done; if (init_cm_done) return; @@ -287,8 +287,6 @@ static void init_cache_modes(void) } __init_cache_modes(pat); - - init_cm_done = 1; } /** @@ -306,10 +304,8 @@ void pat_init(void) u64 pat; struct cpuinfo_x86 *c = &boot_cpu_data; - if (!pat_enabled()) { - init_cache_modes(); + if (pat_disabled) return; - } if ((c->x86_vendor == X86_VENDOR_INTEL) && (((c->x86 == 0x6) && (c->x86_model <= 0xd)) || ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT @ 2017-05-31 0:53 Doug Smythies 2017-06-01 7:49 ` Ian W MORRISON 0 siblings, 1 reply; 31+ messages in thread From: Doug Smythies @ 2017-05-31 0:53 UTC (permalink / raw) To: 'Bernhard Held', 'Mikulas Patocka', 'Andy Lutomirski', 'Ingo Molnar' Cc: linux-kernel, Doug Smythies Note Before: I might not have the address list correct, as I have re-created this e-mail from the web page archive, having found the thread after bisecting the kernel. On 2017.05.29 18:50:57 -0400 (EDT) Mikulas Patocka wrote: > On Sun, 28 May 2017, Andy Lutomirski wrote: >> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote: >>> Hi, >>> >>> this patch breaks the boot of my kernel. The last message is "Booting >>> the kernel.". It breaks my kernel boot also, and I know of at least two others with the same, or similar, problem as of kernel 4.12-rc3. >>> My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a >>> Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the >>> microcode of the E5450 and recognizes the CPU. I do not think my test server setup is unusual. I use Ubuntu 16.04.2 server edition as my distro, and steal Ubuntu kernel configurations for compiling. My processor is an older model i7 (Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz) > Hi > > Please do the following three tests and test if the kernel boots. > > 1. use the PAT patch and revert the change to the function pat_enabled() > - i.e. change it to the original: > bool pat_enabled(void) > { > return !!__pat_enabled; > } Test 1 result: fail > > 2. use the PAT patch and revert the change to the function pat_ap_init > - i.e. change it to the original: > static void pat_ap_init(u64 pat) > { > if (!boot_cpu_has(X86_FEATURE_PAT)) { Test 2 result: pass > 3. use the full PAT patch and apply the below patch on the top of it. > Test 3 result: fail >> I think this patch is bogus. pat_enabled() sure looks like it's >> supposed to return true if PAT is *enabled*, and these days PAT is >> "enabled" even if there's no HW PAT support. Even if the patch were >> somehow correct, it should have been split up into two patches, one to >> change pat_enabled() and one to use this_cpu_has(). >> >> Ingo, I'd suggest reverting the patch, cc-ing stable on the revert so >> -stable knows not to backport it, and starting over with the fix. >> From very brief inspection, the right fix is to make sure that >> pat_init(), or at least init_cache_modes(), gets called on the >> > pat_init() needs to be called with cache disabled - and the cache disable > code (functions prepare_set() and post_set()) exists in > arch/x86/kernel/cpu/mtrr/generic.c - it may not be compiled if CONFIG_MTRR > is not set. > > Though, it is possible to call init_cache_modes() - see the patch below. > init_cache_modes() does nothing if it is called multiple times. > >> affected CPUs. >> >> As a future cleanup, I think that pat_enabled() could be deleted >> outright and, if needed, replaced by functions like have_memtype_wc() >> or similar. (Do we already have helpers like that?) Toshi, am I >> right? >> >> --Andy ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-05-31 0:53 [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT Doug Smythies @ 2017-06-01 7:49 ` Ian W MORRISON 2017-06-01 14:48 ` Ian W MORRISON 0 siblings, 1 reply; 31+ messages in thread From: Ian W MORRISON @ 2017-06-01 7:49 UTC (permalink / raw) To: Doug Smythies Cc: Bernhard Held, Mikulas Patocka, Andy Lutomirski, Ingo Molnar, linux-kernel On 31 May 2017 at 10:53, Doug Smythies <dsmythies@telus.net> wrote: > Note Before: > I might not have the address list correct, as I have re-created this > e-mail from the web page archive, having found the thread after bisecting the > kernel. > > On 2017.05.29 18:50:57 -0400 (EDT) Mikulas Patocka wrote: >> On Sun, 28 May 2017, Andy Lutomirski wrote: >>> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote: >>>> Hi, >>>> >>>> this patch breaks the boot of my kernel. The last message is "Booting >>>> the kernel.". > > It breaks my kernel boot also, and I know of at least two others with > the same, or similar, problem as of kernel 4.12-rc3. > Just to add that I cannot boot v4.12-rc3 kernel with any Intel Atom (BYT and CHT) Intel Compute Sticks. Adding 'earlyprintk=efi' confirms kernel panic with [ 0.000000] Kernel panic - not syncing: x86/PAT: PAT enabled, but not supported by secondary CPU <snip> ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT 2017-06-01 7:49 ` Ian W MORRISON @ 2017-06-01 14:48 ` Ian W MORRISON 0 siblings, 0 replies; 31+ messages in thread From: Ian W MORRISON @ 2017-06-01 14:48 UTC (permalink / raw) To: Doug Smythies Cc: Bernhard Held, Mikulas Patocka, Andy Lutomirski, Ingo Molnar, linux-kernel On 6/1/17 5:49 PM, Ian W MORRISON wrote: > On 31 May 2017 at 10:53, Doug Smythies <dsmythies@telus.net> wrote: >> Note Before: >> I might not have the address list correct, as I have re-created this >> e-mail from the web page archive, having found the thread after bisecting the >> kernel. >> >> On 2017.05.29 18:50:57 -0400 (EDT) Mikulas Patocka wrote: >>> On Sun, 28 May 2017, Andy Lutomirski wrote: >>>> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote: >>>>> Hi, >>>>> >>>>> this patch breaks the boot of my kernel. The last message is "Booting >>>>> the kernel.". >> >> It breaks my kernel boot also, and I know of at least two others with >> the same, or similar, problem as of kernel 4.12-rc3. >> > > Just to add that I cannot boot v4.12-rc3 kernel with any Intel Atom > (BYT and CHT) Intel Compute Sticks. Adding 'earlyprintk=efi' confirms > kernel panic with > [ 0.000000] Kernel panic - not syncing: x86/PAT: PAT enabled, but not > supported by secondary CPU > > > <snip> > Confirmed that following patch fixes boot of v4.12-rc3 for Intel Atom Compute Sticks: [PATCH] Fix X86_FEATURE_PAT regression bug Early kernel panic caused by checking for X86_FEATURE_PAT when enabled but not supported by secondary CPU. Fixes cbed27cdf0e3 ("x86/PAT: Fix Xorg regression on CPUs that don't support PAT") Signed-off-by: Ian W Morrison <ianwmorrison@gmail.com> --- arch/x86/mm/pat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 83a59a6..c537bfb 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -234,7 +234,7 @@ static void pat_bsp_init(u64 pat) static void pat_ap_init(u64 pat) { - if (!this_cpu_has(X86_FEATURE_PAT)) { + if (!boot_cpu_has(X86_FEATURE_PAT)) { /* * If this happens we are on a secondary CPU, but switched to * PAT on the boot CPU. We have no way to undo PAT. -- 1.9.1 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT
@ 2017-06-02 6:32 Ian W MORRISON
0 siblings, 0 replies; 31+ messages in thread
From: Ian W MORRISON @ 2017-06-02 6:32 UTC (permalink / raw)
To: linux-kernel
Cc: Bernhard Held, Mikulas Patocka, Andy Lutomirski, Ingo Molnar,
chris, Doug Smythies
On 2 June 2017 at 00:48, Ian W MORRISON <ianwmorrison@gmail.com> wrote:
> On 6/1/17 5:49 PM, Ian W MORRISON wrote:
>> On 31 May 2017 at 10:53, Doug Smythies <dsmythies@telus.net> wrote:
>>> Note Before:
>>> I might not have the address list correct, as I have re-created this
>>> e-mail from the web page archive, having found the thread after bisecting the
>>> kernel.
>>>
>>> On 2017.05.29 18:50:57 -0400 (EDT) Mikulas Patocka wrote:
>>>> On Sun, 28 May 2017, Andy Lutomirski wrote:
>>>>> On Sun, May 28, 2017 at 11:18 AM, Bernhard Held <berny156@gmx.de> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> this patch breaks the boot of my kernel. The last message is "Booting
>>>>>> the kernel.".
>>>
>>> It breaks my kernel boot also, and I know of at least two others with
>>> the same, or similar, problem as of kernel 4.12-rc3.
>>>
>>
>> Just to add that I cannot boot v4.12-rc3 kernel with any Intel Atom
>> (BYT and CHT) Intel Compute Sticks. Adding 'earlyprintk=efi' confirms
>> kernel panic with
>> [ 0.000000] Kernel panic - not syncing: x86/PAT: PAT enabled, but not
>> supported by secondary CPU
>>
>>
>> <snip>
>>
>
> Confirmed that following patch fixes boot of v4.12-rc3 for Intel Atom Compute Sticks:
>
> [PATCH] Fix X86_FEATURE_PAT regression bug
>
<snip>
And for the sake of completeness, the revert patch
(https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent)
also fixes the boot of v4.12-rc3 for Intel Atom Compute Sticks (as
expected and now tested - yay).
^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2017-07-05 7:09 UTC | newest] Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-04-18 19:07 [PATCH] X86: don't report PAT on CPUs that don't support it Mikulas Patocka 2017-04-18 19:28 ` H. Peter Anvin 2017-04-18 20:47 ` Mikulas Patocka 2017-05-14 22:07 ` Mikulas Patocka 2017-05-16 13:57 ` H. Peter Anvin 2017-05-16 15:49 ` Mikulas Patocka 2017-05-18 7:17 ` Ingo Molnar 2017-05-24 10:21 ` [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT tip-bot for Mikulas Patocka 2017-05-28 18:18 ` Bernhard Held 2017-05-28 18:43 ` Andy Lutomirski 2017-05-29 22:50 ` Mikulas Patocka 2017-05-30 17:14 ` Dominik Brodowski 2017-05-30 17:59 ` Mikulas Patocka 2017-05-30 18:47 ` Dominik Brodowski 2017-05-30 19:30 ` Bernhard Held 2017-05-31 9:39 ` Junichi Nomura 2017-06-06 22:49 ` [PATCH v2] X86: don't report PAT on CPUs that don't support it Mikulas Patocka 2017-06-06 22:51 ` Andy Lutomirski 2017-06-06 23:21 ` Mikulas Patocka 2017-06-13 15:54 ` Andy Lutomirski 2017-06-14 20:24 ` Mikulas Patocka 2017-06-07 19:54 ` Bernhard Held 2017-07-03 5:05 ` Mikulas Patocka 2017-07-04 13:41 ` Thomas Gleixner 2017-07-04 13:48 ` Thomas Gleixner 2017-07-04 23:04 ` [PATCH v3] " Mikulas Patocka 2017-07-05 7:03 ` [tip:x86/urgent] x86/mm/pat: Don't " tip-bot for Mikulas Patocka 2017-05-31 0:53 [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT Doug Smythies 2017-06-01 7:49 ` Ian W MORRISON 2017-06-01 14:48 ` Ian W MORRISON 2017-06-02 6:32 Ian W MORRISON
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).