linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] 5-level paging changes from v4.18
@ 2018-05-11 12:08 Kirill A. Shutemov
  2018-05-11 12:08 ` [PATCH 1/2] x86/boot/compressed/64: Fix trampoline page table address calculation Kirill A. Shutemov
  2018-05-11 12:08 ` [PATCH 2/2] x86/mm: Introduce 'no5lvl' kernel parameter Kirill A. Shutemov
  0 siblings, 2 replies; 6+ messages in thread
From: Kirill A. Shutemov @ 2018-05-11 12:08 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov

Couple of changes for v4.18. One fix for a non-critical bug and the new
kernel command line option to disable 5-level paging.

Kirill A. Shutemov (2):
  x86/boot/compressed/64: Fix trampoline page table address calculation
  x86/mm: Introduce 'no5lvl' kernel parameter

 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 arch/x86/boot/compressed/cmdline.c              |  2 +-
 arch/x86/boot/compressed/head_64.S              |  1 +
 arch/x86/boot/compressed/pgtable_64.c           | 14 +++++++++++---
 arch/x86/kernel/cpu/common.c                    |  6 ++++++
 arch/x86/kernel/head64.c                        | 10 ++++++----
 6 files changed, 28 insertions(+), 8 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] x86/boot/compressed/64: Fix trampoline page table address calculation
  2018-05-11 12:08 [PATCH 0/2] 5-level paging changes from v4.18 Kirill A. Shutemov
@ 2018-05-11 12:08 ` Kirill A. Shutemov
  2018-05-11 12:08 ` [PATCH 2/2] x86/mm: Introduce 'no5lvl' kernel parameter Kirill A. Shutemov
  1 sibling, 0 replies; 6+ messages in thread
From: Kirill A. Shutemov @ 2018-05-11 12:08 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov

Hugh noticied that I calculate address of trampoline page table wrong in
cleanup_trampoline(). TRAMPOLINE_32BIT_PGTABLE_OFFSET has to be divided
by sizeof(unsigned long) since trampoline_32bit is unsigned long
pointer.

TRAMPOLINE_32BIT_PGTABLE_OFFSET is zero so the bug doesn't have a
visible effect.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Hugh Dickins <hughd@google.com>
Fixes: e9d0e6330eb8 ("x86/boot/compressed/64: Prepare new top-level page table for trampoline")
---
 arch/x86/boot/compressed/pgtable_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index a362fa0b849c..23707e1da1ff 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -130,7 +130,7 @@ void cleanup_trampoline(void *pgtable)
 {
 	void *trampoline_pgtable;
 
-	trampoline_pgtable = trampoline_32bit + TRAMPOLINE_32BIT_PGTABLE_OFFSET;
+	trampoline_pgtable = trampoline_32bit + TRAMPOLINE_32BIT_PGTABLE_OFFSET / sizeof(unsigned long);
 
 	/*
 	 * Move the top level page table out of trampoline memory,
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] x86/mm: Introduce 'no5lvl' kernel parameter
  2018-05-11 12:08 [PATCH 0/2] 5-level paging changes from v4.18 Kirill A. Shutemov
  2018-05-11 12:08 ` [PATCH 1/2] x86/boot/compressed/64: Fix trampoline page table address calculation Kirill A. Shutemov
@ 2018-05-11 12:08 ` Kirill A. Shutemov
  2018-05-13 21:55   ` Thomas Gleixner
  1 sibling, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2018-05-11 12:08 UTC (permalink / raw)
  To: Ingo Molnar, x86, Thomas Gleixner, H. Peter Anvin
  Cc: Hugh Dickins, linux-kernel, Kirill A. Shutemov

The kernel parameter allows to force kernel to use 4-level paging even
if hardware and kernel support 5-level paging.

The option may be useful to workaround regressions related to 5-level
paging.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 arch/x86/boot/compressed/cmdline.c              |  2 +-
 arch/x86/boot/compressed/head_64.S              |  1 +
 arch/x86/boot/compressed/pgtable_64.c           | 12 ++++++++++--
 arch/x86/kernel/cpu/common.c                    |  6 ++++++
 arch/x86/kernel/head64.c                        | 10 ++++++----
 6 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 11fc28ecdb6d..364a33c1534d 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2600,6 +2600,9 @@
 			emulation library even if a 387 maths coprocessor
 			is present.
 
+	no5lvl		[X86-64] Disable 5-level paging mode. Forces
+			kernel to use 4-level paging instead.
+
 	no_console_suspend
 			[HW] Never suspend the console
 			Disable suspending of consoles during suspend and
diff --git a/arch/x86/boot/compressed/cmdline.c b/arch/x86/boot/compressed/cmdline.c
index 0cb325734cfb..af6cda0b7900 100644
--- a/arch/x86/boot/compressed/cmdline.c
+++ b/arch/x86/boot/compressed/cmdline.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "misc.h"
 
-#if CONFIG_EARLY_PRINTK || CONFIG_RANDOMIZE_BASE
+#if CONFIG_EARLY_PRINTK || CONFIG_RANDOMIZE_BASE || CONFIG_X86_5LEVEL
 
 static unsigned long fs;
 static inline void set_fs(unsigned long seg)
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 1acf7b5d16cf..3ecd8399ad1f 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -361,6 +361,7 @@ ENTRY(startup_64)
 	 * this function call.
 	 */
 	pushq	%rsi
+	movq	%rsi, %rdi		/* real mode address */
 	call	paging_prepare
 	popq	%rsi
 
diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compressed/pgtable_64.c
index 23707e1da1ff..8c5107545251 100644
--- a/arch/x86/boot/compressed/pgtable_64.c
+++ b/arch/x86/boot/compressed/pgtable_64.c
@@ -31,16 +31,23 @@ static char trampoline_save[TRAMPOLINE_32BIT_SIZE];
  */
 unsigned long *trampoline_32bit __section(.data);
 
-struct paging_config paging_prepare(void)
+extern struct boot_params *boot_params;
+int cmdline_find_option_bool(const char *option);
+
+struct paging_config paging_prepare(void *rmode)
 {
 	struct paging_config paging_config = {};
 	unsigned long bios_start, ebda_start;
 
+	/* Initialize boot_params. Required for cmdline_find_option_bool(). */
+	boot_params = rmode;
+
 	/*
 	 * Check if LA57 is desired and supported.
 	 *
-	 * There are two parts to the check:
+	 * There are several parts to the check:
 	 *   - if the kernel supports 5-level paging: CONFIG_X86_5LEVEL=y
+	 *   - if user asked to disable 5-level paging: no5lvl in cmdline
 	 *   - if the machine supports 5-level paging:
 	 *     + CPUID leaf 7 is supported
 	 *     + the leaf has the feature bit set
@@ -48,6 +55,7 @@ struct paging_config paging_prepare(void)
 	 * That's substitute for boot_cpu_has() in early boot code.
 	 */
 	if (IS_ENABLED(CONFIG_X86_5LEVEL) &&
+			!cmdline_find_option_bool("no5lvl") &&
 			native_cpuid_eax(0) >= 7 &&
 			(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) {
 		paging_config.l5_required = 1;
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ce243f7d2d4e..1e91ec6293de 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1008,6 +1008,12 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
 	 */
 	setup_clear_cpu_cap(X86_FEATURE_PCID);
 #endif
+
+#ifdef CONFIG_X86_5LEVEL
+	/* Clear the 5-level paging feature if user asked for 'no5lvl' */
+	if (!__pgtable_l5_enabled)
+		setup_clear_cpu_cap(X86_FEATURE_LA57);
+#endif
 }
 
 void __init early_cpu_init(void)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 0c408f8c4ed4..8ca65d35b93a 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -82,10 +82,12 @@ static unsigned int __head *fixup_int(void *ptr, unsigned long physaddr)
 
 static bool __head check_la57_support(unsigned long physaddr)
 {
-	if (native_cpuid_eax(0) < 7)
-		return false;
-
-	if (!(native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31))))
+	/*
+	 * 5-level paging is detected and enabled on kernel decomression
+	 * stage. Back off if 5-level paging mode has not yet enabled by
+	 * this point.
+	 */
+	if (!(native_read_cr4() & X86_CR4_LA57))
 		return false;
 
 	*fixup_int(&pgtable_l5_enabled, physaddr) = 1;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] x86/mm: Introduce 'no5lvl' kernel parameter
  2018-05-11 12:08 ` [PATCH 2/2] x86/mm: Introduce 'no5lvl' kernel parameter Kirill A. Shutemov
@ 2018-05-13 21:55   ` Thomas Gleixner
  2018-05-14 21:50     ` Kirill A. Shutemov
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2018-05-13 21:55 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel

On Fri, 11 May 2018, Kirill A. Shutemov wrote:
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1008,6 +1008,12 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
>  	 */
>  	setup_clear_cpu_cap(X86_FEATURE_PCID);
>  #endif
> +
> +#ifdef CONFIG_X86_5LEVEL
> +	/* Clear the 5-level paging feature if user asked for 'no5lvl' */

no5lvl is only one reason why 5 level paging is not available.

> +	if (!__pgtable_l5_enabled)
> +		setup_clear_cpu_cap(X86_FEATURE_LA57);
> +#endif

And that #ifdeffery can be avoided by simply doing:

	if (IS_ENABLED(CONFIG_X86_5LEVEL) && !pgtable_l5_enabled)
		setup_clear_cpu_cap(X86_FEATURE_LA57);

Hmm?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] x86/mm: Introduce 'no5lvl' kernel parameter
  2018-05-13 21:55   ` Thomas Gleixner
@ 2018-05-14 21:50     ` Kirill A. Shutemov
  2018-05-15 23:45       ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2018-05-14 21:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel

On Sun, May 13, 2018 at 09:55:00PM +0000, Thomas Gleixner wrote:
> On Fri, 11 May 2018, Kirill A. Shutemov wrote:
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -1008,6 +1008,12 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
> >  	 */
> >  	setup_clear_cpu_cap(X86_FEATURE_PCID);
> >  #endif
> > +
> > +#ifdef CONFIG_X86_5LEVEL
> > +	/* Clear the 5-level paging feature if user asked for 'no5lvl' */
> 
> no5lvl is only one reason why 5 level paging is not available.

The 5-level paging may not be available for few reasons:
 - CONFIG_X86_5LEVEL=n
 - Machine doesn't support X86_FEATURE_LA57 -- clearing is nop;
 - User asked for 'no5lvl'.

To me the one-line comment reflects the situation reasonably well.

Do you want more verbose comment here?

> > +	if (!__pgtable_l5_enabled)
> > +		setup_clear_cpu_cap(X86_FEATURE_LA57);
> > +#endif
> 
> And that #ifdeffery can be avoided by simply doing:
> 
> 	if (IS_ENABLED(CONFIG_X86_5LEVEL) && !pgtable_l5_enabled)
> 		setup_clear_cpu_cap(X86_FEATURE_LA57);
> 
> Hmm?

Unfortunately, no.

pgtable_l5_enabled is defined as cpu_feature_enabled(X86_FEATURE_LA57).
So with such change we would only clear the feature bit if it's already
clear.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] x86/mm: Introduce 'no5lvl' kernel parameter
  2018-05-14 21:50     ` Kirill A. Shutemov
@ 2018-05-15 23:45       ` Thomas Gleixner
  0 siblings, 0 replies; 6+ messages in thread
From: Thomas Gleixner @ 2018-05-15 23:45 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Ingo Molnar, x86, H. Peter Anvin, Hugh Dickins, linux-kernel

On Tue, 15 May 2018, Kirill A. Shutemov wrote:

> On Sun, May 13, 2018 at 09:55:00PM +0000, Thomas Gleixner wrote:
> > On Fri, 11 May 2018, Kirill A. Shutemov wrote:
> > > --- a/arch/x86/kernel/cpu/common.c
> > > +++ b/arch/x86/kernel/cpu/common.c
> > > @@ -1008,6 +1008,12 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
> > >  	 */
> > >  	setup_clear_cpu_cap(X86_FEATURE_PCID);
> > >  #endif
> > > +
> > > +#ifdef CONFIG_X86_5LEVEL
> > > +	/* Clear the 5-level paging feature if user asked for 'no5lvl' */
> > 
> > no5lvl is only one reason why 5 level paging is not available.
> 
> The 5-level paging may not be available for few reasons:
>  - CONFIG_X86_5LEVEL=n
>  - Machine doesn't support X86_FEATURE_LA57 -- clearing is nop;
>  - User asked for 'no5lvl'.
> 
> To me the one-line comment reflects the situation reasonably well.
> Do you want more verbose comment here?

The question is what of that set __pgtable_l5_enabled to 0? Is it only the
no5lvl command line option? If not, then the comment is misleading.

> > > +	if (!__pgtable_l5_enabled)
> > > +		setup_clear_cpu_cap(X86_FEATURE_LA57);
> > > +#endif
> > 
> > And that #ifdeffery can be avoided by simply doing:
> > 
> > 	if (IS_ENABLED(CONFIG_X86_5LEVEL) && !pgtable_l5_enabled)
> > 		setup_clear_cpu_cap(X86_FEATURE_LA57);
> > 
> > Hmm?
> 
> Unfortunately, no.
> 
> pgtable_l5_enabled is defined as cpu_feature_enabled(X86_FEATURE_LA57).
> So with such change we would only clear the feature bit if it's already
> clear.

Bah. I really hate stuff which looks like a variable but in fact is a
function. The whole pgtable_l5_enabled vs. __pgtable_l5_enabled stuff is
pretty horrid to be honest.

arch/x86/kernel/head64.c:

#ifdef CONFIG_X86_5LEVEL
#undef pgtable_l5_enabled
#define pgtable_l5_enabled __pgtable_l5_enabled
#endif

arch/x86/mm/kasan_init_64.c

#ifdef CONFIG_X86_5LEVEL
/* Too early to use cpu_feature_enabled() */
#define pgtable_l5_enabled __pgtable_l5_enabled
#endif

The same hack is in arch/x86/boot/compressed/misc.h:

#ifdef CONFIG_X86_5LEVEL
/* Too early to use cpu_feature_enabled() */
#define pgtable_l5_enabled __pgtable_l5_enabled
#endif

while arch/x86/boot/compressed/kaslr.c has:

#ifdef CONFIG_X86_5LEVEl
unsigned int pgtable_l5_enabled __ro_after_init;

_AFTER_ including mish.h 

Groan. That's just 'Yay, the compiler does not barf' programming, It's the
macro substitution magic which makes it compile

#define WTF wtf
#define USE_WTF WTF

int WTF;

int fun()
{
	return USE_WTF;
}

actually compiles and returns the content of the variable WTF because the
first define makes the compiler to substitute all instances of WTF with the
lower case 'wtf'.

Its clever to abuse that, but it's mind boggling enough if you look at it
in a single file, but when it's all over the place it just becomes
incomprehensible.

Can we pretty please clean all of this up and make it in a way that a
simple grep is sufficient to grok all the magic behind that?

The issues with the boot/compressed code can be avoided by having a define
in the relevant files before including the pgtable headers, or any
header. boot/compressed undefines already CONFIG_ symbols and defines other
stuff for similar reasons.

So either have a define in misc.h or add something like this in the
boot/compressed/Makefile

KBUILD_CFLAGS	+= -DBUILD_COMPRESSED

The same applies to all the other files where you have to rely on the same
macro substitution stuff or even undefine it and the #ifndef
pgtable_l5_enabled guard in pgtable_64_types.h.

You can just have something like

#define USE_EARLY_PGTABLE_L5

before including any header file and the whole mess becomes a single place
in include/asm/pgtable.h

#ifndef CONFIG_X86_5LEVEL
static inline bool pgtable_l5_enabled(void)
{
	return false;
}
#elif defined(BUILD_COMPRESSED)
static inline bool pgtable_l5_enabled(void)
{
	return boot_pgtable_l5_enabled;
}
#elif defined(USE_EARLY_PGTABLE_L5)
static inline bool pgtable_l5_enabled(void)
{
	return __pgtable_l5_enabled;
}
#else
static inline bool pgtable_l5_enabled(void)
{
	return cpu_feature_enabled(X86_FEATURE_LA57);
}
#endif

And that makes every use case immidiately obvious and clear. No nested
layers of macro susbstitutions, not ambigousities. No need for the same
#ifdef blocks and no #ifndef pgtable_l5_enabled guards either. Not SIX
places which define pgtable_l5_enabled. The same thing for 32 and 64 bit.

The fast path case which needs cpu_feature_enabled() might not work with
the inline due to include dependency hell, but it still can be made a macro
which looks like a function,

One other thing:

__pgtable_l5_enabled is defined in head64.c

#ifdef CONFIG_X86_5LEVEL
unsigned int __pgtable_l5_enabled __ro_after_init;
EXPORT_SYMBOL(__pgtable_l5_enabled);

First of all this should be __init_data because after init nothing needs
that anymore. Plus the export is completely pointless.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-05-15 23:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-11 12:08 [PATCH 0/2] 5-level paging changes from v4.18 Kirill A. Shutemov
2018-05-11 12:08 ` [PATCH 1/2] x86/boot/compressed/64: Fix trampoline page table address calculation Kirill A. Shutemov
2018-05-11 12:08 ` [PATCH 2/2] x86/mm: Introduce 'no5lvl' kernel parameter Kirill A. Shutemov
2018-05-13 21:55   ` Thomas Gleixner
2018-05-14 21:50     ` Kirill A. Shutemov
2018-05-15 23:45       ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).