linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/mm/KASLR: Do not adapt the size of the direct mapping section for SGI UV system
@ 2017-05-18  6:52 Baoquan He
  2017-05-18 20:26 ` Mike Travis
  0 siblings, 1 reply; 4+ messages in thread
From: Baoquan He @ 2017-05-18  6:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Matt Fleming, Ard Biesheuvel, travis, Dimitri Sivanich,
	Thomas Garnier, Kees Cook, Andrew Morton, Masahiro Yamada,
	Russ Anderson, Frank Ramsay

On SGI UV system, kernel casually hang with kaslr enabled.

The back trace is:

kernel BUG at arch/x86/mm/init_64.c:311!
invalid opcode: 0000 [#1] SMP
[...]
RIP: 0010:__init_extra_mapping+0x188/0x196
[...]
Call Trace:
 init_extra_mapping_uc+0x13/0x15
 map_high+0x67/0x75
 map_mmioh_high_uv3+0x20a/0x219
 uv_system_init_hub+0x12d9/0x1496
 uv_system_init+0x27/0x29
 native_smp_prepare_cpus+0x28d/0x2d8
 kernel_init_freeable+0xdd/0x253
 ? rest_init+0x80/0x80
 kernel_init+0xe/0x110
 ret_from_fork+0x2c/0x40

The root cause is SGI UV system needs map its MMIOH region to direct
mapping section.

When kaslr disabled, there are 64TB space for system RAM to do direct
mapping. Both system RAM and SGI UV MMIOH region share this 64TB space.
However with kaslr enabled, mm KASLR only reserves the actual size of
system RAM plus 10TB for direct mapping usage. Then MMIOH mapping of
SGI UV could go beyond the upper bound of direct mapping section to step
into VMALLOC or VMEMMAP area. Then the BUG_ON() in __init_extra_mapping()
will be triggered.

E.g on the SGI UV3 machine where this bug is reported , there are two MMIOH
regions:

[    1.519001] UV: Map MMIOH0_HI 0xffc00000000 - 0x100000000000
[    1.523001] UV: Map MMIOH1_HI 0x100000000000 - 0x200000000000

They are [16TB-16G, 16TB) and [16TB, 32TB). On this machine, 512G ram are
spread out to 1TB regions. Then above two SGI MMIOH regions also will be
mapped into the direct mapping section. The point is that SGI UV maps its
MMIOH regions to the direct mapping section in uv_system_init() which is
called during kernel_init_freeable(). It's much later than
kernel_randomize_memory(). Mm KASLR has no chance to take it into
consideration.

To fix it, we need detect if it's SGI UV system early, then do not adapt
the size of the direct mapping section in kernel_randomize_memory() if yes.
According to discussion with SGI developer, "The SGI bios adds UVsystab.
Only systems running SGI bios (and now HPE Hawks2) will have UVsystab."

Hence check if UVsystab is present, if yes, do not adapt the size of the
direct mapping section in kernel_randomize_memory()

Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: "travis@sgi.com" <travis@sgi.com>
Cc: Dimitri Sivanich <sivanich@hpe.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Frank Ramsay <frank.ramsay@hpe.com>
linux-efi@vger.kernel.org
---
 arch/x86/include/asm/uv/uv.h | 2 ++
 arch/x86/mm/kaslr.c          | 4 +++-
 arch/x86/platform/efi/efi.c  | 7 +++++++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/uv/uv.h b/arch/x86/include/asm/uv/uv.h
index 6686820..0275776 100644
--- a/arch/x86/include/asm/uv/uv.h
+++ b/arch/x86/include/asm/uv/uv.h
@@ -11,6 +11,7 @@ struct mm_struct;
 extern enum uv_system_type get_uv_system_type(void);
 extern int is_uv_system(void);
 extern int is_uv_hubless(void);
+extern int is_early_uv_system(void);
 extern void uv_cpu_init(void);
 extern void uv_nmi_init(void);
 extern void uv_system_init(void);
@@ -25,6 +26,7 @@ extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; }
 static inline int is_uv_system(void)	{ return 0; }
 static inline int is_uv_hubless(void)	{ return 0; }
+static inline int is_early_uv_system(void)	{ return 0; }
 static inline void uv_cpu_init(void)	{ }
 static inline void uv_system_init(void)	{ }
 static inline const struct cpumask *
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index aed2064..b75e1f5 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -22,11 +22,13 @@
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/random.h>
+#include <linux/efi.h>
 
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
 #include <asm/setup.h>
 #include <asm/kaslr.h>
+#include <asm/uv/uv.h>
 
 #include "mm_internal.h"
 
@@ -123,7 +125,7 @@ void __init kernel_randomize_memory(void)
 		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
 
 	/* Adapt phyiscal memory region size based on available memory */
-	if (memory_tb < kaslr_regions[0].size_tb)
+	if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())
 		kaslr_regions[0].size_tb = memory_tb;
 
 	/* Calculate entropy available between regions */
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 7e76a4d..b71e01d 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -74,6 +74,13 @@ static int __init setup_add_efi_memmap(char *arg)
 }
 early_param("add_efi_memmap", setup_add_efi_memmap);
 
+#ifdef CONFIG_X86_UV
+int is_early_uv_system(void)
+{
+	return !((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab);
+}
+#endif
+
 static efi_status_t __init phys_efi_set_virtual_address_map(
 	unsigned long memory_map_size,
 	unsigned long descriptor_size,
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86/mm/KASLR: Do not adapt the size of the direct mapping section for SGI UV system
  2017-05-18  6:52 [PATCH] x86/mm/KASLR: Do not adapt the size of the direct mapping section for SGI UV system Baoquan He
@ 2017-05-18 20:26 ` Mike Travis
  2017-05-18 20:28   ` Mike Travis
  2017-05-19 11:33   ` Baoquan He
  0 siblings, 2 replies; 4+ messages in thread
From: Mike Travis @ 2017-05-18 20:26 UTC (permalink / raw)
  To: Baoquan He, linux-kernel
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Matt Fleming,
	Ard Biesheuvel, travis, Dimitri Sivanich, Thomas Garnier,
	Kees Cook, Andrew Morton, Masahiro Yamada, Russ Anderson,
	Frank Ramsay

Hi Baoquan He,

The concept of the patch is correct (since it is I as sent :),
but I would prefer that the entire is_early_uv_system() be put
into the uv.h file.  Primarily because we may need to change
that in the future so having UV specific code in other places
than under platform/uv is not that desirable.  Plus I think it's
better to consolidate the logic in one place.

Making it an inline function in uv.h is fine for now.  If we
need to change it, I'll move it to the UV specific init
function.

And if it's alright with others, you should add the #include
of the efi.h header file under the part of uv.h that has
#ifdef CONFIG_X86_UV defined.

That would mean the KASLR function would only add an
#include <asm/uv/uv.h> to access that UV specific funtion.

Thanks,
Mike

On 5/17/2017 11:52 PM, Baoquan He wrote:
> On SGI UV system, kernel casually hang with kaslr enabled.
> 
> The back trace is:
> 
> kernel BUG at arch/x86/mm/init_64.c:311!
> invalid opcode: 0000 [#1] SMP
> [...]
> RIP: 0010:__init_extra_mapping+0x188/0x196
> [...]
> Call Trace:
>  init_extra_mapping_uc+0x13/0x15
>  map_high+0x67/0x75
>  map_mmioh_high_uv3+0x20a/0x219
>  uv_system_init_hub+0x12d9/0x1496
>  uv_system_init+0x27/0x29
>  native_smp_prepare_cpus+0x28d/0x2d8
>  kernel_init_freeable+0xdd/0x253
>  ? rest_init+0x80/0x80
>  kernel_init+0xe/0x110
>  ret_from_fork+0x2c/0x40
> 
> The root cause is SGI UV system needs map its MMIOH region to direct
> mapping section.
> 
> When kaslr disabled, there are 64TB space for system RAM to do direct
> mapping. Both system RAM and SGI UV MMIOH region share this 64TB space.
> However with kaslr enabled, mm KASLR only reserves the actual size of
> system RAM plus 10TB for direct mapping usage. Then MMIOH mapping of
> SGI UV could go beyond the upper bound of direct mapping section to step
> into VMALLOC or VMEMMAP area. Then the BUG_ON() in __init_extra_mapping()
> will be triggered.
> 
> E.g on the SGI UV3 machine where this bug is reported , there are two MMIOH
> regions:
> 
> [    1.519001] UV: Map MMIOH0_HI 0xffc00000000 - 0x100000000000
> [    1.523001] UV: Map MMIOH1_HI 0x100000000000 - 0x200000000000
> 
> They are [16TB-16G, 16TB) and [16TB, 32TB). On this machine, 512G ram are
> spread out to 1TB regions. Then above two SGI MMIOH regions also will be
> mapped into the direct mapping section. The point is that SGI UV maps its
> MMIOH regions to the direct mapping section in uv_system_init() which is
> called during kernel_init_freeable(). It's much later than
> kernel_randomize_memory(). Mm KASLR has no chance to take it into
> consideration.
> 
> To fix it, we need detect if it's SGI UV system early, then do not adapt
> the size of the direct mapping section in kernel_randomize_memory() if yes.
> According to discussion with SGI developer, "The SGI bios adds UVsystab.
> Only systems running SGI bios (and now HPE Hawks2) will have UVsystab."
> 
> Hence check if UVsystab is present, if yes, do not adapt the size of the
> direct mapping section in kernel_randomize_memory()
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Matt Fleming <matt@codeblueprint.co.uk>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: "travis@sgi.com" <travis@sgi.com>
> Cc: Dimitri Sivanich <sivanich@hpe.com>
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: Russ Anderson <rja@sgi.com>
> Cc: Frank Ramsay <frank.ramsay@hpe.com>
> linux-efi@vger.kernel.org
> ---
>  arch/x86/include/asm/uv/uv.h | 2 ++
>  arch/x86/mm/kaslr.c          | 4 +++-
>  arch/x86/platform/efi/efi.c  | 7 +++++++
>  3 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/uv/uv.h b/arch/x86/include/asm/uv/uv.h
> index 6686820..0275776 100644
> --- a/arch/x86/include/asm/uv/uv.h
> +++ b/arch/x86/include/asm/uv/uv.h
> @@ -11,6 +11,7 @@ struct mm_struct;
>  extern enum uv_system_type get_uv_system_type(void);
>  extern int is_uv_system(void);
>  extern int is_uv_hubless(void);
> +extern int is_early_uv_system(void);
>  extern void uv_cpu_init(void);
>  extern void uv_nmi_init(void);
>  extern void uv_system_init(void);
> @@ -25,6 +26,7 @@ extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
>  static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; }
>  static inline int is_uv_system(void)	{ return 0; }
>  static inline int is_uv_hubless(void)	{ return 0; }
> +static inline int is_early_uv_system(void)	{ return 0; }
>  static inline void uv_cpu_init(void)	{ }
>  static inline void uv_system_init(void)	{ }
>  static inline const struct cpumask *
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index aed2064..b75e1f5 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -22,11 +22,13 @@
>  #include <linux/kernel.h>
>  #include <linux/init.h>
>  #include <linux/random.h>
> +#include <linux/efi.h>
>  
>  #include <asm/pgalloc.h>
>  #include <asm/pgtable.h>
>  #include <asm/setup.h>
>  #include <asm/kaslr.h>
> +#include <asm/uv/uv.h>
>  
>  #include "mm_internal.h"
>  
> @@ -123,7 +125,7 @@ void __init kernel_randomize_memory(void)
>  		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>  
>  	/* Adapt phyiscal memory region size based on available memory */
> -	if (memory_tb < kaslr_regions[0].size_tb)
> +	if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())
>  		kaslr_regions[0].size_tb = memory_tb;
>  
>  	/* Calculate entropy available between regions */
> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> index 7e76a4d..b71e01d 100644
> --- a/arch/x86/platform/efi/efi.c
> +++ b/arch/x86/platform/efi/efi.c
> @@ -74,6 +74,13 @@ static int __init setup_add_efi_memmap(char *arg)
>  }
>  early_param("add_efi_memmap", setup_add_efi_memmap);
>  
> +#ifdef CONFIG_X86_UV
> +int is_early_uv_system(void)
> +{
> +	return !((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab);
> +}
> +#endif
> +
>  static efi_status_t __init phys_efi_set_virtual_address_map(
>  	unsigned long memory_map_size,
>  	unsigned long descriptor_size,
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86/mm/KASLR: Do not adapt the size of the direct mapping section for SGI UV system
  2017-05-18 20:26 ` Mike Travis
@ 2017-05-18 20:28   ` Mike Travis
  2017-05-19 11:33   ` Baoquan He
  1 sibling, 0 replies; 4+ messages in thread
From: Mike Travis @ 2017-05-18 20:28 UTC (permalink / raw)
  To: Baoquan He, linux-kernel
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Matt Fleming,
	Ard Biesheuvel, travis, Dimitri Sivanich, Thomas Garnier,
	Kees Cook, Andrew Morton, Masahiro Yamada, Russ Anderson,
	Frank Ramsay


Sorry, I forgot to add that perhaps the adding of the
is_early_uv_system() function be put into a separate
patch as well.

Thanks.

On 5/18/2017 1:26 PM, Mike Travis wrote:
> Hi Baoquan He,
> 
> The concept of the patch is correct (since it is I as sent :),
> but I would prefer that the entire is_early_uv_system() be put
> into the uv.h file.  Primarily because we may need to change
> that in the future so having UV specific code in other places
> than under platform/uv is not that desirable.  Plus I think it's
> better to consolidate the logic in one place.
> 
> Making it an inline function in uv.h is fine for now.  If we
> need to change it, I'll move it to the UV specific init
> function.
> 
> And if it's alright with others, you should add the #include
> of the efi.h header file under the part of uv.h that has
> #ifdef CONFIG_X86_UV defined.
> 
> That would mean the KASLR function would only add an
> #include <asm/uv/uv.h> to access that UV specific funtion.
> 
> Thanks,
> Mike
> 
> On 5/17/2017 11:52 PM, Baoquan He wrote:
>> On SGI UV system, kernel casually hang with kaslr enabled.
>>
>> The back trace is:
>>
>> kernel BUG at arch/x86/mm/init_64.c:311!
>> invalid opcode: 0000 [#1] SMP
>> [...]
>> RIP: 0010:__init_extra_mapping+0x188/0x196
>> [...]
>> Call Trace:
>>  init_extra_mapping_uc+0x13/0x15
>>  map_high+0x67/0x75
>>  map_mmioh_high_uv3+0x20a/0x219
>>  uv_system_init_hub+0x12d9/0x1496
>>  uv_system_init+0x27/0x29
>>  native_smp_prepare_cpus+0x28d/0x2d8
>>  kernel_init_freeable+0xdd/0x253
>>  ? rest_init+0x80/0x80
>>  kernel_init+0xe/0x110
>>  ret_from_fork+0x2c/0x40
>>
>> The root cause is SGI UV system needs map its MMIOH region to direct
>> mapping section.
>>
>> When kaslr disabled, there are 64TB space for system RAM to do direct
>> mapping. Both system RAM and SGI UV MMIOH region share this 64TB space.
>> However with kaslr enabled, mm KASLR only reserves the actual size of
>> system RAM plus 10TB for direct mapping usage. Then MMIOH mapping of
>> SGI UV could go beyond the upper bound of direct mapping section to step
>> into VMALLOC or VMEMMAP area. Then the BUG_ON() in __init_extra_mapping()
>> will be triggered.
>>
>> E.g on the SGI UV3 machine where this bug is reported , there are two MMIOH
>> regions:
>>
>> [    1.519001] UV: Map MMIOH0_HI 0xffc00000000 - 0x100000000000
>> [    1.523001] UV: Map MMIOH1_HI 0x100000000000 - 0x200000000000
>>
>> They are [16TB-16G, 16TB) and [16TB, 32TB). On this machine, 512G ram are
>> spread out to 1TB regions. Then above two SGI MMIOH regions also will be
>> mapped into the direct mapping section. The point is that SGI UV maps its
>> MMIOH regions to the direct mapping section in uv_system_init() which is
>> called during kernel_init_freeable(). It's much later than
>> kernel_randomize_memory(). Mm KASLR has no chance to take it into
>> consideration.
>>
>> To fix it, we need detect if it's SGI UV system early, then do not adapt
>> the size of the direct mapping section in kernel_randomize_memory() if yes.
>> According to discussion with SGI developer, "The SGI bios adds UVsystab.
>> Only systems running SGI bios (and now HPE Hawks2) will have UVsystab."
>>
>> Hence check if UVsystab is present, if yes, do not adapt the size of the
>> direct mapping section in kernel_randomize_memory()
>>
>> Signed-off-by: Baoquan He <bhe@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: x86@kernel.org
>> Cc: Matt Fleming <matt@codeblueprint.co.uk>
>> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> Cc: "travis@sgi.com" <travis@sgi.com>
>> Cc: Dimitri Sivanich <sivanich@hpe.com>
>> Cc: Thomas Garnier <thgarnie@google.com>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
>> Cc: Russ Anderson <rja@sgi.com>
>> Cc: Frank Ramsay <frank.ramsay@hpe.com>
>> linux-efi@vger.kernel.org
>> ---
>>  arch/x86/include/asm/uv/uv.h | 2 ++
>>  arch/x86/mm/kaslr.c          | 4 +++-
>>  arch/x86/platform/efi/efi.c  | 7 +++++++
>>  3 files changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/include/asm/uv/uv.h b/arch/x86/include/asm/uv/uv.h
>> index 6686820..0275776 100644
>> --- a/arch/x86/include/asm/uv/uv.h
>> +++ b/arch/x86/include/asm/uv/uv.h
>> @@ -11,6 +11,7 @@ struct mm_struct;
>>  extern enum uv_system_type get_uv_system_type(void);
>>  extern int is_uv_system(void);
>>  extern int is_uv_hubless(void);
>> +extern int is_early_uv_system(void);
>>  extern void uv_cpu_init(void);
>>  extern void uv_nmi_init(void);
>>  extern void uv_system_init(void);
>> @@ -25,6 +26,7 @@ extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
>>  static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; }
>>  static inline int is_uv_system(void)	{ return 0; }
>>  static inline int is_uv_hubless(void)	{ return 0; }
>> +static inline int is_early_uv_system(void)	{ return 0; }
>>  static inline void uv_cpu_init(void)	{ }
>>  static inline void uv_system_init(void)	{ }
>>  static inline const struct cpumask *
>> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
>> index aed2064..b75e1f5 100644
>> --- a/arch/x86/mm/kaslr.c
>> +++ b/arch/x86/mm/kaslr.c
>> @@ -22,11 +22,13 @@
>>  #include <linux/kernel.h>
>>  #include <linux/init.h>
>>  #include <linux/random.h>
>> +#include <linux/efi.h>
>>  
>>  #include <asm/pgalloc.h>
>>  #include <asm/pgtable.h>
>>  #include <asm/setup.h>
>>  #include <asm/kaslr.h>
>> +#include <asm/uv/uv.h>
>>  
>>  #include "mm_internal.h"
>>  
>> @@ -123,7 +125,7 @@ void __init kernel_randomize_memory(void)
>>  		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>>  
>>  	/* Adapt phyiscal memory region size based on available memory */
>> -	if (memory_tb < kaslr_regions[0].size_tb)
>> +	if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())
>>  		kaslr_regions[0].size_tb = memory_tb;
>>  
>>  	/* Calculate entropy available between regions */
>> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
>> index 7e76a4d..b71e01d 100644
>> --- a/arch/x86/platform/efi/efi.c
>> +++ b/arch/x86/platform/efi/efi.c
>> @@ -74,6 +74,13 @@ static int __init setup_add_efi_memmap(char *arg)
>>  }
>>  early_param("add_efi_memmap", setup_add_efi_memmap);
>>  
>> +#ifdef CONFIG_X86_UV
>> +int is_early_uv_system(void)
>> +{
>> +	return !((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab);
>> +}
>> +#endif
>> +
>>  static efi_status_t __init phys_efi_set_virtual_address_map(
>>  	unsigned long memory_map_size,
>>  	unsigned long descriptor_size,
>>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86/mm/KASLR: Do not adapt the size of the direct mapping section for SGI UV system
  2017-05-18 20:26 ` Mike Travis
  2017-05-18 20:28   ` Mike Travis
@ 2017-05-19 11:33   ` Baoquan He
  1 sibling, 0 replies; 4+ messages in thread
From: Baoquan He @ 2017-05-19 11:33 UTC (permalink / raw)
  To: Mike Travis
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Matt Fleming, Ard Biesheuvel, travis, Dimitri Sivanich,
	Thomas Garnier, Kees Cook, Andrew Morton, Masahiro Yamada,
	Russ Anderson, Frank Ramsay

On 05/18/17 at 01:26pm, Mike Travis wrote:
> Hi Baoquan He,
> 
> The concept of the patch is correct (since it is I as sent :),
> but I would prefer that the entire is_early_uv_system() be put
> into the uv.h file.  Primarily because we may need to change
> that in the future so having UV specific code in other places
> than under platform/uv is not that desirable.  Plus I think it's
> better to consolidate the logic in one place.
> 
> Making it an inline function in uv.h is fine for now.  If we
> need to change it, I'll move it to the UV specific init
> function.
> 
> And if it's alright with others, you should add the #include
> of the efi.h header file under the part of uv.h that has
> #ifdef CONFIG_X86_UV defined.
> 
> That would mean the KASLR function would only add an
> #include <asm/uv/uv.h> to access that UV specific funtion.

Yes, sounds better. Will change according to your suggestion, and make a
patchset to contain uv part code and kaslr part separately.

Thanks for your reviewing and suggestion, Mike! This is a regression on
rhel7 with kaslr enabled now. I used dmi_match to do the early check of
UV system, your SGI colleagues suggested the UVsystab way.

Thanks
Baoquan
> 
> On 5/17/2017 11:52 PM, Baoquan He wrote:
> > On SGI UV system, kernel casually hang with kaslr enabled.
> > 
> > The back trace is:
> > 
> > kernel BUG at arch/x86/mm/init_64.c:311!
> > invalid opcode: 0000 [#1] SMP
> > [...]
> > RIP: 0010:__init_extra_mapping+0x188/0x196
> > [...]
> > Call Trace:
> >  init_extra_mapping_uc+0x13/0x15
> >  map_high+0x67/0x75
> >  map_mmioh_high_uv3+0x20a/0x219
> >  uv_system_init_hub+0x12d9/0x1496
> >  uv_system_init+0x27/0x29
> >  native_smp_prepare_cpus+0x28d/0x2d8
> >  kernel_init_freeable+0xdd/0x253
> >  ? rest_init+0x80/0x80
> >  kernel_init+0xe/0x110
> >  ret_from_fork+0x2c/0x40
> > 
> > The root cause is SGI UV system needs map its MMIOH region to direct
> > mapping section.
> > 
> > When kaslr disabled, there are 64TB space for system RAM to do direct
> > mapping. Both system RAM and SGI UV MMIOH region share this 64TB space.
> > However with kaslr enabled, mm KASLR only reserves the actual size of
> > system RAM plus 10TB for direct mapping usage. Then MMIOH mapping of
> > SGI UV could go beyond the upper bound of direct mapping section to step
> > into VMALLOC or VMEMMAP area. Then the BUG_ON() in __init_extra_mapping()
> > will be triggered.
> > 
> > E.g on the SGI UV3 machine where this bug is reported , there are two MMIOH
> > regions:
> > 
> > [    1.519001] UV: Map MMIOH0_HI 0xffc00000000 - 0x100000000000
> > [    1.523001] UV: Map MMIOH1_HI 0x100000000000 - 0x200000000000
> > 
> > They are [16TB-16G, 16TB) and [16TB, 32TB). On this machine, 512G ram are
> > spread out to 1TB regions. Then above two SGI MMIOH regions also will be
> > mapped into the direct mapping section. The point is that SGI UV maps its
> > MMIOH regions to the direct mapping section in uv_system_init() which is
> > called during kernel_init_freeable(). It's much later than
> > kernel_randomize_memory(). Mm KASLR has no chance to take it into
> > consideration.
> > 
> > To fix it, we need detect if it's SGI UV system early, then do not adapt
> > the size of the direct mapping section in kernel_randomize_memory() if yes.
> > According to discussion with SGI developer, "The SGI bios adds UVsystab.
> > Only systems running SGI bios (and now HPE Hawks2) will have UVsystab."
> > 
> > Hence check if UVsystab is present, if yes, do not adapt the size of the
> > direct mapping section in kernel_randomize_memory()
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: x86@kernel.org
> > Cc: Matt Fleming <matt@codeblueprint.co.uk>
> > Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > Cc: "travis@sgi.com" <travis@sgi.com>
> > Cc: Dimitri Sivanich <sivanich@hpe.com>
> > Cc: Thomas Garnier <thgarnie@google.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> > Cc: Russ Anderson <rja@sgi.com>
> > Cc: Frank Ramsay <frank.ramsay@hpe.com>
> > linux-efi@vger.kernel.org
> > ---
> >  arch/x86/include/asm/uv/uv.h | 2 ++
> >  arch/x86/mm/kaslr.c          | 4 +++-
> >  arch/x86/platform/efi/efi.c  | 7 +++++++
> >  3 files changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/include/asm/uv/uv.h b/arch/x86/include/asm/uv/uv.h
> > index 6686820..0275776 100644
> > --- a/arch/x86/include/asm/uv/uv.h
> > +++ b/arch/x86/include/asm/uv/uv.h
> > @@ -11,6 +11,7 @@ struct mm_struct;
> >  extern enum uv_system_type get_uv_system_type(void);
> >  extern int is_uv_system(void);
> >  extern int is_uv_hubless(void);
> > +extern int is_early_uv_system(void);
> >  extern void uv_cpu_init(void);
> >  extern void uv_nmi_init(void);
> >  extern void uv_system_init(void);
> > @@ -25,6 +26,7 @@ extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
> >  static inline enum uv_system_type get_uv_system_type(void) { return UV_NONE; }
> >  static inline int is_uv_system(void)	{ return 0; }
> >  static inline int is_uv_hubless(void)	{ return 0; }
> > +static inline int is_early_uv_system(void)	{ return 0; }
> >  static inline void uv_cpu_init(void)	{ }
> >  static inline void uv_system_init(void)	{ }
> >  static inline const struct cpumask *
> > diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> > index aed2064..b75e1f5 100644
> > --- a/arch/x86/mm/kaslr.c
> > +++ b/arch/x86/mm/kaslr.c
> > @@ -22,11 +22,13 @@
> >  #include <linux/kernel.h>
> >  #include <linux/init.h>
> >  #include <linux/random.h>
> > +#include <linux/efi.h>
> >  
> >  #include <asm/pgalloc.h>
> >  #include <asm/pgtable.h>
> >  #include <asm/setup.h>
> >  #include <asm/kaslr.h>
> > +#include <asm/uv/uv.h>
> >  
> >  #include "mm_internal.h"
> >  
> > @@ -123,7 +125,7 @@ void __init kernel_randomize_memory(void)
> >  		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> >  
> >  	/* Adapt phyiscal memory region size based on available memory */
> > -	if (memory_tb < kaslr_regions[0].size_tb)
> > +	if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())
> >  		kaslr_regions[0].size_tb = memory_tb;
> >  
> >  	/* Calculate entropy available between regions */
> > diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
> > index 7e76a4d..b71e01d 100644
> > --- a/arch/x86/platform/efi/efi.c
> > +++ b/arch/x86/platform/efi/efi.c
> > @@ -74,6 +74,13 @@ static int __init setup_add_efi_memmap(char *arg)
> >  }
> >  early_param("add_efi_memmap", setup_add_efi_memmap);
> >  
> > +#ifdef CONFIG_X86_UV
> > +int is_early_uv_system(void)
> > +{
> > +	return !((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab);
> > +}
> > +#endif
> > +
> >  static efi_status_t __init phys_efi_set_virtual_address_map(
> >  	unsigned long memory_map_size,
> >  	unsigned long descriptor_size,
> > 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-05-19 11:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-18  6:52 [PATCH] x86/mm/KASLR: Do not adapt the size of the direct mapping section for SGI UV system Baoquan He
2017-05-18 20:26 ` Mike Travis
2017-05-18 20:28   ` Mike Travis
2017-05-19 11:33   ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).