All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system
@ 2017-05-20 12:02 Baoquan He
  2017-05-20 12:02 ` [PATCH v2 1/2] x86/UV: Introduce a helper function to check UV system at earlier stage Baoquan He
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Baoquan He @ 2017-05-20 12:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Baoquan He

This is v2 post.

This patchset is trying to fix a bug that SGI UV system casually hang
during boot with KASLR enabled. The root cause is that mm KASLR adapts
size of the direct mapping section only based on the system RAM size.
Then later when map SGI UV MMIOH region into the direct mapping during
rest_init() invocation, it might go beyond of the directing mapping
section and step into VMALLOC or VMEMMAP area, then BUG_ON triggered.

The fix is adding a helper function is_early_uv_system to check UV system
earlier, then call the helper function in kernel_randomize_memory() to
check if it's a SGI UV system, if yes, we keep the size of direct mapping
section to be 64TB just as nokslr.

With this fix, SGI UV system can have 64TB direct mapping size always,
and the starting address of direct mapping/vmalloc/vmemmap and the padding
between them can still be randomized to enhance the system security.

v1->v2:
    1. Mike suggested making is_early_uv_system() an inline function and be
    put in include/asm/uv/uv.h so that they can adjust them easier in the
    future.

    2. Split the v1 code into uv part and mm KASLR part as Mike suggested.

Baoquan He (2):
  x86/UV: Introduce a helper function to check UV system at earlier
    stage
  x86/mm/KASLR: Do not adapt the size of the direct mapping section for
    SGI UV system

 arch/x86/include/asm/uv/uv.h | 6 ++++++
 arch/x86/mm/kaslr.c          | 3 ++-
 2 files changed, 8 insertions(+), 1 deletion(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/2] x86/UV: Introduce a helper function to check UV system at earlier stage
  2017-05-20 12:02 [PATCH v2 0/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system Baoquan He
@ 2017-05-20 12:02 ` Baoquan He
  2017-05-23 15:17   ` Mike Travis
  2017-05-20 12:02 ` [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system Baoquan He
  2017-06-07  4:35 ` [PATCH v2 0/2] " Baoquan He
  2 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2017-05-20 12:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Russ Anderson, Dimitri Sivanich, travis, Mike Travis,
	Frank Ramsay

The SGI BIOS adds UVsystab, and only systems running SGI BIOS
(and now HPE Hawks2) will have UVsystab. And UVsystab is detected in
efi_init() which is at very early stage. So introduce a new helper
function is_early_uv_system() for later usage.

Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Russ Anderson <rja@hpe.com>
Cc: Dimitri Sivanich <sivanich@hpe.com>
Cc: "travis@sgi.com" <travis@sgi.com>
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Frank Ramsay <frank.ramsay@hpe.com>
---
 arch/x86/include/asm/uv/uv.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/uv/uv.h b/arch/x86/include/asm/uv/uv.h
index 6686820..159f698 100644
--- a/arch/x86/include/asm/uv/uv.h
+++ b/arch/x86/include/asm/uv/uv.h
@@ -19,6 +19,11 @@ extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 						 unsigned long start,
 						 unsigned long end,
 						 unsigned int cpu);
+#include <linux/efi.h>
+static inline int is_early_uv_system(void)
+{
+	return !((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab);
+}
 
 #else	/* X86_UV */
 
@@ -31,6 +36,7 @@ static inline const struct cpumask *
 uv_flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm,
 		    unsigned long start, unsigned long end, unsigned int cpu)
 { return cpumask; }
+static inline int is_early_uv_system(void)	{ return 0; }
 
 #endif	/* X86_UV */
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system
  2017-05-20 12:02 [PATCH v2 0/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system Baoquan He
  2017-05-20 12:02 ` [PATCH v2 1/2] x86/UV: Introduce a helper function to check UV system at earlier stage Baoquan He
@ 2017-05-20 12:02 ` Baoquan He
  2017-05-21 20:38   ` Thomas Garnier
  2017-08-31  6:21   ` Baoquan He
  2017-06-07  4:35 ` [PATCH v2 0/2] " Baoquan He
  2 siblings, 2 replies; 10+ messages in thread
From: Baoquan He @ 2017-05-20 12:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Thomas Garnier, Kees Cook, Andrew Morton, Masahiro Yamada

On SGI UV system, kernel casually hang with kaslr enabled.

The back trace is:

kernel BUG at arch/x86/mm/init_64.c:311!
invalid opcode: 0000 [#1] SMP
[...]
RIP: 0010:__init_extra_mapping+0x188/0x196
[...]
Call Trace:
 init_extra_mapping_uc+0x13/0x15
 map_high+0x67/0x75
 map_mmioh_high_uv3+0x20a/0x219
 uv_system_init_hub+0x12d9/0x1496
 uv_system_init+0x27/0x29
 native_smp_prepare_cpus+0x28d/0x2d8
 kernel_init_freeable+0xdd/0x253
 ? rest_init+0x80/0x80
 kernel_init+0xe/0x110
 ret_from_fork+0x2c/0x40

The root cause is that SGI UV system needs map its MMIOH region to direct
mapping section and the mapping happens in rest_init(). However mm KASLR
is done in kernel_randomize_memory() which is much earlier than MMIOH
mapping of SGI UV and doesn't count in the MMIOH regions. When kaslr
disabled, there are 64TB space for system RAM to do direct mapping. Both
system RAM and SGI UV MMIOH region share this 64TB space. With kaslr
enabled, mm KASLR only reserves the actual size of system RAM plus 10TB
for direct mapping usage. Then later MMIOH mapping of SGI UV could go
beyond the upper bound of direct mapping section to step into VMALLOC or
VMEMMAP area. Then the BUG_ON() in __init_extra_mapping() will be
triggered.

E.g on the SGI UV3 machine where this bug is reported , there are two MMIOH
regions:

[    1.519001] UV: Map MMIOH0_HI 0xffc00000000 - 0x100000000000
[    1.523001] UV: Map MMIOH1_HI 0x100000000000 - 0x200000000000

They are [16TB-16G, 16TB) and [16TB, 32TB). On this machine, 512G ram are
spread out to 1TB regions. Then above two SGI MMIOH regions also will be
mapped into the direct mapping section.

To fix it, we need check if it's SGI UV system by calling
is_early_uv_system() in kernel_randomize_memory(). If yes, do not adapt the
size of the direct mapping section. Do it now.

Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org 
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
---
 arch/x86/mm/kaslr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index aed2064..20b0456 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -27,6 +27,7 @@
 #include <asm/pgtable.h>
 #include <asm/setup.h>
 #include <asm/kaslr.h>
+#include <asm/uv/uv.h>
 
 #include "mm_internal.h"
 
@@ -123,7 +124,7 @@ void __init kernel_randomize_memory(void)
 		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
 
 	/* Adapt phyiscal memory region size based on available memory */
-	if (memory_tb < kaslr_regions[0].size_tb)
+	if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())
 		kaslr_regions[0].size_tb = memory_tb;
 
 	/* Calculate entropy available between regions */
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system
  2017-05-20 12:02 ` [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system Baoquan He
@ 2017-05-21 20:38   ` Thomas Garnier
  2017-05-21 23:14     ` Baoquan He
  2017-08-31  6:21   ` Baoquan He
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Garnier @ 2017-05-21 20:38 UTC (permalink / raw)
  To: Baoquan He
  Cc: LKML, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Kees Cook, Andrew Morton,
	Masahiro Yamada

On Sat, May 20, 2017 at 5:02 AM, Baoquan He <bhe@redhat.com> wrote:
> On SGI UV system, kernel casually hang with kaslr enabled.
>
> The back trace is:
>
> kernel BUG at arch/x86/mm/init_64.c:311!
> invalid opcode: 0000 [#1] SMP
> [...]
> RIP: 0010:__init_extra_mapping+0x188/0x196
> [...]
> Call Trace:
>  init_extra_mapping_uc+0x13/0x15
>  map_high+0x67/0x75
>  map_mmioh_high_uv3+0x20a/0x219
>  uv_system_init_hub+0x12d9/0x1496
>  uv_system_init+0x27/0x29
>  native_smp_prepare_cpus+0x28d/0x2d8
>  kernel_init_freeable+0xdd/0x253
>  ? rest_init+0x80/0x80
>  kernel_init+0xe/0x110
>  ret_from_fork+0x2c/0x40
>
> The root cause is that SGI UV system needs map its MMIOH region to direct
> mapping section and the mapping happens in rest_init(). However mm KASLR
> is done in kernel_randomize_memory() which is much earlier than MMIOH
> mapping of SGI UV and doesn't count in the MMIOH regions. When kaslr
> disabled, there are 64TB space for system RAM to do direct mapping. Both
> system RAM and SGI UV MMIOH region share this 64TB space. With kaslr
> enabled, mm KASLR only reserves the actual size of system RAM plus 10TB
> for direct mapping usage. Then later MMIOH mapping of SGI UV could go
> beyond the upper bound of direct mapping section to step into VMALLOC or
> VMEMMAP area. Then the BUG_ON() in __init_extra_mapping() will be
> triggered.
>
> E.g on the SGI UV3 machine where this bug is reported , there are two MMIOH
> regions:
>
> [    1.519001] UV: Map MMIOH0_HI 0xffc00000000 - 0x100000000000
> [    1.523001] UV: Map MMIOH1_HI 0x100000000000 - 0x200000000000
>
> They are [16TB-16G, 16TB) and [16TB, 32TB). On this machine, 512G ram are
> spread out to 1TB regions. Then above two SGI MMIOH regions also will be
> mapped into the direct mapping section.
>
> To fix it, we need check if it's SGI UV system by calling
> is_early_uv_system() in kernel_randomize_memory(). If yes, do not adapt the
> size of the direct mapping section. Do it now.
>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
>  arch/x86/mm/kaslr.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index aed2064..20b0456 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -27,6 +27,7 @@
>  #include <asm/pgtable.h>
>  #include <asm/setup.h>
>  #include <asm/kaslr.h>
> +#include <asm/uv/uv.h>
>
>  #include "mm_internal.h"
>
> @@ -123,7 +124,7 @@ void __init kernel_randomize_memory(void)
>                 CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>
>         /* Adapt phyiscal memory region size based on available memory */
> -       if (memory_tb < kaslr_regions[0].size_tb)
> +       if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())

Given your example, any way we could just restrict memory_tb to be
32TB? Or different configurations will result in different mappings?

>                 kaslr_regions[0].size_tb = memory_tb;
>
>         /* Calculate entropy available between regions */
> --
> 2.5.5
>



-- 
Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system
  2017-05-21 20:38   ` Thomas Garnier
@ 2017-05-21 23:14     ` Baoquan He
  2017-05-21 23:17       ` Baoquan He
  0 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2017-05-21 23:14 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: LKML, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Kees Cook, Andrew Morton,
	Masahiro Yamada

On 05/21/17 at 01:38pm, Thomas Garnier wrote:
> On Sat, May 20, 2017 at 5:02 AM, Baoquan He <bhe@redhat.com> wrote:
> >  arch/x86/mm/kaslr.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> > index aed2064..20b0456 100644
> > --- a/arch/x86/mm/kaslr.c
> > +++ b/arch/x86/mm/kaslr.c
> > @@ -27,6 +27,7 @@
> >  #include <asm/pgtable.h>
> >  #include <asm/setup.h>
> >  #include <asm/kaslr.h>
> > +#include <asm/uv/uv.h>
> >
> >  #include "mm_internal.h"
> >
> > @@ -123,7 +124,7 @@ void __init kernel_randomize_memory(void)
> >                 CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> >
> >         /* Adapt phyiscal memory region size based on available memory */
> > -       if (memory_tb < kaslr_regions[0].size_tb)
> > +       if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())
> 
> Given your example, any way we could just restrict memory_tb to be
> 32TB? Or different configurations will result in different mappings?

Thanks for looking into this, Thomas!

For that machine where I used to reproduce the bug and test, 32TB memory
need be mapped to the direct mapping region. I am not sure if SGI UV
system has larger MMIOH region now or in the future in different machine.
If they have machine owning MMIOH region bigger than 64TB, then it's a
problem SGI UV need fix because that will break system whether kaslr
enabled or not.

Hi Mike, Russ and Frank,

About Thomas's question, could you help answer it? Could other SGI UV
system has MMIOH region bigger than 32TB?

Thanks
Baoquan

> 
> >                 kaslr_regions[0].size_tb = memory_tb;
> >
> >         /* Calculate entropy available between regions */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system
  2017-05-21 23:14     ` Baoquan He
@ 2017-05-21 23:17       ` Baoquan He
       [not found]         ` <19a8e832-49db-cb68-bbfd-a5ba1cb8be1e@hpe.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2017-05-21 23:17 UTC (permalink / raw)
  To: Thomas Garnier, mike.travis, rja, frank.ramsay
  Cc: LKML, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Kees Cook, Andrew Morton,
	Masahiro Yamada

Sorry, forget 'To' Mike, Russ and Frank

On 05/22/17 at 07:14am, Baoquan He wrote:
> On 05/21/17 at 01:38pm, Thomas Garnier wrote:
> > On Sat, May 20, 2017 at 5:02 AM, Baoquan He <bhe@redhat.com> wrote:
> > >  arch/x86/mm/kaslr.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> > > index aed2064..20b0456 100644
> > > --- a/arch/x86/mm/kaslr.c
> > > +++ b/arch/x86/mm/kaslr.c
> > > @@ -27,6 +27,7 @@
> > >  #include <asm/pgtable.h>
> > >  #include <asm/setup.h>
> > >  #include <asm/kaslr.h>
> > > +#include <asm/uv/uv.h>
> > >
> > >  #include "mm_internal.h"
> > >
> > > @@ -123,7 +124,7 @@ void __init kernel_randomize_memory(void)
> > >                 CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
> > >
> > >         /* Adapt phyiscal memory region size based on available memory */
> > > -       if (memory_tb < kaslr_regions[0].size_tb)
> > > +       if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())
> > 
> > Given your example, any way we could just restrict memory_tb to be
> > 32TB? Or different configurations will result in different mappings?
> 
> Thanks for looking into this, Thomas!
> 
> For that machine where I used to reproduce the bug and test, 32TB memory
> need be mapped to the direct mapping region. I am not sure if SGI UV
> system has larger MMIOH region now or in the future in different machine.
> If they have machine owning MMIOH region bigger than 64TB, then it's a
> problem SGI UV need fix because that will break system whether kaslr
> enabled or not.
> 
> Hi Mike, Russ and Frank,
> 
> About Thomas's question, could you help answer it? Could other SGI UV
> system has MMIOH region bigger than 32TB?
> 
> Thanks
> Baoquan
> 
> > 
> > >                 kaslr_regions[0].size_tb = memory_tb;
> > >
> > >         /* Calculate entropy available between regions */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system
       [not found]         ` <19a8e832-49db-cb68-bbfd-a5ba1cb8be1e@hpe.com>
@ 2017-05-22 17:00           ` Thomas Garnier
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Garnier @ 2017-05-22 17:00 UTC (permalink / raw)
  To: Mike Travis
  Cc: Baoquan He, rja, frank.ramsay, LKML, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers, Kees Cook,
	Andrew Morton, Masahiro Yamada

On Mon, May 22, 2017 at 9:30 AM, Mike Travis <mike.travis@hpe.com> wrote:
>
>
> On 5/21/2017 4:17 PM, Baoquan He wrote:
>
> Sorry, forget 'To' Mike, Russ and Frank
>
> On 05/22/17 at 07:14am, Baoquan He wrote:
>
> On 05/21/17 at 01:38pm, Thomas Garnier wrote:
>
> On Sat, May 20, 2017 at 5:02 AM, Baoquan He <bhe@redhat.com> wrote:
>
>  arch/x86/mm/kaslr.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index aed2064..20b0456 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -27,6 +27,7 @@
>  #include <asm/pgtable.h>
>  #include <asm/setup.h>
>  #include <asm/kaslr.h>
> +#include <asm/uv/uv.h>
>
>  #include "mm_internal.h"
>
> @@ -123,7 +124,7 @@ void __init kernel_randomize_memory(void)
>                 CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>
>         /* Adapt phyiscal memory region size based on available memory */
> -       if (memory_tb < kaslr_regions[0].size_tb)
> +       if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())
>
> Given your example, any way we could just restrict memory_tb to be
> 32TB? Or different configurations will result in different mappings?
>
> Thanks for looking into this, Thomas!
>
> For that machine where I used to reproduce the bug and test, 32TB memory
> need be mapped to the direct mapping region. I am not sure if SGI UV
> system has larger MMIOH region now or in the future in different machine.
> If they have machine owning MMIOH region bigger than 64TB, then it's a
> problem SGI UV need fix because that will break system whether kaslr
> enabled or not.
>
> Hi Mike, Russ and Frank,
>
> About Thomas's question, could you help answer it? Could other SGI UV
> system has MMIOH region bigger than 32TB?
>
>
> While the region is much smaller it can occupy address space > 32TB, up to
> 64TB - <MMIOH size>.
> On a system with 64TB, part of the address space is taken from RAM to
> accommodate this region.
> This has been true since UV1.

I see, it would be better to know the different places to tailor the
memory_tb accordingly. I understand that might be difficult to do and
I rather have KASLR memory randomization working for now.

Reviewed-by: thgarnie@google.com

>
> Thanks
> Baoquan
>
>                 kaslr_regions[0].size_tb = memory_tb;
>
>         /* Calculate entropy available between regions */
>
>



-- 
Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/2] x86/UV: Introduce a helper function to check UV system at earlier stage
  2017-05-20 12:02 ` [PATCH v2 1/2] x86/UV: Introduce a helper function to check UV system at earlier stage Baoquan He
@ 2017-05-23 15:17   ` Mike Travis
  0 siblings, 0 replies; 10+ messages in thread
From: Mike Travis @ 2017-05-23 15:17 UTC (permalink / raw)
  To: Baoquan He, linux-kernel
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Russ Anderson,
	Dimitri Sivanich, travis, Frank Ramsay

Acked-by: Mike Travis <travis@sgi.com>

On 5/20/2017 5:02 AM, Baoquan He wrote:
> The SGI BIOS adds UVsystab, and only systems running SGI BIOS
> (and now HPE Hawks2) will have UVsystab. And UVsystab is detected in
> efi_init() which is at very early stage. So introduce a new helper
> function is_early_uv_system() for later usage.
>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Russ Anderson <rja@hpe.com>
> Cc: Dimitri Sivanich <sivanich@hpe.com>
> Cc: "travis@sgi.com" <travis@sgi.com>
> Cc: Mike Travis <mike.travis@hpe.com>
> Cc: Frank Ramsay <frank.ramsay@hpe.com>
> ---
>   arch/x86/include/asm/uv/uv.h | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/include/asm/uv/uv.h b/arch/x86/include/asm/uv/uv.h
> index 6686820..159f698 100644
> --- a/arch/x86/include/asm/uv/uv.h
> +++ b/arch/x86/include/asm/uv/uv.h
> @@ -19,6 +19,11 @@ extern const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
>   						 unsigned long start,
>   						 unsigned long end,
>   						 unsigned int cpu);
> +#include <linux/efi.h>
> +static inline int is_early_uv_system(void)
> +{
> +	return !((efi.uv_systab == EFI_INVALID_TABLE_ADDR) || !efi.uv_systab);
> +}
>   
>   #else	/* X86_UV */
>   
> @@ -31,6 +36,7 @@ static inline const struct cpumask *
>   uv_flush_tlb_others(const struct cpumask *cpumask, struct mm_struct *mm,
>   		    unsigned long start, unsigned long end, unsigned int cpu)
>   { return cpumask; }
> +static inline int is_early_uv_system(void)	{ return 0; }
>   
>   #endif	/* X86_UV */
>   

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system
  2017-05-20 12:02 [PATCH v2 0/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system Baoquan He
  2017-05-20 12:02 ` [PATCH v2 1/2] x86/UV: Introduce a helper function to check UV system at earlier stage Baoquan He
  2017-05-20 12:02 ` [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system Baoquan He
@ 2017-06-07  4:35 ` Baoquan He
  2 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2017-06-07  4:35 UTC (permalink / raw)
  To: linux-kernel
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Russ Anderson,
	Dimitri Sivanich, travis, Mike Travis, Frank Ramsay,
	Thomas Garnier, Kees Cook, Andrew Morton, Masahiro Yamada

Hi all,

PING!

Is there any further comment or suggetion about this patchset?

Thanks
Baoquan

On 05/20/17 at 08:02pm, Baoquan He wrote:
> This is v2 post.
> 
> This patchset is trying to fix a bug that SGI UV system casually hang
> during boot with KASLR enabled. The root cause is that mm KASLR adapts
> size of the direct mapping section only based on the system RAM size.
> Then later when map SGI UV MMIOH region into the direct mapping during
> rest_init() invocation, it might go beyond of the directing mapping
> section and step into VMALLOC or VMEMMAP area, then BUG_ON triggered.
> 
> The fix is adding a helper function is_early_uv_system to check UV system
> earlier, then call the helper function in kernel_randomize_memory() to
> check if it's a SGI UV system, if yes, we keep the size of direct mapping
> section to be 64TB just as nokslr.
> 
> With this fix, SGI UV system can have 64TB direct mapping size always,
> and the starting address of direct mapping/vmalloc/vmemmap and the padding
> between them can still be randomized to enhance the system security.
> 
> v1->v2:
>     1. Mike suggested making is_early_uv_system() an inline function and be
>     put in include/asm/uv/uv.h so that they can adjust them easier in the
>     future.
> 
>     2. Split the v1 code into uv part and mm KASLR part as Mike suggested.
> 
> Baoquan He (2):
>   x86/UV: Introduce a helper function to check UV system at earlier
>     stage
>   x86/mm/KASLR: Do not adapt the size of the direct mapping section for
>     SGI UV system
> 
>  arch/x86/include/asm/uv/uv.h | 6 ++++++
>  arch/x86/mm/kaslr.c          | 3 ++-
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> -- 
> 2.5.5
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system
  2017-05-20 12:02 ` [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system Baoquan He
  2017-05-21 20:38   ` Thomas Garnier
@ 2017-08-31  6:21   ` Baoquan He
  1 sibling, 0 replies; 10+ messages in thread
From: Baoquan He @ 2017-08-31  6:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Thomas Garnier, Kees Cook, Andrew Morton, Masahiro Yamada,
	mike.travis

Hi all,

Since this is a blocker bug found on SGI UV system and only happen on
SGI UV system, and expert from HPE SGI UV dev team, Mike Travis sent
private mail to me saying that I can add his Acked-by to this patchset
if repost, I will repost with updated patch log. Currently without this
fix, SGI UV system will panic during boot with very high possibility.

On 05/20/17 at 08:02pm, Baoquan He wrote:
> On SGI UV system, kernel casually hang with kaslr enabled.
> 
> The back trace is:
> 
> kernel BUG at arch/x86/mm/init_64.c:311!
> invalid opcode: 0000 [#1] SMP
> [...]
> RIP: 0010:__init_extra_mapping+0x188/0x196
> [...]
> Call Trace:
>  init_extra_mapping_uc+0x13/0x15
>  map_high+0x67/0x75
>  map_mmioh_high_uv3+0x20a/0x219
>  uv_system_init_hub+0x12d9/0x1496
>  uv_system_init+0x27/0x29
>  native_smp_prepare_cpus+0x28d/0x2d8
>  kernel_init_freeable+0xdd/0x253
>  ? rest_init+0x80/0x80
>  kernel_init+0xe/0x110
>  ret_from_fork+0x2c/0x40
> 
> The root cause is that SGI UV system needs map its MMIOH region to direct
> mapping section and the mapping happens in rest_init(). However mm KASLR
> is done in kernel_randomize_memory() which is much earlier than MMIOH
> mapping of SGI UV and doesn't count in the MMIOH regions. When kaslr
> disabled, there are 64TB space for system RAM to do direct mapping. Both
> system RAM and SGI UV MMIOH region share this 64TB space. With kaslr
> enabled, mm KASLR only reserves the actual size of system RAM plus 10TB
> for direct mapping usage. Then later MMIOH mapping of SGI UV could go
> beyond the upper bound of direct mapping section to step into VMALLOC or
> VMEMMAP area. Then the BUG_ON() in __init_extra_mapping() will be
> triggered.
> 
> E.g on the SGI UV3 machine where this bug is reported , there are two MMIOH
> regions:
> 
> [    1.519001] UV: Map MMIOH0_HI 0xffc00000000 - 0x100000000000
> [    1.523001] UV: Map MMIOH1_HI 0x100000000000 - 0x200000000000
> 
> They are [16TB-16G, 16TB) and [16TB, 32TB). On this machine, 512G ram are
> spread out to 1TB regions. Then above two SGI MMIOH regions also will be
> mapped into the direct mapping section.
> 
> To fix it, we need check if it's SGI UV system by calling
> is_early_uv_system() in kernel_randomize_memory(). If yes, do not adapt the
> size of the direct mapping section. Do it now.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org 
> Cc: Thomas Garnier <thgarnie@google.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> ---
>  arch/x86/mm/kaslr.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index aed2064..20b0456 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -27,6 +27,7 @@
>  #include <asm/pgtable.h>
>  #include <asm/setup.h>
>  #include <asm/kaslr.h>
> +#include <asm/uv/uv.h>
>  
>  #include "mm_internal.h"
>  
> @@ -123,7 +124,7 @@ void __init kernel_randomize_memory(void)
>  		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>  
>  	/* Adapt phyiscal memory region size based on available memory */
> -	if (memory_tb < kaslr_regions[0].size_tb)
> +	if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system())
>  		kaslr_regions[0].size_tb = memory_tb;
>  
>  	/* Calculate entropy available between regions */
> -- 
> 2.5.5
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-08-31  6:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-20 12:02 [PATCH v2 0/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system Baoquan He
2017-05-20 12:02 ` [PATCH v2 1/2] x86/UV: Introduce a helper function to check UV system at earlier stage Baoquan He
2017-05-23 15:17   ` Mike Travis
2017-05-20 12:02 ` [PATCH v2 2/2] x86/mm/KASLR: Do not adapt size of the direct mapping section for SGI UV system Baoquan He
2017-05-21 20:38   ` Thomas Garnier
2017-05-21 23:14     ` Baoquan He
2017-05-21 23:17       ` Baoquan He
     [not found]         ` <19a8e832-49db-cb68-bbfd-a5ba1cb8be1e@hpe.com>
2017-05-22 17:00           ` Thomas Garnier
2017-08-31  6:21   ` Baoquan He
2017-06-07  4:35 ` [PATCH v2 0/2] " Baoquan He

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.