linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size
@ 2018-08-29  2:17 Baoquan He
  2018-08-29  2:17 ` [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode Baoquan He
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Baoquan He @ 2018-08-29  2:17 UTC (permalink / raw)
  To: tglx, mingo, hpa, thgarnie, kirill.shutemov; +Cc: x86, linux-kernel, Baoquan He

In memory KASLR, __PHYSICAL_MASK_SHIFT is taken to calculate the
initial size of the direct mapping region. This is right in the
old code where __PHYSICAL_MASK_SHIFT was equal to MAX_PHYSMEM_BITS,
46bit, and only 4-level mode was supported.

Later, in commit:
b83ce5ee91471d ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52"),
__PHYSICAL_MASK_SHIFT was changed to be 52 always, no matter it's
5-level or 4-level. This is wrong for 4-level paging. Then when
adapt phyiscal memory region size based on available memory, it
will overflow if the amount of system RAM and the padding is bigger
than 64TB.

In fact, here MAX_PHYSMEM_BITS should be used instead. Fix it by
by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/mm/kaslr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 61db77b0eda9..0988971069c9 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -93,7 +93,7 @@ void __init kernel_randomize_memory(void)
 	if (!kaslr_memory_enabled())
 		return;
 
-	kaslr_regions[0].size_tb = 1 << (__PHYSICAL_MASK_SHIFT - TB_SHIFT);
+	kaslr_regions[0].size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
 	kaslr_regions[1].size_tb = VMALLOC_SIZE_TB;
 
 	/*
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-08-29  2:17 [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size Baoquan He
@ 2018-08-29  2:17 ` Baoquan He
  2018-08-29 12:05   ` Kirill A. Shutemov
  2018-08-29 11:49 ` [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size Kirill A. Shutemov
  2018-09-08 12:10 ` Thomas Gleixner
  2 siblings, 1 reply; 20+ messages in thread
From: Baoquan He @ 2018-08-29  2:17 UTC (permalink / raw)
  To: tglx, mingo, hpa, thgarnie, kirill.shutemov; +Cc: x86, linux-kernel, Baoquan He

Vmemmap area has different base and size depending on paging mode.
Now we just hardcode its size as 1TB in memory KASLR, it's not
right for 5-level paging mode.

Adjust it according to paging mode and use it during memory KASLR.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/include/asm/pgtable_64_types.h | 5 +++++
 arch/x86/mm/kaslr.c                     | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 04edd2d58211..fa759d2d3186 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -126,14 +126,19 @@ extern unsigned int ptrs_per_p4d;
 #define __VMEMMAP_BASE_L4	0xffffea0000000000UL
 #define __VMEMMAP_BASE_L5	0xffd4000000000000UL
 
+#define VMEMMAP_SIZE_TB_L4	1UL
+#define VMEMMAP_SIZE_TB_L5	512UL
+
 #ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT
 # define VMALLOC_START		vmalloc_base
 # define VMALLOC_SIZE_TB	(pgtable_l5_enabled() ? VMALLOC_SIZE_TB_L5 : VMALLOC_SIZE_TB_L4)
 # define VMEMMAP_START		vmemmap_base
+# define VMEMMAP_SIZE_TB	(pgtable_l5_enabled() ? VMEMMAP_SIZE_TB_L5 : VMEMMAP_SIZE_TB_L4)
 #else
 # define VMALLOC_START		__VMALLOC_BASE_L4
 # define VMALLOC_SIZE_TB	VMALLOC_SIZE_TB_L4
 # define VMEMMAP_START		__VMEMMAP_BASE_L4
+# define VMEMMAP_SIZE_TB	VMEMMAP_SIZE_TB_L4
 #endif /* CONFIG_DYNAMIC_MEMORY_LAYOUT */
 
 #define VMALLOC_END		(VMALLOC_START + (VMALLOC_SIZE_TB << 40) - 1)
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 0988971069c9..69228af4c7d7 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -51,7 +51,7 @@ static __initdata struct kaslr_memory_region {
 } kaslr_regions[] = {
 	{ &page_offset_base, 0 },
 	{ &vmalloc_base, 0 },
-	{ &vmemmap_base, 1 },
+	{ &vmemmap_base, 0 },
 };
 
 /* Get size in bytes used by the memory region */
@@ -95,6 +95,7 @@ void __init kernel_randomize_memory(void)
 
 	kaslr_regions[0].size_tb = 1 << (MAX_PHYSMEM_BITS - TB_SHIFT);
 	kaslr_regions[1].size_tb = VMALLOC_SIZE_TB;
+	kaslr_regions[2].size_tb = VMEMMAP_SIZE_TB;
 
 	/*
 	 * Update Physical memory mapping to available and
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size
  2018-08-29  2:17 [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size Baoquan He
  2018-08-29  2:17 ` [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode Baoquan He
@ 2018-08-29 11:49 ` Kirill A. Shutemov
  2018-09-04 17:57   ` Thomas Garnier
  2018-09-08 12:10 ` Thomas Gleixner
  2 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2018-08-29 11:49 UTC (permalink / raw)
  To: Baoquan He; +Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On Wed, Aug 29, 2018 at 10:17:53AM +0800, Baoquan He wrote:
> In memory KASLR, __PHYSICAL_MASK_SHIFT is taken to calculate the
> initial size of the direct mapping region. This is right in the
> old code where __PHYSICAL_MASK_SHIFT was equal to MAX_PHYSMEM_BITS,
> 46bit, and only 4-level mode was supported.
> 
> Later, in commit:
> b83ce5ee91471d ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52"),
> __PHYSICAL_MASK_SHIFT was changed to be 52 always, no matter it's
> 5-level or 4-level. This is wrong for 4-level paging. Then when
> adapt phyiscal memory region size based on available memory, it
> will overflow if the amount of system RAM and the padding is bigger
> than 64TB.
> 
> In fact, here MAX_PHYSMEM_BITS should be used instead. Fix it by
> by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-08-29  2:17 ` [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode Baoquan He
@ 2018-08-29 12:05   ` Kirill A. Shutemov
  2018-08-29 12:16     ` Baoquan He
                       ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Kirill A. Shutemov @ 2018-08-29 12:05 UTC (permalink / raw)
  To: Baoquan He; +Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On Wed, Aug 29, 2018 at 10:17:54AM +0800, Baoquan He wrote:
> Vmemmap area has different base and size depending on paging mode.
> Now we just hardcode its size as 1TB in memory KASLR, it's not
> right for 5-level paging mode.
> 
> Adjust it according to paging mode and use it during memory KASLR.
> 

I think 512TiB is wasteful for 5-level paging. We don't need that much.

1TiB limit with 4-level paging is required to fit struct pages for all
64TiB of physical memory, assuming each struct page is 64 bytes.

With 5-level paging the limit on physical memory is not 512-times bigger:
we cap at 52-bit physical address space. So it's just 64 times bigger and
we need only 64TiB in worst case.

I think we can limit it further by taking into account memory_tb. Most of
machines will be fine with 1TiB there and we save few more bits from
KASLR.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-08-29 12:05   ` Kirill A. Shutemov
@ 2018-08-29 12:16     ` Baoquan He
  2018-08-29 12:18     ` Baoquan He
  2018-08-30 15:25     ` Baoquan He
  2 siblings, 0 replies; 20+ messages in thread
From: Baoquan He @ 2018-08-29 12:16 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On 08/29/18 at 03:05pm, Kirill A. Shutemov wrote:
> On Wed, Aug 29, 2018 at 10:17:54AM +0800, Baoquan He wrote:
> > Vmemmap area has different base and size depending on paging mode.
> > Now we just hardcode its size as 1TB in memory KASLR, it's not
> > right for 5-level paging mode.
> > 
> > Adjust it according to paging mode and use it during memory KASLR.
> > 
> 
> I think 512TiB is wasteful for 5-level paging. We don't need that much.
> 
> 1TiB limit with 4-level paging is required to fit struct pages for all
> 64TiB of physical memory, assuming each struct page is 64 bytes.
> 
> With 5-level paging the limit on physical memory is not 512-times bigger:
> we cap at 52-bit physical address space. So it's just 64 times bigger and
> we need only 64TiB in worst case.

Thanks, Kirill.

Exactly, the current 5-level only costs 64TB for vmemmap at most. I just
copy 512TB from Documentation/x86/x86_64/mm.txt. It might develop very
quickly to need enlarge RAM space to exceed 52bit in the future. And
KASLR and KASAN are mutually exclusive. With KASLR enabled, we can take
another 8PB space reserved for KASAN. So taking 512TB here might not
mitigate KASLR. This is my thought for this fix. But I am also fine to
change it to be 64TB for 5-level paging mode. I can change it with your
idea and repost.

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-08-29 12:05   ` Kirill A. Shutemov
  2018-08-29 12:16     ` Baoquan He
@ 2018-08-29 12:18     ` Baoquan He
  2018-08-29 12:26       ` Kirill A. Shutemov
  2018-08-30 15:25     ` Baoquan He
  2 siblings, 1 reply; 20+ messages in thread
From: Baoquan He @ 2018-08-29 12:18 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On 08/29/18 at 03:05pm, Kirill A. Shutemov wrote:
> On Wed, Aug 29, 2018 at 10:17:54AM +0800, Baoquan He wrote:
> > Vmemmap area has different base and size depending on paging mode.
> > Now we just hardcode its size as 1TB in memory KASLR, it's not
> > right for 5-level paging mode.
> > 
> > Adjust it according to paging mode and use it during memory KASLR.
> > 
> 
> I think 512TiB is wasteful for 5-level paging. We don't need that much.
> 
> 1TiB limit with 4-level paging is required to fit struct pages for all
> 64TiB of physical memory, assuming each struct page is 64 bytes.
> 
> With 5-level paging the limit on physical memory is not 512-times bigger:
> we cap at 52-bit physical address space. So it's just 64 times bigger and
> we need only 64TiB in worst case.
> 
> I think we can limit it further by taking into account memory_tb. Most of
> machines will be fine with 1TiB there and we save few more bits from
> KASLR.

Oh, do you mean to make a calculation according to the actual size of
system RAM? And still taking 1TB as default, then adapt later by RAM? 

Then, no need to introduce VMEMMAP_SIZE_TB, this also looks good to me.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-08-29 12:18     ` Baoquan He
@ 2018-08-29 12:26       ` Kirill A. Shutemov
  2018-08-29 12:51         ` Baoquan He
  0 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2018-08-29 12:26 UTC (permalink / raw)
  To: Baoquan He; +Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On Wed, Aug 29, 2018 at 08:18:56PM +0800, Baoquan He wrote:
> On 08/29/18 at 03:05pm, Kirill A. Shutemov wrote:
> > On Wed, Aug 29, 2018 at 10:17:54AM +0800, Baoquan He wrote:
> > > Vmemmap area has different base and size depending on paging mode.
> > > Now we just hardcode its size as 1TB in memory KASLR, it's not
> > > right for 5-level paging mode.
> > > 
> > > Adjust it according to paging mode and use it during memory KASLR.
> > > 
> > 
> > I think 512TiB is wasteful for 5-level paging. We don't need that much.
> > 
> > 1TiB limit with 4-level paging is required to fit struct pages for all
> > 64TiB of physical memory, assuming each struct page is 64 bytes.
> > 
> > With 5-level paging the limit on physical memory is not 512-times bigger:
> > we cap at 52-bit physical address space. So it's just 64 times bigger and
> > we need only 64TiB in worst case.
> > 
> > I think we can limit it further by taking into account memory_tb. Most of
> > machines will be fine with 1TiB there and we save few more bits from
> > KASLR.
> 
> Oh, do you mean to make a calculation according to the actual size of
> system RAM? And still taking 1TB as default, then adapt later by RAM? 

Right, actual system RAM plus padding for memory hotplug.

> Then, no need to introduce VMEMMAP_SIZE_TB, this also looks good to me.

There's a tricky part that we ignore now. struct page might be larger than
64 bytes depending on debug options enabled. We may include the actual
size of struct page in calculation.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-08-29 12:26       ` Kirill A. Shutemov
@ 2018-08-29 12:51         ` Baoquan He
  0 siblings, 0 replies; 20+ messages in thread
From: Baoquan He @ 2018-08-29 12:51 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On 08/29/18 at 03:26pm, Kirill A. Shutemov wrote:
> On Wed, Aug 29, 2018 at 08:18:56PM +0800, Baoquan He wrote:
> > On 08/29/18 at 03:05pm, Kirill A. Shutemov wrote:
> > > On Wed, Aug 29, 2018 at 10:17:54AM +0800, Baoquan He wrote:
> > > > Vmemmap area has different base and size depending on paging mode.
> > > > Now we just hardcode its size as 1TB in memory KASLR, it's not
> > > > right for 5-level paging mode.
> > > > 
> > > > Adjust it according to paging mode and use it during memory KASLR.
> > > > 
> > > 
> > > I think 512TiB is wasteful for 5-level paging. We don't need that much.
> > > 
> > > 1TiB limit with 4-level paging is required to fit struct pages for all
> > > 64TiB of physical memory, assuming each struct page is 64 bytes.
> > > 
> > > With 5-level paging the limit on physical memory is not 512-times bigger:
> > > we cap at 52-bit physical address space. So it's just 64 times bigger and
> > > we need only 64TiB in worst case.
> > > 
> > > I think we can limit it further by taking into account memory_tb. Most of
> > > machines will be fine with 1TiB there and we save few more bits from
> > > KASLR.
> > 
> > Oh, do you mean to make a calculation according to the actual size of
> > system RAM? And still taking 1TB as default, then adapt later by RAM? 
> 
> Right, actual system RAM plus padding for memory hotplug.

OK, this may need be applied on top of Masa's padding adjusting patches.

x86/mm: Add an option to change the padding used for the physical memory mapping
lkml.kernel.org/r/20180821212305.20214-1-msys.mizuma@gmail.com

> 
> > Then, no need to introduce VMEMMAP_SIZE_TB, this also looks good to me.
> 
> There's a tricky part that we ignore now. struct page might be larger than
> 64 bytes depending on debug options enabled. We may include the actual
> size of struct page in calculation.

OK, I will consider this.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-08-29 12:05   ` Kirill A. Shutemov
  2018-08-29 12:16     ` Baoquan He
  2018-08-29 12:18     ` Baoquan He
@ 2018-08-30 15:25     ` Baoquan He
  2018-09-02 20:52       ` Kirill A. Shutemov
  2 siblings, 1 reply; 20+ messages in thread
From: Baoquan He @ 2018-08-30 15:25 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

Hi Kirill,

I made a new version according to your suggestion, just a little
different, I didn't make 1TB as default, just calculate with the actual
size, then align up to 1TB boundary.  Just found kcore is printing more
entries than before, I thought it's caused by my code, later got it was
touchde by other people.

Any comment about this? I can change accordingly.

From b19955ec5d439f7cb580331c83a27ad5753b06b6 Mon Sep 17 00:00:00 2001
From: Baoquan He <bhe@redhat.com>
Date: Thu, 30 Aug 2018 11:45:13 +0800
Subject: [PATCH] x86/mm/KASLR: Calculate the actual size of vmemmap region

Vmemmap region has different maximum size depending on paging mode,
now its size is hardcoded as 1TB in memory KASLR. This is not
right for 5-level paging mode. It will cause overflow if vmemmap
region is randomized to be adjacent to cpu_entry_area region and
its actual size is bigger than 1TB.

So here calculate how many TB by the actual size of vmemmap region
and align to 1TB boundary.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/mm/kaslr.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 0988971069c9..1db8e166455e 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -51,7 +51,7 @@ static __initdata struct kaslr_memory_region {
 } kaslr_regions[] = {
 	{ &page_offset_base, 0 },
 	{ &vmalloc_base, 0 },
-	{ &vmemmap_base, 1 },
+	{ &vmemmap_base, 0 },
 };
 
 /* Get size in bytes used by the memory region */
@@ -77,6 +77,7 @@ void __init kernel_randomize_memory(void)
 	unsigned long rand, memory_tb;
 	struct rnd_state rand_state;
 	unsigned long remain_entropy;
+	unsigned long vmemmap_size;
 
 	vaddr_start = pgtable_l5_enabled() ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
 	vaddr = vaddr_start;
@@ -108,6 +109,14 @@ void __init kernel_randomize_memory(void)
 	if (memory_tb < kaslr_regions[0].size_tb)
 		kaslr_regions[0].size_tb = memory_tb;
 
+	/*
+	 * Calculate how many TB vmemmap region needs, and align to
+	 * 1TB boundary.
+	 * */
+	vmemmap_size = (kaslr_regions[0].size_tb << (TB_SHIFT - PAGE_SHIFT)) *
+		sizeof(struct page);
+	kaslr_regions[2].size_tb = DIV_ROUND_UP(vmemmap_size, 1UL << TB_SHIFT);
+
 	/* Calculate entropy available between regions */
 	remain_entropy = vaddr_end - vaddr_start;
 	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++)
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-08-30 15:25     ` Baoquan He
@ 2018-09-02 20:52       ` Kirill A. Shutemov
  2018-09-03  7:47         ` Baoquan He
  0 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2018-09-02 20:52 UTC (permalink / raw)
  To: Baoquan He; +Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On Thu, Aug 30, 2018 at 11:25:12PM +0800, Baoquan He wrote:
> Hi Kirill,
> 
> I made a new version according to your suggestion, just a little
> different, I didn't make 1TB as default, just calculate with the actual
> size, then align up to 1TB boundary.  Just found kcore is printing more
> entries than before, I thought it's caused by my code, later got it was
> touchde by other people.
> 
> Any comment about this? I can change accordingly.

Looks good to me.

But there's corner case when struct page is unreasonably large and
vmemmap_size will be way to large. We probably have to report an error if
we cannot fit vmemmap properly into virtual memory layout.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-09-02 20:52       ` Kirill A. Shutemov
@ 2018-09-03  7:47         ` Baoquan He
  2018-09-03 10:26           ` Kirill A. Shutemov
  0 siblings, 1 reply; 20+ messages in thread
From: Baoquan He @ 2018-09-03  7:47 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On 09/02/18 at 11:52pm, Kirill A. Shutemov wrote:
> On Thu, Aug 30, 2018 at 11:25:12PM +0800, Baoquan He wrote:
> > Hi Kirill,
> > 
> > I made a new version according to your suggestion, just a little
> > different, I didn't make 1TB as default, just calculate with the actual
> > size, then align up to 1TB boundary.  Just found kcore is printing more
> > entries than before, I thought it's caused by my code, later got it was
> > touchde by other people.
> > 
> > Any comment about this? I can change accordingly.
> 
> Looks good to me.
> 
> But there's corner case when struct page is unreasonably large and
> vmemmap_size will be way to large. We probably have to report an error if
> we cannot fit vmemmap properly into virtual memory layout.

Hmm, sizeof(struct page) can't exceed one whole page surely, otherwise
system bootup can't go over vmemmap initlization. Except of this, we may
need think about the virtual memory layout which vmemmap can be allowed
to occupy.

If KASAN enabled, KASLR disabled,
4-level 1TB + 1TB hole (2TB)
5-level 512TB + 2034TB hole (2.5PB)

If KASAN disabled, KASLR enabled,
4-level 1TB + 1TB hole + 16TB  (18TB)
5-level 512TB + 2034TB hole + 8PB (10.5PB)

So, as you can see, if add check in memory KASLR code, we should only
consider KASLR enabled case. We possibly don't need to worry about
5-level case since the size 10.5PB is even bigger than the maximum
physical RAM mapping size. For 4-level, 18TB align to multiples of 2, it
will be 32 times of the current 1TB, then we usually assume 64 as the
default value of sizeof(struct page), then 64*32 == 1024. So we can add
check like this, what do you think? Or any other idea?

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 1db8e166455e..776ec759a87c 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -90,6 +90,7 @@ void __init kernel_randomize_memory(void)
        BUILD_BUG_ON(vaddr_start >= vaddr_end);
        BUILD_BUG_ON(vaddr_end != CPU_ENTRY_AREA_BASE);
        BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
+       BUILD_BUG_ON(sizeof(struct page ) > PAGE_SIZE/4);
 
        if (!kaslr_memory_enabled())
                return;


For 5-level paging mode, we
may not need to worry about that. Since KASAN 

***4-level***
ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
... unused hole ...
ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
... unused hole ...



***5-level***
ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
... unused hole ...
ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB) 

> 
> -- 
>  Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-09-03  7:47         ` Baoquan He
@ 2018-09-03 10:26           ` Kirill A. Shutemov
  2018-09-03 14:52             ` Baoquan He
  0 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2018-09-03 10:26 UTC (permalink / raw)
  To: Baoquan He; +Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On Mon, Sep 03, 2018 at 03:47:18PM +0800, Baoquan He wrote:
> On 09/02/18 at 11:52pm, Kirill A. Shutemov wrote:
> > On Thu, Aug 30, 2018 at 11:25:12PM +0800, Baoquan He wrote:
> > > Hi Kirill,
> > > 
> > > I made a new version according to your suggestion, just a little
> > > different, I didn't make 1TB as default, just calculate with the actual
> > > size, then align up to 1TB boundary.  Just found kcore is printing more
> > > entries than before, I thought it's caused by my code, later got it was
> > > touchde by other people.
> > > 
> > > Any comment about this? I can change accordingly.
> > 
> > Looks good to me.
> > 
> > But there's corner case when struct page is unreasonably large and
> > vmemmap_size will be way to large. We probably have to report an error if
> > we cannot fit vmemmap properly into virtual memory layout.
> 
> Hmm, sizeof(struct page) can't exceed one whole page surely, otherwise
> system bootup can't go over vmemmap initlization. Except of this, we may
> need think about the virtual memory layout which vmemmap can be allowed
> to occupy.
> 
> If KASAN enabled, KASLR disabled,
> 4-level 1TB + 1TB hole (2TB)
> 5-level 512TB + 2034TB hole (2.5PB)
> 
> If KASAN disabled, KASLR enabled,
> 4-level 1TB + 1TB hole + 16TB  (18TB)
> 5-level 512TB + 2034TB hole + 8PB (10.5PB)
> 
> So, as you can see, if add check in memory KASLR code, we should only
> consider KASLR enabled case. We possibly don't need to worry about
> 5-level case since the size 10.5PB is even bigger than the maximum
> physical RAM mapping size. For 4-level, 18TB align to multiples of 2, it
> will be 32 times of the current 1TB, then we usually assume 64 as the
> default value of sizeof(struct page), then 64*32 == 1024. So we can add
> check like this, what do you think? Or any other idea?

Looks reasonable to me.

But I would have the BUILD_BUG_ON() in generic code. If you struct page is
more than 1/4 of PAGE_SIZE something is horribly broken.

> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index 1db8e166455e..776ec759a87c 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -90,6 +90,7 @@ void __init kernel_randomize_memory(void)
>         BUILD_BUG_ON(vaddr_start >= vaddr_end);
>         BUILD_BUG_ON(vaddr_end != CPU_ENTRY_AREA_BASE);
>         BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
> +       BUILD_BUG_ON(sizeof(struct page ) > PAGE_SIZE/4);

Nitpick: redundant space before ')'.

>  
>         if (!kaslr_memory_enabled())
>                 return;
> 
> 
> For 5-level paging mode, we
> may not need to worry about that. Since KASAN 
> 
> ***4-level***
> ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
> ... unused hole ...
> ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
> ... unused hole ...
> 
> 
> 
> ***5-level***
> ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
> ... unused hole ...
> ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB) 
> 
> > 
> > -- 
> >  Kirill A. Shutemov

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-09-03 10:26           ` Kirill A. Shutemov
@ 2018-09-03 14:52             ` Baoquan He
  2018-09-04  8:13               ` Kirill A. Shutemov
  0 siblings, 1 reply; 20+ messages in thread
From: Baoquan He @ 2018-09-03 14:52 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On 09/03/18 at 01:26pm, Kirill A. Shutemov wrote:
> On Mon, Sep 03, 2018 at 03:47:18PM +0800, Baoquan He wrote:
> > On 09/02/18 at 11:52pm, Kirill A. Shutemov wrote:
> > > On Thu, Aug 30, 2018 at 11:25:12PM +0800, Baoquan He wrote:
> > > > Hi Kirill,
> > > > 
> > > > I made a new version according to your suggestion, just a little
> > > > different, I didn't make 1TB as default, just calculate with the actual
> > > > size, then align up to 1TB boundary.  Just found kcore is printing more
> > > > entries than before, I thought it's caused by my code, later got it was
> > > > touchde by other people.
> > > > 
> > > > Any comment about this? I can change accordingly.
> > > 
> > > Looks good to me.
> > > 
> > > But there's corner case when struct page is unreasonably large and
> > > vmemmap_size will be way to large. We probably have to report an error if
> > > we cannot fit vmemmap properly into virtual memory layout.
> > 
> > Hmm, sizeof(struct page) can't exceed one whole page surely, otherwise
> > system bootup can't go over vmemmap initlization. Except of this, we may
> > need think about the virtual memory layout which vmemmap can be allowed
> > to occupy.
> > 
> > If KASAN enabled, KASLR disabled,
> > 4-level 1TB + 1TB hole (2TB)
> > 5-level 512TB + 2034TB hole (2.5PB)
> > 
> > If KASAN disabled, KASLR enabled,
> > 4-level 1TB + 1TB hole + 16TB  (18TB)
> > 5-level 512TB + 2034TB hole + 8PB (10.5PB)
> > 
> > So, as you can see, if add check in memory KASLR code, we should only
> > consider KASLR enabled case. We possibly don't need to worry about
> > 5-level case since the size 10.5PB is even bigger than the maximum
> > physical RAM mapping size. For 4-level, 18TB align to multiples of 2, it
> > will be 32 times of the current 1TB, then we usually assume 64 as the
> > default value of sizeof(struct page), then 64*32 == 1024. So we can add
> > check like this, what do you think? Or any other idea?
> 
> Looks reasonable to me.
> 
> But I would have the BUILD_BUG_ON() in generic code. If you struct page is
> more than 1/4 of PAGE_SIZE something is horribly broken.

Just the 1/4 of PAGE_SIZE is based on analysis of KASLR case. If
non-KASLR case, it may not be that value. Not sure if it's OK to put it
in generic code, and haven't thought of a good place, maybe in
setup_arch(), just at the beginning?

> 
> > diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> > index 1db8e166455e..776ec759a87c 100644
> > --- a/arch/x86/mm/kaslr.c
> > +++ b/arch/x86/mm/kaslr.c
> > @@ -90,6 +90,7 @@ void __init kernel_randomize_memory(void)
> >         BUILD_BUG_ON(vaddr_start >= vaddr_end);
> >         BUILD_BUG_ON(vaddr_end != CPU_ENTRY_AREA_BASE);
> >         BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
> > +       BUILD_BUG_ON(sizeof(struct page ) > PAGE_SIZE/4);
> 
> Nitpick: redundant space before ')'.
> 
> >  
> >         if (!kaslr_memory_enabled())
> >                 return;
> > 
> > 
> > For 5-level paging mode, we
> > may not need to worry about that. Since KASAN 
> > 
> > ***4-level***
> > ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
> > ... unused hole ...
> > ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
> > ... unused hole ...
> > 
> > 
> > 
> > ***5-level***
> > ffd4000000000000 - ffd5ffffffffffff (=49 bits) virtual memory map (512TB)
> > ... unused hole ...
> > ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB) 
> > 
> > > 
> > > -- 
> > >  Kirill A. Shutemov
> 
> -- 
>  Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-09-03 14:52             ` Baoquan He
@ 2018-09-04  8:13               ` Kirill A. Shutemov
  2018-09-05  8:15                 ` Baoquan He
  0 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2018-09-04  8:13 UTC (permalink / raw)
  To: Baoquan He; +Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On Mon, Sep 03, 2018 at 10:52:13PM +0800, Baoquan He wrote:
> On 09/03/18 at 01:26pm, Kirill A. Shutemov wrote:
> > On Mon, Sep 03, 2018 at 03:47:18PM +0800, Baoquan He wrote:
> > > On 09/02/18 at 11:52pm, Kirill A. Shutemov wrote:
> > > > On Thu, Aug 30, 2018 at 11:25:12PM +0800, Baoquan He wrote:
> > > > > Hi Kirill,
> > > > > 
> > > > > I made a new version according to your suggestion, just a little
> > > > > different, I didn't make 1TB as default, just calculate with the actual
> > > > > size, then align up to 1TB boundary.  Just found kcore is printing more
> > > > > entries than before, I thought it's caused by my code, later got it was
> > > > > touchde by other people.
> > > > > 
> > > > > Any comment about this? I can change accordingly.
> > > > 
> > > > Looks good to me.
> > > > 
> > > > But there's corner case when struct page is unreasonably large and
> > > > vmemmap_size will be way to large. We probably have to report an error if
> > > > we cannot fit vmemmap properly into virtual memory layout.
> > > 
> > > Hmm, sizeof(struct page) can't exceed one whole page surely, otherwise
> > > system bootup can't go over vmemmap initlization. Except of this, we may
> > > need think about the virtual memory layout which vmemmap can be allowed
> > > to occupy.
> > > 
> > > If KASAN enabled, KASLR disabled,
> > > 4-level 1TB + 1TB hole (2TB)
> > > 5-level 512TB + 2034TB hole (2.5PB)
> > > 
> > > If KASAN disabled, KASLR enabled,
> > > 4-level 1TB + 1TB hole + 16TB  (18TB)
> > > 5-level 512TB + 2034TB hole + 8PB (10.5PB)
> > > 
> > > So, as you can see, if add check in memory KASLR code, we should only
> > > consider KASLR enabled case. We possibly don't need to worry about
> > > 5-level case since the size 10.5PB is even bigger than the maximum
> > > physical RAM mapping size. For 4-level, 18TB align to multiples of 2, it
> > > will be 32 times of the current 1TB, then we usually assume 64 as the
> > > default value of sizeof(struct page), then 64*32 == 1024. So we can add
> > > check like this, what do you think? Or any other idea?
> > 
> > Looks reasonable to me.
> > 
> > But I would have the BUILD_BUG_ON() in generic code. If you struct page is
> > more than 1/4 of PAGE_SIZE something is horribly broken.
> 
> Just the 1/4 of PAGE_SIZE is based on analysis of KASLR case. If
> non-KASLR case, it may not be that value.

Even if it technically possible to have struct page larger than
PAGE_SIZE/4, it's just insane.

> Not sure if it's OK to put it in generic code, and haven't thought of a
> good place, maybe in setup_arch(), just at the beginning?

I don't see an obvious place too. Maybe free_area_init_nodes()?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size
  2018-08-29 11:49 ` [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size Kirill A. Shutemov
@ 2018-09-04 17:57   ` Thomas Garnier
  0 siblings, 0 replies; 20+ messages in thread
From: Thomas Garnier @ 2018-09-04 17:57 UTC (permalink / raw)
  To: kirill
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Kirill A. Shutemov, the arch/x86 maintainers, LKML

Thanks Baoquan!

Reviewed-by: Thomas Garnier <thgarnie@google.com>

On Wed, Aug 29, 2018 at 4:49 AM Kirill A. Shutemov <kirill@shutemov.name> wrote:
>
> On Wed, Aug 29, 2018 at 10:17:53AM +0800, Baoquan He wrote:
> > In memory KASLR, __PHYSICAL_MASK_SHIFT is taken to calculate the
> > initial size of the direct mapping region. This is right in the
> > old code where __PHYSICAL_MASK_SHIFT was equal to MAX_PHYSMEM_BITS,
> > 46bit, and only 4-level mode was supported.
> >
> > Later, in commit:
> > b83ce5ee91471d ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52"),
> > __PHYSICAL_MASK_SHIFT was changed to be 52 always, no matter it's
> > 5-level or 4-level. This is wrong for 4-level paging. Then when
> > adapt phyiscal memory region size based on available memory, it
> > will overflow if the amount of system RAM and the padding is bigger
> > than 64TB.
> >
> > In fact, here MAX_PHYSMEM_BITS should be used instead. Fix it by
> > by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS.
> >
> > Signed-off-by: Baoquan He <bhe@redhat.com>
>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>
> --
>  Kirill A. Shutemov



-- 
Thomas

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-09-04  8:13               ` Kirill A. Shutemov
@ 2018-09-05  8:15                 ` Baoquan He
  2018-09-05 12:09                   ` Kirill A. Shutemov
  0 siblings, 1 reply; 20+ messages in thread
From: Baoquan He @ 2018-09-05  8:15 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On 09/04/18 at 11:13am, Kirill A. Shutemov wrote:
> On Mon, Sep 03, 2018 at 10:52:13PM +0800, Baoquan He wrote:
> > On 09/03/18 at 01:26pm, Kirill A. Shutemov wrote:
> > > > > But there's corner case when struct page is unreasonably large and
> > > > > vmemmap_size will be way to large. We probably have to report an error if
> > > > > we cannot fit vmemmap properly into virtual memory layout.
> > > > 
> > > > Hmm, sizeof(struct page) can't exceed one whole page surely, otherwise
> > > > system bootup can't go over vmemmap initlization. Except of this, we may
> > > > need think about the virtual memory layout which vmemmap can be allowed
> > > > to occupy.
> > > > 
> > > > If KASAN enabled, KASLR disabled,
> > > > 4-level 1TB + 1TB hole (2TB)
> > > > 5-level 512TB + 2034TB hole (2.5PB)
> > > > 
> > > > If KASAN disabled, KASLR enabled,
> > > > 4-level 1TB + 1TB hole + 16TB  (18TB)
> > > > 5-level 512TB + 2034TB hole + 8PB (10.5PB)
> > > > 
> > > > So, as you can see, if add check in memory KASLR code, we should only
> > > > consider KASLR enabled case. We possibly don't need to worry about
> > > > 5-level case since the size 10.5PB is even bigger than the maximum
> > > > physical RAM mapping size. For 4-level, 18TB align to multiples of 2, it
> > > > will be 32 times of the current 1TB, then we usually assume 64 as the
> > > > default value of sizeof(struct page), then 64*32 == 1024. So we can add
						~~~64*32 = 2048
				Sorry, I made mistake here.
> > > > check like this, what do you think? Or any other idea?
> > > 
> > > Looks reasonable to me.
> > > 
> > > But I would have the BUILD_BUG_ON() in generic code. If you struct page is
> > > more than 1/4 of PAGE_SIZE something is horribly broken.
> > 
> > Just the 1/4 of PAGE_SIZE is based on analysis of KASLR case. If
> > non-KASLR case, it may not be that value.
> 
> Even if it technically possible to have struct page larger than
> PAGE_SIZE/4, it's just insane.
> 
> > Not sure if it's OK to put it in generic code, and haven't thought of a
> > good place, maybe in setup_arch(), just at the beginning?
> 
> I don't see an obvious place too. Maybe free_area_init_nodes()?

OK, you mean a more generic place, I only considered generic place in
x86. The thing is not all ARCH-es set PAGE_SIZE as 4KB, e.g power and
arm64 can have PAGE_SIZE of 64KB. For them, PAGE_SIZE/4, namely 16KB,
is hardly reached. So my thought is either taking PAGE_SIZE/4 in x86
arch only, or using SZ_1K in free_area_init_nodes() as you suggested.
What do you think?

Thanks
Baoquan 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-09-05  8:15                 ` Baoquan He
@ 2018-09-05 12:09                   ` Kirill A. Shutemov
  2018-09-05 12:37                     ` Baoquan He
  0 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2018-09-05 12:09 UTC (permalink / raw)
  To: Baoquan He
  Cc: Kirill A. Shutemov, tglx, mingo, hpa, thgarnie, x86, linux-kernel

On Wed, Sep 05, 2018 at 08:15:31AM +0000, Baoquan He wrote:
> On 09/04/18 at 11:13am, Kirill A. Shutemov wrote:
> > On Mon, Sep 03, 2018 at 10:52:13PM +0800, Baoquan He wrote:
> > > On 09/03/18 at 01:26pm, Kirill A. Shutemov wrote:
> > > > > > But there's corner case when struct page is unreasonably large and
> > > > > > vmemmap_size will be way to large. We probably have to report an error if
> > > > > > we cannot fit vmemmap properly into virtual memory layout.
> > > > > 
> > > > > Hmm, sizeof(struct page) can't exceed one whole page surely, otherwise
> > > > > system bootup can't go over vmemmap initlization. Except of this, we may
> > > > > need think about the virtual memory layout which vmemmap can be allowed
> > > > > to occupy.
> > > > > 
> > > > > If KASAN enabled, KASLR disabled,
> > > > > 4-level 1TB + 1TB hole (2TB)
> > > > > 5-level 512TB + 2034TB hole (2.5PB)
> > > > > 
> > > > > If KASAN disabled, KASLR enabled,
> > > > > 4-level 1TB + 1TB hole + 16TB  (18TB)
> > > > > 5-level 512TB + 2034TB hole + 8PB (10.5PB)
> > > > > 
> > > > > So, as you can see, if add check in memory KASLR code, we should only
> > > > > consider KASLR enabled case. We possibly don't need to worry about
> > > > > 5-level case since the size 10.5PB is even bigger than the maximum
> > > > > physical RAM mapping size. For 4-level, 18TB align to multiples of 2, it
> > > > > will be 32 times of the current 1TB, then we usually assume 64 as the
> > > > > default value of sizeof(struct page), then 64*32 == 1024. So we can add
> 						~~~64*32 = 2048
> 				Sorry, I made mistake here.
> > > > > check like this, what do you think? Or any other idea?
> > > > 
> > > > Looks reasonable to me.
> > > > 
> > > > But I would have the BUILD_BUG_ON() in generic code. If you struct page is
> > > > more than 1/4 of PAGE_SIZE something is horribly broken.
> > > 
> > > Just the 1/4 of PAGE_SIZE is based on analysis of KASLR case. If
> > > non-KASLR case, it may not be that value.
> > 
> > Even if it technically possible to have struct page larger than
> > PAGE_SIZE/4, it's just insane.
> > 
> > > Not sure if it's OK to put it in generic code, and haven't thought of a
> > > good place, maybe in setup_arch(), just at the beginning?
> > 
> > I don't see an obvious place too. Maybe free_area_init_nodes()?
> 
> OK, you mean a more generic place, I only considered generic place in
> x86. The thing is not all ARCH-es set PAGE_SIZE as 4KB, e.g power and
> arm64 can have PAGE_SIZE of 64KB. For them, PAGE_SIZE/4, namely 16KB,
> is hardly reached. So my thought is either taking PAGE_SIZE/4 in x86
> arch only, or using SZ_1K in free_area_init_nodes() as you suggested.
> What do you think?

BUILD_BUG_ON() on min(SZ_1K, PAGE_SIZE/4)?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode
  2018-09-05 12:09                   ` Kirill A. Shutemov
@ 2018-09-05 12:37                     ` Baoquan He
  0 siblings, 0 replies; 20+ messages in thread
From: Baoquan He @ 2018-09-05 12:37 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Kirill A. Shutemov, tglx, mingo, hpa, thgarnie, x86, linux-kernel

On 09/05/18 at 03:09pm, Kirill A. Shutemov wrote:
> On Wed, Sep 05, 2018 at 08:15:31AM +0000, Baoquan He wrote:
> > On 09/04/18 at 11:13am, Kirill A. Shutemov wrote:
> > > On Mon, Sep 03, 2018 at 10:52:13PM +0800, Baoquan He wrote:
> > > > On 09/03/18 at 01:26pm, Kirill A. Shutemov wrote:
> > > > > > > But there's corner case when struct page is unreasonably large and
> > > > > > > vmemmap_size will be way to large. We probably have to report an error if
> > > > > > > we cannot fit vmemmap properly into virtual memory layout.
> > > > > > 
> > > > > > Hmm, sizeof(struct page) can't exceed one whole page surely, otherwise
> > > > > > system bootup can't go over vmemmap initlization. Except of this, we may
> > > > > > need think about the virtual memory layout which vmemmap can be allowed
> > > > > > to occupy.
> > > > > > 
> > > > > > If KASAN enabled, KASLR disabled,
> > > > > > 4-level 1TB + 1TB hole (2TB)
> > > > > > 5-level 512TB + 2034TB hole (2.5PB)
> > > > > > 
> > > > > > If KASAN disabled, KASLR enabled,
> > > > > > 4-level 1TB + 1TB hole + 16TB  (18TB)
> > > > > > 5-level 512TB + 2034TB hole + 8PB (10.5PB)
> > > > > > 
> > > > > > So, as you can see, if add check in memory KASLR code, we should only
> > > > > > consider KASLR enabled case. We possibly don't need to worry about
> > > > > > 5-level case since the size 10.5PB is even bigger than the maximum
> > > > > > physical RAM mapping size. For 4-level, 18TB align to multiples of 2, it
> > > > > > will be 32 times of the current 1TB, then we usually assume 64 as the
> > > > > > default value of sizeof(struct page), then 64*32 == 1024. So we can add
> > 						~~~64*32 = 2048
> > 				Sorry, I made mistake here.
> > > > > > check like this, what do you think? Or any other idea?
> > > > > 
> > > > > Looks reasonable to me.
> > > > > 
> > > > > But I would have the BUILD_BUG_ON() in generic code. If you struct page is
> > > > > more than 1/4 of PAGE_SIZE something is horribly broken.
> > > > 
> > > > Just the 1/4 of PAGE_SIZE is based on analysis of KASLR case. If
> > > > non-KASLR case, it may not be that value.
> > > 
> > > Even if it technically possible to have struct page larger than
> > > PAGE_SIZE/4, it's just insane.
> > > 
> > > > Not sure if it's OK to put it in generic code, and haven't thought of a
> > > > good place, maybe in setup_arch(), just at the beginning?
> > > 
> > > I don't see an obvious place too. Maybe free_area_init_nodes()?
> > 
> > OK, you mean a more generic place, I only considered generic place in
> > x86. The thing is not all ARCH-es set PAGE_SIZE as 4KB, e.g power and
> > arm64 can have PAGE_SIZE of 64KB. For them, PAGE_SIZE/4, namely 16KB,
> > is hardly reached. So my thought is either taking PAGE_SIZE/4 in x86
> > arch only, or using SZ_1K in free_area_init_nodes() as you suggested.
> > What do you think?
> 
> BUILD_BUG_ON() on min(SZ_1K, PAGE_SIZE/4)?

I am fine. Just SZ_1K will always win,  4K is the smallest granularity
of known size :-).


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size
  2018-08-29  2:17 [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size Baoquan He
  2018-08-29  2:17 ` [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode Baoquan He
  2018-08-29 11:49 ` [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size Kirill A. Shutemov
@ 2018-09-08 12:10 ` Thomas Gleixner
  2018-09-08 13:33   ` Baoquan He
  2 siblings, 1 reply; 20+ messages in thread
From: Thomas Gleixner @ 2018-09-08 12:10 UTC (permalink / raw)
  To: Baoquan He; +Cc: mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On Wed, 29 Aug 2018, Baoquan He wrote:

> In memory KASLR, __PHYSICAL_MASK_SHIFT is taken to calculate the
> initial size of the direct mapping region. This is right in the
> old code where __PHYSICAL_MASK_SHIFT was equal to MAX_PHYSMEM_BITS,
> 46bit, and only 4-level mode was supported.
> 
> Later, in commit:
> b83ce5ee91471d ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52"),
> __PHYSICAL_MASK_SHIFT was changed to be 52 always, no matter it's
> 5-level or 4-level. This is wrong for 4-level paging. Then when
> adapt phyiscal memory region size based on available memory, it
> will overflow if the amount of system RAM and the padding is bigger
> than 64TB.
> 
> In fact, here MAX_PHYSMEM_BITS should be used instead. Fix it by
> by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>

This lacks a fixes tag .....

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size
  2018-09-08 12:10 ` Thomas Gleixner
@ 2018-09-08 13:33   ` Baoquan He
  0 siblings, 0 replies; 20+ messages in thread
From: Baoquan He @ 2018-09-08 13:33 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: mingo, hpa, thgarnie, kirill.shutemov, x86, linux-kernel

On 09/08/18 at 02:10pm, Thomas Gleixner wrote:
> On Wed, 29 Aug 2018, Baoquan He wrote:
> 
> > In memory KASLR, __PHYSICAL_MASK_SHIFT is taken to calculate the
> > initial size of the direct mapping region. This is right in the
> > old code where __PHYSICAL_MASK_SHIFT was equal to MAX_PHYSMEM_BITS,
> > 46bit, and only 4-level mode was supported.
> > 
> > Later, in commit:
> > b83ce5ee91471d ("x86/mm/64: Make __PHYSICAL_MASK_SHIFT always 52"),
> > __PHYSICAL_MASK_SHIFT was changed to be 52 always, no matter it's
> > 5-level or 4-level. This is wrong for 4-level paging. Then when
> > adapt phyiscal memory region size based on available memory, it
> > will overflow if the amount of system RAM and the padding is bigger
> > than 64TB.
> > 
> > In fact, here MAX_PHYSMEM_BITS should be used instead. Fix it by
> > by replacing __PHYSICAL_MASK_SHIFT with MAX_PHYSMEM_BITS.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> 
> This lacks a fixes tag .....

Sure, I will add fix tag in this patch, and arrange a new patchset
according to discussion with Kirill after test.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-09-08 13:33 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-29  2:17 [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size Baoquan He
2018-08-29  2:17 ` [PATCH 2/2] x86/mm/KASLR: Adjust the vmemmap size according to paging mode Baoquan He
2018-08-29 12:05   ` Kirill A. Shutemov
2018-08-29 12:16     ` Baoquan He
2018-08-29 12:18     ` Baoquan He
2018-08-29 12:26       ` Kirill A. Shutemov
2018-08-29 12:51         ` Baoquan He
2018-08-30 15:25     ` Baoquan He
2018-09-02 20:52       ` Kirill A. Shutemov
2018-09-03  7:47         ` Baoquan He
2018-09-03 10:26           ` Kirill A. Shutemov
2018-09-03 14:52             ` Baoquan He
2018-09-04  8:13               ` Kirill A. Shutemov
2018-09-05  8:15                 ` Baoquan He
2018-09-05 12:09                   ` Kirill A. Shutemov
2018-09-05 12:37                     ` Baoquan He
2018-08-29 11:49 ` [PATCH 1/2] x86/mm/KASLR: Fix the wrong calculation of kalsr region initial size Kirill A. Shutemov
2018-09-04 17:57   ` Thomas Garnier
2018-09-08 12:10 ` Thomas Gleixner
2018-09-08 13:33   ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).