linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
@ 2019-02-28  0:35 Baoquan He
  2019-02-28  0:35 ` [PATCH v2 1/2] x86/mm/KASLR: Only build one PUD entry of area for real mode trampoline Baoquan He
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Baoquan He @ 2019-02-28  0:35 UTC (permalink / raw)
  To: linux-kernel, kirill.shutemov
  Cc: dave.hansen, luto, peterz, tglx, mingo, bp, hpa, x86, keescook,
	thgarnie, Baoquan He

This is v2 post. v1 can be found here:
http://lkml.kernel.org/r/20190224132231.4878-1-bhe@redhat.com

Background:
***
Earlier, during a series of KASLR patch reviewing, Ingo got the current
memory region KASLR only has granularity of randomization in PUD size in
4-level paging mode, and P4D size in 5-level paging mode, He suggested
me to try to change both of them to be PMD size at granularity:

  http://lkml.kernel.org/r/20180912100135.GB3333@gmail.com

Later, I changed code to support PMD level of randomization for both
4-level and 5-level.

  https://github.com/baoquan-he/linux/commits/mm-kaslr-2m-aligned

The test passed on my KVM guest with 1 GB RAM, but failed when I
increased the RAM to 4 GB, and failed either on larger RAM.

After analyzing, it's because that 1 GB page mapping need be mapped at 1
GB aligned physical address for intel CPU. The 2 MB level of randomization
will break it and cause error. Please check below table in intel IA32 manual.

  Table 4-15. Format of an IA-32e Page-Directory-Pointer-Table Entry (PDPTE) that Maps a 1-GByte Page

So PMD level of randomization for mm KASLR is not doable.

However, during investigation and testing above code, it turns out that the
current code is misleading to build identity mapping for the real mode
trampoline in case KASLR enabled. From code, only a small area (which is
smaller than 1 MB) need be identity mapped. Please check below patch which
is from above mm-kaslr-2m-aligned patch series. it only builds up 2 MB
identity maping for real mode trampoline, and test passed on machines
with 32 GB RAM of 4-level and on KVM guest of 5-level. 

https://github.com/baoquan-he/linux/commit/e120e67fbf9a5aa818d20084d8dea5b4a27ecf97

Result:
Make a patchset to:
  1)change code to only build 1 GB of area for real mode trampoline,
    namely only copy one PUD entry where physical address 0 resides;

  2)improve the randomization granularity of 5-level from P4D size to PUD size.

Changelog:
v1->v2:
  Improve patch according to Kirill's suggestions:
    *)Add more information to code comment for better understanding;
    *)Improve code to save one low memory page in 4-level;

Baoquan He (2):
  x86/mm/KASLR: Only build one PUD entry of area for real mode
    trampoline
  x86/mm/KASLR: Change the granularity of randomization to PUD size in
    5-level

 arch/x86/mm/kaslr.c | 92 ++++++++++++++++++++-------------------------
 1 file changed, 40 insertions(+), 52 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/2] x86/mm/KASLR: Only build one PUD entry of area for real mode trampoline
  2019-02-28  0:35 [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level Baoquan He
@ 2019-02-28  0:35 ` Baoquan He
  2019-02-28  0:35 ` [PATCH v2 2/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level Baoquan He
  2019-02-28  9:10 ` [PATCH v2 0/2] " Kirill A. Shutemov
  2 siblings, 0 replies; 13+ messages in thread
From: Baoquan He @ 2019-02-28  0:35 UTC (permalink / raw)
  To: linux-kernel, kirill.shutemov
  Cc: dave.hansen, luto, peterz, tglx, mingo, bp, hpa, x86, keescook,
	thgarnie, Baoquan He

The current code builds identity mapping for real mode treampoline by
borrowing page tables from the direct mapping section if KASLR is
enabled. It will copy present entries of the first PUD table in 4-level
paging mode, or the first P4D table in 5-level paging mode.

However, there's only a very small area under low 1 MB reserved
for real mode trampoline in reserve_real_mode(). Makes no sense
to build up so large area of mapping for it. Since the randomization
granularity in 4-level is 1 GB, and 512 GB in 5-level, only copying
one PUD entry is enough.

Hence, only copy one PUD entry of area where physical address 0
resides. And this is preparation for later changing the randomization
granularity of 5-level paging mode from 512 GB to 1 GB.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/mm/kaslr.c | 82 +++++++++++++++++++++------------------------
 1 file changed, 38 insertions(+), 44 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 754b5da91d43..131e08a10a68 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -226,74 +226,68 @@ void __init kernel_randomize_memory(void)
 
 static void __meminit init_trampoline_pud(void)
 {
-	unsigned long paddr, paddr_next;
+	unsigned long paddr, vaddr;
 	pgd_t *pgd;
-	pud_t *pud_page, *pud_page_tramp;
-	int i;
+
+	pud_t *pud_page, *pud_page_tramp, *pud, *pud_tramp;
 
 	pud_page_tramp = alloc_low_page();
 
 	paddr = 0;
+	vaddr = (unsigned long)__va(paddr);
 	pgd = pgd_offset_k((unsigned long)__va(paddr));
-	pud_page = (pud_t *) pgd_page_vaddr(*pgd);
-
-	for (i = pud_index(paddr); i < PTRS_PER_PUD; i++, paddr = paddr_next) {
-		pud_t *pud, *pud_tramp;
-		unsigned long vaddr = (unsigned long)__va(paddr);
 
-		pud_tramp = pud_page_tramp + pud_index(paddr);
-		pud = pud_page + pud_index(vaddr);
-		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
-
-		*pud_tramp = *pud;
-	}
+	if (pgtable_l5_enabled()) {
+		p4d_t *p4d_page, *p4d_page_tramp, *p4d, *p4d_tramp;

+		p4d_page_tramp = alloc_low_page();
 
-	set_pgd(&trampoline_pgd_entry,
-		__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
-}
+		p4d_page = (p4d_t *) pgd_page_vaddr(*pgd);
+		p4d = p4d_page + p4d_index(vaddr);
 
-static void __meminit init_trampoline_p4d(void)
-{
-	unsigned long paddr, paddr_next;
-	pgd_t *pgd;
-	p4d_t *p4d_page, *p4d_page_tramp;
-	int i;
+		pud_page = (pud_t *) p4d_page_vaddr(*p4d);
+		pud = pud_page + pud_index(vaddr);
 
-	p4d_page_tramp = alloc_low_page();
+		p4d_tramp = p4d_page_tramp + p4d_index(paddr);
+		pud_tramp = pud_page_tramp + pud_index(paddr);
 
-	paddr = 0;
-	pgd = pgd_offset_k((unsigned long)__va(paddr));
-	p4d_page = (p4d_t *) pgd_page_vaddr(*pgd);
+		*pud_tramp = *pud;
 
-	for (i = p4d_index(paddr); i < PTRS_PER_P4D; i++, paddr = paddr_next) {
-		p4d_t *p4d, *p4d_tramp;
-		unsigned long vaddr = (unsigned long)__va(paddr);
+		set_p4d(p4d_tramp,
+			__p4d(_KERNPG_TABLE | __pa(pud_page_tramp)));
 
-		p4d_tramp = p4d_page_tramp + p4d_index(paddr);
-		p4d = p4d_page + p4d_index(vaddr);
-		paddr_next = (paddr & P4D_MASK) + P4D_SIZE;
+		set_pgd(&trampoline_pgd_entry,
+			__pgd(_KERNPG_TABLE | __pa(p4d_page_tramp)));
+	} else {
+		pud_page = (pud_t *) pgd_page_vaddr(*pgd);
+		pud = pud_page + pud_index(vaddr);
 
-		*p4d_tramp = *p4d;
+		pud_tramp = pud_page_tramp + pud_index(paddr);
+		*pud_tramp = *pud;
+		set_pgd(&trampoline_pgd_entry,
+			__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
 	}
-
-	set_pgd(&trampoline_pgd_entry,
-		__pgd(_KERNPG_TABLE | __pa(p4d_page_tramp)));
 }
 
 /*
- * Create PGD aligned trampoline table to allow real mode initialization
- * of additional CPUs. Consume only 1 low memory page.
+ * Real mode trampoline only occupies a small area under low 1 MB
+ * (please check codes in reserve_real_mode() for details). For
+ * APs' booting up, we just borrow as few page tables as possible
+ * from the direct physical mapping to build 1:1 mapping to cover
+ * that area. In case KASLR disabled, the 1st PGD entry of the
+ * direct mapping is copied directly. If KASLR is enabled, only
+ * copy the 1st PUD entry where physical address 0 resides since
+ * the granularity of randomization is PUD size in 4-level, and
+ * P4D size in 5-level.
+ *
+ * This consumes one low memory page in 4-level case, and extra one
+ * in 5-level.
  */
 void __meminit init_trampoline(void)
 {
-
 	if (!kaslr_memory_enabled()) {
 		init_trampoline_default();
 		return;
 	}
 
-	if (pgtable_l5_enabled())
-		init_trampoline_p4d();
-	else
-		init_trampoline_pud();
+	init_trampoline_pud();
 }
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-02-28  0:35 [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level Baoquan He
  2019-02-28  0:35 ` [PATCH v2 1/2] x86/mm/KASLR: Only build one PUD entry of area for real mode trampoline Baoquan He
@ 2019-02-28  0:35 ` Baoquan He
  2019-02-28  9:10 ` [PATCH v2 0/2] " Kirill A. Shutemov
  2 siblings, 0 replies; 13+ messages in thread
From: Baoquan He @ 2019-02-28  0:35 UTC (permalink / raw)
  To: linux-kernel, kirill.shutemov
  Cc: dave.hansen, luto, peterz, tglx, mingo, bp, hpa, x86, keescook,
	thgarnie, Baoquan He

The current randomization granularity of 5-level is 512 GB. Improve
it to 1 GB. This can add more randomness to memory region KASLR in
5-level paging mode

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/mm/kaslr.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 131e08a10a68..99ec75634613 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -204,10 +204,7 @@ void __init kernel_randomize_memory(void)
 		 */
 		entropy = remain_entropy / (ARRAY_SIZE(kaslr_regions) - i);
 		prandom_bytes_state(&rand_state, &rand, sizeof(rand));
-		if (pgtable_l5_enabled())
-			entropy = (rand % (entropy + 1)) & P4D_MASK;
-		else
-			entropy = (rand % (entropy + 1)) & PUD_MASK;
+		entropy = (rand % (entropy + 1)) & PUD_MASK;
 		vaddr += entropy;
 		*kaslr_regions[i].base = vaddr;
 
@@ -216,10 +213,7 @@ void __init kernel_randomize_memory(void)
 		 * randomization alignment.
 		 */
 		vaddr += kaslr_regions[i].size_tb << TB_SHIFT;
-		if (pgtable_l5_enabled())
-			vaddr = round_up(vaddr + 1, P4D_SIZE);
-		else
-			vaddr = round_up(vaddr + 1, PUD_SIZE);
+		vaddr = round_up(vaddr + 1, PUD_SIZE);
 		remain_entropy -= entropy;
 	}
 }
@@ -276,8 +270,8 @@ static void __meminit init_trampoline_pud(void)
  * that area. In case KASLR disabled, the 1st PGD entry of the
  * direct mapping is copied directly. If KASLR is enabled, only
  * copy the 1st PUD entry where physical address 0 resides since
- * the granularity of randomization is PUD size in 4-level, and
- * P4D size in 5-level.
+ * the granularity of randomization is PUD size in both 4-level and
+ * 5-level.
  *
  * This consumes one low memory page in 4-level case, and extra one
  * in 5-level.
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-02-28  0:35 [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level Baoquan He
  2019-02-28  0:35 ` [PATCH v2 1/2] x86/mm/KASLR: Only build one PUD entry of area for real mode trampoline Baoquan He
  2019-02-28  0:35 ` [PATCH v2 2/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level Baoquan He
@ 2019-02-28  9:10 ` Kirill A. Shutemov
  2019-02-28  9:23   ` Baoquan He
  2019-02-28  9:29   ` [PATCH v3 " Baoquan He
  2 siblings, 2 replies; 13+ messages in thread
From: Kirill A. Shutemov @ 2019-02-28  9:10 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

On Thu, Feb 28, 2019 at 08:35:20AM +0800, Baoquan He wrote:
>  arch/x86/mm/kaslr.c | 92 ++++++++++++++++++++-------------------------
>  1 file changed, 40 insertions(+), 52 deletions(-)

What tree is it against?

I cannot apply on current Linus' tree or tip/master.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-02-28  9:10 ` [PATCH v2 0/2] " Kirill A. Shutemov
@ 2019-02-28  9:23   ` Baoquan He
  2019-02-28  9:55     ` Kirill A. Shutemov
  2019-02-28  9:29   ` [PATCH v3 " Baoquan He
  1 sibling, 1 reply; 13+ messages in thread
From: Baoquan He @ 2019-02-28  9:23 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

On 02/28/19 at 12:10pm, Kirill A. Shutemov wrote:
> On Thu, Feb 28, 2019 at 08:35:20AM +0800, Baoquan He wrote:
> >  arch/x86/mm/kaslr.c | 92 ++++++++++++++++++++-------------------------
> >  1 file changed, 40 insertions(+), 52 deletions(-)
> 
> What tree is it against?
> 
> I cannot apply on current Linus' tree or tip/master.

Oops, I put them on my on my previous kaslr fixing patches:
http://lkml.kernel.org/r/20190216140008.28671-1-bhe@redhat.com
[PATCH v3 0/6] Several patches to fix code bugs, improve documents and clean up

There's a clean up there to open code get_padding(), that cause
conflict in patch 0002. Let me rearrange and repost.

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-02-28  9:10 ` [PATCH v2 0/2] " Kirill A. Shutemov
  2019-02-28  9:23   ` Baoquan He
@ 2019-02-28  9:29   ` Baoquan He
  1 sibling, 0 replies; 13+ messages in thread
From: Baoquan He @ 2019-02-28  9:29 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

The current randomization granularity of 5-level is 512 GB. Improve
it to 1 GB. This can add more randomness to memory region KASLR in
5-level paging mode

Signed-off-by: Baoquan He <bhe@redhat.com>
---
v2->v3:
  The v2 was based on another patchset. Rebase it on the latest linus's
  tree.

 arch/x86/mm/kaslr.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 4e7f07b466c5..4ef9a27a1361 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -125,10 +125,7 @@ void __init kernel_randomize_memory(void)
 		 */
 		entropy = remain_entropy / (ARRAY_SIZE(kaslr_regions) - i);
 		prandom_bytes_state(&rand_state, &rand, sizeof(rand));
-		if (pgtable_l5_enabled())
-			entropy = (rand % (entropy + 1)) & P4D_MASK;
-		else
-			entropy = (rand % (entropy + 1)) & PUD_MASK;
+		entropy = (rand % (entropy + 1)) & PUD_MASK;
 		vaddr += entropy;
 		*kaslr_regions[i].base = vaddr;
 
@@ -137,10 +134,7 @@ void __init kernel_randomize_memory(void)
 		 * randomization alignment.
 		 */
 		vaddr += get_padding(&kaslr_regions[i]);
-		if (pgtable_l5_enabled())
-			vaddr = round_up(vaddr + 1, P4D_SIZE);
-		else
-			vaddr = round_up(vaddr + 1, PUD_SIZE);
+		vaddr = round_up(vaddr + 1, PUD_SIZE);
 		remain_entropy -= entropy;
 	}
 }
@@ -197,8 +191,8 @@ static void __meminit init_trampoline_pud(void)
  * that area. In case KASLR disabled, the 1st PGD entry of the
  * direct mapping is copied directly. If KASLR is enabled, only
  * copy the 1st PUD entry where physical address 0 resides since
- * the granularity of randomization is PUD size in 4-level, and
- * P4D size in 5-level.
+ * the granularity of randomization is PUD size in both 4-level and
+ * 5-level.
  *
  * This consumes one low memory page in 4-level case, and extra one
  * in 5-level.
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-02-28  9:23   ` Baoquan He
@ 2019-02-28  9:55     ` Kirill A. Shutemov
  2019-02-28 10:04       ` Baoquan He
  0 siblings, 1 reply; 13+ messages in thread
From: Kirill A. Shutemov @ 2019-02-28  9:55 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

On Thu, Feb 28, 2019 at 05:23:46PM +0800, Baoquan He wrote:
> On 02/28/19 at 12:10pm, Kirill A. Shutemov wrote:
> > On Thu, Feb 28, 2019 at 08:35:20AM +0800, Baoquan He wrote:
> > >  arch/x86/mm/kaslr.c | 92 ++++++++++++++++++++-------------------------
> > >  1 file changed, 40 insertions(+), 52 deletions(-)
> > 
> > What tree is it against?
> > 
> > I cannot apply on current Linus' tree or tip/master.
> 
> Oops, I put them on my on my previous kaslr fixing patches:
> http://lkml.kernel.org/r/20190216140008.28671-1-bhe@redhat.com
> [PATCH v3 0/6] Several patches to fix code bugs, improve documents and clean up
> 
> There's a clean up there to open code get_padding(), that cause
> conflict in patch 0002. Let me rearrange and repost.

I cannot apply the first patch too. Hm?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-02-28  9:55     ` Kirill A. Shutemov
@ 2019-02-28 10:04       ` Baoquan He
  2019-02-28 10:30         ` Kirill A. Shutemov
  0 siblings, 1 reply; 13+ messages in thread
From: Baoquan He @ 2019-02-28 10:04 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

On 02/28/19 at 12:55pm, Kirill A. Shutemov wrote:
> I cannot apply the first patch too. Hm?

It's weird. I use 'git cherry-pick' and it can be cleanly applied.

And git formatted patch can be applied too. Could you try below one? 
git am works for me with it, based on this commit on linus's tree.
7d762d69145a  afs: Fix manually set volume location server list.


From dcbe8e4846102fdd051deb3c2946125c17b270fb Mon Sep 17 00:00:00 2001
From: Baoquan He <bhe@redhat.com>
Date: Thu, 21 Feb 2019 11:46:59 +0800
Subject: [PATCH v3] x86/mm/KASLR: Only build one PUD entry of area for real mode
 trampoline

The current code builds identity mapping for real mode treampoline by
borrowing page tables from the direct mapping section if KASLR is
enabled. It will copy present entries of the first PUD table in 4-level
paging mode, or the first P4D table in 5-level paging mode.

However, there's only a very small area under low 1 MB reserved
for real mode trampoline in reserve_real_mode(). Makes no sense
to build up so large area of mapping for it. Since the randomization
granularity in 4-level is 1 GB, and 512 GB in 5-level, only copying
one PUD entry is enough.

Hence, only copy one PUD entry of area where physical address 0
resides. And this is preparation for later changing the randomization
granularity of 5-level paging mode from 512 GB to 1 GB.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/mm/kaslr.c | 82 +++++++++++++++++++++------------------------
 1 file changed, 38 insertions(+), 44 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 754b5da91d43..131e08a10a68 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -226,74 +226,68 @@ void __init kernel_randomize_memory(void)
 
 static void __meminit init_trampoline_pud(void)
 {
-	unsigned long paddr, paddr_next;
+	unsigned long paddr, vaddr;
 	pgd_t *pgd;
-	pud_t *pud_page, *pud_page_tramp;
-	int i;
+
+	pud_t *pud_page, *pud_page_tramp, *pud, *pud_tramp;
 
 	pud_page_tramp = alloc_low_page();
 
 	paddr = 0;
+	vaddr = (unsigned long)__va(paddr);
 	pgd = pgd_offset_k((unsigned long)__va(paddr));
-	pud_page = (pud_t *) pgd_page_vaddr(*pgd);
-
-	for (i = pud_index(paddr); i < PTRS_PER_PUD; i++, paddr = paddr_next) {
-		pud_t *pud, *pud_tramp;
-		unsigned long vaddr = (unsigned long)__va(paddr);
 
-		pud_tramp = pud_page_tramp + pud_index(paddr);
-		pud = pud_page + pud_index(vaddr);
-		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
-
-		*pud_tramp = *pud;
-	}
+	if (pgtable_l5_enabled()) {
+		p4d_t *p4d_page, *p4d_page_tramp, *p4d, *p4d_tramp;
+		p4d_page_tramp = alloc_low_page();
 
-	set_pgd(&trampoline_pgd_entry,
-		__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
-}
+		p4d_page = (p4d_t *) pgd_page_vaddr(*pgd);
+		p4d = p4d_page + p4d_index(vaddr);
 
-static void __meminit init_trampoline_p4d(void)
-{
-	unsigned long paddr, paddr_next;
-	pgd_t *pgd;
-	p4d_t *p4d_page, *p4d_page_tramp;
-	int i;
+		pud_page = (pud_t *) p4d_page_vaddr(*p4d);
+		pud = pud_page + pud_index(vaddr);
 
-	p4d_page_tramp = alloc_low_page();
+		p4d_tramp = p4d_page_tramp + p4d_index(paddr);
+		pud_tramp = pud_page_tramp + pud_index(paddr);
 
-	paddr = 0;
-	pgd = pgd_offset_k((unsigned long)__va(paddr));
-	p4d_page = (p4d_t *) pgd_page_vaddr(*pgd);
+		*pud_tramp = *pud;
 
-	for (i = p4d_index(paddr); i < PTRS_PER_P4D; i++, paddr = paddr_next) {
-		p4d_t *p4d, *p4d_tramp;
-		unsigned long vaddr = (unsigned long)__va(paddr);
+		set_p4d(p4d_tramp,
+			__p4d(_KERNPG_TABLE | __pa(pud_page_tramp)));
 
-		p4d_tramp = p4d_page_tramp + p4d_index(paddr);
-		p4d = p4d_page + p4d_index(vaddr);
-		paddr_next = (paddr & P4D_MASK) + P4D_SIZE;
+		set_pgd(&trampoline_pgd_entry,
+			__pgd(_KERNPG_TABLE | __pa(p4d_page_tramp)));
+	} else {
+		pud_page = (pud_t *) pgd_page_vaddr(*pgd);
+		pud = pud_page + pud_index(vaddr);
 
-		*p4d_tramp = *p4d;
+		pud_tramp = pud_page_tramp + pud_index(paddr);
+		*pud_tramp = *pud;
+		set_pgd(&trampoline_pgd_entry,
+			__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
 	}
-
-	set_pgd(&trampoline_pgd_entry,
-		__pgd(_KERNPG_TABLE | __pa(p4d_page_tramp)));
 }
 
 /*
- * Create PGD aligned trampoline table to allow real mode initialization
- * of additional CPUs. Consume only 1 low memory page.
+ * Real mode trampoline only occupies a small area under low 1 MB
+ * (please check codes in reserve_real_mode() for details). For
+ * APs' booting up, we just borrow as few page tables as possible
+ * from the direct physical mapping to build 1:1 mapping to cover
+ * that area. In case KASLR disabled, the 1st PGD entry of the
+ * direct mapping is copied directly. If KASLR is enabled, only
+ * copy the 1st PUD entry where physical address 0 resides since
+ * the granularity of randomization is PUD size in 4-level, and
+ * P4D size in 5-level.
+ *
+ * This consumes one low memory page in 4-level case, and extra one
+ * in 5-level.
  */
 void __meminit init_trampoline(void)
 {
-
 	if (!kaslr_memory_enabled()) {
 		init_trampoline_default();
 		return;
 	}
 
-	if (pgtable_l5_enabled())
-		init_trampoline_p4d();
-	else
-		init_trampoline_pud();
+	init_trampoline_pud();
 }
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-02-28 10:04       ` Baoquan He
@ 2019-02-28 10:30         ` Kirill A. Shutemov
  2019-02-28 13:01           ` Baoquan He
  2019-03-01 14:45           ` Baoquan He
  0 siblings, 2 replies; 13+ messages in thread
From: Kirill A. Shutemov @ 2019-02-28 10:30 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

On Thu, Feb 28, 2019 at 06:04:19PM +0800, Baoquan He wrote:
> On 02/28/19 at 12:55pm, Kirill A. Shutemov wrote:
> > I cannot apply the first patch too. Hm?
> 
> It's weird. I use 'git cherry-pick' and it can be cleanly applied.
> 
> And git formatted patch can be applied too. Could you try below one? 
> git am works for me with it, based on this commit on linus's tree.
> 7d762d69145a  afs: Fix manually set volume location server list.
> 
> 
> From dcbe8e4846102fdd051deb3c2946125c17b270fb Mon Sep 17 00:00:00 2001
> From: Baoquan He <bhe@redhat.com>
> Date: Thu, 21 Feb 2019 11:46:59 +0800
> Subject: [PATCH v3] x86/mm/KASLR: Only build one PUD entry of area for real mode
>  trampoline
> 
> The current code builds identity mapping for real mode treampoline by
> borrowing page tables from the direct mapping section if KASLR is
> enabled. It will copy present entries of the first PUD table in 4-level
> paging mode, or the first P4D table in 5-level paging mode.
> 
> However, there's only a very small area under low 1 MB reserved
> for real mode trampoline in reserve_real_mode(). Makes no sense
> to build up so large area of mapping for it. Since the randomization
> granularity in 4-level is 1 GB, and 512 GB in 5-level, only copying
> one PUD entry is enough.
> 
> Hence, only copy one PUD entry of area where physical address 0
> resides. And this is preparation for later changing the randomization
> granularity of 5-level paging mode from 512 GB to 1 GB.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  arch/x86/mm/kaslr.c | 82 +++++++++++++++++++++------------------------
>  1 file changed, 38 insertions(+), 44 deletions(-)
> 
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index 754b5da91d43..131e08a10a68 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -226,74 +226,68 @@ void __init kernel_randomize_memory(void)
>  
>  static void __meminit init_trampoline_pud(void)
>  {
> -	unsigned long paddr, paddr_next;
> +	unsigned long paddr, vaddr;
>  	pgd_t *pgd;
> -	pud_t *pud_page, *pud_page_tramp;
> -	int i;
> +

Unneeded empty line.

> +	pud_t *pud_page, *pud_page_tramp, *pud, *pud_tramp;
>  
>  	pud_page_tramp = alloc_low_page();
>  
>  	paddr = 0;
> +	vaddr = (unsigned long)__va(paddr);

Maybe just

	vaddr = PAGE_OFFSET;

?

>  	pgd = pgd_offset_k((unsigned long)__va(paddr));

We do have 'vaddr' already.

	pgd = pgd_offset_k(vaddr);

> -	pud_page = (pud_t *) pgd_page_vaddr(*pgd);
> -
> -	for (i = pud_index(paddr); i < PTRS_PER_PUD; i++, paddr = paddr_next) {
> -		pud_t *pud, *pud_tramp;
> -		unsigned long vaddr = (unsigned long)__va(paddr);
>  
> -		pud_tramp = pud_page_tramp + pud_index(paddr);
> -		pud = pud_page + pud_index(vaddr);
> -		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
> -
> -		*pud_tramp = *pud;
> -	}
> +	if (pgtable_l5_enabled()) {
> +		p4d_t *p4d_page, *p4d_page_tramp, *p4d, *p4d_tramp;
> +		p4d_page_tramp = alloc_low_page();
>  
> -	set_pgd(&trampoline_pgd_entry,
> -		__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
> -}
> +		p4d_page = (p4d_t *) pgd_page_vaddr(*pgd);
> +		p4d = p4d_page + p4d_index(vaddr);
>  
> -static void __meminit init_trampoline_p4d(void)
> -{
> -	unsigned long paddr, paddr_next;
> -	pgd_t *pgd;
> -	p4d_t *p4d_page, *p4d_page_tramp;
> -	int i;
> +		pud_page = (pud_t *) p4d_page_vaddr(*p4d);
> +		pud = pud_page + pud_index(vaddr);
>  
> -	p4d_page_tramp = alloc_low_page();
> +		p4d_tramp = p4d_page_tramp + p4d_index(paddr);
> +		pud_tramp = pud_page_tramp + pud_index(paddr);

p?d_index() has to be called on the virtual address. It shouldn't break
anything, but it's wrong from conceptual point of view.

I don't think need paddr at all in the function.

BTW, any reason we cannot use p?d_offset() instead of playing with
p?d_index() directly?


-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-02-28 10:30         ` Kirill A. Shutemov
@ 2019-02-28 13:01           ` Baoquan He
  2019-03-01 14:45           ` Baoquan He
  1 sibling, 0 replies; 13+ messages in thread
From: Baoquan He @ 2019-02-28 13:01 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

On 02/28/19 at 01:30pm, Kirill A. Shutemov wrote:
> > diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> > index 754b5da91d43..131e08a10a68 100644
> > --- a/arch/x86/mm/kaslr.c
> > +++ b/arch/x86/mm/kaslr.c
> > @@ -226,74 +226,68 @@ void __init kernel_randomize_memory(void)
> >  
> >  static void __meminit init_trampoline_pud(void)
> >  {
> > -	unsigned long paddr, paddr_next;
> > +	unsigned long paddr, vaddr;
> >  	pgd_t *pgd;
> > -	pud_t *pud_page, *pud_page_tramp;
> > -	int i;
> > +
> 
> Unneeded empty line.

Right, I will remove it, and move it to the top since it's the longest
line.

> 
> > +	pud_t *pud_page, *pud_page_tramp, *pud, *pud_tramp;
> >  
> >  	pud_page_tramp = alloc_low_page();
> >  
> >  	paddr = 0;
> > +	vaddr = (unsigned long)__va(paddr);
> 
> Maybe just
> 
> 	vaddr = PAGE_OFFSET;
> 
> ?

An obvious 0 to PAGE_OFFSET converting can explain the page table
borrowing from the direct mapping in some extent. While both is fine to
me if you prefer PAGE_OFFSET.

> 
> >  	pgd = pgd_offset_k((unsigned long)__va(paddr));
> 
> We do have 'vaddr' already.

Yeah, will replace it.

> 
> 	pgd = pgd_offset_k(vaddr);
> 
> > -	pud_page = (pud_t *) pgd_page_vaddr(*pgd);
> > -
> > -	for (i = pud_index(paddr); i < PTRS_PER_PUD; i++, paddr = paddr_next) {
> > -		pud_t *pud, *pud_tramp;
> > -		unsigned long vaddr = (unsigned long)__va(paddr);
> >  
> > -		pud_tramp = pud_page_tramp + pud_index(paddr);
> > -		pud = pud_page + pud_index(vaddr);
> > -		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
> > -
> > -		*pud_tramp = *pud;
> > -	}
> > +	if (pgtable_l5_enabled()) {
> > +		p4d_t *p4d_page, *p4d_page_tramp, *p4d, *p4d_tramp;
> > +		p4d_page_tramp = alloc_low_page();
> >  
> > -	set_pgd(&trampoline_pgd_entry,
> > -		__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
> > -}
> > +		p4d_page = (p4d_t *) pgd_page_vaddr(*pgd);
> > +		p4d = p4d_page + p4d_index(vaddr);
> >  
> > -static void __meminit init_trampoline_p4d(void)
> > -{
> > -	unsigned long paddr, paddr_next;
> > -	pgd_t *pgd;
> > -	p4d_t *p4d_page, *p4d_page_tramp;
> > -	int i;
> > +		pud_page = (pud_t *) p4d_page_vaddr(*p4d);
> > +		pud = pud_page + pud_index(vaddr);
> >  
> > -	p4d_page_tramp = alloc_low_page();
> > +		p4d_tramp = p4d_page_tramp + p4d_index(paddr);
> > +		pud_tramp = pud_page_tramp + pud_index(paddr);
> 
> p?d_index() has to be called on the virtual address. It shouldn't break
> anything, but it's wrong from conceptual point of view.
> I don't think need paddr at all in the function.

Hmm, here the tramp table is for identity mapping real mode. Its paddr
equals to its vaddr. Using paddr here can tell it's the identity mapping
which is handling.

> 
> BTW, any reason we cannot use p?d_offset() instead of playing with
> p?d_index() directly?

Sure, p?d_offset() looks better. Just took it from the old code. I will
use it instead.

Thanks for careful reviewing.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-02-28 10:30         ` Kirill A. Shutemov
  2019-02-28 13:01           ` Baoquan He
@ 2019-03-01 14:45           ` Baoquan He
  2019-03-04  8:15             ` Kirill A. Shutemov
  1 sibling, 1 reply; 13+ messages in thread
From: Baoquan He @ 2019-03-01 14:45 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

Hi Kirill,

I updated patch per your comments. While I kept the 'paddr' variable and
the '0' initilization stuffs. My thought is there are two kinds of mapping
in the handling, so keeping these names from old codes can make it more
understandable. What do you think?

Kind               Physical address            Virtual address
----------------------------------------------------------------
Direct mapping     paddr                vaddr = paddr+PAGE_OFFSET
1:1 mapping        paddr                    paddr

Thanks
Baoquan

From 52c6c80b2e3ffe153902800793c0e732c3c6bf1d Mon Sep 17 00:00:00 2001
From: Baoquan He <bhe@redhat.com>
Date: Thu, 21 Feb 2019 11:46:59 +0800
Subject: [PATCH] x86/mm/KASLR: Only build one PUD entry of area for real mode
 trampoline

The current code builds identity mapping for real mode treampoline by
borrowing page tables from the direct mapping section if KASLR is
enabled. It will copy present entries of the first PUD table in 4-level
paging mode, or the first P4D table in 5-level paging mode.

However, there's only a very small area under low 1 MB reserved
for real mode trampoline in reserve_real_mode(). Makes no sense
to build up so large area of mapping for it. Since the randomization
granularity in 4-level is 1 GB, and 512 GB in 5-level, only copying
one PUD entry is enough.

Hence, only copy one PUD entry of area where physical address 0
resides. And this is preparation for later changing the randomization
granularity of 5-level paging mode from 512 GB to 1 GB.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/mm/kaslr.c | 79 ++++++++++++++++++---------------------------
 1 file changed, 32 insertions(+), 47 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 3f452ffed7e9..c6fab7a77439 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -147,74 +147,59 @@ void __init kernel_randomize_memory(void)
 
 static void __meminit init_trampoline_pud(void)
 {
-	unsigned long paddr, paddr_next;
+	pud_t *pud_page_tramp, *pud, *pud_tramp;
+	p4d_t *p4d_page_tramp, *p4d, *p4d_tramp;
+	unsigned long paddr, vaddr;
 	pgd_t *pgd;
-	pud_t *pud_page, *pud_page_tramp;
-	int i;
 
 	pud_page_tramp = alloc_low_page();
 
 	paddr = 0;
-	pgd = pgd_offset_k((unsigned long)__va(paddr));
-	pud_page = (pud_t *) pgd_page_vaddr(*pgd);
+	vaddr = (unsigned long)__va(paddr);
+	pgd = pgd_offset_k(vaddr);
 
-	for (i = pud_index(paddr); i < PTRS_PER_PUD; i++, paddr = paddr_next) {
-		pud_t *pud, *pud_tramp;
-		unsigned long vaddr = (unsigned long)__va(paddr);
+	p4d = p4d_offset(pgd, vaddr);
+	pud = pud_offset(p4d, vaddr);
 
-		pud_tramp = pud_page_tramp + pud_index(paddr);
-		pud = pud_page + pud_index(vaddr);
-		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+	pud_tramp = pud_page_tramp + pud_index(paddr);
+	*pud_tramp = *pud;
 
-		*pud_tramp = *pud;
-	}
-
-	set_pgd(&trampoline_pgd_entry,
-		__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
-}
-
-static void __meminit init_trampoline_p4d(void)
-{
-	unsigned long paddr, paddr_next;
-	pgd_t *pgd;
-	p4d_t *p4d_page, *p4d_page_tramp;
-	int i;
-
-	p4d_page_tramp = alloc_low_page();
-
-	paddr = 0;
-	pgd = pgd_offset_k((unsigned long)__va(paddr));
-	p4d_page = (p4d_t *) pgd_page_vaddr(*pgd);
-
-	for (i = p4d_index(paddr); i < PTRS_PER_P4D; i++, paddr = paddr_next) {
-		p4d_t *p4d, *p4d_tramp;
-		unsigned long vaddr = (unsigned long)__va(paddr);
+	if (pgtable_l5_enabled()) {
+		p4d_page_tramp = alloc_low_page();
 
 		p4d_tramp = p4d_page_tramp + p4d_index(paddr);
-		p4d = p4d_page + p4d_index(vaddr);
-		paddr_next = (paddr & P4D_MASK) + P4D_SIZE;
 
-		*p4d_tramp = *p4d;
-	}
+		set_p4d(p4d_tramp,
+			__p4d(_KERNPG_TABLE | __pa(pud_page_tramp)));
 
-	set_pgd(&trampoline_pgd_entry,
-		__pgd(_KERNPG_TABLE | __pa(p4d_page_tramp)));
+		set_pgd(&trampoline_pgd_entry,
+			__pgd(_KERNPG_TABLE | __pa(p4d_page_tramp)));
+	} else {
+		set_pgd(&trampoline_pgd_entry,
+			__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
+	}
 }
 
 /*
- * Create PGD aligned trampoline table to allow real mode initialization
- * of additional CPUs. Consume only 1 low memory page.
+ * Real mode trampoline only occupies a small area under low 1 MB
+ * (please check codes in reserve_real_mode() for details). For
+ * APs' booting up, we just borrow as few page tables as possible
+ * from the direct physical mapping to build 1:1 mapping to cover
+ * that area. In case KASLR disabled, the 1st PGD entry of the
+ * direct mapping is copied directly. If KASLR is enabled, only
+ * copy the 1st PUD entry where physical address 0 resides since
+ * the granularity of randomization is PUD size in 4-level, and
+ * P4D size in 5-level.
+ *
+ * This consumes one low memory page in 4-level case, and extra one
+ * in 5-level.
  */
 void __meminit init_trampoline(void)
 {
-
 	if (!kaslr_memory_enabled()) {
 		init_trampoline_default();
 		return;
 	}
 
-	if (pgtable_l5_enabled())
-		init_trampoline_p4d();
-	else
-		init_trampoline_pud();
+	init_trampoline_pud();
 }
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-03-01 14:45           ` Baoquan He
@ 2019-03-04  8:15             ` Kirill A. Shutemov
  2019-03-04  8:52               ` Baoquan He
  0 siblings, 1 reply; 13+ messages in thread
From: Kirill A. Shutemov @ 2019-03-04  8:15 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

On Fri, Mar 01, 2019 at 10:45:02PM +0800, Baoquan He wrote:
> Hi Kirill,
> 
> I updated patch per your comments. While I kept the 'paddr' variable and
> the '0' initilization stuffs. My thought is there are two kinds of mapping
> in the handling, so keeping these names from old codes can make it more
> understandable. What do you think?
> 
> Kind               Physical address            Virtual address
> ----------------------------------------------------------------
> Direct mapping     paddr                vaddr = paddr+PAGE_OFFSET
> 1:1 mapping        paddr                    paddr

I agree with your logic and patch looks good to me. But maybe you should
exmplain this in the comments too?

Anyway:

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level
  2019-03-04  8:15             ` Kirill A. Shutemov
@ 2019-03-04  8:52               ` Baoquan He
  0 siblings, 0 replies; 13+ messages in thread
From: Baoquan He @ 2019-03-04  8:52 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-kernel, kirill.shutemov, dave.hansen, luto, peterz, tglx,
	mingo, bp, hpa, x86, keescook, thgarnie

On 03/04/19 at 11:15am, Kirill A. Shutemov wrote:
> On Fri, Mar 01, 2019 at 10:45:02PM +0800, Baoquan He wrote:
> > Hi Kirill,
> > 
> > I updated patch per your comments. While I kept the 'paddr' variable and
> > the '0' initilization stuffs. My thought is there are two kinds of mapping
> > in the handling, so keeping these names from old codes can make it more
> > understandable. What do you think?
> > 
> > Kind               Physical address            Virtual address
> > ----------------------------------------------------------------
> > Direct mapping     paddr                vaddr = paddr+PAGE_OFFSET
> > 1:1 mapping        paddr                    paddr
> 
> I agree with your logic and patch looks good to me. But maybe you should
> exmplain this in the comments too?

OK, will add this into code comment, and sent a v3 patchset to wrap all
these changes according to your suggestions. Thanks a lot.

Thanks
Baoquan


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-03-04  8:52 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-28  0:35 [PATCH v2 0/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level Baoquan He
2019-02-28  0:35 ` [PATCH v2 1/2] x86/mm/KASLR: Only build one PUD entry of area for real mode trampoline Baoquan He
2019-02-28  0:35 ` [PATCH v2 2/2] x86/mm/KASLR: Change the granularity of randomization to PUD size in 5-level Baoquan He
2019-02-28  9:10 ` [PATCH v2 0/2] " Kirill A. Shutemov
2019-02-28  9:23   ` Baoquan He
2019-02-28  9:55     ` Kirill A. Shutemov
2019-02-28 10:04       ` Baoquan He
2019-02-28 10:30         ` Kirill A. Shutemov
2019-02-28 13:01           ` Baoquan He
2019-03-01 14:45           ` Baoquan He
2019-03-04  8:15             ` Kirill A. Shutemov
2019-03-04  8:52               ` Baoquan He
2019-02-28  9:29   ` [PATCH v3 " Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).