All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5] x86/mm/KASLR: Fix the size of vmemmap section
@ 2019-05-23  2:57 Baoquan He
       [not found] ` <20190529131454.9818321019@mail.kernel.org>
  2019-06-07 21:16 ` [tip:x86/urgent] x86/mm/KASLR: Compute the size of the vmemmap section properly tip-bot for Baoquan He
  0 siblings, 2 replies; 5+ messages in thread
From: Baoquan He @ 2019-05-23  2:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, tglx, mingo, bp, hpa, kirill.shutemov, keescook, Baoquan He, stable

kernel_randomize_memory() hardcodes the size of vmemmap section as 1 TB,
to support the maximum amount of system RAM in 4-level paging mode, 64 TB.

However, 1 TB is not enough for vmemmap in 5-level paging mode. Assuming
the size of struct page is 64 Bytes, to support 4 PB system RAM in 5-level,
64 TB of vmemmap area is needed. The wrong hardcoding may cause vmemmap
stamping into the following cpu_entry_area section, if KASLR puts vmemmap
very close to cpu_entry_area , and the actual area of vmemmap is much
bigger than 1 TB.

So here calculate the actual size of vmemmap region, then align up to 1 TB
boundary. In 4-level it's always 1 TB. In 5-level it's adjusted on demand.
The current code reserves 0.5 PB for vmemmap in 5-level. In this new way,
the left space can be saved to join randomization to increase the entropy.

Fiexes: eedb92abb9bb ("x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y")
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Kirill A. Shutemov <kirill@linux.intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
---
v4->v5:
  Add Fixes tag and Cc to stable.
v3->v4:
  Fix the incorrect style of code comment;
  Add ack tags from Kirill and Kees.
v3 discussion is here:
  http://lkml.kernel.org/r/20190422091045.GB3584@localhost.localdomain
 arch/x86/mm/kaslr.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index dc3f058bdf9b..c0eedb85a92f 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -52,7 +52,7 @@ static __initdata struct kaslr_memory_region {
 } kaslr_regions[] = {
 	{ &page_offset_base, 0 },
 	{ &vmalloc_base, 0 },
-	{ &vmemmap_base, 1 },
+	{ &vmemmap_base, 0 },
 };
 
 /* Get size in bytes used by the memory region */
@@ -78,6 +78,7 @@ void __init kernel_randomize_memory(void)
 	unsigned long rand, memory_tb;
 	struct rnd_state rand_state;
 	unsigned long remain_entropy;
+	unsigned long vmemmap_size;
 
 	vaddr_start = pgtable_l5_enabled() ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
 	vaddr = vaddr_start;
@@ -109,6 +110,14 @@ void __init kernel_randomize_memory(void)
 	if (memory_tb < kaslr_regions[0].size_tb)
 		kaslr_regions[0].size_tb = memory_tb;
 
+	/*
+	 * Calculate how many TB vmemmap region needs, and aligned to
+	 * 1TB boundary.
+	 */
+	vmemmap_size = (kaslr_regions[0].size_tb << (TB_SHIFT - PAGE_SHIFT)) *
+		sizeof(struct page);
+	kaslr_regions[2].size_tb = DIV_ROUND_UP(vmemmap_size, 1UL << TB_SHIFT);
+
 	/* Calculate entropy available between regions */
 	remain_entropy = vaddr_end - vaddr_start;
 	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++)
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v5] x86/mm/KASLR: Fix the size of vmemmap section
       [not found] ` <20190529131454.9818321019@mail.kernel.org>
@ 2019-05-30  0:46   ` Baoquan He
  0 siblings, 0 replies; 5+ messages in thread
From: Baoquan He @ 2019-05-30  0:46 UTC (permalink / raw)
  To: Sasha Levin; +Cc: linux-kernel, x86, tglx, mingo, bp, stable

Hi,

On 05/29/19 at 01:14pm, Sasha Levin wrote:
> Hi,
> 
> [This is an automated email]
> 
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
> 
> The bot has tested the following trees: v5.1.4, v5.0.18, v4.19.45, v4.14.121, v4.9.178, v4.4.180, v3.18.140.

I marked below commit with 'Fixes' tag.
Fiexes: eedb92abb9bb ("x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y")

[bhe@ linux]$ git describe --contains eedb92abb9bb
v4.17-rc1~171^2~51

You can see that it was added in kernel 4.17-rc1, as above. Can we just
apply this patch to stable trees after 4.17?

> 
> v5.1.4: Build OK!
> v5.0.18: Build OK!
> v4.19.45: Build OK!

We just apply it to above three trees which are after 4.17, and the build
for them is OK. Can we?

Thanks
Baoquan

> v4.14.121: Failed to apply! Possible dependencies:
>     4c2b4058ab325 ("x86/mm: Initialize 'pgtable_l5_enabled' at boot-time")
>     4fa5662b6b496 ("x86/mm: Initialize 'page_offset_base' at boot-time")
>     5c7919bb1994f ("x86/mm: Make LDT_BASE_ADDR dynamic")
>     a7412546d8cb5 ("x86/mm: Adjust vmalloc base and size at boot-time")
>     b16e770bfa534 ("x86/mm: Initialize 'pgdir_shift' and 'ptrs_per_p4d' at boot-time")
>     c65e774fb3f6a ("x86/mm: Make PGDIR_SHIFT and PTRS_PER_P4D variable")
>     e626e6bb0dfac ("x86/mm: Introduce 'pgtable_l5_enabled'")
>     eedb92abb9bb0 ("x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y")
> 
> v4.9.178: Failed to apply! Possible dependencies:
>     4c7c44837be77 ("x86/mm: Define virtual memory map for 5-level paging")
>     5c7919bb1994f ("x86/mm: Make LDT_BASE_ADDR dynamic")
>     69218e47994da ("x86: Remap GDT tables in the fixmap section")
>     92a0f81d89571 ("x86/cpu_entry_area: Move it out of the fixmap")
>     a7412546d8cb5 ("x86/mm: Adjust vmalloc base and size at boot-time")
>     aaeed3aeb39c1 ("x86/entry/gdt: Put per-CPU GDT remaps in ascending order")
>     b23adb7d3f7d1 ("x86/xen/gdt: Use X86_FEATURE_XENPV instead of globals for the GDT fixup")
>     b7ffc44d5b2ea ("x86/kvm/vmx: Defer TR reload after VM exit")
>     b9b1a9c363ff7 ("x86/boot/smp/32: Fix initial idle stack location on 32-bit kernels")
>     ed1bbc40a0d10 ("x86/cpu_entry_area: Move it to a separate unit")
>     ef8813ab28050 ("x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area")
> 
> v4.4.180: Failed to apply! Possible dependencies:
>     021182e52fe01 ("x86/mm: Enable KASLR for physical mapping memory regions")
>     0483e1fa6e09d ("x86/mm: Implement ASLR for kernel memory regions")
>     071a74930e60d ("x86/KASLR: Add virtual address choosing function")
>     206f25a8319b3 ("x86/KASLR: Remove unneeded boot_params argument")
>     2bc1cd39fa9f6 ("x86/boot: Clean up pointer casting")
>     3a94707d7a7bb ("x86/KASLR: Build identity mappings on demand")
>     4252db10559fc ("x86/KASLR: Update description for decompressor worst case size")
>     4c7c44837be77 ("x86/mm: Define virtual memory map for 5-level paging")
>     5c7919bb1994f ("x86/mm: Make LDT_BASE_ADDR dynamic")
>     6655e0aaf768c ("x86/boot: Rename "real_mode" to "boot_params"")
>     7de828dfe6070 ("x86/KASLR: Clarify purpose of kaslr.c")
>     8665e6ff21072 ("x86/boot: Clean up indenting for asm/boot.h")
>     9016875df408f ("x86/KASLR: Rename "random" to "random_addr"")
>     92a0f81d89571 ("x86/cpu_entry_area: Move it out of the fixmap")
>     9b238748cb6e9 ("x86/KASLR: Rename aslr.c to kaslr.c")
>     9dc1969c24eff ("x86/KASLR: Consolidate mem_avoid[] entries")
>     a7412546d8cb5 ("x86/mm: Adjust vmalloc base and size at boot-time")
>     d2d3462f9f08d ("x86/KASLR: Clarify purpose of each get_random_long()")
>     d899a7d146a2e ("x86/mm: Refactor KASLR entropy functions")
>     ed09acde44e30 ("x86/KASLR: Improve comments around the mem_avoid[] logic")
> 
> v3.18.140: Failed to apply! Possible dependencies:
>     021182e52fe01 ("x86/mm: Enable KASLR for physical mapping memory regions")
>     0b24becc810dc ("kasan: add kernel address sanitizer infrastructure")
>     2aa79af642631 ("locking/qspinlock: Revert to test-and-set on hypervisors")
>     3a94707d7a7bb ("x86/KASLR: Build identity mappings on demand")
>     4c7c44837be77 ("x86/mm: Define virtual memory map for 5-level paging")
>     4ea1636b04dbd ("x86/asm/tsc: Rename native_read_tsc() to rdtsc()")
>     5c7919bb1994f ("x86/mm: Make LDT_BASE_ADDR dynamic")
>     87be28aaf1458 ("x86/asm/tsc: Replace rdtscll() with native_read_tsc()")
>     9261e050b686c ("x86/asm/tsc, x86/paravirt: Remove read_tsc() and read_tscp() paravirt hooks")
>     92a0f81d89571 ("x86/cpu_entry_area: Move it out of the fixmap")
>     9b238748cb6e9 ("x86/KASLR: Rename aslr.c to kaslr.c")
>     a33fda35e3a76 ("locking/qspinlock: Introduce a simple generic 4-byte queued spinlock")
>     a7412546d8cb5 ("x86/mm: Adjust vmalloc base and size at boot-time")
>     c6e5ca35c4685 ("x86/asm/tsc: Inline native_read_tsc() and remove __native_read_tsc()")
>     cf991de2f614f ("x86/asm/msr: Make wrmsrl_safe() a function")
>     d6f2d75a7ae06 ("x86/kasan: Move KASAN_SHADOW_OFFSET to the arch Kconfig")
>     d73a33973f16a ("locking/qspinlock, x86: Enable x86-64 to use queued spinlocks")
>     d84b6728c54dc ("locking/mcs: Better differentiate between MCS variants")
>     ef7f0d6a6ca8c ("x86_64: add KASan support")
>     f233f7f1581e7 ("locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching")
> 
> 
> How should we proceed with this patch?
> 
> --
> Thanks,
> Sasha

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip:x86/urgent] x86/mm/KASLR: Compute the size of the vmemmap section properly
  2019-05-23  2:57 [PATCH v5] x86/mm/KASLR: Fix the size of vmemmap section Baoquan He
       [not found] ` <20190529131454.9818321019@mail.kernel.org>
@ 2019-06-07 21:16 ` tip-bot for Baoquan He
  2019-06-08  2:14   ` Baoquan He
  1 sibling, 1 reply; 5+ messages in thread
From: tip-bot for Baoquan He @ 2019-06-07 21:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: keescook, kirill, dave.hansen, bp, luto, hpa, x86, linux-kernel,
	peterz, tglx, bhe, mingo, stable

Commit-ID:  00e5a2bbcc31d5fea853f8daeba0f06c1c88c3ff
Gitweb:     https://git.kernel.org/tip/00e5a2bbcc31d5fea853f8daeba0f06c1c88c3ff
Author:     Baoquan He <bhe@redhat.com>
AuthorDate: Thu, 23 May 2019 10:57:44 +0800
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Fri, 7 Jun 2019 23:12:13 +0200

x86/mm/KASLR: Compute the size of the vmemmap section properly

The size of the vmemmap section is hardcoded to 1 TB to support the
maximum amount of system RAM in 4-level paging mode - 64 TB.

However, 1 TB is not enough for vmemmap in 5-level paging mode. Assuming
the size of struct page is 64 Bytes, to support 4 PB system RAM in 5-level,
64 TB of vmemmap area is needed:

  4 * 1000^5 PB / 4096 bytes page size * 64 bytes per page struct / 1000^4 TB = 62.5 TB.

This hardcoding may cause vmemmap to corrupt the following
cpu_entry_area section, if KASLR puts vmemmap very close to it and the
actual vmemmap size is bigger than 1 TB.

So calculate the actual size of the vmemmap region needed and then align
it up to 1 TB boundary.

In 4-level paging mode it is always 1 TB. In 5-level it's adjusted on
demand. The current code reserves 0.5 PB for vmemmap on 5-level. With
this change, the space can be saved and thus used to increase entropy
for the randomization.

 [ bp: Spell out how the 64 TB needed for vmemmap is computed and massage commit
   message. ]

Fixes: eedb92abb9bb ("x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y")
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Kirill A. Shutemov <kirill@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: kirill.shutemov@linux.intel.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190523025744.3756-1-bhe@redhat.com
---
 arch/x86/mm/kaslr.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index dc3f058bdf9b..dc6182eecefa 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -52,7 +52,7 @@ static __initdata struct kaslr_memory_region {
 } kaslr_regions[] = {
 	{ &page_offset_base, 0 },
 	{ &vmalloc_base, 0 },
-	{ &vmemmap_base, 1 },
+	{ &vmemmap_base, 0 },
 };
 
 /* Get size in bytes used by the memory region */
@@ -78,6 +78,7 @@ void __init kernel_randomize_memory(void)
 	unsigned long rand, memory_tb;
 	struct rnd_state rand_state;
 	unsigned long remain_entropy;
+	unsigned long vmemmap_size;
 
 	vaddr_start = pgtable_l5_enabled() ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
 	vaddr = vaddr_start;
@@ -109,6 +110,14 @@ void __init kernel_randomize_memory(void)
 	if (memory_tb < kaslr_regions[0].size_tb)
 		kaslr_regions[0].size_tb = memory_tb;
 
+	/*
+	 * Calculate the vmemmap region size in TBs, aligned to a TB
+	 * boundary.
+	 */
+	vmemmap_size = (kaslr_regions[0].size_tb << (TB_SHIFT - PAGE_SHIFT)) *
+			sizeof(struct page);
+	kaslr_regions[2].size_tb = DIV_ROUND_UP(vmemmap_size, 1UL << TB_SHIFT);
+
 	/* Calculate entropy available between regions */
 	remain_entropy = vaddr_end - vaddr_start;
 	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [tip:x86/urgent] x86/mm/KASLR: Compute the size of the vmemmap section properly
  2019-06-07 21:16 ` [tip:x86/urgent] x86/mm/KASLR: Compute the size of the vmemmap section properly tip-bot for Baoquan He
@ 2019-06-08  2:14   ` Baoquan He
  2019-06-08  7:49     ` Borislav Petkov
  0 siblings, 1 reply; 5+ messages in thread
From: Baoquan He @ 2019-06-08  2:14 UTC (permalink / raw)
  To: tglx, stable, mingo, x86, linux-kernel, peterz, keescook, bp,
	luto, hpa, kirill, dave.hansen
  Cc: linux-tip-commits

On 06/07/19 at 02:16pm, tip-bot for Baoquan He wrote:
> Commit-ID:  00e5a2bbcc31d5fea853f8daeba0f06c1c88c3ff
> Gitweb:     https://git.kernel.org/tip/00e5a2bbcc31d5fea853f8daeba0f06c1c88c3ff
> Author:     Baoquan He <bhe@redhat.com>
> AuthorDate: Thu, 23 May 2019 10:57:44 +0800
> Committer:  Borislav Petkov <bp@suse.de>
> CommitDate: Fri, 7 Jun 2019 23:12:13 +0200
> 
> x86/mm/KASLR: Compute the size of the vmemmap section properly
> 
> The size of the vmemmap section is hardcoded to 1 TB to support the
> maximum amount of system RAM in 4-level paging mode - 64 TB.
> 
> However, 1 TB is not enough for vmemmap in 5-level paging mode. Assuming
> the size of struct page is 64 Bytes, to support 4 PB system RAM in 5-level,
> 64 TB of vmemmap area is needed:
> 
>   4 * 1000^5 PB / 4096 bytes page size * 64 bytes per page struct / 1000^4 TB = 62.5 TB.

Thanks for picking this, Boris.

Here, 4PB = 4*2^50 = 4*1024^5, the vmemmap should be 64 TB, am I right?

> 
> This hardcoding may cause vmemmap to corrupt the following
> cpu_entry_area section, if KASLR puts vmemmap very close to it and the
> actual vmemmap size is bigger than 1 TB.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [tip:x86/urgent] x86/mm/KASLR: Compute the size of the vmemmap section properly
  2019-06-08  2:14   ` Baoquan He
@ 2019-06-08  7:49     ` Borislav Petkov
  0 siblings, 0 replies; 5+ messages in thread
From: Borislav Petkov @ 2019-06-08  7:49 UTC (permalink / raw)
  To: Baoquan He
  Cc: tglx, stable, mingo, x86, linux-kernel, peterz, keescook, luto,
	hpa, kirill, dave.hansen, linux-tip-commits

On Sat, Jun 08, 2019 at 10:14:04AM +0800, Baoquan He wrote:
> Here, 4PB = 4*2^50 = 4*1024^5, the vmemmap should be 64 TB, am I right?

PB is 1000^5 petabytes.

1024^5 is PiB or pebibytes.

https://en.wikipedia.org/wiki/Petabyte

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-06-08  7:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-23  2:57 [PATCH v5] x86/mm/KASLR: Fix the size of vmemmap section Baoquan He
     [not found] ` <20190529131454.9818321019@mail.kernel.org>
2019-05-30  0:46   ` Baoquan He
2019-06-07 21:16 ` [tip:x86/urgent] x86/mm/KASLR: Compute the size of the vmemmap section properly tip-bot for Baoquan He
2019-06-08  2:14   ` Baoquan He
2019-06-08  7:49     ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.