linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/mm: increase pgt_buf size for 5-level page tables
@ 2020-12-15 20:56 Lorenzo Stoakes
  2020-12-16 10:56 ` Kirill A. Shutemov
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Lorenzo Stoakes @ 2020-12-15 20:56 UTC (permalink / raw)
  To: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H . Peter Anvin
  Cc: linux-kernel, Lorenzo Stoakes

pgt_buf is used to allocate page tables on initial direct page mapping
bootstrapping us into being able to allocate these before the direct
mapping makes further pages available.

INIT_PGD_PAGE_COUNT is set to 6 pages (doubled for KASLR) - 3 (PUD, PMD,
PTE) for the 1 MiB ISA mapping and 3 more for the first direct mapping
assignment in each case providing 2 MiB of address space.

This has not been updated for 5-level page tables which additionally has a
P4D page table level above PUD.

In most instances this will not have a material impact as the first 4 page
levels allocated for the ISA mapping will provide sufficient address space
to encompass all further address mappings. If the first direct mapping is
within 512 GiB of the ISA mapping we need only add a PMD and PTE in the
instance where we are using 4 KiB page tables (e.g. CONFIG_DEBUG_PAGEALLOC
is enabled) and only a PMD if we can use 2 MiB pages (the first allocation
is limited to PMD_SIZE so we can't use a GiB page there).

However if we have more than 512 GiB of RAM and are allocating 4 KiB page
size we require 3 further page tables and if we have more than 256 TiB of
RAM at 4 KiB or 2 MiB page size we require a further 3 or 4 page tables
respectively.

This patch updates INIT_PGD_PAGE_COUNT to reflect this.

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
 arch/x86/mm/init.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index e26f5c5c6565..0ee7dc9a5a65 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -157,16 +157,25 @@ __ref void *alloc_low_pages(unsigned int num)
 }
 
 /*
- * By default need 3 4k for initial PMD_SIZE,  3 4k for 0-ISA_END_ADDRESS.
- * With KASLR memory randomization, depending on the machine e820 memory
- * and the PUD alignment. We may need twice more pages when KASLR memory
+ * By default we need to be able to allocate page tables below PGD firstly for
+ * the 0-ISA_END_ADDRESS range and secondly for the initial PMD_SIZE mapping.
+ * With KASLR memory randomization, depending on the machine e820 memory and the
+ * PUD alignment, we may need twice that many pages when KASLR memory
  * randomization is enabled.
  */
+
+#ifndef CONFIG_X86_5LEVEL
+#define INIT_PGD_PAGE_TABLES    3
+#else
+#define INIT_PGD_PAGE_TABLES    4
+#endif
+
 #ifndef CONFIG_RANDOMIZE_MEMORY
-#define INIT_PGD_PAGE_COUNT      6
+#define INIT_PGD_PAGE_COUNT      (2 * INIT_PGD_PAGE_TABLES)
 #else
-#define INIT_PGD_PAGE_COUNT      12
+#define INIT_PGD_PAGE_COUNT      (4 * INIT_PGD_PAGE_TABLES)
 #endif
+
 #define INIT_PGT_BUF_SIZE	(INIT_PGD_PAGE_COUNT * PAGE_SIZE)
 RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE);
 void  __init early_alloc_pgt_buf(void)
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/mm: increase pgt_buf size for 5-level page tables
  2020-12-15 20:56 [PATCH] x86/mm: increase pgt_buf size for 5-level page tables Lorenzo Stoakes
@ 2020-12-16 10:56 ` Kirill A. Shutemov
  2020-12-16 18:04 ` Dave Hansen
  2021-01-04 18:29 ` [tip: x86/mm] x86/mm: Increase " tip-bot2 for Lorenzo Stoakes
  2 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2020-12-16 10:56 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H . Peter Anvin, linux-kernel

On Tue, Dec 15, 2020 at 08:56:41PM +0000, Lorenzo Stoakes wrote:
> pgt_buf is used to allocate page tables on initial direct page mapping
> bootstrapping us into being able to allocate these before the direct
> mapping makes further pages available.
> 
> INIT_PGD_PAGE_COUNT is set to 6 pages (doubled for KASLR) - 3 (PUD, PMD,
> PTE) for the 1 MiB ISA mapping and 3 more for the first direct mapping
> assignment in each case providing 2 MiB of address space.
> 
> This has not been updated for 5-level page tables which additionally has a
> P4D page table level above PUD.
> 
> In most instances this will not have a material impact as the first 4 page
> levels allocated for the ISA mapping will provide sufficient address space
> to encompass all further address mappings. If the first direct mapping is
> within 512 GiB of the ISA mapping we need only add a PMD and PTE in the
> instance where we are using 4 KiB page tables (e.g. CONFIG_DEBUG_PAGEALLOC
> is enabled) and only a PMD if we can use 2 MiB pages (the first allocation
> is limited to PMD_SIZE so we can't use a GiB page there).
> 
> However if we have more than 512 GiB of RAM and are allocating 4 KiB page
> size we require 3 further page tables and if we have more than 256 TiB of
> RAM at 4 KiB or 2 MiB page size we require a further 3 or 4 page tables
> respectively.
> 
> This patch updates INIT_PGD_PAGE_COUNT to reflect this.
> 
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/mm: increase pgt_buf size for 5-level page tables
  2020-12-15 20:56 [PATCH] x86/mm: increase pgt_buf size for 5-level page tables Lorenzo Stoakes
  2020-12-16 10:56 ` Kirill A. Shutemov
@ 2020-12-16 18:04 ` Dave Hansen
  2021-01-04 14:52   ` Lorenzo Stoakes
  2021-01-04 18:29 ` [tip: x86/mm] x86/mm: Increase " tip-bot2 for Lorenzo Stoakes
  2 siblings, 1 reply; 5+ messages in thread
From: Dave Hansen @ 2020-12-16 18:04 UTC (permalink / raw)
  To: Lorenzo Stoakes, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H . Peter Anvin
  Cc: linux-kernel

On 12/15/20 12:56 PM, Lorenzo Stoakes wrote:
> +#ifndef CONFIG_X86_5LEVEL
> +#define INIT_PGD_PAGE_TABLES    3
> +#else
> +#define INIT_PGD_PAGE_TABLES    4
> +#endif
> +
>  #ifndef CONFIG_RANDOMIZE_MEMORY
> -#define INIT_PGD_PAGE_COUNT      6
> +#define INIT_PGD_PAGE_COUNT      (2 * INIT_PGD_PAGE_TABLES)
>  #else
> -#define INIT_PGD_PAGE_COUNT      12
> +#define INIT_PGD_PAGE_COUNT      (4 * INIT_PGD_PAGE_TABLES)
>  #endif
> +

Lorenzo, thanks for the patch.  That was a very nice changelog, and it
all seems sane to me, especially with Kirill's ack.

Acked-by: Dave Hansen <dave.hansen@intel.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] x86/mm: increase pgt_buf size for 5-level page tables
  2020-12-16 18:04 ` Dave Hansen
@ 2021-01-04 14:52   ` Lorenzo Stoakes
  0 siblings, 0 replies; 5+ messages in thread
From: Lorenzo Stoakes @ 2021-01-04 14:52 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H . Peter Anvin,
	Linux Kernel Mailing List

Happy NY all, thanks for the review! I haven't contributed an
x86-specific mm patch before so am not sure of the process - usually I
submit patches to the mm mailing list and Andrew gathers them up into
his tree, is there anything else I need to do with this?

Thanks, Lorenzo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: x86/mm] x86/mm: Increase pgt_buf size for 5-level page tables
  2020-12-15 20:56 [PATCH] x86/mm: increase pgt_buf size for 5-level page tables Lorenzo Stoakes
  2020-12-16 10:56 ` Kirill A. Shutemov
  2020-12-16 18:04 ` Dave Hansen
@ 2021-01-04 18:29 ` tip-bot2 for Lorenzo Stoakes
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Lorenzo Stoakes @ 2021-01-04 18:29 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Lorenzo Stoakes, Borislav Petkov, Kirill A. Shutemov,
	Dave Hansen, x86, linux-kernel

The following commit has been merged into the x86/mm branch of tip:

Commit-ID:     167dcfc08b0b1f964ea95d410aa496fd78adf475
Gitweb:        https://git.kernel.org/tip/167dcfc08b0b1f964ea95d410aa496fd78adf475
Author:        Lorenzo Stoakes <lstoakes@gmail.com>
AuthorDate:    Tue, 15 Dec 2020 20:56:41 
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Mon, 04 Jan 2021 18:07:50 +01:00

x86/mm: Increase pgt_buf size for 5-level page tables

pgt_buf is used to allocate page tables on initial direct page mapping
which bootstraps the kernel into being able to allocate these before the
direct mapping makes further pages available.

INIT_PGD_PAGE_COUNT is set to 6 pages (doubled for KASLR) - 3 (PUD, PMD,
PTE) for the 1 MiB ISA mapping and 3 more for the first direct mapping
assignment in each case providing 2 MiB of address space.

This has not been updated for 5-level page tables which has an
additional P4D page table level above PUD.

In most instances, this will not have a material impact as the first
4 page levels allocated for the ISA mapping will provide sufficient
address space to encompass all further address mappings.

If the first direct mapping is within 512 GiB of the ISA mapping, only
a PMD and PTE needs to be added in the instance the kernel is using 4
KiB page tables (e.g. CONFIG_DEBUG_PAGEALLOC is enabled) and only a PMD
if the kernel can use 2 MiB pages (the first allocation is limited to
PMD_SIZE so a GiB page cannot be used there).

However, if the machine has more than 512 GiB of RAM and the kernel is
allocating 4 KiB page size, 3 further page tables are required.

If the machine has more than 256 TiB of RAM at 4 KiB or 2 MiB page size,
further 3 or 4 page tables are required respectively.

Update INIT_PGD_PAGE_COUNT to reflect this.

 [ bp: Sanitize text into passive voice without ambiguous personal pronouns. ]

Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Link: https://lkml.kernel.org/r/20201215205641.34096-1-lstoakes@gmail.com
---
 arch/x86/mm/init.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index e26f5c5..dd694fb 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -157,16 +157,25 @@ __ref void *alloc_low_pages(unsigned int num)
 }
 
 /*
- * By default need 3 4k for initial PMD_SIZE,  3 4k for 0-ISA_END_ADDRESS.
- * With KASLR memory randomization, depending on the machine e820 memory
- * and the PUD alignment. We may need twice more pages when KASLR memory
+ * By default need to be able to allocate page tables below PGD firstly for
+ * the 0-ISA_END_ADDRESS range and secondly for the initial PMD_SIZE mapping.
+ * With KASLR memory randomization, depending on the machine e820 memory and the
+ * PUD alignment, twice that many pages may be needed when KASLR memory
  * randomization is enabled.
  */
+
+#ifndef CONFIG_X86_5LEVEL
+#define INIT_PGD_PAGE_TABLES    3
+#else
+#define INIT_PGD_PAGE_TABLES    4
+#endif
+
 #ifndef CONFIG_RANDOMIZE_MEMORY
-#define INIT_PGD_PAGE_COUNT      6
+#define INIT_PGD_PAGE_COUNT      (2 * INIT_PGD_PAGE_TABLES)
 #else
-#define INIT_PGD_PAGE_COUNT      12
+#define INIT_PGD_PAGE_COUNT      (4 * INIT_PGD_PAGE_TABLES)
 #endif
+
 #define INIT_PGT_BUF_SIZE	(INIT_PGD_PAGE_COUNT * PAGE_SIZE)
 RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE);
 void  __init early_alloc_pgt_buf(void)

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-04 18:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-15 20:56 [PATCH] x86/mm: increase pgt_buf size for 5-level page tables Lorenzo Stoakes
2020-12-16 10:56 ` Kirill A. Shutemov
2020-12-16 18:04 ` Dave Hansen
2021-01-04 14:52   ` Lorenzo Stoakes
2021-01-04 18:29 ` [tip: x86/mm] x86/mm: Increase " tip-bot2 for Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).