linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] x86: 5-level paging enabling for v4.12, Part 4
@ 2017-04-06 14:00 Kirill A. Shutemov
  2017-04-06 14:00 ` [PATCH 1/8] x86/boot/64: Rewrite startup_64 in C Kirill A. Shutemov
                   ` (7 more replies)
  0 siblings, 8 replies; 51+ messages in thread
From: Kirill A. Shutemov @ 2017-04-06 14:00 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, x86, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin
  Cc: Andi Kleen, Dave Hansen, Andy Lutomirski, linux-arch, linux-mm,
	linux-kernel, Kirill A. Shutemov

Here's the fourth and the last bunch of of patches that brings initial
5-level paging enabling.

Please review and consider applying.

As Ingo requested I've tried to rewrite assembly parts of boot process
into C before bringing 5-level paging support. The only part where I
succeed is startup_64 in arch/x86/kernel/head_64.S. Most of the logic is
now in C.

I failed to rewrite startup_32 in arch/x86/boot/compressed/head_64.S in C.
The code I need to modify in still in 32-bit mode, but if I would move it
to C it will be compiled as 64-bit. I've tried to move it into separate
translation unit and compile it with -m32, but then linking phase fails
due to type mismatch of object files.

I also have trouble with rewriting secondary_startup_64. Stack breaks as
soon as we switch to new page tables when onlining secondary CPUs. I don't
know how to get around this.

I hope it's not show-stopper.

If you know how to get around these issues, let me know.

Kirill A. Shutemov (8):
  x86/boot/64: Rewrite startup_64 in C
  x86/boot/64: Rename init_level4_pgt and early_level4_pgt
  x86/boot/64: Add support of additional page table level during early
    boot
  x86/mm: Add sync_global_pgds() for configuration with 5-level paging
  x86/mm: Make kernel_physical_mapping_init() support 5-level paging
  x86/mm: Add support for 5-level paging for KASLR
  x86: Enable 5-level paging support
  x86/mm: Allow to have userspace mappings above 47-bits

 arch/x86/Kconfig                            |   5 +
 arch/x86/boot/compressed/head_64.S          |  23 ++++-
 arch/x86/include/asm/elf.h                  |   2 +-
 arch/x86/include/asm/mpx.h                  |   9 ++
 arch/x86/include/asm/pgtable.h              |   2 +-
 arch/x86/include/asm/pgtable_64.h           |   6 +-
 arch/x86/include/asm/processor.h            |   9 +-
 arch/x86/include/uapi/asm/processor-flags.h |   2 +
 arch/x86/kernel/espfix_64.c                 |   2 +-
 arch/x86/kernel/head64.c                    | 137 +++++++++++++++++++++++++---
 arch/x86/kernel/head_64.S                   | 132 ++++++---------------------
 arch/x86/kernel/machine_kexec_64.c          |   2 +-
 arch/x86/kernel/sys_x86_64.c                |  28 +++++-
 arch/x86/mm/dump_pagetables.c               |   2 +-
 arch/x86/mm/hugetlbpage.c                   |  27 +++++-
 arch/x86/mm/init_64.c                       | 104 +++++++++++++++++++--
 arch/x86/mm/kasan_init_64.c                 |  12 +--
 arch/x86/mm/kaslr.c                         |  81 ++++++++++++----
 arch/x86/mm/mmap.c                          |   2 +-
 arch/x86/mm/mpx.c                           |  33 ++++++-
 arch/x86/realmode/init.c                    |   2 +-
 arch/x86/xen/Kconfig                        |   1 +
 arch/x86/xen/mmu.c                          |  18 ++--
 arch/x86/xen/xen-pvh.S                      |   2 +-
 24 files changed, 463 insertions(+), 180 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 51+ messages in thread
* [PATCHv7 07/14] x86/boot/64: Rewrite startup_64 in C
@ 2017-06-06 11:31 Kirill A. Shutemov
  2017-06-13 10:07 ` [tip:x86/mm] x86/boot/64: Rewrite startup_64() " tip-bot for Kirill A. Shutemov
  0 siblings, 1 reply; 51+ messages in thread
From: Kirill A. Shutemov @ 2017-06-06 11:31 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton, x86, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin
  Cc: Andi Kleen, Dave Hansen, Andy Lutomirski, linux-arch, linux-mm,
	linux-kernel, Kirill A. Shutemov

The patch write most of startup_64 logic in C.

This is preparation for 5-level paging enabling.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/kernel/head64.c  | 85 +++++++++++++++++++++++++++++++++++++++++-
 arch/x86/kernel/head_64.S | 95 ++---------------------------------------------
 2 files changed, 87 insertions(+), 93 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 43b7002f44fb..b59c550b1d3a 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -35,9 +35,92 @@
  */
 extern pgd_t early_level4_pgt[PTRS_PER_PGD];
 extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
-static unsigned int __initdata next_early_pgt = 2;
+static unsigned int __initdata next_early_pgt;
 pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
 
+static void __init *fixup_pointer(void *ptr, unsigned long physaddr)
+{
+	return ptr - (void *)_text + (void *)physaddr;
+}
+
+void __init __startup_64(unsigned long physaddr)
+{
+	unsigned long load_delta, *p;
+	pgdval_t *pgd;
+	pudval_t *pud;
+	pmdval_t *pmd, pmd_entry;
+	int i;
+
+	/* Is the address too large? */
+	if (physaddr >> MAX_PHYSMEM_BITS)
+		for (;;);
+
+	/*
+	 * Compute the delta between the address I am compiled to run at
+	 * and the address I am actually running at.
+	 */
+	load_delta = physaddr - (unsigned long)(_text - __START_KERNEL_map);
+
+	/* Is the address not 2M aligned? */
+	if (load_delta & ~PMD_PAGE_MASK)
+		for (;;);
+
+	/* Fixup the physical addresses in the page table */
+
+	pgd = fixup_pointer(&early_level4_pgt, physaddr);
+	pgd[pgd_index(__START_KERNEL_map)] += load_delta;
+
+	pud = fixup_pointer(&level3_kernel_pgt, physaddr);
+	pud[510] += load_delta;
+	pud[511] += load_delta;
+
+	pmd = fixup_pointer(level2_fixmap_pgt, physaddr);
+	pmd[506] += load_delta;
+
+	/*
+	 * Set up the identity mapping for the switchover.  These
+	 * entries should *NOT* have the global bit set!  This also
+	 * creates a bunch of nonsense entries but that is fine --
+	 * it avoids problems around wraparound.
+	 */
+
+	pud = fixup_pointer(early_dynamic_pgts[next_early_pgt++], physaddr);
+	pmd = fixup_pointer(early_dynamic_pgts[next_early_pgt++], physaddr);
+
+	i = (physaddr >> PGDIR_SHIFT) % PTRS_PER_PGD;
+	pgd[i + 0] = (pgdval_t)pud + _KERNPG_TABLE;
+	pgd[i + 1] = (pgdval_t)pud + _KERNPG_TABLE;
+
+	i = (physaddr >> PUD_SHIFT) % PTRS_PER_PUD;
+	pud[i + 0] = (pudval_t)pmd + _KERNPG_TABLE;
+	pud[i + 1] = (pudval_t)pmd + _KERNPG_TABLE;
+
+	pmd_entry = __PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL;
+	pmd_entry +=  physaddr;
+
+	for (i = 0; i < DIV_ROUND_UP(_end - _text, PMD_SIZE); i++) {
+		int idx = i + (physaddr >> PMD_SHIFT) % PTRS_PER_PMD;
+		pmd[idx] = pmd_entry + i * PMD_SIZE;
+	}
+
+	/*
+	 * Fixup the kernel text+data virtual addresses. Note that
+	 * we might write invalid pmds, when the kernel is relocated
+	 * cleanup_highmap() fixes this up along with the mappings
+	 * beyond _end.
+	 */
+
+	pmd = fixup_pointer(level2_kernel_pgt, physaddr);
+	for (i = 0; i < PTRS_PER_PMD; i++) {
+		if (pmd[i] & _PAGE_PRESENT)
+			pmd[i] += load_delta;
+	}
+
+	/* Fixup phys_base */
+	p = fixup_pointer(&phys_base, physaddr);
+	*p += load_delta;
+}
+
 /* Wipe all early page tables except for the kernel symbol map */
 static void __init reset_early_page_tables(void)
 {
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index ac9d327d2e42..1432d530fa35 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -72,100 +72,11 @@ startup_64:
 	/* Sanitize CPU configuration */
 	call verify_cpu
 
-	/*
-	 * Compute the delta between the address I am compiled to run at and the
-	 * address I am actually running at.
-	 */
-	leaq	_text(%rip), %rbp
-	subq	$_text - __START_KERNEL_map, %rbp
-
-	/* Is the address not 2M aligned? */
-	testl	$~PMD_PAGE_MASK, %ebp
-	jnz	bad_address
-
-	/*
-	 * Is the address too large?
-	 */
-	leaq	_text(%rip), %rax
-	shrq	$MAX_PHYSMEM_BITS, %rax
-	jnz	bad_address
-
-	/*
-	 * Fixup the physical addresses in the page table
-	 */
-	addq	%rbp, early_level4_pgt + (L4_START_KERNEL*8)(%rip)
-
-	addq	%rbp, level3_kernel_pgt + (510*8)(%rip)
-	addq	%rbp, level3_kernel_pgt + (511*8)(%rip)
-
-	addq	%rbp, level2_fixmap_pgt + (506*8)(%rip)
-
-	/*
-	 * Set up the identity mapping for the switchover.  These
-	 * entries should *NOT* have the global bit set!  This also
-	 * creates a bunch of nonsense entries but that is fine --
-	 * it avoids problems around wraparound.
-	 */
 	leaq	_text(%rip), %rdi
-	leaq	early_level4_pgt(%rip), %rbx
-
-	movq	%rdi, %rax
-	shrq	$PGDIR_SHIFT, %rax
-
-	leaq	(PAGE_SIZE + _KERNPG_TABLE)(%rbx), %rdx
-	movq	%rdx, 0(%rbx,%rax,8)
-	movq	%rdx, 8(%rbx,%rax,8)
-
-	addq	$PAGE_SIZE, %rdx
-	movq	%rdi, %rax
-	shrq	$PUD_SHIFT, %rax
-	andl	$(PTRS_PER_PUD-1), %eax
-	movq	%rdx, PAGE_SIZE(%rbx,%rax,8)
-	incl	%eax
-	andl	$(PTRS_PER_PUD-1), %eax
-	movq	%rdx, PAGE_SIZE(%rbx,%rax,8)
-
-	addq	$PAGE_SIZE * 2, %rbx
-	movq	%rdi, %rax
-	shrq	$PMD_SHIFT, %rdi
-	addq	$(__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL), %rax
-	leaq	(_end - 1)(%rip), %rcx
-	shrq	$PMD_SHIFT, %rcx
-	subq	%rdi, %rcx
-	incl	%ecx
+	pushq	%rsi
+	call	__startup_64
+	popq	%rsi
 
-1:
-	andq	$(PTRS_PER_PMD - 1), %rdi
-	movq	%rax, (%rbx,%rdi,8)
-	incq	%rdi
-	addq	$PMD_SIZE, %rax
-	decl	%ecx
-	jnz	1b
-
-	test %rbp, %rbp
-	jz .Lskip_fixup
-
-	/*
-	 * Fixup the kernel text+data virtual addresses. Note that
-	 * we might write invalid pmds, when the kernel is relocated
-	 * cleanup_highmap() fixes this up along with the mappings
-	 * beyond _end.
-	 */
-	leaq	level2_kernel_pgt(%rip), %rdi
-	leaq	PAGE_SIZE(%rdi), %r8
-	/* See if it is a valid page table entry */
-1:	testb	$_PAGE_PRESENT, 0(%rdi)
-	jz	2f
-	addq	%rbp, 0(%rdi)
-	/* Go to the next page */
-2:	addq	$8, %rdi
-	cmp	%r8, %rdi
-	jne	1b
-
-	/* Fixup phys_base */
-	addq	%rbp, phys_base(%rip)
-
-.Lskip_fixup:
 	movq	$(early_level4_pgt - __START_KERNEL_map), %rax
 	jmp 1f
 ENTRY(secondary_startup_64)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2017-06-13 10:14 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-06 14:00 [PATCH 0/8] x86: 5-level paging enabling for v4.12, Part 4 Kirill A. Shutemov
2017-04-06 14:00 ` [PATCH 1/8] x86/boot/64: Rewrite startup_64 in C Kirill A. Shutemov
2017-04-11  7:58   ` [tip:x86/mm] x86/boot/64: Rewrite startup_64() " tip-bot for Kirill A. Shutemov
2017-04-11  8:54     ` Ingo Molnar
2017-04-11 12:29       ` Kirill A. Shutemov
2017-04-06 14:01 ` [PATCH 2/8] x86/boot/64: Rename init_level4_pgt and early_level4_pgt Kirill A. Shutemov
2017-04-11  7:59   ` [tip:x86/mm] x86/boot/64: Rename init_level4_pgt() and early_level4_pgt[] tip-bot for Kirill A. Shutemov
2017-04-06 14:01 ` [PATCH 3/8] x86/boot/64: Add support of additional page table level during early boot Kirill A. Shutemov
2017-04-11  7:02   ` Ingo Molnar
2017-04-11 10:51     ` Kirill A. Shutemov
2017-04-11 11:28       ` Ingo Molnar
2017-04-11 11:46         ` Kirill A. Shutemov
2017-04-11 14:09           ` Andi Kleen
2017-04-12 10:18             ` Kirill A. Shutemov
2017-04-17 10:32               ` Ingo Molnar
2017-04-18  8:59                 ` Kirill A. Shutemov
2017-04-18 10:15                   ` Kirill A. Shutemov
2017-04-18 11:10                     ` Kirill A. Shutemov
2017-04-06 14:01 ` [PATCH 4/8] x86/mm: Add sync_global_pgds() for configuration with 5-level paging Kirill A. Shutemov
2017-04-06 14:01 ` [PATCH 5/8] x86/mm: Make kernel_physical_mapping_init() support " Kirill A. Shutemov
2017-04-06 14:01 ` [PATCH 6/8] x86/mm: Add support for 5-level paging for KASLR Kirill A. Shutemov
2017-04-06 14:01 ` [PATCH 7/8] x86: Enable 5-level paging support Kirill A. Shutemov
2017-04-06 14:52   ` Juergen Gross
2017-04-06 15:24     ` Kirill A. Shutemov
2017-04-06 15:56       ` Juergen Gross
2017-04-06 14:01 ` [PATCH 8/8] x86/mm: Allow to have userspace mappings above 47-bits Kirill A. Shutemov
2017-04-06 18:43   ` Dmitry Safonov
2017-04-06 19:15     ` Dmitry Safonov
2017-04-06 23:21       ` Kirill A. Shutemov
2017-04-06 23:24         ` [PATCHv2 " Kirill A. Shutemov
2017-04-07 11:32           ` Dmitry Safonov
2017-04-07 15:44             ` [PATCHv3 " Kirill A. Shutemov
2017-04-07 16:37               ` Dmitry Safonov
2017-04-13 11:30             ` [PATCHv4 0/9] x86: 5-level paging enabling for v4.12, Part 4 Kirill A. Shutemov
2017-04-13 11:30               ` [PATCHv4 1/9] x86/asm: Fix comment in return_from_SYSCALL_64 Kirill A. Shutemov
2017-04-13 11:30               ` [PATCHv4 2/9] x86/boot/64: Rewrite startup_64 in C Kirill A. Shutemov
2017-04-13 11:30               ` [PATCHv4 3/9] x86/boot/64: Rename init_level4_pgt and early_level4_pgt Kirill A. Shutemov
2017-04-13 11:30               ` [PATCHv4 4/9] x86/boot/64: Add support of additional page table level during early boot Kirill A. Shutemov
2017-04-13 11:30               ` [PATCHv4 5/9] x86/mm: Add sync_global_pgds() for configuration with 5-level paging Kirill A. Shutemov
2017-04-13 11:30               ` [PATCHv4 6/9] x86/mm: Make kernel_physical_mapping_init() support " Kirill A. Shutemov
2017-04-13 11:30               ` [PATCHv4 7/9] x86/mm: Add support for 5-level paging for KASLR Kirill A. Shutemov
2017-04-13 11:30               ` [PATCHv4 8/9] x86: Enable 5-level paging support Kirill A. Shutemov
2017-04-13 11:30               ` [PATCHv4 9/9] x86/mm: Allow to have userspace mappings above 47-bits Kirill A. Shutemov
2017-04-07 10:06         ` [PATCH 8/8] " Dmitry Safonov
2017-04-07 13:35   ` Anshuman Khandual
2017-04-07 15:59     ` Kirill A. Shutemov
2017-04-07 16:09       ` hpa
2017-04-07 16:20         ` Kirill A. Shutemov
2017-04-12 10:41       ` Michael Ellerman
2017-04-12 11:11         ` Kirill A. Shutemov
2017-06-06 11:31 [PATCHv7 07/14] x86/boot/64: Rewrite startup_64 in C Kirill A. Shutemov
2017-06-13 10:07 ` [tip:x86/mm] x86/boot/64: Rewrite startup_64() " tip-bot for Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).