kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK
@ 2020-04-14 15:34 Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 01/14] h8300: remove usage of __ARCH_USE_5LEVEL_HACK Mike Rapoport
                   ` (13 more replies)
  0 siblings, 14 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

These patches convert several architectures to use page table folding and
remove __ARCH_HAS_5LEVEL_HACK along with include/asm-generic/5level-fixup.h
and include/asm-generic/pgtable-nop4d-hack.h. With that we'll have a single
and consistent way of dealing with page table folding instead of a mix of
three existing options.

The changes are mostly about mechanical replacement of pgd accessors with
p4d ones and the addition of higher levels to page table traversals.

v4 is about rebasing on top of v5.7-rc1 
* split arm and arm64 changes as there is no KVM host on arm anymore
* update powerpc patches to reflect its recent changes in page table handling

v3:
* add Christophe's patch that removes ppc32 get_pteptr()
* reduce amount of upper layer walks in powerpc

v2:
* collect per-arch patches into a single set
* include Geert's update of 'sh' printing messages
* rebase on v5.6-rc1+

Geert Uytterhoeven (1):
  sh: fault: Modernize printing of kernel messages

Mike Rapoport (13):
  h8300: remove usage of __ARCH_USE_5LEVEL_HACK
  arm: add support for folded p4d page tables
  arm64: add support for folded p4d page tables
  hexagon: remove __ARCH_USE_5LEVEL_HACK
  ia64: add support for folded p4d page tables
  nios2: add support for folded p4d page tables
  openrisc: add support for folded p4d page tables
  powerpc: add support for folded p4d page tables
  sh: drop __pXd_offset() macros that duplicate pXd_index() ones
  sh: add support for folded p4d page tables
  unicore32: remove __ARCH_USE_5LEVEL_HACK
  asm-generic: remove pgtable-nop4d-hack.h
  mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h

 arch/arm/include/asm/pgtable.h                |   1 -
 arch/arm/lib/uaccess_with_memcpy.c            |   7 +-
 arch/arm/mach-sa1100/assabet.c                |   2 +-
 arch/arm/mm/dump.c                            |  29 ++-
 arch/arm/mm/fault-armv.c                      |   7 +-
 arch/arm/mm/fault.c                           |  22 +-
 arch/arm/mm/idmap.c                           |   3 +-
 arch/arm/mm/init.c                            |   2 +-
 arch/arm/mm/ioremap.c                         |  12 +-
 arch/arm/mm/mm.h                              |   2 +-
 arch/arm/mm/mmu.c                             |  35 ++-
 arch/arm/mm/pgd.c                             |  40 +++-
 arch/arm64/include/asm/kvm_mmu.h              |  10 +-
 arch/arm64/include/asm/pgalloc.h              |  10 +-
 arch/arm64/include/asm/pgtable-types.h        |   5 +-
 arch/arm64/include/asm/pgtable.h              |  37 ++--
 arch/arm64/include/asm/stage2_pgtable.h       |  48 +++-
 arch/arm64/kernel/hibernate.c                 |  44 +++-
 arch/arm64/mm/fault.c                         |   9 +-
 arch/arm64/mm/hugetlbpage.c                   |  15 +-
 arch/arm64/mm/kasan_init.c                    |  26 ++-
 arch/arm64/mm/mmu.c                           |  52 +++--
 arch/arm64/mm/pageattr.c                      |   7 +-
 arch/h8300/include/asm/pgtable.h              |   1 -
 arch/hexagon/include/asm/fixmap.h             |   4 +-
 arch/hexagon/include/asm/pgtable.h            |   1 -
 arch/ia64/include/asm/pgalloc.h               |   4 +-
 arch/ia64/include/asm/pgtable.h               |  17 +-
 arch/ia64/mm/fault.c                          |   7 +-
 arch/ia64/mm/hugetlbpage.c                    |  18 +-
 arch/ia64/mm/init.c                           |  28 ++-
 arch/nios2/include/asm/pgtable.h              |   3 +-
 arch/nios2/mm/fault.c                         |   9 +-
 arch/nios2/mm/ioremap.c                       |   6 +-
 arch/openrisc/include/asm/pgtable.h           |   1 -
 arch/openrisc/mm/fault.c                      |  10 +-
 arch/openrisc/mm/init.c                       |   4 +-
 arch/powerpc/include/asm/book3s/32/pgtable.h  |   1 -
 arch/powerpc/include/asm/book3s/64/hash.h     |   4 +-
 arch/powerpc/include/asm/book3s/64/pgalloc.h  |   4 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  60 ++---
 arch/powerpc/include/asm/book3s/64/radix.h    |   6 +-
 arch/powerpc/include/asm/nohash/32/pgtable.h  |   1 -
 arch/powerpc/include/asm/nohash/64/pgalloc.h  |   2 +-
 .../include/asm/nohash/64/pgtable-4k.h        |  32 +--
 arch/powerpc/include/asm/nohash/64/pgtable.h  |   6 +-
 arch/powerpc/include/asm/pgtable.h            |  10 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c        |  32 +--
 arch/powerpc/lib/code-patching.c              |   7 +-
 arch/powerpc/mm/book3s64/hash_pgtable.c       |   4 +-
 arch/powerpc/mm/book3s64/radix_pgtable.c      |  26 ++-
 arch/powerpc/mm/book3s64/subpage_prot.c       |   6 +-
 arch/powerpc/mm/hugetlbpage.c                 |  28 ++-
 arch/powerpc/mm/nohash/book3e_pgtable.c       |  15 +-
 arch/powerpc/mm/pgtable.c                     |  30 ++-
 arch/powerpc/mm/pgtable_64.c                  |  10 +-
 arch/powerpc/mm/ptdump/hashpagetable.c        |  20 +-
 arch/powerpc/mm/ptdump/ptdump.c               |  14 +-
 arch/powerpc/xmon/xmon.c                      |  18 +-
 arch/sh/include/asm/pgtable-2level.h          |   1 -
 arch/sh/include/asm/pgtable-3level.h          |   1 -
 arch/sh/include/asm/pgtable_32.h              |   5 +-
 arch/sh/include/asm/pgtable_64.h              |   5 +-
 arch/sh/kernel/io_trapped.c                   |   7 +-
 arch/sh/mm/cache-sh4.c                        |   4 +-
 arch/sh/mm/cache-sh5.c                        |   7 +-
 arch/sh/mm/fault.c                            |  65 ++++--
 arch/sh/mm/hugetlbpage.c                      |  28 ++-
 arch/sh/mm/init.c                             |  15 +-
 arch/sh/mm/kmap.c                             |   2 +-
 arch/sh/mm/tlbex_32.c                         |   6 +-
 arch/sh/mm/tlbex_64.c                         |   7 +-
 arch/unicore32/include/asm/pgtable.h          |   1 -
 arch/unicore32/kernel/hibernate.c             |   4 +-
 include/asm-generic/5level-fixup.h            |  58 -----
 include/asm-generic/pgtable-nop4d-hack.h      |  64 ------
 include/asm-generic/pgtable-nopud.h           |   4 -
 include/linux/mm.h                            |   6 -
 mm/kasan/init.c                               |  11 -
 mm/memory.c                                   |   8 -
 virt/kvm/arm/mmu.c                            | 209 +++++++++++++++---
 81 files changed, 872 insertions(+), 520 deletions(-)
 delete mode 100644 include/asm-generic/5level-fixup.h
 delete mode 100644 include/asm-generic/pgtable-nop4d-hack.h

-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v4 01/14] h8300: remove usage of __ARCH_USE_5LEVEL_HACK
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 02/14] arm: add support for folded p4d page tables Mike Rapoport
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

h8300 is a nommu architecture and does not require fixup for upper layers
of the page tables because it is already handled by the generic nommu
implementation.

Remove definition of __ARCH_USE_5LEVEL_HACK in
arch/h8300/include/asm/pgtable.h

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/h8300/include/asm/pgtable.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/h8300/include/asm/pgtable.h b/arch/h8300/include/asm/pgtable.h
index 4d00152fab58..f00828720dc4 100644
--- a/arch/h8300/include/asm/pgtable.h
+++ b/arch/h8300/include/asm/pgtable.h
@@ -1,7 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #ifndef _H8300_PGTABLE_H
 #define _H8300_PGTABLE_H
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopud.h>
 #include <asm-generic/pgtable.h>
 extern void paging_init(void);
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 02/14] arm: add support for folded p4d page tables
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 01/14] h8300: remove usage of __ARCH_USE_5LEVEL_HACK Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
       [not found]   ` <CGME20200507121658eucas1p240cf4a3e0fe5c22dda5ec4f72734149f@eucas1p2.samsung.com>
  2020-04-14 15:34 ` [PATCH v4 03/14] arm64: " Mike Rapoport
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm/include/asm/pgtable.h     |  1 -
 arch/arm/lib/uaccess_with_memcpy.c |  7 +++++-
 arch/arm/mach-sa1100/assabet.c     |  2 +-
 arch/arm/mm/dump.c                 | 29 +++++++++++++++++-----
 arch/arm/mm/fault-armv.c           |  7 +++++-
 arch/arm/mm/fault.c                | 22 ++++++++++------
 arch/arm/mm/idmap.c                |  3 ++-
 arch/arm/mm/init.c                 |  2 +-
 arch/arm/mm/ioremap.c              | 12 ++++++---
 arch/arm/mm/mm.h                   |  2 +-
 arch/arm/mm/mmu.c                  | 35 +++++++++++++++++++++-----
 arch/arm/mm/pgd.c                  | 40 ++++++++++++++++++++++++------
 12 files changed, 125 insertions(+), 37 deletions(-)

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index befc8fcec98f..fba20607c53c 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -17,7 +17,6 @@
 
 #else
 
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopud.h>
 #include <asm/memory.h>
 #include <asm/pgtable-hwdef.h>
diff --git a/arch/arm/lib/uaccess_with_memcpy.c b/arch/arm/lib/uaccess_with_memcpy.c
index c9450982a155..d72b14c96670 100644
--- a/arch/arm/lib/uaccess_with_memcpy.c
+++ b/arch/arm/lib/uaccess_with_memcpy.c
@@ -24,6 +24,7 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, spinlock_t **ptlp)
 {
 	unsigned long addr = (unsigned long)_addr;
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pmd_t *pmd;
 	pte_t *pte;
 	pud_t *pud;
@@ -33,7 +34,11 @@ pin_page_for_write(const void __user *_addr, pte_t **ptep, spinlock_t **ptlp)
 	if (unlikely(pgd_none(*pgd) || pgd_bad(*pgd)))
 		return 0;
 
-	pud = pud_offset(pgd, addr);
+	p4d = p4d_offset(pgd, addr);
+	if (unlikely(p4d_none(*p4d) || p4d_bad(*p4d)))
+		return 0;
+
+	pud = pud_offset(p4d, addr);
 	if (unlikely(pud_none(*pud) || pud_bad(*pud)))
 		return 0;
 
diff --git a/arch/arm/mach-sa1100/assabet.c b/arch/arm/mach-sa1100/assabet.c
index d96a101e5504..0631a7b02678 100644
--- a/arch/arm/mach-sa1100/assabet.c
+++ b/arch/arm/mach-sa1100/assabet.c
@@ -633,7 +633,7 @@ static void __init map_sa1100_gpio_regs( void )
 	int prot = PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_DOMAIN(DOMAIN_IO);
 	pmd_t *pmd;
 
-	pmd = pmd_offset(pud_offset(pgd_offset_k(virt), virt), virt);
+	pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset_k(virt), virt), virt), virt);
 	*pmd = __pmd(phys | prot);
 	flush_pmd_entry(pmd);
 }
diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
index 7d6291f23251..677549d6854c 100644
--- a/arch/arm/mm/dump.c
+++ b/arch/arm/mm/dump.c
@@ -207,6 +207,7 @@ struct pg_level {
 static struct pg_level pg_level[] = {
 	{
 	}, { /* pgd */
+	}, { /* p4d */
 	}, { /* pud */
 	}, { /* pmd */
 		.bits	= section_bits,
@@ -308,7 +309,7 @@ static void walk_pte(struct pg_state *st, pmd_t *pmd, unsigned long start,
 
 	for (i = 0; i < PTRS_PER_PTE; i++, pte++) {
 		addr = start + i * PAGE_SIZE;
-		note_page(st, addr, 4, pte_val(*pte), domain);
+		note_page(st, addr, 5, pte_val(*pte), domain);
 	}
 }
 
@@ -350,14 +351,14 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
 			addr += SECTION_SIZE;
 			pmd++;
 			domain = get_domain_name(pmd);
-			note_page(st, addr, 3, pmd_val(*pmd), domain);
+			note_page(st, addr, 4, pmd_val(*pmd), domain);
 		}
 	}
 }
 
-static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
+static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
 {
-	pud_t *pud = pud_offset(pgd, 0);
+	pud_t *pud = pud_offset(p4d, 0);
 	unsigned long addr;
 	unsigned i;
 
@@ -366,7 +367,23 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
 		if (!pud_none(*pud)) {
 			walk_pmd(st, pud, addr);
 		} else {
-			note_page(st, addr, 2, pud_val(*pud), NULL);
+			note_page(st, addr, 3, pud_val(*pud), NULL);
+		}
+	}
+}
+
+static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
+{
+	p4d_t *p4d = p4d_offset(pgd, 0);
+	unsigned long addr;
+	unsigned i;
+
+	for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
+		addr = start + i * P4D_SIZE;
+		if (!p4d_none(*p4d)) {
+			walk_pud(st, p4d, addr);
+		} else {
+			note_page(st, addr, 2, p4d_val(*p4d), NULL);
 		}
 	}
 }
@@ -381,7 +398,7 @@ static void walk_pgd(struct pg_state *st, struct mm_struct *mm,
 	for (i = 0; i < PTRS_PER_PGD; i++, pgd++) {
 		addr = start + i * PGDIR_SIZE;
 		if (!pgd_none(*pgd)) {
-			walk_pud(st, pgd, addr);
+			walk_p4d(st, pgd, addr);
 		} else {
 			note_page(st, addr, 1, pgd_val(*pgd), NULL);
 		}
diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c
index ae857f41f68d..489aaafa6ebd 100644
--- a/arch/arm/mm/fault-armv.c
+++ b/arch/arm/mm/fault-armv.c
@@ -91,6 +91,7 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address,
 {
 	spinlock_t *ptl;
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
@@ -100,7 +101,11 @@ static int adjust_pte(struct vm_area_struct *vma, unsigned long address,
 	if (pgd_none_or_clear_bad(pgd))
 		return 0;
 
-	pud = pud_offset(pgd, address);
+	p4d = p4d_offset(pgd, address);
+	if (p4d_none_or_clear_bad(p4d))
+		return 0;
+
+	pud = pud_offset(p4d, address);
 	if (pud_none_or_clear_bad(pud))
 		return 0;
 
diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 2dd5c41cbb8d..ff230e9affc4 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -43,19 +43,21 @@ void show_pte(const char *lvl, struct mm_struct *mm, unsigned long addr)
 	printk("%s[%08lx] *pgd=%08llx", lvl, addr, (long long)pgd_val(*pgd));
 
 	do {
+		p4d_t *p4d;
 		pud_t *pud;
 		pmd_t *pmd;
 		pte_t *pte;
 
-		if (pgd_none(*pgd))
+		p4d = p4d_offset(pgd, addr);
+		if (p4d_none(*p4d))
 			break;
 
-		if (pgd_bad(*pgd)) {
+		if (p4d_bad(*p4d)) {
 			pr_cont("(bad)");
 			break;
 		}
 
-		pud = pud_offset(pgd, addr);
+		pud = pud_offset(p4d, addr);
 		if (PTRS_PER_PUD != 1)
 			pr_cont(", *pud=%08llx", (long long)pud_val(*pud));
 
@@ -405,6 +407,7 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
 {
 	unsigned int index;
 	pgd_t *pgd, *pgd_k;
+	p4d_t *p4d, *p4d_k;
 	pud_t *pud, *pud_k;
 	pmd_t *pmd, *pmd_k;
 
@@ -419,13 +422,16 @@ do_translation_fault(unsigned long addr, unsigned int fsr,
 	pgd = cpu_get_pgd() + index;
 	pgd_k = init_mm.pgd + index;
 
-	if (pgd_none(*pgd_k))
+	p4d = p4d_offset(pgd, addr);
+	p4d_k = p4d_offset(pgd_k, addr);
+
+	if (p4d_none(*p4d_k))
 		goto bad_area;
-	if (!pgd_present(*pgd))
-		set_pgd(pgd, *pgd_k);
+	if (!p4d_present(*p4d))
+		set_p4d(p4d, *p4d_k);
 
-	pud = pud_offset(pgd, addr);
-	pud_k = pud_offset(pgd_k, addr);
+	pud = pud_offset(p4d, addr);
+	pud_k = pud_offset(p4d_k, addr);
 
 	if (pud_none(*pud_k))
 		goto bad_area;
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index a033f6134a64..cd54411ef1b8 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -68,7 +68,8 @@ static void idmap_add_pmd(pud_t *pud, unsigned long addr, unsigned long end,
 static void idmap_add_pud(pgd_t *pgd, unsigned long addr, unsigned long end,
 	unsigned long prot)
 {
-	pud_t *pud = pud_offset(pgd, addr);
+	p4d_t *p4d = p4d_offset(pgd, addr);
+	pud_t *pud = pud_offset(p4d, addr);
 	unsigned long next;
 
 	do {
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 054be44d1cdb..963b5284d284 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -571,7 +571,7 @@ static inline void section_update(unsigned long addr, pmdval_t mask,
 {
 	pmd_t *pmd;
 
-	pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);
+	pmd = pmd_off_k(addr);
 
 #ifdef CONFIG_ARM_LPAE
 	pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c
index 72286f9a4d30..75529d76d28c 100644
--- a/arch/arm/mm/ioremap.c
+++ b/arch/arm/mm/ioremap.c
@@ -142,12 +142,14 @@ static void unmap_area_sections(unsigned long virt, unsigned long size)
 {
 	unsigned long addr = virt, end = virt + (size & ~(SZ_1M - 1));
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmdp;
 
 	flush_cache_vunmap(addr, end);
 	pgd = pgd_offset_k(addr);
-	pud = pud_offset(pgd, addr);
+	p4d = p4d_offset(pgd, addr);
+	pud = pud_offset(p4d, addr);
 	pmdp = pmd_offset(pud, addr);
 	do {
 		pmd_t pmd = *pmdp;
@@ -190,6 +192,7 @@ remap_area_sections(unsigned long virt, unsigned long pfn,
 {
 	unsigned long addr = virt, end = virt + size;
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 
@@ -200,7 +203,8 @@ remap_area_sections(unsigned long virt, unsigned long pfn,
 	unmap_area_sections(virt, size);
 
 	pgd = pgd_offset_k(addr);
-	pud = pud_offset(pgd, addr);
+	p4d = p4d_offset(pgd, addr);
+	pud = pud_offset(p4d, addr);
 	pmd = pmd_offset(pud, addr);
 	do {
 		pmd[0] = __pmd(__pfn_to_phys(pfn) | type->prot_sect);
@@ -222,6 +226,7 @@ remap_area_supersections(unsigned long virt, unsigned long pfn,
 {
 	unsigned long addr = virt, end = virt + size;
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 
@@ -232,7 +237,8 @@ remap_area_supersections(unsigned long virt, unsigned long pfn,
 	unmap_area_sections(virt, size);
 
 	pgd = pgd_offset_k(virt);
-	pud = pud_offset(pgd, addr);
+	p4d = p4d_offset(pgd, addr);
+	pud = pud_offset(p4d, addr);
 	pmd = pmd_offset(pud, addr);
 	do {
 		unsigned long super_pmd_val, i;
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index 88c121ac14b3..4f1f72b75890 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -38,7 +38,7 @@ static inline pte_t get_top_pte(unsigned long va)
 
 static inline pmd_t *pmd_off_k(unsigned long virt)
 {
-	return pmd_offset(pud_offset(pgd_offset_k(virt), virt), virt);
+	return pmd_offset(pud_offset(p4d_offset(pgd_offset_k(virt), virt), virt), virt);
 }
 
 struct mem_type {
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index ec8d0008bfa1..c425288f1a86 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -357,7 +357,8 @@ static pte_t *pte_offset_late_fixmap(pmd_t *dir, unsigned long addr)
 static inline pmd_t * __init fixmap_pmd(unsigned long addr)
 {
 	pgd_t *pgd = pgd_offset_k(addr);
-	pud_t *pud = pud_offset(pgd, addr);
+	p4d_t *p4d = p4d_offset(pgd, addr);
+	pud_t *pud = pud_offset(p4d, addr);
 	pmd_t *pmd = pmd_offset(pud, addr);
 
 	return pmd;
@@ -801,12 +802,12 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 	} while (pmd++, addr = next, addr != end);
 }
 
-static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
+static void __init alloc_init_pud(p4d_t *p4d, unsigned long addr,
 				  unsigned long end, phys_addr_t phys,
 				  const struct mem_type *type,
 				  void *(*alloc)(unsigned long sz), bool ng)
 {
-	pud_t *pud = pud_offset(pgd, addr);
+	pud_t *pud = pud_offset(p4d, addr);
 	unsigned long next;
 
 	do {
@@ -816,6 +817,21 @@ static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
 	} while (pud++, addr = next, addr != end);
 }
 
+static void __init alloc_init_p4d(pgd_t *pgd, unsigned long addr,
+				  unsigned long end, phys_addr_t phys,
+				  const struct mem_type *type,
+				  void *(*alloc)(unsigned long sz), bool ng)
+{
+	p4d_t *p4d = p4d_offset(pgd, addr);
+	unsigned long next;
+
+	do {
+		next = p4d_addr_end(addr, end);
+		alloc_init_pud(p4d, addr, next, phys, type, alloc, ng);
+		phys += next - addr;
+	} while (p4d++, addr = next, addr != end);
+}
+
 #ifndef CONFIG_ARM_LPAE
 static void __init create_36bit_mapping(struct mm_struct *mm,
 					struct map_desc *md,
@@ -863,7 +879,8 @@ static void __init create_36bit_mapping(struct mm_struct *mm,
 	pgd = pgd_offset(mm, addr);
 	end = addr + length;
 	do {
-		pud_t *pud = pud_offset(pgd, addr);
+		p4d_t *p4d = p4d_offset(pgd, addr);
+		pud_t *pud = pud_offset(p4d, addr);
 		pmd_t *pmd = pmd_offset(pud, addr);
 		int i;
 
@@ -914,7 +931,7 @@ static void __init __create_mapping(struct mm_struct *mm, struct map_desc *md,
 	do {
 		unsigned long next = pgd_addr_end(addr, end);
 
-		alloc_init_pud(pgd, addr, next, phys, type, alloc, ng);
+		alloc_init_p4d(pgd, addr, next, phys, type, alloc, ng);
 
 		phys += next - addr;
 		addr = next;
@@ -950,7 +967,13 @@ void __init create_mapping_late(struct mm_struct *mm, struct map_desc *md,
 				bool ng)
 {
 #ifdef CONFIG_ARM_LPAE
-	pud_t *pud = pud_alloc(mm, pgd_offset(mm, md->virtual), md->virtual);
+	p4d_t *p4d;
+	pud_t *pud;
+
+	p4d = p4d_alloc(mm, pgd_offset(mm, md->virtual), md->virtual);
+	if (!WARN_ON(!p4d))
+		return;
+	pud = pud_alloc(mm, p4d, md->virtual);
 	if (WARN_ON(!pud))
 		return;
 	pmd_alloc(mm, pud, 0);
diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index 478bd2c6aa50..c5e1b27046a8 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -30,6 +30,7 @@
 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
 	pgd_t *new_pgd, *init_pgd;
+	p4d_t *new_p4d, *init_p4d;
 	pud_t *new_pud, *init_pud;
 	pmd_t *new_pmd, *init_pmd;
 	pte_t *new_pte, *init_pte;
@@ -53,8 +54,12 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 	/*
 	 * Allocate PMD table for modules and pkmap mappings.
 	 */
-	new_pud = pud_alloc(mm, new_pgd + pgd_index(MODULES_VADDR),
+	new_p4d = p4d_alloc(mm, new_pgd + pgd_index(MODULES_VADDR),
 			    MODULES_VADDR);
+	if (!new_p4d)
+		goto no_p4d;
+
+	new_pud = pud_alloc(mm, new_p4d, MODULES_VADDR);
 	if (!new_pud)
 		goto no_pud;
 
@@ -69,7 +74,11 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 		 * contains the machine vectors. The vectors are always high
 		 * with LPAE.
 		 */
-		new_pud = pud_alloc(mm, new_pgd, 0);
+		new_p4d = p4d_alloc(mm, new_pgd, 0);
+		if (!new_p4d)
+			goto no_p4d;
+
+		new_pud = pud_alloc(mm, new_p4d, 0);
 		if (!new_pud)
 			goto no_pud;
 
@@ -91,7 +100,8 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 		pmd_val(*new_pmd) |= PMD_DOMAIN(DOMAIN_VECTORS);
 #endif
 
-		init_pud = pud_offset(init_pgd, 0);
+		init_p4d = p4d_offset(init_pgd, 0);
+		init_pud = pud_offset(init_p4d, 0);
 		init_pmd = pmd_offset(init_pud, 0);
 		init_pte = pte_offset_map(init_pmd, 0);
 		set_pte_ext(new_pte + 0, init_pte[0], 0);
@@ -108,6 +118,8 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 no_pmd:
 	pud_free(mm, new_pud);
 no_pud:
+	p4d_free(mm, new_p4d);
+no_p4d:
 	__pgd_free(new_pgd);
 no_pgd:
 	return NULL;
@@ -116,6 +128,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pgtable_t pte;
@@ -127,7 +140,11 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
 	if (pgd_none_or_clear_bad(pgd))
 		goto no_pgd;
 
-	pud = pud_offset(pgd, 0);
+	p4d = p4d_offset(pgd, 0);
+	if (p4d_none_or_clear_bad(p4d))
+		goto no_p4d;
+
+	pud = pud_offset(p4d, 0);
 	if (pud_none_or_clear_bad(pud))
 		goto no_pud;
 
@@ -144,8 +161,11 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
 	pmd_free(mm, pmd);
 	mm_dec_nr_pmds(mm);
 no_pud:
-	pgd_clear(pgd);
+	p4d_clear(p4d);
 	pud_free(mm, pud);
+no_p4d:
+	pgd_clear(pgd);
+	p4d_free(mm, p4d);
 no_pgd:
 #ifdef CONFIG_ARM_LPAE
 	/*
@@ -156,15 +176,21 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
 			continue;
 		if (pgd_val(*pgd) & L_PGD_SWAPPER)
 			continue;
-		pud = pud_offset(pgd, 0);
+		p4d = p4d_offset(pgd, 0);
+		if (p4d_none_or_clear_bad(p4d))
+			continue;
+		pud = pud_offset(p4d, 0);
 		if (pud_none_or_clear_bad(pud))
 			continue;
 		pmd = pmd_offset(pud, 0);
 		pud_clear(pud);
 		pmd_free(mm, pmd);
 		mm_dec_nr_pmds(mm);
-		pgd_clear(pgd);
+		p4d_clear(p4d);
 		pud_free(mm, pud);
+		mm_dec_nr_puds(mm);
+		pgd_clear(pgd);
+		p4d_free(mm, p4d);
 	}
 #endif
 	__pgd_free(pgd_base);
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 03/14] arm64: add support for folded p4d page tables
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 01/14] h8300: remove usage of __ARCH_USE_5LEVEL_HACK Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 02/14] arm: add support for folded p4d page tables Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-05-15 18:40   ` Andrew Morton
  2020-04-14 15:34 ` [PATCH v4 04/14] hexagon: remove __ARCH_USE_5LEVEL_HACK Mike Rapoport
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
remove __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/arm64/include/asm/kvm_mmu.h        |  10 +-
 arch/arm64/include/asm/pgalloc.h        |  10 +-
 arch/arm64/include/asm/pgtable-types.h  |   5 +-
 arch/arm64/include/asm/pgtable.h        |  37 +++--
 arch/arm64/include/asm/stage2_pgtable.h |  48 ++++--
 arch/arm64/kernel/hibernate.c           |  44 ++++-
 arch/arm64/mm/fault.c                   |   9 +-
 arch/arm64/mm/hugetlbpage.c             |  15 +-
 arch/arm64/mm/kasan_init.c              |  26 ++-
 arch/arm64/mm/mmu.c                     |  52 ++++--
 arch/arm64/mm/pageattr.c                |   7 +-
 virt/kvm/arm/mmu.c                      | 209 ++++++++++++++++++++----
 12 files changed, 368 insertions(+), 104 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 30b0e8d6b895..8255fab2e441 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -172,8 +172,8 @@ void kvm_clear_hyp_idmap(void);
 	__pmd(__phys_to_pmd_val(__pa(ptep)) | PMD_TYPE_TABLE)
 #define kvm_mk_pud(pmdp)					\
 	__pud(__phys_to_pud_val(__pa(pmdp)) | PMD_TYPE_TABLE)
-#define kvm_mk_pgd(pudp)					\
-	__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
+#define kvm_mk_p4d(pmdp)					\
+	__p4d(__phys_to_p4d_val(__pa(pmdp)) | PUD_TYPE_TABLE)
 
 #define kvm_set_pud(pudp, pud)		set_pud(pudp, pud)
 
@@ -299,6 +299,12 @@ static inline bool kvm_s2pud_young(pud_t pud)
 #define hyp_pud_table_empty(pudp) kvm_page_empty(pudp)
 #endif
 
+#ifdef __PAGETABLE_P4D_FOLDED
+#define hyp_p4d_table_empty(p4dp) (0)
+#else
+#define hyp_p4d_table_empty(p4dp) kvm_page_empty(p4dp)
+#endif
+
 struct kvm;
 
 #define kvm_flush_dcache_to_poc(a,l)	__flush_dcache_area((a), (l))
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index 172d76fa0245..58e93583ddb6 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -73,17 +73,17 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pudp)
 	free_page((unsigned long)pudp);
 }
 
-static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
+static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
 {
-	set_pgd(pgdp, __pgd(__phys_to_pgd_val(pudp) | prot));
+	set_p4d(p4dp, __p4d(__phys_to_p4d_val(pudp) | prot));
 }
 
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
+static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
 {
-	__pgd_populate(pgdp, __pa(pudp), PUD_TYPE_TABLE);
+	__p4d_populate(p4dp, __pa(pudp), PUD_TYPE_TABLE);
 }
 #else
-static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
+static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
 {
 	BUILD_BUG();
 }
diff --git a/arch/arm64/include/asm/pgtable-types.h b/arch/arm64/include/asm/pgtable-types.h
index acb0751a6606..b8f158ae2527 100644
--- a/arch/arm64/include/asm/pgtable-types.h
+++ b/arch/arm64/include/asm/pgtable-types.h
@@ -14,6 +14,7 @@
 typedef u64 pteval_t;
 typedef u64 pmdval_t;
 typedef u64 pudval_t;
+typedef u64 p4dval_t;
 typedef u64 pgdval_t;
 
 /*
@@ -44,13 +45,11 @@ typedef struct { pteval_t pgprot; } pgprot_t;
 #define __pgprot(x)	((pgprot_t) { (x) } )
 
 #if CONFIG_PGTABLE_LEVELS == 2
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopmd.h>
 #elif CONFIG_PGTABLE_LEVELS == 3
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopud.h>
 #elif CONFIG_PGTABLE_LEVELS == 4
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
 #endif
 
 #endif	/* __ASM_PGTABLE_TYPES_H */
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 538c85e62f86..c23c5a4e6dc6 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -298,6 +298,11 @@ static inline pte_t pgd_pte(pgd_t pgd)
 	return __pte(pgd_val(pgd));
 }
 
+static inline pte_t p4d_pte(p4d_t p4d)
+{
+	return __pte(p4d_val(p4d));
+}
+
 static inline pte_t pud_pte(pud_t pud)
 {
 	return __pte(pud_val(pud));
@@ -401,6 +406,9 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd)
 
 #define set_pmd_at(mm, addr, pmdp, pmd)	set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd))
 
+#define __p4d_to_phys(p4d)	__pte_to_phys(p4d_pte(p4d))
+#define __phys_to_p4d_val(phys)	__phys_to_pte_val(phys)
+
 #define __pgd_to_phys(pgd)	__pte_to_phys(pgd_pte(pgd))
 #define __phys_to_pgd_val(phys)	__phys_to_pte_val(phys)
 
@@ -588,49 +596,50 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 
 #define pud_ERROR(pud)		__pud_error(__FILE__, __LINE__, pud_val(pud))
 
-#define pgd_none(pgd)		(!pgd_val(pgd))
-#define pgd_bad(pgd)		(!(pgd_val(pgd) & 2))
-#define pgd_present(pgd)	(pgd_val(pgd))
+#define p4d_none(p4d)		(!p4d_val(p4d))
+#define p4d_bad(p4d)		(!(p4d_val(p4d) & 2))
+#define p4d_present(p4d)	(p4d_val(p4d))
 
-static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
+static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
 {
-	if (in_swapper_pgdir(pgdp)) {
-		set_swapper_pgd(pgdp, pgd);
+	if (in_swapper_pgdir(p4dp)) {
+		set_swapper_pgd((pgd_t *)p4dp, __pgd(p4d_val(p4d)));
 		return;
 	}
 
-	WRITE_ONCE(*pgdp, pgd);
+	WRITE_ONCE(*p4dp, p4d);
 	dsb(ishst);
 	isb();
 }
 
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
 {
-	set_pgd(pgdp, __pgd(0));
+	set_p4d(p4dp, __p4d(0));
 }
 
-static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
+static inline phys_addr_t p4d_page_paddr(p4d_t p4d)
 {
-	return __pgd_to_phys(pgd);
+	return __p4d_to_phys(p4d);
 }
 
 /* Find an entry in the frst-level page table. */
 #define pud_index(addr)		(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
 
-#define pud_offset_phys(dir, addr)	(pgd_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
+#define pud_offset_phys(dir, addr)	(p4d_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
 #define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
 
 #define pud_set_fixmap(addr)		((pud_t *)set_fixmap_offset(FIX_PUD, addr))
-#define pud_set_fixmap_offset(pgd, addr)	pud_set_fixmap(pud_offset_phys(pgd, addr))
+#define pud_set_fixmap_offset(p4d, addr)	pud_set_fixmap(pud_offset_phys(p4d, addr))
 #define pud_clear_fixmap()		clear_fixmap(FIX_PUD)
 
-#define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(__pgd_to_phys(pgd)))
+#define p4d_page(p4d)		pfn_to_page(__phys_to_pfn(__p4d_to_phys(p4d)))
 
 /* use ONLY for statically allocated translation tables */
 #define pud_offset_kimg(dir,addr)	((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
 
 #else
 
+#define p4d_page_paddr(p4d)	({ BUILD_BUG(); 0;})
 #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
 
 /* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
diff --git a/arch/arm64/include/asm/stage2_pgtable.h b/arch/arm64/include/asm/stage2_pgtable.h
index 326aac658b9d..9a364aeae5fb 100644
--- a/arch/arm64/include/asm/stage2_pgtable.h
+++ b/arch/arm64/include/asm/stage2_pgtable.h
@@ -68,41 +68,67 @@ static inline bool kvm_stage2_has_pud(struct kvm *kvm)
 #define S2_PUD_SIZE			(1UL << S2_PUD_SHIFT)
 #define S2_PUD_MASK			(~(S2_PUD_SIZE - 1))
 
-static inline bool stage2_pgd_none(struct kvm *kvm, pgd_t pgd)
+#define stage2_pgd_none(kvm, pgd)		pgd_none(pgd)
+#define stage2_pgd_clear(kvm, pgd)		pgd_clear(pgd)
+#define stage2_pgd_present(kvm, pgd)		pgd_present(pgd)
+#define stage2_pgd_populate(kvm, pgd, p4d)	pgd_populate(NULL, pgd, p4d)
+
+static inline p4d_t *stage2_p4d_offset(struct kvm *kvm,
+				       pgd_t *pgd, unsigned long address)
+{
+	return p4d_offset(pgd, address);
+}
+
+static inline void stage2_p4d_free(struct kvm *kvm, p4d_t *p4d)
+{
+}
+
+static inline bool stage2_p4d_table_empty(struct kvm *kvm, p4d_t *p4dp)
+{
+	return false;
+}
+
+static inline phys_addr_t stage2_p4d_addr_end(struct kvm *kvm,
+					      phys_addr_t addr, phys_addr_t end)
+{
+	return end;
+}
+
+static inline bool stage2_p4d_none(struct kvm *kvm, p4d_t p4d)
 {
 	if (kvm_stage2_has_pud(kvm))
-		return pgd_none(pgd);
+		return p4d_none(p4d);
 	else
 		return 0;
 }
 
-static inline void stage2_pgd_clear(struct kvm *kvm, pgd_t *pgdp)
+static inline void stage2_p4d_clear(struct kvm *kvm, p4d_t *p4dp)
 {
 	if (kvm_stage2_has_pud(kvm))
-		pgd_clear(pgdp);
+		p4d_clear(p4dp);
 }
 
-static inline bool stage2_pgd_present(struct kvm *kvm, pgd_t pgd)
+static inline bool stage2_p4d_present(struct kvm *kvm, p4d_t p4d)
 {
 	if (kvm_stage2_has_pud(kvm))
-		return pgd_present(pgd);
+		return p4d_present(p4d);
 	else
 		return 1;
 }
 
-static inline void stage2_pgd_populate(struct kvm *kvm, pgd_t *pgd, pud_t *pud)
+static inline void stage2_p4d_populate(struct kvm *kvm, p4d_t *p4d, pud_t *pud)
 {
 	if (kvm_stage2_has_pud(kvm))
-		pgd_populate(NULL, pgd, pud);
+		p4d_populate(NULL, p4d, pud);
 }
 
 static inline pud_t *stage2_pud_offset(struct kvm *kvm,
-				       pgd_t *pgd, unsigned long address)
+				       p4d_t *p4d, unsigned long address)
 {
 	if (kvm_stage2_has_pud(kvm))
-		return pud_offset(pgd, address);
+		return pud_offset(p4d, address);
 	else
-		return (pud_t *)pgd;
+		return (pud_t *)p4d;
 }
 
 static inline void stage2_pud_free(struct kvm *kvm, pud_t *pud)
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 5b73e92c99e3..a8a4b55f3a09 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -184,6 +184,7 @@ static int trans_pgd_map_page(pgd_t *trans_pgd, void *page,
 		       pgprot_t pgprot)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep;
@@ -196,7 +197,15 @@ static int trans_pgd_map_page(pgd_t *trans_pgd, void *page,
 		pgd_populate(&init_mm, pgdp, pudp);
 	}
 
-	pudp = pud_offset(pgdp, dst_addr);
+	p4dp = p4d_offset(pgdp, dst_addr);
+	if (p4d_none(READ_ONCE(*p4dp))) {
+		pudp = (void *)get_safe_page(GFP_ATOMIC);
+		if (!pudp)
+			return -ENOMEM;
+		p4d_populate(&init_mm, p4dp, pudp);
+	}
+
+	pudp = pud_offset(p4dp, dst_addr);
 	if (pud_none(READ_ONCE(*pudp))) {
 		pmdp = (void *)get_safe_page(GFP_ATOMIC);
 		if (!pmdp)
@@ -419,7 +428,7 @@ static int copy_pmd(pud_t *dst_pudp, pud_t *src_pudp, unsigned long start,
 	return 0;
 }
 
-static int copy_pud(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
+static int copy_pud(p4d_t *dst_p4dp, p4d_t *src_p4dp, unsigned long start,
 		    unsigned long end)
 {
 	pud_t *dst_pudp;
@@ -427,15 +436,15 @@ static int copy_pud(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
 	unsigned long next;
 	unsigned long addr = start;
 
-	if (pgd_none(READ_ONCE(*dst_pgdp))) {
+	if (p4d_none(READ_ONCE(*dst_p4dp))) {
 		dst_pudp = (pud_t *)get_safe_page(GFP_ATOMIC);
 		if (!dst_pudp)
 			return -ENOMEM;
-		pgd_populate(&init_mm, dst_pgdp, dst_pudp);
+		p4d_populate(&init_mm, dst_p4dp, dst_pudp);
 	}
-	dst_pudp = pud_offset(dst_pgdp, start);
+	dst_pudp = pud_offset(dst_p4dp, start);
 
-	src_pudp = pud_offset(src_pgdp, start);
+	src_pudp = pud_offset(src_p4dp, start);
 	do {
 		pud_t pud = READ_ONCE(*src_pudp);
 
@@ -454,6 +463,27 @@ static int copy_pud(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
 	return 0;
 }
 
+static int copy_p4d(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
+		    unsigned long end)
+{
+	p4d_t *dst_p4dp;
+	p4d_t *src_p4dp;
+	unsigned long next;
+	unsigned long addr = start;
+
+	dst_p4dp = p4d_offset(dst_pgdp, start);
+	src_p4dp = p4d_offset(src_pgdp, start);
+	do {
+		next = p4d_addr_end(addr, end);
+		if (p4d_none(READ_ONCE(*src_p4dp)))
+			continue;
+		if (copy_pud(dst_p4dp, src_p4dp, addr, next))
+			return -ENOMEM;
+	} while (dst_p4dp++, src_p4dp++, addr = next, addr != end);
+
+	return 0;
+}
+
 static int copy_page_tables(pgd_t *dst_pgdp, unsigned long start,
 			    unsigned long end)
 {
@@ -466,7 +496,7 @@ static int copy_page_tables(pgd_t *dst_pgdp, unsigned long start,
 		next = pgd_addr_end(addr, end);
 		if (pgd_none(READ_ONCE(*src_pgdp)))
 			continue;
-		if (copy_pud(dst_pgdp, src_pgdp, addr, next))
+		if (copy_p4d(dst_pgdp, src_pgdp, addr, next))
 			return -ENOMEM;
 	} while (dst_pgdp++, src_pgdp++, addr = next, addr != end);
 
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index c9cedc0432d2..a529877daa13 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -145,6 +145,7 @@ static void show_pte(unsigned long addr)
 	pr_alert("[%016lx] pgd=%016llx", addr, pgd_val(pgd));
 
 	do {
+		p4d_t *p4dp, p4d;
 		pud_t *pudp, pud;
 		pmd_t *pmdp, pmd;
 		pte_t *ptep, pte;
@@ -152,7 +153,13 @@ static void show_pte(unsigned long addr)
 		if (pgd_none(pgd) || pgd_bad(pgd))
 			break;
 
-		pudp = pud_offset(pgdp, addr);
+		p4dp = p4d_offset(pgdp, addr);
+		p4d = READ_ONCE(*p4dp);
+		pr_cont(", p4d=%016llx", p4d_val(p4d));
+		if (p4d_none(p4d) || p4d_bad(p4d))
+			break;
+
+		pudp = pud_offset(p4dp, addr);
 		pud = READ_ONCE(*pudp);
 		pr_cont(", pud=%016llx", pud_val(pud));
 		if (pud_none(pud) || pud_bad(pud))
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index bbeb6a5a6ba6..b8a9f26f3790 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -67,11 +67,13 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
 			   pte_t *ptep, size_t *pgsize)
 {
 	pgd_t *pgdp = pgd_offset(mm, addr);
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 
 	*pgsize = PAGE_SIZE;
-	pudp = pud_offset(pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	pudp = pud_offset(p4dp, addr);
 	pmdp = pmd_offset(pudp, addr);
 	if ((pte_t *)pmdp == ptep) {
 		*pgsize = PMD_SIZE;
@@ -217,12 +219,14 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
 		      unsigned long addr, unsigned long sz)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep = NULL;
 
 	pgdp = pgd_offset(mm, addr);
-	pudp = pud_alloc(mm, pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	pudp = pud_alloc(mm, p4dp, addr);
 	if (!pudp)
 		return NULL;
 
@@ -259,6 +263,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
 		       unsigned long addr, unsigned long sz)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp, pud;
 	pmd_t *pmdp, pmd;
 
@@ -266,7 +271,11 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
 	if (!pgd_present(READ_ONCE(*pgdp)))
 		return NULL;
 
-	pudp = pud_offset(pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	if (!p4d_present(READ_ONCE(*p4dp)))
+		return NULL;
+
+	pudp = pud_offset(p4dp, addr);
 	pud = READ_ONCE(*pudp);
 	if (sz != PUD_SIZE && pud_none(pud))
 		return NULL;
diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c
index f87a32484ea8..2339811f317b 100644
--- a/arch/arm64/mm/kasan_init.c
+++ b/arch/arm64/mm/kasan_init.c
@@ -84,17 +84,17 @@ static pmd_t *__init kasan_pmd_offset(pud_t *pudp, unsigned long addr, int node,
 	return early ? pmd_offset_kimg(pudp, addr) : pmd_offset(pudp, addr);
 }
 
-static pud_t *__init kasan_pud_offset(pgd_t *pgdp, unsigned long addr, int node,
+static pud_t *__init kasan_pud_offset(p4d_t *p4dp, unsigned long addr, int node,
 				      bool early)
 {
-	if (pgd_none(READ_ONCE(*pgdp))) {
+	if (p4d_none(READ_ONCE(*p4dp))) {
 		phys_addr_t pud_phys = early ?
 				__pa_symbol(kasan_early_shadow_pud)
 					: kasan_alloc_zeroed_page(node);
-		__pgd_populate(pgdp, pud_phys, PMD_TYPE_TABLE);
+		__p4d_populate(p4dp, pud_phys, PMD_TYPE_TABLE);
 	}
 
-	return early ? pud_offset_kimg(pgdp, addr) : pud_offset(pgdp, addr);
+	return early ? pud_offset_kimg(p4dp, addr) : pud_offset(p4dp, addr);
 }
 
 static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr,
@@ -126,11 +126,11 @@ static void __init kasan_pmd_populate(pud_t *pudp, unsigned long addr,
 	} while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp)));
 }
 
-static void __init kasan_pud_populate(pgd_t *pgdp, unsigned long addr,
+static void __init kasan_pud_populate(p4d_t *p4dp, unsigned long addr,
 				      unsigned long end, int node, bool early)
 {
 	unsigned long next;
-	pud_t *pudp = kasan_pud_offset(pgdp, addr, node, early);
+	pud_t *pudp = kasan_pud_offset(p4dp, addr, node, early);
 
 	do {
 		next = pud_addr_end(addr, end);
@@ -138,6 +138,18 @@ static void __init kasan_pud_populate(pgd_t *pgdp, unsigned long addr,
 	} while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp)));
 }
 
+static void __init kasan_p4d_populate(pgd_t *pgdp, unsigned long addr,
+				      unsigned long end, int node, bool early)
+{
+	unsigned long next;
+	p4d_t *p4dp = p4d_offset(pgdp, addr);
+
+	do {
+		next = p4d_addr_end(addr, end);
+		kasan_pud_populate(p4dp, addr, next, node, early);
+	} while (p4dp++, addr = next, addr != end);
+}
+
 static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
 				      int node, bool early)
 {
@@ -147,7 +159,7 @@ static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
 	pgdp = pgd_offset_k(addr);
 	do {
 		next = pgd_addr_end(addr, end);
-		kasan_pud_populate(pgdp, addr, next, node, early);
+		kasan_p4d_populate(pgdp, addr, next, node, early);
 	} while (pgdp++, addr = next, addr != end);
 }
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a374e4f51a62..c4c2e36b80ab 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -290,18 +290,19 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
 {
 	unsigned long next;
 	pud_t *pudp;
-	pgd_t pgd = READ_ONCE(*pgdp);
+	p4d_t *p4dp = p4d_offset(pgdp, addr);
+	p4d_t p4d = READ_ONCE(*p4dp);
 
-	if (pgd_none(pgd)) {
+	if (p4d_none(p4d)) {
 		phys_addr_t pud_phys;
 		BUG_ON(!pgtable_alloc);
 		pud_phys = pgtable_alloc(PUD_SHIFT);
-		__pgd_populate(pgdp, pud_phys, PUD_TYPE_TABLE);
-		pgd = READ_ONCE(*pgdp);
+		__p4d_populate(p4dp, pud_phys, PUD_TYPE_TABLE);
+		p4d = READ_ONCE(*p4dp);
 	}
-	BUG_ON(pgd_bad(pgd));
+	BUG_ON(p4d_bad(p4d));
 
-	pudp = pud_set_fixmap_offset(pgdp, addr);
+	pudp = pud_set_fixmap_offset(p4dp, addr);
 	do {
 		pud_t old_pud = READ_ONCE(*pudp);
 
@@ -648,6 +649,7 @@ static void __init map_kernel(pgd_t *pgdp)
 			READ_ONCE(*pgd_offset_k(FIXADDR_START)));
 	} else if (CONFIG_PGTABLE_LEVELS > 3) {
 		pgd_t *bm_pgdp;
+		p4d_t *bm_p4dp;
 		pud_t *bm_pudp;
 		/*
 		 * The fixmap shares its top level pgd entry with the kernel
@@ -657,7 +659,8 @@ static void __init map_kernel(pgd_t *pgdp)
 		 */
 		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
 		bm_pgdp = pgd_offset_raw(pgdp, FIXADDR_START);
-		bm_pudp = pud_set_fixmap_offset(bm_pgdp, FIXADDR_START);
+		bm_p4dp = p4d_offset(bm_pgdp, FIXADDR_START);
+		bm_pudp = pud_set_fixmap_offset(bm_p4dp, FIXADDR_START);
 		pud_populate(&init_mm, bm_pudp, lm_alias(bm_pmd));
 		pud_clear_fixmap();
 	} else {
@@ -691,6 +694,7 @@ void __init paging_init(void)
 int kern_addr_valid(unsigned long addr)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp, pud;
 	pmd_t *pmdp, pmd;
 	pte_t *ptep, pte;
@@ -702,7 +706,11 @@ int kern_addr_valid(unsigned long addr)
 	if (pgd_none(READ_ONCE(*pgdp)))
 		return 0;
 
-	pudp = pud_offset(pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	if (p4d_none(READ_ONCE(*p4dp)))
+		return 0;
+
+	pudp = pud_offset(p4dp, addr);
 	pud = READ_ONCE(*pudp);
 	if (pud_none(pud))
 		return 0;
@@ -1045,6 +1053,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
 	unsigned long addr = start;
 	unsigned long next;
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 
@@ -1055,7 +1064,11 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
 		if (!pgdp)
 			return -ENOMEM;
 
-		pudp = vmemmap_pud_populate(pgdp, addr, node);
+		p4dp = vmemmap_p4d_populate(pgdp, addr, node);
+		if (!p4dp)
+			return -ENOMEM;
+
+		pudp = vmemmap_pud_populate(p4dp, addr, node);
 		if (!pudp)
 			return -ENOMEM;
 
@@ -1090,11 +1103,12 @@ void vmemmap_free(unsigned long start, unsigned long end,
 static inline pud_t * fixmap_pud(unsigned long addr)
 {
 	pgd_t *pgdp = pgd_offset_k(addr);
-	pgd_t pgd = READ_ONCE(*pgdp);
+	p4d_t *p4dp = p4d_offset(pgdp, addr);
+	p4d_t p4d = READ_ONCE(*p4dp);
 
-	BUG_ON(pgd_none(pgd) || pgd_bad(pgd));
+	BUG_ON(p4d_none(p4d) || p4d_bad(p4d));
 
-	return pud_offset_kimg(pgdp, addr);
+	return pud_offset_kimg(p4dp, addr);
 }
 
 static inline pmd_t * fixmap_pmd(unsigned long addr)
@@ -1120,25 +1134,27 @@ static inline pte_t * fixmap_pte(unsigned long addr)
  */
 void __init early_fixmap_init(void)
 {
-	pgd_t *pgdp, pgd;
+	pgd_t *pgdp;
+	p4d_t *p4dp, p4d;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	unsigned long addr = FIXADDR_START;
 
 	pgdp = pgd_offset_k(addr);
-	pgd = READ_ONCE(*pgdp);
+	p4dp = p4d_offset(pgdp, addr);
+	p4d = READ_ONCE(*p4dp);
 	if (CONFIG_PGTABLE_LEVELS > 3 &&
-	    !(pgd_none(pgd) || pgd_page_paddr(pgd) == __pa_symbol(bm_pud))) {
+	    !(p4d_none(p4d) || p4d_page_paddr(p4d) == __pa_symbol(bm_pud))) {
 		/*
 		 * We only end up here if the kernel mapping and the fixmap
 		 * share the top level pgd entry, which should only happen on
 		 * 16k/4 levels configurations.
 		 */
 		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
-		pudp = pud_offset_kimg(pgdp, addr);
+		pudp = pud_offset_kimg(p4dp, addr);
 	} else {
-		if (pgd_none(pgd))
-			__pgd_populate(pgdp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
+		if (p4d_none(p4d))
+			__p4d_populate(p4dp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
 		pudp = fixmap_pud(addr);
 	}
 	if (pud_none(READ_ONCE(*pudp)))
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 250c49008d73..5a310991ff73 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -198,6 +198,7 @@ void __kernel_map_pages(struct page *page, int numpages, int enable)
 bool kernel_page_present(struct page *page)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp, pud;
 	pmd_t *pmdp, pmd;
 	pte_t *ptep;
@@ -210,7 +211,11 @@ bool kernel_page_present(struct page *page)
 	if (pgd_none(READ_ONCE(*pgdp)))
 		return false;
 
-	pudp = pud_offset(pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	if (p4d_none(READ_ONCE(*p4dp)))
+		return false;
+
+	pudp = pud_offset(p4dp, addr);
 	pud = READ_ONCE(*pudp);
 	if (pud_none(pud))
 		return false;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index e3b9ee268823..48d4288c5f1b 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -158,13 +158,22 @@ static void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc)
 
 static void clear_stage2_pgd_entry(struct kvm *kvm, pgd_t *pgd, phys_addr_t addr)
 {
-	pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, pgd, 0UL);
+	p4d_t *p4d_table __maybe_unused = stage2_p4d_offset(kvm, pgd, 0UL);
 	stage2_pgd_clear(kvm, pgd);
 	kvm_tlb_flush_vmid_ipa(kvm, addr);
-	stage2_pud_free(kvm, pud_table);
+	stage2_p4d_free(kvm, p4d_table);
 	put_page(virt_to_page(pgd));
 }
 
+static void clear_stage2_p4d_entry(struct kvm *kvm, p4d_t *p4d, phys_addr_t addr)
+{
+	pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, p4d, 0);
+	stage2_p4d_clear(kvm, p4d);
+	kvm_tlb_flush_vmid_ipa(kvm, addr);
+	stage2_pud_free(kvm, pud_table);
+	put_page(virt_to_page(p4d));
+}
+
 static void clear_stage2_pud_entry(struct kvm *kvm, pud_t *pud, phys_addr_t addr)
 {
 	pmd_t *pmd_table __maybe_unused = stage2_pmd_offset(kvm, pud, 0);
@@ -208,12 +217,20 @@ static inline void kvm_pud_populate(pud_t *pudp, pmd_t *pmdp)
 	dsb(ishst);
 }
 
-static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
+static inline void kvm_p4d_populate(p4d_t *p4dp, pud_t *pudp)
 {
-	WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
+	WRITE_ONCE(*p4dp, kvm_mk_p4d(pudp));
 	dsb(ishst);
 }
 
+static inline void kvm_pgd_populate(pgd_t *pgdp, p4d_t *p4dp)
+{
+#ifndef __PAGETABLE_P4D_FOLDED
+	WRITE_ONCE(*pgdp, kvm_mk_pgd(p4dp));
+	dsb(ishst);
+#endif
+}
+
 /*
  * Unmapping vs dcache management:
  *
@@ -293,13 +310,13 @@ static void unmap_stage2_pmds(struct kvm *kvm, pud_t *pud,
 		clear_stage2_pud_entry(kvm, pud, start_addr);
 }
 
-static void unmap_stage2_puds(struct kvm *kvm, pgd_t *pgd,
+static void unmap_stage2_puds(struct kvm *kvm, p4d_t *p4d,
 		       phys_addr_t addr, phys_addr_t end)
 {
 	phys_addr_t next, start_addr = addr;
 	pud_t *pud, *start_pud;
 
-	start_pud = pud = stage2_pud_offset(kvm, pgd, addr);
+	start_pud = pud = stage2_pud_offset(kvm, p4d, addr);
 	do {
 		next = stage2_pud_addr_end(kvm, addr, end);
 		if (!stage2_pud_none(kvm, *pud)) {
@@ -317,6 +334,23 @@ static void unmap_stage2_puds(struct kvm *kvm, pgd_t *pgd,
 	} while (pud++, addr = next, addr != end);
 
 	if (stage2_pud_table_empty(kvm, start_pud))
+		clear_stage2_p4d_entry(kvm, p4d, start_addr);
+}
+
+static void unmap_stage2_p4ds(struct kvm *kvm, pgd_t *pgd,
+		       phys_addr_t addr, phys_addr_t end)
+{
+	phys_addr_t next, start_addr = addr;
+	p4d_t *p4d, *start_p4d;
+
+	start_p4d = p4d = stage2_p4d_offset(kvm, pgd, addr);
+	do {
+		next = stage2_p4d_addr_end(kvm, addr, end);
+		if (!stage2_p4d_none(kvm, *p4d))
+			unmap_stage2_puds(kvm, p4d, addr, next);
+	} while (p4d++, addr = next, addr != end);
+
+	if (stage2_p4d_table_empty(kvm, start_p4d))
 		clear_stage2_pgd_entry(kvm, pgd, start_addr);
 }
 
@@ -351,7 +385,7 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
 			break;
 		next = stage2_pgd_addr_end(kvm, addr, end);
 		if (!stage2_pgd_none(kvm, *pgd))
-			unmap_stage2_puds(kvm, pgd, addr, next);
+			unmap_stage2_p4ds(kvm, pgd, addr, next);
 		/*
 		 * If the range is too large, release the kvm->mmu_lock
 		 * to prevent starvation and lockup detector warnings.
@@ -391,13 +425,13 @@ static void stage2_flush_pmds(struct kvm *kvm, pud_t *pud,
 	} while (pmd++, addr = next, addr != end);
 }
 
-static void stage2_flush_puds(struct kvm *kvm, pgd_t *pgd,
+static void stage2_flush_puds(struct kvm *kvm, p4d_t *p4d,
 			      phys_addr_t addr, phys_addr_t end)
 {
 	pud_t *pud;
 	phys_addr_t next;
 
-	pud = stage2_pud_offset(kvm, pgd, addr);
+	pud = stage2_pud_offset(kvm, p4d, addr);
 	do {
 		next = stage2_pud_addr_end(kvm, addr, end);
 		if (!stage2_pud_none(kvm, *pud)) {
@@ -409,6 +443,20 @@ static void stage2_flush_puds(struct kvm *kvm, pgd_t *pgd,
 	} while (pud++, addr = next, addr != end);
 }
 
+static void stage2_flush_p4ds(struct kvm *kvm, pgd_t *pgd,
+			      phys_addr_t addr, phys_addr_t end)
+{
+	p4d_t *p4d;
+	phys_addr_t next;
+
+	p4d = stage2_p4d_offset(kvm, pgd, addr);
+	do {
+		next = stage2_p4d_addr_end(kvm, addr, end);
+		if (!stage2_p4d_none(kvm, *p4d))
+			stage2_flush_puds(kvm, p4d, addr, next);
+	} while (p4d++, addr = next, addr != end);
+}
+
 static void stage2_flush_memslot(struct kvm *kvm,
 				 struct kvm_memory_slot *memslot)
 {
@@ -421,7 +469,7 @@ static void stage2_flush_memslot(struct kvm *kvm,
 	do {
 		next = stage2_pgd_addr_end(kvm, addr, end);
 		if (!stage2_pgd_none(kvm, *pgd))
-			stage2_flush_puds(kvm, pgd, addr, next);
+			stage2_flush_p4ds(kvm, pgd, addr, next);
 	} while (pgd++, addr = next, addr != end);
 }
 
@@ -451,12 +499,21 @@ static void stage2_flush_vm(struct kvm *kvm)
 
 static void clear_hyp_pgd_entry(pgd_t *pgd)
 {
-	pud_t *pud_table __maybe_unused = pud_offset(pgd, 0UL);
+	p4d_t *p4d_table __maybe_unused = p4d_offset(pgd, 0UL);
 	pgd_clear(pgd);
-	pud_free(NULL, pud_table);
+	p4d_free(NULL, p4d_table);
 	put_page(virt_to_page(pgd));
 }
 
+static void clear_hyp_p4d_entry(p4d_t *p4d)
+{
+	pud_t *pud_table __maybe_unused = pud_offset(p4d, 0);
+	VM_BUG_ON(p4d_huge(*p4d));
+	p4d_clear(p4d);
+	pud_free(NULL, pud_table);
+	put_page(virt_to_page(p4d));
+}
+
 static void clear_hyp_pud_entry(pud_t *pud)
 {
 	pmd_t *pmd_table __maybe_unused = pmd_offset(pud, 0);
@@ -508,12 +565,12 @@ static void unmap_hyp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end)
 		clear_hyp_pud_entry(pud);
 }
 
-static void unmap_hyp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+static void unmap_hyp_puds(p4d_t *p4d, phys_addr_t addr, phys_addr_t end)
 {
 	phys_addr_t next;
 	pud_t *pud, *start_pud;
 
-	start_pud = pud = pud_offset(pgd, addr);
+	start_pud = pud = pud_offset(p4d, addr);
 	do {
 		next = pud_addr_end(addr, end);
 		/* Hyp doesn't use huge puds */
@@ -522,6 +579,23 @@ static void unmap_hyp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
 	} while (pud++, addr = next, addr != end);
 
 	if (hyp_pud_table_empty(start_pud))
+		clear_hyp_p4d_entry(p4d);
+}
+
+static void unmap_hyp_p4ds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+{
+	phys_addr_t next;
+	p4d_t *p4d, *start_p4d;
+
+	start_p4d = p4d = p4d_offset(pgd, addr);
+	do {
+		next = p4d_addr_end(addr, end);
+		/* Hyp doesn't use huge p4ds */
+		if (!p4d_none(*p4d))
+			unmap_hyp_puds(p4d, addr, next);
+	} while (p4d++, addr = next, addr != end);
+
+	if (hyp_p4d_table_empty(start_p4d))
 		clear_hyp_pgd_entry(pgd);
 }
 
@@ -545,7 +619,7 @@ static void __unmap_hyp_range(pgd_t *pgdp, unsigned long ptrs_per_pgd,
 	do {
 		next = pgd_addr_end(addr, end);
 		if (!pgd_none(*pgd))
-			unmap_hyp_puds(pgd, addr, next);
+			unmap_hyp_p4ds(pgd, addr, next);
 	} while (pgd++, addr = next, addr != end);
 }
 
@@ -655,7 +729,7 @@ static int create_hyp_pmd_mappings(pud_t *pud, unsigned long start,
 	return 0;
 }
 
-static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
+static int create_hyp_pud_mappings(p4d_t *p4d, unsigned long start,
 				   unsigned long end, unsigned long pfn,
 				   pgprot_t prot)
 {
@@ -666,7 +740,7 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
 
 	addr = start;
 	do {
-		pud = pud_offset(pgd, addr);
+		pud = pud_offset(p4d, addr);
 
 		if (pud_none_or_clear_bad(pud)) {
 			pmd = pmd_alloc_one(NULL, addr);
@@ -688,12 +762,45 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
 	return 0;
 }
 
+static int create_hyp_p4d_mappings(pgd_t *pgd, unsigned long start,
+				   unsigned long end, unsigned long pfn,
+				   pgprot_t prot)
+{
+	p4d_t *p4d;
+	pud_t *pud;
+	unsigned long addr, next;
+	int ret;
+
+	addr = start;
+	do {
+		p4d = p4d_offset(pgd, addr);
+
+		if (p4d_none(*p4d)) {
+			pud = pud_alloc_one(NULL, addr);
+			if (!pud) {
+				kvm_err("Cannot allocate Hyp pud\n");
+				return -ENOMEM;
+			}
+			kvm_p4d_populate(p4d, pud);
+			get_page(virt_to_page(p4d));
+		}
+
+		next = p4d_addr_end(addr, end);
+		ret = create_hyp_pud_mappings(p4d, addr, next, pfn, prot);
+		if (ret)
+			return ret;
+		pfn += (next - addr) >> PAGE_SHIFT;
+	} while (addr = next, addr != end);
+
+	return 0;
+}
+
 static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
 				 unsigned long start, unsigned long end,
 				 unsigned long pfn, pgprot_t prot)
 {
 	pgd_t *pgd;
-	pud_t *pud;
+	p4d_t *p4d;
 	unsigned long addr, next;
 	int err = 0;
 
@@ -704,18 +811,18 @@ static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
 		pgd = pgdp + kvm_pgd_index(addr, ptrs_per_pgd);
 
 		if (pgd_none(*pgd)) {
-			pud = pud_alloc_one(NULL, addr);
-			if (!pud) {
-				kvm_err("Cannot allocate Hyp pud\n");
+			p4d = p4d_alloc_one(NULL, addr);
+			if (!p4d) {
+				kvm_err("Cannot allocate Hyp p4d\n");
 				err = -ENOMEM;
 				goto out;
 			}
-			kvm_pgd_populate(pgd, pud);
+			kvm_pgd_populate(pgd, p4d);
 			get_page(virt_to_page(pgd));
 		}
 
 		next = pgd_addr_end(addr, end);
-		err = create_hyp_pud_mappings(pgd, addr, next, pfn, prot);
+		err = create_hyp_p4d_mappings(pgd, addr, next, pfn, prot);
 		if (err)
 			goto out;
 		pfn += (next - addr) >> PAGE_SHIFT;
@@ -1012,22 +1119,40 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
 		free_pages_exact(pgd, stage2_pgd_size(kvm));
 }
 
-static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
+static p4d_t *stage2_get_p4d(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 			     phys_addr_t addr)
 {
 	pgd_t *pgd;
-	pud_t *pud;
+	p4d_t *p4d;
 
 	pgd = kvm->arch.pgd + stage2_pgd_index(kvm, addr);
 	if (stage2_pgd_none(kvm, *pgd)) {
 		if (!cache)
 			return NULL;
-		pud = mmu_memory_cache_alloc(cache);
-		stage2_pgd_populate(kvm, pgd, pud);
+		p4d = mmu_memory_cache_alloc(cache);
+		stage2_pgd_populate(kvm, pgd, p4d);
 		get_page(virt_to_page(pgd));
 	}
 
-	return stage2_pud_offset(kvm, pgd, addr);
+	return stage2_p4d_offset(kvm, pgd, addr);
+}
+
+static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
+			     phys_addr_t addr)
+{
+	p4d_t *p4d;
+	pud_t *pud;
+
+	p4d = stage2_get_p4d(kvm, cache, addr);
+	if (stage2_p4d_none(kvm, *p4d)) {
+		if (!cache)
+			return NULL;
+		pud = mmu_memory_cache_alloc(cache);
+		stage2_p4d_populate(kvm, p4d, pud);
+		get_page(virt_to_page(p4d));
+	}
+
+	return stage2_pud_offset(kvm, p4d, addr);
 }
 
 static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
@@ -1461,18 +1586,18 @@ static void stage2_wp_pmds(struct kvm *kvm, pud_t *pud,
 }
 
 /**
- * stage2_wp_puds - write protect PGD range
+ * stage2_wp_puds - write protect P4D range
  * @pgd:	pointer to pgd entry
  * @addr:	range start address
  * @end:	range end address
  */
-static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
+static void  stage2_wp_puds(struct kvm *kvm, p4d_t *p4d,
 			    phys_addr_t addr, phys_addr_t end)
 {
 	pud_t *pud;
 	phys_addr_t next;
 
-	pud = stage2_pud_offset(kvm, pgd, addr);
+	pud = stage2_pud_offset(kvm, p4d, addr);
 	do {
 		next = stage2_pud_addr_end(kvm, addr, end);
 		if (!stage2_pud_none(kvm, *pud)) {
@@ -1486,6 +1611,26 @@ static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
 	} while (pud++, addr = next, addr != end);
 }
 
+/**
+ * stage2_wp_p4ds - write protect PGD range
+ * @pgd:	pointer to pgd entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void  stage2_wp_p4ds(struct kvm *kvm, pgd_t *pgd,
+			    phys_addr_t addr, phys_addr_t end)
+{
+	p4d_t *p4d;
+	phys_addr_t next;
+
+	p4d = stage2_p4d_offset(kvm, pgd, addr);
+	do {
+		next = stage2_p4d_addr_end(kvm, addr, end);
+		if (!stage2_p4d_none(kvm, *p4d))
+			stage2_wp_puds(kvm, p4d, addr, next);
+	} while (p4d++, addr = next, addr != end);
+}
+
 /**
  * stage2_wp_range() - write protect stage2 memory region range
  * @kvm:	The KVM pointer
@@ -1513,7 +1658,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
 			break;
 		next = stage2_pgd_addr_end(kvm, addr, end);
 		if (stage2_pgd_present(kvm, *pgd))
-			stage2_wp_puds(kvm, pgd, addr, next);
+			stage2_wp_p4ds(kvm, pgd, addr, next);
 	} while (pgd++, addr = next, addr != end);
 }
 
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 04/14] hexagon: remove __ARCH_USE_5LEVEL_HACK
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (2 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 03/14] arm64: " Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 05/14] ia64: add support for folded p4d page tables Mike Rapoport
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The hexagon architecture has 2 level page tables and as such most of the
page table folding is already implemented in asm-generic/pgtable-nopmd.h.

Fixup the only place in arch/hexagon to unfold the p4d level and remove
__ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/hexagon/include/asm/fixmap.h  | 4 ++--
 arch/hexagon/include/asm/pgtable.h | 1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/hexagon/include/asm/fixmap.h b/arch/hexagon/include/asm/fixmap.h
index 933dac167504..97b1b062e750 100644
--- a/arch/hexagon/include/asm/fixmap.h
+++ b/arch/hexagon/include/asm/fixmap.h
@@ -16,7 +16,7 @@
 #include <asm-generic/fixmap.h>
 
 #define kmap_get_fixmap_pte(vaddr) \
-	pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr), \
-				(vaddr)), (vaddr)), (vaddr))
+	pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(vaddr), \
+				(vaddr)), (vaddr)), (vaddr)), (vaddr))
 
 #endif
diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h
index d383e8bea5b2..2a17d4eb2fa4 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -12,7 +12,6 @@
  * Page table definitions for Qualcomm Hexagon processor.
  */
 #include <asm/page.h>
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopmd.h>
 
 /* A handy thing to have if one has the RAM. Declared in head.S */
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 05/14] ia64: add support for folded p4d page tables
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (3 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 04/14] hexagon: remove __ARCH_USE_5LEVEL_HACK Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 06/14] nios2: " Mike Rapoport
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, remove usage of __ARCH_USE_5LEVEL_HACK and replace
5level-fixup.h with pgtable-nop4d.h

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/ia64/include/asm/pgalloc.h |  4 ++--
 arch/ia64/include/asm/pgtable.h | 17 ++++++++---------
 arch/ia64/mm/fault.c            |  7 ++++++-
 arch/ia64/mm/hugetlbpage.c      | 18 ++++++++++++------
 arch/ia64/mm/init.c             | 28 ++++++++++++++++++++++++----
 5 files changed, 52 insertions(+), 22 deletions(-)

diff --git a/arch/ia64/include/asm/pgalloc.h b/arch/ia64/include/asm/pgalloc.h
index f4c491044882..2a3050345099 100644
--- a/arch/ia64/include/asm/pgalloc.h
+++ b/arch/ia64/include/asm/pgalloc.h
@@ -36,9 +36,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 
 #if CONFIG_PGTABLE_LEVELS == 4
 static inline void
-pgd_populate(struct mm_struct *mm, pgd_t * pgd_entry, pud_t * pud)
+p4d_populate(struct mm_struct *mm, p4d_t * p4d_entry, pud_t * pud)
 {
-	pgd_val(*pgd_entry) = __pa(pud);
+	p4d_val(*p4d_entry) = __pa(pud);
 }
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
diff --git a/arch/ia64/include/asm/pgtable.h b/arch/ia64/include/asm/pgtable.h
index 0e7b645b76c6..787b0a91d255 100644
--- a/arch/ia64/include/asm/pgtable.h
+++ b/arch/ia64/include/asm/pgtable.h
@@ -283,12 +283,12 @@ extern unsigned long VMALLOC_END;
 #define pud_page(pud)			virt_to_page((pud_val(pud) + PAGE_OFFSET))
 
 #if CONFIG_PGTABLE_LEVELS == 4
-#define pgd_none(pgd)			(!pgd_val(pgd))
-#define pgd_bad(pgd)			(!ia64_phys_addr_valid(pgd_val(pgd)))
-#define pgd_present(pgd)		(pgd_val(pgd) != 0UL)
-#define pgd_clear(pgdp)			(pgd_val(*(pgdp)) = 0UL)
-#define pgd_page_vaddr(pgd)		((unsigned long) __va(pgd_val(pgd) & _PFN_MASK))
-#define pgd_page(pgd)			virt_to_page((pgd_val(pgd) + PAGE_OFFSET))
+#define p4d_none(p4d)			(!p4d_val(p4d))
+#define p4d_bad(p4d)			(!ia64_phys_addr_valid(p4d_val(p4d)))
+#define p4d_present(p4d)		(p4d_val(p4d) != 0UL)
+#define p4d_clear(p4dp)			(p4d_val(*(p4dp)) = 0UL)
+#define p4d_page_vaddr(p4d)		((unsigned long) __va(p4d_val(p4d) & _PFN_MASK))
+#define p4d_page(p4d)			virt_to_page((p4d_val(p4d) + PAGE_OFFSET))
 #endif
 
 /*
@@ -386,7 +386,7 @@ pgd_offset (const struct mm_struct *mm, unsigned long address)
 #if CONFIG_PGTABLE_LEVELS == 4
 /* Find an entry in the second-level page table.. */
 #define pud_offset(dir,addr) \
-	((pud_t *) pgd_page_vaddr(*(dir)) + (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
+	((pud_t *) p4d_page_vaddr(*(dir)) + (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
 #endif
 
 /* Find an entry in the third-level page table.. */
@@ -580,10 +580,9 @@ extern struct page *zero_page_memmap_ptr;
 
 
 #if CONFIG_PGTABLE_LEVELS == 3
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopud.h>
 #endif
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
 #include <asm-generic/pgtable.h>
 
 #endif /* _ASM_IA64_PGTABLE_H */
diff --git a/arch/ia64/mm/fault.c b/arch/ia64/mm/fault.c
index 30d0c1fca99e..12242aa0dad1 100644
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -29,6 +29,7 @@ static int
 mapped_kernel_page_is_present (unsigned long address)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *ptep, pte;
@@ -37,7 +38,11 @@ mapped_kernel_page_is_present (unsigned long address)
 	if (pgd_none(*pgd) || pgd_bad(*pgd))
 		return 0;
 
-	pud = pud_offset(pgd, address);
+	p4d = p4d_offset(pgd, address);
+	if (p4d_none(*p4d) || p4d_bad(*p4d))
+		return 0;
+
+	pud = pud_offset(p4d, address);
 	if (pud_none(*pud) || pud_bad(*pud))
 		return 0;
 
diff --git a/arch/ia64/mm/hugetlbpage.c b/arch/ia64/mm/hugetlbpage.c
index d16e419fd712..32352a73df0c 100644
--- a/arch/ia64/mm/hugetlbpage.c
+++ b/arch/ia64/mm/hugetlbpage.c
@@ -30,12 +30,14 @@ huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
 {
 	unsigned long taddr = htlbpage_to_page(addr);
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte = NULL;
 
 	pgd = pgd_offset(mm, taddr);
-	pud = pud_alloc(mm, pgd, taddr);
+	p4d = p4d_offset(pgd, taddr);
+	pud = pud_alloc(mm, p4d, taddr);
 	if (pud) {
 		pmd = pmd_alloc(mm, pud, taddr);
 		if (pmd)
@@ -49,17 +51,21 @@ huge_pte_offset (struct mm_struct *mm, unsigned long addr, unsigned long sz)
 {
 	unsigned long taddr = htlbpage_to_page(addr);
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte = NULL;
 
 	pgd = pgd_offset(mm, taddr);
 	if (pgd_present(*pgd)) {
-		pud = pud_offset(pgd, taddr);
-		if (pud_present(*pud)) {
-			pmd = pmd_offset(pud, taddr);
-			if (pmd_present(*pmd))
-				pte = pte_offset_map(pmd, taddr);
+		p4d = p4d_offset(pgd, addr);
+		if (p4d_present(*p4d)) {
+			pud = pud_offset(p4d, taddr);
+			if (pud_present(*pud)) {
+				pmd = pmd_offset(pud, taddr);
+				if (pmd_present(*pmd))
+					pte = pte_offset_map(pmd, taddr);
+			}
 		}
 	}
 
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index d637b4ea3147..ca760f6cb18f 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -208,6 +208,7 @@ static struct page * __init
 put_kernel_page (struct page *page, unsigned long address, pgprot_t pgprot)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
@@ -215,7 +216,10 @@ put_kernel_page (struct page *page, unsigned long address, pgprot_t pgprot)
 	pgd = pgd_offset_k(address);		/* note: this is NOT pgd_offset()! */
 
 	{
-		pud = pud_alloc(&init_mm, pgd, address);
+		p4d = p4d_alloc(&init_mm, pgd, address);
+		if (!p4d)
+			goto out;
+		pud = pud_alloc(&init_mm, p4d, address);
 		if (!pud)
 			goto out;
 		pmd = pmd_alloc(&init_mm, pud, address);
@@ -382,6 +386,7 @@ int vmemmap_find_next_valid_pfn(int node, int i)
 
 	do {
 		pgd_t *pgd;
+		p4d_t *p4d;
 		pud_t *pud;
 		pmd_t *pmd;
 		pte_t *pte;
@@ -392,7 +397,13 @@ int vmemmap_find_next_valid_pfn(int node, int i)
 			continue;
 		}
 
-		pud = pud_offset(pgd, end_address);
+		p4d = p4d_offset(pgd, end_address);
+		if (p4d_none(*p4d)) {
+			end_address += P4D_SIZE;
+			continue;
+		}
+
+		pud = pud_offset(p4d, end_address);
 		if (pud_none(*pud)) {
 			end_address += PUD_SIZE;
 			continue;
@@ -430,6 +441,7 @@ int __init create_mem_map_page_table(u64 start, u64 end, void *arg)
 	struct page *map_start, *map_end;
 	int node;
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
@@ -444,12 +456,20 @@ int __init create_mem_map_page_table(u64 start, u64 end, void *arg)
 	for (address = start_page; address < end_page; address += PAGE_SIZE) {
 		pgd = pgd_offset_k(address);
 		if (pgd_none(*pgd)) {
+			p4d = memblock_alloc_node(PAGE_SIZE, PAGE_SIZE, node);
+			if (!p4d)
+				goto err_alloc;
+			pgd_populate(&init_mm, pgd, p4d);
+		}
+		p4d = p4d_offset(pgd, address);
+
+		if (p4d_none(*p4d)) {
 			pud = memblock_alloc_node(PAGE_SIZE, PAGE_SIZE, node);
 			if (!pud)
 				goto err_alloc;
-			pgd_populate(&init_mm, pgd, pud);
+			p4d_populate(&init_mm, p4d, pud);
 		}
-		pud = pud_offset(pgd, address);
+		pud = pud_offset(p4d, address);
 
 		if (pud_none(*pud)) {
 			pmd = memblock_alloc_node(PAGE_SIZE, PAGE_SIZE, node);
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 06/14] nios2: add support for folded p4d page tables
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (4 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 05/14] ia64: add support for folded p4d page tables Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 07/14] openrisc: " Mike Rapoport
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/nios2/include/asm/pgtable.h | 3 +--
 arch/nios2/mm/fault.c            | 9 +++++++--
 arch/nios2/mm/ioremap.c          | 6 +++++-
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/nios2/include/asm/pgtable.h b/arch/nios2/include/asm/pgtable.h
index f98b7f4519ba..47a1a3ea5734 100644
--- a/arch/nios2/include/asm/pgtable.h
+++ b/arch/nios2/include/asm/pgtable.h
@@ -22,7 +22,6 @@
 #include <asm/tlbflush.h>
 
 #include <asm/pgtable-bits.h>
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopmd.h>
 
 #define FIRST_USER_ADDRESS	0UL
@@ -100,7 +99,7 @@ extern pte_t invalid_pte_table[PAGE_SIZE/sizeof(pte_t)];
  */
 static inline void set_pmd(pmd_t *pmdptr, pmd_t pmdval)
 {
-	pmdptr->pud.pgd.pgd = pmdval.pud.pgd.pgd;
+	*pmdptr = pmdval;
 }
 
 /* to find an entry in a page-table-directory */
diff --git a/arch/nios2/mm/fault.c b/arch/nios2/mm/fault.c
index ec9d8a9c426f..964eac1a21d0 100644
--- a/arch/nios2/mm/fault.c
+++ b/arch/nios2/mm/fault.c
@@ -242,6 +242,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
 		 */
 		int offset = pgd_index(address);
 		pgd_t *pgd, *pgd_k;
+		p4d_t *p4d, *p4d_k;
 		pud_t *pud, *pud_k;
 		pmd_t *pmd, *pmd_k;
 		pte_t *pte_k;
@@ -253,8 +254,12 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long cause,
 			goto no_context;
 		set_pgd(pgd, *pgd_k);
 
-		pud = pud_offset(pgd, address);
-		pud_k = pud_offset(pgd_k, address);
+		p4d = p4d_offset(pgd, address);
+		p4d_k = p4d_offset(pgd_k, address);
+		if (!p4d_present(*p4d_k))
+			goto no_context;
+		pud = pud_offset(p4d, address);
+		pud_k = pud_offset(p4d_k, address);
 		if (!pud_present(*pud_k))
 			goto no_context;
 		pmd = pmd_offset(pud, address);
diff --git a/arch/nios2/mm/ioremap.c b/arch/nios2/mm/ioremap.c
index 819bdfcc2e71..fe821efb9a99 100644
--- a/arch/nios2/mm/ioremap.c
+++ b/arch/nios2/mm/ioremap.c
@@ -86,11 +86,15 @@ static int remap_area_pages(unsigned long address, unsigned long phys_addr,
 	if (address >= end)
 		BUG();
 	do {
+		p4d_t *p4d;
 		pud_t *pud;
 		pmd_t *pmd;
 
 		error = -ENOMEM;
-		pud = pud_alloc(&init_mm, dir, address);
+		p4d = p4d_alloc(&init_mm, dir, address);
+		if (!p4d)
+			break;
+		pud = pud_alloc(&init_mm, p4d, address);
 		if (!pud)
 			break;
 		pmd = pmd_alloc(&init_mm, pud, address);
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 07/14] openrisc: add support for folded p4d page tables
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (5 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 06/14] nios2: " Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 08/14] powerpc: " Mike Rapoport
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/openrisc/include/asm/pgtable.h |  1 -
 arch/openrisc/mm/fault.c            | 10 ++++++++--
 arch/openrisc/mm/init.c             |  4 +++-
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/openrisc/include/asm/pgtable.h b/arch/openrisc/include/asm/pgtable.h
index 7f3fb9ceb083..219979e57790 100644
--- a/arch/openrisc/include/asm/pgtable.h
+++ b/arch/openrisc/include/asm/pgtable.h
@@ -21,7 +21,6 @@
 #ifndef __ASM_OPENRISC_PGTABLE_H
 #define __ASM_OPENRISC_PGTABLE_H
 
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopmd.h>
 
 #ifndef __ASSEMBLY__
diff --git a/arch/openrisc/mm/fault.c b/arch/openrisc/mm/fault.c
index 8af1cc78c4fb..6e0a11ac4c00 100644
--- a/arch/openrisc/mm/fault.c
+++ b/arch/openrisc/mm/fault.c
@@ -295,6 +295,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address,
 
 		int offset = pgd_index(address);
 		pgd_t *pgd, *pgd_k;
+		p4d_t *p4d, *p4d_k;
 		pud_t *pud, *pud_k;
 		pmd_t *pmd, *pmd_k;
 		pte_t *pte_k;
@@ -321,8 +322,13 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long address,
 		 * it exists.
 		 */
 
-		pud = pud_offset(pgd, address);
-		pud_k = pud_offset(pgd_k, address);
+		p4d = p4d_offset(pgd, address);
+		p4d_k = p4d_offset(pgd_k, address);
+		if (!p4d_present(*p4d_k))
+			goto no_context;
+
+		pud = pud_offset(p4d, address);
+		pud_k = pud_offset(p4d_k, address);
 		if (!pud_present(*pud_k))
 			goto no_context;
 
diff --git a/arch/openrisc/mm/init.c b/arch/openrisc/mm/init.c
index 1f87b524db78..2536aeae0975 100644
--- a/arch/openrisc/mm/init.c
+++ b/arch/openrisc/mm/init.c
@@ -71,6 +71,7 @@ static void __init map_ram(void)
 	unsigned long v, p, e;
 	pgprot_t prot;
 	pgd_t *pge;
+	p4d_t *p4e;
 	pud_t *pue;
 	pmd_t *pme;
 	pte_t *pte;
@@ -90,7 +91,8 @@ static void __init map_ram(void)
 
 		while (p < e) {
 			int j;
-			pue = pud_offset(pge, v);
+			p4e = p4d_offset(pge, v);
+			pue = pud_offset(p4e, v);
 			pme = pmd_offset(pue, v);
 
 			if ((u32) pue != (u32) pge || (u32) pme != (u32) pge) {
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 08/14] powerpc: add support for folded p4d page tables
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (6 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 07/14] openrisc: " Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-06-03 19:05   ` Andrew Morton
  2020-04-14 15:34 ` [PATCH v4 09/14] sh: fault: Modernize printing of kernel messages Mike Rapoport
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> # 8xx and 83xx
---
 arch/powerpc/include/asm/book3s/32/pgtable.h  |  1 -
 arch/powerpc/include/asm/book3s/64/hash.h     |  4 +-
 arch/powerpc/include/asm/book3s/64/pgalloc.h  |  4 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 60 ++++++++++---------
 arch/powerpc/include/asm/book3s/64/radix.h    |  6 +-
 arch/powerpc/include/asm/nohash/32/pgtable.h  |  1 -
 arch/powerpc/include/asm/nohash/64/pgalloc.h  |  2 +-
 .../include/asm/nohash/64/pgtable-4k.h        | 32 +++++-----
 arch/powerpc/include/asm/nohash/64/pgtable.h  |  6 +-
 arch/powerpc/include/asm/pgtable.h            | 10 ++--
 arch/powerpc/kvm/book3s_64_mmu_radix.c        | 32 ++++++----
 arch/powerpc/lib/code-patching.c              |  7 ++-
 arch/powerpc/mm/book3s64/hash_pgtable.c       |  4 +-
 arch/powerpc/mm/book3s64/radix_pgtable.c      | 26 +++++---
 arch/powerpc/mm/book3s64/subpage_prot.c       |  6 +-
 arch/powerpc/mm/hugetlbpage.c                 | 28 +++++----
 arch/powerpc/mm/nohash/book3e_pgtable.c       | 15 ++---
 arch/powerpc/mm/pgtable.c                     | 30 ++++++----
 arch/powerpc/mm/pgtable_64.c                  | 10 ++--
 arch/powerpc/mm/ptdump/hashpagetable.c        | 20 ++++++-
 arch/powerpc/mm/ptdump/ptdump.c               | 14 +++--
 arch/powerpc/xmon/xmon.c                      | 18 +++---
 22 files changed, 197 insertions(+), 139 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 7549393c4c43..6052b72216a6 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -2,7 +2,6 @@
 #ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
 #define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
 
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopmd.h>
 
 #include <asm/book3s/32/hash.h>
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 6fc4520092c7..73ad038ed10b 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -134,9 +134,9 @@ static inline int get_region_id(unsigned long ea)
 
 #define	hash__pmd_bad(pmd)		(pmd_val(pmd) & H_PMD_BAD_BITS)
 #define	hash__pud_bad(pud)		(pud_val(pud) & H_PUD_BAD_BITS)
-static inline int hash__pgd_bad(pgd_t pgd)
+static inline int hash__p4d_bad(p4d_t p4d)
 {
-	return (pgd_val(pgd) == 0);
+	return (p4d_val(p4d) == 0);
 }
 #ifdef CONFIG_STRICT_KERNEL_RWX
 extern void hash__mark_rodata_ro(void);
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index a41e91bd0580..69c5b051734f 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -85,9 +85,9 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 	kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
 }
 
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
+static inline void p4d_populate(struct mm_struct *mm, p4d_t *pgd, pud_t *pud)
 {
-	*pgd =  __pgd(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
+	*pgd =  __p4d(__pgtable_ptr_val(pud) | PGD_VAL_BITS);
 }
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 368b136517e0..bc047514724c 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
 #define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
 
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
 
 #ifndef __ASSEMBLY__
 #include <linux/mmdebug.h>
@@ -251,7 +251,7 @@ extern unsigned long __pmd_frag_size_shift;
 /* Bits to mask out from a PUD to get to the PMD page */
 #define PUD_MASKED_BITS		0xc0000000000000ffUL
 /* Bits to mask out from a PGD to get to the PUD page */
-#define PGD_MASKED_BITS		0xc0000000000000ffUL
+#define P4D_MASKED_BITS		0xc0000000000000ffUL
 
 /*
  * Used as an indicator for rcu callback functions
@@ -949,54 +949,60 @@ static inline bool pud_access_permitted(pud_t pud, bool write)
 	return pte_access_permitted(pud_pte(pud), write);
 }
 
-#define pgd_write(pgd)		pte_write(pgd_pte(pgd))
+#define __p4d_raw(x)	((p4d_t) { __pgd_raw(x) })
+static inline __be64 p4d_raw(p4d_t x)
+{
+	return pgd_raw(x.pgd);
+}
+
+#define p4d_write(p4d)		pte_write(p4d_pte(p4d))
 
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
 {
-	*pgdp = __pgd(0);
+	*p4dp = __p4d(0);
 }
 
-static inline int pgd_none(pgd_t pgd)
+static inline int p4d_none(p4d_t p4d)
 {
-	return !pgd_raw(pgd);
+	return !p4d_raw(p4d);
 }
 
-static inline int pgd_present(pgd_t pgd)
+static inline int p4d_present(p4d_t p4d)
 {
-	return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT));
+	return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PRESENT));
 }
 
-static inline pte_t pgd_pte(pgd_t pgd)
+static inline pte_t p4d_pte(p4d_t p4d)
 {
-	return __pte_raw(pgd_raw(pgd));
+	return __pte_raw(p4d_raw(p4d));
 }
 
-static inline pgd_t pte_pgd(pte_t pte)
+static inline p4d_t pte_p4d(pte_t pte)
 {
-	return __pgd_raw(pte_raw(pte));
+	return __p4d_raw(pte_raw(pte));
 }
 
-static inline int pgd_bad(pgd_t pgd)
+static inline int p4d_bad(p4d_t p4d)
 {
 	if (radix_enabled())
-		return radix__pgd_bad(pgd);
-	return hash__pgd_bad(pgd);
+		return radix__p4d_bad(p4d);
+	return hash__p4d_bad(p4d);
 }
 
-#define pgd_access_permitted pgd_access_permitted
-static inline bool pgd_access_permitted(pgd_t pgd, bool write)
+#define p4d_access_permitted p4d_access_permitted
+static inline bool p4d_access_permitted(p4d_t p4d, bool write)
 {
-	return pte_access_permitted(pgd_pte(pgd), write);
+	return pte_access_permitted(p4d_pte(p4d), write);
 }
 
-extern struct page *pgd_page(pgd_t pgd);
+extern struct page *p4d_page(p4d_t p4d);
 
 /* Pointers in the page table tree are physical addresses */
 #define __pgtable_ptr_val(ptr)	__pa(ptr)
 
 #define pmd_page_vaddr(pmd)	__va(pmd_val(pmd) & ~PMD_MASKED_BITS)
 #define pud_page_vaddr(pud)	__va(pud_val(pud) & ~PUD_MASKED_BITS)
-#define pgd_page_vaddr(pgd)	__va(pgd_val(pgd) & ~PGD_MASKED_BITS)
+#define p4d_page_vaddr(p4d)	__va(p4d_val(p4d) & ~P4D_MASKED_BITS)
 
 #define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
 #define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
@@ -1010,8 +1016,8 @@ extern struct page *pgd_page(pgd_t pgd);
 
 #define pgd_offset(mm, address)	 ((mm)->pgd + pgd_index(address))
 
-#define pud_offset(pgdp, addr)	\
-	(((pud_t *) pgd_page_vaddr(*(pgdp))) + pud_index(addr))
+#define pud_offset(p4dp, addr)	\
+	(((pud_t *) p4d_page_vaddr(*(p4dp))) + pud_index(addr))
 #define pmd_offset(pudp,addr) \
 	(((pmd_t *) pud_page_vaddr(*(pudp))) + pmd_index(addr))
 #define pte_offset_kernel(dir,addr) \
@@ -1370,11 +1376,11 @@ static inline bool pud_is_leaf(pud_t pud)
 	return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
 }
 
-#define pgd_is_leaf pgd_is_leaf
-#define pgd_leaf pgd_is_leaf
-static inline bool pgd_is_leaf(pgd_t pgd)
+#define p4d_is_leaf p4d_is_leaf
+#define p4d_leaf p4d_is_leaf
+static inline bool p4d_is_leaf(p4d_t p4d)
 {
-	return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE));
+	return !!(p4d_raw(p4d) & cpu_to_be64(_PAGE_PTE));
 }
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index 08c222d5b764..0cba794c4fb8 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -30,7 +30,7 @@
 /* Don't have anything in the reserved bits and leaf bits */
 #define RADIX_PMD_BAD_BITS		0x60000000000000e0UL
 #define RADIX_PUD_BAD_BITS		0x60000000000000e0UL
-#define RADIX_PGD_BAD_BITS		0x60000000000000e0UL
+#define RADIX_P4D_BAD_BITS		0x60000000000000e0UL
 
 #define RADIX_PMD_SHIFT		(PAGE_SHIFT + RADIX_PTE_INDEX_SIZE)
 #define RADIX_PUD_SHIFT		(RADIX_PMD_SHIFT + RADIX_PMD_INDEX_SIZE)
@@ -227,9 +227,9 @@ static inline int radix__pud_bad(pud_t pud)
 }
 
 
-static inline int radix__pgd_bad(pgd_t pgd)
+static inline int radix__p4d_bad(p4d_t p4d)
 {
-	return !!(pgd_val(pgd) & RADIX_PGD_BAD_BITS);
+	return !!(p4d_val(p4d) & RADIX_P4D_BAD_BITS);
 }
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index b04ba257fddb..3d0bc99dd520 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -2,7 +2,6 @@
 #ifndef _ASM_POWERPC_NOHASH_32_PGTABLE_H
 #define _ASM_POWERPC_NOHASH_32_PGTABLE_H
 
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopmd.h>
 
 #ifndef __ASSEMBLY__
diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h
index b9534a793293..668aee6017e7 100644
--- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
@@ -15,7 +15,7 @@ struct vmemmap_backing {
 };
 extern struct vmemmap_backing *vmemmap_list;
 
-#define pgd_populate(MM, PGD, PUD)	pgd_set(PGD, (unsigned long)PUD)
+#define p4d_populate(MM, P4D, PUD)	p4d_set(P4D, (unsigned long)PUD)
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
index c40ec32b8194..81b1c54e3cf1 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
 #define _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
 
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
 
 /*
  * Entries per page directory level.  The PTE level must use a 64b record
@@ -45,41 +45,41 @@
 #define PMD_MASKED_BITS		0
 /* Bits to mask out from a PUD to get to the PMD page */
 #define PUD_MASKED_BITS		0
-/* Bits to mask out from a PGD to get to the PUD page */
-#define PGD_MASKED_BITS		0
+/* Bits to mask out from a P4D to get to the PUD page */
+#define P4D_MASKED_BITS		0
 
 
 /*
  * 4-level page tables related bits
  */
 
-#define pgd_none(pgd)		(!pgd_val(pgd))
-#define pgd_bad(pgd)		(pgd_val(pgd) == 0)
-#define pgd_present(pgd)	(pgd_val(pgd) != 0)
-#define pgd_page_vaddr(pgd)	(pgd_val(pgd) & ~PGD_MASKED_BITS)
+#define p4d_none(p4d)		(!p4d_val(p4d))
+#define p4d_bad(p4d)		(p4d_val(p4d) == 0)
+#define p4d_present(p4d)	(p4d_val(p4d) != 0)
+#define p4d_page_vaddr(p4d)	(p4d_val(p4d) & ~P4D_MASKED_BITS)
 
 #ifndef __ASSEMBLY__
 
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
 {
-	*pgdp = __pgd(0);
+	*p4dp = __p4d(0);
 }
 
-static inline pte_t pgd_pte(pgd_t pgd)
+static inline pte_t p4d_pte(p4d_t p4d)
 {
-	return __pte(pgd_val(pgd));
+	return __pte(p4d_val(p4d));
 }
 
-static inline pgd_t pte_pgd(pte_t pte)
+static inline p4d_t pte_p4d(pte_t pte)
 {
-	return __pgd(pte_val(pte));
+	return __p4d(pte_val(pte));
 }
-extern struct page *pgd_page(pgd_t pgd);
+extern struct page *p4d_page(p4d_t p4d);
 
 #endif /* !__ASSEMBLY__ */
 
-#define pud_offset(pgdp, addr)	\
-  (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
+#define pud_offset(p4dp, addr)	\
+  (((pud_t *) p4d_page_vaddr(*(p4dp))) + \
     (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
 
 #define pud_ERROR(e) \
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 9a33b8bd842d..b360f262b9c6 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -175,11 +175,11 @@ static inline pud_t pte_pud(pte_t pte)
 	return __pud(pte_val(pte));
 }
 #define pud_write(pud)		pte_write(pud_pte(pud))
-#define pgd_write(pgd)		pte_write(pgd_pte(pgd))
+#define p4d_write(pgd)		pte_write(p4d_pte(p4d))
 
-static inline void pgd_set(pgd_t *pgdp, unsigned long val)
+static inline void p4d_set(p4d_t *p4dp, unsigned long val)
 {
-	*pgdp = __pgd(val);
+	*p4dp = __p4d(val);
 }
 
 /*
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index b1f1d5339735..bad9b324559d 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -44,12 +44,12 @@ struct mm_struct;
 #ifdef CONFIG_PPC32
 static inline pmd_t *pmd_ptr(struct mm_struct *mm, unsigned long va)
 {
-	return pmd_offset(pud_offset(pgd_offset(mm, va), va), va);
+	return pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, va), va), va), va);
 }
 
 static inline pmd_t *pmd_ptr_k(unsigned long va)
 {
-	return pmd_offset(pud_offset(pgd_offset_k(va), va), va);
+	return pmd_offset(pud_offset(p4d_offset(pgd_offset_k(va), va), va), va);
 }
 
 static inline pte_t *virt_to_kpte(unsigned long vaddr)
@@ -158,9 +158,9 @@ static inline bool pud_is_leaf(pud_t pud)
 }
 #endif
 
-#ifndef pgd_is_leaf
-#define pgd_is_leaf pgd_is_leaf
-static inline bool pgd_is_leaf(pgd_t pgd)
+#ifndef p4d_is_leaf
+#define p4d_is_leaf p4d_is_leaf
+static inline bool p4d_is_leaf(p4d_t p4d)
 {
 	return false;
 }
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 9f050064d2a2..454cf0dd1b5e 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -499,13 +499,14 @@ void kvmppc_free_pgtable_radix(struct kvm *kvm, pgd_t *pgd, unsigned int lpid)
 	unsigned long ig;
 
 	for (ig = 0; ig < PTRS_PER_PGD; ++ig, ++pgd) {
+		p4d_t *p4d = p4d_offset(pgd, 0);
 		pud_t *pud;
 
-		if (!pgd_present(*pgd))
+		if (!p4d_present(*p4d))
 			continue;
-		pud = pud_offset(pgd, 0);
+		pud = pud_offset(p4d, 0);
 		kvmppc_unmap_free_pud(kvm, pud, lpid);
-		pgd_clear(pgd);
+		p4d_clear(p4d);
 	}
 }
 
@@ -566,6 +567,7 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
 		      unsigned long *rmapp, struct rmap_nested **n_rmap)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud, *new_pud = NULL;
 	pmd_t *pmd, *new_pmd = NULL;
 	pte_t *ptep, *new_ptep = NULL;
@@ -573,9 +575,11 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
 
 	/* Traverse the guest's 2nd-level tree, allocate new levels needed */
 	pgd = pgtable + pgd_index(gpa);
+	p4d = p4d_offset(pgd, gpa);
+
 	pud = NULL;
-	if (pgd_present(*pgd))
-		pud = pud_offset(pgd, gpa);
+	if (p4d_present(*p4d))
+		pud = pud_offset(p4d, gpa);
 	else
 		new_pud = pud_alloc_one(kvm->mm, gpa);
 
@@ -596,13 +600,13 @@ int kvmppc_create_pte(struct kvm *kvm, pgd_t *pgtable, pte_t pte,
 
 	/* Now traverse again under the lock and change the tree */
 	ret = -ENOMEM;
-	if (pgd_none(*pgd)) {
+	if (p4d_none(*p4d)) {
 		if (!new_pud)
 			goto out_unlock;
-		pgd_populate(kvm->mm, pgd, new_pud);
+		p4d_populate(kvm->mm, p4d, new_pud);
 		new_pud = NULL;
 	}
-	pud = pud_offset(pgd, gpa);
+	pud = pud_offset(p4d, gpa);
 	if (pud_is_leaf(*pud)) {
 		unsigned long hgpa = gpa & PUD_MASK;
 
@@ -1219,7 +1223,8 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
 	unsigned long gpa;
 	pgd_t *pgt;
 	struct kvm_nested_guest *nested;
-	pgd_t pgd, *pgdp;
+	pgd_t *pgdp;
+	p4d_t p4d, *p4dp;
 	pud_t pud, *pudp;
 	pmd_t pmd, *pmdp;
 	pte_t *ptep;
@@ -1292,13 +1297,14 @@ static ssize_t debugfs_radix_read(struct file *file, char __user *buf,
 		}
 
 		pgdp = pgt + pgd_index(gpa);
-		pgd = READ_ONCE(*pgdp);
-		if (!(pgd_val(pgd) & _PAGE_PRESENT)) {
-			gpa = (gpa & PGDIR_MASK) + PGDIR_SIZE;
+		p4dp = p4d_offset(pgdp, gpa);
+		p4d = READ_ONCE(*p4dp);
+		if (!(p4d_val(p4d) & _PAGE_PRESENT)) {
+			gpa = (gpa & P4D_MASK) + P4D_SIZE;
 			continue;
 		}
 
-		pudp = pud_offset(&pgd, gpa);
+		pudp = pud_offset(&p4d, gpa);
 		pud = READ_ONCE(*pudp);
 		if (!(pud_val(pud) & _PAGE_PRESENT)) {
 			gpa = (gpa & PUD_MASK) + PUD_SIZE;
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 3345f039a876..7a59f6863cec 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -107,13 +107,18 @@ static inline int unmap_patch_area(unsigned long addr)
 	pte_t *ptep;
 	pmd_t *pmdp;
 	pud_t *pudp;
+	p4d_t *p4dp;
 	pgd_t *pgdp;
 
 	pgdp = pgd_offset_k(addr);
 	if (unlikely(!pgdp))
 		return -EINVAL;
 
-	pudp = pud_offset(pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	if (unlikely(!p4dp))
+		return -EINVAL;
+
+	pudp = pud_offset(p4dp, addr);
 	if (unlikely(!pudp))
 		return -EINVAL;
 
diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
index 64733b9cb20a..9cd15937e88a 100644
--- a/arch/powerpc/mm/book3s64/hash_pgtable.c
+++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
@@ -148,6 +148,7 @@ void hash__vmemmap_remove_mapping(unsigned long start,
 int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep;
@@ -155,7 +156,8 @@ int hash__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
 	BUILD_BUG_ON(TASK_SIZE_USER64 > H_PGTABLE_RANGE);
 	if (slab_is_available()) {
 		pgdp = pgd_offset_k(ea);
-		pudp = pud_alloc(&init_mm, pgdp, ea);
+		p4dp = p4d_offset(pgdp, ea);
+		pudp = pud_alloc(&init_mm, p4dp, ea);
 		if (!pudp)
 			return -ENOMEM;
 		pmdp = pmd_alloc(&init_mm, pudp, ea);
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 8f9edf07063a..97891ca0d428 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -65,17 +65,19 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
 {
 	unsigned long pfn = pa >> PAGE_SHIFT;
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep;
 
 	pgdp = pgd_offset_k(ea);
-	if (pgd_none(*pgdp)) {
+	p4dp = p4d_offset(pgdp, ea);
+	if (p4d_none(*p4dp)) {
 		pudp = early_alloc_pgtable(PUD_TABLE_SIZE, nid,
 						region_start, region_end);
-		pgd_populate(&init_mm, pgdp, pudp);
+		p4d_populate(&init_mm, p4dp, pudp);
 	}
-	pudp = pud_offset(pgdp, ea);
+	pudp = pud_offset(p4dp, ea);
 	if (map_page_size == PUD_SIZE) {
 		ptep = (pte_t *)pudp;
 		goto set_the_pte;
@@ -115,6 +117,7 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
 {
 	unsigned long pfn = pa >> PAGE_SHIFT;
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep;
@@ -137,7 +140,8 @@ static int __map_kernel_page(unsigned long ea, unsigned long pa,
 	 * boot.
 	 */
 	pgdp = pgd_offset_k(ea);
-	pudp = pud_alloc(&init_mm, pgdp, ea);
+	p4dp = p4d_offset(pgdp, ea);
+	pudp = pud_alloc(&init_mm, p4dp, ea);
 	if (!pudp)
 		return -ENOMEM;
 	if (map_page_size == PUD_SIZE) {
@@ -174,6 +178,7 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
 {
 	unsigned long idx;
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep;
@@ -186,7 +191,8 @@ void radix__change_memory_range(unsigned long start, unsigned long end,
 
 	for (idx = start; idx < end; idx += PAGE_SIZE) {
 		pgdp = pgd_offset_k(idx);
-		pudp = pud_alloc(&init_mm, pgdp, idx);
+		p4dp = p4d_offset(pgdp, idx);
+		pudp = pud_alloc(&init_mm, p4dp, idx);
 		if (!pudp)
 			continue;
 		if (pud_is_leaf(*pudp)) {
@@ -850,6 +856,7 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
 	unsigned long addr, next;
 	pud_t *pud_base;
 	pgd_t *pgd;
+	p4d_t *p4d;
 
 	spin_lock(&init_mm.page_table_lock);
 
@@ -857,15 +864,16 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
 		next = pgd_addr_end(addr, end);
 
 		pgd = pgd_offset_k(addr);
-		if (!pgd_present(*pgd))
+		p4d = p4d_offset(pgd, addr);
+		if (!p4d_present(*p4d))
 			continue;
 
-		if (pgd_is_leaf(*pgd)) {
-			split_kernel_mapping(addr, end, PGDIR_SIZE, (pte_t *)pgd);
+		if (p4d_is_leaf(*p4d)) {
+			split_kernel_mapping(addr, end, P4D_SIZE, (pte_t *)p4d);
 			continue;
 		}
 
-		pud_base = (pud_t *)pgd_page_vaddr(*pgd);
+		pud_base = (pud_t *)p4d_page_vaddr(*p4d);
 		remove_pud_table(pud_base, addr, next);
 	}
 
diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book3s64/subpage_prot.c
index 2ef24a53f4c9..25a0c044bd93 100644
--- a/arch/powerpc/mm/book3s64/subpage_prot.c
+++ b/arch/powerpc/mm/book3s64/subpage_prot.c
@@ -54,15 +54,17 @@ static void hpte_flush_range(struct mm_struct *mm, unsigned long addr,
 			     int npages)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
 	spinlock_t *ptl;
 
 	pgd = pgd_offset(mm, addr);
-	if (pgd_none(*pgd))
+	p4d = p4d_offset(pgd, addr);
+	if (p4d_none(*p4d))
 		return;
-	pud = pud_offset(pgd, addr);
+	pud = pud_offset(p4d, addr);
 	if (pud_none(*pud))
 		return;
 	pmd = pmd_offset(pud, addr);
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 33b3461d91e8..54f5994d4cbb 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -119,6 +119,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
 pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
 {
 	pgd_t *pg;
+	p4d_t *p4;
 	pud_t *pu;
 	pmd_t *pm;
 	hugepd_t *hpdp = NULL;
@@ -128,20 +129,21 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
 
 	addr &= ~(sz-1);
 	pg = pgd_offset(mm, addr);
+	p4 = p4d_offset(pg, addr);
 
 #ifdef CONFIG_PPC_BOOK3S_64
 	if (pshift == PGDIR_SHIFT)
 		/* 16GB huge page */
-		return (pte_t *) pg;
+		return (pte_t *) p4;
 	else if (pshift > PUD_SHIFT) {
 		/*
 		 * We need to use hugepd table
 		 */
 		ptl = &mm->page_table_lock;
-		hpdp = (hugepd_t *)pg;
+		hpdp = (hugepd_t *)p4;
 	} else {
 		pdshift = PUD_SHIFT;
-		pu = pud_alloc(mm, pg, addr);
+		pu = pud_alloc(mm, p4, addr);
 		if (!pu)
 			return NULL;
 		if (pshift == PUD_SHIFT)
@@ -166,10 +168,10 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
 #else
 	if (pshift >= PGDIR_SHIFT) {
 		ptl = &mm->page_table_lock;
-		hpdp = (hugepd_t *)pg;
+		hpdp = (hugepd_t *)p4;
 	} else {
 		pdshift = PUD_SHIFT;
-		pu = pud_alloc(mm, pg, addr);
+		pu = pud_alloc(mm, p4, addr);
 		if (!pu)
 			return NULL;
 		if (pshift >= PUD_SHIFT) {
@@ -390,7 +392,7 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
 	mm_dec_nr_pmds(tlb->mm);
 }
 
-static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
+static void hugetlb_free_pud_range(struct mmu_gather *tlb, p4d_t *p4d,
 				   unsigned long addr, unsigned long end,
 				   unsigned long floor, unsigned long ceiling)
 {
@@ -400,7 +402,7 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
 
 	start = addr;
 	do {
-		pud = pud_offset(pgd, addr);
+		pud = pud_offset(p4d, addr);
 		next = pud_addr_end(addr, end);
 		if (!is_hugepd(__hugepd(pud_val(*pud)))) {
 			if (pud_none_or_clear_bad(pud))
@@ -435,8 +437,8 @@ static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
 	if (end - 1 > ceiling - 1)
 		return;
 
-	pud = pud_offset(pgd, start);
-	pgd_clear(pgd);
+	pud = pud_offset(p4d, start);
+	p4d_clear(p4d);
 	pud_free_tlb(tlb, pud, start);
 	mm_dec_nr_puds(tlb->mm);
 }
@@ -449,6 +451,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
 			    unsigned long floor, unsigned long ceiling)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	unsigned long next;
 
 	/*
@@ -471,10 +474,11 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
 	do {
 		next = pgd_addr_end(addr, end);
 		pgd = pgd_offset(tlb->mm, addr);
+		p4d = p4d_offset(pgd, addr);
 		if (!is_hugepd(__hugepd(pgd_val(*pgd)))) {
-			if (pgd_none_or_clear_bad(pgd))
+			if (p4d_none_or_clear_bad(p4d))
 				continue;
-			hugetlb_free_pud_range(tlb, pgd, addr, next, floor, ceiling);
+			hugetlb_free_pud_range(tlb, p4d, addr, next, floor, ceiling);
 		} else {
 			unsigned long more;
 			/*
@@ -487,7 +491,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
 			if (more > next)
 				next = more;
 
-			free_hugepd_range(tlb, (hugepd_t *)pgd, PGDIR_SHIFT,
+			free_hugepd_range(tlb, (hugepd_t *)p4d, PGDIR_SHIFT,
 					  addr, next, floor, ceiling);
 		}
 	} while (addr = next, addr != end);
diff --git a/arch/powerpc/mm/nohash/book3e_pgtable.c b/arch/powerpc/mm/nohash/book3e_pgtable.c
index 4637fdd469cf..77884e24281d 100644
--- a/arch/powerpc/mm/nohash/book3e_pgtable.c
+++ b/arch/powerpc/mm/nohash/book3e_pgtable.c
@@ -73,6 +73,7 @@ static void __init *early_alloc_pgtable(unsigned long size)
 int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep;
@@ -80,7 +81,8 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
 	BUILD_BUG_ON(TASK_SIZE_USER64 > PGTABLE_RANGE);
 	if (slab_is_available()) {
 		pgdp = pgd_offset_k(ea);
-		pudp = pud_alloc(&init_mm, pgdp, ea);
+		p4dp = p4d_offset(pgdp, ea);
+		pudp = pud_alloc(&init_mm, p4dp, ea);
 		if (!pudp)
 			return -ENOMEM;
 		pmdp = pmd_alloc(&init_mm, pudp, ea);
@@ -91,13 +93,12 @@ int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
 			return -ENOMEM;
 	} else {
 		pgdp = pgd_offset_k(ea);
-#ifndef __PAGETABLE_PUD_FOLDED
-		if (pgd_none(*pgdp)) {
-			pudp = early_alloc_pgtable(PUD_TABLE_SIZE);
-			pgd_populate(&init_mm, pgdp, pudp);
+		p4dp = p4d_offset(pgdp, ea);
+		if (p4d_none(*p4dp)) {
+			pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
+			p4d_populate(&init_mm, p4dp, pmdp);
 		}
-#endif /* !__PAGETABLE_PUD_FOLDED */
-		pudp = pud_offset(pgdp, ea);
+		pudp = pud_offset(p4dp, ea);
 		if (pud_none(*pudp)) {
 			pmdp = early_alloc_pgtable(PMD_TABLE_SIZE);
 			pud_populate(&init_mm, pudp, pmdp);
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index e3759b69f81b..c2499271f6c1 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -265,6 +265,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 
@@ -272,7 +273,9 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
 		return;
 	pgd = mm->pgd + pgd_index(addr);
 	BUG_ON(pgd_none(*pgd));
-	pud = pud_offset(pgd, addr);
+	p4d = p4d_offset(pgd, addr);
+	BUG_ON(p4d_none(*p4d));
+	pud = pud_offset(p4d, addr);
 	BUG_ON(pud_none(*pud));
 	pmd = pmd_offset(pud, addr);
 	/*
@@ -312,12 +315,13 @@ EXPORT_SYMBOL_GPL(vmalloc_to_phys);
 pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
 			bool *is_thp, unsigned *hpage_shift)
 {
-	pgd_t pgd, *pgdp;
+	pgd_t *pgdp;
+	p4d_t p4d, *p4dp;
 	pud_t pud, *pudp;
 	pmd_t pmd, *pmdp;
 	pte_t *ret_pte;
 	hugepd_t *hpdp = NULL;
-	unsigned pdshift = PGDIR_SHIFT;
+	unsigned pdshift;
 
 	if (hpage_shift)
 		*hpage_shift = 0;
@@ -325,24 +329,28 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
 	if (is_thp)
 		*is_thp = false;
 
-	pgdp = pgdir + pgd_index(ea);
-	pgd  = READ_ONCE(*pgdp);
 	/*
 	 * Always operate on the local stack value. This make sure the
 	 * value don't get updated by a parallel THP split/collapse,
 	 * page fault or a page unmap. The return pte_t * is still not
 	 * stable. So should be checked there for above conditions.
+	 * Top level is an exception because it is folded into p4d.
 	 */
-	if (pgd_none(pgd))
+	pgdp = pgdir + pgd_index(ea);
+	p4dp = p4d_offset(pgdp, ea);
+	p4d  = READ_ONCE(*p4dp);
+	pdshift = P4D_SHIFT;
+
+	if (p4d_none(p4d))
 		return NULL;
 
-	if (pgd_is_leaf(pgd)) {
-		ret_pte = (pte_t *)pgdp;
+	if (p4d_is_leaf(p4d)) {
+		ret_pte = (pte_t *)p4dp;
 		goto out;
 	}
 
-	if (is_hugepd(__hugepd(pgd_val(pgd)))) {
-		hpdp = (hugepd_t *)&pgd;
+	if (is_hugepd(__hugepd(p4d_val(p4d)))) {
+		hpdp = (hugepd_t *)&p4d;
 		goto out_huge;
 	}
 
@@ -352,7 +360,7 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
 	 * irq disabled
 	 */
 	pdshift = PUD_SHIFT;
-	pudp = pud_offset(&pgd, ea);
+	pudp = pud_offset(&p4d, ea);
 	pud  = READ_ONCE(*pudp);
 
 	if (pud_none(pud))
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index e78832dce7bb..1f86a88fd4bb 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -101,13 +101,13 @@ EXPORT_SYMBOL(__pte_frag_size_shift);
 
 #ifndef __PAGETABLE_PUD_FOLDED
 /* 4 level page table */
-struct page *pgd_page(pgd_t pgd)
+struct page *p4d_page(p4d_t p4d)
 {
-	if (pgd_is_leaf(pgd)) {
-		VM_WARN_ON(!pgd_huge(pgd));
-		return pte_page(pgd_pte(pgd));
+	if (p4d_is_leaf(p4d)) {
+		VM_WARN_ON(!p4d_huge(p4d));
+		return pte_page(p4d_pte(p4d));
 	}
-	return virt_to_page(pgd_page_vaddr(pgd));
+	return virt_to_page(p4d_page_vaddr(p4d));
 }
 #endif
 
diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c b/arch/powerpc/mm/ptdump/hashpagetable.c
index b6ed9578382f..6aaeb1eb3b9c 100644
--- a/arch/powerpc/mm/ptdump/hashpagetable.c
+++ b/arch/powerpc/mm/ptdump/hashpagetable.c
@@ -417,9 +417,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
 	}
 }
 
-static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
+static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
 {
-	pud_t *pud = pud_offset(pgd, 0);
+	pud_t *pud = pud_offset(p4d, 0);
 	unsigned long addr;
 	unsigned int i;
 
@@ -431,6 +431,20 @@ static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
 	}
 }
 
+static void walk_p4d(struct pg_state *st, pgd_t *pgd, unsigned long start)
+{
+	p4d_t *p4d = p4d_offset(pgd, 0);
+	unsigned long addr;
+	unsigned int i;
+
+	for (i = 0; i < PTRS_PER_P4D; i++, p4d++) {
+		addr = start + i * P4D_SIZE;
+		if (!p4d_none(*p4d))
+			/* p4d exists */
+			walk_pud(st, p4d, addr);
+	}
+}
+
 static void walk_pagetables(struct pg_state *st)
 {
 	pgd_t *pgd = pgd_offset_k(0UL);
@@ -445,7 +459,7 @@ static void walk_pagetables(struct pg_state *st)
 		addr = KERN_VIRT_START + i * PGDIR_SIZE;
 		if (!pgd_none(*pgd))
 			/* pgd exists */
-			walk_pud(st, pgd, addr);
+			walk_p4d(st, pgd, addr);
 	}
 }
 
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index d92bb8ea229c..507cb9793b26 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -277,9 +277,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start)
 	}
 }
 
-static void walk_pud(struct pg_state *st, pgd_t *pgd, unsigned long start)
+static void walk_pud(struct pg_state *st, p4d_t *p4d, unsigned long start)
 {
-	pud_t *pud = pud_offset(pgd, 0);
+	pud_t *pud = pud_offset(p4d, 0);
 	unsigned long addr;
 	unsigned int i;
 
@@ -304,11 +304,13 @@ static void walk_pagetables(struct pg_state *st)
 	 * the hash pagetable.
 	 */
 	for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
-		if (!pgd_none(*pgd) && !pgd_is_leaf(*pgd))
-			/* pgd exists */
-			walk_pud(st, pgd, addr);
+		p4d_t *p4d = p4d_offset(pgd, 0);
+
+		if (!p4d_none(*p4d) && !p4d_is_leaf(*p4d))
+			/* p4d exists */
+			walk_pud(st, p4d, addr);
 		else
-			note_page(st, addr, 1, pgd_val(*pgd));
+			note_page(st, addr, 1, p4d_val(*p4d));
 	}
 }
 
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 7af840c0fc93..64be69cb0b13 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -3136,6 +3136,7 @@ static void show_pte(unsigned long addr)
 	struct task_struct *tsk = NULL;
 	struct mm_struct *mm;
 	pgd_t *pgdp, *pgdir;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep;
@@ -3167,20 +3168,21 @@ static void show_pte(unsigned long addr)
 		pgdir = pgd_offset(mm, 0);
 	}
 
-	if (pgd_none(*pgdp)) {
-		printf("no linux page table for address\n");
+	p4dp = p4d_offset(pgdp, addr);
+
+	if (p4d_none(*p4dp)) {
+		printf("No valid P4D\n");
 		return;
 	}
 
-	printf("pgd  @ 0x%px\n", pgdir);
-
-	if (pgd_is_leaf(*pgdp)) {
-		format_pte(pgdp, pgd_val(*pgdp));
+	if (p4d_is_leaf(*p4dp)) {
+		format_pte(p4dp, p4d_val(*p4dp));
 		return;
 	}
-	printf("pgdp @ 0x%px = 0x%016lx\n", pgdp, pgd_val(*pgdp));
 
-	pudp = pud_offset(pgdp, addr);
+	printf("p4dp @ 0x%px = 0x%016lx\n", p4dp, p4d_val(*p4dp));
+
+	pudp = pud_offset(p4dp, addr);
 
 	if (pud_none(*pudp)) {
 		printf("No valid PUD\n");
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 09/14] sh: fault: Modernize printing of kernel messages
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (7 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 08/14] powerpc: " Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 10/14] sh: drop __pXd_offset() macros that duplicate pXd_index() ones Mike Rapoport
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Geert Uytterhoeven <geert+renesas@glider.be>

  - Convert from printk() to pr_*(),
  - Add missing continuations,
  - Use "%llx" to format u64,
  - Join multiple prints in show_fault_oops() into a single print.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/sh/mm/fault.c | 39 ++++++++++++++++++---------------------
 1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index 5f23d7907597..7b74e18b71d7 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -47,10 +47,10 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
 			pgd = swapper_pg_dir;
 	}
 
-	printk(KERN_ALERT "pgd = %p\n", pgd);
+	pr_alert("pgd = %p\n", pgd);
 	pgd += pgd_index(addr);
-	printk(KERN_ALERT "[%08lx] *pgd=%0*Lx", addr,
-	       (u32)(sizeof(*pgd) * 2), (u64)pgd_val(*pgd));
+	pr_alert("[%08lx] *pgd=%0*llx", addr, (u32)(sizeof(*pgd) * 2),
+		 (u64)pgd_val(*pgd));
 
 	do {
 		pud_t *pud;
@@ -61,33 +61,33 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
 			break;
 
 		if (pgd_bad(*pgd)) {
-			printk("(bad)");
+			pr_cont("(bad)");
 			break;
 		}
 
 		pud = pud_offset(pgd, addr);
 		if (PTRS_PER_PUD != 1)
-			printk(", *pud=%0*Lx", (u32)(sizeof(*pud) * 2),
-			       (u64)pud_val(*pud));
+			pr_cont(", *pud=%0*llx", (u32)(sizeof(*pud) * 2),
+				(u64)pud_val(*pud));
 
 		if (pud_none(*pud))
 			break;
 
 		if (pud_bad(*pud)) {
-			printk("(bad)");
+			pr_cont("(bad)");
 			break;
 		}
 
 		pmd = pmd_offset(pud, addr);
 		if (PTRS_PER_PMD != 1)
-			printk(", *pmd=%0*Lx", (u32)(sizeof(*pmd) * 2),
-			       (u64)pmd_val(*pmd));
+			pr_cont(", *pmd=%0*llx", (u32)(sizeof(*pmd) * 2),
+				(u64)pmd_val(*pmd));
 
 		if (pmd_none(*pmd))
 			break;
 
 		if (pmd_bad(*pmd)) {
-			printk("(bad)");
+			pr_cont("(bad)");
 			break;
 		}
 
@@ -96,11 +96,11 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
 			break;
 
 		pte = pte_offset_kernel(pmd, addr);
-		printk(", *pte=%0*Lx", (u32)(sizeof(*pte) * 2),
-		       (u64)pte_val(*pte));
+		pr_cont(", *pte=%0*llx", (u32)(sizeof(*pte) * 2),
+			(u64)pte_val(*pte));
 	} while (0);
 
-	printk("\n");
+	pr_cont("\n");
 }
 
 static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
@@ -188,14 +188,11 @@ show_fault_oops(struct pt_regs *regs, unsigned long address)
 	if (!oops_may_print())
 		return;
 
-	printk(KERN_ALERT "BUG: unable to handle kernel ");
-	if (address < PAGE_SIZE)
-		printk(KERN_CONT "NULL pointer dereference");
-	else
-		printk(KERN_CONT "paging request");
-
-	printk(KERN_CONT " at %08lx\n", address);
-	printk(KERN_ALERT "PC:");
+	pr_alert("BUG: unable to handle kernel %s at %08lx\n",
+		 address < PAGE_SIZE ? "NULL pointer dereference"
+				     : "paging request",
+		 address);
+	pr_alert("PC:");
 	printk_address(regs->pc, 1);
 
 	show_pte(NULL, address);
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 10/14] sh: drop __pXd_offset() macros that duplicate pXd_index() ones
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (8 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 09/14] sh: fault: Modernize printing of kernel messages Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 11/14] sh: add support for folded p4d page tables Mike Rapoport
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The __pXd_offset() macros are identical to the pXd_index() macros and there
is no point to keep both of them. All architectures define and use
pXd_index() so let's keep only those to make mips consistent with the rest
of the kernel.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/sh/include/asm/pgtable_32.h | 5 ++---
 arch/sh/include/asm/pgtable_64.h | 5 ++---
 arch/sh/mm/init.c                | 6 +++---
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h
index 29274f0e428e..4acce5f2cbf9 100644
--- a/arch/sh/include/asm/pgtable_32.h
+++ b/arch/sh/include/asm/pgtable_32.h
@@ -407,13 +407,12 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 /* to find an entry in a page-table-directory. */
 #define pgd_index(address)	(((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
 #define pgd_offset(mm, address)	((mm)->pgd + pgd_index(address))
-#define __pgd_offset(address)	pgd_index(address)
 
 /* to find an entry in a kernel page-table-directory */
 #define pgd_offset_k(address)	pgd_offset(&init_mm, address)
 
-#define __pud_offset(address)	(((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
-#define __pmd_offset(address)	(((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
+#define pud_index(address)	(((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+#define pmd_index(address)	(((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
 
 /* Find an entry in the third-level page table.. */
 #define pte_index(address)	((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
diff --git a/arch/sh/include/asm/pgtable_64.h b/arch/sh/include/asm/pgtable_64.h
index 1778bc5971e7..27cc282ec6c0 100644
--- a/arch/sh/include/asm/pgtable_64.h
+++ b/arch/sh/include/asm/pgtable_64.h
@@ -46,14 +46,13 @@ static __inline__ void set_pte(pte_t *pteptr, pte_t pteval)
 
 /* To find an entry in a generic PGD. */
 #define pgd_index(address) (((address) >> PGDIR_SHIFT) & (PTRS_PER_PGD-1))
-#define __pgd_offset(address) pgd_index(address)
 #define pgd_offset(mm, address) ((mm)->pgd+pgd_index(address))
 
 /* To find an entry in a kernel PGD. */
 #define pgd_offset_k(address) pgd_offset(&init_mm, address)
 
-#define __pud_offset(address)	(((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
-#define __pmd_offset(address)	(((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))
+#define pud_index(address)	(((address) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
+/* #define pmd_index(address)	(((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1)) */
 
 /*
  * PMD level access routines. Same notes as above.
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index b9de2d4fa57e..f445ba630790 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -172,9 +172,9 @@ void __init page_table_range_init(unsigned long start, unsigned long end,
 	unsigned long vaddr;
 
 	vaddr = start;
-	i = __pgd_offset(vaddr);
-	j = __pud_offset(vaddr);
-	k = __pmd_offset(vaddr);
+	i = pgd_index(vaddr);
+	j = pud_index(vaddr);
+	k = pmd_index(vaddr);
 	pgd = pgd_base + i;
 
 	for ( ; (i < PTRS_PER_PGD) && (vaddr != end); pgd++, i++) {
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 11/14] sh: add support for folded p4d page tables
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (9 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 10/14] sh: drop __pXd_offset() macros that duplicate pXd_index() ones Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 12/14] unicore32: remove __ARCH_USE_5LEVEL_HACK Mike Rapoport
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/sh/include/asm/pgtable-2level.h |  1 -
 arch/sh/include/asm/pgtable-3level.h |  1 -
 arch/sh/kernel/io_trapped.c          |  7 ++++++-
 arch/sh/mm/cache-sh4.c               |  4 +++-
 arch/sh/mm/cache-sh5.c               |  7 ++++++-
 arch/sh/mm/fault.c                   | 26 +++++++++++++++++++++++---
 arch/sh/mm/hugetlbpage.c             | 28 ++++++++++++++++++----------
 arch/sh/mm/init.c                    |  9 ++++++++-
 arch/sh/mm/kmap.c                    |  2 +-
 arch/sh/mm/tlbex_32.c                |  6 +++++-
 arch/sh/mm/tlbex_64.c                |  7 ++++++-
 11 files changed, 76 insertions(+), 22 deletions(-)

diff --git a/arch/sh/include/asm/pgtable-2level.h b/arch/sh/include/asm/pgtable-2level.h
index bf1eb51c3ee5..08bff93927ff 100644
--- a/arch/sh/include/asm/pgtable-2level.h
+++ b/arch/sh/include/asm/pgtable-2level.h
@@ -2,7 +2,6 @@
 #ifndef __ASM_SH_PGTABLE_2LEVEL_H
 #define __ASM_SH_PGTABLE_2LEVEL_H
 
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopmd.h>
 
 /*
diff --git a/arch/sh/include/asm/pgtable-3level.h b/arch/sh/include/asm/pgtable-3level.h
index 779260b721ca..0f80097e5c9c 100644
--- a/arch/sh/include/asm/pgtable-3level.h
+++ b/arch/sh/include/asm/pgtable-3level.h
@@ -2,7 +2,6 @@
 #ifndef __ASM_SH_PGTABLE_3LEVEL_H
 #define __ASM_SH_PGTABLE_3LEVEL_H
 
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopud.h>
 
 /*
diff --git a/arch/sh/kernel/io_trapped.c b/arch/sh/kernel/io_trapped.c
index 60c828a2b8a2..037aab2708b7 100644
--- a/arch/sh/kernel/io_trapped.c
+++ b/arch/sh/kernel/io_trapped.c
@@ -136,6 +136,7 @@ EXPORT_SYMBOL_GPL(match_trapped_io_handler);
 static struct trapped_io *lookup_tiop(unsigned long address)
 {
 	pgd_t *pgd_k;
+	p4d_t *p4d_k;
 	pud_t *pud_k;
 	pmd_t *pmd_k;
 	pte_t *pte_k;
@@ -145,7 +146,11 @@ static struct trapped_io *lookup_tiop(unsigned long address)
 	if (!pgd_present(*pgd_k))
 		return NULL;
 
-	pud_k = pud_offset(pgd_k, address);
+	p4d_k = p4d_offset(pgd_k, address);
+	if (!p4d_present(*p4d_k))
+		return NULL;
+
+	pud_k = pud_offset(p4d_k, address);
 	if (!pud_present(*pud_k))
 		return NULL;
 
diff --git a/arch/sh/mm/cache-sh4.c b/arch/sh/mm/cache-sh4.c
index eee911422cf9..45943bcb7042 100644
--- a/arch/sh/mm/cache-sh4.c
+++ b/arch/sh/mm/cache-sh4.c
@@ -209,6 +209,7 @@ static void sh4_flush_cache_page(void *args)
 	unsigned long address, pfn, phys;
 	int map_coherent = 0;
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
@@ -224,7 +225,8 @@ static void sh4_flush_cache_page(void *args)
 		return;
 
 	pgd = pgd_offset(vma->vm_mm, address);
-	pud = pud_offset(pgd, address);
+	p4d = p4d_offset(pgd, address);
+	pud = pud_offset(p4d, address);
 	pmd = pmd_offset(pud, address);
 	pte = pte_offset_kernel(pmd, address);
 
diff --git a/arch/sh/mm/cache-sh5.c b/arch/sh/mm/cache-sh5.c
index 445b5e69b73c..442a77cc2957 100644
--- a/arch/sh/mm/cache-sh5.c
+++ b/arch/sh/mm/cache-sh5.c
@@ -383,6 +383,7 @@ static void sh64_dcache_purge_user_pages(struct mm_struct *mm,
 				unsigned long addr, unsigned long end)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
@@ -397,7 +398,11 @@ static void sh64_dcache_purge_user_pages(struct mm_struct *mm,
 	if (pgd_bad(*pgd))
 		return;
 
-	pud = pud_offset(pgd, addr);
+	p4d = p4d_offset(pgd, addr);
+	if (p4d_none(*p4d) || p4d_bad(*p4d))
+		return;
+
+	pud = pud_offset(p4d, addr);
 	if (pud_none(*pud) || pud_bad(*pud))
 		return;
 
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index 7b74e18b71d7..8b3ab65c81c4 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -53,6 +53,7 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
 		 (u64)pgd_val(*pgd));
 
 	do {
+		p4d_t *p4d;
 		pud_t *pud;
 		pmd_t *pmd;
 		pte_t *pte;
@@ -65,7 +66,20 @@ static void show_pte(struct mm_struct *mm, unsigned long addr)
 			break;
 		}
 
-		pud = pud_offset(pgd, addr);
+		p4d = p4d_offset(pgd, addr);
+		if (PTRS_PER_P4D != 1)
+			pr_cont(", *p4d=%0*Lx", (u32)(sizeof(*p4d) * 2),
+			        (u64)p4d_val(*p4d));
+
+		if (p4d_none(*p4d))
+			break;
+
+		if (p4d_bad(*p4d)) {
+			pr_cont("(bad)");
+			break;
+		}
+
+		pud = pud_offset(p4d, addr);
 		if (PTRS_PER_PUD != 1)
 			pr_cont(", *pud=%0*llx", (u32)(sizeof(*pud) * 2),
 				(u64)pud_val(*pud));
@@ -107,6 +121,7 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
 {
 	unsigned index = pgd_index(address);
 	pgd_t *pgd_k;
+	p4d_t *p4d, *p4d_k;
 	pud_t *pud, *pud_k;
 	pmd_t *pmd, *pmd_k;
 
@@ -116,8 +131,13 @@ static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
 	if (!pgd_present(*pgd_k))
 		return NULL;
 
-	pud = pud_offset(pgd, address);
-	pud_k = pud_offset(pgd_k, address);
+	p4d = p4d_offset(pgd, address);
+	p4d_k = p4d_offset(pgd_k, address);
+	if (!p4d_present(*p4d_k))
+		return NULL;
+
+	pud = pud_offset(p4d, address);
+	pud_k = pud_offset(p4d_k, address);
 	if (!pud_present(*pud_k))
 		return NULL;
 
diff --git a/arch/sh/mm/hugetlbpage.c b/arch/sh/mm/hugetlbpage.c
index 960deb1f24a1..acd5652a0de3 100644
--- a/arch/sh/mm/hugetlbpage.c
+++ b/arch/sh/mm/hugetlbpage.c
@@ -26,17 +26,21 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
 			unsigned long addr, unsigned long sz)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte = NULL;
 
 	pgd = pgd_offset(mm, addr);
 	if (pgd) {
-		pud = pud_alloc(mm, pgd, addr);
-		if (pud) {
-			pmd = pmd_alloc(mm, pud, addr);
-			if (pmd)
-				pte = pte_alloc_map(mm, pmd, addr);
+		p4d = p4d_alloc(mm, pgd, addr);
+		if (p4d) {
+			pud = pud_alloc(mm, p4d, addr);
+			if (pud) {
+				pmd = pmd_alloc(mm, pud, addr);
+				if (pmd)
+					pte = pte_alloc_map(mm, pmd, addr);
+			}
 		}
 	}
 
@@ -47,17 +51,21 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
 		       unsigned long addr, unsigned long sz)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte = NULL;
 
 	pgd = pgd_offset(mm, addr);
 	if (pgd) {
-		pud = pud_offset(pgd, addr);
-		if (pud) {
-			pmd = pmd_offset(pud, addr);
-			if (pmd)
-				pte = pte_offset_map(pmd, addr);
+		p4d = p4d_offset(pgd, addr);
+		if (p4d) {
+			pud = pud_offset(p4d, addr);
+			if (pud) {
+				pmd = pmd_offset(pud, addr);
+				if (pmd)
+					pte = pte_offset_map(pmd, addr);
+			}
 		}
 	}
 
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index f445ba630790..60c33cf7acd1 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -45,6 +45,7 @@ void __init __weak plat_mem_setup(void)
 static pte_t *__get_pte_phys(unsigned long addr)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 
@@ -54,7 +55,13 @@ static pte_t *__get_pte_phys(unsigned long addr)
 		return NULL;
 	}
 
-	pud = pud_alloc(NULL, pgd, addr);
+	p4d = p4d_alloc(NULL, pgd, addr);
+	if (unlikely(!p4d)) {
+		p4d_ERROR(*p4d);
+		return NULL;
+	}
+
+	pud = pud_alloc(NULL, p4d, addr);
 	if (unlikely(!pud)) {
 		pud_ERROR(*pud);
 		return NULL;
diff --git a/arch/sh/mm/kmap.c b/arch/sh/mm/kmap.c
index 9e6b38b03cf7..0e7039137f5a 100644
--- a/arch/sh/mm/kmap.c
+++ b/arch/sh/mm/kmap.c
@@ -15,7 +15,7 @@
 #include <asm/cacheflush.h>
 
 #define kmap_get_fixmap_pte(vaddr)                                     \
-	pte_offset_kernel(pmd_offset(pud_offset(pgd_offset_k(vaddr), (vaddr)), (vaddr)), (vaddr))
+	pte_offset_kernel(pmd_offset(pud_offset(p4d_offset(pgd_offset_k(vaddr), (vaddr)), (vaddr)), (vaddr)), vaddr)
 
 static pte_t *kmap_coherent_pte;
 
diff --git a/arch/sh/mm/tlbex_32.c b/arch/sh/mm/tlbex_32.c
index 382262dc0c4b..1c53868632ee 100644
--- a/arch/sh/mm/tlbex_32.c
+++ b/arch/sh/mm/tlbex_32.c
@@ -23,6 +23,7 @@ handle_tlbmiss(struct pt_regs *regs, unsigned long error_code,
 	       unsigned long address)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
@@ -42,7 +43,10 @@ handle_tlbmiss(struct pt_regs *regs, unsigned long error_code,
 		pgd = pgd_offset(current->mm, address);
 	}
 
-	pud = pud_offset(pgd, address);
+	p4d = p4d_offset(pgd, address);
+	if (p4d_none_or_clear_bad(p4d))
+		return 1;
+	pud = pud_offset(p4d, address);
 	if (pud_none_or_clear_bad(pud))
 		return 1;
 	pmd = pmd_offset(pud, address);
diff --git a/arch/sh/mm/tlbex_64.c b/arch/sh/mm/tlbex_64.c
index 8ff966dd0c74..0d015f7556fa 100644
--- a/arch/sh/mm/tlbex_64.c
+++ b/arch/sh/mm/tlbex_64.c
@@ -44,6 +44,7 @@ static int handle_tlbmiss(unsigned long long protection_flags,
 			  unsigned long address)
 {
 	pgd_t *pgd;
+	p4d_t *p4d;
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
@@ -58,7 +59,11 @@ static int handle_tlbmiss(unsigned long long protection_flags,
 		pgd = pgd_offset(current->mm, address);
 	}
 
-	pud = pud_offset(pgd, address);
+	p4d = p4d_offset(pgd, address);
+	if (p4d_none(*p4d) || !p4d_present(*p4d))
+		return 1;
+
+	pud = pud_offset(p4d, address);
 	if (pud_none(*pud) || !pud_present(*pud))
 		return 1;
 
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 12/14] unicore32: remove __ARCH_USE_5LEVEL_HACK
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (10 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 11/14] sh: add support for folded p4d page tables Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 13/14] asm-generic: remove pgtable-nop4d-hack.h Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 14/14] mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h Mike Rapoport
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

The unicore32 architecture has 2 level page tables and
asm-generic/pgtable-nopmd.h and explicit casts from pud_t to pgd_t for page
table folding.

Add p4d walk in the only place that actually unfolds the pud level and
remove __ARCH_USE_5LEVEL_HACK.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/unicore32/include/asm/pgtable.h | 1 -
 arch/unicore32/kernel/hibernate.c    | 4 +++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/unicore32/include/asm/pgtable.h b/arch/unicore32/include/asm/pgtable.h
index 3b8731b3a937..826f49edd94e 100644
--- a/arch/unicore32/include/asm/pgtable.h
+++ b/arch/unicore32/include/asm/pgtable.h
@@ -9,7 +9,6 @@
 #ifndef __UNICORE_PGTABLE_H__
 #define __UNICORE_PGTABLE_H__
 
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopmd.h>
 #include <asm/cpu-single.h>
 
diff --git a/arch/unicore32/kernel/hibernate.c b/arch/unicore32/kernel/hibernate.c
index f3812245cc00..ccad051a79b6 100644
--- a/arch/unicore32/kernel/hibernate.c
+++ b/arch/unicore32/kernel/hibernate.c
@@ -33,9 +33,11 @@ struct swsusp_arch_regs swsusp_arch_regs_cpu0;
 static pmd_t *resume_one_md_table_init(pgd_t *pgd)
 {
 	pud_t *pud;
+	p4d_t *p4d;
 	pmd_t *pmd_table;
 
-	pud = pud_offset(pgd, 0);
+	p4d = p4d_offset(pgd, 0);
+	pud = pud_offset(p4d, 0);
 	pmd_table = pmd_offset(pud, 0);
 
 	return pmd_table;
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 13/14] asm-generic: remove pgtable-nop4d-hack.h
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (11 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 12/14] unicore32: remove __ARCH_USE_5LEVEL_HACK Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  2020-04-14 15:34 ` [PATCH v4 14/14] mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h Mike Rapoport
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

No architecture defines __ARCH_USE_5LEVEL_HACK and therefore
pgtable-nop4d-hack.h will be never actually included.

Remove it.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/asm-generic/pgtable-nop4d-hack.h | 64 ------------------------
 include/asm-generic/pgtable-nopud.h      |  4 --
 2 files changed, 68 deletions(-)
 delete mode 100644 include/asm-generic/pgtable-nop4d-hack.h

diff --git a/include/asm-generic/pgtable-nop4d-hack.h b/include/asm-generic/pgtable-nop4d-hack.h
deleted file mode 100644
index 829bdb0d6327..000000000000
--- a/include/asm-generic/pgtable-nop4d-hack.h
+++ /dev/null
@@ -1,64 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _PGTABLE_NOP4D_HACK_H
-#define _PGTABLE_NOP4D_HACK_H
-
-#ifndef __ASSEMBLY__
-#include <asm-generic/5level-fixup.h>
-
-#define __PAGETABLE_PUD_FOLDED 1
-
-/*
- * Having the pud type consist of a pgd gets the size right, and allows
- * us to conceptually access the pgd entry that this pud is folded into
- * without casting.
- */
-typedef struct { pgd_t pgd; } pud_t;
-
-#define PUD_SHIFT	PGDIR_SHIFT
-#define PTRS_PER_PUD	1
-#define PUD_SIZE	(1UL << PUD_SHIFT)
-#define PUD_MASK	(~(PUD_SIZE-1))
-
-/*
- * The "pgd_xxx()" functions here are trivial for a folded two-level
- * setup: the pud is never bad, and a pud always exists (as it's folded
- * into the pgd entry)
- */
-static inline int pgd_none(pgd_t pgd)		{ return 0; }
-static inline int pgd_bad(pgd_t pgd)		{ return 0; }
-static inline int pgd_present(pgd_t pgd)	{ return 1; }
-static inline void pgd_clear(pgd_t *pgd)	{ }
-#define pud_ERROR(pud)				(pgd_ERROR((pud).pgd))
-
-#define pgd_populate(mm, pgd, pud)		do { } while (0)
-#define pgd_populate_safe(mm, pgd, pud)		do { } while (0)
-/*
- * (puds are folded into pgds so this doesn't get actually called,
- * but the define is needed for a generic inline function.)
- */
-#define set_pgd(pgdptr, pgdval)	set_pud((pud_t *)(pgdptr), (pud_t) { pgdval })
-
-static inline pud_t *pud_offset(pgd_t *pgd, unsigned long address)
-{
-	return (pud_t *)pgd;
-}
-
-#define pud_val(x)				(pgd_val((x).pgd))
-#define __pud(x)				((pud_t) { __pgd(x) })
-
-#define pgd_page(pgd)				(pud_page((pud_t){ pgd }))
-#define pgd_page_vaddr(pgd)			(pud_page_vaddr((pud_t){ pgd }))
-
-/*
- * allocating and freeing a pud is trivial: the 1-entry pud is
- * inside the pgd, so has no extra memory associated with it.
- */
-#define pud_alloc_one(mm, address)		NULL
-#define pud_free(mm, x)				do { } while (0)
-#define __pud_free_tlb(tlb, x, a)		do { } while (0)
-
-#undef  pud_addr_end
-#define pud_addr_end(addr, end)			(end)
-
-#endif /* __ASSEMBLY__ */
-#endif /* _PGTABLE_NOP4D_HACK_H */
diff --git a/include/asm-generic/pgtable-nopud.h b/include/asm-generic/pgtable-nopud.h
index d3776cb494c0..ad05c1684bfc 100644
--- a/include/asm-generic/pgtable-nopud.h
+++ b/include/asm-generic/pgtable-nopud.h
@@ -4,9 +4,6 @@
 
 #ifndef __ASSEMBLY__
 
-#ifdef __ARCH_USE_5LEVEL_HACK
-#include <asm-generic/pgtable-nop4d-hack.h>
-#else
 #include <asm-generic/pgtable-nop4d.h>
 
 #define __PAGETABLE_PUD_FOLDED 1
@@ -65,5 +62,4 @@ static inline pud_t *pud_offset(p4d_t *p4d, unsigned long address)
 #define pud_addr_end(addr, end)			(end)
 
 #endif /* __ASSEMBLY__ */
-#endif /* !__ARCH_USE_5LEVEL_HACK */
 #endif /* _PGTABLE_NOPUD_H */
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v4 14/14] mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h
  2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
                   ` (12 preceding siblings ...)
  2020-04-14 15:34 ` [PATCH v4 13/14] asm-generic: remove pgtable-nop4d-hack.h Mike Rapoport
@ 2020-04-14 15:34 ` Mike Rapoport
  13 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-04-14 15:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

There are no architectures that use include/asm-generic/5level-fixup.h
therefore it can be removed along with __ARCH_HAS_5LEVEL_HACK define and
the code it surrounds

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 include/asm-generic/5level-fixup.h | 58 ------------------------------
 include/linux/mm.h                 |  6 ----
 mm/kasan/init.c                    | 11 ------
 mm/memory.c                        |  8 -----
 4 files changed, 83 deletions(-)
 delete mode 100644 include/asm-generic/5level-fixup.h

diff --git a/include/asm-generic/5level-fixup.h b/include/asm-generic/5level-fixup.h
deleted file mode 100644
index 4c74b1c1d13b..000000000000
--- a/include/asm-generic/5level-fixup.h
+++ /dev/null
@@ -1,58 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _5LEVEL_FIXUP_H
-#define _5LEVEL_FIXUP_H
-
-#define __ARCH_HAS_5LEVEL_HACK
-#define __PAGETABLE_P4D_FOLDED 1
-
-#define P4D_SHIFT			PGDIR_SHIFT
-#define P4D_SIZE			PGDIR_SIZE
-#define P4D_MASK			PGDIR_MASK
-#define MAX_PTRS_PER_P4D		1
-#define PTRS_PER_P4D			1
-
-#define p4d_t				pgd_t
-
-#define pud_alloc(mm, p4d, address) \
-	((unlikely(pgd_none(*(p4d))) && __pud_alloc(mm, p4d, address)) ? \
-		NULL : pud_offset(p4d, address))
-
-#define p4d_alloc(mm, pgd, address)	(pgd)
-#define p4d_offset(pgd, start)		(pgd)
-
-#ifndef __ASSEMBLY__
-static inline int p4d_none(p4d_t p4d)
-{
-	return 0;
-}
-
-static inline int p4d_bad(p4d_t p4d)
-{
-	return 0;
-}
-
-static inline int p4d_present(p4d_t p4d)
-{
-	return 1;
-}
-#endif
-
-#define p4d_ERROR(p4d)			do { } while (0)
-#define p4d_clear(p4d)			pgd_clear(p4d)
-#define p4d_val(p4d)			pgd_val(p4d)
-#define p4d_populate(mm, p4d, pud)	pgd_populate(mm, p4d, pud)
-#define p4d_populate_safe(mm, p4d, pud)	pgd_populate(mm, p4d, pud)
-#define p4d_page(p4d)			pgd_page(p4d)
-#define p4d_page_vaddr(p4d)		pgd_page_vaddr(p4d)
-
-#define __p4d(x)			__pgd(x)
-#define set_p4d(p4dp, p4d)		set_pgd(p4dp, p4d)
-
-#undef p4d_free_tlb
-#define p4d_free_tlb(tlb, x, addr)	do { } while (0)
-#define p4d_free(mm, x)			do { } while (0)
-
-#undef  p4d_addr_end
-#define p4d_addr_end(addr, end)		(end)
-
-#endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5a323422d783..f794b77df1ca 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2060,11 +2060,6 @@ int __pte_alloc_kernel(pmd_t *pmd);
 
 #if defined(CONFIG_MMU)
 
-/*
- * The following ifdef needed to get the 5level-fixup.h header to work.
- * Remove it when 5level-fixup.h has been removed.
- */
-#ifndef __ARCH_HAS_5LEVEL_HACK
 static inline p4d_t *p4d_alloc(struct mm_struct *mm, pgd_t *pgd,
 		unsigned long address)
 {
@@ -2078,7 +2073,6 @@ static inline pud_t *pud_alloc(struct mm_struct *mm, p4d_t *p4d,
 	return (unlikely(p4d_none(*p4d)) && __pud_alloc(mm, p4d, address)) ?
 		NULL : pud_offset(p4d, address);
 }
-#endif /* !__ARCH_HAS_5LEVEL_HACK */
 
 static inline pmd_t *pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
 {
diff --git a/mm/kasan/init.c b/mm/kasan/init.c
index ce45c491ebcd..fe6be0be1f76 100644
--- a/mm/kasan/init.c
+++ b/mm/kasan/init.c
@@ -250,20 +250,9 @@ int __ref kasan_populate_early_shadow(const void *shadow_start,
 			 * 3,2 - level page tables where we don't have
 			 * puds,pmds, so pgd_populate(), pud_populate()
 			 * is noops.
-			 *
-			 * The ifndef is required to avoid build breakage.
-			 *
-			 * With 5level-fixup.h, pgd_populate() is not nop and
-			 * we reference kasan_early_shadow_p4d. It's not defined
-			 * unless 5-level paging enabled.
-			 *
-			 * The ifndef can be dropped once all KASAN-enabled
-			 * architectures will switch to pgtable-nop4d.h.
 			 */
-#ifndef __ARCH_HAS_5LEVEL_HACK
 			pgd_populate(&init_mm, pgd,
 					lm_alias(kasan_early_shadow_p4d));
-#endif
 			p4d = p4d_offset(pgd, addr);
 			p4d_populate(&init_mm, p4d,
 					lm_alias(kasan_early_shadow_pud));
diff --git a/mm/memory.c b/mm/memory.c
index f703fe8c8346..379277c631b4 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4434,19 +4434,11 @@ int __pud_alloc(struct mm_struct *mm, p4d_t *p4d, unsigned long address)
 	smp_wmb(); /* See comment in __pte_alloc */
 
 	spin_lock(&mm->page_table_lock);
-#ifndef __ARCH_HAS_5LEVEL_HACK
 	if (!p4d_present(*p4d)) {
 		mm_inc_nr_puds(mm);
 		p4d_populate(mm, p4d, new);
 	} else	/* Another has populated it */
 		pud_free(mm, new);
-#else
-	if (!pgd_present(*p4d)) {
-		mm_inc_nr_puds(mm);
-		pgd_populate(mm, p4d, new);
-	} else	/* Another has populated it */
-		pud_free(mm, new);
-#endif /* __ARCH_HAS_5LEVEL_HACK */
 	spin_unlock(&mm->page_table_lock);
 	return 0;
 }
-- 
2.25.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 02/14] arm: add support for folded p4d page tables
       [not found]   ` <CGME20200507121658eucas1p240cf4a3e0fe5c22dda5ec4f72734149f@eucas1p2.samsung.com>
@ 2020-05-07 12:16     ` Marek Szyprowski
  2020-05-07 16:11       ` Mike Rapoport
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Szyprowski @ 2020-05-07 12:16 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, Fenghua Yu,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras,
	Michael Ellerman, Will Deacon, kvmarm, Jonas Bonn, Brian Cain,
	linux-hexagon, linux-sh, Russell King, Ley Foon Tan,
	Mike Rapoport, Catalin Marinas, uclinux-h8-devel, linux-arch,
	Arnd Bergmann, Bartlomiej Zolnierkiewicz, Łukasz Stelmach,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Marc Zyngier, nios2-dev,
	linuxppc-dev

Hi

On 14.04.2020 17:34, Mike Rapoport wrote:
> From: Mike Rapoport <rppt@linux.ibm.com>
>
> Implement primitives necessary for the 4th level folding, add walks of p4d
> level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.
>
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>

Today I've noticed that kexec is broken on ARM 32bit. Bisecting between 
current linux-next and v5.7-rc1 pointed to this commit. I've tested this 
on Odroid XU4 and Raspberry Pi4 boards. Here is the relevant log:

# kexec --kexec-syscall -l zImage --append "$(cat /proc/cmdline)"
memory_range[0]:0x40000000..0xbe9fffff
memory_range[0]:0x40000000..0xbe9fffff
# kexec -e
kexec_core: Starting new kernel
8<--- cut here ---
Unable to handle kernel paging request at virtual address c010f1f4
pgd = c6817793
[c010f1f4] *pgd=4000041e(bad)
Internal error: Oops: 80d [#1] PREEMPT ARM
Modules linked in:
CPU: 0 PID: 1329 Comm: kexec Tainted: G        W 
5.7.0-rc3-00127-g6cba81ed0f62 #611
Hardware name: Samsung Exynos (Flattened Device Tree)
PC is at machine_kexec+0x40/0xfc
LR is at 0xffffffff
pc : [<c010f0b4>]    lr : [<ffffffff>]    psr: 60000013
sp : ebc13e60  ip : 40008000  fp : 00000001
r10: 00000058  r9 : fee1dead  r8 : 00000001
r7 : c121387c  r6 : 6c224000  r5 : ece40c00  r4 : ec222000
r3 : c010f1f4  r2 : c1100000  r1 : c1100000  r0 : 418d0000
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 6bc14059  DAC: 00000051
Process kexec (pid: 1329, stack limit = 0x366bb4dc)
Stack: (0xebc13e60 to 0xebc14000)
...
[<c010f0b4>] (machine_kexec) from [<c01c0d84>] (kernel_kexec+0x74/0x7c)
[<c01c0d84>] (kernel_kexec) from [<c014b1bc>] (__do_sys_reboot+0x1f8/0x210)
[<c014b1bc>] (__do_sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x28)
Exception stack(0xebc13fa8 to 0xebc13ff0)
...
---[ end trace 3e8d6c81723c778d ]---
1329 Segmentation fault      ./kexec -e

> ---
>   arch/arm/include/asm/pgtable.h     |  1 -
>   arch/arm/lib/uaccess_with_memcpy.c |  7 +++++-
>   arch/arm/mach-sa1100/assabet.c     |  2 +-
>   arch/arm/mm/dump.c                 | 29 +++++++++++++++++-----
>   arch/arm/mm/fault-armv.c           |  7 +++++-
>   arch/arm/mm/fault.c                | 22 ++++++++++------
>   arch/arm/mm/idmap.c                |  3 ++-
>   arch/arm/mm/init.c                 |  2 +-
>   arch/arm/mm/ioremap.c              | 12 ++++++---
>   arch/arm/mm/mm.h                   |  2 +-
>   arch/arm/mm/mmu.c                  | 35 +++++++++++++++++++++-----
>   arch/arm/mm/pgd.c                  | 40 ++++++++++++++++++++++++------
>   12 files changed, 125 insertions(+), 37 deletions(-)
>
> ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 02/14] arm: add support for folded p4d page tables
  2020-05-07 12:16     ` Marek Szyprowski
@ 2020-05-07 16:11       ` Mike Rapoport
  2020-05-08  6:53         ` Marek Szyprowski
  0 siblings, 1 reply; 25+ messages in thread
From: Mike Rapoport @ 2020-05-07 16:11 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, Fenghua Yu,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras,
	Michael Ellerman, Will Deacon, kvmarm, Jonas Bonn, Brian Cain,
	linux-hexagon, linux-sh, Russell King, Ley Foon Tan,
	Catalin Marinas, uclinux-h8-devel, linux-arch, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Łukasz Stelmach, kvm-ppc,
	Stefan Kristiansson, openrisc, Stafford Horne, Guan Xuetao,
	linux-arm-kernel, Christophe Leroy, Tony Luck, Yoshinori Sato,
	linux-kernel, Marc Zyngier, nios2-dev, Andrew Morton,
	linuxppc-dev, Mike Rapoport

Hi,

On Thu, May 07, 2020 at 02:16:56PM +0200, Marek Szyprowski wrote:
> Hi
> 
> On 14.04.2020 17:34, Mike Rapoport wrote:
> > From: Mike Rapoport <rppt@linux.ibm.com>
> >
> > Implement primitives necessary for the 4th level folding, add walks of p4d
> > level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.
> >
> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> 
> Today I've noticed that kexec is broken on ARM 32bit. Bisecting between 
> current linux-next and v5.7-rc1 pointed to this commit. I've tested this 
> on Odroid XU4 and Raspberry Pi4 boards. Here is the relevant log:
> 
> # kexec --kexec-syscall -l zImage --append "$(cat /proc/cmdline)"
> memory_range[0]:0x40000000..0xbe9fffff
> memory_range[0]:0x40000000..0xbe9fffff
> # kexec -e
> kexec_core: Starting new kernel
> 8<--- cut here ---
> Unable to handle kernel paging request at virtual address c010f1f4
> pgd = c6817793
> [c010f1f4] *pgd=4000041e(bad)
> Internal error: Oops: 80d [#1] PREEMPT ARM
> Modules linked in:
> CPU: 0 PID: 1329 Comm: kexec Tainted: G        W 
> 5.7.0-rc3-00127-g6cba81ed0f62 #611
> Hardware name: Samsung Exynos (Flattened Device Tree)
> PC is at machine_kexec+0x40/0xfc

Any chance you have the debug info in this kernel?
scripts/faddr2line would come handy here.

> LR is at 0xffffffff
> pc : [<c010f0b4>]    lr : [<ffffffff>]    psr: 60000013
> sp : ebc13e60  ip : 40008000  fp : 00000001
> r10: 00000058  r9 : fee1dead  r8 : 00000001
> r7 : c121387c  r6 : 6c224000  r5 : ece40c00  r4 : ec222000
> r3 : c010f1f4  r2 : c1100000  r1 : c1100000  r0 : 418d0000
> Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> Control: 10c5387d  Table: 6bc14059  DAC: 00000051
> Process kexec (pid: 1329, stack limit = 0x366bb4dc)
> Stack: (0xebc13e60 to 0xebc14000)
> ...
> [<c010f0b4>] (machine_kexec) from [<c01c0d84>] (kernel_kexec+0x74/0x7c)
> [<c01c0d84>] (kernel_kexec) from [<c014b1bc>] (__do_sys_reboot+0x1f8/0x210)
> [<c014b1bc>] (__do_sys_reboot) from [<c0100060>] (ret_fast_syscall+0x0/0x28)
> Exception stack(0xebc13fa8 to 0xebc13ff0)
> ...
> ---[ end trace 3e8d6c81723c778d ]---
> 1329 Segmentation fault      ./kexec -e
> 
> > ---
> >   arch/arm/include/asm/pgtable.h     |  1 -
> >   arch/arm/lib/uaccess_with_memcpy.c |  7 +++++-
> >   arch/arm/mach-sa1100/assabet.c     |  2 +-
> >   arch/arm/mm/dump.c                 | 29 +++++++++++++++++-----
> >   arch/arm/mm/fault-armv.c           |  7 +++++-
> >   arch/arm/mm/fault.c                | 22 ++++++++++------
> >   arch/arm/mm/idmap.c                |  3 ++-
> >   arch/arm/mm/init.c                 |  2 +-
> >   arch/arm/mm/ioremap.c              | 12 ++++++---
> >   arch/arm/mm/mm.h                   |  2 +-
> >   arch/arm/mm/mmu.c                  | 35 +++++++++++++++++++++-----
> >   arch/arm/mm/pgd.c                  | 40 ++++++++++++++++++++++++------
> >   12 files changed, 125 insertions(+), 37 deletions(-)
> >
> > ...
> 
> Best regards
> -- 
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
> 

-- 
Sincerely yours,
Mike.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 02/14] arm: add support for folded p4d page tables
  2020-05-07 16:11       ` Mike Rapoport
@ 2020-05-08  6:53         ` Marek Szyprowski
  2020-05-08 17:42           ` Mike Rapoport
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Szyprowski @ 2020-05-08  6:53 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, Fenghua Yu,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras,
	Michael Ellerman, Will Deacon, kvmarm, Jonas Bonn, Brian Cain,
	linux-hexagon, linux-sh, Russell King, Ley Foon Tan,
	Catalin Marinas, uclinux-h8-devel, linux-arch, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Łukasz Stelmach, kvm-ppc,
	Stefan Kristiansson, openrisc, Stafford Horne, Guan Xuetao,
	linux-arm-kernel, Christophe Leroy, Tony Luck, Yoshinori Sato,
	linux-kernel, Marc Zyngier, nios2-dev, Andrew Morton,
	linuxppc-dev, Mike Rapoport

Hi Mike,

On 07.05.2020 18:11, Mike Rapoport wrote:
> On Thu, May 07, 2020 at 02:16:56PM +0200, Marek Szyprowski wrote:
>> On 14.04.2020 17:34, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> Implement primitives necessary for the 4th level folding, add walks of p4d
>>> level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.
>>>
>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>> Today I've noticed that kexec is broken on ARM 32bit. Bisecting between
>> current linux-next and v5.7-rc1 pointed to this commit. I've tested this
>> on Odroid XU4 and Raspberry Pi4 boards. Here is the relevant log:
>>
>> # kexec --kexec-syscall -l zImage --append "$(cat /proc/cmdline)"
>> memory_range[0]:0x40000000..0xbe9fffff
>> memory_range[0]:0x40000000..0xbe9fffff
>> # kexec -e
>> kexec_core: Starting new kernel
>> 8<--- cut here ---
>> Unable to handle kernel paging request at virtual address c010f1f4
>> pgd = c6817793
>> [c010f1f4] *pgd=4000041e(bad)
>> Internal error: Oops: 80d [#1] PREEMPT ARM
>> Modules linked in:
>> CPU: 0 PID: 1329 Comm: kexec Tainted: G        W
>> 5.7.0-rc3-00127-g6cba81ed0f62 #611
>> Hardware name: Samsung Exynos (Flattened Device Tree)
>> PC is at machine_kexec+0x40/0xfc
> Any chance you have the debug info in this kernel?
> scripts/faddr2line would come handy here.

# ./scripts/faddr2line --list vmlinux machine_kexec+0x40
machine_kexec+0x40/0xf8:

machine_kexec at arch/arm/kernel/machine_kexec.c:182
  177            reboot_code_buffer = 
page_address(image->control_code_page);
  178
  179            /* Prepare parameters for reboot_code_buffer*/
  180            set_kernel_text_rw();
  181            kexec_start_address = image->start;
 >182<           kexec_indirection_page = page_list;
  183            kexec_mach_type = machine_arch_type;
  184            kexec_boot_atags = image->arch.kernel_r2;
  185
  186            /* copy our kernel relocation code to the control code 
page */
  187            reboot_entry = fncpy(reboot_code_buffer,

 > ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 02/14] arm: add support for folded p4d page tables
  2020-05-08  6:53         ` Marek Szyprowski
@ 2020-05-08 17:42           ` Mike Rapoport
  2020-05-11  6:36             ` Marek Szyprowski
  0 siblings, 1 reply; 25+ messages in thread
From: Mike Rapoport @ 2020-05-08 17:42 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, Fenghua Yu,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras,
	Michael Ellerman, Will Deacon, kvmarm, Jonas Bonn, Brian Cain,
	linux-hexagon, linux-sh, Russell King, Ley Foon Tan,
	Catalin Marinas, uclinux-h8-devel, linux-arch, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Łukasz Stelmach, kvm-ppc,
	Stefan Kristiansson, openrisc, Stafford Horne, Guan Xuetao,
	linux-arm-kernel, Christophe Leroy, Tony Luck, Yoshinori Sato,
	linux-kernel, Marc Zyngier, nios2-dev, Andrew Morton,
	linuxppc-dev, Mike Rapoport

On Fri, May 08, 2020 at 08:53:27AM +0200, Marek Szyprowski wrote:
> Hi Mike,
> 
> On 07.05.2020 18:11, Mike Rapoport wrote:
> > On Thu, May 07, 2020 at 02:16:56PM +0200, Marek Szyprowski wrote:
> >> On 14.04.2020 17:34, Mike Rapoport wrote:
> >>> From: Mike Rapoport <rppt@linux.ibm.com>
> >>>
> >>> Implement primitives necessary for the 4th level folding, add walks of p4d
> >>> level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.
> >>>
> >>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> >> Today I've noticed that kexec is broken on ARM 32bit. Bisecting between
> >> current linux-next and v5.7-rc1 pointed to this commit. I've tested this
> >> on Odroid XU4 and Raspberry Pi4 boards. Here is the relevant log:
> >>
> >> # kexec --kexec-syscall -l zImage --append "$(cat /proc/cmdline)"
> >> memory_range[0]:0x40000000..0xbe9fffff
> >> memory_range[0]:0x40000000..0xbe9fffff
> >> # kexec -e
> >> kexec_core: Starting new kernel
> >> 8<--- cut here ---
> >> Unable to handle kernel paging request at virtual address c010f1f4
> >> pgd = c6817793
> >> [c010f1f4] *pgd=4000041e(bad)
> >> Internal error: Oops: 80d [#1] PREEMPT ARM
> >> Modules linked in:
> >> CPU: 0 PID: 1329 Comm: kexec Tainted: G        W
> >> 5.7.0-rc3-00127-g6cba81ed0f62 #611
> >> Hardware name: Samsung Exynos (Flattened Device Tree)
> >> PC is at machine_kexec+0x40/0xfc
> > Any chance you have the debug info in this kernel?
> > scripts/faddr2line would come handy here.
> 
> # ./scripts/faddr2line --list vmlinux machine_kexec+0x40
> machine_kexec+0x40/0xf8:
> 
> machine_kexec at arch/arm/kernel/machine_kexec.c:182
>   177            reboot_code_buffer = 
> page_address(image->control_code_page);
>   178
>   179            /* Prepare parameters for reboot_code_buffer*/
>   180            set_kernel_text_rw();
>   181            kexec_start_address = image->start;
>  >182<           kexec_indirection_page = page_list;
>   183            kexec_mach_type = machine_arch_type;
>   184            kexec_boot_atags = image->arch.kernel_r2;
>   185
>   186            /* copy our kernel relocation code to the control code 
> page */
>   187            reboot_entry = fncpy(reboot_code_buffer,

Can you please try the patch below:

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 963b5284d284..f86b3d17928e 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -571,7 +571,7 @@ static inline void section_update(unsigned long addr, pmdval_t mask,
 {
 	pmd_t *pmd;
 
-	pmd = pmd_off_k(addr);
+	pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, addr), addr), addr), addr);
 
 #ifdef CONFIG_ARM_LPAE
 	pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);

>  > ...
> 
> Best regards
> -- 
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
> 

-- 
Sincerely yours,
Mike.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 02/14] arm: add support for folded p4d page tables
  2020-05-08 17:42           ` Mike Rapoport
@ 2020-05-11  6:36             ` Marek Szyprowski
  2020-05-11 14:15               ` Mike Rapoport
  0 siblings, 1 reply; 25+ messages in thread
From: Marek Szyprowski @ 2020-05-11  6:36 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, Fenghua Yu,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras,
	Michael Ellerman, Will Deacon, kvmarm, Jonas Bonn, Brian Cain,
	linux-hexagon, linux-sh, Russell King, Ley Foon Tan,
	Catalin Marinas, uclinux-h8-devel, linux-arch, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Łukasz Stelmach, kvm-ppc,
	Stefan Kristiansson, openrisc, Stafford Horne, Guan Xuetao,
	linux-arm-kernel, Christophe Leroy, Tony Luck, Yoshinori Sato,
	linux-kernel, Marc Zyngier, nios2-dev, Andrew Morton,
	linuxppc-dev, Mike Rapoport

Hi Mike,

On 08.05.2020 19:42, Mike Rapoport wrote:
> On Fri, May 08, 2020 at 08:53:27AM +0200, Marek Szyprowski wrote:
>> On 07.05.2020 18:11, Mike Rapoport wrote:
>>> On Thu, May 07, 2020 at 02:16:56PM +0200, Marek Szyprowski wrote:
>>>> On 14.04.2020 17:34, Mike Rapoport wrote:
>>>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>>>
>>>>> Implement primitives necessary for the 4th level folding, add walks of p4d
>>>>> level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.
>>>>>
>>>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>>>> Today I've noticed that kexec is broken on ARM 32bit. Bisecting between
>>>> current linux-next and v5.7-rc1 pointed to this commit. I've tested this
>>>> on Odroid XU4 and Raspberry Pi4 boards. Here is the relevant log:
>>>>
>>>> # kexec --kexec-syscall -l zImage --append "$(cat /proc/cmdline)"
>>>> memory_range[0]:0x40000000..0xbe9fffff
>>>> memory_range[0]:0x40000000..0xbe9fffff
>>>> # kexec -e
>>>> kexec_core: Starting new kernel
>>>> 8<--- cut here ---
>>>> Unable to handle kernel paging request at virtual address c010f1f4
>>>> pgd = c6817793
>>>> [c010f1f4] *pgd=4000041e(bad)
>>>> Internal error: Oops: 80d [#1] PREEMPT ARM
>>>> Modules linked in:
>>>> CPU: 0 PID: 1329 Comm: kexec Tainted: G        W
>>>> 5.7.0-rc3-00127-g6cba81ed0f62 #611
>>>> Hardware name: Samsung Exynos (Flattened Device Tree)
>>>> PC is at machine_kexec+0x40/0xfc
>>> Any chance you have the debug info in this kernel?
>>> scripts/faddr2line would come handy here.
>> # ./scripts/faddr2line --list vmlinux machine_kexec+0x40
>> machine_kexec+0x40/0xf8:
>>
>> machine_kexec at arch/arm/kernel/machine_kexec.c:182
>>    177            reboot_code_buffer =
>> page_address(image->control_code_page);
>>    178
>>    179            /* Prepare parameters for reboot_code_buffer*/
>>    180            set_kernel_text_rw();
>>    181            kexec_start_address = image->start;
>>   >182<           kexec_indirection_page = page_list;
>>    183            kexec_mach_type = machine_arch_type;
>>    184            kexec_boot_atags = image->arch.kernel_r2;
>>    185
>>    186            /* copy our kernel relocation code to the control code
>> page */
>>    187            reboot_entry = fncpy(reboot_code_buffer,
> Can you please try the patch below:
>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 963b5284d284..f86b3d17928e 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -571,7 +571,7 @@ static inline void section_update(unsigned long addr, pmdval_t mask,
>   {
>   	pmd_t *pmd;
>   
> -	pmd = pmd_off_k(addr);
> +	pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, addr), addr), addr), addr);
>   
>   #ifdef CONFIG_ARM_LPAE
>   	pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
This fixes kexec issue! Thanks!


Feel free to add:

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 218f1c390557 ("arm: add support for folded p4d page tables")
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 02/14] arm: add support for folded p4d page tables
  2020-05-11  6:36             ` Marek Szyprowski
@ 2020-05-11 14:15               ` Mike Rapoport
  0 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-05-11 14:15 UTC (permalink / raw)
  To: Marek Szyprowski, Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, Fenghua Yu,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras,
	Michael Ellerman, Will Deacon, kvmarm, Jonas Bonn, Brian Cain,
	linux-hexagon, linux-sh, Russell King, Ley Foon Tan,
	Catalin Marinas, uclinux-h8-devel, linux-arch, Arnd Bergmann,
	Bartlomiej Zolnierkiewicz, Łukasz Stelmach, kvm-ppc,
	Stefan Kristiansson, openrisc, Stafford Horne, Guan Xuetao,
	linux-arm-kernel, Christophe Leroy, Tony Luck, Yoshinori Sato,
	linux-kernel, Marc Zyngier, nios2-dev, linuxppc-dev,
	Mike Rapoport

Hi Marek,

On Mon, May 11, 2020 at 08:36:41AM +0200, Marek Szyprowski wrote:
> Hi Mike,
> 
> On 08.05.2020 19:42, Mike Rapoport wrote:
> > On Fri, May 08, 2020 at 08:53:27AM +0200, Marek Szyprowski wrote:
> >> On 07.05.2020 18:11, Mike Rapoport wrote:
> >>> On Thu, May 07, 2020 at 02:16:56PM +0200, Marek Szyprowski wrote:
> >>>> On 14.04.2020 17:34, Mike Rapoport wrote:
> >>>>> From: Mike Rapoport <rppt@linux.ibm.com>
> >>>>>
> >>>>> Implement primitives necessary for the 4th level folding, add walks of p4d
> >>>>> level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.
> >>>>>
> >>>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> > Can you please try the patch below:
> >
> > diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> > index 963b5284d284..f86b3d17928e 100644
> > --- a/arch/arm/mm/init.c
> > +++ b/arch/arm/mm/init.c
> > @@ -571,7 +571,7 @@ static inline void section_update(unsigned long addr, pmdval_t mask,
> >   {
> >   	pmd_t *pmd;
> >   
> > -	pmd = pmd_off_k(addr);
> > +	pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, addr), addr), addr), addr);
> >   
> >   #ifdef CONFIG_ARM_LPAE
> >   	pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);
> This fixes kexec issue! Thanks!
> 
> 
> Feel free to add:
> 
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Fixes: 218f1c390557 ("arm: add support for folded p4d page tables")
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

Thanks for testing!

The patch is still in mmotm tree, so I don't think "Fixes" apply.

Andrew, would you like me to send the fix as a formal patch or will pick
it up as a fixup?

> Best regards
> -- 
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
> 

-- 
Sincerely yours,
Mike.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 03/14] arm64: add support for folded p4d page tables
  2020-04-14 15:34 ` [PATCH v4 03/14] arm64: " Mike Rapoport
@ 2020-05-15 18:40   ` Andrew Morton
  2020-05-16 17:20     ` Mike Rapoport
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2020-05-15 18:40 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev

On Tue, 14 Apr 2020 18:34:44 +0300 Mike Rapoport <rppt@kernel.org> wrote:

> Implement primitives necessary for the 4th level folding, add walks of p4d
> level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
> remove __ARCH_USE_5LEVEL_HACK.

This needed some rework due to arm changes in linux-next.  Please check
my handiwork and test it once I've merged this into linux-next?

Rejects were

--- arch/arm64/include/asm/pgtable.h~arm64-add-support-for-folded-p4d-page-tables
+++ arch/arm64/include/asm/pgtable.h
@@ -596,49 +604,50 @@ static inline phys_addr_t pud_page_paddr
 
 #define pud_ERROR(pud)		__pud_error(__FILE__, __LINE__, pud_val(pud))
 
-#define pgd_none(pgd)		(!pgd_val(pgd))
-#define pgd_bad(pgd)		(!(pgd_val(pgd) & 2))
-#define pgd_present(pgd)	(pgd_val(pgd))
+#define p4d_none(p4d)		(!p4d_val(p4d))
+#define p4d_bad(p4d)		(!(p4d_val(p4d) & 2))
+#define p4d_present(p4d)	(p4d_val(p4d))
 
-static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
+static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
 {
-	if (in_swapper_pgdir(pgdp)) {
-		set_swapper_pgd(pgdp, pgd);
+	if (in_swapper_pgdir(p4dp)) {
+		set_swapper_pgd((pgd_t *)p4dp, __pgd(p4d_val(p4d)));
 		return;
 	}
 
-	WRITE_ONCE(*pgdp, pgd);
+	WRITE_ONCE(*p4dp, p4d);
 	dsb(ishst);
 	isb();
 }
 
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
 {
-	set_pgd(pgdp, __pgd(0));
+	set_p4d(p4dp, __p4d(0));
 }
 
-static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
+static inline phys_addr_t p4d_page_paddr(p4d_t p4d)
 {
-	return __pgd_to_phys(pgd);
+	return __p4d_to_phys(p4d);
 }
 
 /* Find an entry in the frst-level page table. */
 #define pud_index(addr)		(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
 
-#define pud_offset_phys(dir, addr)	(pgd_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
+#define pud_offset_phys(dir, addr)	(p4d_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
 #define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
 
 #define pud_set_fixmap(addr)		((pud_t *)set_fixmap_offset(FIX_PUD, addr))
-#define pud_set_fixmap_offset(pgd, addr)	pud_set_fixmap(pud_offset_phys(pgd, addr))
+#define pud_set_fixmap_offset(p4d, addr)	pud_set_fixmap(pud_offset_phys(p4d, addr))
 #define pud_clear_fixmap()		clear_fixmap(FIX_PUD)
 
-#define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(__pgd_to_phys(pgd)))
+#define p4d_page(p4d)		pfn_to_page(__phys_to_pfn(__p4d_to_phys(p4d)))
 
 /* use ONLY for statically allocated translation tables */
 #define pud_offset_kimg(dir,addr)	((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
 
 #else
 
+#define p4d_page_paddr(p4d)	({ BUILD_BUG(); 0;})
 #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
 
 /* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */



and

--- arch/arm64/kvm/mmu.c~arm64-add-support-for-folded-p4d-page-tables
+++ arch/arm64/kvm/mmu.c
@@ -469,7 +517,7 @@ static void stage2_flush_memslot(struct
 	do {
 		next = stage2_pgd_addr_end(kvm, addr, end);
 		if (!stage2_pgd_none(kvm, *pgd))
-			stage2_flush_puds(kvm, pgd, addr, next);
+			stage2_flush_p4ds(kvm, pgd, addr, next);
 	} while (pgd++, addr = next, addr != end);
 }
 


Result:

From: Mike Rapoport <rppt@linux.ibm.com>
Subject: arm64: add support for folded p4d page tables

Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
remove __ARCH_USE_5LEVEL_HACK.

Link: http://lkml.kernel.org/r/20200414153455.21744-4-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: James Morse <james.morse@arm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/include/asm/kvm_mmu.h        |   10 -
 arch/arm64/include/asm/pgalloc.h        |   10 -
 arch/arm64/include/asm/pgtable-types.h  |    5 
 arch/arm64/include/asm/pgtable.h        |   37 ++-
 arch/arm64/include/asm/stage2_pgtable.h |   48 +++--
 arch/arm64/kernel/hibernate.c           |   44 +++-
 arch/arm64/kvm/mmu.c                    |  209 ++++++++++++++++++----
 arch/arm64/mm/fault.c                   |    9 
 arch/arm64/mm/hugetlbpage.c             |   15 +
 arch/arm64/mm/kasan_init.c              |   26 ++
 arch/arm64/mm/mmu.c                     |   52 +++--
 arch/arm64/mm/pageattr.c                |    7 
 12 files changed, 368 insertions(+), 104 deletions(-)

--- a/arch/arm64/include/asm/kvm_mmu.h~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/include/asm/kvm_mmu.h
@@ -172,8 +172,8 @@ void kvm_clear_hyp_idmap(void);
 	__pmd(__phys_to_pmd_val(__pa(ptep)) | PMD_TYPE_TABLE)
 #define kvm_mk_pud(pmdp)					\
 	__pud(__phys_to_pud_val(__pa(pmdp)) | PMD_TYPE_TABLE)
-#define kvm_mk_pgd(pudp)					\
-	__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
+#define kvm_mk_p4d(pmdp)					\
+	__p4d(__phys_to_p4d_val(__pa(pmdp)) | PUD_TYPE_TABLE)
 
 #define kvm_set_pud(pudp, pud)		set_pud(pudp, pud)
 
@@ -299,6 +299,12 @@ static inline bool kvm_s2pud_young(pud_t
 #define hyp_pud_table_empty(pudp) kvm_page_empty(pudp)
 #endif
 
+#ifdef __PAGETABLE_P4D_FOLDED
+#define hyp_p4d_table_empty(p4dp) (0)
+#else
+#define hyp_p4d_table_empty(p4dp) kvm_page_empty(p4dp)
+#endif
+
 struct kvm;
 
 #define kvm_flush_dcache_to_poc(a,l)	__flush_dcache_area((a), (l))
--- a/arch/arm64/include/asm/pgalloc.h~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/include/asm/pgalloc.h
@@ -73,17 +73,17 @@ static inline void pud_free(struct mm_st
 	free_page((unsigned long)pudp);
 }
 
-static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
+static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
 {
-	set_pgd(pgdp, __pgd(__phys_to_pgd_val(pudp) | prot));
+	set_p4d(p4dp, __p4d(__phys_to_p4d_val(pudp) | prot));
 }
 
-static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
+static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
 {
-	__pgd_populate(pgdp, __pa(pudp), PUD_TYPE_TABLE);
+	__p4d_populate(p4dp, __pa(pudp), PUD_TYPE_TABLE);
 }
 #else
-static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
+static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
 {
 	BUILD_BUG();
 }
--- a/arch/arm64/include/asm/pgtable.h~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/include/asm/pgtable.h
@@ -298,6 +298,11 @@ static inline pte_t pgd_pte(pgd_t pgd)
 	return __pte(pgd_val(pgd));
 }
 
+static inline pte_t p4d_pte(p4d_t p4d)
+{
+	return __pte(p4d_val(p4d));
+}
+
 static inline pte_t pud_pte(pud_t pud)
 {
 	return __pte(pud_val(pud));
@@ -401,6 +406,9 @@ static inline pmd_t pmd_mkdevmap(pmd_t p
 
 #define set_pmd_at(mm, addr, pmdp, pmd)	set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd))
 
+#define __p4d_to_phys(p4d)	__pte_to_phys(p4d_pte(p4d))
+#define __phys_to_p4d_val(phys)	__phys_to_pte_val(phys)
+
 #define __pgd_to_phys(pgd)	__pte_to_phys(pgd_pte(pgd))
 #define __phys_to_pgd_val(phys)	__phys_to_pte_val(phys)
 
@@ -592,49 +600,50 @@ static inline phys_addr_t pud_page_paddr
 
 #define pud_ERROR(pud)		__pud_error(__FILE__, __LINE__, pud_val(pud))
 
-#define pgd_none(pgd)		(!pgd_val(pgd))
-#define pgd_bad(pgd)		(!(pgd_val(pgd) & 2))
-#define pgd_present(pgd)	(pgd_val(pgd))
+#define p4d_none(p4d)		(!p4d_val(p4d))
+#define p4d_bad(p4d)		(!(p4d_val(p4d) & 2))
+#define p4d_present(p4d)	(p4d_val(p4d))
 
-static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
+static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
 {
-	if (in_swapper_pgdir(pgdp)) {
-		set_swapper_pgd(pgdp, pgd);
+	if (in_swapper_pgdir(p4dp)) {
+		set_swapper_pgd((pgd_t *)p4dp, __pgd(p4d_val(p4d)));
 		return;
 	}
 
-	WRITE_ONCE(*pgdp, pgd);
+	WRITE_ONCE(*p4dp, p4d);
 	dsb(ishst);
 	isb();
 }
 
-static inline void pgd_clear(pgd_t *pgdp)
+static inline void p4d_clear(p4d_t *p4dp)
 {
-	set_pgd(pgdp, __pgd(0));
+	set_p4d(p4dp, __p4d(0));
 }
 
-static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
+static inline phys_addr_t p4d_page_paddr(p4d_t p4d)
 {
-	return __pgd_to_phys(pgd);
+	return __p4d_to_phys(p4d);
 }
 
 /* Find an entry in the frst-level page table. */
 #define pud_index(addr)		(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
 
-#define pud_offset_phys(dir, addr)	(pgd_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
+#define pud_offset_phys(dir, addr)	(p4d_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
 #define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
 
 #define pud_set_fixmap(addr)		((pud_t *)set_fixmap_offset(FIX_PUD, addr))
-#define pud_set_fixmap_offset(pgd, addr)	pud_set_fixmap(pud_offset_phys(pgd, addr))
+#define pud_set_fixmap_offset(p4d, addr)	pud_set_fixmap(pud_offset_phys(p4d, addr))
 #define pud_clear_fixmap()		clear_fixmap(FIX_PUD)
 
-#define pgd_page(pgd)			phys_to_page(__pgd_to_phys(pgd))
+#define p4d_page(p4d)		pfn_to_page(__phys_to_pfn(__p4d_to_phys(p4d)))
 
 /* use ONLY for statically allocated translation tables */
 #define pud_offset_kimg(dir,addr)	((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
 
 #else
 
+#define p4d_page_paddr(p4d)	({ BUILD_BUG(); 0;})
 #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
 
 /* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
--- a/arch/arm64/include/asm/pgtable-types.h~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/include/asm/pgtable-types.h
@@ -14,6 +14,7 @@
 typedef u64 pteval_t;
 typedef u64 pmdval_t;
 typedef u64 pudval_t;
+typedef u64 p4dval_t;
 typedef u64 pgdval_t;
 
 /*
@@ -44,13 +45,11 @@ typedef struct { pteval_t pgprot; } pgpr
 #define __pgprot(x)	((pgprot_t) { (x) } )
 
 #if CONFIG_PGTABLE_LEVELS == 2
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopmd.h>
 #elif CONFIG_PGTABLE_LEVELS == 3
-#define __ARCH_USE_5LEVEL_HACK
 #include <asm-generic/pgtable-nopud.h>
 #elif CONFIG_PGTABLE_LEVELS == 4
-#include <asm-generic/5level-fixup.h>
+#include <asm-generic/pgtable-nop4d.h>
 #endif
 
 #endif	/* __ASM_PGTABLE_TYPES_H */
--- a/arch/arm64/include/asm/stage2_pgtable.h~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/include/asm/stage2_pgtable.h
@@ -68,41 +68,67 @@ static inline bool kvm_stage2_has_pud(st
 #define S2_PUD_SIZE			(1UL << S2_PUD_SHIFT)
 #define S2_PUD_MASK			(~(S2_PUD_SIZE - 1))
 
-static inline bool stage2_pgd_none(struct kvm *kvm, pgd_t pgd)
+#define stage2_pgd_none(kvm, pgd)		pgd_none(pgd)
+#define stage2_pgd_clear(kvm, pgd)		pgd_clear(pgd)
+#define stage2_pgd_present(kvm, pgd)		pgd_present(pgd)
+#define stage2_pgd_populate(kvm, pgd, p4d)	pgd_populate(NULL, pgd, p4d)
+
+static inline p4d_t *stage2_p4d_offset(struct kvm *kvm,
+				       pgd_t *pgd, unsigned long address)
+{
+	return p4d_offset(pgd, address);
+}
+
+static inline void stage2_p4d_free(struct kvm *kvm, p4d_t *p4d)
+{
+}
+
+static inline bool stage2_p4d_table_empty(struct kvm *kvm, p4d_t *p4dp)
+{
+	return false;
+}
+
+static inline phys_addr_t stage2_p4d_addr_end(struct kvm *kvm,
+					      phys_addr_t addr, phys_addr_t end)
+{
+	return end;
+}
+
+static inline bool stage2_p4d_none(struct kvm *kvm, p4d_t p4d)
 {
 	if (kvm_stage2_has_pud(kvm))
-		return pgd_none(pgd);
+		return p4d_none(p4d);
 	else
 		return 0;
 }
 
-static inline void stage2_pgd_clear(struct kvm *kvm, pgd_t *pgdp)
+static inline void stage2_p4d_clear(struct kvm *kvm, p4d_t *p4dp)
 {
 	if (kvm_stage2_has_pud(kvm))
-		pgd_clear(pgdp);
+		p4d_clear(p4dp);
 }
 
-static inline bool stage2_pgd_present(struct kvm *kvm, pgd_t pgd)
+static inline bool stage2_p4d_present(struct kvm *kvm, p4d_t p4d)
 {
 	if (kvm_stage2_has_pud(kvm))
-		return pgd_present(pgd);
+		return p4d_present(p4d);
 	else
 		return 1;
 }
 
-static inline void stage2_pgd_populate(struct kvm *kvm, pgd_t *pgd, pud_t *pud)
+static inline void stage2_p4d_populate(struct kvm *kvm, p4d_t *p4d, pud_t *pud)
 {
 	if (kvm_stage2_has_pud(kvm))
-		pgd_populate(NULL, pgd, pud);
+		p4d_populate(NULL, p4d, pud);
 }
 
 static inline pud_t *stage2_pud_offset(struct kvm *kvm,
-				       pgd_t *pgd, unsigned long address)
+				       p4d_t *p4d, unsigned long address)
 {
 	if (kvm_stage2_has_pud(kvm))
-		return pud_offset(pgd, address);
+		return pud_offset(p4d, address);
 	else
-		return (pud_t *)pgd;
+		return (pud_t *)p4d;
 }
 
 static inline void stage2_pud_free(struct kvm *kvm, pud_t *pud)
--- a/arch/arm64/kernel/hibernate.c~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/kernel/hibernate.c
@@ -184,6 +184,7 @@ static int trans_pgd_map_page(pgd_t *tra
 		       pgprot_t pgprot)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep;
@@ -196,7 +197,15 @@ static int trans_pgd_map_page(pgd_t *tra
 		pgd_populate(&init_mm, pgdp, pudp);
 	}
 
-	pudp = pud_offset(pgdp, dst_addr);
+	p4dp = p4d_offset(pgdp, dst_addr);
+	if (p4d_none(READ_ONCE(*p4dp))) {
+		pudp = (void *)get_safe_page(GFP_ATOMIC);
+		if (!pudp)
+			return -ENOMEM;
+		p4d_populate(&init_mm, p4dp, pudp);
+	}
+
+	pudp = pud_offset(p4dp, dst_addr);
 	if (pud_none(READ_ONCE(*pudp))) {
 		pmdp = (void *)get_safe_page(GFP_ATOMIC);
 		if (!pmdp)
@@ -419,7 +428,7 @@ static int copy_pmd(pud_t *dst_pudp, pud
 	return 0;
 }
 
-static int copy_pud(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
+static int copy_pud(p4d_t *dst_p4dp, p4d_t *src_p4dp, unsigned long start,
 		    unsigned long end)
 {
 	pud_t *dst_pudp;
@@ -427,15 +436,15 @@ static int copy_pud(pgd_t *dst_pgdp, pgd
 	unsigned long next;
 	unsigned long addr = start;
 
-	if (pgd_none(READ_ONCE(*dst_pgdp))) {
+	if (p4d_none(READ_ONCE(*dst_p4dp))) {
 		dst_pudp = (pud_t *)get_safe_page(GFP_ATOMIC);
 		if (!dst_pudp)
 			return -ENOMEM;
-		pgd_populate(&init_mm, dst_pgdp, dst_pudp);
+		p4d_populate(&init_mm, dst_p4dp, dst_pudp);
 	}
-	dst_pudp = pud_offset(dst_pgdp, start);
+	dst_pudp = pud_offset(dst_p4dp, start);
 
-	src_pudp = pud_offset(src_pgdp, start);
+	src_pudp = pud_offset(src_p4dp, start);
 	do {
 		pud_t pud = READ_ONCE(*src_pudp);
 
@@ -454,6 +463,27 @@ static int copy_pud(pgd_t *dst_pgdp, pgd
 	return 0;
 }
 
+static int copy_p4d(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
+		    unsigned long end)
+{
+	p4d_t *dst_p4dp;
+	p4d_t *src_p4dp;
+	unsigned long next;
+	unsigned long addr = start;
+
+	dst_p4dp = p4d_offset(dst_pgdp, start);
+	src_p4dp = p4d_offset(src_pgdp, start);
+	do {
+		next = p4d_addr_end(addr, end);
+		if (p4d_none(READ_ONCE(*src_p4dp)))
+			continue;
+		if (copy_pud(dst_p4dp, src_p4dp, addr, next))
+			return -ENOMEM;
+	} while (dst_p4dp++, src_p4dp++, addr = next, addr != end);
+
+	return 0;
+}
+
 static int copy_page_tables(pgd_t *dst_pgdp, unsigned long start,
 			    unsigned long end)
 {
@@ -466,7 +496,7 @@ static int copy_page_tables(pgd_t *dst_p
 		next = pgd_addr_end(addr, end);
 		if (pgd_none(READ_ONCE(*src_pgdp)))
 			continue;
-		if (copy_pud(dst_pgdp, src_pgdp, addr, next))
+		if (copy_p4d(dst_pgdp, src_pgdp, addr, next))
 			return -ENOMEM;
 	} while (dst_pgdp++, src_pgdp++, addr = next, addr != end);
 
--- a/arch/arm64/kvm/mmu.c~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/kvm/mmu.c
@@ -158,13 +158,22 @@ static void *mmu_memory_cache_alloc(stru
 
 static void clear_stage2_pgd_entry(struct kvm *kvm, pgd_t *pgd, phys_addr_t addr)
 {
-	pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, pgd, 0UL);
+	p4d_t *p4d_table __maybe_unused = stage2_p4d_offset(kvm, pgd, 0UL);
 	stage2_pgd_clear(kvm, pgd);
 	kvm_tlb_flush_vmid_ipa(kvm, addr);
-	stage2_pud_free(kvm, pud_table);
+	stage2_p4d_free(kvm, p4d_table);
 	put_page(virt_to_page(pgd));
 }
 
+static void clear_stage2_p4d_entry(struct kvm *kvm, p4d_t *p4d, phys_addr_t addr)
+{
+	pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, p4d, 0);
+	stage2_p4d_clear(kvm, p4d);
+	kvm_tlb_flush_vmid_ipa(kvm, addr);
+	stage2_pud_free(kvm, pud_table);
+	put_page(virt_to_page(p4d));
+}
+
 static void clear_stage2_pud_entry(struct kvm *kvm, pud_t *pud, phys_addr_t addr)
 {
 	pmd_t *pmd_table __maybe_unused = stage2_pmd_offset(kvm, pud, 0);
@@ -208,12 +217,20 @@ static inline void kvm_pud_populate(pud_
 	dsb(ishst);
 }
 
-static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
+static inline void kvm_p4d_populate(p4d_t *p4dp, pud_t *pudp)
 {
-	WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
+	WRITE_ONCE(*p4dp, kvm_mk_p4d(pudp));
 	dsb(ishst);
 }
 
+static inline void kvm_pgd_populate(pgd_t *pgdp, p4d_t *p4dp)
+{
+#ifndef __PAGETABLE_P4D_FOLDED
+	WRITE_ONCE(*pgdp, kvm_mk_pgd(p4dp));
+	dsb(ishst);
+#endif
+}
+
 /*
  * Unmapping vs dcache management:
  *
@@ -293,13 +310,13 @@ static void unmap_stage2_pmds(struct kvm
 		clear_stage2_pud_entry(kvm, pud, start_addr);
 }
 
-static void unmap_stage2_puds(struct kvm *kvm, pgd_t *pgd,
+static void unmap_stage2_puds(struct kvm *kvm, p4d_t *p4d,
 		       phys_addr_t addr, phys_addr_t end)
 {
 	phys_addr_t next, start_addr = addr;
 	pud_t *pud, *start_pud;
 
-	start_pud = pud = stage2_pud_offset(kvm, pgd, addr);
+	start_pud = pud = stage2_pud_offset(kvm, p4d, addr);
 	do {
 		next = stage2_pud_addr_end(kvm, addr, end);
 		if (!stage2_pud_none(kvm, *pud)) {
@@ -317,6 +334,23 @@ static void unmap_stage2_puds(struct kvm
 	} while (pud++, addr = next, addr != end);
 
 	if (stage2_pud_table_empty(kvm, start_pud))
+		clear_stage2_p4d_entry(kvm, p4d, start_addr);
+}
+
+static void unmap_stage2_p4ds(struct kvm *kvm, pgd_t *pgd,
+		       phys_addr_t addr, phys_addr_t end)
+{
+	phys_addr_t next, start_addr = addr;
+	p4d_t *p4d, *start_p4d;
+
+	start_p4d = p4d = stage2_p4d_offset(kvm, pgd, addr);
+	do {
+		next = stage2_p4d_addr_end(kvm, addr, end);
+		if (!stage2_p4d_none(kvm, *p4d))
+			unmap_stage2_puds(kvm, p4d, addr, next);
+	} while (p4d++, addr = next, addr != end);
+
+	if (stage2_p4d_table_empty(kvm, start_p4d))
 		clear_stage2_pgd_entry(kvm, pgd, start_addr);
 }
 
@@ -351,7 +385,7 @@ static void unmap_stage2_range(struct kv
 			break;
 		next = stage2_pgd_addr_end(kvm, addr, end);
 		if (!stage2_pgd_none(kvm, *pgd))
-			unmap_stage2_puds(kvm, pgd, addr, next);
+			unmap_stage2_p4ds(kvm, pgd, addr, next);
 		/*
 		 * If the range is too large, release the kvm->mmu_lock
 		 * to prevent starvation and lockup detector warnings.
@@ -391,13 +425,13 @@ static void stage2_flush_pmds(struct kvm
 	} while (pmd++, addr = next, addr != end);
 }
 
-static void stage2_flush_puds(struct kvm *kvm, pgd_t *pgd,
+static void stage2_flush_puds(struct kvm *kvm, p4d_t *p4d,
 			      phys_addr_t addr, phys_addr_t end)
 {
 	pud_t *pud;
 	phys_addr_t next;
 
-	pud = stage2_pud_offset(kvm, pgd, addr);
+	pud = stage2_pud_offset(kvm, p4d, addr);
 	do {
 		next = stage2_pud_addr_end(kvm, addr, end);
 		if (!stage2_pud_none(kvm, *pud)) {
@@ -409,6 +443,20 @@ static void stage2_flush_puds(struct kvm
 	} while (pud++, addr = next, addr != end);
 }
 
+static void stage2_flush_p4ds(struct kvm *kvm, pgd_t *pgd,
+			      phys_addr_t addr, phys_addr_t end)
+{
+	p4d_t *p4d;
+	phys_addr_t next;
+
+	p4d = stage2_p4d_offset(kvm, pgd, addr);
+	do {
+		next = stage2_p4d_addr_end(kvm, addr, end);
+		if (!stage2_p4d_none(kvm, *p4d))
+			stage2_flush_puds(kvm, p4d, addr, next);
+	} while (p4d++, addr = next, addr != end);
+}
+
 static void stage2_flush_memslot(struct kvm *kvm,
 				 struct kvm_memory_slot *memslot)
 {
@@ -421,7 +469,7 @@ static void stage2_flush_memslot(struct
 	do {
 		next = stage2_pgd_addr_end(kvm, addr, end);
 		if (!stage2_pgd_none(kvm, *pgd))
-			stage2_flush_puds(kvm, pgd, addr, next);
+			stage2_flush_p4ds(kvm, pgd, addr, next);
 
 		if (next != end)
 			cond_resched_lock(&kvm->mmu_lock);
@@ -454,12 +502,21 @@ static void stage2_flush_vm(struct kvm *
 
 static void clear_hyp_pgd_entry(pgd_t *pgd)
 {
-	pud_t *pud_table __maybe_unused = pud_offset(pgd, 0UL);
+	p4d_t *p4d_table __maybe_unused = p4d_offset(pgd, 0UL);
 	pgd_clear(pgd);
-	pud_free(NULL, pud_table);
+	p4d_free(NULL, p4d_table);
 	put_page(virt_to_page(pgd));
 }
 
+static void clear_hyp_p4d_entry(p4d_t *p4d)
+{
+	pud_t *pud_table __maybe_unused = pud_offset(p4d, 0);
+	VM_BUG_ON(p4d_huge(*p4d));
+	p4d_clear(p4d);
+	pud_free(NULL, pud_table);
+	put_page(virt_to_page(p4d));
+}
+
 static void clear_hyp_pud_entry(pud_t *pud)
 {
 	pmd_t *pmd_table __maybe_unused = pmd_offset(pud, 0);
@@ -511,12 +568,12 @@ static void unmap_hyp_pmds(pud_t *pud, p
 		clear_hyp_pud_entry(pud);
 }
 
-static void unmap_hyp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+static void unmap_hyp_puds(p4d_t *p4d, phys_addr_t addr, phys_addr_t end)
 {
 	phys_addr_t next;
 	pud_t *pud, *start_pud;
 
-	start_pud = pud = pud_offset(pgd, addr);
+	start_pud = pud = pud_offset(p4d, addr);
 	do {
 		next = pud_addr_end(addr, end);
 		/* Hyp doesn't use huge puds */
@@ -525,6 +582,23 @@ static void unmap_hyp_puds(pgd_t *pgd, p
 	} while (pud++, addr = next, addr != end);
 
 	if (hyp_pud_table_empty(start_pud))
+		clear_hyp_p4d_entry(p4d);
+}
+
+static void unmap_hyp_p4ds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+{
+	phys_addr_t next;
+	p4d_t *p4d, *start_p4d;
+
+	start_p4d = p4d = p4d_offset(pgd, addr);
+	do {
+		next = p4d_addr_end(addr, end);
+		/* Hyp doesn't use huge p4ds */
+		if (!p4d_none(*p4d))
+			unmap_hyp_puds(p4d, addr, next);
+	} while (p4d++, addr = next, addr != end);
+
+	if (hyp_p4d_table_empty(start_p4d))
 		clear_hyp_pgd_entry(pgd);
 }
 
@@ -548,7 +622,7 @@ static void __unmap_hyp_range(pgd_t *pgd
 	do {
 		next = pgd_addr_end(addr, end);
 		if (!pgd_none(*pgd))
-			unmap_hyp_puds(pgd, addr, next);
+			unmap_hyp_p4ds(pgd, addr, next);
 	} while (pgd++, addr = next, addr != end);
 }
 
@@ -658,7 +732,7 @@ static int create_hyp_pmd_mappings(pud_t
 	return 0;
 }
 
-static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
+static int create_hyp_pud_mappings(p4d_t *p4d, unsigned long start,
 				   unsigned long end, unsigned long pfn,
 				   pgprot_t prot)
 {
@@ -669,7 +743,7 @@ static int create_hyp_pud_mappings(pgd_t
 
 	addr = start;
 	do {
-		pud = pud_offset(pgd, addr);
+		pud = pud_offset(p4d, addr);
 
 		if (pud_none_or_clear_bad(pud)) {
 			pmd = pmd_alloc_one(NULL, addr);
@@ -691,12 +765,45 @@ static int create_hyp_pud_mappings(pgd_t
 	return 0;
 }
 
+static int create_hyp_p4d_mappings(pgd_t *pgd, unsigned long start,
+				   unsigned long end, unsigned long pfn,
+				   pgprot_t prot)
+{
+	p4d_t *p4d;
+	pud_t *pud;
+	unsigned long addr, next;
+	int ret;
+
+	addr = start;
+	do {
+		p4d = p4d_offset(pgd, addr);
+
+		if (p4d_none(*p4d)) {
+			pud = pud_alloc_one(NULL, addr);
+			if (!pud) {
+				kvm_err("Cannot allocate Hyp pud\n");
+				return -ENOMEM;
+			}
+			kvm_p4d_populate(p4d, pud);
+			get_page(virt_to_page(p4d));
+		}
+
+		next = p4d_addr_end(addr, end);
+		ret = create_hyp_pud_mappings(p4d, addr, next, pfn, prot);
+		if (ret)
+			return ret;
+		pfn += (next - addr) >> PAGE_SHIFT;
+	} while (addr = next, addr != end);
+
+	return 0;
+}
+
 static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
 				 unsigned long start, unsigned long end,
 				 unsigned long pfn, pgprot_t prot)
 {
 	pgd_t *pgd;
-	pud_t *pud;
+	p4d_t *p4d;
 	unsigned long addr, next;
 	int err = 0;
 
@@ -707,18 +814,18 @@ static int __create_hyp_mappings(pgd_t *
 		pgd = pgdp + kvm_pgd_index(addr, ptrs_per_pgd);
 
 		if (pgd_none(*pgd)) {
-			pud = pud_alloc_one(NULL, addr);
-			if (!pud) {
-				kvm_err("Cannot allocate Hyp pud\n");
+			p4d = p4d_alloc_one(NULL, addr);
+			if (!p4d) {
+				kvm_err("Cannot allocate Hyp p4d\n");
 				err = -ENOMEM;
 				goto out;
 			}
-			kvm_pgd_populate(pgd, pud);
+			kvm_pgd_populate(pgd, p4d);
 			get_page(virt_to_page(pgd));
 		}
 
 		next = pgd_addr_end(addr, end);
-		err = create_hyp_pud_mappings(pgd, addr, next, pfn, prot);
+		err = create_hyp_p4d_mappings(pgd, addr, next, pfn, prot);
 		if (err)
 			goto out;
 		pfn += (next - addr) >> PAGE_SHIFT;
@@ -1015,22 +1122,40 @@ void kvm_free_stage2_pgd(struct kvm *kvm
 		free_pages_exact(pgd, stage2_pgd_size(kvm));
 }
 
-static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
+static p4d_t *stage2_get_p4d(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 			     phys_addr_t addr)
 {
 	pgd_t *pgd;
-	pud_t *pud;
+	p4d_t *p4d;
 
 	pgd = kvm->arch.pgd + stage2_pgd_index(kvm, addr);
 	if (stage2_pgd_none(kvm, *pgd)) {
 		if (!cache)
 			return NULL;
-		pud = mmu_memory_cache_alloc(cache);
-		stage2_pgd_populate(kvm, pgd, pud);
+		p4d = mmu_memory_cache_alloc(cache);
+		stage2_pgd_populate(kvm, pgd, p4d);
 		get_page(virt_to_page(pgd));
 	}
 
-	return stage2_pud_offset(kvm, pgd, addr);
+	return stage2_p4d_offset(kvm, pgd, addr);
+}
+
+static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
+			     phys_addr_t addr)
+{
+	p4d_t *p4d;
+	pud_t *pud;
+
+	p4d = stage2_get_p4d(kvm, cache, addr);
+	if (stage2_p4d_none(kvm, *p4d)) {
+		if (!cache)
+			return NULL;
+		pud = mmu_memory_cache_alloc(cache);
+		stage2_p4d_populate(kvm, p4d, pud);
+		get_page(virt_to_page(p4d));
+	}
+
+	return stage2_pud_offset(kvm, p4d, addr);
 }
 
 static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
@@ -1423,18 +1548,18 @@ static void stage2_wp_pmds(struct kvm *k
 }
 
 /**
- * stage2_wp_puds - write protect PGD range
+ * stage2_wp_puds - write protect P4D range
  * @pgd:	pointer to pgd entry
  * @addr:	range start address
  * @end:	range end address
  */
-static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
+static void  stage2_wp_puds(struct kvm *kvm, p4d_t *p4d,
 			    phys_addr_t addr, phys_addr_t end)
 {
 	pud_t *pud;
 	phys_addr_t next;
 
-	pud = stage2_pud_offset(kvm, pgd, addr);
+	pud = stage2_pud_offset(kvm, p4d, addr);
 	do {
 		next = stage2_pud_addr_end(kvm, addr, end);
 		if (!stage2_pud_none(kvm, *pud)) {
@@ -1449,6 +1574,26 @@ static void  stage2_wp_puds(struct kvm *
 }
 
 /**
+ * stage2_wp_p4ds - write protect PGD range
+ * @pgd:	pointer to pgd entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void  stage2_wp_p4ds(struct kvm *kvm, pgd_t *pgd,
+			    phys_addr_t addr, phys_addr_t end)
+{
+	p4d_t *p4d;
+	phys_addr_t next;
+
+	p4d = stage2_p4d_offset(kvm, pgd, addr);
+	do {
+		next = stage2_p4d_addr_end(kvm, addr, end);
+		if (!stage2_p4d_none(kvm, *p4d))
+			stage2_wp_puds(kvm, p4d, addr, next);
+	} while (p4d++, addr = next, addr != end);
+}
+
+/**
  * stage2_wp_range() - write protect stage2 memory region range
  * @kvm:	The KVM pointer
  * @addr:	Start address of range
@@ -1475,7 +1620,7 @@ static void stage2_wp_range(struct kvm *
 			break;
 		next = stage2_pgd_addr_end(kvm, addr, end);
 		if (stage2_pgd_present(kvm, *pgd))
-			stage2_wp_puds(kvm, pgd, addr, next);
+			stage2_wp_p4ds(kvm, pgd, addr, next);
 	} while (pgd++, addr = next, addr != end);
 }
 
--- a/arch/arm64/mm/fault.c~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/mm/fault.c
@@ -145,6 +145,7 @@ static void show_pte(unsigned long addr)
 	pr_alert("[%016lx] pgd=%016llx", addr, pgd_val(pgd));
 
 	do {
+		p4d_t *p4dp, p4d;
 		pud_t *pudp, pud;
 		pmd_t *pmdp, pmd;
 		pte_t *ptep, pte;
@@ -152,7 +153,13 @@ static void show_pte(unsigned long addr)
 		if (pgd_none(pgd) || pgd_bad(pgd))
 			break;
 
-		pudp = pud_offset(pgdp, addr);
+		p4dp = p4d_offset(pgdp, addr);
+		p4d = READ_ONCE(*p4dp);
+		pr_cont(", p4d=%016llx", p4d_val(p4d));
+		if (p4d_none(p4d) || p4d_bad(p4d))
+			break;
+
+		pudp = pud_offset(p4dp, addr);
 		pud = READ_ONCE(*pudp);
 		pr_cont(", pud=%016llx", pud_val(pud));
 		if (pud_none(pud) || pud_bad(pud))
--- a/arch/arm64/mm/hugetlbpage.c~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/mm/hugetlbpage.c
@@ -67,11 +67,13 @@ static int find_num_contig(struct mm_str
 			   pte_t *ptep, size_t *pgsize)
 {
 	pgd_t *pgdp = pgd_offset(mm, addr);
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 
 	*pgsize = PAGE_SIZE;
-	pudp = pud_offset(pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	pudp = pud_offset(p4dp, addr);
 	pmdp = pmd_offset(pudp, addr);
 	if ((pte_t *)pmdp == ptep) {
 		*pgsize = PMD_SIZE;
@@ -217,12 +219,14 @@ pte_t *huge_pte_alloc(struct mm_struct *
 		      unsigned long addr, unsigned long sz)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	pte_t *ptep = NULL;
 
 	pgdp = pgd_offset(mm, addr);
-	pudp = pud_alloc(mm, pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	pudp = pud_alloc(mm, p4dp, addr);
 	if (!pudp)
 		return NULL;
 
@@ -261,6 +265,7 @@ pte_t *huge_pte_offset(struct mm_struct
 		       unsigned long addr, unsigned long sz)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp, pud;
 	pmd_t *pmdp, pmd;
 
@@ -268,7 +273,11 @@ pte_t *huge_pte_offset(struct mm_struct
 	if (!pgd_present(READ_ONCE(*pgdp)))
 		return NULL;
 
-	pudp = pud_offset(pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	if (!p4d_present(READ_ONCE(*p4dp)))
+		return NULL;
+
+	pudp = pud_offset(p4dp, addr);
 	pud = READ_ONCE(*pudp);
 	if (sz != PUD_SIZE && pud_none(pud))
 		return NULL;
--- a/arch/arm64/mm/kasan_init.c~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/mm/kasan_init.c
@@ -84,17 +84,17 @@ static pmd_t *__init kasan_pmd_offset(pu
 	return early ? pmd_offset_kimg(pudp, addr) : pmd_offset(pudp, addr);
 }
 
-static pud_t *__init kasan_pud_offset(pgd_t *pgdp, unsigned long addr, int node,
+static pud_t *__init kasan_pud_offset(p4d_t *p4dp, unsigned long addr, int node,
 				      bool early)
 {
-	if (pgd_none(READ_ONCE(*pgdp))) {
+	if (p4d_none(READ_ONCE(*p4dp))) {
 		phys_addr_t pud_phys = early ?
 				__pa_symbol(kasan_early_shadow_pud)
 					: kasan_alloc_zeroed_page(node);
-		__pgd_populate(pgdp, pud_phys, PMD_TYPE_TABLE);
+		__p4d_populate(p4dp, pud_phys, PMD_TYPE_TABLE);
 	}
 
-	return early ? pud_offset_kimg(pgdp, addr) : pud_offset(pgdp, addr);
+	return early ? pud_offset_kimg(p4dp, addr) : pud_offset(p4dp, addr);
 }
 
 static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr,
@@ -126,11 +126,11 @@ static void __init kasan_pmd_populate(pu
 	} while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp)));
 }
 
-static void __init kasan_pud_populate(pgd_t *pgdp, unsigned long addr,
+static void __init kasan_pud_populate(p4d_t *p4dp, unsigned long addr,
 				      unsigned long end, int node, bool early)
 {
 	unsigned long next;
-	pud_t *pudp = kasan_pud_offset(pgdp, addr, node, early);
+	pud_t *pudp = kasan_pud_offset(p4dp, addr, node, early);
 
 	do {
 		next = pud_addr_end(addr, end);
@@ -138,6 +138,18 @@ static void __init kasan_pud_populate(pg
 	} while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp)));
 }
 
+static void __init kasan_p4d_populate(pgd_t *pgdp, unsigned long addr,
+				      unsigned long end, int node, bool early)
+{
+	unsigned long next;
+	p4d_t *p4dp = p4d_offset(pgdp, addr);
+
+	do {
+		next = p4d_addr_end(addr, end);
+		kasan_pud_populate(p4dp, addr, next, node, early);
+	} while (p4dp++, addr = next, addr != end);
+}
+
 static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
 				      int node, bool early)
 {
@@ -147,7 +159,7 @@ static void __init kasan_pgd_populate(un
 	pgdp = pgd_offset_k(addr);
 	do {
 		next = pgd_addr_end(addr, end);
-		kasan_pud_populate(pgdp, addr, next, node, early);
+		kasan_p4d_populate(pgdp, addr, next, node, early);
 	} while (pgdp++, addr = next, addr != end);
 }
 
--- a/arch/arm64/mm/mmu.c~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/mm/mmu.c
@@ -290,18 +290,19 @@ static void alloc_init_pud(pgd_t *pgdp,
 {
 	unsigned long next;
 	pud_t *pudp;
-	pgd_t pgd = READ_ONCE(*pgdp);
+	p4d_t *p4dp = p4d_offset(pgdp, addr);
+	p4d_t p4d = READ_ONCE(*p4dp);
 
-	if (pgd_none(pgd)) {
+	if (p4d_none(p4d)) {
 		phys_addr_t pud_phys;
 		BUG_ON(!pgtable_alloc);
 		pud_phys = pgtable_alloc(PUD_SHIFT);
-		__pgd_populate(pgdp, pud_phys, PUD_TYPE_TABLE);
-		pgd = READ_ONCE(*pgdp);
+		__p4d_populate(p4dp, pud_phys, PUD_TYPE_TABLE);
+		p4d = READ_ONCE(*p4dp);
 	}
-	BUG_ON(pgd_bad(pgd));
+	BUG_ON(p4d_bad(p4d));
 
-	pudp = pud_set_fixmap_offset(pgdp, addr);
+	pudp = pud_set_fixmap_offset(p4dp, addr);
 	do {
 		pud_t old_pud = READ_ONCE(*pudp);
 
@@ -672,6 +673,7 @@ static void __init map_kernel(pgd_t *pgd
 			READ_ONCE(*pgd_offset_k(FIXADDR_START)));
 	} else if (CONFIG_PGTABLE_LEVELS > 3) {
 		pgd_t *bm_pgdp;
+		p4d_t *bm_p4dp;
 		pud_t *bm_pudp;
 		/*
 		 * The fixmap shares its top level pgd entry with the kernel
@@ -681,7 +683,8 @@ static void __init map_kernel(pgd_t *pgd
 		 */
 		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
 		bm_pgdp = pgd_offset_raw(pgdp, FIXADDR_START);
-		bm_pudp = pud_set_fixmap_offset(bm_pgdp, FIXADDR_START);
+		bm_p4dp = p4d_offset(bm_pgdp, FIXADDR_START);
+		bm_pudp = pud_set_fixmap_offset(bm_p4dp, FIXADDR_START);
 		pud_populate(&init_mm, bm_pudp, lm_alias(bm_pmd));
 		pud_clear_fixmap();
 	} else {
@@ -715,6 +718,7 @@ void __init paging_init(void)
 int kern_addr_valid(unsigned long addr)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp, pud;
 	pmd_t *pmdp, pmd;
 	pte_t *ptep, pte;
@@ -726,7 +730,11 @@ int kern_addr_valid(unsigned long addr)
 	if (pgd_none(READ_ONCE(*pgdp)))
 		return 0;
 
-	pudp = pud_offset(pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	if (p4d_none(READ_ONCE(*p4dp)))
+		return 0;
+
+	pudp = pud_offset(p4dp, addr);
 	pud = READ_ONCE(*pudp);
 	if (pud_none(pud))
 		return 0;
@@ -1069,6 +1077,7 @@ int __meminit vmemmap_populate(unsigned
 	unsigned long addr = start;
 	unsigned long next;
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp;
 	pmd_t *pmdp;
 
@@ -1079,7 +1088,11 @@ int __meminit vmemmap_populate(unsigned
 		if (!pgdp)
 			return -ENOMEM;
 
-		pudp = vmemmap_pud_populate(pgdp, addr, node);
+		p4dp = vmemmap_p4d_populate(pgdp, addr, node);
+		if (!p4dp)
+			return -ENOMEM;
+
+		pudp = vmemmap_pud_populate(p4dp, addr, node);
 		if (!pudp)
 			return -ENOMEM;
 
@@ -1114,11 +1127,12 @@ void vmemmap_free(unsigned long start, u
 static inline pud_t * fixmap_pud(unsigned long addr)
 {
 	pgd_t *pgdp = pgd_offset_k(addr);
-	pgd_t pgd = READ_ONCE(*pgdp);
+	p4d_t *p4dp = p4d_offset(pgdp, addr);
+	p4d_t p4d = READ_ONCE(*p4dp);
 
-	BUG_ON(pgd_none(pgd) || pgd_bad(pgd));
+	BUG_ON(p4d_none(p4d) || p4d_bad(p4d));
 
-	return pud_offset_kimg(pgdp, addr);
+	return pud_offset_kimg(p4dp, addr);
 }
 
 static inline pmd_t * fixmap_pmd(unsigned long addr)
@@ -1144,25 +1158,27 @@ static inline pte_t * fixmap_pte(unsigne
  */
 void __init early_fixmap_init(void)
 {
-	pgd_t *pgdp, pgd;
+	pgd_t *pgdp;
+	p4d_t *p4dp, p4d;
 	pud_t *pudp;
 	pmd_t *pmdp;
 	unsigned long addr = FIXADDR_START;
 
 	pgdp = pgd_offset_k(addr);
-	pgd = READ_ONCE(*pgdp);
+	p4dp = p4d_offset(pgdp, addr);
+	p4d = READ_ONCE(*p4dp);
 	if (CONFIG_PGTABLE_LEVELS > 3 &&
-	    !(pgd_none(pgd) || pgd_page_paddr(pgd) == __pa_symbol(bm_pud))) {
+	    !(p4d_none(p4d) || p4d_page_paddr(p4d) == __pa_symbol(bm_pud))) {
 		/*
 		 * We only end up here if the kernel mapping and the fixmap
 		 * share the top level pgd entry, which should only happen on
 		 * 16k/4 levels configurations.
 		 */
 		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
-		pudp = pud_offset_kimg(pgdp, addr);
+		pudp = pud_offset_kimg(p4dp, addr);
 	} else {
-		if (pgd_none(pgd))
-			__pgd_populate(pgdp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
+		if (p4d_none(p4d))
+			__p4d_populate(p4dp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
 		pudp = fixmap_pud(addr);
 	}
 	if (pud_none(READ_ONCE(*pudp)))
--- a/arch/arm64/mm/pageattr.c~arm64-add-support-for-folded-p4d-page-tables
+++ a/arch/arm64/mm/pageattr.c
@@ -198,6 +198,7 @@ void __kernel_map_pages(struct page *pag
 bool kernel_page_present(struct page *page)
 {
 	pgd_t *pgdp;
+	p4d_t *p4dp;
 	pud_t *pudp, pud;
 	pmd_t *pmdp, pmd;
 	pte_t *ptep;
@@ -210,7 +211,11 @@ bool kernel_page_present(struct page *pa
 	if (pgd_none(READ_ONCE(*pgdp)))
 		return false;
 
-	pudp = pud_offset(pgdp, addr);
+	p4dp = p4d_offset(pgdp, addr);
+	if (p4d_none(READ_ONCE(*p4dp)))
+		return false;
+
+	pudp = pud_offset(p4dp, addr);
 	pud = READ_ONCE(*pudp);
 	if (pud_none(pud))
 		return false;
_

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 03/14] arm64: add support for folded p4d page tables
  2020-05-15 18:40   ` Andrew Morton
@ 2020-05-16 17:20     ` Mike Rapoport
  0 siblings, 0 replies; 25+ messages in thread
From: Mike Rapoport @ 2020-05-16 17:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev

On Fri, May 15, 2020 at 11:40:12AM -0700, Andrew Morton wrote:
> On Tue, 14 Apr 2020 18:34:44 +0300 Mike Rapoport <rppt@kernel.org> wrote:
> 
> > Implement primitives necessary for the 4th level folding, add walks of p4d
> > level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
> > remove __ARCH_USE_5LEVEL_HACK.
> 
> This needed some rework due to arm changes in linux-next.  Please check
> my handiwork and test it once I've merged this into linux-next?

Looks ok to me. It passed defconfig and a couple of randconfig builds
and qemu-system-aarch64 boots find with this.

> Rejects were
> 
> --- arch/arm64/include/asm/pgtable.h~arm64-add-support-for-folded-p4d-page-tables
> +++ arch/arm64/include/asm/pgtable.h
> @@ -596,49 +604,50 @@ static inline phys_addr_t pud_page_paddr
>  
>  #define pud_ERROR(pud)		__pud_error(__FILE__, __LINE__, pud_val(pud))
>  
> -#define pgd_none(pgd)		(!pgd_val(pgd))
> -#define pgd_bad(pgd)		(!(pgd_val(pgd) & 2))
> -#define pgd_present(pgd)	(pgd_val(pgd))
> +#define p4d_none(p4d)		(!p4d_val(p4d))
> +#define p4d_bad(p4d)		(!(p4d_val(p4d) & 2))
> +#define p4d_present(p4d)	(p4d_val(p4d))
>  
> -static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
> +static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
>  {
> -	if (in_swapper_pgdir(pgdp)) {
> -		set_swapper_pgd(pgdp, pgd);
> +	if (in_swapper_pgdir(p4dp)) {
> +		set_swapper_pgd((pgd_t *)p4dp, __pgd(p4d_val(p4d)));
>  		return;
>  	}
>  
> -	WRITE_ONCE(*pgdp, pgd);
> +	WRITE_ONCE(*p4dp, p4d);
>  	dsb(ishst);
>  	isb();
>  }
>  
> -static inline void pgd_clear(pgd_t *pgdp)
> +static inline void p4d_clear(p4d_t *p4dp)
>  {
> -	set_pgd(pgdp, __pgd(0));
> +	set_p4d(p4dp, __p4d(0));
>  }
>  
> -static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
> +static inline phys_addr_t p4d_page_paddr(p4d_t p4d)
>  {
> -	return __pgd_to_phys(pgd);
> +	return __p4d_to_phys(p4d);
>  }
>  
>  /* Find an entry in the frst-level page table. */
>  #define pud_index(addr)		(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
>  
> -#define pud_offset_phys(dir, addr)	(pgd_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
> +#define pud_offset_phys(dir, addr)	(p4d_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
>  #define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
>  
>  #define pud_set_fixmap(addr)		((pud_t *)set_fixmap_offset(FIX_PUD, addr))
> -#define pud_set_fixmap_offset(pgd, addr)	pud_set_fixmap(pud_offset_phys(pgd, addr))
> +#define pud_set_fixmap_offset(p4d, addr)	pud_set_fixmap(pud_offset_phys(p4d, addr))
>  #define pud_clear_fixmap()		clear_fixmap(FIX_PUD)
>  
> -#define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(__pgd_to_phys(pgd)))
> +#define p4d_page(p4d)		pfn_to_page(__phys_to_pfn(__p4d_to_phys(p4d)))
>  
>  /* use ONLY for statically allocated translation tables */
>  #define pud_offset_kimg(dir,addr)	((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
>  
>  #else
>  
> +#define p4d_page_paddr(p4d)	({ BUILD_BUG(); 0;})
>  #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
>  
>  /* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
> 
> 
> 
> and
> 
> --- arch/arm64/kvm/mmu.c~arm64-add-support-for-folded-p4d-page-tables
> +++ arch/arm64/kvm/mmu.c
> @@ -469,7 +517,7 @@ static void stage2_flush_memslot(struct
>  	do {
>  		next = stage2_pgd_addr_end(kvm, addr, end);
>  		if (!stage2_pgd_none(kvm, *pgd))
> -			stage2_flush_puds(kvm, pgd, addr, next);
> +			stage2_flush_p4ds(kvm, pgd, addr, next);
>  	} while (pgd++, addr = next, addr != end);
>  }
>  
> 
> 
> Result:
> 
> From: Mike Rapoport <rppt@linux.ibm.com>
> Subject: arm64: add support for folded p4d page tables
> 
> Implement primitives necessary for the 4th level folding, add walks of p4d
> level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
> remove __ARCH_USE_5LEVEL_HACK.
> 
> Link: http://lkml.kernel.org/r/20200414153455.21744-4-rppt@kernel.org
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Brian Cain <bcain@codeaurora.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Christophe Leroy <christophe.leroy@c-s.fr>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: Guan Xuetao <gxt@pku.edu.cn>
> Cc: James Morse <james.morse@arm.com>
> Cc: Jonas Bonn <jonas@southpole.se>
> Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
> Cc: Ley Foon Tan <ley.foon.tan@intel.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Rich Felker <dalias@libc.org>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Stafford Horne <shorne@gmail.com>
> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  arch/arm64/include/asm/kvm_mmu.h        |   10 -
>  arch/arm64/include/asm/pgalloc.h        |   10 -
>  arch/arm64/include/asm/pgtable-types.h  |    5 
>  arch/arm64/include/asm/pgtable.h        |   37 ++-
>  arch/arm64/include/asm/stage2_pgtable.h |   48 +++--
>  arch/arm64/kernel/hibernate.c           |   44 +++-
>  arch/arm64/kvm/mmu.c                    |  209 ++++++++++++++++++----
>  arch/arm64/mm/fault.c                   |    9 
>  arch/arm64/mm/hugetlbpage.c             |   15 +
>  arch/arm64/mm/kasan_init.c              |   26 ++
>  arch/arm64/mm/mmu.c                     |   52 +++--
>  arch/arm64/mm/pageattr.c                |    7 
>  12 files changed, 368 insertions(+), 104 deletions(-)
> 
> --- a/arch/arm64/include/asm/kvm_mmu.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/kvm_mmu.h
> @@ -172,8 +172,8 @@ void kvm_clear_hyp_idmap(void);
>  	__pmd(__phys_to_pmd_val(__pa(ptep)) | PMD_TYPE_TABLE)
>  #define kvm_mk_pud(pmdp)					\
>  	__pud(__phys_to_pud_val(__pa(pmdp)) | PMD_TYPE_TABLE)
> -#define kvm_mk_pgd(pudp)					\
> -	__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
> +#define kvm_mk_p4d(pmdp)					\
> +	__p4d(__phys_to_p4d_val(__pa(pmdp)) | PUD_TYPE_TABLE)
>  
>  #define kvm_set_pud(pudp, pud)		set_pud(pudp, pud)
>  
> @@ -299,6 +299,12 @@ static inline bool kvm_s2pud_young(pud_t
>  #define hyp_pud_table_empty(pudp) kvm_page_empty(pudp)
>  #endif
>  
> +#ifdef __PAGETABLE_P4D_FOLDED
> +#define hyp_p4d_table_empty(p4dp) (0)
> +#else
> +#define hyp_p4d_table_empty(p4dp) kvm_page_empty(p4dp)
> +#endif
> +
>  struct kvm;
>  
>  #define kvm_flush_dcache_to_poc(a,l)	__flush_dcache_area((a), (l))
> --- a/arch/arm64/include/asm/pgalloc.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/pgalloc.h
> @@ -73,17 +73,17 @@ static inline void pud_free(struct mm_st
>  	free_page((unsigned long)pudp);
>  }
>  
> -static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
> +static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
>  {
> -	set_pgd(pgdp, __pgd(__phys_to_pgd_val(pudp) | prot));
> +	set_p4d(p4dp, __p4d(__phys_to_p4d_val(pudp) | prot));
>  }
>  
> -static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
> +static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
>  {
> -	__pgd_populate(pgdp, __pa(pudp), PUD_TYPE_TABLE);
> +	__p4d_populate(p4dp, __pa(pudp), PUD_TYPE_TABLE);
>  }
>  #else
> -static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
> +static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
>  {
>  	BUILD_BUG();
>  }
> --- a/arch/arm64/include/asm/pgtable.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/pgtable.h
> @@ -298,6 +298,11 @@ static inline pte_t pgd_pte(pgd_t pgd)
>  	return __pte(pgd_val(pgd));
>  }
>  
> +static inline pte_t p4d_pte(p4d_t p4d)
> +{
> +	return __pte(p4d_val(p4d));
> +}
> +
>  static inline pte_t pud_pte(pud_t pud)
>  {
>  	return __pte(pud_val(pud));
> @@ -401,6 +406,9 @@ static inline pmd_t pmd_mkdevmap(pmd_t p
>  
>  #define set_pmd_at(mm, addr, pmdp, pmd)	set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd))
>  
> +#define __p4d_to_phys(p4d)	__pte_to_phys(p4d_pte(p4d))
> +#define __phys_to_p4d_val(phys)	__phys_to_pte_val(phys)
> +
>  #define __pgd_to_phys(pgd)	__pte_to_phys(pgd_pte(pgd))
>  #define __phys_to_pgd_val(phys)	__phys_to_pte_val(phys)
>  
> @@ -592,49 +600,50 @@ static inline phys_addr_t pud_page_paddr
>  
>  #define pud_ERROR(pud)		__pud_error(__FILE__, __LINE__, pud_val(pud))
>  
> -#define pgd_none(pgd)		(!pgd_val(pgd))
> -#define pgd_bad(pgd)		(!(pgd_val(pgd) & 2))
> -#define pgd_present(pgd)	(pgd_val(pgd))
> +#define p4d_none(p4d)		(!p4d_val(p4d))
> +#define p4d_bad(p4d)		(!(p4d_val(p4d) & 2))
> +#define p4d_present(p4d)	(p4d_val(p4d))
>  
> -static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
> +static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
>  {
> -	if (in_swapper_pgdir(pgdp)) {
> -		set_swapper_pgd(pgdp, pgd);
> +	if (in_swapper_pgdir(p4dp)) {
> +		set_swapper_pgd((pgd_t *)p4dp, __pgd(p4d_val(p4d)));
>  		return;
>  	}
>  
> -	WRITE_ONCE(*pgdp, pgd);
> +	WRITE_ONCE(*p4dp, p4d);
>  	dsb(ishst);
>  	isb();
>  }
>  
> -static inline void pgd_clear(pgd_t *pgdp)
> +static inline void p4d_clear(p4d_t *p4dp)
>  {
> -	set_pgd(pgdp, __pgd(0));
> +	set_p4d(p4dp, __p4d(0));
>  }
>  
> -static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
> +static inline phys_addr_t p4d_page_paddr(p4d_t p4d)
>  {
> -	return __pgd_to_phys(pgd);
> +	return __p4d_to_phys(p4d);
>  }
>  
>  /* Find an entry in the frst-level page table. */
>  #define pud_index(addr)		(((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
>  
> -#define pud_offset_phys(dir, addr)	(pgd_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
> +#define pud_offset_phys(dir, addr)	(p4d_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
>  #define pud_offset(dir, addr)		((pud_t *)__va(pud_offset_phys((dir), (addr))))
>  
>  #define pud_set_fixmap(addr)		((pud_t *)set_fixmap_offset(FIX_PUD, addr))
> -#define pud_set_fixmap_offset(pgd, addr)	pud_set_fixmap(pud_offset_phys(pgd, addr))
> +#define pud_set_fixmap_offset(p4d, addr)	pud_set_fixmap(pud_offset_phys(p4d, addr))
>  #define pud_clear_fixmap()		clear_fixmap(FIX_PUD)
>  
> -#define pgd_page(pgd)			phys_to_page(__pgd_to_phys(pgd))
> +#define p4d_page(p4d)		pfn_to_page(__phys_to_pfn(__p4d_to_phys(p4d)))
>  
>  /* use ONLY for statically allocated translation tables */
>  #define pud_offset_kimg(dir,addr)	((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
>  
>  #else
>  
> +#define p4d_page_paddr(p4d)	({ BUILD_BUG(); 0;})
>  #define pgd_page_paddr(pgd)	({ BUILD_BUG(); 0;})
>  
>  /* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
> --- a/arch/arm64/include/asm/pgtable-types.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/pgtable-types.h
> @@ -14,6 +14,7 @@
>  typedef u64 pteval_t;
>  typedef u64 pmdval_t;
>  typedef u64 pudval_t;
> +typedef u64 p4dval_t;
>  typedef u64 pgdval_t;
>  
>  /*
> @@ -44,13 +45,11 @@ typedef struct { pteval_t pgprot; } pgpr
>  #define __pgprot(x)	((pgprot_t) { (x) } )
>  
>  #if CONFIG_PGTABLE_LEVELS == 2
> -#define __ARCH_USE_5LEVEL_HACK
>  #include <asm-generic/pgtable-nopmd.h>
>  #elif CONFIG_PGTABLE_LEVELS == 3
> -#define __ARCH_USE_5LEVEL_HACK
>  #include <asm-generic/pgtable-nopud.h>
>  #elif CONFIG_PGTABLE_LEVELS == 4
> -#include <asm-generic/5level-fixup.h>
> +#include <asm-generic/pgtable-nop4d.h>
>  #endif
>  
>  #endif	/* __ASM_PGTABLE_TYPES_H */
> --- a/arch/arm64/include/asm/stage2_pgtable.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/stage2_pgtable.h
> @@ -68,41 +68,67 @@ static inline bool kvm_stage2_has_pud(st
>  #define S2_PUD_SIZE			(1UL << S2_PUD_SHIFT)
>  #define S2_PUD_MASK			(~(S2_PUD_SIZE - 1))
>  
> -static inline bool stage2_pgd_none(struct kvm *kvm, pgd_t pgd)
> +#define stage2_pgd_none(kvm, pgd)		pgd_none(pgd)
> +#define stage2_pgd_clear(kvm, pgd)		pgd_clear(pgd)
> +#define stage2_pgd_present(kvm, pgd)		pgd_present(pgd)
> +#define stage2_pgd_populate(kvm, pgd, p4d)	pgd_populate(NULL, pgd, p4d)
> +
> +static inline p4d_t *stage2_p4d_offset(struct kvm *kvm,
> +				       pgd_t *pgd, unsigned long address)
> +{
> +	return p4d_offset(pgd, address);
> +}
> +
> +static inline void stage2_p4d_free(struct kvm *kvm, p4d_t *p4d)
> +{
> +}
> +
> +static inline bool stage2_p4d_table_empty(struct kvm *kvm, p4d_t *p4dp)
> +{
> +	return false;
> +}
> +
> +static inline phys_addr_t stage2_p4d_addr_end(struct kvm *kvm,
> +					      phys_addr_t addr, phys_addr_t end)
> +{
> +	return end;
> +}
> +
> +static inline bool stage2_p4d_none(struct kvm *kvm, p4d_t p4d)
>  {
>  	if (kvm_stage2_has_pud(kvm))
> -		return pgd_none(pgd);
> +		return p4d_none(p4d);
>  	else
>  		return 0;
>  }
>  
> -static inline void stage2_pgd_clear(struct kvm *kvm, pgd_t *pgdp)
> +static inline void stage2_p4d_clear(struct kvm *kvm, p4d_t *p4dp)
>  {
>  	if (kvm_stage2_has_pud(kvm))
> -		pgd_clear(pgdp);
> +		p4d_clear(p4dp);
>  }
>  
> -static inline bool stage2_pgd_present(struct kvm *kvm, pgd_t pgd)
> +static inline bool stage2_p4d_present(struct kvm *kvm, p4d_t p4d)
>  {
>  	if (kvm_stage2_has_pud(kvm))
> -		return pgd_present(pgd);
> +		return p4d_present(p4d);
>  	else
>  		return 1;
>  }
>  
> -static inline void stage2_pgd_populate(struct kvm *kvm, pgd_t *pgd, pud_t *pud)
> +static inline void stage2_p4d_populate(struct kvm *kvm, p4d_t *p4d, pud_t *pud)
>  {
>  	if (kvm_stage2_has_pud(kvm))
> -		pgd_populate(NULL, pgd, pud);
> +		p4d_populate(NULL, p4d, pud);
>  }
>  
>  static inline pud_t *stage2_pud_offset(struct kvm *kvm,
> -				       pgd_t *pgd, unsigned long address)
> +				       p4d_t *p4d, unsigned long address)
>  {
>  	if (kvm_stage2_has_pud(kvm))
> -		return pud_offset(pgd, address);
> +		return pud_offset(p4d, address);
>  	else
> -		return (pud_t *)pgd;
> +		return (pud_t *)p4d;
>  }
>  
>  static inline void stage2_pud_free(struct kvm *kvm, pud_t *pud)
> --- a/arch/arm64/kernel/hibernate.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/kernel/hibernate.c
> @@ -184,6 +184,7 @@ static int trans_pgd_map_page(pgd_t *tra
>  		       pgprot_t pgprot)
>  {
>  	pgd_t *pgdp;
> +	p4d_t *p4dp;
>  	pud_t *pudp;
>  	pmd_t *pmdp;
>  	pte_t *ptep;
> @@ -196,7 +197,15 @@ static int trans_pgd_map_page(pgd_t *tra
>  		pgd_populate(&init_mm, pgdp, pudp);
>  	}
>  
> -	pudp = pud_offset(pgdp, dst_addr);
> +	p4dp = p4d_offset(pgdp, dst_addr);
> +	if (p4d_none(READ_ONCE(*p4dp))) {
> +		pudp = (void *)get_safe_page(GFP_ATOMIC);
> +		if (!pudp)
> +			return -ENOMEM;
> +		p4d_populate(&init_mm, p4dp, pudp);
> +	}
> +
> +	pudp = pud_offset(p4dp, dst_addr);
>  	if (pud_none(READ_ONCE(*pudp))) {
>  		pmdp = (void *)get_safe_page(GFP_ATOMIC);
>  		if (!pmdp)
> @@ -419,7 +428,7 @@ static int copy_pmd(pud_t *dst_pudp, pud
>  	return 0;
>  }
>  
> -static int copy_pud(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
> +static int copy_pud(p4d_t *dst_p4dp, p4d_t *src_p4dp, unsigned long start,
>  		    unsigned long end)
>  {
>  	pud_t *dst_pudp;
> @@ -427,15 +436,15 @@ static int copy_pud(pgd_t *dst_pgdp, pgd
>  	unsigned long next;
>  	unsigned long addr = start;
>  
> -	if (pgd_none(READ_ONCE(*dst_pgdp))) {
> +	if (p4d_none(READ_ONCE(*dst_p4dp))) {
>  		dst_pudp = (pud_t *)get_safe_page(GFP_ATOMIC);
>  		if (!dst_pudp)
>  			return -ENOMEM;
> -		pgd_populate(&init_mm, dst_pgdp, dst_pudp);
> +		p4d_populate(&init_mm, dst_p4dp, dst_pudp);
>  	}
> -	dst_pudp = pud_offset(dst_pgdp, start);
> +	dst_pudp = pud_offset(dst_p4dp, start);
>  
> -	src_pudp = pud_offset(src_pgdp, start);
> +	src_pudp = pud_offset(src_p4dp, start);
>  	do {
>  		pud_t pud = READ_ONCE(*src_pudp);
>  
> @@ -454,6 +463,27 @@ static int copy_pud(pgd_t *dst_pgdp, pgd
>  	return 0;
>  }
>  
> +static int copy_p4d(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
> +		    unsigned long end)
> +{
> +	p4d_t *dst_p4dp;
> +	p4d_t *src_p4dp;
> +	unsigned long next;
> +	unsigned long addr = start;
> +
> +	dst_p4dp = p4d_offset(dst_pgdp, start);
> +	src_p4dp = p4d_offset(src_pgdp, start);
> +	do {
> +		next = p4d_addr_end(addr, end);
> +		if (p4d_none(READ_ONCE(*src_p4dp)))
> +			continue;
> +		if (copy_pud(dst_p4dp, src_p4dp, addr, next))
> +			return -ENOMEM;
> +	} while (dst_p4dp++, src_p4dp++, addr = next, addr != end);
> +
> +	return 0;
> +}
> +
>  static int copy_page_tables(pgd_t *dst_pgdp, unsigned long start,
>  			    unsigned long end)
>  {
> @@ -466,7 +496,7 @@ static int copy_page_tables(pgd_t *dst_p
>  		next = pgd_addr_end(addr, end);
>  		if (pgd_none(READ_ONCE(*src_pgdp)))
>  			continue;
> -		if (copy_pud(dst_pgdp, src_pgdp, addr, next))
> +		if (copy_p4d(dst_pgdp, src_pgdp, addr, next))
>  			return -ENOMEM;
>  	} while (dst_pgdp++, src_pgdp++, addr = next, addr != end);
>  
> --- a/arch/arm64/kvm/mmu.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/kvm/mmu.c
> @@ -158,13 +158,22 @@ static void *mmu_memory_cache_alloc(stru
>  
>  static void clear_stage2_pgd_entry(struct kvm *kvm, pgd_t *pgd, phys_addr_t addr)
>  {
> -	pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, pgd, 0UL);
> +	p4d_t *p4d_table __maybe_unused = stage2_p4d_offset(kvm, pgd, 0UL);
>  	stage2_pgd_clear(kvm, pgd);
>  	kvm_tlb_flush_vmid_ipa(kvm, addr);
> -	stage2_pud_free(kvm, pud_table);
> +	stage2_p4d_free(kvm, p4d_table);
>  	put_page(virt_to_page(pgd));
>  }
>  
> +static void clear_stage2_p4d_entry(struct kvm *kvm, p4d_t *p4d, phys_addr_t addr)
> +{
> +	pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, p4d, 0);
> +	stage2_p4d_clear(kvm, p4d);
> +	kvm_tlb_flush_vmid_ipa(kvm, addr);
> +	stage2_pud_free(kvm, pud_table);
> +	put_page(virt_to_page(p4d));
> +}
> +
>  static void clear_stage2_pud_entry(struct kvm *kvm, pud_t *pud, phys_addr_t addr)
>  {
>  	pmd_t *pmd_table __maybe_unused = stage2_pmd_offset(kvm, pud, 0);
> @@ -208,12 +217,20 @@ static inline void kvm_pud_populate(pud_
>  	dsb(ishst);
>  }
>  
> -static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
> +static inline void kvm_p4d_populate(p4d_t *p4dp, pud_t *pudp)
>  {
> -	WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
> +	WRITE_ONCE(*p4dp, kvm_mk_p4d(pudp));
>  	dsb(ishst);
>  }
>  
> +static inline void kvm_pgd_populate(pgd_t *pgdp, p4d_t *p4dp)
> +{
> +#ifndef __PAGETABLE_P4D_FOLDED
> +	WRITE_ONCE(*pgdp, kvm_mk_pgd(p4dp));
> +	dsb(ishst);
> +#endif
> +}
> +
>  /*
>   * Unmapping vs dcache management:
>   *
> @@ -293,13 +310,13 @@ static void unmap_stage2_pmds(struct kvm
>  		clear_stage2_pud_entry(kvm, pud, start_addr);
>  }
>  
> -static void unmap_stage2_puds(struct kvm *kvm, pgd_t *pgd,
> +static void unmap_stage2_puds(struct kvm *kvm, p4d_t *p4d,
>  		       phys_addr_t addr, phys_addr_t end)
>  {
>  	phys_addr_t next, start_addr = addr;
>  	pud_t *pud, *start_pud;
>  
> -	start_pud = pud = stage2_pud_offset(kvm, pgd, addr);
> +	start_pud = pud = stage2_pud_offset(kvm, p4d, addr);
>  	do {
>  		next = stage2_pud_addr_end(kvm, addr, end);
>  		if (!stage2_pud_none(kvm, *pud)) {
> @@ -317,6 +334,23 @@ static void unmap_stage2_puds(struct kvm
>  	} while (pud++, addr = next, addr != end);
>  
>  	if (stage2_pud_table_empty(kvm, start_pud))
> +		clear_stage2_p4d_entry(kvm, p4d, start_addr);
> +}
> +
> +static void unmap_stage2_p4ds(struct kvm *kvm, pgd_t *pgd,
> +		       phys_addr_t addr, phys_addr_t end)
> +{
> +	phys_addr_t next, start_addr = addr;
> +	p4d_t *p4d, *start_p4d;
> +
> +	start_p4d = p4d = stage2_p4d_offset(kvm, pgd, addr);
> +	do {
> +		next = stage2_p4d_addr_end(kvm, addr, end);
> +		if (!stage2_p4d_none(kvm, *p4d))
> +			unmap_stage2_puds(kvm, p4d, addr, next);
> +	} while (p4d++, addr = next, addr != end);
> +
> +	if (stage2_p4d_table_empty(kvm, start_p4d))
>  		clear_stage2_pgd_entry(kvm, pgd, start_addr);
>  }
>  
> @@ -351,7 +385,7 @@ static void unmap_stage2_range(struct kv
>  			break;
>  		next = stage2_pgd_addr_end(kvm, addr, end);
>  		if (!stage2_pgd_none(kvm, *pgd))
> -			unmap_stage2_puds(kvm, pgd, addr, next);
> +			unmap_stage2_p4ds(kvm, pgd, addr, next);
>  		/*
>  		 * If the range is too large, release the kvm->mmu_lock
>  		 * to prevent starvation and lockup detector warnings.
> @@ -391,13 +425,13 @@ static void stage2_flush_pmds(struct kvm
>  	} while (pmd++, addr = next, addr != end);
>  }
>  
> -static void stage2_flush_puds(struct kvm *kvm, pgd_t *pgd,
> +static void stage2_flush_puds(struct kvm *kvm, p4d_t *p4d,
>  			      phys_addr_t addr, phys_addr_t end)
>  {
>  	pud_t *pud;
>  	phys_addr_t next;
>  
> -	pud = stage2_pud_offset(kvm, pgd, addr);
> +	pud = stage2_pud_offset(kvm, p4d, addr);
>  	do {
>  		next = stage2_pud_addr_end(kvm, addr, end);
>  		if (!stage2_pud_none(kvm, *pud)) {
> @@ -409,6 +443,20 @@ static void stage2_flush_puds(struct kvm
>  	} while (pud++, addr = next, addr != end);
>  }
>  
> +static void stage2_flush_p4ds(struct kvm *kvm, pgd_t *pgd,
> +			      phys_addr_t addr, phys_addr_t end)
> +{
> +	p4d_t *p4d;
> +	phys_addr_t next;
> +
> +	p4d = stage2_p4d_offset(kvm, pgd, addr);
> +	do {
> +		next = stage2_p4d_addr_end(kvm, addr, end);
> +		if (!stage2_p4d_none(kvm, *p4d))
> +			stage2_flush_puds(kvm, p4d, addr, next);
> +	} while (p4d++, addr = next, addr != end);
> +}
> +
>  static void stage2_flush_memslot(struct kvm *kvm,
>  				 struct kvm_memory_slot *memslot)
>  {
> @@ -421,7 +469,7 @@ static void stage2_flush_memslot(struct
>  	do {
>  		next = stage2_pgd_addr_end(kvm, addr, end);
>  		if (!stage2_pgd_none(kvm, *pgd))
> -			stage2_flush_puds(kvm, pgd, addr, next);
> +			stage2_flush_p4ds(kvm, pgd, addr, next);
>  
>  		if (next != end)
>  			cond_resched_lock(&kvm->mmu_lock);
> @@ -454,12 +502,21 @@ static void stage2_flush_vm(struct kvm *
>  
>  static void clear_hyp_pgd_entry(pgd_t *pgd)
>  {
> -	pud_t *pud_table __maybe_unused = pud_offset(pgd, 0UL);
> +	p4d_t *p4d_table __maybe_unused = p4d_offset(pgd, 0UL);
>  	pgd_clear(pgd);
> -	pud_free(NULL, pud_table);
> +	p4d_free(NULL, p4d_table);
>  	put_page(virt_to_page(pgd));
>  }
>  
> +static void clear_hyp_p4d_entry(p4d_t *p4d)
> +{
> +	pud_t *pud_table __maybe_unused = pud_offset(p4d, 0);
> +	VM_BUG_ON(p4d_huge(*p4d));
> +	p4d_clear(p4d);
> +	pud_free(NULL, pud_table);
> +	put_page(virt_to_page(p4d));
> +}
> +
>  static void clear_hyp_pud_entry(pud_t *pud)
>  {
>  	pmd_t *pmd_table __maybe_unused = pmd_offset(pud, 0);
> @@ -511,12 +568,12 @@ static void unmap_hyp_pmds(pud_t *pud, p
>  		clear_hyp_pud_entry(pud);
>  }
>  
> -static void unmap_hyp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
> +static void unmap_hyp_puds(p4d_t *p4d, phys_addr_t addr, phys_addr_t end)
>  {
>  	phys_addr_t next;
>  	pud_t *pud, *start_pud;
>  
> -	start_pud = pud = pud_offset(pgd, addr);
> +	start_pud = pud = pud_offset(p4d, addr);
>  	do {
>  		next = pud_addr_end(addr, end);
>  		/* Hyp doesn't use huge puds */
> @@ -525,6 +582,23 @@ static void unmap_hyp_puds(pgd_t *pgd, p
>  	} while (pud++, addr = next, addr != end);
>  
>  	if (hyp_pud_table_empty(start_pud))
> +		clear_hyp_p4d_entry(p4d);
> +}
> +
> +static void unmap_hyp_p4ds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
> +{
> +	phys_addr_t next;
> +	p4d_t *p4d, *start_p4d;
> +
> +	start_p4d = p4d = p4d_offset(pgd, addr);
> +	do {
> +		next = p4d_addr_end(addr, end);
> +		/* Hyp doesn't use huge p4ds */
> +		if (!p4d_none(*p4d))
> +			unmap_hyp_puds(p4d, addr, next);
> +	} while (p4d++, addr = next, addr != end);
> +
> +	if (hyp_p4d_table_empty(start_p4d))
>  		clear_hyp_pgd_entry(pgd);
>  }
>  
> @@ -548,7 +622,7 @@ static void __unmap_hyp_range(pgd_t *pgd
>  	do {
>  		next = pgd_addr_end(addr, end);
>  		if (!pgd_none(*pgd))
> -			unmap_hyp_puds(pgd, addr, next);
> +			unmap_hyp_p4ds(pgd, addr, next);
>  	} while (pgd++, addr = next, addr != end);
>  }
>  
> @@ -658,7 +732,7 @@ static int create_hyp_pmd_mappings(pud_t
>  	return 0;
>  }
>  
> -static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
> +static int create_hyp_pud_mappings(p4d_t *p4d, unsigned long start,
>  				   unsigned long end, unsigned long pfn,
>  				   pgprot_t prot)
>  {
> @@ -669,7 +743,7 @@ static int create_hyp_pud_mappings(pgd_t
>  
>  	addr = start;
>  	do {
> -		pud = pud_offset(pgd, addr);
> +		pud = pud_offset(p4d, addr);
>  
>  		if (pud_none_or_clear_bad(pud)) {
>  			pmd = pmd_alloc_one(NULL, addr);
> @@ -691,12 +765,45 @@ static int create_hyp_pud_mappings(pgd_t
>  	return 0;
>  }
>  
> +static int create_hyp_p4d_mappings(pgd_t *pgd, unsigned long start,
> +				   unsigned long end, unsigned long pfn,
> +				   pgprot_t prot)
> +{
> +	p4d_t *p4d;
> +	pud_t *pud;
> +	unsigned long addr, next;
> +	int ret;
> +
> +	addr = start;
> +	do {
> +		p4d = p4d_offset(pgd, addr);
> +
> +		if (p4d_none(*p4d)) {
> +			pud = pud_alloc_one(NULL, addr);
> +			if (!pud) {
> +				kvm_err("Cannot allocate Hyp pud\n");
> +				return -ENOMEM;
> +			}
> +			kvm_p4d_populate(p4d, pud);
> +			get_page(virt_to_page(p4d));
> +		}
> +
> +		next = p4d_addr_end(addr, end);
> +		ret = create_hyp_pud_mappings(p4d, addr, next, pfn, prot);
> +		if (ret)
> +			return ret;
> +		pfn += (next - addr) >> PAGE_SHIFT;
> +	} while (addr = next, addr != end);
> +
> +	return 0;
> +}
> +
>  static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
>  				 unsigned long start, unsigned long end,
>  				 unsigned long pfn, pgprot_t prot)
>  {
>  	pgd_t *pgd;
> -	pud_t *pud;
> +	p4d_t *p4d;
>  	unsigned long addr, next;
>  	int err = 0;
>  
> @@ -707,18 +814,18 @@ static int __create_hyp_mappings(pgd_t *
>  		pgd = pgdp + kvm_pgd_index(addr, ptrs_per_pgd);
>  
>  		if (pgd_none(*pgd)) {
> -			pud = pud_alloc_one(NULL, addr);
> -			if (!pud) {
> -				kvm_err("Cannot allocate Hyp pud\n");
> +			p4d = p4d_alloc_one(NULL, addr);
> +			if (!p4d) {
> +				kvm_err("Cannot allocate Hyp p4d\n");
>  				err = -ENOMEM;
>  				goto out;
>  			}
> -			kvm_pgd_populate(pgd, pud);
> +			kvm_pgd_populate(pgd, p4d);
>  			get_page(virt_to_page(pgd));
>  		}
>  
>  		next = pgd_addr_end(addr, end);
> -		err = create_hyp_pud_mappings(pgd, addr, next, pfn, prot);
> +		err = create_hyp_p4d_mappings(pgd, addr, next, pfn, prot);
>  		if (err)
>  			goto out;
>  		pfn += (next - addr) >> PAGE_SHIFT;
> @@ -1015,22 +1122,40 @@ void kvm_free_stage2_pgd(struct kvm *kvm
>  		free_pages_exact(pgd, stage2_pgd_size(kvm));
>  }
>  
> -static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> +static p4d_t *stage2_get_p4d(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  			     phys_addr_t addr)
>  {
>  	pgd_t *pgd;
> -	pud_t *pud;
> +	p4d_t *p4d;
>  
>  	pgd = kvm->arch.pgd + stage2_pgd_index(kvm, addr);
>  	if (stage2_pgd_none(kvm, *pgd)) {
>  		if (!cache)
>  			return NULL;
> -		pud = mmu_memory_cache_alloc(cache);
> -		stage2_pgd_populate(kvm, pgd, pud);
> +		p4d = mmu_memory_cache_alloc(cache);
> +		stage2_pgd_populate(kvm, pgd, p4d);
>  		get_page(virt_to_page(pgd));
>  	}
>  
> -	return stage2_pud_offset(kvm, pgd, addr);
> +	return stage2_p4d_offset(kvm, pgd, addr);
> +}
> +
> +static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> +			     phys_addr_t addr)
> +{
> +	p4d_t *p4d;
> +	pud_t *pud;
> +
> +	p4d = stage2_get_p4d(kvm, cache, addr);
> +	if (stage2_p4d_none(kvm, *p4d)) {
> +		if (!cache)
> +			return NULL;
> +		pud = mmu_memory_cache_alloc(cache);
> +		stage2_p4d_populate(kvm, p4d, pud);
> +		get_page(virt_to_page(p4d));
> +	}
> +
> +	return stage2_pud_offset(kvm, p4d, addr);
>  }
>  
>  static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> @@ -1423,18 +1548,18 @@ static void stage2_wp_pmds(struct kvm *k
>  }
>  
>  /**
> - * stage2_wp_puds - write protect PGD range
> + * stage2_wp_puds - write protect P4D range
>   * @pgd:	pointer to pgd entry
>   * @addr:	range start address
>   * @end:	range end address
>   */
> -static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
> +static void  stage2_wp_puds(struct kvm *kvm, p4d_t *p4d,
>  			    phys_addr_t addr, phys_addr_t end)
>  {
>  	pud_t *pud;
>  	phys_addr_t next;
>  
> -	pud = stage2_pud_offset(kvm, pgd, addr);
> +	pud = stage2_pud_offset(kvm, p4d, addr);
>  	do {
>  		next = stage2_pud_addr_end(kvm, addr, end);
>  		if (!stage2_pud_none(kvm, *pud)) {
> @@ -1449,6 +1574,26 @@ static void  stage2_wp_puds(struct kvm *
>  }
>  
>  /**
> + * stage2_wp_p4ds - write protect PGD range
> + * @pgd:	pointer to pgd entry
> + * @addr:	range start address
> + * @end:	range end address
> + */
> +static void  stage2_wp_p4ds(struct kvm *kvm, pgd_t *pgd,
> +			    phys_addr_t addr, phys_addr_t end)
> +{
> +	p4d_t *p4d;
> +	phys_addr_t next;
> +
> +	p4d = stage2_p4d_offset(kvm, pgd, addr);
> +	do {
> +		next = stage2_p4d_addr_end(kvm, addr, end);
> +		if (!stage2_p4d_none(kvm, *p4d))
> +			stage2_wp_puds(kvm, p4d, addr, next);
> +	} while (p4d++, addr = next, addr != end);
> +}
> +
> +/**
>   * stage2_wp_range() - write protect stage2 memory region range
>   * @kvm:	The KVM pointer
>   * @addr:	Start address of range
> @@ -1475,7 +1620,7 @@ static void stage2_wp_range(struct kvm *
>  			break;
>  		next = stage2_pgd_addr_end(kvm, addr, end);
>  		if (stage2_pgd_present(kvm, *pgd))
> -			stage2_wp_puds(kvm, pgd, addr, next);
> +			stage2_wp_p4ds(kvm, pgd, addr, next);
>  	} while (pgd++, addr = next, addr != end);
>  }
>  
> --- a/arch/arm64/mm/fault.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/fault.c
> @@ -145,6 +145,7 @@ static void show_pte(unsigned long addr)
>  	pr_alert("[%016lx] pgd=%016llx", addr, pgd_val(pgd));
>  
>  	do {
> +		p4d_t *p4dp, p4d;
>  		pud_t *pudp, pud;
>  		pmd_t *pmdp, pmd;
>  		pte_t *ptep, pte;
> @@ -152,7 +153,13 @@ static void show_pte(unsigned long addr)
>  		if (pgd_none(pgd) || pgd_bad(pgd))
>  			break;
>  
> -		pudp = pud_offset(pgdp, addr);
> +		p4dp = p4d_offset(pgdp, addr);
> +		p4d = READ_ONCE(*p4dp);
> +		pr_cont(", p4d=%016llx", p4d_val(p4d));
> +		if (p4d_none(p4d) || p4d_bad(p4d))
> +			break;
> +
> +		pudp = pud_offset(p4dp, addr);
>  		pud = READ_ONCE(*pudp);
>  		pr_cont(", pud=%016llx", pud_val(pud));
>  		if (pud_none(pud) || pud_bad(pud))
> --- a/arch/arm64/mm/hugetlbpage.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/hugetlbpage.c
> @@ -67,11 +67,13 @@ static int find_num_contig(struct mm_str
>  			   pte_t *ptep, size_t *pgsize)
>  {
>  	pgd_t *pgdp = pgd_offset(mm, addr);
> +	p4d_t *p4dp;
>  	pud_t *pudp;
>  	pmd_t *pmdp;
>  
>  	*pgsize = PAGE_SIZE;
> -	pudp = pud_offset(pgdp, addr);
> +	p4dp = p4d_offset(pgdp, addr);
> +	pudp = pud_offset(p4dp, addr);
>  	pmdp = pmd_offset(pudp, addr);
>  	if ((pte_t *)pmdp == ptep) {
>  		*pgsize = PMD_SIZE;
> @@ -217,12 +219,14 @@ pte_t *huge_pte_alloc(struct mm_struct *
>  		      unsigned long addr, unsigned long sz)
>  {
>  	pgd_t *pgdp;
> +	p4d_t *p4dp;
>  	pud_t *pudp;
>  	pmd_t *pmdp;
>  	pte_t *ptep = NULL;
>  
>  	pgdp = pgd_offset(mm, addr);
> -	pudp = pud_alloc(mm, pgdp, addr);
> +	p4dp = p4d_offset(pgdp, addr);
> +	pudp = pud_alloc(mm, p4dp, addr);
>  	if (!pudp)
>  		return NULL;
>  
> @@ -261,6 +265,7 @@ pte_t *huge_pte_offset(struct mm_struct
>  		       unsigned long addr, unsigned long sz)
>  {
>  	pgd_t *pgdp;
> +	p4d_t *p4dp;
>  	pud_t *pudp, pud;
>  	pmd_t *pmdp, pmd;
>  
> @@ -268,7 +273,11 @@ pte_t *huge_pte_offset(struct mm_struct
>  	if (!pgd_present(READ_ONCE(*pgdp)))
>  		return NULL;
>  
> -	pudp = pud_offset(pgdp, addr);
> +	p4dp = p4d_offset(pgdp, addr);
> +	if (!p4d_present(READ_ONCE(*p4dp)))
> +		return NULL;
> +
> +	pudp = pud_offset(p4dp, addr);
>  	pud = READ_ONCE(*pudp);
>  	if (sz != PUD_SIZE && pud_none(pud))
>  		return NULL;
> --- a/arch/arm64/mm/kasan_init.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/kasan_init.c
> @@ -84,17 +84,17 @@ static pmd_t *__init kasan_pmd_offset(pu
>  	return early ? pmd_offset_kimg(pudp, addr) : pmd_offset(pudp, addr);
>  }
>  
> -static pud_t *__init kasan_pud_offset(pgd_t *pgdp, unsigned long addr, int node,
> +static pud_t *__init kasan_pud_offset(p4d_t *p4dp, unsigned long addr, int node,
>  				      bool early)
>  {
> -	if (pgd_none(READ_ONCE(*pgdp))) {
> +	if (p4d_none(READ_ONCE(*p4dp))) {
>  		phys_addr_t pud_phys = early ?
>  				__pa_symbol(kasan_early_shadow_pud)
>  					: kasan_alloc_zeroed_page(node);
> -		__pgd_populate(pgdp, pud_phys, PMD_TYPE_TABLE);
> +		__p4d_populate(p4dp, pud_phys, PMD_TYPE_TABLE);
>  	}
>  
> -	return early ? pud_offset_kimg(pgdp, addr) : pud_offset(pgdp, addr);
> +	return early ? pud_offset_kimg(p4dp, addr) : pud_offset(p4dp, addr);
>  }
>  
>  static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr,
> @@ -126,11 +126,11 @@ static void __init kasan_pmd_populate(pu
>  	} while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp)));
>  }
>  
> -static void __init kasan_pud_populate(pgd_t *pgdp, unsigned long addr,
> +static void __init kasan_pud_populate(p4d_t *p4dp, unsigned long addr,
>  				      unsigned long end, int node, bool early)
>  {
>  	unsigned long next;
> -	pud_t *pudp = kasan_pud_offset(pgdp, addr, node, early);
> +	pud_t *pudp = kasan_pud_offset(p4dp, addr, node, early);
>  
>  	do {
>  		next = pud_addr_end(addr, end);
> @@ -138,6 +138,18 @@ static void __init kasan_pud_populate(pg
>  	} while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp)));
>  }
>  
> +static void __init kasan_p4d_populate(pgd_t *pgdp, unsigned long addr,
> +				      unsigned long end, int node, bool early)
> +{
> +	unsigned long next;
> +	p4d_t *p4dp = p4d_offset(pgdp, addr);
> +
> +	do {
> +		next = p4d_addr_end(addr, end);
> +		kasan_pud_populate(p4dp, addr, next, node, early);
> +	} while (p4dp++, addr = next, addr != end);
> +}
> +
>  static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
>  				      int node, bool early)
>  {
> @@ -147,7 +159,7 @@ static void __init kasan_pgd_populate(un
>  	pgdp = pgd_offset_k(addr);
>  	do {
>  		next = pgd_addr_end(addr, end);
> -		kasan_pud_populate(pgdp, addr, next, node, early);
> +		kasan_p4d_populate(pgdp, addr, next, node, early);
>  	} while (pgdp++, addr = next, addr != end);
>  }
>  
> --- a/arch/arm64/mm/mmu.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/mmu.c
> @@ -290,18 +290,19 @@ static void alloc_init_pud(pgd_t *pgdp,
>  {
>  	unsigned long next;
>  	pud_t *pudp;
> -	pgd_t pgd = READ_ONCE(*pgdp);
> +	p4d_t *p4dp = p4d_offset(pgdp, addr);
> +	p4d_t p4d = READ_ONCE(*p4dp);
>  
> -	if (pgd_none(pgd)) {
> +	if (p4d_none(p4d)) {
>  		phys_addr_t pud_phys;
>  		BUG_ON(!pgtable_alloc);
>  		pud_phys = pgtable_alloc(PUD_SHIFT);
> -		__pgd_populate(pgdp, pud_phys, PUD_TYPE_TABLE);
> -		pgd = READ_ONCE(*pgdp);
> +		__p4d_populate(p4dp, pud_phys, PUD_TYPE_TABLE);
> +		p4d = READ_ONCE(*p4dp);
>  	}
> -	BUG_ON(pgd_bad(pgd));
> +	BUG_ON(p4d_bad(p4d));
>  
> -	pudp = pud_set_fixmap_offset(pgdp, addr);
> +	pudp = pud_set_fixmap_offset(p4dp, addr);
>  	do {
>  		pud_t old_pud = READ_ONCE(*pudp);
>  
> @@ -672,6 +673,7 @@ static void __init map_kernel(pgd_t *pgd
>  			READ_ONCE(*pgd_offset_k(FIXADDR_START)));
>  	} else if (CONFIG_PGTABLE_LEVELS > 3) {
>  		pgd_t *bm_pgdp;
> +		p4d_t *bm_p4dp;
>  		pud_t *bm_pudp;
>  		/*
>  		 * The fixmap shares its top level pgd entry with the kernel
> @@ -681,7 +683,8 @@ static void __init map_kernel(pgd_t *pgd
>  		 */
>  		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
>  		bm_pgdp = pgd_offset_raw(pgdp, FIXADDR_START);
> -		bm_pudp = pud_set_fixmap_offset(bm_pgdp, FIXADDR_START);
> +		bm_p4dp = p4d_offset(bm_pgdp, FIXADDR_START);
> +		bm_pudp = pud_set_fixmap_offset(bm_p4dp, FIXADDR_START);
>  		pud_populate(&init_mm, bm_pudp, lm_alias(bm_pmd));
>  		pud_clear_fixmap();
>  	} else {
> @@ -715,6 +718,7 @@ void __init paging_init(void)
>  int kern_addr_valid(unsigned long addr)
>  {
>  	pgd_t *pgdp;
> +	p4d_t *p4dp;
>  	pud_t *pudp, pud;
>  	pmd_t *pmdp, pmd;
>  	pte_t *ptep, pte;
> @@ -726,7 +730,11 @@ int kern_addr_valid(unsigned long addr)
>  	if (pgd_none(READ_ONCE(*pgdp)))
>  		return 0;
>  
> -	pudp = pud_offset(pgdp, addr);
> +	p4dp = p4d_offset(pgdp, addr);
> +	if (p4d_none(READ_ONCE(*p4dp)))
> +		return 0;
> +
> +	pudp = pud_offset(p4dp, addr);
>  	pud = READ_ONCE(*pudp);
>  	if (pud_none(pud))
>  		return 0;
> @@ -1069,6 +1077,7 @@ int __meminit vmemmap_populate(unsigned
>  	unsigned long addr = start;
>  	unsigned long next;
>  	pgd_t *pgdp;
> +	p4d_t *p4dp;
>  	pud_t *pudp;
>  	pmd_t *pmdp;
>  
> @@ -1079,7 +1088,11 @@ int __meminit vmemmap_populate(unsigned
>  		if (!pgdp)
>  			return -ENOMEM;
>  
> -		pudp = vmemmap_pud_populate(pgdp, addr, node);
> +		p4dp = vmemmap_p4d_populate(pgdp, addr, node);
> +		if (!p4dp)
> +			return -ENOMEM;
> +
> +		pudp = vmemmap_pud_populate(p4dp, addr, node);
>  		if (!pudp)
>  			return -ENOMEM;
>  
> @@ -1114,11 +1127,12 @@ void vmemmap_free(unsigned long start, u
>  static inline pud_t * fixmap_pud(unsigned long addr)
>  {
>  	pgd_t *pgdp = pgd_offset_k(addr);
> -	pgd_t pgd = READ_ONCE(*pgdp);
> +	p4d_t *p4dp = p4d_offset(pgdp, addr);
> +	p4d_t p4d = READ_ONCE(*p4dp);
>  
> -	BUG_ON(pgd_none(pgd) || pgd_bad(pgd));
> +	BUG_ON(p4d_none(p4d) || p4d_bad(p4d));
>  
> -	return pud_offset_kimg(pgdp, addr);
> +	return pud_offset_kimg(p4dp, addr);
>  }
>  
>  static inline pmd_t * fixmap_pmd(unsigned long addr)
> @@ -1144,25 +1158,27 @@ static inline pte_t * fixmap_pte(unsigne
>   */
>  void __init early_fixmap_init(void)
>  {
> -	pgd_t *pgdp, pgd;
> +	pgd_t *pgdp;
> +	p4d_t *p4dp, p4d;
>  	pud_t *pudp;
>  	pmd_t *pmdp;
>  	unsigned long addr = FIXADDR_START;
>  
>  	pgdp = pgd_offset_k(addr);
> -	pgd = READ_ONCE(*pgdp);
> +	p4dp = p4d_offset(pgdp, addr);
> +	p4d = READ_ONCE(*p4dp);
>  	if (CONFIG_PGTABLE_LEVELS > 3 &&
> -	    !(pgd_none(pgd) || pgd_page_paddr(pgd) == __pa_symbol(bm_pud))) {
> +	    !(p4d_none(p4d) || p4d_page_paddr(p4d) == __pa_symbol(bm_pud))) {
>  		/*
>  		 * We only end up here if the kernel mapping and the fixmap
>  		 * share the top level pgd entry, which should only happen on
>  		 * 16k/4 levels configurations.
>  		 */
>  		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
> -		pudp = pud_offset_kimg(pgdp, addr);
> +		pudp = pud_offset_kimg(p4dp, addr);
>  	} else {
> -		if (pgd_none(pgd))
> -			__pgd_populate(pgdp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
> +		if (p4d_none(p4d))
> +			__p4d_populate(p4dp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
>  		pudp = fixmap_pud(addr);
>  	}
>  	if (pud_none(READ_ONCE(*pudp)))
> --- a/arch/arm64/mm/pageattr.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/pageattr.c
> @@ -198,6 +198,7 @@ void __kernel_map_pages(struct page *pag
>  bool kernel_page_present(struct page *page)
>  {
>  	pgd_t *pgdp;
> +	p4d_t *p4dp;
>  	pud_t *pudp, pud;
>  	pmd_t *pmdp, pmd;
>  	pte_t *ptep;
> @@ -210,7 +211,11 @@ bool kernel_page_present(struct page *pa
>  	if (pgd_none(READ_ONCE(*pgdp)))
>  		return false;
>  
> -	pudp = pud_offset(pgdp, addr);
> +	p4dp = p4d_offset(pgdp, addr);
> +	if (p4d_none(READ_ONCE(*p4dp)))
> +		return false;
> +
> +	pudp = pud_offset(p4dp, addr);
>  	pud = READ_ONCE(*pudp);
>  	if (pud_none(pud))
>  		return false;
> _
> 

-- 
Sincerely yours,
Mike.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 08/14] powerpc: add support for folded p4d page tables
  2020-04-14 15:34 ` [PATCH v4 08/14] powerpc: " Mike Rapoport
@ 2020-06-03 19:05   ` Andrew Morton
  2020-06-04  9:50     ` Qian Cai
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2020-06-03 19:05 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev

On Tue, 14 Apr 2020 18:34:49 +0300 Mike Rapoport <rppt@kernel.org> wrote:

> Implement primitives necessary for the 4th level folding, add walks of p4d
> level where appropriate and replace 5level-fixup.h with pgtable-nop4d.h.

A bunch of new material just landed in linux-next/powerpc.

The timing is awkward!  I trust this will be going into mainline during
this merge window?  If not, please drop it and repull after -rc1.

arch/powerpc/mm/ptdump/ptdump.c:walk_pagetables() was a problem. 
Here's what I ended up with - please check.

static void walk_pagetables(struct pg_state *st)
{
	unsigned int i;
	unsigned long addr = st->start_address & PGDIR_MASK;
	pgd_t *pgd = pgd_offset_k(addr);

	/*
	 * Traverse the linux pagetable structure and dump pages that are in
	 * the hash pagetable.
	 */
	for (i = pgd_index(addr); i < PTRS_PER_PGD; i++, pgd++, addr += PGDIR_SIZE) {
		p4d_t *p4d = p4d_offset(pgd, 0);

		if (pgd_none(*pgd) || pgd_is_leaf(*pgd))
			note_page(st, addr, 1, p4d_val(*p4d), PGDIR_SIZE);
		else if (is_hugepd(__hugepd(p4d_val(*p4d))))
			walk_hugepd(st, (hugepd_t *)pgd, addr, PGDIR_SHIFT, 1);
		else
			/* pgd exists */
			walk_pud(st, p4d, addr);
	}
}

Mike's series "mm: consolidate definitions of page table accessors"
took quite a lot of damage as well.  Patches which needed rework as a
result of this were:

powerpc-add-support-for-folded-p4d-page-tables-fix.patch
mm-introduce-include-linux-pgtableh.patch
mm-reorder-includes-after-introduction-of-linux-pgtableh.patch
mm-pgtable-add-shortcuts-for-accessing-kernel-pmd-and-pte.patch
mm-pgtable-add-shortcuts-for-accessing-kernel-pmd-and-pte-fix-2.patch
mm-consolidate-pte_index-and-pte_offset_-definitions.patch
mm-consolidate-pmd_index-and-pmd_offset-definitions.patch
mm-consolidate-pud_index-and-pud_offset-definitions.patch
mm-consolidate-pgd_index-and-pgd_offset_k-definitions.patch

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4 08/14] powerpc: add support for folded p4d page tables
  2020-06-03 19:05   ` Andrew Morton
@ 2020-06-04  9:50     ` Qian Cai
  0 siblings, 0 replies; 25+ messages in thread
From: Qian Cai @ 2020-06-04  9:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh,
	Benjamin Herrenschmidt, linux-mm, Paul Mackerras, linux-hexagon,
	Will Deacon, kvmarm, Jonas Bonn, linux-arch, Brian Cain,
	Marc Zyngier, Russell King, Ley Foon Tan, Mike Rapoport,
	Catalin Marinas, uclinux-h8-devel, Fenghua Yu, Arnd Bergmann,
	kvm-ppc, Stefan Kristiansson, openrisc, Stafford Horne,
	Guan Xuetao, linux-arm-kernel, Christophe Leroy, Tony Luck,
	Yoshinori Sato, linux-kernel, Michael Ellerman, nios2-dev,
	linuxppc-dev, Mike Rapoport



> On Jun 3, 2020, at 3:05 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> A bunch of new material just landed in linux-next/powerpc.
> 
> The timing is awkward!  I trust this will be going into mainline during
> this merge window?  If not, please drop it and repull after -rc1.

I have noticed the same pattern over and over again, i.e., many powerpc new material has only shown up in linux-next for only a few days before sending for a pull request to Linus.

There are absolutely no safe net for this kind of practice. The main problem is that Linus seems totally fine with it.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-06-04 10:07 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-14 15:34 [PATCH v4 00/14] mm: remove __ARCH_HAS_5LEVEL_HACK Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 01/14] h8300: remove usage of __ARCH_USE_5LEVEL_HACK Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 02/14] arm: add support for folded p4d page tables Mike Rapoport
     [not found]   ` <CGME20200507121658eucas1p240cf4a3e0fe5c22dda5ec4f72734149f@eucas1p2.samsung.com>
2020-05-07 12:16     ` Marek Szyprowski
2020-05-07 16:11       ` Mike Rapoport
2020-05-08  6:53         ` Marek Szyprowski
2020-05-08 17:42           ` Mike Rapoport
2020-05-11  6:36             ` Marek Szyprowski
2020-05-11 14:15               ` Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 03/14] arm64: " Mike Rapoport
2020-05-15 18:40   ` Andrew Morton
2020-05-16 17:20     ` Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 04/14] hexagon: remove __ARCH_USE_5LEVEL_HACK Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 05/14] ia64: add support for folded p4d page tables Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 06/14] nios2: " Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 07/14] openrisc: " Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 08/14] powerpc: " Mike Rapoport
2020-06-03 19:05   ` Andrew Morton
2020-06-04  9:50     ` Qian Cai
2020-04-14 15:34 ` [PATCH v4 09/14] sh: fault: Modernize printing of kernel messages Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 10/14] sh: drop __pXd_offset() macros that duplicate pXd_index() ones Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 11/14] sh: add support for folded p4d page tables Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 12/14] unicore32: remove __ARCH_USE_5LEVEL_HACK Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 13/14] asm-generic: remove pgtable-nop4d-hack.h Mike Rapoport
2020-04-14 15:34 ` [PATCH v4 14/14] mm: remove __ARCH_HAS_5LEVEL_HACK and include/asm-generic/5level-fixup.h Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).