linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/13] Convert x86 & arm64 to use generic page walk
@ 2019-02-15 17:02 Steven Price
  2019-02-15 17:02 ` [PATCH 01/13] arm64: mm: Add p?d_large() definitions Steven Price
                   ` (12 more replies)
  0 siblings, 13 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

Most architectures current have a debugfs file for dumping the kernel
page tables. Currently each architecture has to implement custom
functions for walking the page tables because the generic
walk_page_range() function is unable to walk the page tables used by the
kernel.

This series extends the capabilities of walk_page_range() so that it can
deal with the page tables of the kernel (which have no VMAs and can
contain larger huge pages than exist for user space). x86 and arm64 are
then converted to make use of walk_page_range() removing the custom page
table walkers.

Potentially future changes could unify the implementations of the
debugfs walkers further, moving the common functionality into common
code. This would require a common way of handling the effective
permissions (currently implemented only for x86) along with a per-arch
way of formatting the page table information for debugfs. One
immediate benefit would be getting the KASAN speed up optimisation in
arm64 (and other arches) which is currently only implemented for x86.

James Morse (2):
  arm64: mm: Add p?d_large() definitions
  mm: Add generic p?d_large() macros

Steven Price (11):
  x86/mm: Add p?d_large() definitions
  mm: pagewalk: Add p4d_entry() and pgd_entry()
  mm: pagewalk: Allow walking without vma
  mm: pagewalk: Add 'depth' parameter to pte_hole
  mm: pagewalk: Add test_p?d callbacks
  arm64: mm: Convert mm/dump.c to use walk_page_range()
  x86/mm: Point to struct seq_file from struct pg_state
  x86/mm+efi: Convert ptdump_walk_pgd_level() to take a mm_struct
  x86/mm: Convert ptdump_walk_pgd_level_debugfs() to take an mm_struct
  x86/mm: Convert ptdump_walk_pgd_level_core() to take an mm_struct
  x86: mm: Convert dump_pagetables to use walk_page_range

 arch/arm64/include/asm/pgtable.h |   2 +
 arch/arm64/mm/dump.c             | 108 +++++-----
 arch/x86/include/asm/pgtable.h   |   8 +-
 arch/x86/mm/debug_pagetables.c   |   8 +-
 arch/x86/mm/dump_pagetables.c    | 342 ++++++++++++++++---------------
 arch/x86/platform/efi/efi_32.c   |   2 +-
 arch/x86/platform/efi/efi_64.c   |   4 +-
 fs/proc/task_mmu.c               |   4 +-
 include/asm-generic/pgtable.h    |  10 +
 include/linux/mm.h               |  25 ++-
 mm/hmm.c                         |   2 +-
 mm/migrate.c                     |   1 +
 mm/mincore.c                     |   1 +
 mm/pagewalk.c                    |  92 ++++++---
 14 files changed, 350 insertions(+), 259 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-18 11:16   ` Mark Rutland
  2019-02-18 11:29   ` Peter Zijlstra
  2019-02-15 17:02 ` [PATCH 02/13] x86/mm: " Steven Price
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

From: James Morse <james.morse@arm.com>

Exposing the pud/pgd levels of the page tables to walk_page_range() means
we may come across the exotic large mappings that come with large areas
of contiguous memory (such as the kernel's linear map).

Expose p?d_large() from each architecture to detect these large mappings.

arm64 already has these macros defined, but with a different name.
p?d_large() is used by s390, sparc and x86. Only arm/arm64 use p?d_sect().
Add a macro to allow both names.

By not providing a pgd_large(), we get the generic version that always
returns 0.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/pgtable.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index de70c1eabf33..09d308921625 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -428,6 +428,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 				 PMD_TYPE_TABLE)
 #define pmd_sect(pmd)		((pmd_val(pmd) & PMD_TYPE_MASK) == \
 				 PMD_TYPE_SECT)
+#define pmd_large(x)		pmd_sect(x)
 
 #if defined(CONFIG_ARM64_64K_PAGES) || CONFIG_PGTABLE_LEVELS < 3
 #define pud_sect(pud)		(0)
@@ -435,6 +436,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 #else
 #define pud_sect(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
 				 PUD_TYPE_SECT)
+#define pud_large(x)		pud_sect(x)
 #define pud_table(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
 				 PUD_TYPE_TABLE)
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/13] x86/mm: Add p?d_large() definitions
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
  2019-02-15 17:02 ` [PATCH 01/13] arm64: mm: Add p?d_large() definitions Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:02 ` [PATCH 03/13] mm: Add generic p?d_large() macros Steven Price
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

Exposing the pud/pgd levels of the page tables to walk_page_range() means
we may come across the exotic large mappings that come with large areas
of contiguous memory (such as the kernel's linear map).

Expose p?d_large() from each architecture to detect these large mappings.

x86 already has these defined as inline functions, add a macro of the
same name so we don't end up with the generic version too.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/x86/include/asm/pgtable.h | 3 +++
 arch/x86/mm/dump_pagetables.c  | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 2779ace16d23..3695f6acb6af 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -234,6 +234,7 @@ static inline int pmd_large(pmd_t pte)
 {
 	return pmd_flags(pte) & _PAGE_PSE;
 }
+#define pmd_large(x)	pmd_large(x)
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static inline int pmd_trans_huge(pmd_t pmd)
@@ -873,6 +874,7 @@ static inline int pud_large(pud_t pud)
 	return 0;
 }
 #endif	/* CONFIG_PGTABLE_LEVELS > 2 */
+#define pud_large(x)	pud_large(x)
 
 static inline unsigned long pud_index(unsigned long address)
 {
@@ -1214,6 +1216,7 @@ static inline bool pgdp_maps_userspace(void *__ptr)
 }
 
 static inline int pgd_large(pgd_t pgd) { return 0; }
+#define pgd_large(x)	pgd_large(x)
 
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
 /*
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index e3cdc85ce5b6..695647dc9cb9 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -432,6 +432,7 @@ static void walk_pmd_level(struct seq_file *m, struct pg_state *st, pud_t addr,
 
 #else
 #define walk_pmd_level(m,s,a,e,p) walk_pte_level(m,s,__pmd(pud_val(a)),e,p)
+#undef pud_large
 #define pud_large(a) pmd_large(__pmd(pud_val(a)))
 #define pud_none(a)  pmd_none(__pmd(pud_val(a)))
 #endif
@@ -503,6 +504,7 @@ static void walk_p4d_level(struct seq_file *m, struct pg_state *st, pgd_t addr,
 	}
 }
 
+#undef pgd_large
 #define pgd_large(a) (pgtable_l5_enabled() ? pgd_large(a) : p4d_large(__p4d(pgd_val(a))))
 #define pgd_none(a)  (pgtable_l5_enabled() ? pgd_none(a) : p4d_none(__p4d(pgd_val(a))))
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/13] mm: Add generic p?d_large() macros
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
  2019-02-15 17:02 ` [PATCH 01/13] arm64: mm: Add p?d_large() definitions Steven Price
  2019-02-15 17:02 ` [PATCH 02/13] x86/mm: " Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-18 11:14   ` Mark Rutland
  2019-02-18 11:31   ` Peter Zijlstra
  2019-02-15 17:02 ` [PATCH 04/13] mm: pagewalk: Add p4d_entry() and pgd_entry() Steven Price
                   ` (9 subsequent siblings)
  12 siblings, 2 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

From: James Morse <james.morse@arm.com>

Exposing the pud/pgd levels of the page tables to walk_page_range() means
we may come across the exotic large mappings that come with large areas
of contiguous memory (such as the kernel's linear map).

For architectures that don't provide p?d_large() macros, provided a
does nothing default.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 include/asm-generic/pgtable.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 05e61e6c843f..7630d663cd51 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -1186,4 +1186,14 @@ static inline bool arch_has_pfn_modify_check(void)
 #define mm_pmd_folded(mm)	__is_defined(__PAGETABLE_PMD_FOLDED)
 #endif
 
+#ifndef pgd_large
+#define pgd_large(x)	0
+#endif
+#ifndef pud_large
+#define pud_large(x)	0
+#endif
+#ifndef pmd_large
+#define pmd_large(x)	0
+#endif
+
 #endif /* _ASM_GENERIC_PGTABLE_H */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/13] mm: pagewalk: Add p4d_entry() and pgd_entry()
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (2 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 03/13] mm: Add generic p?d_large() macros Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:02 ` [PATCH 05/13] mm: pagewalk: Allow walking without vma Steven Price
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

pgd_entry() and pud_entry() were removed by commit 0b1fbfe50006c410
("mm/pagewalk: remove pgd_entry() and pud_entry()") because there were
no users. We're about to add users so reintroduce them, along with
p4d_entry() as we now have 5 levels of tables.

Note that commit a00cc7d9dd93d66a ("mm, x86: add support for
PUD-sized transparent hugepages") already re-added pud_entry() but with
different semantics to the other callbacks. Since there have never
been upstream users of this, revert the semantics back to match the
other callbacks. This means pud_entry() is called for all entries, not
just transparent huge pages.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 include/linux/mm.h |  9 ++++++---
 mm/pagewalk.c      | 27 ++++++++++++++++-----------
 2 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 80bb6408fe73..1a4b1615d012 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1412,10 +1412,9 @@ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
 
 /**
  * mm_walk - callbacks for walk_page_range
+ * @pgd_entry: if set, called for each non-empty PGD (top-level) entry
+ * @p4d_entry: if set, called for each non-empty P4D (1st-level) entry
  * @pud_entry: if set, called for each non-empty PUD (2nd-level) entry
- *	       this handler should only handle pud_trans_huge() puds.
- *	       the pmd_entry or pte_entry callbacks will be used for
- *	       regular PUDs.
  * @pmd_entry: if set, called for each non-empty PMD (3rd-level) entry
  *	       this handler is required to be able to handle
  *	       pmd_trans_huge() pmds.  They may simply choose to
@@ -1435,6 +1434,10 @@ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
  * (see the comment on walk_page_range() for more details)
  */
 struct mm_walk {
+	int (*pgd_entry)(pgd_t *pgd, unsigned long addr,
+			 unsigned long next, struct mm_walk *walk);
+	int (*p4d_entry)(p4d_t *p4d, unsigned long addr,
+			 unsigned long next, struct mm_walk *walk);
 	int (*pud_entry)(pud_t *pud, unsigned long addr,
 			 unsigned long next, struct mm_walk *walk);
 	int (*pmd_entry)(pmd_t *pmd, unsigned long addr,
diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index c3084ff2569d..98373a9f88b8 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -90,15 +90,9 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end,
 		}
 
 		if (walk->pud_entry) {
-			spinlock_t *ptl = pud_trans_huge_lock(pud, walk->vma);
-
-			if (ptl) {
-				err = walk->pud_entry(pud, addr, next, walk);
-				spin_unlock(ptl);
-				if (err)
-					break;
-				continue;
-			}
+			err = walk->pud_entry(pud, addr, next, walk);
+			if (err)
+				break;
 		}
 
 		split_huge_pud(walk->vma, pud, addr);
@@ -131,7 +125,12 @@ static int walk_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end,
 				break;
 			continue;
 		}
-		if (walk->pmd_entry || walk->pte_entry)
+		if (walk->p4d_entry) {
+			err = walk->p4d_entry(p4d, addr, next, walk);
+			if (err)
+				break;
+		}
+		if (walk->pud_entry || walk->pmd_entry || walk->pte_entry)
 			err = walk_pud_range(p4d, addr, next, walk);
 		if (err)
 			break;
@@ -157,7 +156,13 @@ static int walk_pgd_range(unsigned long addr, unsigned long end,
 				break;
 			continue;
 		}
-		if (walk->pmd_entry || walk->pte_entry)
+		if (walk->pgd_entry) {
+			err = walk->pgd_entry(pgd, addr, next, walk);
+			if (err)
+				break;
+		}
+		if (walk->p4d_entry || walk->pud_entry || walk->pmd_entry ||
+				walk->pte_entry)
 			err = walk_p4d_range(pgd, addr, next, walk);
 		if (err)
 			break;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/13] mm: pagewalk: Allow walking without vma
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (3 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 04/13] mm: pagewalk: Add p4d_entry() and pgd_entry() Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:02 ` [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole Steven Price
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

Since 48684a65b4e3: "mm: pagewalk: fix misbehavior of walk_page_range
for vma(VM_PFNMAP)", page_table_walk() will report any kernel area as
a hole, because it lacks a vma.

This means each arch has re-implemented page table walking when needed,
for example in the per-arch ptdump walker.

Remove the requirement to have a vma except when trying to split huge
pages.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 mm/pagewalk.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index 98373a9f88b8..dac0c848b458 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -36,7 +36,7 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end,
 	do {
 again:
 		next = pmd_addr_end(addr, end);
-		if (pmd_none(*pmd) || !walk->vma) {
+		if (pmd_none(*pmd)) {
 			if (walk->pte_hole)
 				err = walk->pte_hole(addr, next, walk);
 			if (err)
@@ -59,9 +59,14 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end,
 		if (!walk->pte_entry)
 			continue;
 
-		split_huge_pmd(walk->vma, pmd, addr);
-		if (pmd_trans_unstable(pmd))
-			goto again;
+		if (walk->vma) {
+			split_huge_pmd(walk->vma, pmd, addr);
+			if (pmd_trans_unstable(pmd))
+				goto again;
+		} else if (pmd_large(*pmd)) {
+			continue;
+		}
+
 		err = walk_pte_range(pmd, addr, next, walk);
 		if (err)
 			break;
@@ -81,7 +86,7 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end,
 	do {
  again:
 		next = pud_addr_end(addr, end);
-		if (pud_none(*pud) || !walk->vma) {
+		if (pud_none(*pud)) {
 			if (walk->pte_hole)
 				err = walk->pte_hole(addr, next, walk);
 			if (err)
@@ -95,9 +100,13 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end,
 				break;
 		}
 
-		split_huge_pud(walk->vma, pud, addr);
-		if (pud_none(*pud))
-			goto again;
+		if (walk->vma) {
+			split_huge_pud(walk->vma, pud, addr);
+			if (pud_none(*pud))
+				goto again;
+		} else if (pud_large(*pud)) {
+			continue;
+		}
 
 		if (walk->pmd_entry || walk->pte_entry)
 			err = walk_pmd_range(pud, addr, next, walk);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (4 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 05/13] mm: pagewalk: Allow walking without vma Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-18 11:23   ` Mark Rutland
  2019-02-20 11:35   ` William Kucharski
  2019-02-15 17:02 ` [PATCH 07/13] mm: pagewalk: Add test_p?d callbacks Steven Price
                   ` (6 subsequent siblings)
  12 siblings, 2 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

The pte_hole() callback is called at multiple levels of the page tables.
Code dumping the kernel page tables needs to know what at what depth
the missing entry is. Add this is an extra parameter to pte_hole().
When the depth isn't know (e.g. processing a vma) then -1 is passed.

Note that depth starts at 0 for a PGD so that PUD/PMD/PTE retain their
natural numbers as levels 2/3/4.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 fs/proc/task_mmu.c |  4 ++--
 include/linux/mm.h |  5 +++--
 mm/hmm.c           |  2 +-
 mm/migrate.c       |  1 +
 mm/mincore.c       |  1 +
 mm/pagewalk.c      | 16 ++++++++++------
 6 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index f0ec9edab2f3..91131cd4e9e0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -474,7 +474,7 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
 
 #ifdef CONFIG_SHMEM
 static int smaps_pte_hole(unsigned long addr, unsigned long end,
-		struct mm_walk *walk)
+			  __always_unused int depth, struct mm_walk *walk)
 {
 	struct mem_size_stats *mss = walk->private;
 
@@ -1203,7 +1203,7 @@ static int add_to_pagemap(unsigned long addr, pagemap_entry_t *pme,
 }
 
 static int pagemap_pte_hole(unsigned long start, unsigned long end,
-				struct mm_walk *walk)
+			    __always_unused int depth, struct mm_walk *walk)
 {
 	struct pagemapread *pm = walk->private;
 	unsigned long addr = start;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1a4b1615d012..0418a018d7b3 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1420,7 +1420,8 @@ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
  *	       pmd_trans_huge() pmds.  They may simply choose to
  *	       split_huge_page() instead of handling it explicitly.
  * @pte_entry: if set, called for each non-empty PTE (4th-level) entry
- * @pte_hole: if set, called for each hole at all levels
+ * @pte_hole: if set, called for each hole at all levels,
+ *            depth is -1 if not known
  * @hugetlb_entry: if set, called for each hugetlb entry
  * @test_walk: caller specific callback function to determine whether
  *             we walk over the current vma or not. Returning 0
@@ -1445,7 +1446,7 @@ struct mm_walk {
 	int (*pte_entry)(pte_t *pte, unsigned long addr,
 			 unsigned long next, struct mm_walk *walk);
 	int (*pte_hole)(unsigned long addr, unsigned long next,
-			struct mm_walk *walk);
+			int depth, struct mm_walk *walk);
 	int (*hugetlb_entry)(pte_t *pte, unsigned long hmask,
 			     unsigned long addr, unsigned long next,
 			     struct mm_walk *walk);
diff --git a/mm/hmm.c b/mm/hmm.c
index a04e4b810610..e3e6b8fda437 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -440,7 +440,7 @@ static void hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
 }
 
 static int hmm_vma_walk_hole(unsigned long addr, unsigned long end,
-			     struct mm_walk *walk)
+			     __always_unused int depth, struct mm_walk *walk)
 {
 	struct hmm_vma_walk *hmm_vma_walk = walk->private;
 	struct hmm_range *range = hmm_vma_walk->range;
diff --git a/mm/migrate.c b/mm/migrate.c
index d4fd680be3b0..8b62a9fecb5c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2121,6 +2121,7 @@ struct migrate_vma {
 
 static int migrate_vma_collect_hole(unsigned long start,
 				    unsigned long end,
+				    __always_unused int depth,
 				    struct mm_walk *walk)
 {
 	struct migrate_vma *migrate = walk->private;
diff --git a/mm/mincore.c b/mm/mincore.c
index 218099b5ed31..c4edbc688241 100644
--- a/mm/mincore.c
+++ b/mm/mincore.c
@@ -104,6 +104,7 @@ static int __mincore_unmapped_range(unsigned long addr, unsigned long end,
 }
 
 static int mincore_unmapped_range(unsigned long addr, unsigned long end,
+				   __always_unused int depth,
 				   struct mm_walk *walk)
 {
 	walk->private += __mincore_unmapped_range(addr, end,
diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index dac0c848b458..b8038f852f06 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -38,7 +38,7 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end,
 		next = pmd_addr_end(addr, end);
 		if (pmd_none(*pmd)) {
 			if (walk->pte_hole)
-				err = walk->pte_hole(addr, next, walk);
+				err = walk->pte_hole(addr, next, 3, walk);
 			if (err)
 				break;
 			continue;
@@ -88,7 +88,7 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end,
 		next = pud_addr_end(addr, end);
 		if (pud_none(*pud)) {
 			if (walk->pte_hole)
-				err = walk->pte_hole(addr, next, walk);
+				err = walk->pte_hole(addr, next, 2, walk);
 			if (err)
 				break;
 			continue;
@@ -123,13 +123,17 @@ static int walk_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end,
 	p4d_t *p4d;
 	unsigned long next;
 	int err = 0;
+	/* If the p4ds are actually just pgds then we should report a depth
+	 * of 0 not 1 (as a missing entry is really a missing pgd
+	 */
+	int depth = (PTRS_PER_P4D == 1)?0:1;
 
 	p4d = p4d_offset(pgd, addr);
 	do {
 		next = p4d_addr_end(addr, end);
 		if (p4d_none_or_clear_bad(p4d)) {
 			if (walk->pte_hole)
-				err = walk->pte_hole(addr, next, walk);
+				err = walk->pte_hole(addr, next, depth, walk);
 			if (err)
 				break;
 			continue;
@@ -160,7 +164,7 @@ static int walk_pgd_range(unsigned long addr, unsigned long end,
 		next = pgd_addr_end(addr, end);
 		if (pgd_none_or_clear_bad(pgd)) {
 			if (walk->pte_hole)
-				err = walk->pte_hole(addr, next, walk);
+				err = walk->pte_hole(addr, next, 0, walk);
 			if (err)
 				break;
 			continue;
@@ -206,7 +210,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end,
 		if (pte)
 			err = walk->hugetlb_entry(pte, hmask, addr, next, walk);
 		else if (walk->pte_hole)
-			err = walk->pte_hole(addr, next, walk);
+			err = walk->pte_hole(addr, next, -1, walk);
 
 		if (err)
 			break;
@@ -249,7 +253,7 @@ static int walk_page_test(unsigned long start, unsigned long end,
 	if (vma->vm_flags & VM_PFNMAP) {
 		int err = 1;
 		if (walk->pte_hole)
-			err = walk->pte_hole(start, end, walk);
+			err = walk->pte_hole(start, end, -1, walk);
 		return err ? err : 1;
 	}
 	return 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/13] mm: pagewalk: Add test_p?d callbacks
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (5 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:02 ` [PATCH 08/13] arm64: mm: Convert mm/dump.c to use walk_page_range() Steven Price
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

It is useful to be able to skip parts of the page table tree even when
walking without VMAs. Add test_p?d callbacks similar to test_walk but
which are called just before a table at that level is walked. If the
callback returns non-zero then the entire table is skipped.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 include/linux/mm.h | 11 +++++++++++
 mm/pagewalk.c      | 24 ++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0418a018d7b3..0e4c9b6b554d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1428,6 +1428,11 @@ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
  *             value means "do page table walk over the current vma,"
  *             and a negative one means "abort current page table walk
  *             right now." 1 means "skip the current vma."
+ * @test_pmd:  similar to test_walk(), but called for every pmd.
+ * @test_pud:  similar to test_walk(), but called for every pud.
+ * @test_p4d:  similar to test_walk(), but called for every p4d.
+ *             Returning 0 means walk this part of the page tables,
+ *             returning 1 means to skip this range.
  * @mm:        mm_struct representing the target process of page table walk
  * @vma:       vma currently walked (NULL if walking outside vmas)
  * @private:   private data for callbacks' usage
@@ -1452,6 +1457,12 @@ struct mm_walk {
 			     struct mm_walk *walk);
 	int (*test_walk)(unsigned long addr, unsigned long next,
 			struct mm_walk *walk);
+	int (*test_pmd)(unsigned long addr, unsigned long next,
+			pmd_t *pmd_start, struct mm_walk *walk);
+	int (*test_pud)(unsigned long addr, unsigned long next,
+			pud_t *pud_start, struct mm_walk *walk);
+	int (*test_p4d)(unsigned long addr, unsigned long next,
+			p4d_t *p4d_start, struct mm_walk *walk);
 	struct mm_struct *mm;
 	struct vm_area_struct *vma;
 	void *private;
diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index b8038f852f06..bce7571c6563 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -32,6 +32,14 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end,
 	unsigned long next;
 	int err = 0;
 
+	if (walk->test_pmd) {
+		err = walk->test_pmd(addr, end, pmd_offset(pud, 0), walk);
+		if (err < 0)
+			return err;
+		if (err > 0)
+			return 0;
+	}
+
 	pmd = pmd_offset(pud, addr);
 	do {
 again:
@@ -82,6 +90,14 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end,
 	unsigned long next;
 	int err = 0;
 
+	if (walk->test_pud) {
+		err = walk->test_pud(addr, end, pud_offset(p4d, 0), walk);
+		if (err < 0)
+			return err;
+		if (err > 0)
+			return 0;
+	}
+
 	pud = pud_offset(p4d, addr);
 	do {
  again:
@@ -128,6 +144,14 @@ static int walk_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end,
 	 */
 	int depth = (PTRS_PER_P4D == 1)?0:1;
 
+	if (walk->test_p4d) {
+		err = walk->test_p4d(addr, end, p4d_offset(pgd, 0), walk);
+		if (err < 0)
+			return err;
+		if (err > 0)
+			return 0;
+	}
+
 	p4d = p4d_offset(pgd, addr);
 	do {
 		next = p4d_addr_end(addr, end);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/13] arm64: mm: Convert mm/dump.c to use walk_page_range()
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (6 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 07/13] mm: pagewalk: Add test_p?d callbacks Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:02 ` [PATCH 09/13] x86/mm: Point to struct seq_file from struct pg_state Steven Price
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

Now walk_page_range() can walk kernel page tables, we can switch the
arm64 ptdump code over to using it, simplifying the code.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/mm/dump.c | 108 +++++++++++++++++++++----------------------
 1 file changed, 53 insertions(+), 55 deletions(-)

diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 99bb8facb5cb..ee0bc1441dd0 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -286,73 +286,71 @@ static void note_page(struct pg_state *st, unsigned long addr, unsigned level,
 
 }
 
-static void walk_pte(struct pg_state *st, pmd_t *pmdp, unsigned long start,
-		     unsigned long end)
+static int pud_entry(pud_t *pud, unsigned long addr,
+		unsigned long next, struct mm_walk *walk)
 {
-	unsigned long addr = start;
-	pte_t *ptep = pte_offset_kernel(pmdp, start);
+	struct pg_state *st = walk->private;
+	pud_t val = READ_ONCE(*pud);
 
-	do {
-		note_page(st, addr, 4, READ_ONCE(pte_val(*ptep)));
-	} while (ptep++, addr += PAGE_SIZE, addr != end);
+	if (pud_table(val))
+		return 0;
+
+	note_page(st, addr, 2, pud_val(val));
+
+	return 0;
 }
 
-static void walk_pmd(struct pg_state *st, pud_t *pudp, unsigned long start,
-		     unsigned long end)
+static int pmd_entry(pmd_t *pmd, unsigned long addr,
+		unsigned long next, struct mm_walk *walk)
 {
-	unsigned long next, addr = start;
-	pmd_t *pmdp = pmd_offset(pudp, start);
-
-	do {
-		pmd_t pmd = READ_ONCE(*pmdp);
-		next = pmd_addr_end(addr, end);
-
-		if (pmd_none(pmd) || pmd_sect(pmd)) {
-			note_page(st, addr, 3, pmd_val(pmd));
-		} else {
-			BUG_ON(pmd_bad(pmd));
-			walk_pte(st, pmdp, addr, next);
-		}
-	} while (pmdp++, addr = next, addr != end);
+	struct pg_state *st = walk->private;
+	pmd_t val = READ_ONCE(*pmd);
+
+	if (pmd_table(val))
+		return 0;
+
+	note_page(st, addr, 3, pmd_val(val));
+
+	return 0;
 }
 
-static void walk_pud(struct pg_state *st, pgd_t *pgdp, unsigned long start,
-		     unsigned long end)
+static int pte_entry(pte_t *pte, unsigned long addr,
+		unsigned long next, struct mm_walk *walk)
 {
-	unsigned long next, addr = start;
-	pud_t *pudp = pud_offset(pgdp, start);
-
-	do {
-		pud_t pud = READ_ONCE(*pudp);
-		next = pud_addr_end(addr, end);
-
-		if (pud_none(pud) || pud_sect(pud)) {
-			note_page(st, addr, 2, pud_val(pud));
-		} else {
-			BUG_ON(pud_bad(pud));
-			walk_pmd(st, pudp, addr, next);
-		}
-	} while (pudp++, addr = next, addr != end);
+	struct pg_state *st = walk->private;
+	pte_t val = READ_ONCE(*pte);
+
+	note_page(st, addr, 4, pte_val(val));
+
+	return 0;
+}
+
+static int pte_hole(unsigned long addr, unsigned long next, int depth,
+		struct mm_walk *walk)
+{
+	struct pg_state *st = walk->private;
+
+	note_page(st, addr, depth+1, 0);
+
+	return 0;
 }
 
 static void walk_pgd(struct pg_state *st, struct mm_struct *mm,
-		     unsigned long start)
+		unsigned long start)
 {
-	unsigned long end = (start < TASK_SIZE_64) ? TASK_SIZE_64 : 0;
-	unsigned long next, addr = start;
-	pgd_t *pgdp = pgd_offset(mm, start);
-
-	do {
-		pgd_t pgd = READ_ONCE(*pgdp);
-		next = pgd_addr_end(addr, end);
-
-		if (pgd_none(pgd)) {
-			note_page(st, addr, 1, pgd_val(pgd));
-		} else {
-			BUG_ON(pgd_bad(pgd));
-			walk_pud(st, pgdp, addr, next);
-		}
-	} while (pgdp++, addr = next, addr != end);
+	struct mm_walk walk = {
+		.mm = mm,
+		.private = st,
+		.pud_entry = pud_entry,
+		.pmd_entry = pmd_entry,
+		.pte_entry = pte_entry,
+		.pte_hole = pte_hole
+	};
+	down_read(&mm->mmap_sem);
+	walk_page_range(start, start | (((unsigned long)PTRS_PER_PGD <<
+					 PGDIR_SHIFT) - 1),
+			&walk);
+	up_read(&mm->mmap_sem);
 }
 
 void ptdump_walk_pgd(struct seq_file *m, struct ptdump_info *info)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/13] x86/mm: Point to struct seq_file from struct pg_state
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (7 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 08/13] arm64: mm: Convert mm/dump.c to use walk_page_range() Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:02 ` [PATCH 10/13] x86/mm+efi: Convert ptdump_walk_pgd_level() to take a mm_struct Steven Price
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

mm/dump_pagetables.c passes both struct seq_file and struct pg_state
down the chain of walk_*_level() functions to be passed to note_page().
Instead place the struct seq_file in struct pg_state and access it from
struct pg_state (which is private to this file) in note_page().

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/x86/mm/dump_pagetables.c | 69 ++++++++++++++++++-----------------
 1 file changed, 35 insertions(+), 34 deletions(-)

diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 695647dc9cb9..ad6986c9e8e7 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -40,6 +40,7 @@ struct pg_state {
 	bool to_dmesg;
 	bool check_wx;
 	unsigned long wx_pages;
+	struct seq_file *seq;
 };
 
 struct addr_marker {
@@ -268,11 +269,12 @@ static void note_wx(struct pg_state *st)
  * of PTE entries; the next one is different so we need to
  * print what we collected so far.
  */
-static void note_page(struct seq_file *m, struct pg_state *st,
-		      pgprot_t new_prot, pgprotval_t new_eff, int level)
+static void note_page(struct pg_state *st, pgprot_t new_prot,
+		      pgprotval_t new_eff, int level)
 {
 	pgprotval_t prot, cur, eff;
 	static const char units[] = "BKMGTPE";
+	struct seq_file *m = st->seq;
 
 	/*
 	 * If we have a "break" in the series, we need to flush the state that
@@ -357,8 +359,8 @@ static inline pgprotval_t effective_prot(pgprotval_t prot1, pgprotval_t prot2)
 	       ((prot1 | prot2) & _PAGE_NX);
 }
 
-static void walk_pte_level(struct seq_file *m, struct pg_state *st, pmd_t addr,
-			   pgprotval_t eff_in, unsigned long P)
+static void walk_pte_level(struct pg_state *st, pmd_t addr, pgprotval_t eff_in,
+			   unsigned long P)
 {
 	int i;
 	pte_t *pte;
@@ -369,7 +371,7 @@ static void walk_pte_level(struct seq_file *m, struct pg_state *st, pmd_t addr,
 		pte = pte_offset_map(&addr, st->current_address);
 		prot = pte_flags(*pte);
 		eff = effective_prot(eff_in, prot);
-		note_page(m, st, __pgprot(prot), eff, 5);
+		note_page(st, __pgprot(prot), eff, 5);
 		pte_unmap(pte);
 	}
 }
@@ -382,22 +384,20 @@ static void walk_pte_level(struct seq_file *m, struct pg_state *st, pmd_t addr,
  * us dozens of seconds (minutes for 5-level config) while checking for
  * W+X mapping or reading kernel_page_tables debugfs file.
  */
-static inline bool kasan_page_table(struct seq_file *m, struct pg_state *st,
-				void *pt)
+static inline bool kasan_page_table(struct pg_state *st, void *pt)
 {
 	if (__pa(pt) == __pa(kasan_early_shadow_pmd) ||
 	    (pgtable_l5_enabled() &&
 			__pa(pt) == __pa(kasan_early_shadow_p4d)) ||
 	    __pa(pt) == __pa(kasan_early_shadow_pud)) {
 		pgprotval_t prot = pte_flags(kasan_early_shadow_pte[0]);
-		note_page(m, st, __pgprot(prot), 0, 5);
+		note_page(st, __pgprot(prot), 0, 5);
 		return true;
 	}
 	return false;
 }
 #else
-static inline bool kasan_page_table(struct seq_file *m, struct pg_state *st,
-				void *pt)
+static inline bool kasan_page_table(struct pg_state *st, void *pt)
 {
 	return false;
 }
@@ -405,7 +405,7 @@ static inline bool kasan_page_table(struct seq_file *m, struct pg_state *st,
 
 #if PTRS_PER_PMD > 1
 
-static void walk_pmd_level(struct seq_file *m, struct pg_state *st, pud_t addr,
+static void walk_pmd_level(struct pg_state *st, pud_t addr,
 			   pgprotval_t eff_in, unsigned long P)
 {
 	int i;
@@ -419,19 +419,19 @@ static void walk_pmd_level(struct seq_file *m, struct pg_state *st, pud_t addr,
 			prot = pmd_flags(*start);
 			eff = effective_prot(eff_in, prot);
 			if (pmd_large(*start) || !pmd_present(*start)) {
-				note_page(m, st, __pgprot(prot), eff, 4);
-			} else if (!kasan_page_table(m, st, pmd_start)) {
-				walk_pte_level(m, st, *start, eff,
+				note_page(st, __pgprot(prot), eff, 4);
+			} else if (!kasan_page_table(st, pmd_start)) {
+				walk_pte_level(st, *start, eff,
 					       P + i * PMD_LEVEL_MULT);
 			}
 		} else
-			note_page(m, st, __pgprot(0), 0, 4);
+			note_page(st, __pgprot(0), 0, 4);
 		start++;
 	}
 }
 
 #else
-#define walk_pmd_level(m,s,a,e,p) walk_pte_level(m,s,__pmd(pud_val(a)),e,p)
+#define walk_pmd_level(s,a,e,p) walk_pte_level(s,__pmd(pud_val(a)),e,p)
 #undef pud_large
 #define pud_large(a) pmd_large(__pmd(pud_val(a)))
 #define pud_none(a)  pmd_none(__pmd(pud_val(a)))
@@ -439,8 +439,8 @@ static void walk_pmd_level(struct seq_file *m, struct pg_state *st, pud_t addr,
 
 #if PTRS_PER_PUD > 1
 
-static void walk_pud_level(struct seq_file *m, struct pg_state *st, p4d_t addr,
-			   pgprotval_t eff_in, unsigned long P)
+static void walk_pud_level(struct pg_state *st, p4d_t addr, pgprotval_t eff_in,
+			   unsigned long P)
 {
 	int i;
 	pud_t *start, *pud_start;
@@ -455,13 +455,13 @@ static void walk_pud_level(struct seq_file *m, struct pg_state *st, p4d_t addr,
 			prot = pud_flags(*start);
 			eff = effective_prot(eff_in, prot);
 			if (pud_large(*start) || !pud_present(*start)) {
-				note_page(m, st, __pgprot(prot), eff, 3);
-			} else if (!kasan_page_table(m, st, pud_start)) {
-				walk_pmd_level(m, st, *start, eff,
+				note_page(st, __pgprot(prot), eff, 3);
+			} else if (!kasan_page_table(st, pud_start)) {
+				walk_pmd_level(st, *start, eff,
 					       P + i * PUD_LEVEL_MULT);
 			}
 		} else
-			note_page(m, st, __pgprot(0), 0, 3);
+			note_page(st, __pgprot(0), 0, 3);
 
 		prev_pud = start;
 		start++;
@@ -469,20 +469,20 @@ static void walk_pud_level(struct seq_file *m, struct pg_state *st, p4d_t addr,
 }
 
 #else
-#define walk_pud_level(m,s,a,e,p) walk_pmd_level(m,s,__pud(p4d_val(a)),e,p)
+#define walk_pud_level(s,a,e,p) walk_pmd_level(s,__pud(p4d_val(a)),e,p)
 #define p4d_large(a) pud_large(__pud(p4d_val(a)))
 #define p4d_none(a)  pud_none(__pud(p4d_val(a)))
 #endif
 
-static void walk_p4d_level(struct seq_file *m, struct pg_state *st, pgd_t addr,
-			   pgprotval_t eff_in, unsigned long P)
+static void walk_p4d_level(struct pg_state *st, pgd_t addr, pgprotval_t eff_in,
+			   unsigned long P)
 {
 	int i;
 	p4d_t *start, *p4d_start;
 	pgprotval_t prot, eff;
 
 	if (PTRS_PER_P4D == 1)
-		return walk_pud_level(m, st, __p4d(pgd_val(addr)), eff_in, P);
+		return walk_pud_level(st, __p4d(pgd_val(addr)), eff_in, P);
 
 	p4d_start = start = (p4d_t *)pgd_page_vaddr(addr);
 
@@ -492,13 +492,13 @@ static void walk_p4d_level(struct seq_file *m, struct pg_state *st, pgd_t addr,
 			prot = p4d_flags(*start);
 			eff = effective_prot(eff_in, prot);
 			if (p4d_large(*start) || !p4d_present(*start)) {
-				note_page(m, st, __pgprot(prot), eff, 2);
-			} else if (!kasan_page_table(m, st, p4d_start)) {
-				walk_pud_level(m, st, *start, eff,
+				note_page(st, __pgprot(prot), eff, 2);
+			} else if (!kasan_page_table(st, p4d_start)) {
+				walk_pud_level(st, *start, eff,
 					       P + i * P4D_LEVEL_MULT);
 			}
 		} else
-			note_page(m, st, __pgprot(0), 0, 2);
+			note_page(st, __pgprot(0), 0, 2);
 
 		start++;
 	}
@@ -536,6 +536,7 @@ static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
 	}
 
 	st.check_wx = checkwx;
+	st.seq = m;
 	if (checkwx)
 		st.wx_pages = 0;
 
@@ -549,13 +550,13 @@ static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
 			eff = prot;
 #endif
 			if (pgd_large(*start) || !pgd_present(*start)) {
-				note_page(m, &st, __pgprot(prot), eff, 1);
+				note_page(&st, __pgprot(prot), eff, 1);
 			} else {
-				walk_p4d_level(m, &st, *start, eff,
+				walk_p4d_level(&st, *start, eff,
 					       i * PGD_LEVEL_MULT);
 			}
 		} else
-			note_page(m, &st, __pgprot(0), 0, 1);
+			note_page(&st, __pgprot(0), 0, 1);
 
 		cond_resched();
 		start++;
@@ -563,7 +564,7 @@ static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
 
 	/* Flush out the last page */
 	st.current_address = normalize_addr(PTRS_PER_PGD*PGD_LEVEL_MULT);
-	note_page(m, &st, __pgprot(0), 0, 0);
+	note_page(&st, __pgprot(0), 0, 0);
 	if (!checkwx)
 		return;
 	if (st.wx_pages)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/13] x86/mm+efi: Convert ptdump_walk_pgd_level() to take a mm_struct
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (8 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 09/13] x86/mm: Point to struct seq_file from struct pg_state Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:02 ` [PATCH 11/13] x86/mm: Convert ptdump_walk_pgd_level_debugfs() to take an mm_struct Steven Price
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

To enable x86 to use the generic walk_page_range() function, the
callers of ptdump_walk_pgd_level() need to pass an mm_struct rather
than the raw pgd_t pointer. Luckily since commit 7e904a91bf60
("efi: Use efi_mm in x86 as well as ARM") we now have an mm_struct
for EFI on x86.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/x86/include/asm/pgtable.h | 2 +-
 arch/x86/mm/dump_pagetables.c  | 4 ++--
 arch/x86/platform/efi/efi_32.c | 2 +-
 arch/x86/platform/efi/efi_64.c | 4 ++--
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 3695f6acb6af..371901283d5f 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -27,7 +27,7 @@
 extern pgd_t early_top_pgt[PTRS_PER_PGD];
 int __init __early_make_pgtable(unsigned long address, pmdval_t pmd);
 
-void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd);
+void ptdump_walk_pgd_level(struct seq_file *m, struct mm_struct *mm);
 void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool user);
 void ptdump_walk_pgd_level_checkwx(void);
 void ptdump_walk_user_pgd_level_checkwx(void);
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index ad6986c9e8e7..1a4a03e3a6bd 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -574,9 +574,9 @@ static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
 		pr_info("x86/mm: Checked W+X mappings: passed, no W+X pages found.\n");
 }
 
-void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd)
+void ptdump_walk_pgd_level(struct seq_file *m, struct mm_struct *mm)
 {
-	ptdump_walk_pgd_level_core(m, pgd, false, true);
+	ptdump_walk_pgd_level_core(m, mm->pgd, false, true);
 }
 
 void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool user)
diff --git a/arch/x86/platform/efi/efi_32.c b/arch/x86/platform/efi/efi_32.c
index 9959657127f4..9175ceaa6e72 100644
--- a/arch/x86/platform/efi/efi_32.c
+++ b/arch/x86/platform/efi/efi_32.c
@@ -49,7 +49,7 @@ void efi_sync_low_kernel_mappings(void) {}
 void __init efi_dump_pagetable(void)
 {
 #ifdef CONFIG_EFI_PGT_DUMP
-	ptdump_walk_pgd_level(NULL, swapper_pg_dir);
+	ptdump_walk_pgd_level(NULL, init_mm);
 #endif
 }
 
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index cf0347f61b21..a2e0f9800190 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -611,9 +611,9 @@ void __init efi_dump_pagetable(void)
 {
 #ifdef CONFIG_EFI_PGT_DUMP
 	if (efi_enabled(EFI_OLD_MEMMAP))
-		ptdump_walk_pgd_level(NULL, swapper_pg_dir);
+		ptdump_walk_pgd_level(NULL, init_mm);
 	else
-		ptdump_walk_pgd_level(NULL, efi_mm.pgd);
+		ptdump_walk_pgd_level(NULL, efi_mm);
 #endif
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 11/13] x86/mm: Convert ptdump_walk_pgd_level_debugfs() to take an mm_struct
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (9 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 10/13] x86/mm+efi: Convert ptdump_walk_pgd_level() to take a mm_struct Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:02 ` [PATCH 12/13] x86/mm: Convert ptdump_walk_pgd_level_core() " Steven Price
  2019-02-15 17:02 ` [PATCH 13/13] x86: mm: Convert dump_pagetables to use walk_page_range Steven Price
  12 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

To enable x86 to use the generic walk_page_range() function, the
callers of ptdump_walk_pgd_level_debugfs() need to pass in the mm_struct.

This means that ptdump_walk_pgd_level_core() is now always passed a
valid pgd, so drop the support for pgd==NULL.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/x86/include/asm/pgtable.h |  3 ++-
 arch/x86/mm/debug_pagetables.c |  8 ++++----
 arch/x86/mm/dump_pagetables.c  | 14 ++++++--------
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 371901283d5f..ab2aa3eb05e9 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -28,7 +28,8 @@ extern pgd_t early_top_pgt[PTRS_PER_PGD];
 int __init __early_make_pgtable(unsigned long address, pmdval_t pmd);
 
 void ptdump_walk_pgd_level(struct seq_file *m, struct mm_struct *mm);
-void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool user);
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, struct mm_struct *mm,
+				   bool user);
 void ptdump_walk_pgd_level_checkwx(void);
 void ptdump_walk_user_pgd_level_checkwx(void);
 
diff --git a/arch/x86/mm/debug_pagetables.c b/arch/x86/mm/debug_pagetables.c
index cd84f067e41d..824131052574 100644
--- a/arch/x86/mm/debug_pagetables.c
+++ b/arch/x86/mm/debug_pagetables.c
@@ -6,7 +6,7 @@
 
 static int ptdump_show(struct seq_file *m, void *v)
 {
-	ptdump_walk_pgd_level_debugfs(m, NULL, false);
+	ptdump_walk_pgd_level_debugfs(m, &init_mm, false);
 	return 0;
 }
 
@@ -16,7 +16,7 @@ static int ptdump_curknl_show(struct seq_file *m, void *v)
 {
 	if (current->mm->pgd) {
 		down_read(&current->mm->mmap_sem);
-		ptdump_walk_pgd_level_debugfs(m, current->mm->pgd, false);
+		ptdump_walk_pgd_level_debugfs(m, current->mm, false);
 		up_read(&current->mm->mmap_sem);
 	}
 	return 0;
@@ -31,7 +31,7 @@ static int ptdump_curusr_show(struct seq_file *m, void *v)
 {
 	if (current->mm->pgd) {
 		down_read(&current->mm->mmap_sem);
-		ptdump_walk_pgd_level_debugfs(m, current->mm->pgd, true);
+		ptdump_walk_pgd_level_debugfs(m, current->mm, true);
 		up_read(&current->mm->mmap_sem);
 	}
 	return 0;
@@ -46,7 +46,7 @@ static struct dentry *pe_efi;
 static int ptdump_efi_show(struct seq_file *m, void *v)
 {
 	if (efi_mm.pgd)
-		ptdump_walk_pgd_level_debugfs(m, efi_mm.pgd, false);
+		ptdump_walk_pgd_level_debugfs(m, &efi_mm, false);
 	return 0;
 }
 
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 1a4a03e3a6bd..18fb6193311f 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -525,16 +525,12 @@ static inline bool is_hypervisor_range(int idx)
 static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
 				       bool checkwx, bool dmesg)
 {
-	pgd_t *start = INIT_PGD;
+	pgd_t *start = pgd;
 	pgprotval_t prot, eff;
 	int i;
 	struct pg_state st = {};
 
-	if (pgd) {
-		start = pgd;
-		st.to_dmesg = dmesg;
-	}
-
+	st.to_dmesg = dmesg;
 	st.check_wx = checkwx;
 	st.seq = m;
 	if (checkwx)
@@ -579,8 +575,10 @@ void ptdump_walk_pgd_level(struct seq_file *m, struct mm_struct *mm)
 	ptdump_walk_pgd_level_core(m, mm->pgd, false, true);
 }
 
-void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool user)
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, struct mm_struct *mm,
+				   bool user)
 {
+	pgd_t *pgd = mm->pgd;
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
 	if (user && static_cpu_has(X86_FEATURE_PTI))
 		pgd = kernel_to_user_pgdp(pgd);
@@ -606,7 +604,7 @@ void ptdump_walk_user_pgd_level_checkwx(void)
 
 void ptdump_walk_pgd_level_checkwx(void)
 {
-	ptdump_walk_pgd_level_core(NULL, NULL, true, false);
+	ptdump_walk_pgd_level_core(NULL, INIT_PGD, true, false);
 }
 
 static int __init pt_dump_init(void)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 12/13] x86/mm: Convert ptdump_walk_pgd_level_core() to take an mm_struct
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (10 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 11/13] x86/mm: Convert ptdump_walk_pgd_level_debugfs() to take an mm_struct Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:02 ` [PATCH 13/13] x86: mm: Convert dump_pagetables to use walk_page_range Steven Price
  12 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

An mm_struct is needed to enable x86 to use of the generic
walk_page_range() function.

In the case of walking the user page tables (when
CONFIG_PAGE_TABLE_ISOLATION is enabled), it is necessary to create a
fake_mm structure because there isn't an mm_struct with a pointer
to the pgd of the user page tables. This fake_mm structure is
initialised with the minimum necessary for the generic page walk code.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/x86/mm/dump_pagetables.c | 36 ++++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 18fb6193311f..ddb22e7f81b3 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -111,8 +111,6 @@ static struct addr_marker address_markers[] = {
 	[END_OF_SPACE_NR]	= { -1,			NULL }
 };
 
-#define INIT_PGD	((pgd_t *) &init_top_pgt)
-
 #else /* CONFIG_X86_64 */
 
 enum address_markers_idx {
@@ -147,8 +145,6 @@ static struct addr_marker address_markers[] = {
 	[END_OF_SPACE_NR]	= { -1,			NULL }
 };
 
-#define INIT_PGD	(swapper_pg_dir)
-
 #endif /* !CONFIG_X86_64 */
 
 /* Multipliers for offsets within the PTEs */
@@ -522,10 +518,10 @@ static inline bool is_hypervisor_range(int idx)
 #endif
 }
 
-static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
+static void ptdump_walk_pgd_level_core(struct seq_file *m, struct mm_struct *mm,
 				       bool checkwx, bool dmesg)
 {
-	pgd_t *start = pgd;
+	pgd_t *start = mm->pgd;
 	pgprotval_t prot, eff;
 	int i;
 	struct pg_state st = {};
@@ -572,39 +568,49 @@ static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
 
 void ptdump_walk_pgd_level(struct seq_file *m, struct mm_struct *mm)
 {
-	ptdump_walk_pgd_level_core(m, mm->pgd, false, true);
+	ptdump_walk_pgd_level_core(m, mm, false, true);
 }
 
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+static void ptdump_walk_pgd_level_user_core(struct seq_file *m,
+					    struct mm_struct *mm,
+					    bool checkwx, bool dmesg)
+{
+	struct mm_struct fake_mm = {
+		.pgd = kernel_to_user_pgdp(mm->pgd)
+	};
+	init_rwsem(&fake_mm.mmap_sem);
+	ptdump_walk_pgd_level_core(m, &fake_mm, checkwx, dmesg);
+}
+#endif
+
 void ptdump_walk_pgd_level_debugfs(struct seq_file *m, struct mm_struct *mm,
 				   bool user)
 {
-	pgd_t *pgd = mm->pgd;
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
 	if (user && static_cpu_has(X86_FEATURE_PTI))
-		pgd = kernel_to_user_pgdp(pgd);
+		ptdump_walk_pgd_level_user_core(m, mm, false, false);
+	else
 #endif
-	ptdump_walk_pgd_level_core(m, pgd, false, false);
+		ptdump_walk_pgd_level_core(m, mm, false, false);
 }
 EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level_debugfs);
 
 void ptdump_walk_user_pgd_level_checkwx(void)
 {
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
-	pgd_t *pgd = INIT_PGD;
-
 	if (!(__supported_pte_mask & _PAGE_NX) ||
 	    !static_cpu_has(X86_FEATURE_PTI))
 		return;
 
 	pr_info("x86/mm: Checking user space page tables\n");
-	pgd = kernel_to_user_pgdp(pgd);
-	ptdump_walk_pgd_level_core(NULL, pgd, true, false);
+	ptdump_walk_pgd_level_user_core(NULL, &init_mm, true, false);
 #endif
 }
 
 void ptdump_walk_pgd_level_checkwx(void)
 {
-	ptdump_walk_pgd_level_core(NULL, INIT_PGD, true, false);
+	ptdump_walk_pgd_level_core(NULL, &init_mm, true, false);
 }
 
 static int __init pt_dump_init(void)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 13/13] x86: mm: Convert dump_pagetables to use walk_page_range
  2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
                   ` (11 preceding siblings ...)
  2019-02-15 17:02 ` [PATCH 12/13] x86/mm: Convert ptdump_walk_pgd_level_core() " Steven Price
@ 2019-02-15 17:02 ` Steven Price
  2019-02-15 17:16   ` Dave Hansen
  12 siblings, 1 reply; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:02 UTC (permalink / raw)
  To: linux-mm
  Cc: Steven Price, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

Make use of the new functionality in walk_page_range to remove the
arch page walking code and use the generic code to walk the page tables.

The effective permissions are passed down the chain using new fields
in struct pg_state.

The KASAN optimisation is implemented by including test_p?d callbacks
which can decide to skip an entire tree of entries

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/x86/mm/dump_pagetables.c | 281 ++++++++++++++++++----------------
 1 file changed, 146 insertions(+), 135 deletions(-)

diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index ddb22e7f81b3..64d1619493a4 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -33,6 +33,10 @@ struct pg_state {
 	int level;
 	pgprot_t current_prot;
 	pgprotval_t effective_prot;
+	pgprotval_t effective_prot_pgd;
+	pgprotval_t effective_prot_p4d;
+	pgprotval_t effective_prot_pud;
+	pgprotval_t effective_prot_pmd;
 	unsigned long start_address;
 	unsigned long current_address;
 	const struct addr_marker *marker;
@@ -355,22 +359,21 @@ static inline pgprotval_t effective_prot(pgprotval_t prot1, pgprotval_t prot2)
 	       ((prot1 | prot2) & _PAGE_NX);
 }
 
-static void walk_pte_level(struct pg_state *st, pmd_t addr, pgprotval_t eff_in,
-			   unsigned long P)
+static int ptdump_pte_entry(pte_t *pte, unsigned long addr,
+			    unsigned long next, struct mm_walk *walk)
 {
-	int i;
-	pte_t *pte;
-	pgprotval_t prot, eff;
-
-	for (i = 0; i < PTRS_PER_PTE; i++) {
-		st->current_address = normalize_addr(P + i * PTE_LEVEL_MULT);
-		pte = pte_offset_map(&addr, st->current_address);
-		prot = pte_flags(*pte);
-		eff = effective_prot(eff_in, prot);
-		note_page(st, __pgprot(prot), eff, 5);
-		pte_unmap(pte);
-	}
+	struct pg_state *st = walk->private;
+	pgprotval_t eff, prot;
+
+	st->current_address = normalize_addr(addr);
+
+	prot = pte_flags(*pte);
+	eff = effective_prot(st->effective_prot_pmd, prot);
+	note_page(st, __pgprot(prot), eff, 5);
+
+	return 0;
 }
+
 #ifdef CONFIG_KASAN
 
 /*
@@ -399,132 +402,152 @@ static inline bool kasan_page_table(struct pg_state *st, void *pt)
 }
 #endif
 
-#if PTRS_PER_PMD > 1
-
-static void walk_pmd_level(struct pg_state *st, pud_t addr,
-			   pgprotval_t eff_in, unsigned long P)
+static int ptdump_test_pmd(unsigned long addr, unsigned long next,
+			   pmd_t *pmd, struct mm_walk *walk)
 {
-	int i;
-	pmd_t *start, *pmd_start;
-	pgprotval_t prot, eff;
-
-	pmd_start = start = (pmd_t *)pud_page_vaddr(addr);
-	for (i = 0; i < PTRS_PER_PMD; i++) {
-		st->current_address = normalize_addr(P + i * PMD_LEVEL_MULT);
-		if (!pmd_none(*start)) {
-			prot = pmd_flags(*start);
-			eff = effective_prot(eff_in, prot);
-			if (pmd_large(*start) || !pmd_present(*start)) {
-				note_page(st, __pgprot(prot), eff, 4);
-			} else if (!kasan_page_table(st, pmd_start)) {
-				walk_pte_level(st, *start, eff,
-					       P + i * PMD_LEVEL_MULT);
-			}
-		} else
-			note_page(st, __pgprot(0), 0, 4);
-		start++;
-	}
+	struct pg_state *st = walk->private;
+
+	st->current_address = normalize_addr(addr);
+
+	if (kasan_page_table(st, pmd))
+		return 1;
+	return 0;
 }
 
-#else
-#define walk_pmd_level(s,a,e,p) walk_pte_level(s,__pmd(pud_val(a)),e,p)
-#undef pud_large
-#define pud_large(a) pmd_large(__pmd(pud_val(a)))
-#define pud_none(a)  pmd_none(__pmd(pud_val(a)))
-#endif
+static int ptdump_pmd_entry(pmd_t *pmd, unsigned long addr,
+			    unsigned long next, struct mm_walk *walk)
+{
+	struct pg_state *st = walk->private;
+	pgprotval_t eff, prot;
+
+	prot = pmd_flags(*pmd);
+	eff = effective_prot(st->effective_prot_pud, prot);
+
+	st->current_address = normalize_addr(addr);
+
+	if (pmd_large(*pmd))
+		note_page(st, __pgprot(prot), eff, 4);
 
-#if PTRS_PER_PUD > 1
+	st->effective_prot_pmd = eff;
 
-static void walk_pud_level(struct pg_state *st, p4d_t addr, pgprotval_t eff_in,
-			   unsigned long P)
+	return 0;
+}
+
+static int ptdump_test_pud(unsigned long addr, unsigned long next,
+			   pud_t *pud, struct mm_walk *walk)
 {
-	int i;
-	pud_t *start, *pud_start;
-	pgprotval_t prot, eff;
-	pud_t *prev_pud = NULL;
-
-	pud_start = start = (pud_t *)p4d_page_vaddr(addr);
-
-	for (i = 0; i < PTRS_PER_PUD; i++) {
-		st->current_address = normalize_addr(P + i * PUD_LEVEL_MULT);
-		if (!pud_none(*start)) {
-			prot = pud_flags(*start);
-			eff = effective_prot(eff_in, prot);
-			if (pud_large(*start) || !pud_present(*start)) {
-				note_page(st, __pgprot(prot), eff, 3);
-			} else if (!kasan_page_table(st, pud_start)) {
-				walk_pmd_level(st, *start, eff,
-					       P + i * PUD_LEVEL_MULT);
-			}
-		} else
-			note_page(st, __pgprot(0), 0, 3);
+	struct pg_state *st = walk->private;
 
-		prev_pud = start;
-		start++;
-	}
+	st->current_address = normalize_addr(addr);
+
+	if (kasan_page_table(st, pud))
+		return 1;
+	return 0;
 }
 
-#else
-#define walk_pud_level(s,a,e,p) walk_pmd_level(s,__pud(p4d_val(a)),e,p)
-#define p4d_large(a) pud_large(__pud(p4d_val(a)))
-#define p4d_none(a)  pud_none(__pud(p4d_val(a)))
-#endif
+static int ptdump_pud_entry(pud_t *pud, unsigned long addr,
+			    unsigned long next, struct mm_walk *walk)
+{
+	struct pg_state *st = walk->private;
+	pgprotval_t eff, prot;
+
+	prot = pud_flags(*pud);
+	eff = effective_prot(st->effective_prot_p4d, prot);
+
+	st->current_address = normalize_addr(addr);
+
+	if (pud_large(*pud))
+		note_page(st, __pgprot(prot), eff, 3);
+
+	st->effective_prot_pud = eff;
 
-static void walk_p4d_level(struct pg_state *st, pgd_t addr, pgprotval_t eff_in,
-			   unsigned long P)
+	return 0;
+}
+
+static int ptdump_test_p4d(unsigned long addr, unsigned long next,
+			   p4d_t *p4d, struct mm_walk *walk)
 {
-	int i;
-	p4d_t *start, *p4d_start;
-	pgprotval_t prot, eff;
-
-	if (PTRS_PER_P4D == 1)
-		return walk_pud_level(st, __p4d(pgd_val(addr)), eff_in, P);
-
-	p4d_start = start = (p4d_t *)pgd_page_vaddr(addr);
-
-	for (i = 0; i < PTRS_PER_P4D; i++) {
-		st->current_address = normalize_addr(P + i * P4D_LEVEL_MULT);
-		if (!p4d_none(*start)) {
-			prot = p4d_flags(*start);
-			eff = effective_prot(eff_in, prot);
-			if (p4d_large(*start) || !p4d_present(*start)) {
-				note_page(st, __pgprot(prot), eff, 2);
-			} else if (!kasan_page_table(st, p4d_start)) {
-				walk_pud_level(st, *start, eff,
-					       P + i * P4D_LEVEL_MULT);
-			}
-		} else
-			note_page(st, __pgprot(0), 0, 2);
+	struct pg_state *st = walk->private;
 
-		start++;
-	}
+	st->current_address = normalize_addr(addr);
+
+	if (kasan_page_table(st, p4d))
+		return 1;
+	return 0;
 }
 
-#undef pgd_large
-#define pgd_large(a) (pgtable_l5_enabled() ? pgd_large(a) : p4d_large(__p4d(pgd_val(a))))
-#define pgd_none(a)  (pgtable_l5_enabled() ? pgd_none(a) : p4d_none(__p4d(pgd_val(a))))
+static int ptdump_p4d_entry(p4d_t *p4d, unsigned long addr,
+			    unsigned long next, struct mm_walk *walk)
+{
+	struct pg_state *st = walk->private;
+	pgprotval_t eff, prot;
+
+	prot = p4d_flags(*p4d);
+	eff = effective_prot(st->effective_prot_pgd, prot);
+
+	st->current_address = normalize_addr(addr);
+
+	if (p4d_large(*p4d))
+		note_page(st, __pgprot(prot), eff, 2);
+
+	st->effective_prot_p4d = eff;
+
+	return 0;
+}
 
-static inline bool is_hypervisor_range(int idx)
+static int ptdump_pgd_entry(pgd_t *pgd, unsigned long addr,
+			    unsigned long next, struct mm_walk *walk)
 {
-#ifdef CONFIG_X86_64
-	/*
-	 * A hole in the beginning of kernel address space reserved
-	 * for a hypervisor.
-	 */
-	return	(idx >= pgd_index(GUARD_HOLE_BASE_ADDR)) &&
-		(idx <  pgd_index(GUARD_HOLE_END_ADDR));
+	struct pg_state *st = walk->private;
+	pgprotval_t eff, prot;
+
+	prot = pgd_flags(*pgd);
+
+#ifdef CONFIG_X86_PAE
+	eff = _PAGE_USER | _PAGE_RW;
 #else
-	return false;
+	eff = prot;
 #endif
+
+	st->current_address = normalize_addr(addr);
+
+	if (pgd_large(*pgd))
+		note_page(st, __pgprot(prot), eff, 1);
+
+	st->effective_prot_pgd = eff;
+
+	return 0;
+}
+
+static int ptdump_hole(unsigned long addr, unsigned long next, int depth,
+		       struct mm_walk *walk)
+{
+	struct pg_state *st = walk->private;
+
+	st->current_address = normalize_addr(addr);
+
+	note_page(st, __pgprot(0), 0, depth + 1);
+
+	return 0;
 }
 
 static void ptdump_walk_pgd_level_core(struct seq_file *m, struct mm_struct *mm,
 				       bool checkwx, bool dmesg)
 {
-	pgd_t *start = mm->pgd;
-	pgprotval_t prot, eff;
-	int i;
 	struct pg_state st = {};
+	struct mm_walk walk = {
+		.mm		= mm,
+		.pgd_entry	= ptdump_pgd_entry,
+		.p4d_entry	= ptdump_p4d_entry,
+		.pud_entry	= ptdump_pud_entry,
+		.pmd_entry	= ptdump_pmd_entry,
+		.pte_entry	= ptdump_pte_entry,
+		.test_p4d	= ptdump_test_p4d,
+		.test_pud	= ptdump_test_pud,
+		.test_pmd	= ptdump_test_pmd,
+		.pte_hole	= ptdump_hole,
+		.private	= &st
+	};
 
 	st.to_dmesg = dmesg;
 	st.check_wx = checkwx;
@@ -532,27 +555,15 @@ static void ptdump_walk_pgd_level_core(struct seq_file *m, struct mm_struct *mm,
 	if (checkwx)
 		st.wx_pages = 0;
 
-	for (i = 0; i < PTRS_PER_PGD; i++) {
-		st.current_address = normalize_addr(i * PGD_LEVEL_MULT);
-		if (!pgd_none(*start) && !is_hypervisor_range(i)) {
-			prot = pgd_flags(*start);
-#ifdef CONFIG_X86_PAE
-			eff = _PAGE_USER | _PAGE_RW;
+	down_read(&mm->mmap_sem);
+#ifdef CONFIG_X86_64
+	walk_page_range(0, PTRS_PER_PGD*PGD_LEVEL_MULT/2, &walk);
+	walk_page_range(normalize_addr(PTRS_PER_PGD*PGD_LEVEL_MULT/2), ~0,
+			&walk);
 #else
-			eff = prot;
+	walk_page_range(0, ~0, &walk);
 #endif
-			if (pgd_large(*start) || !pgd_present(*start)) {
-				note_page(&st, __pgprot(prot), eff, 1);
-			} else {
-				walk_p4d_level(&st, *start, eff,
-					       i * PGD_LEVEL_MULT);
-			}
-		} else
-			note_page(&st, __pgprot(0), 0, 1);
-
-		cond_resched();
-		start++;
-	}
+	up_read(&mm->mmap_sem);
 
 	/* Flush out the last page */
 	st.current_address = normalize_addr(PTRS_PER_PGD*PGD_LEVEL_MULT);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 13/13] x86: mm: Convert dump_pagetables to use walk_page_range
  2019-02-15 17:02 ` [PATCH 13/13] x86: mm: Convert dump_pagetables to use walk_page_range Steven Price
@ 2019-02-15 17:16   ` Dave Hansen
  2019-02-15 17:32     ` Steven Price
  0 siblings, 1 reply; 36+ messages in thread
From: Dave Hansen @ 2019-02-15 17:16 UTC (permalink / raw)
  To: Steven Price, linux-mm
  Cc: Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann, Borislav Petkov,
	Catalin Marinas, Dave Hansen, Ingo Molnar, James Morse,
	Jérôme Glisse, Peter Zijlstra, Thomas Gleixner,
	Will Deacon, x86, H. Peter Anvin, linux-arm-kernel, linux-kernel

On 2/15/19 9:02 AM, Steven Price wrote:
>  arch/x86/mm/dump_pagetables.c | 281 ++++++++++++++++++----------------
>  1 file changed, 146 insertions(+), 135 deletions(-)

I'll look through this in more detail in a bit.  But, I'm a bit bummed
out by the diffstat.  When I see patches add a bunch of infrastructure,
I *really* hope for code simplification.

Looking at the diff, I think it gets there.  The code you add is
simpler than the code you remove.  But, the diffstat is misleading.

I'd probably address this disparity for each patch in the changelogs.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 13/13] x86: mm: Convert dump_pagetables to use walk_page_range
  2019-02-15 17:16   ` Dave Hansen
@ 2019-02-15 17:32     ` Steven Price
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-15 17:32 UTC (permalink / raw)
  To: Dave Hansen, linux-mm
  Cc: x86, Arnd Bergmann, Ard Biesheuvel, Peter Zijlstra,
	Catalin Marinas, Dave Hansen, Will Deacon, linux-kernel,
	Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel

On 15/02/2019 17:16, Dave Hansen wrote:
> On 2/15/19 9:02 AM, Steven Price wrote:
>>  arch/x86/mm/dump_pagetables.c | 281 ++++++++++++++++++----------------
>>  1 file changed, 146 insertions(+), 135 deletions(-)
> 
> I'll look through this in more detail in a bit.  But, I'm a bit bummed
> out by the diffstat.  When I see patches add a bunch of infrastructure,
> I *really* hope for code simplification.
> 
> Looking at the diff, I think it gets there.  The code you add is
> simpler than the code you remove.  But, the diffstat is misleading.

I agree, I was disappointed myself by the diffstat. Hopefully the code
is simplier though - there's unfortunately quite a bit of boiler plate
code in the new version, but that code is at least easier to understand.

My hope is that it will be possible to move the boiler plate code into
common code (there's a lot of similarity with arm64 here). However I
wanted to get some agreement on the approach and make a non-functional
change before attempting to unify the different dump_pagetables
implementations (which is likely to cause minor functional changes).

> I'd probably address this disparity for each patch in the changelogs.

Good point, I'll try to remember for next time.

Thanks for taking a look,

Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/13] mm: Add generic p?d_large() macros
  2019-02-15 17:02 ` [PATCH 03/13] mm: Add generic p?d_large() macros Steven Price
@ 2019-02-18 11:14   ` Mark Rutland
  2019-02-18 12:53     ` Peter Zijlstra
  2019-02-18 11:31   ` Peter Zijlstra
  1 sibling, 1 reply; 36+ messages in thread
From: Mark Rutland @ 2019-02-18 11:14 UTC (permalink / raw)
  To: Steven Price
  Cc: linux-mm, x86, Arnd Bergmann, Ard Biesheuvel, Peter Zijlstra,
	Catalin Marinas, Dave Hansen, Will Deacon, linux-kernel,
	Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel

On Fri, Feb 15, 2019 at 05:02:24PM +0000, Steven Price wrote:
> From: James Morse <james.morse@arm.com>
>
> Exposing the pud/pgd levels of the page tables to walk_page_range() means
> we may come across the exotic large mappings that come with large areas
> of contiguous memory (such as the kernel's linear map).
>
> For architectures that don't provide p?d_large() macros, provided a
> does nothing default.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  include/asm-generic/pgtable.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 05e61e6c843f..7630d663cd51 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -1186,4 +1186,14 @@ static inline bool arch_has_pfn_modify_check(void)
>  #define mm_pmd_folded(mm)__is_defined(__PAGETABLE_PMD_FOLDED)
>  #endif
>
> +#ifndef pgd_large
> +#define pgd_large(x)0
> +#endif
> +#ifndef pud_large
> +#define pud_large(x)0
> +#endif
> +#ifndef pmd_large
> +#define pmd_large(x)0
> +#endif

It might be worth a comment defining the semantics of these, e.g. how
they differ from p?d_huge() and p?d_trans_huge().

Thanks,
Mark.

> +
>  #endif /* _ASM_GENERIC_PGTABLE_H */
> --
> 2.20.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-15 17:02 ` [PATCH 01/13] arm64: mm: Add p?d_large() definitions Steven Price
@ 2019-02-18 11:16   ` Mark Rutland
  2019-02-18 11:29   ` Peter Zijlstra
  1 sibling, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2019-02-18 11:16 UTC (permalink / raw)
  To: Steven Price
  Cc: linux-mm, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

On Fri, Feb 15, 2019 at 05:02:22PM +0000, Steven Price wrote:
> From: James Morse <james.morse@arm.com>
>
> Exposing the pud/pgd levels of the page tables to walk_page_range() means
> we may come across the exotic large mappings that come with large areas
> of contiguous memory (such as the kernel's linear map).
>
> Expose p?d_large() from each architecture to detect these large mappings.
>
> arm64 already has these macros defined, but with a different name.
> p?d_large() is used by s390, sparc and x86. Only arm/arm64 use p?d_sect().
> Add a macro to allow both names.
>
> By not providing a pgd_large(), we get the generic version that always
> returns 0.

This last sentence isn't true until a subsequent patch, so it's probably
worth dropping it to avoid confusion.

Thanks,
Mark.

>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/pgtable.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index de70c1eabf33..09d308921625 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -428,6 +428,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>   PMD_TYPE_TABLE)
>  #define pmd_sect(pmd)((pmd_val(pmd) & PMD_TYPE_MASK) == \
>   PMD_TYPE_SECT)
> +#define pmd_large(x)pmd_sect(x)
>
>  #if defined(CONFIG_ARM64_64K_PAGES) || CONFIG_PGTABLE_LEVELS < 3
>  #define pud_sect(pud)(0)
> @@ -435,6 +436,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>  #else
>  #define pud_sect(pud)((pud_val(pud) & PUD_TYPE_MASK) == \
>   PUD_TYPE_SECT)
> +#define pud_large(x)pud_sect(x)
>  #define pud_table(pud)((pud_val(pud) & PUD_TYPE_MASK) == \
>   PUD_TYPE_TABLE)
>  #endif
> --
> 2.20.1
>
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole
  2019-02-15 17:02 ` [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole Steven Price
@ 2019-02-18 11:23   ` Mark Rutland
  2019-02-18 15:23     ` Steven Price
  2019-02-20 11:35   ` William Kucharski
  1 sibling, 1 reply; 36+ messages in thread
From: Mark Rutland @ 2019-02-18 11:23 UTC (permalink / raw)
  To: Steven Price
  Cc: linux-mm, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

On Fri, Feb 15, 2019 at 05:02:27PM +0000, Steven Price wrote:
> The pte_hole() callback is called at multiple levels of the page tables.
> Code dumping the kernel page tables needs to know what at what depth
> the missing entry is. Add this is an extra parameter to pte_hole().
> When the depth isn't know (e.g. processing a vma) then -1 is passed.
>
> Note that depth starts at 0 for a PGD so that PUD/PMD/PTE retain their
> natural numbers as levels 2/3/4.
>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  fs/proc/task_mmu.c |  4 ++--
>  include/linux/mm.h |  5 +++--
>  mm/hmm.c           |  2 +-
>  mm/migrate.c       |  1 +
>  mm/mincore.c       |  1 +
>  mm/pagewalk.c      | 16 ++++++++++------
>  6 files changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index f0ec9edab2f3..91131cd4e9e0 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -474,7 +474,7 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
>
>  #ifdef CONFIG_SHMEM
>  static int smaps_pte_hole(unsigned long addr, unsigned long end,
> -struct mm_walk *walk)
> +  __always_unused int depth, struct mm_walk *walk)
>  {
>  struct mem_size_stats *mss = walk->private;
>
> @@ -1203,7 +1203,7 @@ static int add_to_pagemap(unsigned long addr, pagemap_entry_t *pme,
>  }
>
>  static int pagemap_pte_hole(unsigned long start, unsigned long end,
> -struct mm_walk *walk)
> +    __always_unused int depth, struct mm_walk *walk)
>  {
>  struct pagemapread *pm = walk->private;
>  unsigned long addr = start;
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 1a4b1615d012..0418a018d7b3 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1420,7 +1420,8 @@ void unmap_vmas(struct mmu_gather *tlb, struct vm_area_struct *start_vma,
>   *       pmd_trans_huge() pmds.  They may simply choose to
>   *       split_huge_page() instead of handling it explicitly.
>   * @pte_entry: if set, called for each non-empty PTE (4th-level) entry
> - * @pte_hole: if set, called for each hole at all levels
> + * @pte_hole: if set, called for each hole at all levels,
> + *            depth is -1 if not known
>   * @hugetlb_entry: if set, called for each hugetlb entry
>   * @test_walk: caller specific callback function to determine whether
>   *             we walk over the current vma or not. Returning 0
> @@ -1445,7 +1446,7 @@ struct mm_walk {
>  int (*pte_entry)(pte_t *pte, unsigned long addr,
>   unsigned long next, struct mm_walk *walk);
>  int (*pte_hole)(unsigned long addr, unsigned long next,
> -struct mm_walk *walk);
> +int depth, struct mm_walk *walk);
>  int (*hugetlb_entry)(pte_t *pte, unsigned long hmask,
>       unsigned long addr, unsigned long next,
>       struct mm_walk *walk);
> diff --git a/mm/hmm.c b/mm/hmm.c
> index a04e4b810610..e3e6b8fda437 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -440,7 +440,7 @@ static void hmm_range_need_fault(const struct hmm_vma_walk *hmm_vma_walk,
>  }
>
>  static int hmm_vma_walk_hole(unsigned long addr, unsigned long end,
> -     struct mm_walk *walk)
> +     __always_unused int depth, struct mm_walk *walk)
>  {
>  struct hmm_vma_walk *hmm_vma_walk = walk->private;
>  struct hmm_range *range = hmm_vma_walk->range;
> diff --git a/mm/migrate.c b/mm/migrate.c
> index d4fd680be3b0..8b62a9fecb5c 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -2121,6 +2121,7 @@ struct migrate_vma {
>
>  static int migrate_vma_collect_hole(unsigned long start,
>      unsigned long end,
> +    __always_unused int depth,
>      struct mm_walk *walk)
>  {
>  struct migrate_vma *migrate = walk->private;
> diff --git a/mm/mincore.c b/mm/mincore.c
> index 218099b5ed31..c4edbc688241 100644
> --- a/mm/mincore.c
> +++ b/mm/mincore.c
> @@ -104,6 +104,7 @@ static int __mincore_unmapped_range(unsigned long addr, unsigned long end,
>  }
>
>  static int mincore_unmapped_range(unsigned long addr, unsigned long end,
> +   __always_unused int depth,
>     struct mm_walk *walk)
>  {
>  walk->private += __mincore_unmapped_range(addr, end,
> diff --git a/mm/pagewalk.c b/mm/pagewalk.c
> index dac0c848b458..b8038f852f06 100644
> --- a/mm/pagewalk.c
> +++ b/mm/pagewalk.c
> @@ -38,7 +38,7 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end,
>  next = pmd_addr_end(addr, end);
>  if (pmd_none(*pmd)) {
>  if (walk->pte_hole)
> -err = walk->pte_hole(addr, next, walk);
> +err = walk->pte_hole(addr, next, 3, walk);
>  if (err)
>  break;
>  continue;
> @@ -88,7 +88,7 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end,
>  next = pud_addr_end(addr, end);
>  if (pud_none(*pud)) {
>  if (walk->pte_hole)
> -err = walk->pte_hole(addr, next, walk);
> +err = walk->pte_hole(addr, next, 2, walk);
>  if (err)
>  break;
>  continue;
> @@ -123,13 +123,17 @@ static int walk_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end,
>  p4d_t *p4d;
>  unsigned long next;
>  int err = 0;
> +/* If the p4ds are actually just pgds then we should report a depth
> + * of 0 not 1 (as a missing entry is really a missing pgd
> + */

Nit: comment style violation. This should look like:
should be:

/*
 * If the p4ds are actually just pgds then we should report a depth
 * of 0 not 1 (as a missing entry is really a missing pgd
 */

> +int depth = (PTRS_PER_P4D == 1)?0:1;

Nit: the ternary should have spacing.

We don't seem to do this at any other level that could be folded, so why
does p4d need special care?

For example, what happens on arm64 when using 64K pages and 3 level
paging, where puds are folded into pgds?

Thanks,
Mark.

>
>  p4d = p4d_offset(pgd, addr);
>  do {
>  next = p4d_addr_end(addr, end);
>  if (p4d_none_or_clear_bad(p4d)) {
>  if (walk->pte_hole)
> -err = walk->pte_hole(addr, next, walk);
> +err = walk->pte_hole(addr, next, depth, walk);
>  if (err)
>  break;
>  continue;
> @@ -160,7 +164,7 @@ static int walk_pgd_range(unsigned long addr, unsigned long end,
>  next = pgd_addr_end(addr, end);
>  if (pgd_none_or_clear_bad(pgd)) {
>  if (walk->pte_hole)
> -err = walk->pte_hole(addr, next, walk);
> +err = walk->pte_hole(addr, next, 0, walk);
>  if (err)
>  break;
>  continue;
> @@ -206,7 +210,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end,
>  if (pte)
>  err = walk->hugetlb_entry(pte, hmask, addr, next, walk);
>  else if (walk->pte_hole)
> -err = walk->pte_hole(addr, next, walk);
> +err = walk->pte_hole(addr, next, -1, walk);
>
>  if (err)
>  break;
> @@ -249,7 +253,7 @@ static int walk_page_test(unsigned long start, unsigned long end,
>  if (vma->vm_flags & VM_PFNMAP) {
>  int err = 1;
>  if (walk->pte_hole)
> -err = walk->pte_hole(start, end, walk);
> +err = walk->pte_hole(start, end, -1, walk);
>  return err ? err : 1;
>  }
>  return 0;
> --
> 2.20.1
>
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-15 17:02 ` [PATCH 01/13] arm64: mm: Add p?d_large() definitions Steven Price
  2019-02-18 11:16   ` Mark Rutland
@ 2019-02-18 11:29   ` Peter Zijlstra
  2019-02-18 13:45     ` Mark Rutland
  2019-02-18 14:11     ` Steven Price
  1 sibling, 2 replies; 36+ messages in thread
From: Peter Zijlstra @ 2019-02-18 11:29 UTC (permalink / raw)
  To: Steven Price
  Cc: linux-mm, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Thomas Gleixner,
	Will Deacon, x86, H. Peter Anvin, linux-arm-kernel, linux-kernel

On Fri, Feb 15, 2019 at 05:02:22PM +0000, Steven Price wrote:

> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index de70c1eabf33..09d308921625 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -428,6 +428,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>  				 PMD_TYPE_TABLE)
>  #define pmd_sect(pmd)		((pmd_val(pmd) & PMD_TYPE_MASK) == \
>  				 PMD_TYPE_SECT)
> +#define pmd_large(x)		pmd_sect(x)
>  
>  #if defined(CONFIG_ARM64_64K_PAGES) || CONFIG_PGTABLE_LEVELS < 3
>  #define pud_sect(pud)		(0)
> @@ -435,6 +436,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>  #else
>  #define pud_sect(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
>  				 PUD_TYPE_SECT)
> +#define pud_large(x)		pud_sect(x)
>  #define pud_table(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
>  				 PUD_TYPE_TABLE)
>  #endif

So on x86 p*d_large() also matches p*d_huge() and thp, But it is not
clear to me this p*d_sect() thing does so, given your definitions.

See here why I care:

  http://lkml.kernel.org/r/20190201124741.GE31552@hirez.programming.kicks-ass.net

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/13] mm: Add generic p?d_large() macros
  2019-02-15 17:02 ` [PATCH 03/13] mm: Add generic p?d_large() macros Steven Price
  2019-02-18 11:14   ` Mark Rutland
@ 2019-02-18 11:31   ` Peter Zijlstra
  2019-02-18 14:19     ` Steven Price
  1 sibling, 1 reply; 36+ messages in thread
From: Peter Zijlstra @ 2019-02-18 11:31 UTC (permalink / raw)
  To: Steven Price
  Cc: linux-mm, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Thomas Gleixner,
	Will Deacon, x86, H. Peter Anvin, linux-arm-kernel, linux-kernel,
	kan.liang, kirill

On Fri, Feb 15, 2019 at 05:02:24PM +0000, Steven Price wrote:
> From: James Morse <james.morse@arm.com>
> 
> Exposing the pud/pgd levels of the page tables to walk_page_range() means
> we may come across the exotic large mappings that come with large areas
> of contiguous memory (such as the kernel's linear map).
> 
> For architectures that don't provide p?d_large() macros, provided a
> does nothing default.

Kan was going to fix that for all archs I think..

See:

  http://lkml.kernel.org/r/20190204105409.GA17550@hirez.programming.kicks-ass.net

> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  include/asm-generic/pgtable.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 05e61e6c843f..7630d663cd51 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -1186,4 +1186,14 @@ static inline bool arch_has_pfn_modify_check(void)
>  #define mm_pmd_folded(mm)	__is_defined(__PAGETABLE_PMD_FOLDED)
>  #endif
>  
> +#ifndef pgd_large
> +#define pgd_large(x)	0
> +#endif
> +#ifndef pud_large
> +#define pud_large(x)	0
> +#endif
> +#ifndef pmd_large
> +#define pmd_large(x)	0
> +#endif
> +
>  #endif /* _ASM_GENERIC_PGTABLE_H */
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/13] mm: Add generic p?d_large() macros
  2019-02-18 11:14   ` Mark Rutland
@ 2019-02-18 12:53     ` Peter Zijlstra
  2019-02-18 14:20       ` Steven Price
  0 siblings, 1 reply; 36+ messages in thread
From: Peter Zijlstra @ 2019-02-18 12:53 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Steven Price, linux-mm, x86, Arnd Bergmann, Ard Biesheuvel,
	Catalin Marinas, Dave Hansen, Will Deacon, linux-kernel,
	Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel

On Mon, Feb 18, 2019 at 11:14:23AM +0000, Mark Rutland wrote:
> > +#ifndef pgd_large
> > +#define pgd_large(x)0
> > +#endif
> > +#ifndef pud_large
> > +#define pud_large(x)0
> > +#endif
> > +#ifndef pmd_large
> > +#define pmd_large(x)0
> > +#endif
> 
> It might be worth a comment defining the semantics of these, e.g. how
> they differ from p?d_huge() and p?d_trans_huge().

Yes; I took it to mean any large page mapping, so it would explicitly
include huge and thp.



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-18 11:29   ` Peter Zijlstra
@ 2019-02-18 13:45     ` Mark Rutland
  2019-02-18 14:11     ` Steven Price
  1 sibling, 0 replies; 36+ messages in thread
From: Mark Rutland @ 2019-02-18 13:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Price, linux-mm, Andy Lutomirski, Ard Biesheuvel,
	Arnd Bergmann, Borislav Petkov, Catalin Marinas, Dave Hansen,
	Ingo Molnar, James Morse, Jérôme Glisse,
	Thomas Gleixner, Will Deacon, x86, H. Peter Anvin,
	linux-arm-kernel, linux-kernel

On Mon, Feb 18, 2019 at 12:29:22PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 15, 2019 at 05:02:22PM +0000, Steven Price wrote:
> 
> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> > index de70c1eabf33..09d308921625 100644
> > --- a/arch/arm64/include/asm/pgtable.h
> > +++ b/arch/arm64/include/asm/pgtable.h
> > @@ -428,6 +428,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> >  				 PMD_TYPE_TABLE)
> >  #define pmd_sect(pmd)		((pmd_val(pmd) & PMD_TYPE_MASK) == \
> >  				 PMD_TYPE_SECT)
> > +#define pmd_large(x)		pmd_sect(x)
> >  
> >  #if defined(CONFIG_ARM64_64K_PAGES) || CONFIG_PGTABLE_LEVELS < 3
> >  #define pud_sect(pud)		(0)
> > @@ -435,6 +436,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> >  #else
> >  #define pud_sect(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
> >  				 PUD_TYPE_SECT)
> > +#define pud_large(x)		pud_sect(x)
> >  #define pud_table(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
> >  				 PUD_TYPE_TABLE)
> >  #endif
> 
> So on x86 p*d_large() also matches p*d_huge() and thp, But it is not
> clear to me this p*d_sect() thing does so, given your definitions.
> 
> See here why I care:
> 
>   http://lkml.kernel.org/r/20190201124741.GE31552@hirez.programming.kicks-ass.net

I believe it does not.

IIUC our p?d_huge() helpers implicitly handle contiguous entries. That's
where you have $N entries in the current level of table that the TLB can
cache together as one.

Our p?d_sect() helpers only match section entries. That's where we map
an entire next-level-table's worth of VA space with a single entry at
the current level.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-18 11:29   ` Peter Zijlstra
  2019-02-18 13:45     ` Mark Rutland
@ 2019-02-18 14:11     ` Steven Price
  2019-02-18 14:29       ` Mark Rutland
  1 sibling, 1 reply; 36+ messages in thread
From: Steven Price @ 2019-02-18 14:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-mm, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Thomas Gleixner,
	Will Deacon, x86, H. Peter Anvin, linux-arm-kernel, linux-kernel

On 18/02/2019 11:29, Peter Zijlstra wrote:
> On Fri, Feb 15, 2019 at 05:02:22PM +0000, Steven Price wrote:
> 
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index de70c1eabf33..09d308921625 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -428,6 +428,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>>  				 PMD_TYPE_TABLE)
>>  #define pmd_sect(pmd)		((pmd_val(pmd) & PMD_TYPE_MASK) == \
>>  				 PMD_TYPE_SECT)
>> +#define pmd_large(x)		pmd_sect(x)
>>  
>>  #if defined(CONFIG_ARM64_64K_PAGES) || CONFIG_PGTABLE_LEVELS < 3
>>  #define pud_sect(pud)		(0)
>> @@ -435,6 +436,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>>  #else
>>  #define pud_sect(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
>>  				 PUD_TYPE_SECT)
>> +#define pud_large(x)		pud_sect(x)
>>  #define pud_table(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
>>  				 PUD_TYPE_TABLE)
>>  #endif
> 
> So on x86 p*d_large() also matches p*d_huge() and thp, But it is not
> clear to me this p*d_sect() thing does so, given your definitions.
> 
> See here why I care:
> 
>   http://lkml.kernel.org/r/20190201124741.GE31552@hirez.programming.kicks-ass.net
> 

pmd_huge()/pud_huge() unfortunately are currently defined as '0' if
!CONFIG_HUGETLB_PAGE and for this reason I was avoiding using them.
While most code would reasonably not care about huge pages in that build
configuration, the likes of the debugfs page table dump code needs to be
able to recognise them in all build configurations. I believe the
situation is the same on arm64 and x86.

The other quirk is that higher levels are not supported for HUGETLB on
some arch implementations. For example arm64 provides no definition for
pgd_huge() so falls back to the generic defined as '0' implementation.
The only architecture I can see that defines this is powerpc. Keeping
this as '0' ensures that the otherwise dead code in other places is
compiled out.

Where p?d_huge is defined by the arch code the intention is that
p?d_large(x)==p?d_huge(x).

Thanks,

Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/13] mm: Add generic p?d_large() macros
  2019-02-18 11:31   ` Peter Zijlstra
@ 2019-02-18 14:19     ` Steven Price
  2019-02-19  3:44       ` Liang, Kan
  0 siblings, 1 reply; 36+ messages in thread
From: Steven Price @ 2019-02-18 14:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, Arnd Bergmann, Ard Biesheuvel, Catalin Marinas, kirill,
	Dave Hansen, Will Deacon, linux-kernel, linux-mm,
	Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel, kan.liang

On 18/02/2019 11:31, Peter Zijlstra wrote:
> On Fri, Feb 15, 2019 at 05:02:24PM +0000, Steven Price wrote:
>> From: James Morse <james.morse@arm.com>
>>
>> Exposing the pud/pgd levels of the page tables to walk_page_range() means
>> we may come across the exotic large mappings that come with large areas
>> of contiguous memory (such as the kernel's linear map).
>>
>> For architectures that don't provide p?d_large() macros, provided a
>> does nothing default.
> 
> Kan was going to fix that for all archs I think..

The latest series I can find from Kan is still x86 specific. I'm happy
to rebase onto something else if Kan has an implementation already
(please point me in the right direction). Otherwise Kan is obviously
free to base on these changes.

Steve

> See:
> 
>   http://lkml.kernel.org/r/20190204105409.GA17550@hirez.programming.kicks-ass.net
> 
>> Signed-off-by: James Morse <james.morse@arm.com>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  include/asm-generic/pgtable.h | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>> index 05e61e6c843f..7630d663cd51 100644
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -1186,4 +1186,14 @@ static inline bool arch_has_pfn_modify_check(void)
>>  #define mm_pmd_folded(mm)	__is_defined(__PAGETABLE_PMD_FOLDED)
>>  #endif
>>  
>> +#ifndef pgd_large
>> +#define pgd_large(x)	0
>> +#endif
>> +#ifndef pud_large
>> +#define pud_large(x)	0
>> +#endif
>> +#ifndef pmd_large
>> +#define pmd_large(x)	0
>> +#endif
>> +
>>  #endif /* _ASM_GENERIC_PGTABLE_H */
>> -- 
>> 2.20.1
>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/13] mm: Add generic p?d_large() macros
  2019-02-18 12:53     ` Peter Zijlstra
@ 2019-02-18 14:20       ` Steven Price
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-18 14:20 UTC (permalink / raw)
  To: Peter Zijlstra, Mark Rutland
  Cc: Dave Hansen, Arnd Bergmann, Ard Biesheuvel, Catalin Marinas, x86,
	Will Deacon, linux-kernel, linux-mm, Jérôme Glisse,
	Ingo Molnar, Borislav Petkov, Andy Lutomirski, H. Peter Anvin,
	James Morse, Thomas Gleixner, linux-arm-kernel

On 18/02/2019 12:53, Peter Zijlstra wrote:
> On Mon, Feb 18, 2019 at 11:14:23AM +0000, Mark Rutland wrote:
>>> +#ifndef pgd_large
>>> +#define pgd_large(x)0
>>> +#endif
>>> +#ifndef pud_large
>>> +#define pud_large(x)0
>>> +#endif
>>> +#ifndef pmd_large
>>> +#define pmd_large(x)0
>>> +#endif
>>
>> It might be worth a comment defining the semantics of these, e.g. how
>> they differ from p?d_huge() and p?d_trans_huge().
> 
> Yes; I took it to mean any large page mapping, so it would explicitly
> include huge and thp.

Yes your interpretation is correct. I'll add a comment explaining the
semantics.

Thanks,

Steve


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-18 14:11     ` Steven Price
@ 2019-02-18 14:29       ` Mark Rutland
  2019-02-18 15:06         ` Peter Zijlstra
  0 siblings, 1 reply; 36+ messages in thread
From: Mark Rutland @ 2019-02-18 14:29 UTC (permalink / raw)
  To: Steven Price
  Cc: Peter Zijlstra, x86, Arnd Bergmann, Ard Biesheuvel,
	Catalin Marinas, Dave Hansen, Will Deacon, linux-kernel,
	linux-mm, Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel

On Mon, Feb 18, 2019 at 02:11:40PM +0000, Steven Price wrote:
> On 18/02/2019 11:29, Peter Zijlstra wrote:
> > On Fri, Feb 15, 2019 at 05:02:22PM +0000, Steven Price wrote:
> > 
> >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> >> index de70c1eabf33..09d308921625 100644
> >> --- a/arch/arm64/include/asm/pgtable.h
> >> +++ b/arch/arm64/include/asm/pgtable.h
> >> @@ -428,6 +428,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> >>  				 PMD_TYPE_TABLE)
> >>  #define pmd_sect(pmd)		((pmd_val(pmd) & PMD_TYPE_MASK) == \
> >>  				 PMD_TYPE_SECT)
> >> +#define pmd_large(x)		pmd_sect(x)
> >>  
> >>  #if defined(CONFIG_ARM64_64K_PAGES) || CONFIG_PGTABLE_LEVELS < 3
> >>  #define pud_sect(pud)		(0)
> >> @@ -435,6 +436,7 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> >>  #else
> >>  #define pud_sect(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
> >>  				 PUD_TYPE_SECT)
> >> +#define pud_large(x)		pud_sect(x)
> >>  #define pud_table(pud)		((pud_val(pud) & PUD_TYPE_MASK) == \
> >>  				 PUD_TYPE_TABLE)
> >>  #endif
> > 
> > So on x86 p*d_large() also matches p*d_huge() and thp, But it is not
> > clear to me this p*d_sect() thing does so, given your definitions.
> > 
> > See here why I care:
> > 
> >   http://lkml.kernel.org/r/20190201124741.GE31552@hirez.programming.kicks-ass.net
> > 
> 
> pmd_huge()/pud_huge() unfortunately are currently defined as '0' if
> !CONFIG_HUGETLB_PAGE and for this reason I was avoiding using them.

I think that Peter means p?d_huge(x) should imply p?d_large(x), e.g.

#define pmd_large(x) \
	(pmd_sect(x) || pmd_huge(x) || pmd_trans_huge(x))

... which should work regardless of CONFIG_HUGETLB_PAGE.

> While most code would reasonably not care about huge pages in that build
> configuration, the likes of the debugfs page table dump code needs to be
> able to recognise them in all build configurations. I believe the
> situation is the same on arm64 and x86.

There's a very important distinction here between:

* section mappings, which are an archtiectural construct used in
  arm64-specific code (e.g. the kernel's own page tables).

* huge mappings, which are Linux logical construct for mapping
  userspace memory. These are buillt using section mappings.

The existing arm64 debugfs pagetable dump code cares about section
mappings specifically in all cases, since it is not used to dump
userspace page tables.

The existing generic code doesn't care about section mappings
specifically, because they are not generic.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-18 14:29       ` Mark Rutland
@ 2019-02-18 15:06         ` Peter Zijlstra
  2019-02-18 15:30           ` Steven Price
  0 siblings, 1 reply; 36+ messages in thread
From: Peter Zijlstra @ 2019-02-18 15:06 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Steven Price, x86, Arnd Bergmann, Ard Biesheuvel,
	Catalin Marinas, Dave Hansen, Will Deacon, linux-kernel,
	linux-mm, Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel

On Mon, Feb 18, 2019 at 02:29:52PM +0000, Mark Rutland wrote:
> I think that Peter means p?d_huge(x) should imply p?d_large(x), e.g.
> 
> #define pmd_large(x) \
> 	(pmd_sect(x) || pmd_huge(x) || pmd_trans_huge(x))
> 
> ... which should work regardless of CONFIG_HUGETLB_PAGE.

Yep, that.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole
  2019-02-18 11:23   ` Mark Rutland
@ 2019-02-18 15:23     ` Steven Price
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-18 15:23 UTC (permalink / raw)
  To: Mark Rutland
  Cc: x86, Arnd Bergmann, Ard Biesheuvel, Peter Zijlstra,
	Catalin Marinas, Dave Hansen, Will Deacon, linux-kernel,
	linux-mm, Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel

On 18/02/2019 11:23, Mark Rutland wrote:
> On Fri, Feb 15, 2019 at 05:02:27PM +0000, Steven Price wrote:
>> +/* If the p4ds are actually just pgds then we should report a depth
>> + * of 0 not 1 (as a missing entry is really a missing pgd
>> + */
> 
> Nit: comment style violation. This should look like:
> should be:
> 
> /*
>  * If the p4ds are actually just pgds then we should report a depth
>  * of 0 not 1 (as a missing entry is really a missing pgd
>  */
> 
>> +int depth = (PTRS_PER_P4D == 1)?0:1;
> 
> Nit: the ternary should have spacing.
> 
> We don't seem to do this at any other level that could be folded, so why
> does p4d need special care?
> 
> For example, what happens on arm64 when using 64K pages and 3 level
> paging, where puds are folded into pgds?
> 
> Thanks,
> Mark.

Yes, you are entirely correct I've missed the other potential foldings.
I somehow imagined that p4d was special and was folded the opposite
direction (I'm not sure why!).

The best solution I can come up with is a function which will convert
from the level the entry is found at, back to the 'real' level the entry
was missing at. This is needed to produce the correct output in the
debugfs file. Something like:

static int real_depth(int depth)
{
	if (depth == 3 && PTRS_PER_PMD == 1)
		depth = 2;
	if (depth == 2 && PTRS_PER_PUD == 1)
		depth = 1;
	if (depth == 1 && PTRS_PER_P4D == 1)
		depth = 0;
	return depth;
}

This should of course get folded by the compiler and not actually
generate any code.

Thanks,

Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-18 15:06         ` Peter Zijlstra
@ 2019-02-18 15:30           ` Steven Price
  2019-02-18 17:04             ` Mark Rutland
  0 siblings, 1 reply; 36+ messages in thread
From: Steven Price @ 2019-02-18 15:30 UTC (permalink / raw)
  To: Peter Zijlstra, Mark Rutland
  Cc: Dave Hansen, Arnd Bergmann, Ard Biesheuvel, Catalin Marinas, x86,
	Will Deacon, linux-kernel, linux-mm, Jérôme Glisse,
	Ingo Molnar, Borislav Petkov, Andy Lutomirski, H. Peter Anvin,
	James Morse, Thomas Gleixner, linux-arm-kernel

On 18/02/2019 15:06, Peter Zijlstra wrote:
> On Mon, Feb 18, 2019 at 02:29:52PM +0000, Mark Rutland wrote:
>> I think that Peter means p?d_huge(x) should imply p?d_large(x), e.g.
>>
>> #define pmd_large(x) \
>> 	(pmd_sect(x) || pmd_huge(x) || pmd_trans_huge(x))
>>
>> ... which should work regardless of CONFIG_HUGETLB_PAGE.
> 
> Yep, that.

I'm not aware of a situation where pmd_huge(x) is true but pmd_sect(x)
isn't. Equally for pmd_huge(x) and pmd_trans_huge(x).

What am I missing?

Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-18 15:30           ` Steven Price
@ 2019-02-18 17:04             ` Mark Rutland
  2019-02-18 17:22               ` Steven Price
  0 siblings, 1 reply; 36+ messages in thread
From: Mark Rutland @ 2019-02-18 17:04 UTC (permalink / raw)
  To: Steven Price
  Cc: Peter Zijlstra, Dave Hansen, Arnd Bergmann, Ard Biesheuvel,
	Catalin Marinas, x86, Will Deacon, linux-kernel, linux-mm,
	Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel

On Mon, Feb 18, 2019 at 03:30:38PM +0000, Steven Price wrote:
> On 18/02/2019 15:06, Peter Zijlstra wrote:
> > On Mon, Feb 18, 2019 at 02:29:52PM +0000, Mark Rutland wrote:
> >> I think that Peter means p?d_huge(x) should imply p?d_large(x), e.g.
> >>
> >> #define pmd_large(x) \
> >> 	(pmd_sect(x) || pmd_huge(x) || pmd_trans_huge(x))
> >>
> >> ... which should work regardless of CONFIG_HUGETLB_PAGE.
> > 
> > Yep, that.
> 
> I'm not aware of a situation where pmd_huge(x) is true but pmd_sect(x)
> isn't. Equally for pmd_huge(x) and pmd_trans_huge(x).
> 
> What am I missing?

Having dug for a bit, I think you're right in asserting that pmd_sect()
should cover those.

I had worried that wouldn't cater for contiguous pmd entries, but those
have to be contiguous section entries, so they get picked up.

That said, do we have any special handling for contiguous PTEs? We use
those in kernel mappings regardless of hugetlb support, and I didn't
spot a pte_large() helper.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/13] arm64: mm: Add p?d_large() definitions
  2019-02-18 17:04             ` Mark Rutland
@ 2019-02-18 17:22               ` Steven Price
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-18 17:22 UTC (permalink / raw)
  To: Mark Rutland
  Cc: x86, Arnd Bergmann, Ard Biesheuvel, Peter Zijlstra,
	Catalin Marinas, Dave Hansen, Will Deacon, linux-kernel,
	linux-mm, Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel

On 18/02/2019 17:04, Mark Rutland wrote:
> On Mon, Feb 18, 2019 at 03:30:38PM +0000, Steven Price wrote:
>> On 18/02/2019 15:06, Peter Zijlstra wrote:
>>> On Mon, Feb 18, 2019 at 02:29:52PM +0000, Mark Rutland wrote:
>>>> I think that Peter means p?d_huge(x) should imply p?d_large(x), e.g.
>>>>
>>>> #define pmd_large(x) \
>>>> 	(pmd_sect(x) || pmd_huge(x) || pmd_trans_huge(x))
>>>>
>>>> ... which should work regardless of CONFIG_HUGETLB_PAGE.
>>>
>>> Yep, that.
>>
>> I'm not aware of a situation where pmd_huge(x) is true but pmd_sect(x)
>> isn't. Equally for pmd_huge(x) and pmd_trans_huge(x).
>>
>> What am I missing?
> 
> Having dug for a bit, I think you're right in asserting that pmd_sect()
> should cover those.
> 
> I had worried that wouldn't cater for contiguous pmd entries, but those
> have to be contiguous section entries, so they get picked up.
> 
> That said, do we have any special handling for contiguous PTEs? We use
> those in kernel mappings regardless of hugetlb support, and I didn't
> spot a pte_large() helper.

There's no special handling for contiguous PTEs because the page walk
code doesn't care - each PTE is valid individually even if it is part of
a contiguous group. So the walker can descend all levels in this case.
pte_large() if it existed would therefore always return 0.

The pte_entry() callback obviously might go looking for the contiguous
bit so that it can annotate the output correctly but that's different
from a 'large' page. The code in arch/arm64/mm/dump.c simply looks for
the PTE_CONT bit being set to do this annotation.

Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/13] mm: Add generic p?d_large() macros
  2019-02-18 14:19     ` Steven Price
@ 2019-02-19  3:44       ` Liang, Kan
  2019-02-20 13:56         ` Steven Price
  0 siblings, 1 reply; 36+ messages in thread
From: Liang, Kan @ 2019-02-19  3:44 UTC (permalink / raw)
  To: Steven Price, Peter Zijlstra
  Cc: x86, Arnd Bergmann, Ard Biesheuvel, Catalin Marinas, kirill,
	Dave Hansen, Will Deacon, linux-kernel, linux-mm,
	Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel



On 2/18/2019 9:19 AM, Steven Price wrote:
> On 18/02/2019 11:31, Peter Zijlstra wrote:
>> On Fri, Feb 15, 2019 at 05:02:24PM +0000, Steven Price wrote:
>>> From: James Morse <james.morse@arm.com>
>>>
>>> Exposing the pud/pgd levels of the page tables to walk_page_range() means
>>> we may come across the exotic large mappings that come with large areas
>>> of contiguous memory (such as the kernel's linear map).
>>>
>>> For architectures that don't provide p?d_large() macros, provided a
>>> does nothing default.
>>
>> Kan was going to fix that for all archs I think..
>

Yes, I'm still working on a generic function to retrieve page size.
The generic p?d_large() issue has been fixed. However, I found that the 
pgd_page() is not generic either. I'm still working on it.
I will update you on the other thread when all issues are fixed.



> The latest series I can find from Kan is still x86 specific. I'm happy
> to rebase onto something else if Kan has an implementation already
> (please point me in the right direction). Otherwise Kan is obviously
> free to base on these changes.
>

My implementation is similar as yours. I'm happy to re-base on your changes.

Could you please also add a generic p4d_large()?

Thanks,
Kan

> Steve
> 
>> See:
>>
>>    http://lkml.kernel.org/r/20190204105409.GA17550@hirez.programming.kicks-ass.net
>>
>>> Signed-off-by: James Morse <james.morse@arm.com>
>>> Signed-off-by: Steven Price <steven.price@arm.com>
>>> ---
>>>   include/asm-generic/pgtable.h | 10 ++++++++++
>>>   1 file changed, 10 insertions(+)
>>>
>>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
>>> index 05e61e6c843f..7630d663cd51 100644
>>> --- a/include/asm-generic/pgtable.h
>>> +++ b/include/asm-generic/pgtable.h
>>> @@ -1186,4 +1186,14 @@ static inline bool arch_has_pfn_modify_check(void)
>>>   #define mm_pmd_folded(mm)	__is_defined(__PAGETABLE_PMD_FOLDED)
>>>   #endif
>>>   
>>> +#ifndef pgd_large
>>> +#define pgd_large(x)	0
>>> +#endif
>>> +#ifndef pud_large
>>> +#define pud_large(x)	0
>>> +#endif
>>> +#ifndef pmd_large
>>> +#define pmd_large(x)	0
>>> +#endif
>>> +
>>>   #endif /* _ASM_GENERIC_PGTABLE_H */
>>> -- 
>>> 2.20.1
>>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole
  2019-02-15 17:02 ` [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole Steven Price
  2019-02-18 11:23   ` Mark Rutland
@ 2019-02-20 11:35   ` William Kucharski
  2019-02-20 14:10     ` Steven Price
  1 sibling, 1 reply; 36+ messages in thread
From: William Kucharski @ 2019-02-20 11:35 UTC (permalink / raw)
  To: Steven Price
  Cc: Linux-MM, Andy Lutomirski, Ard Biesheuvel, Arnd Bergmann,
	Borislav Petkov, Catalin Marinas, Dave Hansen, Ingo Molnar,
	James Morse, Jérôme Glisse, Peter Zijlstra,
	Thomas Gleixner, Will Deacon,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, linux-arm-kernel, linux-kernel



> On Feb 15, 2019, at 10:02 AM, Steven Price <Steven.Price@arm.com> wrote:
> 
> The pte_hole() callback is called at multiple levels of the page tables.
> Code dumping the kernel page tables needs to know what at what depth
> the missing entry is. Add this is an extra parameter to pte_hole().
> When the depth isn't know (e.g. processing a vma) then -1 is passed.
> 
> Note that depth starts at 0 for a PGD so that PUD/PMD/PTE retain their
> natural numbers as levels 2/3/4.

Nit: Could you add a comment noting this for anyone wondering how to
calculate the level numbers in the future?

Thanks!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/13] mm: Add generic p?d_large() macros
  2019-02-19  3:44       ` Liang, Kan
@ 2019-02-20 13:56         ` Steven Price
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-20 13:56 UTC (permalink / raw)
  To: Liang, Kan, Peter Zijlstra
  Cc: Dave Hansen, James Morse, Arnd Bergmann, Ard Biesheuvel,
	Catalin Marinas, x86, Will Deacon, linux-kernel, linux-mm,
	Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, kirill, Thomas Gleixner,
	linux-arm-kernel

On 19/02/2019 03:44, Liang, Kan wrote:
> 
> 
> On 2/18/2019 9:19 AM, Steven Price wrote:
>> On 18/02/2019 11:31, Peter Zijlstra wrote:
>>> On Fri, Feb 15, 2019 at 05:02:24PM +0000, Steven Price wrote:
>>>> From: James Morse <james.morse@arm.com>
>>>>
>>>> Exposing the pud/pgd levels of the page tables to walk_page_range()
>>>> means
>>>> we may come across the exotic large mappings that come with large areas
>>>> of contiguous memory (such as the kernel's linear map).
>>>>
>>>> For architectures that don't provide p?d_large() macros, provided a
>>>> does nothing default.
>>>
>>> Kan was going to fix that for all archs I think..
>>
> 
> Yes, I'm still working on a generic function to retrieve page size.
> The generic p?d_large() issue has been fixed. However, I found that the
> pgd_page() is not generic either. I'm still working on it.
> I will update you on the other thread when all issues are fixed.
> 
> 
> 
>> The latest series I can find from Kan is still x86 specific. I'm happy
>> to rebase onto something else if Kan has an implementation already
>> (please point me in the right direction). Otherwise Kan is obviously
>> free to base on these changes.
>>
> 
> My implementation is similar as yours. I'm happy to re-base on your
> changes.
> 
> Could you please also add a generic p4d_large()?

Sure, I'll include that in the next posting.

Thanks,

Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole
  2019-02-20 11:35   ` William Kucharski
@ 2019-02-20 14:10     ` Steven Price
  0 siblings, 0 replies; 36+ messages in thread
From: Steven Price @ 2019-02-20 14:10 UTC (permalink / raw)
  To: William Kucharski
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Arnd Bergmann, Ard Biesheuvel, Peter Zijlstra, Catalin Marinas,
	Dave Hansen, Will Deacon, linux-kernel, Linux-MM,
	Jérôme Glisse, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, H. Peter Anvin, James Morse, Thomas Gleixner,
	linux-arm-kernel

On 20/02/2019 11:35, William Kucharski wrote:
> 
> 
>> On Feb 15, 2019, at 10:02 AM, Steven Price <Steven.Price@arm.com> wrote:
>>
>> The pte_hole() callback is called at multiple levels of the page tables.
>> Code dumping the kernel page tables needs to know what at what depth
>> the missing entry is. Add this is an extra parameter to pte_hole().
>> When the depth isn't know (e.g. processing a vma) then -1 is passed.
>>
>> Note that depth starts at 0 for a PGD so that PUD/PMD/PTE retain their
>> natural numbers as levels 2/3/4.
> 
> Nit: Could you add a comment noting this for anyone wondering how to
> calculate the level numbers in the future?

Good point! I'll expand the comment in the header file.

Thanks,

Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2019-02-20 14:10 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-15 17:02 [PATCH 00/13] Convert x86 & arm64 to use generic page walk Steven Price
2019-02-15 17:02 ` [PATCH 01/13] arm64: mm: Add p?d_large() definitions Steven Price
2019-02-18 11:16   ` Mark Rutland
2019-02-18 11:29   ` Peter Zijlstra
2019-02-18 13:45     ` Mark Rutland
2019-02-18 14:11     ` Steven Price
2019-02-18 14:29       ` Mark Rutland
2019-02-18 15:06         ` Peter Zijlstra
2019-02-18 15:30           ` Steven Price
2019-02-18 17:04             ` Mark Rutland
2019-02-18 17:22               ` Steven Price
2019-02-15 17:02 ` [PATCH 02/13] x86/mm: " Steven Price
2019-02-15 17:02 ` [PATCH 03/13] mm: Add generic p?d_large() macros Steven Price
2019-02-18 11:14   ` Mark Rutland
2019-02-18 12:53     ` Peter Zijlstra
2019-02-18 14:20       ` Steven Price
2019-02-18 11:31   ` Peter Zijlstra
2019-02-18 14:19     ` Steven Price
2019-02-19  3:44       ` Liang, Kan
2019-02-20 13:56         ` Steven Price
2019-02-15 17:02 ` [PATCH 04/13] mm: pagewalk: Add p4d_entry() and pgd_entry() Steven Price
2019-02-15 17:02 ` [PATCH 05/13] mm: pagewalk: Allow walking without vma Steven Price
2019-02-15 17:02 ` [PATCH 06/13] mm: pagewalk: Add 'depth' parameter to pte_hole Steven Price
2019-02-18 11:23   ` Mark Rutland
2019-02-18 15:23     ` Steven Price
2019-02-20 11:35   ` William Kucharski
2019-02-20 14:10     ` Steven Price
2019-02-15 17:02 ` [PATCH 07/13] mm: pagewalk: Add test_p?d callbacks Steven Price
2019-02-15 17:02 ` [PATCH 08/13] arm64: mm: Convert mm/dump.c to use walk_page_range() Steven Price
2019-02-15 17:02 ` [PATCH 09/13] x86/mm: Point to struct seq_file from struct pg_state Steven Price
2019-02-15 17:02 ` [PATCH 10/13] x86/mm+efi: Convert ptdump_walk_pgd_level() to take a mm_struct Steven Price
2019-02-15 17:02 ` [PATCH 11/13] x86/mm: Convert ptdump_walk_pgd_level_debugfs() to take an mm_struct Steven Price
2019-02-15 17:02 ` [PATCH 12/13] x86/mm: Convert ptdump_walk_pgd_level_core() " Steven Price
2019-02-15 17:02 ` [PATCH 13/13] x86: mm: Convert dump_pagetables to use walk_page_range Steven Price
2019-02-15 17:16   ` Dave Hansen
2019-02-15 17:32     ` Steven Price

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).