Linux-m68k Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/7] Record the mm_struct in the page table pages
@ 2020-04-28 19:44 Matthew Wilcox
  2020-04-28 19:44 ` [PATCH 1/7] mm: Document x86 uses a linked list of pgds Matthew Wilcox
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-28 19:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Russell King, Geert Uytterhoeven, linux-m68k

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Pages which are in use as page tables have some space unused in struct
page.  It would be handy to have a pointer to the struct mm_struct that
they belong to so that we can handle uncorrectable errors in page tables
more gracefully.  There are a few other things we could use it for too,
such as checking that the page table entry actually belongs to the task
we think it ought to.  This patch series does none of that, but does
lay the groundwork for it.

Matthew Wilcox (Oracle) (7):
  mm: Document x86 uses a linked list of pgds
  mm: Move pt_mm within struct page
  arm: Thread mm_struct throughout page table allocation
  arm64: Thread mm_struct throughout page table allocation
  m68k: Thread mm_struct throughout page table allocation
  mm: Set pt_mm in PTE constructor
  mm: Set pt_mm in PMD constructor

 arch/arc/include/asm/pgalloc.h           |  2 +-
 arch/arm/mm/mmu.c                        | 66 ++++++++---------
 arch/arm64/include/asm/pgalloc.h         |  2 +-
 arch/arm64/mm/mmu.c                      | 93 ++++++++++++------------
 arch/m68k/include/asm/mcf_pgalloc.h      |  2 +-
 arch/m68k/include/asm/motorola_pgalloc.h | 10 +--
 arch/m68k/mm/motorola.c                  |  4 +-
 arch/openrisc/include/asm/pgalloc.h      |  2 +-
 arch/powerpc/mm/book3s64/pgtable.c       |  2 +-
 arch/powerpc/mm/pgtable-frag.c           |  2 +-
 arch/s390/include/asm/pgalloc.h          |  2 +-
 arch/s390/mm/pgalloc.c                   |  2 +-
 arch/sparc/mm/init_64.c                  |  2 +-
 arch/sparc/mm/srmmu.c                    |  2 +-
 arch/x86/include/asm/pgalloc.h           |  2 +-
 arch/x86/mm/pgtable.c                    |  3 +-
 arch/xtensa/include/asm/pgalloc.h        |  2 +-
 include/asm-generic/pgalloc.h            |  2 +-
 include/linux/mm.h                       | 18 ++++-
 include/linux/mm_types.h                 | 12 +--
 20 files changed, 122 insertions(+), 110 deletions(-)


base-commit: 6a8b55ed4056ea5559ebe4f6a4b247f627870d4c
-- 
2.26.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/7] mm: Document x86 uses a linked list of pgds
  2020-04-28 19:44 [PATCH 0/7] Record the mm_struct in the page table pages Matthew Wilcox
@ 2020-04-28 19:44 ` Matthew Wilcox
  2020-04-28 21:41   ` Ira Weiny
  2020-04-28 19:44 ` [PATCH 2/7] mm: Move pt_mm within struct page Matthew Wilcox
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-28 19:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Russell King, Geert Uytterhoeven, linux-m68k

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

x86 uses page->lru of the pages used for pgds, but that's not immediately
obvious to anyone looking to make changes.  Add a struct list_head to
the union so it's clearly in use for pgds.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/mm_types.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 4aba6c0c2ba8..9bb34e2cd5a5 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -142,8 +142,13 @@ struct page {
 			struct list_head deferred_list;
 		};
 		struct {	/* Page table pages */
-			unsigned long _pt_pad_1;	/* compound_head */
-			pgtable_t pmd_huge_pte; /* protected by page->ptl */
+			union {
+				struct list_head pgd_list;	/* x86 */
+				struct {
+					unsigned long _pt_pad_1;
+					pgtable_t pmd_huge_pte;
+				};
+			};
 			unsigned long _pt_pad_2;	/* mapping */
 			union {
 				struct mm_struct *pt_mm; /* x86 pgds only */
-- 
2.26.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/7] mm: Move pt_mm within struct page
  2020-04-28 19:44 [PATCH 0/7] Record the mm_struct in the page table pages Matthew Wilcox
  2020-04-28 19:44 ` [PATCH 1/7] mm: Document x86 uses a linked list of pgds Matthew Wilcox
@ 2020-04-28 19:44 ` Matthew Wilcox
  2020-04-29  7:34   ` Geert Uytterhoeven
  2020-04-28 19:44 ` [PATCH 3/7] arm: Thread mm_struct throughout page table allocation Matthew Wilcox
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-28 19:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Russell King, Geert Uytterhoeven, linux-m68k

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

Instead of a per-arch word within struct page, use a formerly reserved
word.  This word is shared with page->mapping, so it must be cleared
before being freed as it is checked in free_pages().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/x86/mm/pgtable.c    | 1 +
 include/linux/mm_types.h | 7 ++-----
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 7bd2c3a52297..f5f46737aea0 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -95,6 +95,7 @@ static inline void pgd_list_del(pgd_t *pgd)
 	struct page *page = virt_to_page(pgd);
 
 	list_del(&page->lru);
+	page->pt_mm = NULL;
 }
 
 #define UNSHARED_PTRS_PER_PGD				\
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 9bb34e2cd5a5..7efa12f4626f 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -149,11 +149,8 @@ struct page {
 					pgtable_t pmd_huge_pte;
 				};
 			};
-			unsigned long _pt_pad_2;	/* mapping */
-			union {
-				struct mm_struct *pt_mm; /* x86 pgds only */
-				atomic_t pt_frag_refcount; /* powerpc */
-			};
+			struct mm_struct *pt_mm;
+			atomic_t pt_frag_refcount; /* powerpc */
 #if ALLOC_SPLIT_PTLOCKS
 			spinlock_t *ptl;
 #else
-- 
2.26.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/7] arm: Thread mm_struct throughout page table allocation
  2020-04-28 19:44 [PATCH 0/7] Record the mm_struct in the page table pages Matthew Wilcox
  2020-04-28 19:44 ` [PATCH 1/7] mm: Document x86 uses a linked list of pgds Matthew Wilcox
  2020-04-28 19:44 ` [PATCH 2/7] mm: Move pt_mm within struct page Matthew Wilcox
@ 2020-04-28 19:44 ` Matthew Wilcox
  2020-04-28 19:44 ` [PATCH 4/7] arm64: " Matthew Wilcox
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-28 19:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Russell King, Geert Uytterhoeven, linux-m68k

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

An upcoming patch will pass mm_struct to the page table constructor.
Make sure ARM has the appropriate mm_struct at the point it needs to
call the constructor.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/arm/mm/mmu.c | 64 +++++++++++++++++++++++------------------------
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index ec8d0008bfa1..e5275bfbe695 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -690,7 +690,9 @@ EXPORT_SYMBOL(phys_mem_access_prot);
 
 #define vectors_base()	(vectors_high() ? 0xffff0000 : 0)
 
-static void __init *early_alloc(unsigned long sz)
+typedef void *(arm_pt_alloc_t)(unsigned long size, struct mm_struct *);
+
+static void __init *early_alloc(unsigned long sz, struct mm_struct *mm)
 {
 	void *ptr = memblock_alloc(sz, sz);
 
@@ -701,7 +703,7 @@ static void __init *early_alloc(unsigned long sz)
 	return ptr;
 }
 
-static void *__init late_alloc(unsigned long sz)
+static void *__init late_alloc(unsigned long sz, struct mm_struct *mm)
 {
 	void *ptr = (void *)__get_free_pages(GFP_PGTABLE_KERNEL, get_order(sz));
 
@@ -710,31 +712,30 @@ static void *__init late_alloc(unsigned long sz)
 	return ptr;
 }
 
-static pte_t * __init arm_pte_alloc(pmd_t *pmd, unsigned long addr,
-				unsigned long prot,
-				void *(*alloc)(unsigned long sz))
+static pte_t * __init arm_pte_alloc(struct mm_struct *mm, pmd_t *pmd,
+				unsigned long addr, unsigned long prot,
+				arm_pt_alloc_t alloc)
 {
 	if (pmd_none(*pmd)) {
-		pte_t *pte = alloc(PTE_HWTABLE_OFF + PTE_HWTABLE_SIZE);
+		pte_t *pte = alloc(PTE_HWTABLE_OFF + PTE_HWTABLE_SIZE, mm);
 		__pmd_populate(pmd, __pa(pte), prot);
 	}
 	BUG_ON(pmd_bad(*pmd));
 	return pte_offset_kernel(pmd, addr);
 }
 
-static pte_t * __init early_pte_alloc(pmd_t *pmd, unsigned long addr,
-				      unsigned long prot)
+static pte_t * __init early_pte_alloc(struct mm_struct *mm, pmd_t *pmd,
+					unsigned long addr, unsigned long prot)
 {
-	return arm_pte_alloc(pmd, addr, prot, early_alloc);
+	return arm_pte_alloc(mm, pmd, addr, prot, early_alloc);
 }
 
-static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
-				  unsigned long end, unsigned long pfn,
-				  const struct mem_type *type,
-				  void *(*alloc)(unsigned long sz),
-				  bool ng)
+static void __init alloc_init_pte(struct mm_struct *mm, pmd_t *pmd,
+				unsigned long addr, unsigned long end,
+				unsigned long pfn, const struct mem_type *type,
+				arm_pt_alloc_t alloc, bool ng)
 {
-	pte_t *pte = arm_pte_alloc(pmd, addr, type->prot_l1, alloc);
+	pte_t *pte = arm_pte_alloc(mm, pmd, addr, type->prot_l1, alloc);
 	do {
 		set_pte_ext(pte, pfn_pte(pfn, __pgprot(type->prot_pte)),
 			    ng ? PTE_EXT_NG : 0);
@@ -769,10 +770,10 @@ static void __init __map_init_section(pmd_t *pmd, unsigned long addr,
 	flush_pmd_entry(p);
 }
 
-static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
-				      unsigned long end, phys_addr_t phys,
-				      const struct mem_type *type,
-				      void *(*alloc)(unsigned long sz), bool ng)
+static void __init alloc_init_pmd(struct mm_struct *mm, pud_t *pud,
+				unsigned long addr, unsigned long end,
+				phys_addr_t phys, const struct mem_type *type,
+				arm_pt_alloc_t alloc, bool ng)
 {
 	pmd_t *pmd = pmd_offset(pud, addr);
 	unsigned long next;
@@ -792,7 +793,7 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 				((addr | next | phys) & ~SECTION_MASK) == 0) {
 			__map_init_section(pmd, addr, next, phys, type, ng);
 		} else {
-			alloc_init_pte(pmd, addr, next,
+			alloc_init_pte(mm, pmd, addr, next,
 				       __phys_to_pfn(phys), type, alloc, ng);
 		}
 
@@ -801,17 +802,17 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
 	} while (pmd++, addr = next, addr != end);
 }
 
-static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
-				  unsigned long end, phys_addr_t phys,
-				  const struct mem_type *type,
-				  void *(*alloc)(unsigned long sz), bool ng)
+static void __init alloc_init_pud(struct mm_struct *mm, pgd_t *pgd,
+				unsigned long addr, unsigned long end,
+				phys_addr_t phys, const struct mem_type *type,
+				arm_pt_alloc_t alloc, bool ng)
 {
 	pud_t *pud = pud_offset(pgd, addr);
 	unsigned long next;
 
 	do {
 		next = pud_addr_end(addr, end);
-		alloc_init_pmd(pud, addr, next, phys, type, alloc, ng);
+		alloc_init_pmd(mm, pud, addr, next, phys, type, alloc, ng);
 		phys += next - addr;
 	} while (pud++, addr = next, addr != end);
 }
@@ -879,8 +880,7 @@ static void __init create_36bit_mapping(struct mm_struct *mm,
 #endif	/* !CONFIG_ARM_LPAE */
 
 static void __init __create_mapping(struct mm_struct *mm, struct map_desc *md,
-				    void *(*alloc)(unsigned long sz),
-				    bool ng)
+					arm_pt_alloc_t alloc, bool ng)
 {
 	unsigned long addr, length, end;
 	phys_addr_t phys;
@@ -914,7 +914,7 @@ static void __init __create_mapping(struct mm_struct *mm, struct map_desc *md,
 	do {
 		unsigned long next = pgd_addr_end(addr, end);
 
-		alloc_init_pud(pgd, addr, next, phys, type, alloc, ng);
+		alloc_init_pud(mm, pgd, addr, next, phys, type, alloc, ng);
 
 		phys += next - addr;
 		addr = next;
@@ -1316,7 +1316,7 @@ static void __init devicemaps_init(const struct machine_desc *mdesc)
 	/*
 	 * Allocate the vector page early.
 	 */
-	vectors = early_alloc(PAGE_SIZE * 2);
+	vectors = early_alloc(PAGE_SIZE * 2, &init_mm);
 
 	early_trap_init(vectors);
 
@@ -1413,11 +1413,11 @@ static void __init devicemaps_init(const struct machine_desc *mdesc)
 static void __init kmap_init(void)
 {
 #ifdef CONFIG_HIGHMEM
-	pkmap_page_table = early_pte_alloc(pmd_off_k(PKMAP_BASE),
+	pkmap_page_table = early_pte_alloc(&init_mm, pmd_off_k(PKMAP_BASE),
 		PKMAP_BASE, _PAGE_KERNEL_TABLE);
 #endif
 
-	early_pte_alloc(pmd_off_k(FIXADDR_START), FIXADDR_START,
+	early_pte_alloc(&init_mm, pmd_off_k(FIXADDR_START), FIXADDR_START,
 			_PAGE_KERNEL_TABLE);
 }
 
@@ -1630,7 +1630,7 @@ void __init paging_init(const struct machine_desc *mdesc)
 	top_pmd = pmd_off_k(0xffff0000);
 
 	/* allocate the zero page. */
-	zero_page = early_alloc(PAGE_SIZE);
+	zero_page = early_alloc(PAGE_SIZE, &init_mm);
 
 	bootmem_init();
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/7] arm64: Thread mm_struct throughout page table allocation
  2020-04-28 19:44 [PATCH 0/7] Record the mm_struct in the page table pages Matthew Wilcox
                   ` (2 preceding siblings ...)
  2020-04-28 19:44 ` [PATCH 3/7] arm: Thread mm_struct throughout page table allocation Matthew Wilcox
@ 2020-04-28 19:44 ` Matthew Wilcox
  2020-04-29  9:58   ` Mark Rutland
  2020-04-28 19:44 ` [PATCH 5/7] m68k: " Matthew Wilcox
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-28 19:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Russell King, Geert Uytterhoeven, linux-m68k

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

An upcoming patch will pass mm_struct to the page table constructor.
Make sure arm64 has the appropriate mm_struct at the point it needs to
call the constructor.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/arm64/mm/mmu.c | 89 ++++++++++++++++++++++-----------------------
 1 file changed, 43 insertions(+), 46 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a374e4f51a62..69ecc83c3be0 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -88,7 +88,9 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 }
 EXPORT_SYMBOL(phys_mem_access_prot);
 
-static phys_addr_t __init early_pgtable_alloc(int shift)
+typedef phys_addr_t (arm_pt_alloc_t)(int size, struct mm_struct *);
+
+static phys_addr_t __init early_pgtable_alloc(int shift, struct mm_struct *mm)
 {
 	phys_addr_t phys;
 	void *ptr;
@@ -162,11 +164,9 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end,
 	pte_clear_fixmap();
 }
 
-static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
-				unsigned long end, phys_addr_t phys,
-				pgprot_t prot,
-				phys_addr_t (*pgtable_alloc)(int),
-				int flags)
+static void alloc_init_cont_pte(struct mm_struct *mm, pmd_t *pmdp,
+		unsigned long addr, unsigned long end, phys_addr_t phys,
+		pgprot_t prot, arm_pt_alloc_t pgtable_alloc, int flags)
 {
 	unsigned long next;
 	pmd_t pmd = READ_ONCE(*pmdp);
@@ -175,7 +175,7 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
 	if (pmd_none(pmd)) {
 		phys_addr_t pte_phys;
 		BUG_ON(!pgtable_alloc);
-		pte_phys = pgtable_alloc(PAGE_SHIFT);
+		pte_phys = pgtable_alloc(PAGE_SHIFT, mm);
 		__pmd_populate(pmdp, pte_phys, PMD_TYPE_TABLE);
 		pmd = READ_ONCE(*pmdp);
 	}
@@ -197,9 +197,9 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
 	} while (addr = next, addr != end);
 }
 
-static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end,
-		     phys_addr_t phys, pgprot_t prot,
-		     phys_addr_t (*pgtable_alloc)(int), int flags)
+static void init_pmd(struct mm_struct *mm, pud_t *pudp, unsigned long addr,
+		unsigned long end, phys_addr_t phys, pgprot_t prot,
+		arm_pt_alloc_t pgtable_alloc, int flags)
 {
 	unsigned long next;
 	pmd_t *pmdp;
@@ -222,7 +222,7 @@ static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end,
 			BUG_ON(!pgattr_change_is_safe(pmd_val(old_pmd),
 						      READ_ONCE(pmd_val(*pmdp))));
 		} else {
-			alloc_init_cont_pte(pmdp, addr, next, phys, prot,
+			alloc_init_cont_pte(mm, pmdp, addr, next, phys, prot,
 					    pgtable_alloc, flags);
 
 			BUG_ON(pmd_val(old_pmd) != 0 &&
@@ -234,10 +234,9 @@ static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end,
 	pmd_clear_fixmap();
 }
 
-static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
-				unsigned long end, phys_addr_t phys,
-				pgprot_t prot,
-				phys_addr_t (*pgtable_alloc)(int), int flags)
+static void alloc_init_cont_pmd(struct mm_struct *mm, pud_t *pudp,
+		unsigned long addr, unsigned long end, phys_addr_t phys,
+		pgprot_t prot, arm_pt_alloc_t pgtable_alloc, int flags)
 {
 	unsigned long next;
 	pud_t pud = READ_ONCE(*pudp);
@@ -249,7 +248,7 @@ static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
 	if (pud_none(pud)) {
 		phys_addr_t pmd_phys;
 		BUG_ON(!pgtable_alloc);
-		pmd_phys = pgtable_alloc(PMD_SHIFT);
+		pmd_phys = pgtable_alloc(PMD_SHIFT, mm);
 		__pud_populate(pudp, pmd_phys, PUD_TYPE_TABLE);
 		pud = READ_ONCE(*pudp);
 	}
@@ -265,7 +264,8 @@ static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
 		    (flags & NO_CONT_MAPPINGS) == 0)
 			__prot = __pgprot(pgprot_val(prot) | PTE_CONT);
 
-		init_pmd(pudp, addr, next, phys, __prot, pgtable_alloc, flags);
+		init_pmd(mm, pudp, addr, next, phys, __prot, pgtable_alloc,
+				flags);
 
 		phys += next - addr;
 	} while (addr = next, addr != end);
@@ -283,10 +283,9 @@ static inline bool use_1G_block(unsigned long addr, unsigned long next,
 	return true;
 }
 
-static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
-			   phys_addr_t phys, pgprot_t prot,
-			   phys_addr_t (*pgtable_alloc)(int),
-			   int flags)
+static void alloc_init_pud(struct mm_struct *mm, pgd_t *pgdp,
+		unsigned long addr, unsigned long end, phys_addr_t phys,
+		pgprot_t prot, arm_pt_alloc_t pgtable_alloc, int flags)
 {
 	unsigned long next;
 	pud_t *pudp;
@@ -295,7 +294,7 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
 	if (pgd_none(pgd)) {
 		phys_addr_t pud_phys;
 		BUG_ON(!pgtable_alloc);
-		pud_phys = pgtable_alloc(PUD_SHIFT);
+		pud_phys = pgtable_alloc(PUD_SHIFT, mm);
 		__pgd_populate(pgdp, pud_phys, PUD_TYPE_TABLE);
 		pgd = READ_ONCE(*pgdp);
 	}
@@ -321,7 +320,7 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
 			BUG_ON(!pgattr_change_is_safe(pud_val(old_pud),
 						      READ_ONCE(pud_val(*pudp))));
 		} else {
-			alloc_init_cont_pmd(pudp, addr, next, phys, prot,
+			alloc_init_cont_pmd(mm, pudp, addr, next, phys, prot,
 					    pgtable_alloc, flags);
 
 			BUG_ON(pud_val(old_pud) != 0 &&
@@ -333,11 +332,9 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
 	pud_clear_fixmap();
 }
 
-static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
-				 unsigned long virt, phys_addr_t size,
-				 pgprot_t prot,
-				 phys_addr_t (*pgtable_alloc)(int),
-				 int flags)
+static void __create_pgd_mapping(struct mm_struct *mm, pgd_t *pgdir,
+		phys_addr_t phys, unsigned long virt, phys_addr_t size,
+		pgprot_t prot, arm_pt_alloc_t pgtable_alloc, int flags)
 {
 	unsigned long addr, end, next;
 	pgd_t *pgdp = pgd_offset_raw(pgdir, virt);
@@ -355,13 +352,13 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
 
 	do {
 		next = pgd_addr_end(addr, end);
-		alloc_init_pud(pgdp, addr, next, phys, prot, pgtable_alloc,
+		alloc_init_pud(mm, pgdp, addr, next, phys, prot, pgtable_alloc,
 			       flags);
 		phys += next - addr;
 	} while (pgdp++, addr = next, addr != end);
 }
 
-static phys_addr_t __pgd_pgtable_alloc(int shift)
+static phys_addr_t __pgd_pgtable_alloc(int shift, struct mm_struct *mm)
 {
 	void *ptr = (void *)__get_free_page(GFP_PGTABLE_KERNEL);
 	BUG_ON(!ptr);
@@ -371,9 +368,9 @@ static phys_addr_t __pgd_pgtable_alloc(int shift)
 	return __pa(ptr);
 }
 
-static phys_addr_t pgd_pgtable_alloc(int shift)
+static phys_addr_t pgd_pgtable_alloc(int shift, struct mm_struct *mm)
 {
-	phys_addr_t pa = __pgd_pgtable_alloc(shift);
+	phys_addr_t pa = __pgd_pgtable_alloc(shift, mm);
 
 	/*
 	 * Call proper page table ctor in case later we need to
@@ -404,8 +401,8 @@ static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
 			&phys, virt);
 		return;
 	}
-	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
-			     NO_CONT_MAPPINGS);
+	__create_pgd_mapping(&init_mm, init_mm.pgd, phys, virt, size, prot,
+			NULL, NO_CONT_MAPPINGS);
 }
 
 void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
@@ -419,7 +416,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 	if (page_mappings_only)
 		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
-	__create_pgd_mapping(mm->pgd, phys, virt, size, prot,
+	__create_pgd_mapping(mm, mm->pgd, phys, virt, size, prot,
 			     pgd_pgtable_alloc, flags);
 }
 
@@ -432,8 +429,8 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
 		return;
 	}
 
-	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
-			     NO_CONT_MAPPINGS);
+	__create_pgd_mapping(&init_mm, init_mm.pgd, phys, virt, size, prot,
+			NULL, NO_CONT_MAPPINGS);
 
 	/* flush the TLBs after updating live kernel mappings */
 	flush_tlb_kernel_range(virt, virt + size);
@@ -442,8 +439,8 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
 static void __init __map_memblock(pgd_t *pgdp, phys_addr_t start,
 				  phys_addr_t end, pgprot_t prot, int flags)
 {
-	__create_pgd_mapping(pgdp, start, __phys_to_virt(start), end - start,
-			     prot, early_pgtable_alloc, flags);
+	__create_pgd_mapping(&init_mm, pgdp, start, __phys_to_virt(start),
+			end - start, prot, early_pgtable_alloc, flags);
 }
 
 void __init mark_linear_text_alias_ro(void)
@@ -547,8 +544,8 @@ static void __init map_kernel_segment(pgd_t *pgdp, void *va_start, void *va_end,
 	BUG_ON(!PAGE_ALIGNED(pa_start));
 	BUG_ON(!PAGE_ALIGNED(size));
 
-	__create_pgd_mapping(pgdp, pa_start, (unsigned long)va_start, size, prot,
-			     early_pgtable_alloc, flags);
+	__create_pgd_mapping(&init_mm, pgdp, pa_start, (unsigned long)va_start,
+			size, prot, early_pgtable_alloc, flags);
 
 	if (!(vm_flags & VM_NO_GUARD))
 		size += PAGE_SIZE;
@@ -591,8 +588,8 @@ static int __init map_entry_trampoline(void)
 
 	/* Map only the text into the trampoline page table */
 	memset(tramp_pg_dir, 0, PGD_SIZE);
-	__create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
-			     prot, __pgd_pgtable_alloc, 0);
+	__create_pgd_mapping(&init_mm, tramp_pg_dir, pa_start, TRAMP_VALIAS,
+			PAGE_SIZE, prot, __pgd_pgtable_alloc, 0);
 
 	/* Map both the text and data into the kernel page table */
 	__set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot);
@@ -1381,9 +1378,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
 	if (rodata_full || debug_pagealloc_enabled())
 		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
-	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
-			     size, params->pgprot, __pgd_pgtable_alloc,
-			     flags);
+	__create_pgd_mapping(&init_mm, swapper_pg_dir, start,
+			__phys_to_virt(start), size, params->pgprot,
+			__pgd_pgtable_alloc, flags);
 
 	memblock_clear_nomap(start, size);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 5/7] m68k: Thread mm_struct throughout page table allocation
  2020-04-28 19:44 [PATCH 0/7] Record the mm_struct in the page table pages Matthew Wilcox
                   ` (3 preceding siblings ...)
  2020-04-28 19:44 ` [PATCH 4/7] arm64: " Matthew Wilcox
@ 2020-04-28 19:44 ` Matthew Wilcox
  2020-04-29  7:44   ` Geert Uytterhoeven
  2020-04-28 19:44 ` [PATCH 6/7] mm: Set pt_mm in PTE constructor Matthew Wilcox
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-28 19:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Russell King, Geert Uytterhoeven, linux-m68k

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

An upcoming patch will pass mm_struct to the page table constructor.
Make sure m68k has the appropriate mm_struct at the point it needs to
call the constructor.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/m68k/include/asm/motorola_pgalloc.h | 10 +++++-----
 arch/m68k/mm/motorola.c                  |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/m68k/include/asm/motorola_pgalloc.h b/arch/m68k/include/asm/motorola_pgalloc.h
index c66e42917912..dbac0c597397 100644
--- a/arch/m68k/include/asm/motorola_pgalloc.h
+++ b/arch/m68k/include/asm/motorola_pgalloc.h
@@ -15,12 +15,12 @@ enum m68k_table_types {
 };
 
 extern void init_pointer_table(void *table, int type);
-extern void *get_pointer_table(int type);
+extern void *get_pointer_table(int type, struct mm_struct *mm);
 extern int free_pointer_table(void *table, int type);
 
 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
 {
-	return get_pointer_table(TABLE_PTE);
+	return get_pointer_table(TABLE_PTE, mm);
 }
 
 static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
@@ -30,7 +30,7 @@ static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
 
 static inline pgtable_t pte_alloc_one(struct mm_struct *mm)
 {
-	return get_pointer_table(TABLE_PTE);
+	return get_pointer_table(TABLE_PTE, mm);
 }
 
 static inline void pte_free(struct mm_struct *mm, pgtable_t pgtable)
@@ -47,7 +47,7 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pgtable,
 
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address)
 {
-	return get_pointer_table(TABLE_PMD);
+	return get_pointer_table(TABLE_PMD, mm);
 }
 
 static inline int pmd_free(struct mm_struct *mm, pmd_t *pmd)
@@ -69,7 +69,7 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 
 static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 {
-	return get_pointer_table(TABLE_PGD);
+	return get_pointer_table(TABLE_PGD, mm);
 }
 
 
diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
index fc16190ec2d6..7743480be0cf 100644
--- a/arch/m68k/mm/motorola.c
+++ b/arch/m68k/mm/motorola.c
@@ -113,7 +113,7 @@ void __init init_pointer_table(void *table, int type)
 	return;
 }
 
-void *get_pointer_table(int type)
+void *get_pointer_table(int type, struct mm_struct *mm)
 {
 	ptable_desc *dp = ptable_list[type].next;
 	unsigned int mask = list_empty(&ptable_list[type]) ? 0 : PD_MARKBITS(dp);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 6/7] mm: Set pt_mm in PTE constructor
  2020-04-28 19:44 [PATCH 0/7] Record the mm_struct in the page table pages Matthew Wilcox
                   ` (4 preceding siblings ...)
  2020-04-28 19:44 ` [PATCH 5/7] m68k: " Matthew Wilcox
@ 2020-04-28 19:44 ` Matthew Wilcox
  2020-04-29  7:46   ` Geert Uytterhoeven
  2020-04-28 19:44 ` [PATCH 7/7] mm: Set pt_mm in PMD constructor Matthew Wilcox
  2020-04-29  0:26 ` [PATCH 0/7] Record the mm_struct in the page table pages Kirill A. Shutemov
  7 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-28 19:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Russell King, Geert Uytterhoeven, linux-m68k

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

By setting pt_mm for pages in use as page tables, we can help with
debugging and lay the foundation for handling hardware errors in page
tables more gracefully.  It also opens up the possibility for adding
more sanity checks in the future.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/arc/include/asm/pgalloc.h      | 2 +-
 arch/arm/mm/mmu.c                   | 2 +-
 arch/arm64/mm/mmu.c                 | 2 +-
 arch/m68k/include/asm/mcf_pgalloc.h | 2 +-
 arch/m68k/mm/motorola.c             | 2 +-
 arch/openrisc/include/asm/pgalloc.h | 2 +-
 arch/powerpc/mm/pgtable-frag.c      | 2 +-
 arch/s390/mm/pgalloc.c              | 2 +-
 arch/sparc/mm/init_64.c             | 2 +-
 arch/sparc/mm/srmmu.c               | 2 +-
 arch/xtensa/include/asm/pgalloc.h   | 2 +-
 include/asm-generic/pgalloc.h       | 2 +-
 include/linux/mm.h                  | 5 ++++-
 13 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/arch/arc/include/asm/pgalloc.h b/arch/arc/include/asm/pgalloc.h
index b747f2ec2928..5f6b1f3bc2a2 100644
--- a/arch/arc/include/asm/pgalloc.h
+++ b/arch/arc/include/asm/pgalloc.h
@@ -108,7 +108,7 @@ pte_alloc_one(struct mm_struct *mm)
 		return 0;
 	memzero((void *)pte_pg, PTRS_PER_PTE * sizeof(pte_t));
 	page = virt_to_page(pte_pg);
-	if (!pgtable_pte_page_ctor(page)) {
+	if (!pgtable_pte_page_ctor(page, mm)) {
 		__free_page(page);
 		return 0;
 	}
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index e5275bfbe695..9c16c45570ba 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -707,7 +707,7 @@ static void *__init late_alloc(unsigned long sz, struct mm_struct *mm)
 {
 	void *ptr = (void *)__get_free_pages(GFP_PGTABLE_KERNEL, get_order(sz));
 
-	if (!ptr || !pgtable_pte_page_ctor(virt_to_page(ptr)))
+	if (!ptr || !pgtable_pte_page_ctor(virt_to_page(ptr), mm))
 		BUG();
 	return ptr;
 }
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 69ecc83c3be0..c706bed1e496 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -381,7 +381,7 @@ static phys_addr_t pgd_pgtable_alloc(int shift, struct mm_struct *mm)
 	 * folded, and if so pgtable_pmd_page_ctor() becomes nop.
 	 */
 	if (shift == PAGE_SHIFT)
-		BUG_ON(!pgtable_pte_page_ctor(phys_to_page(pa)));
+		BUG_ON(!pgtable_pte_page_ctor(phys_to_page(pa), mm));
 	else if (shift == PMD_SHIFT)
 		BUG_ON(!pgtable_pmd_page_ctor(phys_to_page(pa)));
 
diff --git a/arch/m68k/include/asm/mcf_pgalloc.h b/arch/m68k/include/asm/mcf_pgalloc.h
index bc1228e00518..369a3523e834 100644
--- a/arch/m68k/include/asm/mcf_pgalloc.h
+++ b/arch/m68k/include/asm/mcf_pgalloc.h
@@ -50,7 +50,7 @@ static inline pgtable_t pte_alloc_one(struct mm_struct *mm)
 
 	if (!page)
 		return NULL;
-	if (!pgtable_pte_page_ctor(page)) {
+	if (!pgtable_pte_page_ctor(page, mm)) {
 		__free_page(page);
 		return NULL;
 	}
diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
index 7743480be0cf..6bb7c9f348ad 100644
--- a/arch/m68k/mm/motorola.c
+++ b/arch/m68k/mm/motorola.c
@@ -137,7 +137,7 @@ void *get_pointer_table(int type, struct mm_struct *mm)
 			 * m68k doesn't have SPLIT_PTE_PTLOCKS for not having
 			 * SMP.
 			 */
-			pgtable_pte_page_ctor(virt_to_page(page));
+			pgtable_pte_page_ctor(virt_to_page(page, mm));
 		}
 
 		mmu_page_ctor(page);
diff --git a/arch/openrisc/include/asm/pgalloc.h b/arch/openrisc/include/asm/pgalloc.h
index da12a4c38c4b..1a80dfc928b5 100644
--- a/arch/openrisc/include/asm/pgalloc.h
+++ b/arch/openrisc/include/asm/pgalloc.h
@@ -75,7 +75,7 @@ static inline struct page *pte_alloc_one(struct mm_struct *mm)
 	if (!pte)
 		return NULL;
 	clear_page(page_address(pte));
-	if (!pgtable_pte_page_ctor(pte)) {
+	if (!pgtable_pte_page_ctor(pte, mm)) {
 		__free_page(pte);
 		return NULL;
 	}
diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c
index ee4bd6d38602..59a8c85e01ac 100644
--- a/arch/powerpc/mm/pgtable-frag.c
+++ b/arch/powerpc/mm/pgtable-frag.c
@@ -61,7 +61,7 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel)
 		page = alloc_page(PGALLOC_GFP | __GFP_ACCOUNT);
 		if (!page)
 			return NULL;
-		if (!pgtable_pte_page_ctor(page)) {
+		if (!pgtable_pte_page_ctor(page, mm)) {
 			__free_page(page);
 			return NULL;
 		}
diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
index 498c98a312f4..0363828749e2 100644
--- a/arch/s390/mm/pgalloc.c
+++ b/arch/s390/mm/pgalloc.c
@@ -208,7 +208,7 @@ unsigned long *page_table_alloc(struct mm_struct *mm)
 	page = alloc_page(GFP_KERNEL);
 	if (!page)
 		return NULL;
-	if (!pgtable_pte_page_ctor(page)) {
+	if (!pgtable_pte_page_ctor(page, mm)) {
 		__free_page(page);
 		return NULL;
 	}
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 1cf0d666dea3..d2cc80828415 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2928,7 +2928,7 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
 	struct page *page = alloc_page(GFP_KERNEL | __GFP_ZERO);
 	if (!page)
 		return NULL;
-	if (!pgtable_pte_page_ctor(page)) {
+	if (!pgtable_pte_page_ctor(page, mm)) {
 		free_unref_page(page);
 		return NULL;
 	}
diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index b7c94de70cca..019ff2019b55 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -382,7 +382,7 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
 	if ((pte = (unsigned long)pte_alloc_one_kernel(mm)) == 0)
 		return NULL;
 	page = pfn_to_page(__nocache_pa(pte) >> PAGE_SHIFT);
-	if (!pgtable_pte_page_ctor(page)) {
+	if (!pgtable_pte_page_ctor(page, mm)) {
 		__free_page(page);
 		return NULL;
 	}
diff --git a/arch/xtensa/include/asm/pgalloc.h b/arch/xtensa/include/asm/pgalloc.h
index 1d38f0e755ba..43cc05255832 100644
--- a/arch/xtensa/include/asm/pgalloc.h
+++ b/arch/xtensa/include/asm/pgalloc.h
@@ -55,7 +55,7 @@ static inline pgtable_t pte_alloc_one(struct mm_struct *mm)
 	if (!pte)
 		return NULL;
 	page = virt_to_page(pte);
-	if (!pgtable_pte_page_ctor(page)) {
+	if (!pgtable_pte_page_ctor(page, mm)) {
 		__free_page(page);
 		return NULL;
 	}
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index 73f7421413cb..24c2d6e194fb 100644
--- a/include/asm-generic/pgalloc.h
+++ b/include/asm-generic/pgalloc.h
@@ -63,7 +63,7 @@ static inline pgtable_t __pte_alloc_one(struct mm_struct *mm, gfp_t gfp)
 	pte = alloc_page(gfp);
 	if (!pte)
 		return NULL;
-	if (!pgtable_pte_page_ctor(pte)) {
+	if (!pgtable_pte_page_ctor(pte, mm)) {
 		__free_page(pte);
 		return NULL;
 	}
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5a323422d783..2a98eebeba91 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2157,11 +2157,13 @@ static inline void pgtable_init(void)
 	pgtable_cache_init();
 }
 
-static inline bool pgtable_pte_page_ctor(struct page *page)
+static inline
+bool pgtable_pte_page_ctor(struct page *page, struct mm_struct *mm)
 {
 	if (!ptlock_init(page))
 		return false;
 	__SetPageTable(page);
+	page->pt_mm = mm;
 	inc_zone_page_state(page, NR_PAGETABLE);
 	return true;
 }
@@ -2170,6 +2172,7 @@ static inline void pgtable_pte_page_dtor(struct page *page)
 {
 	ptlock_free(page);
 	__ClearPageTable(page);
+	page->pt_mm = NULL;
 	dec_zone_page_state(page, NR_PAGETABLE);
 }
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 7/7] mm: Set pt_mm in PMD constructor
  2020-04-28 19:44 [PATCH 0/7] Record the mm_struct in the page table pages Matthew Wilcox
                   ` (5 preceding siblings ...)
  2020-04-28 19:44 ` [PATCH 6/7] mm: Set pt_mm in PTE constructor Matthew Wilcox
@ 2020-04-28 19:44 ` Matthew Wilcox
  2020-04-29  0:52   ` Kirill A. Shutemov
  2020-04-29  0:26 ` [PATCH 0/7] Record the mm_struct in the page table pages Kirill A. Shutemov
  7 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-28 19:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Matthew Wilcox (Oracle),
	linux-kernel, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Russell King, Geert Uytterhoeven, linux-m68k

From: "Matthew Wilcox (Oracle)" <willy@infradead.org>

By setting pt_mm for pages in use as page tables, we can help with
debugging and lay the foundation for handling hardware errors in page
tables more gracefully.  It also opens up the possibility for adding
more sanity checks in the future.

Also set and clear the PageTable bit so that we know these are page tables.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 arch/arm64/include/asm/pgalloc.h   |  2 +-
 arch/arm64/mm/mmu.c                |  2 +-
 arch/powerpc/mm/book3s64/pgtable.c |  2 +-
 arch/s390/include/asm/pgalloc.h    |  2 +-
 arch/x86/include/asm/pgalloc.h     |  2 +-
 arch/x86/mm/pgtable.c              |  2 +-
 include/linux/mm.h                 | 13 +++++++++++--
 7 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index 172d76fa0245..920da9c5786c 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -30,7 +30,7 @@ static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
 	page = alloc_page(gfp);
 	if (!page)
 		return NULL;
-	if (!pgtable_pmd_page_ctor(page)) {
+	if (!pgtable_pmd_page_ctor(page, mm)) {
 		__free_page(page);
 		return NULL;
 	}
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c706bed1e496..b7bdde1990be 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -383,7 +383,7 @@ static phys_addr_t pgd_pgtable_alloc(int shift, struct mm_struct *mm)
 	if (shift == PAGE_SHIFT)
 		BUG_ON(!pgtable_pte_page_ctor(phys_to_page(pa), mm));
 	else if (shift == PMD_SHIFT)
-		BUG_ON(!pgtable_pmd_page_ctor(phys_to_page(pa)));
+		BUG_ON(!pgtable_pmd_page_ctor(phys_to_page(pa), mm));
 
 	return pa;
 }
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index e0bb69c616e4..9fda5287c197 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -297,7 +297,7 @@ static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm)
 	page = alloc_page(gfp);
 	if (!page)
 		return NULL;
-	if (!pgtable_pmd_page_ctor(page)) {
+	if (!pgtable_pmd_page_ctor(page, mm)) {
 		__free_pages(page, 0);
 		return NULL;
 	}
diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h
index 74a352f8c0d1..bebad4e5d42a 100644
--- a/arch/s390/include/asm/pgalloc.h
+++ b/arch/s390/include/asm/pgalloc.h
@@ -86,7 +86,7 @@ static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long vmaddr)
 	if (!table)
 		return NULL;
 	crst_table_init(table, _SEGMENT_ENTRY_EMPTY);
-	if (!pgtable_pmd_page_ctor(virt_to_page(table))) {
+	if (!pgtable_pmd_page_ctor(virt_to_page(table), mm)) {
 		crst_table_free(mm, table);
 		return NULL;
 	}
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index 29aa7859bdee..33514f0a9e79 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -96,7 +96,7 @@ static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
 	page = alloc_pages(gfp, 0);
 	if (!page)
 		return NULL;
-	if (!pgtable_pmd_page_ctor(page)) {
+	if (!pgtable_pmd_page_ctor(page, mm)) {
 		__free_pages(page, 0);
 		return NULL;
 	}
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index f5f46737aea0..8f4255662c5a 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -229,7 +229,7 @@ static int preallocate_pmds(struct mm_struct *mm, pmd_t *pmds[], int count)
 		pmd_t *pmd = (pmd_t *)__get_free_page(gfp);
 		if (!pmd)
 			failed = true;
-		if (pmd && !pgtable_pmd_page_ctor(virt_to_page(pmd))) {
+		if (pmd && !pgtable_pmd_page_ctor(virt_to_page(pmd), mm)) {
 			free_page((unsigned long)pmd);
 			pmd = NULL;
 			failed = true;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2a98eebeba91..e2924d900fc5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2216,11 +2216,14 @@ static inline spinlock_t *pmd_lockptr(struct mm_struct *mm, pmd_t *pmd)
 	return ptlock_ptr(pmd_to_page(pmd));
 }
 
-static inline bool pgtable_pmd_page_ctor(struct page *page)
+static inline
+bool pgtable_pmd_page_ctor(struct page *page, struct mm_struct *mm)
 {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	page->pmd_huge_pte = NULL;
 #endif
+	__SetPageTable(page);
+	page->pt_mm = mm;
 	return ptlock_init(page);
 }
 
@@ -2229,6 +2232,8 @@ static inline void pgtable_pmd_page_dtor(struct page *page)
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	VM_BUG_ON_PAGE(page->pmd_huge_pte, page);
 #endif
+	__ClearPageTable(page);
+	page->pt_mm = NULL;
 	ptlock_free(page);
 }
 
@@ -2241,7 +2246,11 @@ static inline spinlock_t *pmd_lockptr(struct mm_struct *mm, pmd_t *pmd)
 	return &mm->page_table_lock;
 }
 
-static inline bool pgtable_pmd_page_ctor(struct page *page) { return true; }
+static inline
+bool pgtable_pmd_page_ctor(struct page *page, struct mm_struct *mm)
+{
+	return true;
+}
 static inline void pgtable_pmd_page_dtor(struct page *page) {}
 
 #define pmd_huge_pte(mm, pmd) ((mm)->pmd_huge_pte)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/7] mm: Document x86 uses a linked list of pgds
  2020-04-28 19:44 ` [PATCH 1/7] mm: Document x86 uses a linked list of pgds Matthew Wilcox
@ 2020-04-28 21:41   ` Ira Weiny
  2020-04-28 22:52     ` Matthew Wilcox
  0 siblings, 1 reply; 20+ messages in thread
From: Ira Weiny @ 2020-04-28 21:41 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, linux-arm-kernel, Will Deacon,
	Catalin Marinas, Russell King, Geert Uytterhoeven, linux-m68k

On Tue, Apr 28, 2020 at 12:44:43PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> x86 uses page->lru of the pages used for pgds, but that's not immediately
> obvious to anyone looking to make changes.  Add a struct list_head to
> the union so it's clearly in use for pgds.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  include/linux/mm_types.h | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 4aba6c0c2ba8..9bb34e2cd5a5 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -142,8 +142,13 @@ struct page {
>  			struct list_head deferred_list;
>  		};
>  		struct {	/* Page table pages */
> -			unsigned long _pt_pad_1;	/* compound_head */
> -			pgtable_t pmd_huge_pte; /* protected by page->ptl */
> +			union {
> +				struct list_head pgd_list;	/* x86 */

Shouldn't pgd_list_{add,del}() use this list head variable instead of lru to
complete the documentation?

Probably the list iteration loops arch/x86/* as well?

Ira

> +				struct {
> +					unsigned long _pt_pad_1;
> +					pgtable_t pmd_huge_pte;
> +				};
> +			};
>  			unsigned long _pt_pad_2;	/* mapping */
>  			union {
>  				struct mm_struct *pt_mm; /* x86 pgds only */
> -- 
> 2.26.2
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/7] mm: Document x86 uses a linked list of pgds
  2020-04-28 21:41   ` Ira Weiny
@ 2020-04-28 22:52     ` Matthew Wilcox
  2020-04-29 18:29       ` Ira Weiny
  0 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-28 22:52 UTC (permalink / raw)
  To: Ira Weiny
  Cc: linux-mm, linux-kernel, linux-arm-kernel, Will Deacon,
	Catalin Marinas, Russell King, Geert Uytterhoeven, linux-m68k

On Tue, Apr 28, 2020 at 02:41:09PM -0700, Ira Weiny wrote:
> On Tue, Apr 28, 2020 at 12:44:43PM -0700, Matthew Wilcox wrote:
> > x86 uses page->lru of the pages used for pgds, but that's not immediately
> > obvious to anyone looking to make changes.  Add a struct list_head to
> > the union so it's clearly in use for pgds.
> 
> Shouldn't pgd_list_{add,del}() use this list head variable instead of lru to
> complete the documentation?
> 
> Probably the list iteration loops arch/x86/* as well?

Yes, but I felt that was out of scope for this patchset.  Untangling the
uses of struct page is a long and messy business; if we have to fix
everything at once, we'll never get anywhere.  There's also the slab
users of page->lru instead of page->slab_list.

What I actually want to get to is:

struct page {
	unsigned long flags;
	union {
		struct file_page file;
		struct anon_page anon;
		struct pt_page pt;
		struct slab_page slab;
		struct tail_page tail;
		struct rcu_head rcu;
	};
	union {
		atomic_t _mapcount;
		...
	};
	atomic_t refcount;
	...
};

and then we can refer to page->pt.list and so on.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/7] Record the mm_struct in the page table pages
  2020-04-28 19:44 [PATCH 0/7] Record the mm_struct in the page table pages Matthew Wilcox
                   ` (6 preceding siblings ...)
  2020-04-28 19:44 ` [PATCH 7/7] mm: Set pt_mm in PMD constructor Matthew Wilcox
@ 2020-04-29  0:26 ` Kirill A. Shutemov
  2020-04-29  1:51   ` Matthew Wilcox
  7 siblings, 1 reply; 20+ messages in thread
From: Kirill A. Shutemov @ 2020-04-29  0:26 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, linux-arm-kernel, Will Deacon,
	Catalin Marinas, Russell King, Geert Uytterhoeven, linux-m68k

On Tue, Apr 28, 2020 at 12:44:42PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> Pages which are in use as page tables have some space unused in struct
> page.  It would be handy to have a pointer to the struct mm_struct that
> they belong to so that we can handle uncorrectable errors in page tables
> more gracefully.  There are a few other things we could use it for too,
> such as checking that the page table entry actually belongs to the task
> we think it ought to.  This patch series does none of that, but does
> lay the groundwork for it.
> 
> Matthew Wilcox (Oracle) (7):

How does it work for kernel side of virtual address space?

And your employer may be interested in semantics around
CONFIG_ARCH_WANT_HUGE_PMD_SHARE :P

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/7] mm: Set pt_mm in PMD constructor
  2020-04-28 19:44 ` [PATCH 7/7] mm: Set pt_mm in PMD constructor Matthew Wilcox
@ 2020-04-29  0:52   ` Kirill A. Shutemov
  0 siblings, 0 replies; 20+ messages in thread
From: Kirill A. Shutemov @ 2020-04-29  0:52 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, linux-arm-kernel, Will Deacon,
	Catalin Marinas, Russell King, Geert Uytterhoeven, linux-m68k

On Tue, Apr 28, 2020 at 12:44:49PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> By setting pt_mm for pages in use as page tables, we can help with
> debugging and lay the foundation for handling hardware errors in page
> tables more gracefully.  It also opens up the possibility for adding
> more sanity checks in the future.
> 
> Also set and clear the PageTable bit so that we know these are page tables.

As far as I can see you don't yet introduce any checks. It makes patchset
somewhat pointless.

I'm not entirely sure how such checks would look like. The single page
table tree would have at least two pt_mm: the owner and init_mm. Hugetlb
shared page tables would make a mess here. Hm?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/7] Record the mm_struct in the page table pages
  2020-04-29  0:26 ` [PATCH 0/7] Record the mm_struct in the page table pages Kirill A. Shutemov
@ 2020-04-29  1:51   ` Matthew Wilcox
  2020-05-24  7:42     ` Mike Rapoport
  0 siblings, 1 reply; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-29  1:51 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mm, linux-kernel, linux-arm-kernel, Will Deacon,
	Catalin Marinas, Russell King, Geert Uytterhoeven, linux-m68k

On Wed, Apr 29, 2020 at 03:26:24AM +0300, Kirill A. Shutemov wrote:
> On Tue, Apr 28, 2020 at 12:44:42PM -0700, Matthew Wilcox wrote:
> > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > 
> > Pages which are in use as page tables have some space unused in struct
> > page.  It would be handy to have a pointer to the struct mm_struct that
> > they belong to so that we can handle uncorrectable errors in page tables
> > more gracefully.  There are a few other things we could use it for too,
> > such as checking that the page table entry actually belongs to the task
> > we think it ought to.  This patch series does none of that, but does
> > lay the groundwork for it.
> > 
> > Matthew Wilcox (Oracle) (7):
> 
> How does it work for kernel side of virtual address space?

init_mm

> And your employer may be interested in semantics around
> CONFIG_ARCH_WANT_HUGE_PMD_SHARE :P

I was thinking about that.  Right now, it's only useful for debugging
purposes (as you point out in a later email).  I think it's OK if shared
PMDs aren't supported as well as regular PTEs, just because there are
so few of them that uncorrectable errors are less likely to strike in
those pages.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] mm: Move pt_mm within struct page
  2020-04-28 19:44 ` [PATCH 2/7] mm: Move pt_mm within struct page Matthew Wilcox
@ 2020-04-29  7:34   ` Geert Uytterhoeven
  2020-04-29 12:53     ` Matthew Wilcox
  0 siblings, 1 reply; 20+ messages in thread
From: Geert Uytterhoeven @ 2020-04-29  7:34 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linux MM, Linux Kernel Mailing List, Linux ARM, Will Deacon,
	Catalin Marinas, Russell King, linux-m68k

Hi Matthew,

On Tue, Apr 28, 2020 at 9:44 PM Matthew Wilcox <willy@infradead.org> wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Instead of a per-arch word within struct page, use a formerly reserved
> word.  This word is shared with page->mapping, so it must be cleared
> before being freed as it is checked in free_pages().
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Thanks for your patch!

> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -149,11 +149,8 @@ struct page {
>                                         pgtable_t pmd_huge_pte;
>                                 };
>                         };
> -                       unsigned long _pt_pad_2;        /* mapping */
> -                       union {
> -                               struct mm_struct *pt_mm; /* x86 pgds only */
> -                               atomic_t pt_frag_refcount; /* powerpc */
> -                       };
> +                       struct mm_struct *pt_mm;
> +                       atomic_t pt_frag_refcount; /* powerpc */

So here is now an implicit hole on 64-bit platforms, right?
Do we have any where alignof(long) != 8?

>  #if ALLOC_SPLIT_PTLOCKS
>                         spinlock_t *ptl;
>  #else

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/7] m68k: Thread mm_struct throughout page table allocation
  2020-04-28 19:44 ` [PATCH 5/7] m68k: " Matthew Wilcox
@ 2020-04-29  7:44   ` Geert Uytterhoeven
  0 siblings, 0 replies; 20+ messages in thread
From: Geert Uytterhoeven @ 2020-04-29  7:44 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linux MM, Linux Kernel Mailing List, Linux ARM, Will Deacon,
	Catalin Marinas, Russell King, linux-m68k

On Tue, Apr 28, 2020 at 9:45 PM Matthew Wilcox <willy@infradead.org> wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
>
> An upcoming patch will pass mm_struct to the page table constructor.
> Make sure m68k has the appropriate mm_struct at the point it needs to
> call the constructor.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/7] mm: Set pt_mm in PTE constructor
  2020-04-28 19:44 ` [PATCH 6/7] mm: Set pt_mm in PTE constructor Matthew Wilcox
@ 2020-04-29  7:46   ` Geert Uytterhoeven
  0 siblings, 0 replies; 20+ messages in thread
From: Geert Uytterhoeven @ 2020-04-29  7:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Linux MM, Linux Kernel Mailing List, Linux ARM, Will Deacon,
	Catalin Marinas, Russell King, linux-m68k

On Tue, Apr 28, 2020 at 9:45 PM Matthew Wilcox <willy@infradead.org> wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
>
> By setting pt_mm for pages in use as page tables, we can help with
> debugging and lay the foundation for handling hardware errors in page
> tables more gracefully.  It also opens up the possibility for adding
> more sanity checks in the future.
>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

>  arch/m68k/include/asm/mcf_pgalloc.h | 2 +-
>  arch/m68k/mm/motorola.c             | 2 +-

Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/7] arm64: Thread mm_struct throughout page table allocation
  2020-04-28 19:44 ` [PATCH 4/7] arm64: " Matthew Wilcox
@ 2020-04-29  9:58   ` Mark Rutland
  0 siblings, 0 replies; 20+ messages in thread
From: Mark Rutland @ 2020-04-29  9:58 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, linux-arm-kernel, Will Deacon,
	Catalin Marinas, Russell King, Geert Uytterhoeven, linux-m68k

Hi Matthew,

On Tue, Apr 28, 2020 at 12:44:46PM -0700, Matthew Wilcox wrote:
> From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> 
> An upcoming patch will pass mm_struct to the page table constructor.
> Make sure arm64 has the appropriate mm_struct at the point it needs to
> call the constructor.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

This generally looks good ot me. I was a little scared that we'd need to
update the EFI mapping code, but I see that already passes its mm into
create_pgd_mapping(), and everything else uses init_mm today.

One small comment below.

> ---
>  arch/arm64/mm/mmu.c | 89 ++++++++++++++++++++++-----------------------
>  1 file changed, 43 insertions(+), 46 deletions(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index a374e4f51a62..69ecc83c3be0 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -88,7 +88,9 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>  }
>  EXPORT_SYMBOL(phys_mem_access_prot);
>  
> -static phys_addr_t __init early_pgtable_alloc(int shift)
> +typedef phys_addr_t (arm_pt_alloc_t)(int size, struct mm_struct *);

Sorry to bikeshed, but for consistency with the naming scheme used here
could we please call this 'pgtable_alloc_fn' ?

We generally use 'pgtable' here, and 'fn' makes it clearer that this is
a function pointer rather than data. The 'arm_' prefix is also a bit
unusual here, and I don't think we need it.

With that:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> @@ -333,11 +332,9 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
>  	pud_clear_fixmap();
>  }
>  
> -static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
> -				 unsigned long virt, phys_addr_t size,
> -				 pgprot_t prot,
> -				 phys_addr_t (*pgtable_alloc)(int),
> -				 int flags)
> +static void __create_pgd_mapping(struct mm_struct *mm, pgd_t *pgdir,
> +		phys_addr_t phys, unsigned long virt, phys_addr_t size,
> +		pgprot_t prot, arm_pt_alloc_t pgtable_alloc, int flags)
>  {
>  	unsigned long addr, end, next;
>  	pgd_t *pgdp = pgd_offset_raw(pgdir, virt);
> @@ -355,13 +352,13 @@ static void __create_pgd_mapping(pgd_t *pgdir, phys_addr_t phys,
>  
>  	do {
>  		next = pgd_addr_end(addr, end);
> -		alloc_init_pud(pgdp, addr, next, phys, prot, pgtable_alloc,
> +		alloc_init_pud(mm, pgdp, addr, next, phys, prot, pgtable_alloc,
>  			       flags);
>  		phys += next - addr;
>  	} while (pgdp++, addr = next, addr != end);
>  }
>  
> -static phys_addr_t __pgd_pgtable_alloc(int shift)
> +static phys_addr_t __pgd_pgtable_alloc(int shift, struct mm_struct *mm)
>  {
>  	void *ptr = (void *)__get_free_page(GFP_PGTABLE_KERNEL);
>  	BUG_ON(!ptr);
> @@ -371,9 +368,9 @@ static phys_addr_t __pgd_pgtable_alloc(int shift)
>  	return __pa(ptr);
>  }
>  
> -static phys_addr_t pgd_pgtable_alloc(int shift)
> +static phys_addr_t pgd_pgtable_alloc(int shift, struct mm_struct *mm)
>  {
> -	phys_addr_t pa = __pgd_pgtable_alloc(shift);
> +	phys_addr_t pa = __pgd_pgtable_alloc(shift, mm);
>  
>  	/*
>  	 * Call proper page table ctor in case later we need to
> @@ -404,8 +401,8 @@ static void __init create_mapping_noalloc(phys_addr_t phys, unsigned long virt,
>  			&phys, virt);
>  		return;
>  	}
> -	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
> -			     NO_CONT_MAPPINGS);
> +	__create_pgd_mapping(&init_mm, init_mm.pgd, phys, virt, size, prot,
> +			NULL, NO_CONT_MAPPINGS);
>  }
>  
>  void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
> @@ -419,7 +416,7 @@ void __init create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
>  	if (page_mappings_only)
>  		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>  
> -	__create_pgd_mapping(mm->pgd, phys, virt, size, prot,
> +	__create_pgd_mapping(mm, mm->pgd, phys, virt, size, prot,
>  			     pgd_pgtable_alloc, flags);
>  }
>  
> @@ -432,8 +429,8 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
>  		return;
>  	}
>  
> -	__create_pgd_mapping(init_mm.pgd, phys, virt, size, prot, NULL,
> -			     NO_CONT_MAPPINGS);
> +	__create_pgd_mapping(&init_mm, init_mm.pgd, phys, virt, size, prot,
> +			NULL, NO_CONT_MAPPINGS);
>  
>  	/* flush the TLBs after updating live kernel mappings */
>  	flush_tlb_kernel_range(virt, virt + size);
> @@ -442,8 +439,8 @@ static void update_mapping_prot(phys_addr_t phys, unsigned long virt,
>  static void __init __map_memblock(pgd_t *pgdp, phys_addr_t start,
>  				  phys_addr_t end, pgprot_t prot, int flags)
>  {
> -	__create_pgd_mapping(pgdp, start, __phys_to_virt(start), end - start,
> -			     prot, early_pgtable_alloc, flags);
> +	__create_pgd_mapping(&init_mm, pgdp, start, __phys_to_virt(start),
> +			end - start, prot, early_pgtable_alloc, flags);
>  }
>  
>  void __init mark_linear_text_alias_ro(void)
> @@ -547,8 +544,8 @@ static void __init map_kernel_segment(pgd_t *pgdp, void *va_start, void *va_end,
>  	BUG_ON(!PAGE_ALIGNED(pa_start));
>  	BUG_ON(!PAGE_ALIGNED(size));
>  
> -	__create_pgd_mapping(pgdp, pa_start, (unsigned long)va_start, size, prot,
> -			     early_pgtable_alloc, flags);
> +	__create_pgd_mapping(&init_mm, pgdp, pa_start, (unsigned long)va_start,
> +			size, prot, early_pgtable_alloc, flags);
>  
>  	if (!(vm_flags & VM_NO_GUARD))
>  		size += PAGE_SIZE;
> @@ -591,8 +588,8 @@ static int __init map_entry_trampoline(void)
>  
>  	/* Map only the text into the trampoline page table */
>  	memset(tramp_pg_dir, 0, PGD_SIZE);
> -	__create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
> -			     prot, __pgd_pgtable_alloc, 0);
> +	__create_pgd_mapping(&init_mm, tramp_pg_dir, pa_start, TRAMP_VALIAS,
> +			PAGE_SIZE, prot, __pgd_pgtable_alloc, 0);
>  
>  	/* Map both the text and data into the kernel page table */
>  	__set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot);
> @@ -1381,9 +1378,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
>  	if (rodata_full || debug_pagealloc_enabled())
>  		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>  
> -	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> -			     size, params->pgprot, __pgd_pgtable_alloc,
> -			     flags);
> +	__create_pgd_mapping(&init_mm, swapper_pg_dir, start,
> +			__phys_to_virt(start), size, params->pgprot,
> +			__pgd_pgtable_alloc, flags);
>  
>  	memblock_clear_nomap(start, size);
>  
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/7] mm: Move pt_mm within struct page
  2020-04-29  7:34   ` Geert Uytterhoeven
@ 2020-04-29 12:53     ` Matthew Wilcox
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Wilcox @ 2020-04-29 12:53 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linux MM, Linux Kernel Mailing List, Linux ARM, Will Deacon,
	Catalin Marinas, Russell King, linux-m68k

On Wed, Apr 29, 2020 at 09:34:02AM +0200, Geert Uytterhoeven wrote:
> > +++ b/include/linux/mm_types.h
> > @@ -149,11 +149,8 @@ struct page {
> >                                         pgtable_t pmd_huge_pte;
> >                                 };
> >                         };
> > -                       unsigned long _pt_pad_2;        /* mapping */
> > -                       union {
> > -                               struct mm_struct *pt_mm; /* x86 pgds only */
> > -                               atomic_t pt_frag_refcount; /* powerpc */
> > -                       };
> > +                       struct mm_struct *pt_mm;
> > +                       atomic_t pt_frag_refcount; /* powerpc */
> 
> So here is now an implicit hole on 64-bit platforms, right?
> Do we have any where alignof(long) != 8?

There's an implicit hole if someone's turned on spinlock debugging and
has split pagetable locks.  Without the need to allocate the spinlock
separately, the ptl will actually move from the same word as 'private'
to the same word as 'index', freeing up 'private' entirely.  I don't
intend to depend on that, but it's not quite as critical to line up the
various members of struct page as it used to be.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/7] mm: Document x86 uses a linked list of pgds
  2020-04-28 22:52     ` Matthew Wilcox
@ 2020-04-29 18:29       ` Ira Weiny
  0 siblings, 0 replies; 20+ messages in thread
From: Ira Weiny @ 2020-04-29 18:29 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-mm, linux-kernel, linux-arm-kernel, Will Deacon,
	Catalin Marinas, Russell King, Geert Uytterhoeven, linux-m68k

On Tue, Apr 28, 2020 at 03:52:51PM -0700, Matthew Wilcox wrote:
> On Tue, Apr 28, 2020 at 02:41:09PM -0700, Ira Weiny wrote:
> > On Tue, Apr 28, 2020 at 12:44:43PM -0700, Matthew Wilcox wrote:
> > > x86 uses page->lru of the pages used for pgds, but that's not immediately
> > > obvious to anyone looking to make changes.  Add a struct list_head to
> > > the union so it's clearly in use for pgds.
> > 
> > Shouldn't pgd_list_{add,del}() use this list head variable instead of lru to
> > complete the documentation?
> > 
> > Probably the list iteration loops arch/x86/* as well?
> 
> Yes, but I felt that was out of scope for this patchset.  Untangling the
> uses of struct page is a long and messy business; if we have to fix
> everything at once, we'll never get anywhere.  There's also the slab
> users of page->lru instead of page->slab_list.

But doesn't changing the use of lru with this new name in the code also help to
identify the users?

> 
> What I actually want to get to is:
> 
> struct page {
> 	unsigned long flags;
> 	union {
> 		struct file_page file;
> 		struct anon_page anon;
> 		struct pt_page pt;
> 		struct slab_page slab;
> 		struct tail_page tail;
> 		struct rcu_head rcu;
> 	};
> 	union {
> 		atomic_t _mapcount;
> 		...
> 	};
> 	atomic_t refcount;
> 	...
> };
> 
> and then we can refer to page->pt.list and so on.

Then later on we know exactly where page->pt.list needs to be inserted.

I'm not opposed to the patch as it is.  But as someone newer it seems like the
following documents the use of lru as much if not more.

Compile tested only but feel free to merge if you like.
Ira

From 63fa92a940fa17567ab45a64b7ac058d4d41a54d Mon Sep 17 00:00:00 2001
From: Ira Weiny <ira.weiny@intel.com>
Date: Wed, 29 Apr 2020 11:10:59 -0700
Subject: [PATCH] mm: Complete documenting the use of lru for pgd_list

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 arch/x86/mm/fault.c          | 2 +-
 arch/x86/mm/init_64.c        | 4 ++--
 arch/x86/mm/pat/set_memory.c | 2 +-
 arch/x86/mm/pgtable.c        | 4 ++--
 arch/x86/xen/mmu_pv.c        | 4 ++--
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a51df516b87b..f07d477f8787 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -203,7 +203,7 @@ static void vmalloc_sync(void)
 		struct page *page;
 
 		spin_lock(&pgd_lock);
-		list_for_each_entry(page, &pgd_list, lru) {
+		list_for_each_entry(page, &pgd_list, pgd_list) {
 			spinlock_t *pgt_lock;
 
 			/* the pgt_lock only for Xen */
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 3b289c2f75cd..e2ae3618a65d 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -140,7 +140,7 @@ static void sync_global_pgds_l5(unsigned long start, unsigned long end)
 			continue;
 
 		spin_lock(&pgd_lock);
-		list_for_each_entry(page, &pgd_list, lru) {
+		list_for_each_entry(page, &pgd_list, pgd_list) {
 			pgd_t *pgd;
 			spinlock_t *pgt_lock;
 
@@ -181,7 +181,7 @@ static void sync_global_pgds_l4(unsigned long start, unsigned long end)
 			continue;
 
 		spin_lock(&pgd_lock);
-		list_for_each_entry(page, &pgd_list, lru) {
+		list_for_each_entry(page, &pgd_list, pgd_list) {
 			pgd_t *pgd;
 			p4d_t *p4d;
 			spinlock_t *pgt_lock;
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 59eca6a94ce7..a1edfc593141 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -723,7 +723,7 @@ static void __set_pmd_pte(pte_t *kpte, unsigned long address, pte_t pte)
 	if (!SHARED_KERNEL_PMD) {
 		struct page *page;
 
-		list_for_each_entry(page, &pgd_list, lru) {
+		list_for_each_entry(page, &pgd_list, pgd_list) {
 			pgd_t *pgd;
 			p4d_t *p4d;
 			pud_t *pud;
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 8f4255662c5a..28ea8cc3f3a2 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -87,14 +87,14 @@ static inline void pgd_list_add(pgd_t *pgd)
 {
 	struct page *page = virt_to_page(pgd);
 
-	list_add(&page->lru, &pgd_list);
+	list_add(&page->pgd_list, &pgd_list);
 }
 
 static inline void pgd_list_del(pgd_t *pgd)
 {
 	struct page *page = virt_to_page(pgd);
 
-	list_del(&page->lru);
+	list_del(&page->pgd_list);
 	page->pt_mm = NULL;
 }
 
diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index bbba8b17829a..df6592be3208 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -844,7 +844,7 @@ void xen_mm_pin_all(void)
 
 	spin_lock(&pgd_lock);
 
-	list_for_each_entry(page, &pgd_list, lru) {
+	list_for_each_entry(page, &pgd_list, pgd_list) {
 		if (!PagePinned(page)) {
 			__xen_pgd_pin(&init_mm, (pgd_t *)page_address(page));
 			SetPageSavePinned(page);
@@ -963,7 +963,7 @@ void xen_mm_unpin_all(void)
 
 	spin_lock(&pgd_lock);
 
-	list_for_each_entry(page, &pgd_list, lru) {
+	list_for_each_entry(page, &pgd_list, pgd_list) {
 		if (PageSavePinned(page)) {
 			BUG_ON(!PagePinned(page));
 			__xen_pgd_unpin(&init_mm, (pgd_t *)page_address(page));
-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/7] Record the mm_struct in the page table pages
  2020-04-29  1:51   ` Matthew Wilcox
@ 2020-05-24  7:42     ` Mike Rapoport
  0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2020-05-24  7:42 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Kirill A. Shutemov, linux-mm, linux-kernel, linux-arm-kernel,
	Will Deacon, Catalin Marinas, Russell King, Geert Uytterhoeven,
	linux-m68k

On Tue, Apr 28, 2020 at 06:51:26PM -0700, Matthew Wilcox wrote:
> On Wed, Apr 29, 2020 at 03:26:24AM +0300, Kirill A. Shutemov wrote:
> > On Tue, Apr 28, 2020 at 12:44:42PM -0700, Matthew Wilcox wrote:
> > > From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> > > 
> > > Pages which are in use as page tables have some space unused in struct
> > > page.  It would be handy to have a pointer to the struct mm_struct that
> > > they belong to so that we can handle uncorrectable errors in page tables
> > > more gracefully.  There are a few other things we could use it for too,
> > > such as checking that the page table entry actually belongs to the task
> > > we think it ought to.  This patch series does none of that, but does
> > > lay the groundwork for it.
> > > 
> > > Matthew Wilcox (Oracle) (7):
> > 
> > How does it work for kernel side of virtual address space?
> 
> init_mm

A note to keep in mind is that most of the kernel page tables are seen
as PG_reserved rather than PageTable().

> > And your employer may be interested in semantics around
> > CONFIG_ARCH_WANT_HUGE_PMD_SHARE :P
> 
> I was thinking about that.  Right now, it's only useful for debugging
> purposes (as you point out in a later email).  I think it's OK if shared
> PMDs aren't supported as well as regular PTEs, just because there are
> so few of them that uncorrectable errors are less likely to strike in
> those pages.
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, back to index

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-28 19:44 [PATCH 0/7] Record the mm_struct in the page table pages Matthew Wilcox
2020-04-28 19:44 ` [PATCH 1/7] mm: Document x86 uses a linked list of pgds Matthew Wilcox
2020-04-28 21:41   ` Ira Weiny
2020-04-28 22:52     ` Matthew Wilcox
2020-04-29 18:29       ` Ira Weiny
2020-04-28 19:44 ` [PATCH 2/7] mm: Move pt_mm within struct page Matthew Wilcox
2020-04-29  7:34   ` Geert Uytterhoeven
2020-04-29 12:53     ` Matthew Wilcox
2020-04-28 19:44 ` [PATCH 3/7] arm: Thread mm_struct throughout page table allocation Matthew Wilcox
2020-04-28 19:44 ` [PATCH 4/7] arm64: " Matthew Wilcox
2020-04-29  9:58   ` Mark Rutland
2020-04-28 19:44 ` [PATCH 5/7] m68k: " Matthew Wilcox
2020-04-29  7:44   ` Geert Uytterhoeven
2020-04-28 19:44 ` [PATCH 6/7] mm: Set pt_mm in PTE constructor Matthew Wilcox
2020-04-29  7:46   ` Geert Uytterhoeven
2020-04-28 19:44 ` [PATCH 7/7] mm: Set pt_mm in PMD constructor Matthew Wilcox
2020-04-29  0:52   ` Kirill A. Shutemov
2020-04-29  0:26 ` [PATCH 0/7] Record the mm_struct in the page table pages Kirill A. Shutemov
2020-04-29  1:51   ` Matthew Wilcox
2020-05-24  7:42     ` Mike Rapoport

Linux-m68k Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-m68k/0 linux-m68k/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-m68k linux-m68k/ https://lore.kernel.org/linux-m68k \
		linux-m68k@vger.kernel.org linux-m68k@lists.linux-m68k.org
	public-inbox-index linux-m68k

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-m68k


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git