LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v4 00/13] mm/debug_vm_pgtable fixes
@ 2020-09-02 11:42 Aneesh Kumar K.V
  2020-09-02 11:42 ` [PATCH v4 01/13] powerpc/mm: Add DEBUG_VM WARN for pmd_clear Aneesh Kumar K.V
                   ` (16 more replies)
  0 siblings, 17 replies; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

This patch series includes fixes for debug_vm_pgtable test code so that
they follow page table updates rules correctly. The first two patches introduce
changes w.r.t ppc64. The patches are included in this series for completeness. We can
merge them via ppc64 tree if required.

Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
page table update rules.

These tests are broken w.r.t page table update rules and results in kernel
crash as below. 

[   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
    pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
    lr: c0000000005eeeec: pte_update+0x11c/0x190
    sp: c000000c6d1e7950
   msr: 8000000002029033
  current = 0xc000000c6d172c80
  paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
    pid   = 1, comm = swapper/0
kernel BUG at arch/powerpc/mm/pgtable.c:304!
[link register   ] c0000000005eeeec pte_update+0x11c/0x190
[c000000c6d1e7950] 0000000000000001 (unreliable)
[c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
[c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
[c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
[c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
[c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
[c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
[c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c

With DEBUG_VM disabled

[   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
[   20.530183] Faulting instruction address: 0xc0000000000df330
cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
    pc: c0000000000df330: memset+0x68/0x104
    lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
    sp: c000000c6d19f990
   msr: 8000000002009033
   dar: 0
  current = 0xc000000c6d177480
  paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
    pid   = 1, comm = swapper/0
[link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
[c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
[c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
[c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
[c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
[c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
[c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
[c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c

Changes from v3:
* Address review feedback
* Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure.

Changes from v2:
* Fix build failure with different configs and architecture.

Changes from v1:
* Address review feedback
* drop test specific pfn_pte and pfn_pmd.
* Update ppc64 page table helper to add _PAGE_PTE 


Aneesh Kumar K.V (13):
  powerpc/mm: Add DEBUG_VM WARN for pmd_clear
  powerpc/mm: Move setting pte specific flags to pfn_pte
  mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value
  mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge
    vmap support.
  mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with
    CONFIG_NUMA_BALANCING
  mm/debug_vm_pgtable/THP: Mark the pte entry huge before using
    set_pmd/pud_at
  mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an
    existing pte entry
  mm/debug_vm_pgtable/locks: Move non page table modifying test together
  mm/debug_vm_pgtable/locks: Take correct page table lock
  mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP
  mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
  mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64
  mm/debug_vm_pgtable: Avoid none pte in pte_clear_test

 arch/powerpc/include/asm/book3s/64/pgtable.h |  29 +++-
 arch/powerpc/include/asm/nohash/pgtable.h    |   5 -
 arch/powerpc/mm/pgtable.c                    |   5 -
 mm/debug_vm_pgtable.c                        | 171 ++++++++++++-------
 4 files changed, 131 insertions(+), 79 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 01/13] powerpc/mm: Add DEBUG_VM WARN for pmd_clear
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-02 11:42 ` [PATCH v4 02/13] powerpc/mm: Move setting pte specific flags to pfn_pte Aneesh Kumar K.V
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

With the hash page table, the kernel should not use pmd_clear for clearing
huge pte entries. Add a DEBUG_VM WARN to catch the wrong usage.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 6de56c3b33c4..079211968987 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -868,6 +868,13 @@ static inline bool pte_ci(pte_t pte)
 
 static inline void pmd_clear(pmd_t *pmdp)
 {
+	if (IS_ENABLED(CONFIG_DEBUG_VM) && !radix_enabled()) {
+		/*
+		 * Don't use this if we can possibly have a hash page table
+		 * entry mapping this.
+		 */
+		WARN_ON((pmd_val(*pmdp) & (H_PAGE_HASHPTE | _PAGE_PTE)) == (H_PAGE_HASHPTE | _PAGE_PTE));
+	}
 	*pmdp = __pmd(0);
 }
 
@@ -916,6 +923,13 @@ static inline int pmd_bad(pmd_t pmd)
 
 static inline void pud_clear(pud_t *pudp)
 {
+	if (IS_ENABLED(CONFIG_DEBUG_VM) && !radix_enabled()) {
+		/*
+		 * Don't use this if we can possibly have a hash page table
+		 * entry mapping this.
+		 */
+		WARN_ON((pud_val(*pudp) & (H_PAGE_HASHPTE | _PAGE_PTE)) == (H_PAGE_HASHPTE | _PAGE_PTE));
+	}
 	*pudp = __pud(0);
 }
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 02/13] powerpc/mm: Move setting pte specific flags to pfn_pte
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
  2020-09-02 11:42 ` [PATCH v4 01/13] powerpc/mm: Add DEBUG_VM WARN for pmd_clear Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-02 12:30   ` Christophe Leroy
  2020-09-02 11:42 ` [PATCH v4 03/13] mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value Aneesh Kumar K.V
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

powerpc used to set the pte specific flags in set_pte_at(). This is
different from other architectures. To be consistent with other
architecture update pfn_pte to set _PAGE_PTE on ppc64. Also, drop now
unused pte_mkpte.

We add a VM_WARN_ON() to catch the usage of calling set_pte_at()
without setting _PAGE_PTE bit. We will remove that after a few releases.

With respect to huge pmd entries, pmd_mkhuge() takes care of adding the
_PAGE_PTE bit.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 15 +++++++++------
 arch/powerpc/include/asm/nohash/pgtable.h    |  5 -----
 arch/powerpc/mm/pgtable.c                    |  5 -----
 3 files changed, 9 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 079211968987..2382fd516f6b 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -619,7 +619,7 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
 	VM_BUG_ON(pfn >> (64 - PAGE_SHIFT));
 	VM_BUG_ON((pfn << PAGE_SHIFT) & ~PTE_RPN_MASK);
 
-	return __pte(((pte_basic_t)pfn << PAGE_SHIFT) | pgprot_val(pgprot));
+	return __pte(((pte_basic_t)pfn << PAGE_SHIFT) | pgprot_val(pgprot) | _PAGE_PTE);
 }
 
 static inline unsigned long pte_pfn(pte_t pte)
@@ -655,11 +655,6 @@ static inline pte_t pte_mkexec(pte_t pte)
 	return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_EXEC));
 }
 
-static inline pte_t pte_mkpte(pte_t pte)
-{
-	return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_PTE));
-}
-
 static inline pte_t pte_mkwrite(pte_t pte)
 {
 	/*
@@ -823,6 +818,14 @@ static inline int pte_none(pte_t pte)
 static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
 				pte_t *ptep, pte_t pte, int percpu)
 {
+
+	VM_WARN_ON(!(pte_raw(pte) & cpu_to_be64(_PAGE_PTE)));
+	/*
+	 * Keep the _PAGE_PTE added till we are sure we handle _PAGE_PTE
+	 * in all the callers.
+	 */
+	 pte = __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_PTE));
+
 	if (radix_enabled())
 		return radix__set_pte_at(mm, addr, ptep, pte, percpu);
 	return hash__set_pte_at(mm, addr, ptep, pte, percpu);
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index 4b7c3472eab1..6277e7596ae5 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -140,11 +140,6 @@ static inline pte_t pte_mkold(pte_t pte)
 	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
 }
 
-static inline pte_t pte_mkpte(pte_t pte)
-{
-	return pte;
-}
-
 static inline pte_t pte_mkspecial(pte_t pte)
 {
 	return __pte(pte_val(pte) | _PAGE_SPECIAL);
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 9c0547d77af3..ab57b07ef39a 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -184,9 +184,6 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
 	 */
 	VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
 
-	/* Add the pte bit when trying to set a pte */
-	pte = pte_mkpte(pte);
-
 	/* Note: mm->context.id might not yet have been assigned as
 	 * this context might not have been activated yet when this
 	 * is called.
@@ -275,8 +272,6 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_
 	 */
 	VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
 
-	pte = pte_mkpte(pte);
-
 	pte = set_pte_filter(pte);
 
 	val = pte_val(pte);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 03/13] mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
  2020-09-02 11:42 ` [PATCH v4 01/13] powerpc/mm: Add DEBUG_VM WARN for pmd_clear Aneesh Kumar K.V
  2020-09-02 11:42 ` [PATCH v4 02/13] powerpc/mm: Move setting pte specific flags to pfn_pte Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-04  4:03   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support Aneesh Kumar K.V
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

ppc64 use bit 62 to indicate a pte entry (_PAGE_PTE). Avoid setting
that bit in random value.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 086309fb9b6f..00649b47f6e0 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -44,10 +44,17 @@
  * entry type. But these bits might affect the ability to clear entries with
  * pxx_clear() because of how dynamic page table folding works on s390. So
  * while loading up the entries do not change the lower 4 bits. It does not
- * have affect any other platform.
+ * have affect any other platform. Also avoid the 62nd bit on ppc64 that is
+ * used to mark a pte entry.
  */
-#define S390_MASK_BITS	4
-#define RANDOM_ORVALUE	GENMASK(BITS_PER_LONG - 1, S390_MASK_BITS)
+#define S390_SKIP_MASK		GENMASK(3, 0)
+#if __BITS_PER_LONG == 64
+#define PPC64_SKIP_MASK		GENMASK(62, 62)
+#else
+#define PPC64_SKIP_MASK		0x0
+#endif
+#define ARCH_SKIP_MASK (S390_SKIP_MASK | PPC64_SKIP_MASK)
+#define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK)
 #define RANDOM_NZVALUE	GENMASK(7, 0)
 
 static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support.
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (2 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 03/13] mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-02 12:40   ` Christophe Leroy
  2020-09-04  4:08   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 05/13] mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with CONFIG_NUMA_BALANCING Aneesh Kumar K.V
                   ` (12 subsequent siblings)
  16 siblings, 2 replies; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

ppc64 supports huge vmap only with radix translation. Hence use arch helper
to determine the huge vmap support.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 00649b47f6e0..4c73e63b4ceb 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -28,6 +28,7 @@
 #include <linux/swapops.h>
 #include <linux/start_kernel.h>
 #include <linux/sched/mm.h>
+#include <linux/io.h>
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
 
@@ -206,11 +207,12 @@ static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot)
 	WARN_ON(!pmd_leaf(pmd));
 }
 
+#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
 static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
 {
 	pmd_t pmd;
 
-	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
+	if (!arch_ioremap_pmd_supported())
 		return;
 
 	pr_debug("Validating PMD huge\n");
@@ -224,6 +226,9 @@ static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(!pmd_none(pmd));
 }
+#else /* CONFIG_HAVE_ARCH_HUGE_VMAP */
+static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot) { }
+#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
 
 static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
 {
@@ -320,11 +325,12 @@ static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
 	WARN_ON(!pud_leaf(pud));
 }
 
+#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
 static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
 {
 	pud_t pud;
 
-	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
+	if (!arch_ioremap_pud_supported())
 		return;
 
 	pr_debug("Validating PUD huge\n");
@@ -338,6 +344,10 @@ static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
 	pud = READ_ONCE(*pudp);
 	WARN_ON(!pud_none(pud));
 }
+#else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
+static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot) { }
+#endif /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
+
 #else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
 static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
 static void __init pud_advanced_tests(struct mm_struct *mm,
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 05/13] mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with CONFIG_NUMA_BALANCING
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (3 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-04  4:09   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 06/13] mm/debug_vm_pgtable/THP: Mark the pte entry huge before using set_pmd/pud_at Aneesh Kumar K.V
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

Saved write support was added to track the write bit of a pte after
marking the pte protnone. This was done so that AUTONUMA can convert
a write pte to protnone and still track the old write bit. When converting
it back we set the pte write bit correctly thereby avoiding a write fault
again. Hence enable the test only when CONFIG_NUMA_BALANCING is enabled and
use protnone protflags.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 4c73e63b4ceb..8704901f6bd8 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -119,10 +119,14 @@ static void __init pte_savedwrite_tests(unsigned long pfn, pgprot_t prot)
 {
 	pte_t pte = pfn_pte(pfn, prot);
 
+	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
+		return;
+
 	pr_debug("Validating PTE saved write\n");
 	WARN_ON(!pte_savedwrite(pte_mk_savedwrite(pte_clear_savedwrite(pte))));
 	WARN_ON(pte_savedwrite(pte_clear_savedwrite(pte_mk_savedwrite(pte))));
 }
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
 {
@@ -234,6 +238,9 @@ static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
 {
 	pmd_t pmd = pfn_pmd(pfn, prot);
 
+	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
+		return;
+
 	pr_debug("Validating PMD saved write\n");
 	WARN_ON(!pmd_savedwrite(pmd_mk_savedwrite(pmd_clear_savedwrite(pmd))));
 	WARN_ON(pmd_savedwrite(pmd_clear_savedwrite(pmd_mk_savedwrite(pmd))));
@@ -1019,8 +1026,8 @@ static int __init debug_vm_pgtable(void)
 	pmd_huge_tests(pmdp, pmd_aligned, prot);
 	pud_huge_tests(pudp, pud_aligned, prot);
 
-	pte_savedwrite_tests(pte_aligned, prot);
-	pmd_savedwrite_tests(pmd_aligned, prot);
+	pte_savedwrite_tests(pte_aligned, protnone);
+	pmd_savedwrite_tests(pmd_aligned, protnone);
 
 	pte_unmap_unlock(ptep, ptl);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 06/13] mm/debug_vm_pgtable/THP: Mark the pte entry huge before using set_pmd/pud_at
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (4 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 05/13] mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with CONFIG_NUMA_BALANCING Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-04  5:21   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 07/13] mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an existing pte entry Aneesh Kumar K.V
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

kernel expects entries to be marked huge before we use
set_pmd_at()/set_pud_at().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 8704901f6bd8..9cafed39c236 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -155,7 +155,7 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
 				      unsigned long pfn, unsigned long vaddr,
 				      pgprot_t prot)
 {
-	pmd_t pmd = pfn_pmd(pfn, prot);
+	pmd_t pmd;
 
 	if (!has_transparent_hugepage())
 		return;
@@ -164,19 +164,19 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
 	/* Align the address wrt HPAGE_PMD_SIZE */
 	vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
 
-	pmd = pfn_pmd(pfn, prot);
+	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
 	set_pmd_at(mm, vaddr, pmdp, pmd);
 	pmdp_set_wrprotect(mm, vaddr, pmdp);
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(pmd_write(pmd));
 
-	pmd = pfn_pmd(pfn, prot);
+	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
 	set_pmd_at(mm, vaddr, pmdp, pmd);
 	pmdp_huge_get_and_clear(mm, vaddr, pmdp);
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(!pmd_none(pmd));
 
-	pmd = pfn_pmd(pfn, prot);
+	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
 	pmd = pmd_wrprotect(pmd);
 	pmd = pmd_mkclean(pmd);
 	set_pmd_at(mm, vaddr, pmdp, pmd);
@@ -236,7 +236,7 @@ static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
 
 static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
 {
-	pmd_t pmd = pfn_pmd(pfn, prot);
+	pmd_t pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
 
 	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
 		return;
@@ -276,7 +276,7 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
 				      unsigned long pfn, unsigned long vaddr,
 				      pgprot_t prot)
 {
-	pud_t pud = pfn_pud(pfn, prot);
+	pud_t pud;
 
 	if (!has_transparent_hugepage())
 		return;
@@ -285,25 +285,27 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
 	/* Align the address wrt HPAGE_PUD_SIZE */
 	vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE;
 
+	pud = pud_mkhuge(pfn_pud(pfn, prot));
 	set_pud_at(mm, vaddr, pudp, pud);
 	pudp_set_wrprotect(mm, vaddr, pudp);
 	pud = READ_ONCE(*pudp);
 	WARN_ON(pud_write(pud));
 
 #ifndef __PAGETABLE_PMD_FOLDED
-	pud = pfn_pud(pfn, prot);
+	pud = pud_mkhuge(pfn_pud(pfn, prot));
 	set_pud_at(mm, vaddr, pudp, pud);
 	pudp_huge_get_and_clear(mm, vaddr, pudp);
 	pud = READ_ONCE(*pudp);
 	WARN_ON(!pud_none(pud));
 
-	pud = pfn_pud(pfn, prot);
+	pud = pud_mkhuge(pfn_pud(pfn, prot));
 	set_pud_at(mm, vaddr, pudp, pud);
 	pudp_huge_get_and_clear_full(mm, vaddr, pudp, 1);
 	pud = READ_ONCE(*pudp);
 	WARN_ON(!pud_none(pud));
 #endif /* __PAGETABLE_PMD_FOLDED */
-	pud = pfn_pud(pfn, prot);
+
+	pud = pud_mkhuge(pfn_pud(pfn, prot));
 	pud = pud_wrprotect(pud);
 	pud = pud_mkclean(pud);
 	set_pud_at(mm, vaddr, pudp, pud);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 07/13] mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an existing pte entry
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (5 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 06/13] mm/debug_vm_pgtable/THP: Mark the pte entry huge before using set_pmd/pud_at Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-04  5:34   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 08/13] mm/debug_vm_pgtable/locks: Move non page table modifying test together Aneesh Kumar K.V
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

set_pte_at() should not be used to set a pte entry at locations that
already holds a valid pte entry. Architectures like ppc64 don't do TLB
invalidate in set_pte_at() and hence expect it to be used to set locations
that are not a valid PTE.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 35 +++++++++++++++--------------------
 1 file changed, 15 insertions(+), 20 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 9cafed39c236..de333871f407 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -79,15 +79,18 @@ static void __init pte_advanced_tests(struct mm_struct *mm,
 {
 	pte_t pte = pfn_pte(pfn, prot);
 
+	/*
+	 * Architectures optimize set_pte_at by avoiding TLB flush.
+	 * This requires set_pte_at to be not used to update an
+	 * existing pte entry. Clear pte before we do set_pte_at
+	 */
+
 	pr_debug("Validating PTE advanced\n");
 	pte = pfn_pte(pfn, prot);
 	set_pte_at(mm, vaddr, ptep, pte);
 	ptep_set_wrprotect(mm, vaddr, ptep);
 	pte = ptep_get(ptep);
 	WARN_ON(pte_write(pte));
-
-	pte = pfn_pte(pfn, prot);
-	set_pte_at(mm, vaddr, ptep, pte);
 	ptep_get_and_clear(mm, vaddr, ptep);
 	pte = ptep_get(ptep);
 	WARN_ON(!pte_none(pte));
@@ -101,13 +104,11 @@ static void __init pte_advanced_tests(struct mm_struct *mm,
 	ptep_set_access_flags(vma, vaddr, ptep, pte, 1);
 	pte = ptep_get(ptep);
 	WARN_ON(!(pte_write(pte) && pte_dirty(pte)));
-
-	pte = pfn_pte(pfn, prot);
-	set_pte_at(mm, vaddr, ptep, pte);
 	ptep_get_and_clear_full(mm, vaddr, ptep, 1);
 	pte = ptep_get(ptep);
 	WARN_ON(!pte_none(pte));
 
+	pte = pfn_pte(pfn, prot);
 	pte = pte_mkyoung(pte);
 	set_pte_at(mm, vaddr, ptep, pte);
 	ptep_test_and_clear_young(vma, vaddr, ptep);
@@ -169,9 +170,6 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
 	pmdp_set_wrprotect(mm, vaddr, pmdp);
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(pmd_write(pmd));
-
-	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
-	set_pmd_at(mm, vaddr, pmdp, pmd);
 	pmdp_huge_get_and_clear(mm, vaddr, pmdp);
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(!pmd_none(pmd));
@@ -185,13 +183,11 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
 	pmdp_set_access_flags(vma, vaddr, pmdp, pmd, 1);
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(!(pmd_write(pmd) && pmd_dirty(pmd)));
-
-	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
-	set_pmd_at(mm, vaddr, pmdp, pmd);
 	pmdp_huge_get_and_clear_full(vma, vaddr, pmdp, 1);
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(!pmd_none(pmd));
 
+	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
 	pmd = pmd_mkyoung(pmd);
 	set_pmd_at(mm, vaddr, pmdp, pmd);
 	pmdp_test_and_clear_young(vma, vaddr, pmdp);
@@ -292,17 +288,9 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
 	WARN_ON(pud_write(pud));
 
 #ifndef __PAGETABLE_PMD_FOLDED
-	pud = pud_mkhuge(pfn_pud(pfn, prot));
-	set_pud_at(mm, vaddr, pudp, pud);
 	pudp_huge_get_and_clear(mm, vaddr, pudp);
 	pud = READ_ONCE(*pudp);
 	WARN_ON(!pud_none(pud));
-
-	pud = pud_mkhuge(pfn_pud(pfn, prot));
-	set_pud_at(mm, vaddr, pudp, pud);
-	pudp_huge_get_and_clear_full(mm, vaddr, pudp, 1);
-	pud = READ_ONCE(*pudp);
-	WARN_ON(!pud_none(pud));
 #endif /* __PAGETABLE_PMD_FOLDED */
 
 	pud = pud_mkhuge(pfn_pud(pfn, prot));
@@ -315,6 +303,13 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
 	pud = READ_ONCE(*pudp);
 	WARN_ON(!(pud_write(pud) && pud_dirty(pud)));
 
+#ifndef __PAGETABLE_PMD_FOLDED
+	pudp_huge_get_and_clear_full(mm, vaddr, pudp, 1);
+	pud = READ_ONCE(*pudp);
+	WARN_ON(!pud_none(pud));
+#endif /* __PAGETABLE_PMD_FOLDED */
+
+	pud = pud_mkhuge(pfn_pud(pfn, prot));
 	pud = pud_mkyoung(pud);
 	set_pud_at(mm, vaddr, pudp, pud);
 	pudp_test_and_clear_young(vma, vaddr, pudp);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 08/13] mm/debug_vm_pgtable/locks: Move non page table modifying test together
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (6 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 07/13] mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an existing pte entry Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-04  3:58   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 09/13] mm/debug_vm_pgtable/locks: Take correct page table lock Aneesh Kumar K.V
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

This will help in adding proper locks in a later patch

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 51 ++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 23 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index de333871f407..f59cf6a9b05e 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -986,7 +986,7 @@ static int __init debug_vm_pgtable(void)
 	p4dp = p4d_alloc(mm, pgdp, vaddr);
 	pudp = pud_alloc(mm, p4dp, vaddr);
 	pmdp = pmd_alloc(mm, pudp, vaddr);
-	ptep = pte_alloc_map_lock(mm, pmdp, vaddr, &ptl);
+	ptep = pte_alloc_map(mm, pmdp, vaddr);
 
 	/*
 	 * Save all the page table page addresses as the page table
@@ -1006,33 +1006,12 @@ static int __init debug_vm_pgtable(void)
 	p4d_basic_tests(p4d_aligned, prot);
 	pgd_basic_tests(pgd_aligned, prot);
 
-	pte_clear_tests(mm, ptep, vaddr);
-	pmd_clear_tests(mm, pmdp);
-	pud_clear_tests(mm, pudp);
-	p4d_clear_tests(mm, p4dp);
-	pgd_clear_tests(mm, pgdp);
-
-	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
-	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
-	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
-	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
-
 	pmd_leaf_tests(pmd_aligned, prot);
 	pud_leaf_tests(pud_aligned, prot);
 
-	pmd_huge_tests(pmdp, pmd_aligned, prot);
-	pud_huge_tests(pudp, pud_aligned, prot);
-
 	pte_savedwrite_tests(pte_aligned, protnone);
 	pmd_savedwrite_tests(pmd_aligned, protnone);
 
-	pte_unmap_unlock(ptep, ptl);
-
-	pmd_populate_tests(mm, pmdp, saved_ptep);
-	pud_populate_tests(mm, pudp, saved_pmdp);
-	p4d_populate_tests(mm, p4dp, saved_pudp);
-	pgd_populate_tests(mm, pgdp, saved_p4dp);
-
 	pte_special_tests(pte_aligned, prot);
 	pte_protnone_tests(pte_aligned, protnone);
 	pmd_protnone_tests(pmd_aligned, protnone);
@@ -1050,11 +1029,37 @@ static int __init debug_vm_pgtable(void)
 	pmd_swap_tests(pmd_aligned, prot);
 
 	swap_migration_tests();
-	hugetlb_basic_tests(pte_aligned, prot);
 
 	pmd_thp_tests(pmd_aligned, prot);
 	pud_thp_tests(pud_aligned, prot);
 
+	hugetlb_basic_tests(pte_aligned, prot);
+
+	pte_clear_tests(mm, ptep, vaddr);
+	pmd_clear_tests(mm, pmdp);
+	pud_clear_tests(mm, pudp);
+	p4d_clear_tests(mm, p4dp);
+	pgd_clear_tests(mm, pgdp);
+
+	ptl = pte_lockptr(mm, pmdp);
+	spin_lock(ptl);
+
+	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
+	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
+	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+
+
+	pmd_huge_tests(pmdp, pmd_aligned, prot);
+	pud_huge_tests(pudp, pud_aligned, prot);
+
+	pte_unmap_unlock(ptep, ptl);
+
+	pmd_populate_tests(mm, pmdp, saved_ptep);
+	pud_populate_tests(mm, pudp, saved_pmdp);
+	p4d_populate_tests(mm, p4dp, saved_pudp);
+	pgd_populate_tests(mm, pgdp, saved_p4dp);
+
 	p4d_free(mm, saved_p4dp);
 	pud_free(mm, saved_pudp);
 	pmd_free(mm, saved_pmdp);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 09/13] mm/debug_vm_pgtable/locks: Take correct page table lock
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (7 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 08/13] mm/debug_vm_pgtable/locks: Move non page table modifying test together Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-04  5:39   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 10/13] mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP Aneesh Kumar K.V
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

Make sure we call pte accessors with correct lock held.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index f59cf6a9b05e..2bc1952e5f83 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -1035,30 +1035,39 @@ static int __init debug_vm_pgtable(void)
 
 	hugetlb_basic_tests(pte_aligned, prot);
 
-	pte_clear_tests(mm, ptep, vaddr);
-	pmd_clear_tests(mm, pmdp);
-	pud_clear_tests(mm, pudp);
-	p4d_clear_tests(mm, p4dp);
-	pgd_clear_tests(mm, pgdp);
+	/*
+	 * Page table modifying tests. They need to hold
+	 * proper page table lock.
+	 */
 
 	ptl = pte_lockptr(mm, pmdp);
 	spin_lock(ptl);
-
+	pte_clear_tests(mm, ptep, vaddr);
 	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
-	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
-	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
-	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
-
+	pte_unmap_unlock(ptep, ptl);
 
+	ptl = pmd_lock(mm, pmdp);
+	pmd_clear_tests(mm, pmdp);
+	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
 	pmd_huge_tests(pmdp, pmd_aligned, prot);
+	pmd_populate_tests(mm, pmdp, saved_ptep);
+	spin_unlock(ptl);
+
+	ptl = pud_lock(mm, pudp);
+	pud_clear_tests(mm, pudp);
+	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
 	pud_huge_tests(pudp, pud_aligned, prot);
+	pud_populate_tests(mm, pudp, saved_pmdp);
+	spin_unlock(ptl);
 
-	pte_unmap_unlock(ptep, ptl);
+	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
 
-	pmd_populate_tests(mm, pmdp, saved_ptep);
-	pud_populate_tests(mm, pudp, saved_pmdp);
+	spin_lock(&mm->page_table_lock);
+	p4d_clear_tests(mm, p4dp);
+	pgd_clear_tests(mm, pgdp);
 	p4d_populate_tests(mm, p4dp, saved_pudp);
 	pgd_populate_tests(mm, pgdp, saved_p4dp);
+	spin_unlock(&mm->page_table_lock);
 
 	p4d_free(mm, saved_p4dp);
 	pud_free(mm, saved_pudp);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 10/13] mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (8 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 09/13] mm/debug_vm_pgtable/locks: Take correct page table lock Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-04  4:21   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 11/13] mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries Aneesh Kumar K.V
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

Architectures like ppc64 use deposited page table while updating the
huge pte entries.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 2bc1952e5f83..26023d990bd0 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -154,7 +154,7 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
 static void __init pmd_advanced_tests(struct mm_struct *mm,
 				      struct vm_area_struct *vma, pmd_t *pmdp,
 				      unsigned long pfn, unsigned long vaddr,
-				      pgprot_t prot)
+				      pgprot_t prot, pgtable_t pgtable)
 {
 	pmd_t pmd;
 
@@ -165,6 +165,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
 	/* Align the address wrt HPAGE_PMD_SIZE */
 	vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
 
+	pgtable_trans_huge_deposit(mm, pmdp, pgtable);
+
 	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
 	set_pmd_at(mm, vaddr, pmdp, pmd);
 	pmdp_set_wrprotect(mm, vaddr, pmdp);
@@ -193,6 +195,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
 	pmdp_test_and_clear_young(vma, vaddr, pmdp);
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(pmd_young(pmd));
+
+	pgtable = pgtable_trans_huge_withdraw(mm, pmdp);
 }
 
 static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot)
@@ -371,7 +375,7 @@ static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
 static void __init pmd_advanced_tests(struct mm_struct *mm,
 				      struct vm_area_struct *vma, pmd_t *pmdp,
 				      unsigned long pfn, unsigned long vaddr,
-				      pgprot_t prot)
+				      pgprot_t prot, pgtable_t pgtable)
 {
 }
 static void __init pud_advanced_tests(struct mm_struct *mm,
@@ -1048,7 +1052,7 @@ static int __init debug_vm_pgtable(void)
 
 	ptl = pmd_lock(mm, pmdp);
 	pmd_clear_tests(mm, pmdp);
-	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
+	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
 	pmd_huge_tests(pmdp, pmd_aligned, prot);
 	pmd_populate_tests(mm, pmdp, saved_ptep);
 	spin_unlock(ptl);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 11/13] mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (9 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 10/13] mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-04  6:03   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 12/13] mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64 Aneesh Kumar K.V
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

pmd_clear() should not be used to clear pmd level pte entries.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 26023d990bd0..b53903fdee85 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -196,6 +196,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(pmd_young(pmd));
 
+	/*  Clear the pte entries  */
+	pmdp_huge_get_and_clear(mm, vaddr, pmdp);
 	pgtable = pgtable_trans_huge_withdraw(mm, pmdp);
 }
 
@@ -319,6 +321,8 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
 	pudp_test_and_clear_young(vma, vaddr, pudp);
 	pud = READ_ONCE(*pudp);
 	WARN_ON(pud_young(pud));
+
+	pudp_huge_get_and_clear(mm, vaddr, pudp);
 }
 
 static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
@@ -442,8 +446,6 @@ static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp,
 	 * This entry points to next level page table page.
 	 * Hence this must not qualify as pud_bad().
 	 */
-	pmd_clear(pmdp);
-	pud_clear(pudp);
 	pud_populate(mm, pudp, pmdp);
 	pud = READ_ONCE(*pudp);
 	WARN_ON(pud_bad(pud));
@@ -575,7 +577,6 @@ static void __init pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
 	 * This entry points to next level page table page.
 	 * Hence this must not qualify as pmd_bad().
 	 */
-	pmd_clear(pmdp);
 	pmd_populate(mm, pmdp, pgtable);
 	pmd = READ_ONCE(*pmdp);
 	WARN_ON(pmd_bad(pmd));
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 12/13] mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (10 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 11/13] mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-04  6:19   ` Anshuman Khandual
  2020-09-02 11:42 ` [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test Aneesh Kumar K.V
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

The seems to be missing quite a lot of details w.r.t allocating
the correct pgtable_t page (huge_pte_alloc()), holding the right
lock (huge_pte_lock()) etc. The vma used is also not a hugetlb VMA.

ppc64 do have runtime checks within CONFIG_DEBUG_VM for most of these.
Hence disable the test on ppc64.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index b53903fdee85..9afa1354326b 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -811,6 +811,7 @@ static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot)
 #endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
 }
 
+#ifndef CONFIG_PPC_BOOK3S_64
 static void __init hugetlb_advanced_tests(struct mm_struct *mm,
 					  struct vm_area_struct *vma,
 					  pte_t *ptep, unsigned long pfn,
@@ -853,6 +854,7 @@ static void __init hugetlb_advanced_tests(struct mm_struct *mm,
 	pte = huge_ptep_get(ptep);
 	WARN_ON(!(huge_pte_write(pte) && huge_pte_dirty(pte)));
 }
+#endif
 #else  /* !CONFIG_HUGETLB_PAGE */
 static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot) { }
 static void __init hugetlb_advanced_tests(struct mm_struct *mm,
@@ -1065,7 +1067,9 @@ static int __init debug_vm_pgtable(void)
 	pud_populate_tests(mm, pudp, saved_pmdp);
 	spin_unlock(ptl);
 
+#ifndef CONFIG_PPC_BOOK3S_64
 	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+#endif
 
 	spin_lock(&mm->page_table_lock);
 	p4d_clear_tests(mm, p4dp);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (11 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 12/13] mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64 Aneesh Kumar K.V
@ 2020-09-02 11:42 ` Aneesh Kumar K.V
  2020-09-11  2:13   ` Nathan Chancellor
  2020-10-11 20:02   ` Guenter Roeck
  2020-09-02 11:46 ` Aneesh Kumar K.V
                   ` (3 subsequent siblings)
  16 siblings, 2 replies; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:42 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

pte_clear_tests operate on an existing pte entry. Make sure that
is not a none pte entry.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 9afa1354326b..c36530c69e33 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -542,9 +542,10 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
 #endif /* PAGETABLE_P4D_FOLDED */
 
 static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
-				   unsigned long vaddr)
+				   unsigned long pfn, unsigned long vaddr,
+				   pgprot_t prot)
 {
-	pte_t pte = ptep_get(ptep);
+	pte_t pte = pfn_pte(pfn, prot);
 
 	pr_debug("Validating PTE clear\n");
 	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
@@ -1049,7 +1050,7 @@ static int __init debug_vm_pgtable(void)
 
 	ptl = pte_lockptr(mm, pmdp);
 	spin_lock(ptl);
-	pte_clear_tests(mm, ptep, vaddr);
+	pte_clear_tests(mm, ptep, pte_aligned, vaddr, prot);
 	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
 	pte_unmap_unlock(ptep, ptl);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (12 preceding siblings ...)
  2020-09-02 11:42 ` [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test Aneesh Kumar K.V
@ 2020-09-02 11:46 ` Aneesh Kumar K.V
  2020-09-04  4:16   ` Anshuman Khandual
  2020-09-04  6:48 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Anshuman Khandual
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 11:46 UTC (permalink / raw)
  To: linux-mm, akpm; +Cc: linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

pte_clear_tests operate on an existing pte entry. Make sure that
is not a none pte entry.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 9afa1354326b..c36530c69e33 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -542,9 +542,10 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
 #endif /* PAGETABLE_P4D_FOLDED */
 
 static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
-				   unsigned long vaddr)
+				   unsigned long pfn, unsigned long vaddr,
+				   pgprot_t prot)
 {
-	pte_t pte = ptep_get(ptep);
+	pte_t pte = pfn_pte(pfn, prot);
 
 	pr_debug("Validating PTE clear\n");
 	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
@@ -1049,7 +1050,7 @@ static int __init debug_vm_pgtable(void)
 
 	ptl = pte_lockptr(mm, pmdp);
 	spin_lock(ptl);
-	pte_clear_tests(mm, ptep, vaddr);
+	pte_clear_tests(mm, ptep, pte_aligned, vaddr, prot);
 	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
 	pte_unmap_unlock(ptep, ptl);
 
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 02/13] powerpc/mm: Move setting pte specific flags to pfn_pte
  2020-09-02 11:42 ` [PATCH v4 02/13] powerpc/mm: Move setting pte specific flags to pfn_pte Aneesh Kumar K.V
@ 2020-09-02 12:30   ` Christophe Leroy
  0 siblings, 0 replies; 52+ messages in thread
From: Christophe Leroy @ 2020-09-02 12:30 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev, Anshuman Khandual



Le 02/09/2020 à 13:42, Aneesh Kumar K.V a écrit :
> powerpc used to set the pte specific flags in set_pte_at(). This is
> different from other architectures. To be consistent with other
> architecture update pfn_pte to set _PAGE_PTE on ppc64. Also, drop now
> unused pte_mkpte.
> 
> We add a VM_WARN_ON() to catch the usage of calling set_pte_at()
> without setting _PAGE_PTE bit. We will remove that after a few releases.
> 
> With respect to huge pmd entries, pmd_mkhuge() takes care of adding the
> _PAGE_PTE bit.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>

Small nit below.

> ---
>   arch/powerpc/include/asm/book3s/64/pgtable.h | 15 +++++++++------
>   arch/powerpc/include/asm/nohash/pgtable.h    |  5 -----
>   arch/powerpc/mm/pgtable.c                    |  5 -----
>   3 files changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 079211968987..2382fd516f6b 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -823,6 +818,14 @@ static inline int pte_none(pte_t pte)
>   static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
>   				pte_t *ptep, pte_t pte, int percpu)
>   {
> +
> +	VM_WARN_ON(!(pte_raw(pte) & cpu_to_be64(_PAGE_PTE)));
> +	/*
> +	 * Keep the _PAGE_PTE added till we are sure we handle _PAGE_PTE
> +	 * in all the callers.
> +	 */
> +	 pte = __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_PTE));

Wrong alignment, there is a leading space.

> +
>   	if (radix_enabled())
>   		return radix__set_pte_at(mm, addr, ptep, pte, percpu);
>   	return hash__set_pte_at(mm, addr, ptep, pte, percpu);

Christophe

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support.
  2020-09-02 11:42 ` [PATCH v4 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support Aneesh Kumar K.V
@ 2020-09-02 12:40   ` Christophe Leroy
  2020-09-02 12:53     ` Aneesh Kumar K.V
  2020-09-04  4:08   ` Anshuman Khandual
  1 sibling, 1 reply; 52+ messages in thread
From: Christophe Leroy @ 2020-09-02 12:40 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev, Anshuman Khandual



Le 02/09/2020 à 13:42, Aneesh Kumar K.V a écrit :
> ppc64 supports huge vmap only with radix translation. Hence use arch helper
> to determine the huge vmap support.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>   mm/debug_vm_pgtable.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 00649b47f6e0..4c73e63b4ceb 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -28,6 +28,7 @@
>   #include <linux/swapops.h>
>   #include <linux/start_kernel.h>
>   #include <linux/sched/mm.h>
> +#include <linux/io.h>
>   #include <asm/pgalloc.h>
>   #include <asm/tlbflush.h>
>   
> @@ -206,11 +207,12 @@ static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot)
>   	WARN_ON(!pmd_leaf(pmd));
>   }
>   
> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
>   static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
>   {
>   	pmd_t pmd;
>   
> -	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
> +	if (!arch_ioremap_pmd_supported())

What about moving ioremap_pmd_enabled() from mm/ioremap.c to some .h, 
and using it ?
As ioremap_pmd_enabled() is defined at all time, no need of #ifdef

>   		return;
>   
>   	pr_debug("Validating PMD huge\n");
> @@ -224,6 +226,9 @@ static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
>   	pmd = READ_ONCE(*pmdp);
>   	WARN_ON(!pmd_none(pmd));
>   }
> +#else /* CONFIG_HAVE_ARCH_HUGE_VMAP */
> +static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot) { }
> +#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
>   
>   static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>   {
> @@ -320,11 +325,12 @@ static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
>   	WARN_ON(!pud_leaf(pud));
>   }
>   
> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
>   static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
>   {
>   	pud_t pud;
>   
> -	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
> +	if (!arch_ioremap_pud_supported())

What about moving ioremap_pud_enabled() from mm/ioremap.c to some .h, 
and using it ?
As ioremap_pud_enabled() is defined at all time, no need of #ifdef

>   		return;
>   
>   	pr_debug("Validating PUD huge\n");
> @@ -338,6 +344,10 @@ static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
>   	pud = READ_ONCE(*pudp);
>   	WARN_ON(!pud_none(pud));
>   }
> +#else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
> +static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot) { }
> +#endif /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
> +
>   #else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
>   static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
>   static void __init pud_advanced_tests(struct mm_struct *mm,
> 

Christophe

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support.
  2020-09-02 12:40   ` Christophe Leroy
@ 2020-09-02 12:53     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-02 12:53 UTC (permalink / raw)
  To: Christophe Leroy, linux-mm, akpm; +Cc: linuxppc-dev, Anshuman Khandual

On 9/2/20 6:10 PM, Christophe Leroy wrote:
> 
> 
> Le 02/09/2020 à 13:42, Aneesh Kumar K.V a écrit :
>> ppc64 supports huge vmap only with radix translation. Hence use arch 
>> helper
>> to determine the huge vmap support.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>   mm/debug_vm_pgtable.c | 14 ++++++++++++--
>>   1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index 00649b47f6e0..4c73e63b4ceb 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -28,6 +28,7 @@
>>   #include <linux/swapops.h>
>>   #include <linux/start_kernel.h>
>>   #include <linux/sched/mm.h>
>> +#include <linux/io.h>
>>   #include <asm/pgalloc.h>
>>   #include <asm/tlbflush.h>
>> @@ -206,11 +207,12 @@ static void __init pmd_leaf_tests(unsigned long 
>> pfn, pgprot_t prot)
>>       WARN_ON(!pmd_leaf(pmd));
>>   }
>> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
>>   static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, 
>> pgprot_t prot)
>>   {
>>       pmd_t pmd;
>> -    if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
>> +    if (!arch_ioremap_pmd_supported())
> 
> What about moving ioremap_pmd_enabled() from mm/ioremap.c to some .h, 
> and using it ?
> As ioremap_pmd_enabled() is defined at all time, no need of #ifdef
> 

yes. This was discussed earlier too. IMHO we should do that outside this 
series. I guess figuring out ioremap_pmd/pud support can definitely be 
simplified. With a generic version like

#ifndef arch_ioremap_pmd_supported
static inline bool arch_ioremap_pmd_supported(void)
{
	return false;
}
#endif


>>           return;
>>       pr_debug("Validating PMD huge\n");
>> @@ -224,6 +226,9 @@ static void __init pmd_huge_tests(pmd_t *pmdp, 
>> unsigned long pfn, pgprot_t prot)
>>       pmd = READ_ONCE(*pmdp);
>>       WARN_ON(!pmd_none(pmd));
>>   }
>> +#else /* CONFIG_HAVE_ARCH_HUGE_VMAP */
>> +static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, 
>> pgprot_t prot) { }
>> +#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
>>   static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t 
>> prot)
>>   {
>> @@ -320,11 +325,12 @@ static void __init pud_leaf_tests(unsigned long 
>> pfn, pgprot_t prot)
>>       WARN_ON(!pud_leaf(pud));
>>   }
>> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
>>   static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, 
>> pgprot_t prot)
>>   {
>>       pud_t pud;
>> -    if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
>> +    if (!arch_ioremap_pud_supported())
> 
> What about moving ioremap_pud_enabled() from mm/ioremap.c to some .h, 
> and using it ?
> As ioremap_pud_enabled() is defined at all time, no need of #ifdef
> 
>>           return;
>>       pr_debug("Validating PUD huge\n");
>> @@ -338,6 +344,10 @@ static void __init pud_huge_tests(pud_t *pudp, 
>> unsigned long pfn, pgprot_t prot)
>>       pud = READ_ONCE(*pudp);
>>       WARN_ON(!pud_none(pud));
>>   }
>> +#else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
>> +static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, 
>> pgprot_t prot) { }
>> +#endif /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
>> +
>>   #else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
>>   static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) 
>> { }
>>   static void __init pud_advanced_tests(struct mm_struct *mm,
>>
> 
> Christophe


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 08/13] mm/debug_vm_pgtable/locks: Move non page table modifying test together
  2020-09-02 11:42 ` [PATCH v4 08/13] mm/debug_vm_pgtable/locks: Move non page table modifying test together Aneesh Kumar K.V
@ 2020-09-04  3:58   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  3:58 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> This will help in adding proper locks in a later patch
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 51 ++++++++++++++++++++++++-------------------
>  1 file changed, 28 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index de333871f407..f59cf6a9b05e 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -986,7 +986,7 @@ static int __init debug_vm_pgtable(void)
>  	p4dp = p4d_alloc(mm, pgdp, vaddr);
>  	pudp = pud_alloc(mm, p4dp, vaddr);
>  	pmdp = pmd_alloc(mm, pudp, vaddr);
> -	ptep = pte_alloc_map_lock(mm, pmdp, vaddr, &ptl);
> +	ptep = pte_alloc_map(mm, pmdp, vaddr);
>  
>  	/*
>  	 * Save all the page table page addresses as the page table
> @@ -1006,33 +1006,12 @@ static int __init debug_vm_pgtable(void)
>  	p4d_basic_tests(p4d_aligned, prot);
>  	pgd_basic_tests(pgd_aligned, prot);
>  
> -	pte_clear_tests(mm, ptep, vaddr);
> -	pmd_clear_tests(mm, pmdp);
> -	pud_clear_tests(mm, pudp);
> -	p4d_clear_tests(mm, p4dp);
> -	pgd_clear_tests(mm, pgdp);
> -
> -	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> -	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
> -	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
> -	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> -
>  	pmd_leaf_tests(pmd_aligned, prot);
>  	pud_leaf_tests(pud_aligned, prot);
>  
> -	pmd_huge_tests(pmdp, pmd_aligned, prot);
> -	pud_huge_tests(pudp, pud_aligned, prot);
> -
>  	pte_savedwrite_tests(pte_aligned, protnone);
>  	pmd_savedwrite_tests(pmd_aligned, protnone);
>  
> -	pte_unmap_unlock(ptep, ptl);
> -
> -	pmd_populate_tests(mm, pmdp, saved_ptep);
> -	pud_populate_tests(mm, pudp, saved_pmdp);
> -	p4d_populate_tests(mm, p4dp, saved_pudp);
> -	pgd_populate_tests(mm, pgdp, saved_p4dp);
> -
>  	pte_special_tests(pte_aligned, prot);
>  	pte_protnone_tests(pte_aligned, protnone);
>  	pmd_protnone_tests(pmd_aligned, protnone);
> @@ -1050,11 +1029,37 @@ static int __init debug_vm_pgtable(void)
>  	pmd_swap_tests(pmd_aligned, prot);
>  
>  	swap_migration_tests();
> -	hugetlb_basic_tests(pte_aligned, prot);
>  
>  	pmd_thp_tests(pmd_aligned, prot);
>  	pud_thp_tests(pud_aligned, prot);
>  
> +	hugetlb_basic_tests(pte_aligned, prot);
> +
> +	pte_clear_tests(mm, ptep, vaddr);
> +	pmd_clear_tests(mm, pmdp);
> +	pud_clear_tests(mm, pudp);
> +	p4d_clear_tests(mm, p4dp);
> +	pgd_clear_tests(mm, pgdp);
> +
> +	ptl = pte_lockptr(mm, pmdp);
> +	spin_lock(ptl);
> +
> +	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> +	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
> +	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
> +	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> +
> +
> +	pmd_huge_tests(pmdp, pmd_aligned, prot);
> +	pud_huge_tests(pudp, pud_aligned, prot);
> +
> +	pte_unmap_unlock(ptep, ptl);
> +
> +	pmd_populate_tests(mm, pmdp, saved_ptep);
> +	pud_populate_tests(mm, pudp, saved_pmdp);
> +	p4d_populate_tests(mm, p4dp, saved_pudp);
> +	pgd_populate_tests(mm, pgdp, saved_p4dp);
> +
>  	p4d_free(mm, saved_p4dp);
>  	pud_free(mm, saved_pudp);
>  	pmd_free(mm, saved_pmdp);
>

Patches applied till here [i.e PATCH_1..PATCH_8] does not boot on arm64
platform, which might create a potential git bisect problem later on.

static void __init pte_advanced_tests(struct mm_struct *mm,
                                      struct vm_area_struct *vma, pte_t *ptep,
                                      unsigned long pfn, unsigned long vaddr,
                                      pgprot_t prot)
{
        pte_t pte = pfn_pte(pfn, prot);

        /*
         * Architectures optimize set_pte_at by avoiding TLB flush.
         * This requires set_pte_at to be not used to update an
         * existing pte entry. Clear pte before we do set_pte_at
         */

        pr_debug("Validating PTE advanced\n");
        pte = pfn_pte(pfn, prot);
        set_pte_at(mm, vaddr, ptep, pte);   ----------> Crashed here.
............
............

[   19.031831] Unable to handle kernel paging request at virtual address fffffdfffde00028
[   19.033181] Mem abort info:
[   19.033627]   ESR = 0x96000006
[   19.034129]   EC = 0x25: DABT (current EL), IL = 32 bits
[   19.035002]   SET = 0, FnV = 0
[   19.035523]   EA = 0, S1PTW = 0
[   19.036054] Data abort info:
[   19.036538]   ISV = 0, ISS = 0x00000006
[   19.037170]   CM = 0, WnR = 0
[   19.037662] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000008158b000
[   19.038749] [fffffdfffde00028] pgd=0000000081d69003, p4d=0000000081d69003, pud=0000000081d6a003, pmd=0000000000000000
[   19.040560] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[   19.041467] Modules linked in:
[   19.041968] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W         5.9.0-rc3-00231-gdef09f62540f #62
[   19.043472] Hardware name: linux,dummy-virt (DT)
[   19.044292] pstate: 20400005 (nzCv daif +PAN -UAO BTYPE=--)
[   19.045235] pc : _raw_spin_lock+0x34/0x70
[   19.045944] lr : debug_vm_pgtable+0x3c8/0x8b8
[   19.046713] sp : ffff80001219bcf0
[   19.047295] x29: ffff80001219bcf0 x28: ffff8000114ac000 
[   19.048224] x27: ffff0005df61f080 x26: ffff8000114ac000 
[   19.049148] x25: 0020000000000fd3 x24: 0020000081400fd3 
[   19.050076] x23: 00003df483b17000 x22: ffff0005ddd63e90 
[   19.051011] x21: ffff0005dddc8000 x20: ffff0005ddd5f0e8 
[   19.051937] x19: ffff0005de3718b8 x18: 0000000000000010 
[   19.052846] x17: 00000000dbb99b48 x16: 00000000cc138a43 
[   19.053774] x15: 0000000000000001 x14: 0000000000000002 
[   19.054703] x13: 000000000055e46d x12: fffffe0017577200 
[   19.055630] x11: 0000000000000008 x10: ffff0005fc455210 
[   19.056553] x9 : ffff0005fc455210 x8 : 0000000000000010 
[   19.057484] x7 : ffff8005ead60000 x6 : ffff0005fcfcee00 
[   19.058410] x5 : ffff0005fc455200 x4 : 0000000000000000 
[   19.059344] x3 : fffffdfffde00028 x2 : 0000000000000001 
[   19.060272] x1 : 0000000000000000 x0 : fffffdfffde00028 
[   19.061203] Call trace:
[   19.061644]  _raw_spin_lock+0x34/0x70
[   19.062289]  debug_vm_pgtable+0x3c8/0x8b8
[   19.063001]  do_one_initcall+0x74/0x1cc
[   19.063680]  kernel_init_freeable+0x1d0/0x238
[   19.064448]  kernel_init+0x14/0x118
[   19.065067]  ret_from_fork+0x10/0x34
[   19.065703] Code: d503201f 52800001 52800022 2a0103e4 (88e47c02) 
[   19.066835] ---[ end trace ff33eeb13d2a13af ]---

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 03/13] mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value
  2020-09-02 11:42 ` [PATCH v4 03/13] mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value Aneesh Kumar K.V
@ 2020-09-04  4:03   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  4:03 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> ppc64 use bit 62 to indicate a pte entry (_PAGE_PTE). Avoid setting
> that bit in random value.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 086309fb9b6f..00649b47f6e0 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -44,10 +44,17 @@
>   * entry type. But these bits might affect the ability to clear entries with
>   * pxx_clear() because of how dynamic page table folding works on s390. So
>   * while loading up the entries do not change the lower 4 bits. It does not
> - * have affect any other platform.
> + * have affect any other platform. Also avoid the 62nd bit on ppc64 that is
> + * used to mark a pte entry.
>   */
> -#define S390_MASK_BITS	4
> -#define RANDOM_ORVALUE	GENMASK(BITS_PER_LONG - 1, S390_MASK_BITS)
> +#define S390_SKIP_MASK		GENMASK(3, 0)
> +#if __BITS_PER_LONG == 64
> +#define PPC64_SKIP_MASK		GENMASK(62, 62)
> +#else
> +#define PPC64_SKIP_MASK		0x0
> +#endif
> +#define ARCH_SKIP_MASK (S390_SKIP_MASK | PPC64_SKIP_MASK)
> +#define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK)
>  #define RANDOM_NZVALUE	GENMASK(7, 0)
>  
>  static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
> 

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support.
  2020-09-02 11:42 ` [PATCH v4 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support Aneesh Kumar K.V
  2020-09-02 12:40   ` Christophe Leroy
@ 2020-09-04  4:08   ` Anshuman Khandual
  1 sibling, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  4:08 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> ppc64 supports huge vmap only with radix translation. Hence use arch helper
> to determine the huge vmap support.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 00649b47f6e0..4c73e63b4ceb 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -28,6 +28,7 @@
>  #include <linux/swapops.h>
>  #include <linux/start_kernel.h>
>  #include <linux/sched/mm.h>
> +#include <linux/io.h>
>  #include <asm/pgalloc.h>
>  #include <asm/tlbflush.h>
>  
> @@ -206,11 +207,12 @@ static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot)
>  	WARN_ON(!pmd_leaf(pmd));
>  }
>  
> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
>  static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
>  {
>  	pmd_t pmd;
>  
> -	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
> +	if (!arch_ioremap_pmd_supported())
>  		return;
>  
>  	pr_debug("Validating PMD huge\n");
> @@ -224,6 +226,9 @@ static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(!pmd_none(pmd));
>  }
> +#else /* CONFIG_HAVE_ARCH_HUGE_VMAP */
> +static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot) { }
> +#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
>  
>  static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>  {
> @@ -320,11 +325,12 @@ static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
>  	WARN_ON(!pud_leaf(pud));
>  }
>  
> +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
>  static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
>  {
>  	pud_t pud;
>  
> -	if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP))
> +	if (!arch_ioremap_pud_supported())
>  		return;
>  
>  	pr_debug("Validating PUD huge\n");
> @@ -338,6 +344,10 @@ static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(!pud_none(pud));
>  }
> +#else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
> +static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot) { }
> +#endif /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
> +
>  #else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
>  static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
>  static void __init pud_advanced_tests(struct mm_struct *mm,
> 

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 05/13] mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with CONFIG_NUMA_BALANCING
  2020-09-02 11:42 ` [PATCH v4 05/13] mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with CONFIG_NUMA_BALANCING Aneesh Kumar K.V
@ 2020-09-04  4:09   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  4:09 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> Saved write support was added to track the write bit of a pte after
> marking the pte protnone. This was done so that AUTONUMA can convert
> a write pte to protnone and still track the old write bit. When converting
> it back we set the pte write bit correctly thereby avoiding a write fault
> again. Hence enable the test only when CONFIG_NUMA_BALANCING is enabled and
> use protnone protflags.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 4c73e63b4ceb..8704901f6bd8 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -119,10 +119,14 @@ static void __init pte_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>  {
>  	pte_t pte = pfn_pte(pfn, prot);
>  
> +	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
> +		return;
> +
>  	pr_debug("Validating PTE saved write\n");
>  	WARN_ON(!pte_savedwrite(pte_mk_savedwrite(pte_clear_savedwrite(pte))));
>  	WARN_ON(pte_savedwrite(pte_clear_savedwrite(pte_mk_savedwrite(pte))));
>  }
> +
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
>  {
> @@ -234,6 +238,9 @@ static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>  {
>  	pmd_t pmd = pfn_pmd(pfn, prot);
>  
> +	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
> +		return;
> +
>  	pr_debug("Validating PMD saved write\n");
>  	WARN_ON(!pmd_savedwrite(pmd_mk_savedwrite(pmd_clear_savedwrite(pmd))));
>  	WARN_ON(pmd_savedwrite(pmd_clear_savedwrite(pmd_mk_savedwrite(pmd))));
> @@ -1019,8 +1026,8 @@ static int __init debug_vm_pgtable(void)
>  	pmd_huge_tests(pmdp, pmd_aligned, prot);
>  	pud_huge_tests(pudp, pud_aligned, prot);
>  
> -	pte_savedwrite_tests(pte_aligned, prot);
> -	pmd_savedwrite_tests(pmd_aligned, prot);
> +	pte_savedwrite_tests(pte_aligned, protnone);
> +	pmd_savedwrite_tests(pmd_aligned, protnone);
>  
>  	pte_unmap_unlock(ptep, ptl);
>  
> 

No more checkpatch.pl warnings.

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test
  2020-09-02 11:46 ` Aneesh Kumar K.V
@ 2020-09-04  4:16   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  4:16 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:16 PM, Aneesh Kumar K.V wrote:
> pte_clear_tests operate on an existing pte entry. Make sure that
> is not a none pte entry.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 9afa1354326b..c36530c69e33 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -542,9 +542,10 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
>  #endif /* PAGETABLE_P4D_FOLDED */
>  
>  static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
> -				   unsigned long vaddr)
> +				   unsigned long pfn, unsigned long vaddr,
> +				   pgprot_t prot)
>  {
> -	pte_t pte = ptep_get(ptep);
> +	pte_t pte = pfn_pte(pfn, prot);
>  
>  	pr_debug("Validating PTE clear\n");
>  	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> @@ -1049,7 +1050,7 @@ static int __init debug_vm_pgtable(void)
>  
>  	ptl = pte_lockptr(mm, pmdp);
>  	spin_lock(ptl);
> -	pte_clear_tests(mm, ptep, vaddr);
> +	pte_clear_tests(mm, ptep, pte_aligned, vaddr, prot);
>  	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>  	pte_unmap_unlock(ptep, ptl);
>  
> 

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 10/13] mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP
  2020-09-02 11:42 ` [PATCH v4 10/13] mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP Aneesh Kumar K.V
@ 2020-09-04  4:21   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  4:21 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> Architectures like ppc64 use deposited page table while updating the
> huge pte entries.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 2bc1952e5f83..26023d990bd0 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -154,7 +154,7 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
>  static void __init pmd_advanced_tests(struct mm_struct *mm,
>  				      struct vm_area_struct *vma, pmd_t *pmdp,
>  				      unsigned long pfn, unsigned long vaddr,
> -				      pgprot_t prot)
> +				      pgprot_t prot, pgtable_t pgtable)
>  {
>  	pmd_t pmd;
>  
> @@ -165,6 +165,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>  	/* Align the address wrt HPAGE_PMD_SIZE */
>  	vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
>  
> +	pgtable_trans_huge_deposit(mm, pmdp, pgtable);
> +
>  	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>  	set_pmd_at(mm, vaddr, pmdp, pmd);
>  	pmdp_set_wrprotect(mm, vaddr, pmdp);
> @@ -193,6 +195,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>  	pmdp_test_and_clear_young(vma, vaddr, pmdp);
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(pmd_young(pmd));
> +
> +	pgtable = pgtable_trans_huge_withdraw(mm, pmdp);
>  }
>  
>  static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot)
> @@ -371,7 +375,7 @@ static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
>  static void __init pmd_advanced_tests(struct mm_struct *mm,
>  				      struct vm_area_struct *vma, pmd_t *pmdp,
>  				      unsigned long pfn, unsigned long vaddr,
> -				      pgprot_t prot)
> +				      pgprot_t prot, pgtable_t pgtable)
>  {
>  }
>  static void __init pud_advanced_tests(struct mm_struct *mm,
> @@ -1048,7 +1052,7 @@ static int __init debug_vm_pgtable(void)
>  
>  	ptl = pmd_lock(mm, pmdp);
>  	pmd_clear_tests(mm, pmdp);
> -	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
> +	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
>  	pmd_huge_tests(pmdp, pmd_aligned, prot);
>  	pmd_populate_tests(mm, pmdp, saved_ptep);
>  	spin_unlock(ptl);
> 

Moving this down further in the series has not really helped the possible
git bisect problem on arm64. Nonetheless, the patch in itself, makes sense.

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 06/13] mm/debug_vm_pgtable/THP: Mark the pte entry huge before using set_pmd/pud_at
  2020-09-02 11:42 ` [PATCH v4 06/13] mm/debug_vm_pgtable/THP: Mark the pte entry huge before using set_pmd/pud_at Aneesh Kumar K.V
@ 2020-09-04  5:21   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  5:21 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> kernel expects entries to be marked huge before we use
> set_pmd_at()/set_pud_at().
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 20 +++++++++++---------
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 8704901f6bd8..9cafed39c236 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -155,7 +155,7 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>  				      unsigned long pfn, unsigned long vaddr,
>  				      pgprot_t prot)
>  {
> -	pmd_t pmd = pfn_pmd(pfn, prot);
> +	pmd_t pmd;
>  
>  	if (!has_transparent_hugepage())
>  		return;
> @@ -164,19 +164,19 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>  	/* Align the address wrt HPAGE_PMD_SIZE */
>  	vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
>  
> -	pmd = pfn_pmd(pfn, prot);
> +	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>  	set_pmd_at(mm, vaddr, pmdp, pmd);
>  	pmdp_set_wrprotect(mm, vaddr, pmdp);
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(pmd_write(pmd));
>  
> -	pmd = pfn_pmd(pfn, prot);
> +	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>  	set_pmd_at(mm, vaddr, pmdp, pmd);
>  	pmdp_huge_get_and_clear(mm, vaddr, pmdp);
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(!pmd_none(pmd));
>  
> -	pmd = pfn_pmd(pfn, prot);
> +	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>  	pmd = pmd_wrprotect(pmd);
>  	pmd = pmd_mkclean(pmd);
>  	set_pmd_at(mm, vaddr, pmdp, pmd);
> @@ -236,7 +236,7 @@ static void __init pmd_huge_tests(pmd_t *pmdp, unsigned long pfn, pgprot_t prot)
>  
>  static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>  {
> -	pmd_t pmd = pfn_pmd(pfn, prot);
> +	pmd_t pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>  
>  	if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
>  		return;
> @@ -276,7 +276,7 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
>  				      unsigned long pfn, unsigned long vaddr,
>  				      pgprot_t prot)
>  {
> -	pud_t pud = pfn_pud(pfn, prot);
> +	pud_t pud;
>  
>  	if (!has_transparent_hugepage())
>  		return;
> @@ -285,25 +285,27 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
>  	/* Align the address wrt HPAGE_PUD_SIZE */
>  	vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE;
>  
> +	pud = pud_mkhuge(pfn_pud(pfn, prot));
>  	set_pud_at(mm, vaddr, pudp, pud);
>  	pudp_set_wrprotect(mm, vaddr, pudp);
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(pud_write(pud));
>  
>  #ifndef __PAGETABLE_PMD_FOLDED
> -	pud = pfn_pud(pfn, prot);
> +	pud = pud_mkhuge(pfn_pud(pfn, prot));
>  	set_pud_at(mm, vaddr, pudp, pud);
>  	pudp_huge_get_and_clear(mm, vaddr, pudp);
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(!pud_none(pud));
>  
> -	pud = pfn_pud(pfn, prot);
> +	pud = pud_mkhuge(pfn_pud(pfn, prot));
>  	set_pud_at(mm, vaddr, pudp, pud);
>  	pudp_huge_get_and_clear_full(mm, vaddr, pudp, 1);
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(!pud_none(pud));
>  #endif /* __PAGETABLE_PMD_FOLDED */
> -	pud = pfn_pud(pfn, prot);
> +
> +	pud = pud_mkhuge(pfn_pud(pfn, prot));
>  	pud = pud_wrprotect(pud);
>  	pud = pud_mkclean(pud);
>  	set_pud_at(mm, vaddr, pudp, pud);
> 

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 07/13] mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an existing pte entry
  2020-09-02 11:42 ` [PATCH v4 07/13] mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an existing pte entry Aneesh Kumar K.V
@ 2020-09-04  5:34   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  5:34 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> set_pte_at() should not be used to set a pte entry at locations that
> already holds a valid pte entry. Architectures like ppc64 don't do TLB
> invalidate in set_pte_at() and hence expect it to be used to set locations
> that are not a valid PTE.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 35 +++++++++++++++--------------------
>  1 file changed, 15 insertions(+), 20 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 9cafed39c236..de333871f407 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -79,15 +79,18 @@ static void __init pte_advanced_tests(struct mm_struct *mm,
>  {
>  	pte_t pte = pfn_pte(pfn, prot);
>  
> +	/*
> +	 * Architectures optimize set_pte_at by avoiding TLB flush.
> +	 * This requires set_pte_at to be not used to update an
> +	 * existing pte entry. Clear pte before we do set_pte_at
> +	 */
> +
>  	pr_debug("Validating PTE advanced\n");
>  	pte = pfn_pte(pfn, prot);
>  	set_pte_at(mm, vaddr, ptep, pte);
>  	ptep_set_wrprotect(mm, vaddr, ptep);
>  	pte = ptep_get(ptep);
>  	WARN_ON(pte_write(pte));
> -
> -	pte = pfn_pte(pfn, prot);
> -	set_pte_at(mm, vaddr, ptep, pte);
>  	ptep_get_and_clear(mm, vaddr, ptep);
>  	pte = ptep_get(ptep);
>  	WARN_ON(!pte_none(pte));
> @@ -101,13 +104,11 @@ static void __init pte_advanced_tests(struct mm_struct *mm,
>  	ptep_set_access_flags(vma, vaddr, ptep, pte, 1);
>  	pte = ptep_get(ptep);
>  	WARN_ON(!(pte_write(pte) && pte_dirty(pte)));
> -
> -	pte = pfn_pte(pfn, prot);
> -	set_pte_at(mm, vaddr, ptep, pte);
>  	ptep_get_and_clear_full(mm, vaddr, ptep, 1);
>  	pte = ptep_get(ptep);
>  	WARN_ON(!pte_none(pte));
>  
> +	pte = pfn_pte(pfn, prot);
>  	pte = pte_mkyoung(pte);
>  	set_pte_at(mm, vaddr, ptep, pte);
>  	ptep_test_and_clear_young(vma, vaddr, ptep);
> @@ -169,9 +170,6 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>  	pmdp_set_wrprotect(mm, vaddr, pmdp);
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(pmd_write(pmd));
> -
> -	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
> -	set_pmd_at(mm, vaddr, pmdp, pmd);
>  	pmdp_huge_get_and_clear(mm, vaddr, pmdp);
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(!pmd_none(pmd));
> @@ -185,13 +183,11 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>  	pmdp_set_access_flags(vma, vaddr, pmdp, pmd, 1);
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(!(pmd_write(pmd) && pmd_dirty(pmd)));
> -
> -	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
> -	set_pmd_at(mm, vaddr, pmdp, pmd);
>  	pmdp_huge_get_and_clear_full(vma, vaddr, pmdp, 1);
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(!pmd_none(pmd));
>  
> +	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>  	pmd = pmd_mkyoung(pmd);
>  	set_pmd_at(mm, vaddr, pmdp, pmd);
>  	pmdp_test_and_clear_young(vma, vaddr, pmdp);
> @@ -292,17 +288,9 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
>  	WARN_ON(pud_write(pud));
>  
>  #ifndef __PAGETABLE_PMD_FOLDED
> -	pud = pud_mkhuge(pfn_pud(pfn, prot));
> -	set_pud_at(mm, vaddr, pudp, pud);
>  	pudp_huge_get_and_clear(mm, vaddr, pudp);
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(!pud_none(pud));
> -
> -	pud = pud_mkhuge(pfn_pud(pfn, prot));
> -	set_pud_at(mm, vaddr, pudp, pud);
> -	pudp_huge_get_and_clear_full(mm, vaddr, pudp, 1);
> -	pud = READ_ONCE(*pudp);
> -	WARN_ON(!pud_none(pud));
>  #endif /* __PAGETABLE_PMD_FOLDED */
>  
>  	pud = pud_mkhuge(pfn_pud(pfn, prot));
> @@ -315,6 +303,13 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(!(pud_write(pud) && pud_dirty(pud)));
>  
> +#ifndef __PAGETABLE_PMD_FOLDED
> +	pudp_huge_get_and_clear_full(mm, vaddr, pudp, 1);
> +	pud = READ_ONCE(*pudp);
> +	WARN_ON(!pud_none(pud));
> +#endif /* __PAGETABLE_PMD_FOLDED */
> +
> +	pud = pud_mkhuge(pfn_pud(pfn, prot));
>  	pud = pud_mkyoung(pud);
>  	set_pud_at(mm, vaddr, pudp, pud);
>  	pudp_test_and_clear_young(vma, vaddr, pudp);
> 

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 09/13] mm/debug_vm_pgtable/locks: Take correct page table lock
  2020-09-02 11:42 ` [PATCH v4 09/13] mm/debug_vm_pgtable/locks: Take correct page table lock Aneesh Kumar K.V
@ 2020-09-04  5:39   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  5:39 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> Make sure we call pte accessors with correct lock held.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 35 ++++++++++++++++++++++-------------
>  1 file changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index f59cf6a9b05e..2bc1952e5f83 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -1035,30 +1035,39 @@ static int __init debug_vm_pgtable(void)
>  
>  	hugetlb_basic_tests(pte_aligned, prot);
>  
> -	pte_clear_tests(mm, ptep, vaddr);
> -	pmd_clear_tests(mm, pmdp);
> -	pud_clear_tests(mm, pudp);
> -	p4d_clear_tests(mm, p4dp);
> -	pgd_clear_tests(mm, pgdp);
> +	/*
> +	 * Page table modifying tests. They need to hold
> +	 * proper page table lock.
> +	 */
>  
>  	ptl = pte_lockptr(mm, pmdp);
>  	spin_lock(ptl);
> -
> +	pte_clear_tests(mm, ptep, vaddr);
>  	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> -	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
> -	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
> -	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> -
> +	pte_unmap_unlock(ptep, ptl);
>  
> +	ptl = pmd_lock(mm, pmdp);
> +	pmd_clear_tests(mm, pmdp);
> +	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
>  	pmd_huge_tests(pmdp, pmd_aligned, prot);
> +	pmd_populate_tests(mm, pmdp, saved_ptep);
> +	spin_unlock(ptl);
> +
> +	ptl = pud_lock(mm, pudp);
> +	pud_clear_tests(mm, pudp);
> +	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
>  	pud_huge_tests(pudp, pud_aligned, prot);
> +	pud_populate_tests(mm, pudp, saved_pmdp);
> +	spin_unlock(ptl);
>  
> -	pte_unmap_unlock(ptep, ptl);
> +	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>  
> -	pmd_populate_tests(mm, pmdp, saved_ptep);
> -	pud_populate_tests(mm, pudp, saved_pmdp);
> +	spin_lock(&mm->page_table_lock);
> +	p4d_clear_tests(mm, p4dp);
> +	pgd_clear_tests(mm, pgdp);
>  	p4d_populate_tests(mm, p4dp, saved_pudp);
>  	pgd_populate_tests(mm, pgdp, saved_p4dp);
> +	spin_unlock(&mm->page_table_lock);
>  
>  	p4d_free(mm, saved_p4dp);
>  	pud_free(mm, saved_pudp);
> 

This patch, in itself looks good but will probably require folding with
the previous one to prevent the git bisect problem on arm64.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 11/13] mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
  2020-09-02 11:42 ` [PATCH v4 11/13] mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries Aneesh Kumar K.V
@ 2020-09-04  6:03   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  6:03 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> pmd_clear() should not be used to clear pmd level pte entries.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 26023d990bd0..b53903fdee85 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -196,6 +196,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(pmd_young(pmd));
>  
> +	/*  Clear the pte entries  */
> +	pmdp_huge_get_and_clear(mm, vaddr, pmdp);
>  	pgtable = pgtable_trans_huge_withdraw(mm, pmdp);
>  }
>  
> @@ -319,6 +321,8 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
>  	pudp_test_and_clear_young(vma, vaddr, pudp);
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(pud_young(pud));
> +
> +	pudp_huge_get_and_clear(mm, vaddr, pudp);
>  }
>  
>  static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
> @@ -442,8 +446,6 @@ static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp,
>  	 * This entry points to next level page table page.
>  	 * Hence this must not qualify as pud_bad().
>  	 */
> -	pmd_clear(pmdp);
> -	pud_clear(pudp);
>  	pud_populate(mm, pudp, pmdp);
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(pud_bad(pud));
> @@ -575,7 +577,6 @@ static void __init pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
>  	 * This entry points to next level page table page.
>  	 * Hence this must not qualify as pmd_bad().
>  	 */
> -	pmd_clear(pmdp);
>  	pmd_populate(mm, pmdp, pgtable);
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(pmd_bad(pmd));
> 

Why pxxp_huge_get_and_clear() cannot be called inside pxx_populate_tests()
functions itself ? Nonetheless, this does not seem to cause any problem.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 12/13] mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64
  2020-09-02 11:42 ` [PATCH v4 12/13] mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64 Aneesh Kumar K.V
@ 2020-09-04  6:19   ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  6:19 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> The seems to be missing quite a lot of details w.r.t allocating
> the correct pgtable_t page (huge_pte_alloc()), holding the right
> lock (huge_pte_lock()) etc. The vma used is also not a hugetlb VMA.
> 
> ppc64 do have runtime checks within CONFIG_DEBUG_VM for most of these.
> Hence disable the test on ppc64.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index b53903fdee85..9afa1354326b 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -811,6 +811,7 @@ static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot)
>  #endif /* CONFIG_ARCH_WANT_GENERAL_HUGETLB */
>  }
>  
> +#ifndef CONFIG_PPC_BOOK3S_64
>  static void __init hugetlb_advanced_tests(struct mm_struct *mm,
>  					  struct vm_area_struct *vma,
>  					  pte_t *ptep, unsigned long pfn,
> @@ -853,6 +854,7 @@ static void __init hugetlb_advanced_tests(struct mm_struct *mm,
>  	pte = huge_ptep_get(ptep);
>  	WARN_ON(!(huge_pte_write(pte) && huge_pte_dirty(pte)));
>  }
> +#endif
>  #else  /* !CONFIG_HUGETLB_PAGE */
>  static void __init hugetlb_basic_tests(unsigned long pfn, pgprot_t prot) { }
>  static void __init hugetlb_advanced_tests(struct mm_struct *mm,
> @@ -1065,7 +1067,9 @@ static int __init debug_vm_pgtable(void)
>  	pud_populate_tests(mm, pudp, saved_pmdp);
>  	spin_unlock(ptl);
>  
> +#ifndef CONFIG_PPC_BOOK3S_64
>  	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> +#endif
>  
>  	spin_lock(&mm->page_table_lock);
>  	p4d_clear_tests(mm, p4dp);
> 

Is it still required now that DEBUG_VM_PGTABLE has been dropped from powerpc
or you would like to re-enabled it back ?

https://lore.kernel.org/linuxppc-dev/159913592797.5893.5829441560236719450.b4-ty@ellerman.id.au/T/#m6d890e2fe84cf180cb875fae5f791e9c83db8d30

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (13 preceding siblings ...)
  2020-09-02 11:46 ` Aneesh Kumar K.V
@ 2020-09-04  6:48 ` Anshuman Khandual
  2020-09-04 15:26   ` Gerald Schaefer
  2020-09-13 11:03 ` [PATCH] mm/debug_vm_pgtable: Avoid doing memory allocation with pgtable_t mapped Aneesh Kumar K.V
  2020-10-13 20:58 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Andrew Morton
  16 siblings, 1 reply; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-04  6:48 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm
  Cc: linux-s390, Vineet Gupta, linux-riscv, linux-snps-arc,
	linuxppc-dev, Gerald Schaefer



On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> This patch series includes fixes for debug_vm_pgtable test code so that
> they follow page table updates rules correctly. The first two patches introduce
> changes w.r.t ppc64. The patches are included in this series for completeness. We can
> merge them via ppc64 tree if required.
> 
> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
> page table update rules.
> 
> These tests are broken w.r.t page table update rules and results in kernel
> crash as below. 
> 
> [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
>     pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
>     lr: c0000000005eeeec: pte_update+0x11c/0x190
>     sp: c000000c6d1e7950
>    msr: 8000000002029033
>   current = 0xc000000c6d172c80
>   paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
>     pid   = 1, comm = swapper/0
> kernel BUG at arch/powerpc/mm/pgtable.c:304!
> [link register   ] c0000000005eeeec pte_update+0x11c/0x190
> [c000000c6d1e7950] 0000000000000001 (unreliable)
> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> 
> With DEBUG_VM disabled
> 
> [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
> [   20.530183] Faulting instruction address: 0xc0000000000df330
> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
>     pc: c0000000000df330: memset+0x68/0x104
>     lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
>     sp: c000000c6d19f990
>    msr: 8000000002009033
>    dar: 0
>   current = 0xc000000c6d177480
>   paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
>     pid   = 1, comm = swapper/0
> [link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> 
> Changes from v3:
> * Address review feedback
> * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure.

This version

- Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE)
- Runs on arm64 and x86 without any regression, atleast nothing that I have noticed
- Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well

+ linux-riscv <linux-riscv@lists.infradead.org>
+ linux-snps-arc@lists.infradead.org <linux-snps-arc@lists.infradead.org>
+ linux-s390@vger.kernel.org
+ Gerald Schaefer <gerald.schaefer@de.ibm.com>
+ Vineet Gupta <vgupta@synopsys.com>

There is still an open git bisect issue on arm64 platform which ideally should be fixed.

- Anshuman

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-04  6:48 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Anshuman Khandual
@ 2020-09-04 15:26   ` Gerald Schaefer
  2020-09-04 16:01     ` Gerald Schaefer
  2020-09-09  8:08     ` Anshuman Khandual
  0 siblings, 2 replies; 52+ messages in thread
From: Gerald Schaefer @ 2020-09-04 15:26 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer

On Fri, 4 Sep 2020 12:18:05 +0530
Anshuman Khandual <anshuman.khandual@arm.com> wrote:

> 
> 
> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> > This patch series includes fixes for debug_vm_pgtable test code so that
> > they follow page table updates rules correctly. The first two patches introduce
> > changes w.r.t ppc64. The patches are included in this series for completeness. We can
> > merge them via ppc64 tree if required.
> > 
> > Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
> > page table update rules.
> > 
> > These tests are broken w.r.t page table update rules and results in kernel
> > crash as below. 
> > 
> > [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
> >     pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
> >     lr: c0000000005eeeec: pte_update+0x11c/0x190
> >     sp: c000000c6d1e7950
> >    msr: 8000000002029033
> >   current = 0xc000000c6d172c80
> >   paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
> >     pid   = 1, comm = swapper/0
> > kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > [link register   ] c0000000005eeeec pte_update+0x11c/0x190
> > [c000000c6d1e7950] 0000000000000001 (unreliable)
> > [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
> > [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
> > [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
> > [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> > [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
> > [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
> > [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > 
> > With DEBUG_VM disabled
> > 
> > [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
> > [   20.530183] Faulting instruction address: 0xc0000000000df330
> > cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
> >     pc: c0000000000df330: memset+0x68/0x104
> >     lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> >     sp: c000000c6d19f990
> >    msr: 8000000002009033
> >    dar: 0
> >   current = 0xc000000c6d177480
> >   paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
> >     pid   = 1, comm = swapper/0
> > [link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
> > [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
> > [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
> > [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> > [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
> > [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
> > [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > 
> > Changes from v3:
> > * Address review feedback
> > * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure.
> 
> This version
> 
> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE)
> - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed
> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well

When I quickly tested v3, it worked fine, but now it turned out to
only work fine "sometimes", both v3 and v4. I need to look into it
further, but so far it seems related to the hugetlb_advanced_tests().

I guess there was already some discussion on this test, but we did
not receive all of the thread(s). Please always add at least
linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com>
for further discussions.

That being said, sorry for duplications, this might already have been
discussed. Preliminary analysis showed that it only seems to go wrong
for certain random vaddr values. I cannot make any sense of that yet,
but what seems strange to me is that the hugetlb_advanced_tests()
take a (real) pte_t pointer as input, and also use that for all
kinds of operations (set_huge_pte_at, huge_ptep_get_and_clear, etc.).

Although all the hugetlb code in the kernel is (mis)using pte_t
pointers instead of the correct pmd/pud_t pointers like THP, that
is just for historic reasons. The pointers will actually never point
to a real pte_t (i.e. page table entry), but of course to a pmd
or pud entry, depending on hugepage size.

What is passed in as ptep to hugetlb_advanced_tests() seems to be
the result from the previous ptep = pte_alloc_map(mm, pmdp, vaddr),
so I would expect that it points to a real page table entry. Need
to investigate further, but IIUC, using such a pointer for adding
large pte entries (i.e. pmd/pud entries) at least feels very wrong
to me, and I assume it is related to the issues we see on s390.

We actually see different issues, e.g. once a panic directly in
hugetlb_advanced_tests() -> huge_ptep_get_and_clear(), but also
indirect symptoms after debug_vm_pgtable() completes, like this:

[   10.533901] BUG task_struct (Not tainted): Padding overwritten. 0x0000000019f798c7-0x0000000019f798c7 @offset=30087

Last but not least, what I said about the pte vs. pmd/pud of
course also should apply to the hugetlb_basic_tests(), although
they are not directly using a pte_t pointer, and especially
also not writing to it. Still, the pte_aligned pfn parameter
is not guaranteed to also be pmd/pud_aligned, which doesn't
feel right.

So, for now, until this is sorted out, I guess we also need
to exclude s390 at least from the hugetlb_advanced_tests().
The hugetlb_basic_tests() seem to work fine so far (probably
by chance :-))

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-04 15:26   ` Gerald Schaefer
@ 2020-09-04 16:01     ` Gerald Schaefer
  2020-09-04 17:53       ` Gerald Schaefer
                         ` (2 more replies)
  2020-09-09  8:08     ` Anshuman Khandual
  1 sibling, 3 replies; 52+ messages in thread
From: Gerald Schaefer @ 2020-09-04 16:01 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer

On Fri, 4 Sep 2020 17:26:47 +0200
Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:

> On Fri, 4 Sep 2020 12:18:05 +0530
> Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> 
> > 
> > 
> > On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> > > This patch series includes fixes for debug_vm_pgtable test code so that
> > > they follow page table updates rules correctly. The first two patches introduce
> > > changes w.r.t ppc64. The patches are included in this series for completeness. We can
> > > merge them via ppc64 tree if required.
> > > 
> > > Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
> > > page table update rules.
> > > 
> > > These tests are broken w.r.t page table update rules and results in kernel
> > > crash as below. 
> > > 
> > > [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > > cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
> > >     pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
> > >     lr: c0000000005eeeec: pte_update+0x11c/0x190
> > >     sp: c000000c6d1e7950
> > >    msr: 8000000002029033
> > >   current = 0xc000000c6d172c80
> > >   paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
> > >     pid   = 1, comm = swapper/0
> > > kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > > [link register   ] c0000000005eeeec pte_update+0x11c/0x190
> > > [c000000c6d1e7950] 0000000000000001 (unreliable)
> > > [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
> > > [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
> > > [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
> > > [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> > > [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
> > > [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
> > > [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > > 
> > > With DEBUG_VM disabled
> > > 
> > > [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
> > > [   20.530183] Faulting instruction address: 0xc0000000000df330
> > > cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
> > >     pc: c0000000000df330: memset+0x68/0x104
> > >     lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > >     sp: c000000c6d19f990
> > >    msr: 8000000002009033
> > >    dar: 0
> > >   current = 0xc000000c6d177480
> > >   paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
> > >     pid   = 1, comm = swapper/0
> > > [link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > > [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
> > > [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
> > > [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
> > > [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> > > [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
> > > [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
> > > [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > > 
> > > Changes from v3:
> > > * Address review feedback
> > > * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure.
> > 
> > This version
> > 
> > - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE)
> > - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed
> > - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well
> 
> When I quickly tested v3, it worked fine, but now it turned out to
> only work fine "sometimes", both v3 and v4. I need to look into it
> further, but so far it seems related to the hugetlb_advanced_tests().
> 
> I guess there was already some discussion on this test, but we did
> not receive all of the thread(s). Please always add at least
> linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com>
> for further discussions.

BTW, with myself I mean the new address gerald.schaefer@linux.ibm.com.
The old gerald.schaefer@de.ibm.com seems to work (again), but is not
very reliable.

BTW2, a quick test with this change (so far) made the issues on s390
go away:

@@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
        spin_unlock(ptl);
 
 #ifndef CONFIG_PPC_BOOK3S_64
-       hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
+       hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot);
 #endif
 
        spin_lock(&mm->page_table_lock);

That would more match the "pte_t pointer" usage for hugetlb code,
i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
but I think the root cause is the pte_t pointer.

Not entirely sure though if that would really be the correct fix.
I somehow lost whatever little track I had about what these tests
really want to check, and if that would still be valid with that
change.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-04 16:01     ` Gerald Schaefer
@ 2020-09-04 17:53       ` Gerald Schaefer
  2020-09-09  8:38         ` Anshuman Khandual
  2020-09-08 15:39       ` Gerald Schaefer
  2020-09-09  8:15       ` Anshuman Khandual
  2 siblings, 1 reply; 52+ messages in thread
From: Gerald Schaefer @ 2020-09-04 17:53 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer

On Fri, 4 Sep 2020 18:01:15 +0200
Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:

> On Fri, 4 Sep 2020 17:26:47 +0200
> Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:
> 
> > On Fri, 4 Sep 2020 12:18:05 +0530
> > Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> > 
> > > 
> > > 
> > > On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> > > > This patch series includes fixes for debug_vm_pgtable test code so that
> > > > they follow page table updates rules correctly. The first two patches introduce
> > > > changes w.r.t ppc64. The patches are included in this series for completeness. We can
> > > > merge them via ppc64 tree if required.
> > > > 
> > > > Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
> > > > page table update rules.
> > > > 
> > > > These tests are broken w.r.t page table update rules and results in kernel
> > > > crash as below. 
> > > > 
> > > > [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > > > cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
> > > >     pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
> > > >     lr: c0000000005eeeec: pte_update+0x11c/0x190
> > > >     sp: c000000c6d1e7950
> > > >    msr: 8000000002029033
> > > >   current = 0xc000000c6d172c80
> > > >   paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
> > > >     pid   = 1, comm = swapper/0
> > > > kernel BUG at arch/powerpc/mm/pgtable.c:304!
> > > > [link register   ] c0000000005eeeec pte_update+0x11c/0x190
> > > > [c000000c6d1e7950] 0000000000000001 (unreliable)
> > > > [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
> > > > [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
> > > > [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
> > > > [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> > > > [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
> > > > [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
> > > > [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > > > 
> > > > With DEBUG_VM disabled
> > > > 
> > > > [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
> > > > [   20.530183] Faulting instruction address: 0xc0000000000df330
> > > > cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
> > > >     pc: c0000000000df330: memset+0x68/0x104
> > > >     lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > > >     sp: c000000c6d19f990
> > > >    msr: 8000000002009033
> > > >    dar: 0
> > > >   current = 0xc000000c6d177480
> > > >   paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
> > > >     pid   = 1, comm = swapper/0
> > > > [link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> > > > [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
> > > > [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
> > > > [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
> > > > [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> > > > [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
> > > > [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
> > > > [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> > > > 
> > > > Changes from v3:
> > > > * Address review feedback
> > > > * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure.
> > > 
> > > This version
> > > 
> > > - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE)
> > > - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed
> > > - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well
> > 
> > When I quickly tested v3, it worked fine, but now it turned out to
> > only work fine "sometimes", both v3 and v4. I need to look into it
> > further, but so far it seems related to the hugetlb_advanced_tests().
> > 
> > I guess there was already some discussion on this test, but we did
> > not receive all of the thread(s). Please always add at least
> > linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com>
> > for further discussions.
> 
> BTW, with myself I mean the new address gerald.schaefer@linux.ibm.com.
> The old gerald.schaefer@de.ibm.com seems to work (again), but is not
> very reliable.
> 
> BTW2, a quick test with this change (so far) made the issues on s390
> go away:
> 
> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
>         spin_unlock(ptl);
> 
>  #ifndef CONFIG_PPC_BOOK3S_64
> -       hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> +       hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot);
>  #endif
> 
>         spin_lock(&mm->page_table_lock);
> 
> That would more match the "pte_t pointer" usage for hugetlb code,
> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
> but I think the root cause is the pte_t pointer.
> 
> Not entirely sure though if that would really be the correct fix.
> I somehow lost whatever little track I had about what these tests
> really want to check, and if that would still be valid with that
> change.

Another potential issue, apparently not for s390, but maybe for
others, is that the vaddr passed to hugetlb_advanced_tests() is
also not pmd/pud size aligned, like you did in pmd/pud_advanced_tests().

I guess for the hugetlb_advanced_tests() you need to choose if
you want to test pmd or pud hugepages, and accordingly prepare
the *ptep, pfn and vaddr input. If you only check for CONFIG_HUGETLB_PAGE,
then probably only pmd hugepages would be safe, there might be
architectures only supporting one hugepage size.

So, for s390, at least the ptep input value is a problem. Still
need to better understand how it goes wrong, but it seems to be
fixed when using proper pmdp, and also works with pudp.

For others, especially the apparent issues on ppc64, the other
non-hugepage aligned input pfn and vaddr might also be an issue,
e.g. power at least seems to use the vaddr in its set_huge_pte_at()
implementation for some pmd_off(mm, addr) calculation.

Again, sorry if this was already discussed, I missed most of it
and honestly didn't properly look at the scarce mails that we did
receive...

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-04 16:01     ` Gerald Schaefer
  2020-09-04 17:53       ` Gerald Schaefer
@ 2020-09-08 15:39       ` Gerald Schaefer
  2020-09-09  6:08         ` Aneesh Kumar K.V
  2020-09-09  8:15       ` Anshuman Khandual
  2 siblings, 1 reply; 52+ messages in thread
From: Gerald Schaefer @ 2020-09-08 15:39 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer

On Fri, 4 Sep 2020 18:01:15 +0200
Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:

[...]
> 
> BTW2, a quick test with this change (so far) made the issues on s390
> go away:
> 
> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
>         spin_unlock(ptl);
> 
>  #ifndef CONFIG_PPC_BOOK3S_64
> -       hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> +       hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot);
>  #endif
> 
>         spin_lock(&mm->page_table_lock);
> 
> That would more match the "pte_t pointer" usage for hugetlb code,
> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
> but I think the root cause is the pte_t pointer.
> 
> Not entirely sure though if that would really be the correct fix.
> I somehow lost whatever little track I had about what these tests
> really want to check, and if that would still be valid with that
> change.

Uh oh, wasn't aware that this (or some predecessor) already went
upstream, and broke our debug kernel today.

I found out now what goes (horribly) wrong on s390, see below for
more details. In short, using hugetlb primitives with ptep pointers
that do _not_ point to a pmd or pud entry will not work on s390.
It also seems to make no sense to verify / test such a thing in general,
as it would also be a severe bug if any kernel code would do that.
After all, with hugepages, there are no pte tables, only pmd etc.
tables.

My change above would fix the issue for s390, but I can still not
completely judge if that would not break other things for your
tests. In general, for normal kernel code, much of what you do would
be very broken, but I guess your tests are doing such "special" things
because they can. E.g. because they operate on some "sandbox" mm
and page tables, and you also do not need properly populated page
tables for some exit / free cleanup, you just throw them away
explicitly with pXd_free at the end. So it might just be "the right
thing" to pass a casted pmd pointer to hugetlb_advanced_tests(),
to simulate and test (proper) usage of the hugetlb primitives.
I also see no other way to make this work for s390, than using a
proper pmd/pud pointer. If not possible, please add us to the
#ifndef.

So, for all those interested, here is what goes wrong on s390.
huge_ptep_get_and_clear() uses the "idte" instruction for the
clearing (and TLB invalidation) part. That instruction expects
a "region or segment table" origin, which is a pmd/pud/p4d/pgd,
but not a pte table. Even worse, when we calculate the table
origin from the given ptep (which *should* not point to a pte),
due to different table sizes for pte / pXd tables, we end up
at some place before the given pte table.

The "idte" instruction also gets the virtual address, and does
corresponding index addition to the given table origin. Depending
on the pmd_index we now end up either within the pte table again,
in which case we see a panic because idte complains about seeing
a pte value. If we are unlucky, then we end up outside the pte
table, and depending on the content of that memory location, idte
might succeed, effectively corrupting that memory.

That explains why we only see the panic sometimes, depending on
random vaddr, other symptoms other times, and probably completely
silent memory corruption for the rest...

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-08 15:39       ` Gerald Schaefer
@ 2020-09-09  6:08         ` Aneesh Kumar K.V
  2020-09-09 11:16           ` Gerald Schaefer
  0 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-09  6:08 UTC (permalink / raw)
  To: Gerald Schaefer, Anshuman Khandual
  Cc: linux-s390, linux-mm, Vineet Gupta, akpm, linux-snps-arc,
	linuxppc-dev, linux-riscv, Gerald Schaefer

Gerald Schaefer <gerald.schaefer@linux.ibm.com> writes:

> On Fri, 4 Sep 2020 18:01:15 +0200
> Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:
>
> [...]
>> 
>> BTW2, a quick test with this change (so far) made the issues on s390
>> go away:
>> 
>> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
>>         spin_unlock(ptl);
>> 
>>  #ifndef CONFIG_PPC_BOOK3S_64
>> -       hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>> +       hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot);
>>  #endif
>> 
>>         spin_lock(&mm->page_table_lock);
>> 
>> That would more match the "pte_t pointer" usage for hugetlb code,
>> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
>> but I think the root cause is the pte_t pointer.
>> 
>> Not entirely sure though if that would really be the correct fix.
>> I somehow lost whatever little track I had about what these tests
>> really want to check, and if that would still be valid with that
>> change.
>
> Uh oh, wasn't aware that this (or some predecessor) already went
> upstream, and broke our debug kernel today.

Not sure i followed the above. Are you finding that s390 kernel crash
after this patch series or the original patchset? As noted in my patch
the hugetlb test is broken and we should fix that. A quick fix is to
comment out that test for s390 too as i have done for PPC64.


-aneesh

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-04 15:26   ` Gerald Schaefer
  2020-09-04 16:01     ` Gerald Schaefer
@ 2020-09-09  8:08     ` Anshuman Khandual
  2020-09-09 11:36       ` Gerald Schaefer
  1 sibling, 1 reply; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-09  8:08 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer



On 09/04/2020 08:56 PM, Gerald Schaefer wrote:
> On Fri, 4 Sep 2020 12:18:05 +0530
> Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> 
>>
>>
>> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
>>> This patch series includes fixes for debug_vm_pgtable test code so that
>>> they follow page table updates rules correctly. The first two patches introduce
>>> changes w.r.t ppc64. The patches are included in this series for completeness. We can
>>> merge them via ppc64 tree if required.
>>>
>>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
>>> page table update rules.
>>>
>>> These tests are broken w.r.t page table update rules and results in kernel
>>> crash as below. 
>>>
>>> [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
>>> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
>>>     pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
>>>     lr: c0000000005eeeec: pte_update+0x11c/0x190
>>>     sp: c000000c6d1e7950
>>>    msr: 8000000002029033
>>>   current = 0xc000000c6d172c80
>>>   paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
>>>     pid   = 1, comm = swapper/0
>>> kernel BUG at arch/powerpc/mm/pgtable.c:304!
>>> [link register   ] c0000000005eeeec pte_update+0x11c/0x190
>>> [c000000c6d1e7950] 0000000000000001 (unreliable)
>>> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
>>> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
>>> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
>>> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
>>> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
>>> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
>>> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
>>>
>>> With DEBUG_VM disabled
>>>
>>> [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
>>> [   20.530183] Faulting instruction address: 0xc0000000000df330
>>> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
>>>     pc: c0000000000df330: memset+0x68/0x104
>>>     lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
>>>     sp: c000000c6d19f990
>>>    msr: 8000000002009033
>>>    dar: 0
>>>   current = 0xc000000c6d177480
>>>   paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
>>>     pid   = 1, comm = swapper/0
>>> [link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
>>> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
>>> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
>>> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
>>> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
>>> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
>>> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
>>> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
>>>
>>> Changes from v3:
>>> * Address review feedback
>>> * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure.
>>
>> This version
>>
>> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE)
>> - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed
>> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well
> 
> When I quickly tested v3, it worked fine, but now it turned out to
> only work fine "sometimes", both v3 and v4. I need to look into it
> further, but so far it seems related to the hugetlb_advanced_tests().
> 
> I guess there was already some discussion on this test, but we did
> not receive all of the thread(s). Please always add at least
> linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com>
> for further discussions.

IIRC, the V3 series previously had all these addresses copied properly
but this version once again missed copying all required addresses.

> 
> That being said, sorry for duplications, this might already have been
> discussed. Preliminary analysis showed that it only seems to go wrong
> for certain random vaddr values. I cannot make any sense of that yet,
> but what seems strange to me is that the hugetlb_advanced_tests()
> take a (real) pte_t pointer as input, and also use that for all
> kinds of operations (set_huge_pte_at, huge_ptep_get_and_clear, etc.).
> 
> Although all the hugetlb code in the kernel is (mis)using pte_t
> pointers instead of the correct pmd/pud_t pointers like THP, that
> is just for historic reasons. The pointers will actually never point
> to a real pte_t (i.e. page table entry), but of course to a pmd
> or pud entry, depending on hugepage size.

HugeTLB logically operates on a PTE entry irrespective of it's real
page table level position. Nonetheless, IIUC, vaddr here should have
been aligned to real page table level in which the entry is being
mapped currently.

> 
> What is passed in as ptep to hugetlb_advanced_tests() seems to be
> the result from the previous ptep = pte_alloc_map(mm, pmdp, vaddr),
> so I would expect that it points to a real page table entry. Need
> to investigate further, but IIUC, using such a pointer for adding
> large pte entries (i.e. pmd/pud entries) at least feels very wrong
> to me, and I assume it is related to the issues we see on s390.
Will look into this further.

> 
> We actually see different issues, e.g. once a panic directly in
> hugetlb_advanced_tests() -> huge_ptep_get_and_clear(), but also
> indirect symptoms after debug_vm_pgtable() completes, like this:
> 
> [   10.533901] BUG task_struct (Not tainted): Padding overwritten. 0x0000000019f798c7-0x0000000019f798c7 @offset=30087
> 
> Last but not least, what I said about the pte vs. pmd/pud of
> course also should apply to the hugetlb_basic_tests(), although
> they are not directly using a pte_t pointer, and especially
> also not writing to it. Still, the pte_aligned pfn parameter
> is not guaranteed to also be pmd/pud_aligned, which doesn't
> feel right.
hugetlb_basic_tests() does not directly operate on real page table
entries. But I do see the point wrt using pmd_aligned pfn instead.
I will look into this in detail and send out something after this
series settles down.

> 
> So, for now, until this is sorted out, I guess we also need
> to exclude s390 at least from the hugetlb_advanced_tests().
> The hugetlb_basic_tests() seem to work fine so far (probably
> by chance :-))

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-04 16:01     ` Gerald Schaefer
  2020-09-04 17:53       ` Gerald Schaefer
  2020-09-08 15:39       ` Gerald Schaefer
@ 2020-09-09  8:15       ` Anshuman Khandual
  2020-09-09 11:10         ` Gerald Schaefer
  2 siblings, 1 reply; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-09  8:15 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer



On 09/04/2020 09:31 PM, Gerald Schaefer wrote:
> On Fri, 4 Sep 2020 17:26:47 +0200
> Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:
> 
>> On Fri, 4 Sep 2020 12:18:05 +0530
>> Anshuman Khandual <anshuman.khandual@arm.com> wrote:
>>
>>>
>>>
>>> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
>>>> This patch series includes fixes for debug_vm_pgtable test code so that
>>>> they follow page table updates rules correctly. The first two patches introduce
>>>> changes w.r.t ppc64. The patches are included in this series for completeness. We can
>>>> merge them via ppc64 tree if required.
>>>>
>>>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
>>>> page table update rules.
>>>>
>>>> These tests are broken w.r.t page table update rules and results in kernel
>>>> crash as below. 
>>>>
>>>> [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
>>>> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
>>>>     pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
>>>>     lr: c0000000005eeeec: pte_update+0x11c/0x190
>>>>     sp: c000000c6d1e7950
>>>>    msr: 8000000002029033
>>>>   current = 0xc000000c6d172c80
>>>>   paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
>>>>     pid   = 1, comm = swapper/0
>>>> kernel BUG at arch/powerpc/mm/pgtable.c:304!
>>>> [link register   ] c0000000005eeeec pte_update+0x11c/0x190
>>>> [c000000c6d1e7950] 0000000000000001 (unreliable)
>>>> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
>>>> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
>>>> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
>>>> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
>>>> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
>>>> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
>>>> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
>>>>
>>>> With DEBUG_VM disabled
>>>>
>>>> [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
>>>> [   20.530183] Faulting instruction address: 0xc0000000000df330
>>>> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
>>>>     pc: c0000000000df330: memset+0x68/0x104
>>>>     lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
>>>>     sp: c000000c6d19f990
>>>>    msr: 8000000002009033
>>>>    dar: 0
>>>>   current = 0xc000000c6d177480
>>>>   paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
>>>>     pid   = 1, comm = swapper/0
>>>> [link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
>>>> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
>>>> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
>>>> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
>>>> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
>>>> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
>>>> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
>>>> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
>>>>
>>>> Changes from v3:
>>>> * Address review feedback
>>>> * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure.
>>>
>>> This version
>>>
>>> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE)
>>> - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed
>>> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well
>>
>> When I quickly tested v3, it worked fine, but now it turned out to
>> only work fine "sometimes", both v3 and v4. I need to look into it
>> further, but so far it seems related to the hugetlb_advanced_tests().
>>
>> I guess there was already some discussion on this test, but we did
>> not receive all of the thread(s). Please always add at least
>> linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com>
>> for further discussions.
> 
> BTW, with myself I mean the new address gerald.schaefer@linux.ibm.com.
> The old gerald.schaefer@de.ibm.com seems to work (again), but is not
> very reliable.

Sure, noted.

> 
> BTW2, a quick test with this change (so far) made the issues on s390
> go away:
> 
> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
>         spin_unlock(ptl);
>  
>  #ifndef CONFIG_PPC_BOOK3S_64
> -       hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> +       hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot);
>  #endif
>  
>         spin_lock(&mm->page_table_lock);
> 
> That would more match the "pte_t pointer" usage for hugetlb code,
> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
> but I think the root cause is the pte_t pointer.

Ideally, the pte_t pointer used here should be from huge_pte_alloc()
not from pte_alloc_map_lock() as the case currently.

> 
> Not entirely sure though if that would really be the correct fix.
> I somehow lost whatever little track I had about what these tests
> really want to check, and if that would still be valid with that
> change.
> 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-04 17:53       ` Gerald Schaefer
@ 2020-09-09  8:38         ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-09  8:38 UTC (permalink / raw)
  To: Gerald Schaefer
  Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer

On 09/04/2020 11:23 PM, Gerald Schaefer wrote:
> On Fri, 4 Sep 2020 18:01:15 +0200
> Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:
> 
>> On Fri, 4 Sep 2020 17:26:47 +0200
>> Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:
>>
>>> On Fri, 4 Sep 2020 12:18:05 +0530
>>> Anshuman Khandual <anshuman.khandual@arm.com> wrote:
>>>
>>>>
>>>>
>>>> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
>>>>> This patch series includes fixes for debug_vm_pgtable test code so that
>>>>> they follow page table updates rules correctly. The first two patches introduce
>>>>> changes w.r.t ppc64. The patches are included in this series for completeness. We can
>>>>> merge them via ppc64 tree if required.
>>>>>
>>>>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
>>>>> page table update rules.
>>>>>
>>>>> These tests are broken w.r.t page table update rules and results in kernel
>>>>> crash as below. 
>>>>>
>>>>> [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
>>>>> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
>>>>>     pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
>>>>>     lr: c0000000005eeeec: pte_update+0x11c/0x190
>>>>>     sp: c000000c6d1e7950
>>>>>    msr: 8000000002029033
>>>>>   current = 0xc000000c6d172c80
>>>>>   paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
>>>>>     pid   = 1, comm = swapper/0
>>>>> kernel BUG at arch/powerpc/mm/pgtable.c:304!
>>>>> [link register   ] c0000000005eeeec pte_update+0x11c/0x190
>>>>> [c000000c6d1e7950] 0000000000000001 (unreliable)
>>>>> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
>>>>> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
>>>>> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
>>>>> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
>>>>> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
>>>>> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
>>>>> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
>>>>>
>>>>> With DEBUG_VM disabled
>>>>>
>>>>> [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
>>>>> [   20.530183] Faulting instruction address: 0xc0000000000df330
>>>>> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
>>>>>     pc: c0000000000df330: memset+0x68/0x104
>>>>>     lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
>>>>>     sp: c000000c6d19f990
>>>>>    msr: 8000000002009033
>>>>>    dar: 0
>>>>>   current = 0xc000000c6d177480
>>>>>   paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
>>>>>     pid   = 1, comm = swapper/0
>>>>> [link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
>>>>> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
>>>>> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
>>>>> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
>>>>> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
>>>>> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
>>>>> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
>>>>> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
>>>>>
>>>>> Changes from v3:
>>>>> * Address review feedback
>>>>> * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure.
>>>>
>>>> This version
>>>>
>>>> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE)
>>>> - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed
>>>> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well
>>>
>>> When I quickly tested v3, it worked fine, but now it turned out to
>>> only work fine "sometimes", both v3 and v4. I need to look into it
>>> further, but so far it seems related to the hugetlb_advanced_tests().
>>>
>>> I guess there was already some discussion on this test, but we did
>>> not receive all of the thread(s). Please always add at least
>>> linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com>
>>> for further discussions.
>>
>> BTW, with myself I mean the new address gerald.schaefer@linux.ibm.com.
>> The old gerald.schaefer@de.ibm.com seems to work (again), but is not
>> very reliable.
>>
>> BTW2, a quick test with this change (so far) made the issues on s390
>> go away:
>>
>> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
>>         spin_unlock(ptl);
>>
>>  #ifndef CONFIG_PPC_BOOK3S_64
>> -       hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>> +       hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot);
>>  #endif
>>
>>         spin_lock(&mm->page_table_lock);
>>
>> That would more match the "pte_t pointer" usage for hugetlb code,
>> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
>> but I think the root cause is the pte_t pointer.
>>
>> Not entirely sure though if that would really be the correct fix.
>> I somehow lost whatever little track I had about what these tests
>> really want to check, and if that would still be valid with that
>> change.
> 
> Another potential issue, apparently not for s390, but maybe for
> others, is that the vaddr passed to hugetlb_advanced_tests() is
> also not pmd/pud size aligned, like you did in pmd/pud_advanced_tests().
> 
> I guess for the hugetlb_advanced_tests() you need to choose if
> you want to test pmd or pud hugepages, and accordingly prepare
> the *ptep, pfn and vaddr input. If you only check for CONFIG_HUGETLB_PAGE,
> then probably only pmd hugepages would be safe, there might be
> architectures only supporting one hugepage size.

I guess preparing for PMD based HugeTLB tests should be sufficient
for now, which can be improved later on to cover other levels.

> 
> So, for s390, at least the ptep input value is a problem. Still
> need to better understand how it goes wrong, but it seems to be
> fixed when using proper pmdp, and also works with pudp.
> 
> For others, especially the apparent issues on ppc64, the other
> non-hugepage aligned input pfn and vaddr might also be an issue,
> e.g. power at least seems to use the vaddr in its set_huge_pte_at()
> implementation for some pmd_off(mm, addr) calculation.
> 
> Again, sorry if this was already discussed, I missed most of it
> and honestly didn't properly look at the scarce mails that we did
> receive...

Sure, will consider these points and try improve tests afterwards.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-09  8:15       ` Anshuman Khandual
@ 2020-09-09 11:10         ` Gerald Schaefer
  0 siblings, 0 replies; 52+ messages in thread
From: Gerald Schaefer @ 2020-09-09 11:10 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer

On Wed, 9 Sep 2020 13:45:48 +0530
Anshuman Khandual <anshuman.khandual@arm.com> wrote:

[...]
> > 
> > That would more match the "pte_t pointer" usage for hugetlb code,
> > i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
> > but I think the root cause is the pte_t pointer.
> 
> Ideally, the pte_t pointer used here should be from huge_pte_alloc()
> not from pte_alloc_map_lock() as the case currently.

Ah, good point. I assumed that this would also always return casted
pmd etc. pointers, and never pte pointers. Unfortunately, that doesn't
seem to be true for all architectures, e.g. ia64, parisc, (some) powerpc,
where they really do a pte_alloc_map() for some reason.

I guess that means you cannot simply cast the pmd pointer, as suggested,
although I really do not understand how any architecture can work with
real ptes for hugepages. But that's fair, s390 also does some things that
nobody would expect or understand for other architectures...

So, for using huge_pte_alloc() you'd also need some size, maybe
iterating over hstates with for_each_hstate() could be an option,
if they are already initialized at that point. Then you have the
size(s) with huge_page_size(hstate) and can actually call the
hugetlb tests for all supported sizes, and with proper pointer
from huge_pte_alloc().

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-09  6:08         ` Aneesh Kumar K.V
@ 2020-09-09 11:16           ` Gerald Schaefer
  0 siblings, 0 replies; 52+ messages in thread
From: Gerald Schaefer @ 2020-09-09 11:16 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: linux-s390, Anshuman Khandual, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer

On Wed, 09 Sep 2020 11:38:39 +0530
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:

> Gerald Schaefer <gerald.schaefer@linux.ibm.com> writes:
> 
> > On Fri, 4 Sep 2020 18:01:15 +0200
> > Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:
> >
> > [...]
> >> 
> >> BTW2, a quick test with this change (so far) made the issues on s390
> >> go away:
> >> 
> >> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
> >>         spin_unlock(ptl);
> >> 
> >>  #ifndef CONFIG_PPC_BOOK3S_64
> >> -       hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> >> +       hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot);
> >>  #endif
> >> 
> >>         spin_lock(&mm->page_table_lock);
> >> 
> >> That would more match the "pte_t pointer" usage for hugetlb code,
> >> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
> >> but I think the root cause is the pte_t pointer.
> >> 
> >> Not entirely sure though if that would really be the correct fix.
> >> I somehow lost whatever little track I had about what these tests
> >> really want to check, and if that would still be valid with that
> >> change.
> >
> > Uh oh, wasn't aware that this (or some predecessor) already went
> > upstream, and broke our debug kernel today.
> 
> Not sure i followed the above. Are you finding that s390 kernel crash
> after this patch series or the original patchset? As noted in my patch
> the hugetlb test is broken and we should fix that. A quick fix is to
> comment out that test for s390 too as i have done for PPC64.

We see it with both, it basically is broken since there is a hugetlb
test using real pte pointers. It doesn't always show, depending on
random vaddr, so it slipped through earlier testing.

I guess we also would have had one or the other chance to notice
that earlier, through better review, or better reading of previous
mails. I must admit that I neglected this a bit.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-09  8:08     ` Anshuman Khandual
@ 2020-09-09 11:36       ` Gerald Schaefer
  0 siblings, 0 replies; 52+ messages in thread
From: Gerald Schaefer @ 2020-09-09 11:36 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-s390, Aneesh Kumar K.V, linux-mm, Vineet Gupta, akpm,
	linux-snps-arc, linuxppc-dev, linux-riscv, Gerald Schaefer

On Wed, 9 Sep 2020 13:38:25 +0530
Anshuman Khandual <anshuman.khandual@arm.com> wrote:

> 
> 
> On 09/04/2020 08:56 PM, Gerald Schaefer wrote:
> > On Fri, 4 Sep 2020 12:18:05 +0530
> > Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> > 
> >>
> >>
> >> On 09/02/2020 05:12 PM, Aneesh Kumar K.V wrote:
> >>> This patch series includes fixes for debug_vm_pgtable test code so that
> >>> they follow page table updates rules correctly. The first two patches introduce
> >>> changes w.r.t ppc64. The patches are included in this series for completeness. We can
> >>> merge them via ppc64 tree if required.
> >>>
> >>> Hugetlb test is disabled on ppc64 because that needs larger change to satisfy
> >>> page table update rules.
> >>>
> >>> These tests are broken w.r.t page table update rules and results in kernel
> >>> crash as below. 
> >>>
> >>> [   21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304!
> >>> cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0]
> >>>     pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380
> >>>     lr: c0000000005eeeec: pte_update+0x11c/0x190
> >>>     sp: c000000c6d1e7950
> >>>    msr: 8000000002029033
> >>>   current = 0xc000000c6d172c80
> >>>   paca    = 0xc000000003ba0000   irqmask: 0x03   irq_happened: 0x01
> >>>     pid   = 1, comm = swapper/0
> >>> kernel BUG at arch/powerpc/mm/pgtable.c:304!
> >>> [link register   ] c0000000005eeeec pte_update+0x11c/0x190
> >>> [c000000c6d1e7950] 0000000000000001 (unreliable)
> >>> [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190
> >>> [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8
> >>> [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338
> >>> [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> >>> [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4
> >>> [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160
> >>> [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> >>>
> >>> With DEBUG_VM disabled
> >>>
> >>> [   20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000
> >>> [   20.530183] Faulting instruction address: 0xc0000000000df330
> >>> cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700]
> >>>     pc: c0000000000df330: memset+0x68/0x104
> >>>     lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> >>>     sp: c000000c6d19f990
> >>>    msr: 8000000002009033
> >>>    dar: 0
> >>>   current = 0xc000000c6d177480
> >>>   paca    = 0xc00000001ec4f400   irqmask: 0x03   irq_happened: 0x01
> >>>     pid   = 1, comm = swapper/0
> >>> [link register   ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0
> >>> [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable)
> >>> [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378
> >>> [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244
> >>> [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0
> >>> [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4
> >>> [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160
> >>> [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c
> >>>
> >>> Changes from v3:
> >>> * Address review feedback
> >>> * Move page table depost and withdraw patch after adding pmdlock to avoid bisect failure.
> >>
> >> This version
> >>
> >> - Builds on x86, arm64, s390, arc, powerpc and riscv (defconfig with DEBUG_VM_PGTABLE)
> >> - Runs on arm64 and x86 without any regression, atleast nothing that I have noticed
> >> - Will be great if this could get tested on s390, arc, riscv, ppc32 platforms as well
> > 
> > When I quickly tested v3, it worked fine, but now it turned out to
> > only work fine "sometimes", both v3 and v4. I need to look into it
> > further, but so far it seems related to the hugetlb_advanced_tests().
> > 
> > I guess there was already some discussion on this test, but we did
> > not receive all of the thread(s). Please always add at least
> > linux-s390@vger.kernel.org and maybe myself and Vasily Gorbik <gor@linux.ibm.com>
> > for further discussions.
> 
> IIRC, the V3 series previously had all these addresses copied properly
> but this version once again missed copying all required addresses.

I also had issues with the de.ibm.com address, which might also have
made some mails disappear, and others might simply have been overlooked
be me. Don't bother, my bad.

> 
> > 
> > That being said, sorry for duplications, this might already have been
> > discussed. Preliminary analysis showed that it only seems to go wrong
> > for certain random vaddr values. I cannot make any sense of that yet,
> > but what seems strange to me is that the hugetlb_advanced_tests()
> > take a (real) pte_t pointer as input, and also use that for all
> > kinds of operations (set_huge_pte_at, huge_ptep_get_and_clear, etc.).
> > 
> > Although all the hugetlb code in the kernel is (mis)using pte_t
> > pointers instead of the correct pmd/pud_t pointers like THP, that
> > is just for historic reasons. The pointers will actually never point
> > to a real pte_t (i.e. page table entry), but of course to a pmd
> > or pud entry, depending on hugepage size.
> 
> HugeTLB logically operates on a PTE entry irrespective of it's real
> page table level position. Nonetheless, IIUC, vaddr here should have
> been aligned to real page table level in which the entry is being
> mapped currently.

That goes back to the time where only x86 had hugepages, and they
have the same layout for pte/pmd/etc entries, so it simply didn't
matter that the code (mis)used pte pointers / entries. But even for
x86, the hugetlb pte pointers would never have pointed to real ptes,
but pmds instead. That's why I call it misuse.

s390 is very sensitive to page table level, and we can also determine
the level from the entry value, which is used for some primitives.
Others have implicit assumptions and calculations, which go wrong
if a wrong level is passed in, like in this case for
huge_ptep_get_and_clear(). Simply aligning vaddr / pfn will not
be enough to fix this for s390, it has to be a pmd/pud pointer.
Or, as you already mentioned, the result of huge_pte_alloc().

Furthermore, the pmd and pte layout are different, so we simply cannot
use any pte_xxx primitives for hugepages. That was the reason for
introducing huge_ptep_get(), which will do an implicit conversion
from the real pmd/pud entry to a "fake" pte entry, which can then
be used with such pte_xxx primitives. Before writing it back in
set_huge_pte_at() we then do the reverse conversion to a proper
pmd/pud again.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test
  2020-09-02 11:42 ` [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test Aneesh Kumar K.V
@ 2020-09-11  2:13   ` Nathan Chancellor
  2020-09-11  5:21     ` Aneesh Kumar K.V
  2020-10-11 20:02   ` Guenter Roeck
  1 sibling, 1 reply; 52+ messages in thread
From: Nathan Chancellor @ 2020-09-11  2:13 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-mm, akpm, Anshuman Khandual, linuxppc-dev

On Wed, Sep 02, 2020 at 05:12:22PM +0530, Aneesh Kumar K.V wrote:
> pte_clear_tests operate on an existing pte entry. Make sure that
> is not a none pte entry.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 9afa1354326b..c36530c69e33 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -542,9 +542,10 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
>  #endif /* PAGETABLE_P4D_FOLDED */
>  
>  static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
> -				   unsigned long vaddr)
> +				   unsigned long pfn, unsigned long vaddr,
> +				   pgprot_t prot)
>  {
> -	pte_t pte = ptep_get(ptep);
> +	pte_t pte = pfn_pte(pfn, prot);
>  
>  	pr_debug("Validating PTE clear\n");
>  	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> @@ -1049,7 +1050,7 @@ static int __init debug_vm_pgtable(void)
>  
>  	ptl = pte_lockptr(mm, pmdp);
>  	spin_lock(ptl);
> -	pte_clear_tests(mm, ptep, vaddr);
> +	pte_clear_tests(mm, ptep, pte_aligned, vaddr, prot);
>  	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>  	pte_unmap_unlock(ptep, ptl);
>  
> -- 
> 2.26.2
> 
> 

This patch causes a panic at boot for RISC-V defconfig. The rootfs is here if it is needed:
https://github.com/ClangBuiltLinux/boot-utils/blob/3b21a5b71451742866349ba4f18638c5a754e660/images/riscv/rootfs.cpio.zst

$ make -skj"$(nproc)" ARCH=riscv CROSS_COMPILE=riscv64-linux- O=out/riscv distclean defconfig Image

$ qemu-system-riscv64 -bios default -M virt -display none -initrd rootfs.cpio -kernel Image -m 512m -nodefaults -serial mon:stdio
...

OpenSBI v0.6
   ____                    _____ ____ _____
  / __ \                  / ____|  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |____) | |_) || |_
  \____/| .__/ \___|_| |_|_____/|____/_____|
        | |
        |_|

Platform Name          : QEMU Virt Machine
Platform HART Features : RV64ACDFIMSU
Platform Max HARTs     : 8
Current Hart           : 0
Firmware Base          : 0x80000000
Firmware Size          : 120 KB
Runtime SBI Version    : 0.2

MIDELEG : 0x0000000000000222
MEDELEG : 0x000000000000b109
PMP0    : 0x0000000080000000-0x000000008001ffff (A)
PMP1    : 0x0000000000000000-0xffffffffffffffff (A,R,W,X)
[    0.000000] Linux version 5.9.0-rc4-next-20200910 (nathan@ubuntu-n2-xlarge-x86) (riscv64-linux-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.35) #1 SMP Thu Sep 10 19:10:43 MST 2020
...
[    0.294593] NET: Registered protocol family 17
[    0.295781] 9pnet: Installing 9P2000 support
[    0.296153] Key type dns_resolver registered
[    0.296694] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
[    0.297635] Unable to handle kernel paging request at virtual address 0a7fffe01dafefc8
[    0.298029] Oops [#1]
[    0.298153] Modules linked in:
[    0.298433] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4-next-20200910 #1
[    0.298792] epc: ffffffe000205afc ra : ffffffe0008be0aa sp : ffffffe01ae73d40
[    0.299078]  gp : ffffffe0010b9b48 tp : ffffffe01ae68000 t0 : ffffffe008152000
[    0.299362]  t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffe01ae73d60
[    0.299648]  s1 : bffffffffffffffb a0 : 0a7fffe01dafefc8 a1 : bffffffffffffffb
[    0.299948]  a2 : ffffffe0010a2698 a3 : 0000000000000001 a4 : 0000000000000003
[    0.300231]  a5 : 0000000000000800 a6 : fffffffff0000080 a7 : 000000001b642000
[    0.300521]  s2 : ffffffe0081517b8 s3 : ffffffe008150a80 s4 : ffffffe01af30000
[    0.300806]  s5 : ffffffe01f8ca9b8 s6 : ffffffe008150000 s7 : ffffffe0010bb100
[    0.301161]  s8 : ffffffe0010bb108 s9 : 0000000000080202 s10: ffffffe0010bb928
[    0.301481]  s11: 000000002008085b t3 : 0000000000000000 t4 : 0000000000000000
[    0.301722]  t5 : 0000000000000000 t6 : ffffffe008150000
[    0.301947] status: 0000000000000120 badaddr: 0a7fffe01dafefc8 cause: 000000000000000f
[    0.302569] ---[ end trace 7ffb153d816164cf ]---
[    0.302797] note: swapper/0[1] exited with preempt_count 1
[    0.303101] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    0.303614] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

$ git bisect log
# bad: [7ce53e3a447bced7b85ed181c4d027e93c062e07] Add linux-next specific files for 20200910
# good: [34d4ddd359dbcdf6c5fb3f85a179243d7a1cb7f8] Merge tag 'linux-kselftest-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
git bisect start '7ce53e3a447bced7b85ed181c4d027e93c062e07' '34d4ddd359dbcdf6c5fb3f85a179243d7a1cb7f8'
# good: [bb1f09d126618aa1ec776d87d7f85136edbed485] Merge remote-tracking branch 'crypto/master' into master
git bisect good bb1f09d126618aa1ec776d87d7f85136edbed485
# good: [1fb7e980a0f9a0aa4c7daba1a7e35d12c97820ea] Merge remote-tracking branch 'audit/next' into master
git bisect good 1fb7e980a0f9a0aa4c7daba1a7e35d12c97820ea
# good: [afdf05baff78a658843c1013855985e7a6871406] Merge remote-tracking branch 'thunderbolt/next' into master
git bisect good afdf05baff78a658843c1013855985e7a6871406
# good: [c5f3c031bd4b8ef5fb6b07352abc284603c3edee] Merge remote-tracking branch 'kselftest/next' into master
git bisect good c5f3c031bd4b8ef5fb6b07352abc284603c3edee
# bad: [671aca25e253f2773850aefb0837a225c691e336] lib: bitmap: delete duplicated words
git bisect bad 671aca25e253f2773850aefb0837a225c691e336
# bad: [e42ac710a849403b7fe582cc555dc3b7bf5b6fa9] tools/testing/selftests/vm/hmm-tests.c: use the new SKIP() macro
git bisect bad e42ac710a849403b7fe582cc555dc3b7bf5b6fa9
# good: [1443e3384317a9dfaf1381e8134a69c1e3fc7130] device-dax: kill dax_kmem_res
git bisect good 1443e3384317a9dfaf1381e8134a69c1e3fc7130
# good: [e2aad6f1d232b457ea6a3194992dd4c0a83534a5] mm/debug_vm_pgtable/locks: take correct page table lock
git bisect good e2aad6f1d232b457ea6a3194992dd4c0a83534a5
# bad: [08075d21b791241ba9f1366f814afa4a77372250] mm: workingset: ignore slab memory size when calculating shadows pressure
git bisect bad 08075d21b791241ba9f1366f814afa4a77372250
# bad: [42e9e63856020953cc3645a2032a70179364a1d8] mm-gup-dont-permit-users-to-call-get_user_pages-with-foll_longterm-fix
git bisect bad 42e9e63856020953cc3645a2032a70179364a1d8
# good: [b77bfa2ce4c2e30c1b1d1b3eb18c5b57397277f9] mm/debug_vm_pgtable/hugetlb: disable hugetlb test on ppc64
git bisect good b77bfa2ce4c2e30c1b1d1b3eb18c5b57397277f9
# bad: [90b612d56df39666f0f6fa6a033b4a7ec2e0a16c] mm/gup_benchmark: use pin_user_pages for FOLL_LONGTERM flag
git bisect bad 90b612d56df39666f0f6fa6a033b4a7ec2e0a16c
# bad: [060e70ecf865e55d82282995b4cb478126a1163c] mm/debug_vm_pgtable: avoid none pte in pte_clear_test
git bisect bad 060e70ecf865e55d82282995b4cb478126a1163c
# first bad commit: [060e70ecf865e55d82282995b4cb478126a1163c] mm/debug_vm_pgtable: avoid none pte in pte_clear_test

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test
  2020-09-11  2:13   ` Nathan Chancellor
@ 2020-09-11  5:21     ` Aneesh Kumar K.V
  2020-09-23  3:14       ` Anshuman Khandual
  0 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-11  5:21 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Anshuman Khandual, linux-mm, akpm, linuxppc-dev, linux-riscv

Nathan Chancellor <natechancellor@gmail.com> writes:

> On Wed, Sep 02, 2020 at 05:12:22PM +0530, Aneesh Kumar K.V wrote:
>> pte_clear_tests operate on an existing pte entry. Make sure that
>> is not a none pte entry.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>  mm/debug_vm_pgtable.c | 7 ++++---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>> 
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index 9afa1354326b..c36530c69e33 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -542,9 +542,10 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
>>  #endif /* PAGETABLE_P4D_FOLDED */
>>  
>>  static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
>> -				   unsigned long vaddr)
>> +				   unsigned long pfn, unsigned long vaddr,
>> +				   pgprot_t prot)
>>  {
>> -	pte_t pte = ptep_get(ptep);
>> +	pte_t pte = pfn_pte(pfn, prot);
>>  
>>  	pr_debug("Validating PTE clear\n");
>>  	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>> @@ -1049,7 +1050,7 @@ static int __init debug_vm_pgtable(void)
>>  
>>  	ptl = pte_lockptr(mm, pmdp);
>>  	spin_lock(ptl);
>> -	pte_clear_tests(mm, ptep, vaddr);
>> +	pte_clear_tests(mm, ptep, pte_aligned, vaddr, prot);
>>  	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>>  	pte_unmap_unlock(ptep, ptl);
>>  
>> -- 
> This patch causes a panic at boot for RISC-V defconfig. The rootfs is here if it is needed:
> https://github.com/ClangBuiltLinux/boot-utils/blob/3b21a5b71451742866349ba4f18638c5a754e660/images/riscv/rootfs.cpio.zst
>
> $ make -skj"$(nproc)" ARCH=riscv CROSS_COMPILE=riscv64-linux- O=out/riscv distclean defconfig Image
>
> $ qemu-system-riscv64 -bios default -M virt -display none -initrd rootfs.cpio -kernel Image -m 512m -nodefaults -serial mon:stdio
> ...
>
> OpenSBI v0.6
>    ____                    _____ ____ _____
>   / __ \                  / ____|  _ \_   _|
>  | |  | |_ __   ___ _ __ | (___ | |_) || |
>  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
>  | |__| | |_) |  __/ | | |____) | |_) || |_
>   \____/| .__/ \___|_| |_|_____/|____/_____|
>         | |
>         |_|
>
> Platform Name          : QEMU Virt Machine
> Platform HART Features : RV64ACDFIMSU
> Platform Max HARTs     : 8
> Current Hart           : 0
> Firmware Base          : 0x80000000
> Firmware Size          : 120 KB
> Runtime SBI Version    : 0.2
>
> MIDELEG : 0x0000000000000222
> MEDELEG : 0x000000000000b109
> PMP0    : 0x0000000080000000-0x000000008001ffff (A)
> PMP1    : 0x0000000000000000-0xffffffffffffffff (A,R,W,X)
> [    0.000000] Linux version 5.9.0-rc4-next-20200910 (nathan@ubuntu-n2-xlarge-x86) (riscv64-linux-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.35) #1 SMP Thu Sep 10 19:10:43 MST 2020
> ...
> [    0.294593] NET: Registered protocol family 17
> [    0.295781] 9pnet: Installing 9P2000 support
> [    0.296153] Key type dns_resolver registered
> [    0.296694] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
> [    0.297635] Unable to handle kernel paging request at virtual address 0a7fffe01dafefc8
> [    0.298029] Oops [#1]
> [    0.298153] Modules linked in:
> [    0.298433] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4-next-20200910 #1
> [    0.298792] epc: ffffffe000205afc ra : ffffffe0008be0aa sp : ffffffe01ae73d40
> [    0.299078]  gp : ffffffe0010b9b48 tp : ffffffe01ae68000 t0 : ffffffe008152000
> [    0.299362]  t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffe01ae73d60
> [    0.299648]  s1 : bffffffffffffffb a0 : 0a7fffe01dafefc8 a1 : bffffffffffffffb
> [    0.299948]  a2 : ffffffe0010a2698 a3 : 0000000000000001 a4 : 0000000000000003
> [    0.300231]  a5 : 0000000000000800 a6 : fffffffff0000080 a7 : 000000001b642000
> [    0.300521]  s2 : ffffffe0081517b8 s3 : ffffffe008150a80 s4 : ffffffe01af30000
> [    0.300806]  s5 : ffffffe01f8ca9b8 s6 : ffffffe008150000 s7 : ffffffe0010bb100
> [    0.301161]  s8 : ffffffe0010bb108 s9 : 0000000000080202 s10: ffffffe0010bb928
> [    0.301481]  s11: 000000002008085b t3 : 0000000000000000 t4 : 0000000000000000
> [    0.301722]  t5 : 0000000000000000 t6 : ffffffe008150000
> [    0.301947] status: 0000000000000120 badaddr: 0a7fffe01dafefc8 cause: 000000000000000f
> [    0.302569] ---[ end trace 7ffb153d816164cf ]---
> [    0.302797] note: swapper/0[1] exited with preempt_count 1
> [    0.303101] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [    0.303614] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---


I guess it is the combination of a valid pte and usage of
RANDOM_ORVALUE. The below change get the kernel to boot. Can somebody
faimilar with riscv pte format take a look at the RANDOM_ORVALUE?

modified   mm/debug_vm_pgtable.c
@@ -548,7 +548,7 @@ static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
 	pte_t pte = pfn_pte(pfn, prot);
 
 	pr_debug("Validating PTE clear\n");
-	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
+//	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
 	set_pte_at(mm, vaddr, ptep, pte);
 	barrier();
 	pte_clear(mm, vaddr, ptep);


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH] mm/debug_vm_pgtable: Avoid doing memory allocation with pgtable_t mapped.
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (14 preceding siblings ...)
  2020-09-04  6:48 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Anshuman Khandual
@ 2020-09-13 11:03 ` Aneesh Kumar K.V
  2020-09-13 13:42   ` Matthew Wilcox
  2020-10-13 20:58 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Andrew Morton
  16 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-09-13 11:03 UTC (permalink / raw)
  To: linux-mm, akpm
  Cc: kernel test robot, linuxppc-dev, Aneesh Kumar K.V, Anshuman Khandual

With highmem, pte_alloc_map() keep the level4 page table mapped using
kmap_atomic(). Avoid doing new memory allocation with page table
mapped like above.

[    9.409233] BUG: sleeping function called from invalid context at mm/page_alloc.c:4822
[    9.410557] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper
[    9.411932] no locks held by swapper/1.
[    9.412595] CPU: 0 PID: 1 Comm: swapper Not tainted 5.9.0-rc3-00323-gc50eb1ed654b5 #2
[    9.413824] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[    9.415207] Call Trace:
[    9.415651]  ? ___might_sleep.cold+0xa7/0xcc
[    9.416367]  ? __alloc_pages_nodemask+0x14c/0x5b0
[    9.417055]  ? swap_migration_tests+0x50/0x293
[    9.417704]  ? debug_vm_pgtable+0x4bc/0x708
[    9.418287]  ? swap_migration_tests+0x293/0x293
[    9.418911]  ? do_one_initcall+0x82/0x3cb
[    9.419465]  ? parse_args+0x1bd/0x280
[    9.419983]  ? rcu_read_lock_sched_held+0x36/0x60
[    9.420673]  ? trace_initcall_level+0x1f/0xf3
[    9.421279]  ? trace_initcall_level+0xbd/0xf3
[    9.421881]  ? do_basic_setup+0x9d/0xdd
[    9.422410]  ? do_basic_setup+0xc3/0xdd
[    9.422938]  ? kernel_init_freeable+0x72/0xa3
[    9.423539]  ? rest_init+0x134/0x134
[    9.424055]  ? kernel_init+0x5/0x12c
[    9.424574]  ? ret_from_fork+0x19/0x30

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 mm/debug_vm_pgtable.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index d12bde82ae95..612c665a1136 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -994,7 +994,13 @@ static int __init debug_vm_pgtable(void)
 	p4dp = p4d_alloc(mm, pgdp, vaddr);
 	pudp = pud_alloc(mm, p4dp, vaddr);
 	pmdp = pmd_alloc(mm, pudp, vaddr);
-	ptep = pte_alloc_map(mm, pmdp, vaddr);
+	/*
+	 * Allocate pgtable_t
+	 */
+	if (pte_alloc(mm, pmdp)) {
+		pr_err("pgtable allocation failed\n");
+		return 1;
+	}
 
 	/*
 	 * Save all the page table page addresses as the page table
@@ -1048,8 +1054,7 @@ static int __init debug_vm_pgtable(void)
 	 * proper page table lock.
 	 */
 
-	ptl = pte_lockptr(mm, pmdp);
-	spin_lock(ptl);
+	ptep = pte_offset_map_lock(mm, pmdp, vaddr, &ptl);
 	pte_clear_tests(mm, ptep, pte_aligned, vaddr, prot);
 	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
 	pte_unmap_unlock(ptep, ptl);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH] mm/debug_vm_pgtable: Avoid doing memory allocation with pgtable_t mapped.
  2020-09-13 11:03 ` [PATCH] mm/debug_vm_pgtable: Avoid doing memory allocation with pgtable_t mapped Aneesh Kumar K.V
@ 2020-09-13 13:42   ` Matthew Wilcox
  0 siblings, 0 replies; 52+ messages in thread
From: Matthew Wilcox @ 2020-09-13 13:42 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: kernel test robot, Anshuman Khandual, linux-mm, akpm, linuxppc-dev

On Sun, Sep 13, 2020 at 04:33:27PM +0530, Aneesh Kumar K.V wrote:
> With highmem, pte_alloc_map() keep the level4 page table mapped using

.noitcerid etisoppo eht ni selbat egap eht srebmun ygolonimret xuniL


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test
  2020-09-11  5:21     ` Aneesh Kumar K.V
@ 2020-09-23  3:14       ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-09-23  3:14 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Nathan Chancellor
  Cc: linux-mm, akpm, linuxppc-dev, linux-riscv



On 09/11/2020 10:51 AM, Aneesh Kumar K.V wrote:
> Nathan Chancellor <natechancellor@gmail.com> writes:
> 
>> On Wed, Sep 02, 2020 at 05:12:22PM +0530, Aneesh Kumar K.V wrote:
>>> pte_clear_tests operate on an existing pte entry. Make sure that
>>> is not a none pte entry.
>>>
>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>> ---
>>>  mm/debug_vm_pgtable.c | 7 ++++---
>>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>>> index 9afa1354326b..c36530c69e33 100644
>>> --- a/mm/debug_vm_pgtable.c
>>> +++ b/mm/debug_vm_pgtable.c
>>> @@ -542,9 +542,10 @@ static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp,
>>>  #endif /* PAGETABLE_P4D_FOLDED */
>>>  
>>>  static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
>>> -				   unsigned long vaddr)
>>> +				   unsigned long pfn, unsigned long vaddr,
>>> +				   pgprot_t prot)
>>>  {
>>> -	pte_t pte = ptep_get(ptep);
>>> +	pte_t pte = pfn_pte(pfn, prot);
>>>  
>>>  	pr_debug("Validating PTE clear\n");
>>>  	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>>> @@ -1049,7 +1050,7 @@ static int __init debug_vm_pgtable(void)
>>>  
>>>  	ptl = pte_lockptr(mm, pmdp);
>>>  	spin_lock(ptl);
>>> -	pte_clear_tests(mm, ptep, vaddr);
>>> +	pte_clear_tests(mm, ptep, pte_aligned, vaddr, prot);
>>>  	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>>>  	pte_unmap_unlock(ptep, ptl);
>>>  
>>> -- 
>> This patch causes a panic at boot for RISC-V defconfig. The rootfs is here if it is needed:
>> https://github.com/ClangBuiltLinux/boot-utils/blob/3b21a5b71451742866349ba4f18638c5a754e660/images/riscv/rootfs.cpio.zst
>>
>> $ make -skj"$(nproc)" ARCH=riscv CROSS_COMPILE=riscv64-linux- O=out/riscv distclean defconfig Image
>>
>> $ qemu-system-riscv64 -bios default -M virt -display none -initrd rootfs.cpio -kernel Image -m 512m -nodefaults -serial mon:stdio
>> ...
>>
>> OpenSBI v0.6
>>    ____                    _____ ____ _____
>>   / __ \                  / ____|  _ \_   _|
>>  | |  | |_ __   ___ _ __ | (___ | |_) || |
>>  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
>>  | |__| | |_) |  __/ | | |____) | |_) || |_
>>   \____/| .__/ \___|_| |_|_____/|____/_____|
>>         | |
>>         |_|
>>
>> Platform Name          : QEMU Virt Machine
>> Platform HART Features : RV64ACDFIMSU
>> Platform Max HARTs     : 8
>> Current Hart           : 0
>> Firmware Base          : 0x80000000
>> Firmware Size          : 120 KB
>> Runtime SBI Version    : 0.2
>>
>> MIDELEG : 0x0000000000000222
>> MEDELEG : 0x000000000000b109
>> PMP0    : 0x0000000080000000-0x000000008001ffff (A)
>> PMP1    : 0x0000000000000000-0xffffffffffffffff (A,R,W,X)
>> [    0.000000] Linux version 5.9.0-rc4-next-20200910 (nathan@ubuntu-n2-xlarge-x86) (riscv64-linux-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.35) #1 SMP Thu Sep 10 19:10:43 MST 2020
>> ...
>> [    0.294593] NET: Registered protocol family 17
>> [    0.295781] 9pnet: Installing 9P2000 support
>> [    0.296153] Key type dns_resolver registered
>> [    0.296694] debug_vm_pgtable: [debug_vm_pgtable         ]: Validating architecture page table helpers
>> [    0.297635] Unable to handle kernel paging request at virtual address 0a7fffe01dafefc8
>> [    0.298029] Oops [#1]
>> [    0.298153] Modules linked in:
>> [    0.298433] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4-next-20200910 #1
>> [    0.298792] epc: ffffffe000205afc ra : ffffffe0008be0aa sp : ffffffe01ae73d40
>> [    0.299078]  gp : ffffffe0010b9b48 tp : ffffffe01ae68000 t0 : ffffffe008152000
>> [    0.299362]  t1 : 0000000000000000 t2 : 0000000000000000 s0 : ffffffe01ae73d60
>> [    0.299648]  s1 : bffffffffffffffb a0 : 0a7fffe01dafefc8 a1 : bffffffffffffffb
>> [    0.299948]  a2 : ffffffe0010a2698 a3 : 0000000000000001 a4 : 0000000000000003
>> [    0.300231]  a5 : 0000000000000800 a6 : fffffffff0000080 a7 : 000000001b642000
>> [    0.300521]  s2 : ffffffe0081517b8 s3 : ffffffe008150a80 s4 : ffffffe01af30000
>> [    0.300806]  s5 : ffffffe01f8ca9b8 s6 : ffffffe008150000 s7 : ffffffe0010bb100
>> [    0.301161]  s8 : ffffffe0010bb108 s9 : 0000000000080202 s10: ffffffe0010bb928
>> [    0.301481]  s11: 000000002008085b t3 : 0000000000000000 t4 : 0000000000000000
>> [    0.301722]  t5 : 0000000000000000 t6 : ffffffe008150000
>> [    0.301947] status: 0000000000000120 badaddr: 0a7fffe01dafefc8 cause: 000000000000000f
>> [    0.302569] ---[ end trace 7ffb153d816164cf ]---
>> [    0.302797] note: swapper/0[1] exited with preempt_count 1
>> [    0.303101] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
>> [    0.303614] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
> 
> 
> I guess it is the combination of a valid pte and usage of
> RANDOM_ORVALUE. The below change get the kernel to boot. Can somebody
> faimilar with riscv pte format take a look at the RANDOM_ORVALUE?
> 
> modified   mm/debug_vm_pgtable.c
> @@ -548,7 +548,7 @@ static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep,
>  	pte_t pte = pfn_pte(pfn, prot);
>  
>  	pr_debug("Validating PTE clear\n");
> -	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> +//	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>  	set_pte_at(mm, vaddr, ptep, pte);
>  	barrier();
>  	pte_clear(mm, vaddr, ptep);

Do we have a fix for this problem ? Otherwise we just risk going into
the next release with this regression on riscv platforms.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test
  2020-09-02 11:42 ` [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test Aneesh Kumar K.V
  2020-09-11  2:13   ` Nathan Chancellor
@ 2020-10-11 20:02   ` Guenter Roeck
  2020-10-12  4:29     ` Aneesh Kumar K.V
  1 sibling, 1 reply; 52+ messages in thread
From: Guenter Roeck @ 2020-10-11 20:02 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-mm, akpm, Anshuman Khandual, linuxppc-dev

On Wed, Sep 02, 2020 at 05:12:22PM +0530, Aneesh Kumar K.V wrote:
> pte_clear_tests operate on an existing pte entry. Make sure that
> is not a none pte entry.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

This patch causes all riscv64 images to crash. Reverting it
as well as the follow-up patch fixes the problem, but there are
still several warning messages starting with
	BUG kmem_cache (Not tainted): Freechain corrupt
I did not try to track down this other problem.

A detailed crash log is at
	https://kerneltests.org/builders/qemu-riscv64-next/builds/523/steps/qemubuildcommand/logs/stdio

Bisect log is attached.

Guenter

---
# bad: [d67bc7812221606e1886620a357b13f906814af7] Add linux-next specific files for 20201009
# good: [549738f15da0e5a00275977623be199fbbf7df50] Linux 5.9-rc8
git bisect start 'HEAD' 'v5.9-rc8'
# good: [b71be15b496cc71a3434a198fc1a1b9e08af6c57] Merge remote-tracking branch 'bpf-next/master' into master
git bisect good b71be15b496cc71a3434a198fc1a1b9e08af6c57
# good: [3542e5a87341bdca83e3d7f061b0f4e0f4c23f73] Merge remote-tracking branch 'spi/for-next' into master
git bisect good 3542e5a87341bdca83e3d7f061b0f4e0f4c23f73
# good: [65f9c957115c749ba79fb469083caf14101a93bb] Merge remote-tracking branch 'char-misc/char-misc-next' into master
git bisect good 65f9c957115c749ba79fb469083caf14101a93bb
# good: [aadfe5ecb55641d5994a5f3b27f074beead8f49b] Merge remote-tracking branch 'scsi-mkp/for-next' into master
git bisect good aadfe5ecb55641d5994a5f3b27f074beead8f49b
# good: [060196553d119d5c252f09ed5b0316929cc25983] Merge remote-tracking branch 'memblock/for-next' into master
git bisect good 060196553d119d5c252f09ed5b0316929cc25983
# bad: [d2340d89ffa6d2643f0689600dbf0969d86fdd3c] x86/setup: simplify initrd relocation and reservation
git bisect bad d2340d89ffa6d2643f0689600dbf0969d86fdd3c
# bad: [4b23734f5ba1a0487b2b569a9e64e4e5360009c1] mm/swapfile.c: remove unnecessary goto out in _swap_info_get()
git bisect bad 4b23734f5ba1a0487b2b569a9e64e4e5360009c1
# good: [b30f5277e97be65a99ca5af3d4337cb9acbf8fa7] device-dax: introduce 'struct dev_dax' typed-driver operations
git bisect good b30f5277e97be65a99ca5af3d4337cb9acbf8fa7
# good: [8c2075296dbbbce685fa14bbf46d29fba54f8a62] mm/debug_vm_pgtable: drop hugetlb_advanced_tests()
git bisect good 8c2075296dbbbce685fa14bbf46d29fba54f8a62
# bad: [52ce97889b3c85826995d22a690cd1430c14f316] mm/filemap: fix filemap_map_pages for THP
git bisect bad 52ce97889b3c85826995d22a690cd1430c14f316
# bad: [1ddc0b707be99faece6cdd77f34544f408c6617d] proc: optimise smaps for shmem entries
git bisect bad 1ddc0b707be99faece6cdd77f34544f408c6617d
# bad: [5d685be3785638202430b6baa2ecde482b52c41e] mm: factor find_get_incore_page out of mincore_page
git bisect bad 5d685be3785638202430b6baa2ecde482b52c41e
# bad: [0797f84d689b9f1e7256d954280b28bfeaf5b1fc] mm/debug_vm_pgtable: avoid doing memory allocation with pgtable_t mapped.
git bisect bad 0797f84d689b9f1e7256d954280b28bfeaf5b1fc
# bad: [4f9a78e6bcd60a25b187adc1526ab3815fc40dae] mm/debug_vm_pgtable: avoid none pte in pte_clear_test
git bisect bad 4f9a78e6bcd60a25b187adc1526ab3815fc40dae
# first bad commit: [4f9a78e6bcd60a25b187adc1526ab3815fc40dae] mm/debug_vm_pgtable: avoid none pte in pte_clear_test

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test
  2020-10-11 20:02   ` Guenter Roeck
@ 2020-10-12  4:29     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-10-12  4:29 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Anshuman Khandual, linux-mm, akpm, linuxppc-dev, linux-riscv

Guenter Roeck <linux@roeck-us.net> writes:

> On Wed, Sep 02, 2020 at 05:12:22PM +0530, Aneesh Kumar K.V wrote:
>> pte_clear_tests operate on an existing pte entry. Make sure that
>> is not a none pte entry.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>
> This patch causes all riscv64 images to crash. Reverting it
> as well as the follow-up patch fixes the problem, but there are
> still several warning messages starting with
> 	BUG kmem_cache (Not tainted): Freechain corrupt
> I did not try to track down this other problem.
>
> A detailed crash log is at
> 	https://kerneltests.org/builders/qemu-riscv64-next/builds/523/steps/qemubuildcommand/logs/stdio
>
> Bisect log is attached.


https://lore.kernel.org/linux-mm/87zh5wx51b.fsf@linux.ibm.com

This was mentioned earlier. The RANDOM_OR_VALUE used is interacting with
some of the riscv page table accessors. 

-aneesh

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
                   ` (15 preceding siblings ...)
  2020-09-13 11:03 ` [PATCH] mm/debug_vm_pgtable: Avoid doing memory allocation with pgtable_t mapped Aneesh Kumar K.V
@ 2020-10-13 20:58 ` Andrew Morton
  2020-10-14  3:15   ` Aneesh Kumar K.V
  16 siblings, 1 reply; 52+ messages in thread
From: Andrew Morton @ 2020-10-13 20:58 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-mm, Anshuman Khandual, linuxppc-dev

On Wed,  2 Sep 2020 17:12:09 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:

> This patch series includes fixes for debug_vm_pgtable test code so that
> they follow page table updates rules correctly. The first two patches introduce
> changes w.r.t ppc64. The patches are included in this series for completeness. We can
> merge them via ppc64 tree if required.

Do you think this series is ready to be merged?

Possibly-unresolved issues which I have recorded are

Against
mm-debug_vm_pgtable-locks-move-non-page-table-modifying-test-together.patch:

https://lkml.kernel.org/r/56830efb-887e-0000-a46e-ae015e5854cd@arm.com
https://lkml.kernel.org/r/20200910075752.GC26874@shao2-debian

Against mm-debug_vm_pgtable-avoid-none-pte-in-pte_clear_test.patch:

https://lkml.kernel.org/r/87zh5wx51b.fsf@linux.ibm.com
https://lkml.kernel.org/r/37a9facc-ca36-290f-3748-16c4a7a778fa@arm.com
https://lkml.kernel.org/r/20201011200258.GA91021@roeck-us.net

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-10-13 20:58 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Andrew Morton
@ 2020-10-14  3:15   ` Aneesh Kumar K.V
  2020-10-14 20:36     ` Andrew Morton
  0 siblings, 1 reply; 52+ messages in thread
From: Aneesh Kumar K.V @ 2020-10-14  3:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, Anshuman Khandual, linuxppc-dev

On 10/14/20 2:28 AM, Andrew Morton wrote:
> On Wed,  2 Sep 2020 17:12:09 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:
> 
>> This patch series includes fixes for debug_vm_pgtable test code so that
>> they follow page table updates rules correctly. The first two patches introduce
>> changes w.r.t ppc64. The patches are included in this series for completeness. We can
>> merge them via ppc64 tree if required.
> 
> Do you think this series is ready to be merged?

Hopefully, except for the Riscv crash.

> 
> Possibly-unresolved issues which I have recorded are
> 
> Against
> mm-debug_vm_pgtable-locks-move-non-page-table-modifying-test-together.patch:
> 
> https://lkml.kernel.org/r/56830efb-887e-0000-a46e-ae015e5854cd@arm.com

I guess the full series do boot fine on arm.

> https://lkml.kernel.org/r/20200910075752.GC26874@shao2-debian

This should be fixed by

https://ozlabs.org/~akpm/mmots/broken-out/mm-debug_vm_pgtable-avoid-doing-memory-allocation-with-pgtable_t-mapped.patch

> 
> Against mm-debug_vm_pgtable-avoid-none-pte-in-pte_clear_test.patch:
> 
> https://lkml.kernel.org/r/87zh5wx51b.fsf@linux.ibm.com


yes this one we should get fixed. I was hoping someone familiar with 
Riscv pte updates rules would pitch in. IIUC we need to update 
RANDON_ORVALUE similar to how we updated it for s390 and ppc64.


  Alternatively we can do this

modified   mm/debug_vm_pgtable.c
@@ -548,7 +548,7 @@ static void __init pte_clear_tests(struct mm_struct 
*mm, pte_t *ptep,
  	pte_t pte = pfn_pte(pfn, prot);

  	pr_debug("Validating PTE clear\n");
-	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
+//	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
  	set_pte_at(mm, vaddr, ptep, pte);
  	barrier();
  	pte_clear(mm, vaddr, ptep);

till we get that feedback from RiscV team?

> https://lkml.kernel.org/r/37a9facc-ca36-290f-3748-16c4a7a778fa@arm.com

same as the above.

> https://lkml.kernel.org/r/20201011200258.GA91021@roeck-us.net
> 

same as the above.

-aneesh

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-10-14  3:15   ` Aneesh Kumar K.V
@ 2020-10-14 20:36     ` Andrew Morton
  2020-10-15  2:59       ` Anshuman Khandual
  0 siblings, 1 reply; 52+ messages in thread
From: Andrew Morton @ 2020-10-14 20:36 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-mm, Anshuman Khandual, linuxppc-dev

On Wed, 14 Oct 2020 08:45:16 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:

> > Against mm-debug_vm_pgtable-avoid-none-pte-in-pte_clear_test.patch:
> > 
> > https://lkml.kernel.org/r/87zh5wx51b.fsf@linux.ibm.com
> 
> 
> yes this one we should get fixed. I was hoping someone familiar with 
> Riscv pte updates rules would pitch in. IIUC we need to update 
> RANDON_ORVALUE similar to how we updated it for s390 and ppc64.
> 
> 
>   Alternatively we can do this
> 
> modified   mm/debug_vm_pgtable.c
> @@ -548,7 +548,7 @@ static void __init pte_clear_tests(struct mm_struct 
> *mm, pte_t *ptep,
>   	pte_t pte = pfn_pte(pfn, prot);
> 
>   	pr_debug("Validating PTE clear\n");
> -	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> +//	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>   	set_pte_at(mm, vaddr, ptep, pte);
>   	barrier();
>   	pte_clear(mm, vaddr, ptep);
> 
> till we get that feedback from RiscV team?

Would it be better to do

#ifdef CONFIG_S390
	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
#endif

?


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
  2020-10-14 20:36     ` Andrew Morton
@ 2020-10-15  2:59       ` Anshuman Khandual
  0 siblings, 0 replies; 52+ messages in thread
From: Anshuman Khandual @ 2020-10-15  2:59 UTC (permalink / raw)
  To: Andrew Morton, Aneesh Kumar K.V; +Cc: linux-mm, linuxppc-dev



On 10/15/2020 02:06 AM, Andrew Morton wrote:
> On Wed, 14 Oct 2020 08:45:16 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:
> 
>>> Against mm-debug_vm_pgtable-avoid-none-pte-in-pte_clear_test.patch:
>>>
>>> https://lkml.kernel.org/r/87zh5wx51b.fsf@linux.ibm.com
>>
>>
>> yes this one we should get fixed. I was hoping someone familiar with 
>> Riscv pte updates rules would pitch in. IIUC we need to update 
>> RANDON_ORVALUE similar to how we updated it for s390 and ppc64.
>>
>>
>>   Alternatively we can do this
>>
>> modified   mm/debug_vm_pgtable.c
>> @@ -548,7 +548,7 @@ static void __init pte_clear_tests(struct mm_struct 
>> *mm, pte_t *ptep,
>>   	pte_t pte = pfn_pte(pfn, prot);
>>
>>   	pr_debug("Validating PTE clear\n");
>> -	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>> +//	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
>>   	set_pte_at(mm, vaddr, ptep, pte);
>>   	barrier();
>>   	pte_clear(mm, vaddr, ptep);
>>
>> till we get that feedback from RiscV team?
> 
> Would it be better to do
> 
> #ifdef CONFIG_S390
> 	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
> #endif

I would suggest just dropping RANDOM_ORVALUE from pte_clear_tests()
possibly with a comment saying it needs to be fixed on RISCV before
being added back later.

	pte = __pte(pte_val(pte));

OR

Disable RANDOM_ORVALUE only for RISCV.

#ifndef CONFIG_RISCV
	pte = __pte(pte_val(pte) | RANDOM_ORVALUE);
#endif

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, back to index

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-02 11:42 [PATCH v4 00/13] mm/debug_vm_pgtable fixes Aneesh Kumar K.V
2020-09-02 11:42 ` [PATCH v4 01/13] powerpc/mm: Add DEBUG_VM WARN for pmd_clear Aneesh Kumar K.V
2020-09-02 11:42 ` [PATCH v4 02/13] powerpc/mm: Move setting pte specific flags to pfn_pte Aneesh Kumar K.V
2020-09-02 12:30   ` Christophe Leroy
2020-09-02 11:42 ` [PATCH v4 03/13] mm/debug_vm_pgtable/ppc64: Avoid setting top bits in radom value Aneesh Kumar K.V
2020-09-04  4:03   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 04/13] mm/debug_vm_pgtables/hugevmap: Use the arch helper to identify huge vmap support Aneesh Kumar K.V
2020-09-02 12:40   ` Christophe Leroy
2020-09-02 12:53     ` Aneesh Kumar K.V
2020-09-04  4:08   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 05/13] mm/debug_vm_pgtable/savedwrite: Enable savedwrite test with CONFIG_NUMA_BALANCING Aneesh Kumar K.V
2020-09-04  4:09   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 06/13] mm/debug_vm_pgtable/THP: Mark the pte entry huge before using set_pmd/pud_at Aneesh Kumar K.V
2020-09-04  5:21   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 07/13] mm/debug_vm_pgtable/set_pte/pmd/pud: Don't use set_*_at to update an existing pte entry Aneesh Kumar K.V
2020-09-04  5:34   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 08/13] mm/debug_vm_pgtable/locks: Move non page table modifying test together Aneesh Kumar K.V
2020-09-04  3:58   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 09/13] mm/debug_vm_pgtable/locks: Take correct page table lock Aneesh Kumar K.V
2020-09-04  5:39   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 10/13] mm/debug_vm_pgtable/thp: Use page table depost/withdraw with THP Aneesh Kumar K.V
2020-09-04  4:21   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 11/13] mm/debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries Aneesh Kumar K.V
2020-09-04  6:03   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 12/13] mm/debug_vm_pgtable/hugetlb: Disable hugetlb test on ppc64 Aneesh Kumar K.V
2020-09-04  6:19   ` Anshuman Khandual
2020-09-02 11:42 ` [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test Aneesh Kumar K.V
2020-09-11  2:13   ` Nathan Chancellor
2020-09-11  5:21     ` Aneesh Kumar K.V
2020-09-23  3:14       ` Anshuman Khandual
2020-10-11 20:02   ` Guenter Roeck
2020-10-12  4:29     ` Aneesh Kumar K.V
2020-09-02 11:46 ` Aneesh Kumar K.V
2020-09-04  4:16   ` Anshuman Khandual
2020-09-04  6:48 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Anshuman Khandual
2020-09-04 15:26   ` Gerald Schaefer
2020-09-04 16:01     ` Gerald Schaefer
2020-09-04 17:53       ` Gerald Schaefer
2020-09-09  8:38         ` Anshuman Khandual
2020-09-08 15:39       ` Gerald Schaefer
2020-09-09  6:08         ` Aneesh Kumar K.V
2020-09-09 11:16           ` Gerald Schaefer
2020-09-09  8:15       ` Anshuman Khandual
2020-09-09 11:10         ` Gerald Schaefer
2020-09-09  8:08     ` Anshuman Khandual
2020-09-09 11:36       ` Gerald Schaefer
2020-09-13 11:03 ` [PATCH] mm/debug_vm_pgtable: Avoid doing memory allocation with pgtable_t mapped Aneesh Kumar K.V
2020-09-13 13:42   ` Matthew Wilcox
2020-10-13 20:58 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Andrew Morton
2020-10-14  3:15   ` Aneesh Kumar K.V
2020-10-14 20:36     ` Andrew Morton
2020-10-15  2:59       ` Anshuman Khandual

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git