linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] mm/debug_vm_pgtable: Some minor updates
@ 2020-11-27  5:06 Anshuman Khandual
  2020-11-27  5:06 ` [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect Anshuman Khandual
  2020-11-27  5:06 ` [PATCH 2/2] mm/debug_vm_pgtable/basic: Iterate over entire protection_map[] Anshuman Khandual
  0 siblings, 2 replies; 10+ messages in thread
From: Anshuman Khandual @ 2020-11-27  5:06 UTC (permalink / raw)
  To: linux-mm, akpm
  Cc: linux-kernel, catalin.marinas, steven.price, christophe.leroy,
	gerald.schaefer, vgupta, paul.walmsley, Anshuman Khandual

This series contains some cleanups and new test suggestions from Catalin
from an earlier discussion.

https://lore.kernel.org/linux-mm/20201123142237.GF17833@gaia/

This series is based on v5.10-rc5 and has been tested on arm64 and x86 but
has only been build tested on riscv, s390, arc etc. It would be great if
folks could test this on these platforms as well. Thank you.

Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Andrew Morton <akpm@linux-foundation.org>

Anshuman Khandual (2):
  mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect
  mm/debug_vm_pgtable/basic: Iterate over entire protection_map[]

 Documentation/vm/arch_pgtable_helpers.rst |  8 ++--
 mm/debug_vm_pgtable.c                     | 50 +++++++++++++++++------
 2 files changed, 42 insertions(+), 16 deletions(-)

-- 
2.20.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect
  2020-11-27  5:06 [PATCH 0/2] mm/debug_vm_pgtable: Some minor updates Anshuman Khandual
@ 2020-11-27  5:06 ` Anshuman Khandual
  2020-11-27  8:22   ` Christophe Leroy
  2020-11-27  5:06 ` [PATCH 2/2] mm/debug_vm_pgtable/basic: Iterate over entire protection_map[] Anshuman Khandual
  1 sibling, 1 reply; 10+ messages in thread
From: Anshuman Khandual @ 2020-11-27  5:06 UTC (permalink / raw)
  To: linux-mm, akpm
  Cc: linux-kernel, catalin.marinas, steven.price, christophe.leroy,
	gerald.schaefer, vgupta, paul.walmsley, Anshuman Khandual

This adds validation tests for dirtiness after write protect conversion for
each page table level. This is important for platforms such as arm64 that
removes the hardware dirty bit while making it an write protected one. This
also fixes pxx_wrprotect() related typos in the documentation file.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 Documentation/vm/arch_pgtable_helpers.rst | 8 ++++----
 mm/debug_vm_pgtable.c                     | 3 +++
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/Documentation/vm/arch_pgtable_helpers.rst b/Documentation/vm/arch_pgtable_helpers.rst
index f3591ee3aaa8..552567d863b8 100644
--- a/Documentation/vm/arch_pgtable_helpers.rst
+++ b/Documentation/vm/arch_pgtable_helpers.rst
@@ -50,7 +50,7 @@ PTE Page Table Helpers
 +---------------------------+--------------------------------------------------+
 | pte_mkwrite               | Creates a writable PTE                           |
 +---------------------------+--------------------------------------------------+
-| pte_mkwrprotect           | Creates a write protected PTE                    |
+| pte_wrprotect             | Creates a write protected PTE                    |
 +---------------------------+--------------------------------------------------+
 | pte_mkspecial             | Creates a special PTE                            |
 +---------------------------+--------------------------------------------------+
@@ -120,7 +120,7 @@ PMD Page Table Helpers
 +---------------------------+--------------------------------------------------+
 | pmd_mkwrite               | Creates a writable PMD                           |
 +---------------------------+--------------------------------------------------+
-| pmd_mkwrprotect           | Creates a write protected PMD                    |
+| pmd_wrprotect             | Creates a write protected PMD                    |
 +---------------------------+--------------------------------------------------+
 | pmd_mkspecial             | Creates a special PMD                            |
 +---------------------------+--------------------------------------------------+
@@ -186,7 +186,7 @@ PUD Page Table Helpers
 +---------------------------+--------------------------------------------------+
 | pud_mkwrite               | Creates a writable PUD                           |
 +---------------------------+--------------------------------------------------+
-| pud_mkwrprotect           | Creates a write protected PUD                    |
+| pud_wrprotect             | Creates a write protected PUD                    |
 +---------------------------+--------------------------------------------------+
 | pud_mkdevmap              | Creates a ZONE_DEVICE mapped PUD                 |
 +---------------------------+--------------------------------------------------+
@@ -224,7 +224,7 @@ HugeTLB Page Table Helpers
 +---------------------------+--------------------------------------------------+
 | huge_pte_mkwrite          | Creates a writable HugeTLB                       |
 +---------------------------+--------------------------------------------------+
-| huge_pte_mkwrprotect      | Creates a write protected HugeTLB                |
+| huge_pte_wrprotect        | Creates a write protected HugeTLB                |
 +---------------------------+--------------------------------------------------+
 | huge_ptep_get_and_clear   | Clears a HugeTLB                                 |
 +---------------------------+--------------------------------------------------+
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index c05d9dcf7891..a5be11210597 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
 	WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte))));
 	WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte))));
 	WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
+	WARN_ON(pte_dirty(pte_wrprotect(pte)));
 }
 
 static void __init pte_advanced_tests(struct mm_struct *mm,
@@ -144,6 +145,7 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
 	WARN_ON(pmd_young(pmd_mkold(pmd_mkyoung(pmd))));
 	WARN_ON(pmd_dirty(pmd_mkclean(pmd_mkdirty(pmd))));
 	WARN_ON(pmd_write(pmd_wrprotect(pmd_mkwrite(pmd))));
+	WARN_ON(pmd_dirty(pmd_wrprotect(pmd)));
 	/*
 	 * A huge page does not point to next level page table
 	 * entry. Hence this must qualify as pmd_bad().
@@ -262,6 +264,7 @@ static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot)
 	WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud))));
 	WARN_ON(pud_write(pud_wrprotect(pud_mkwrite(pud))));
 	WARN_ON(pud_young(pud_mkold(pud_mkyoung(pud))));
+	WARN_ON(pud_dirty(pud_wrprotect(pud)));
 
 	if (mm_pmd_folded(mm))
 		return;
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] mm/debug_vm_pgtable/basic: Iterate over entire protection_map[]
  2020-11-27  5:06 [PATCH 0/2] mm/debug_vm_pgtable: Some minor updates Anshuman Khandual
  2020-11-27  5:06 ` [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect Anshuman Khandual
@ 2020-11-27  5:06 ` Anshuman Khandual
  2020-11-27  9:14   ` Steven Price
  1 sibling, 1 reply; 10+ messages in thread
From: Anshuman Khandual @ 2020-11-27  5:06 UTC (permalink / raw)
  To: linux-mm, akpm
  Cc: linux-kernel, catalin.marinas, steven.price, christophe.leroy,
	gerald.schaefer, vgupta, paul.walmsley, Anshuman Khandual

Currently the basic tests just validate various page table transformations
after starting with vm_get_page_prot(VM_READ|VM_WRITE|VM_EXEC) protection.
Instead scan over the entire protection_map[] for better coverage. It also
makes sure that all these basic page table tranformations checks hold true
irrespective of the starting protection value for the page table entry.
There is also a slight change in the debug print format for basic tests to
capture the protection value it is being tested with. The modified output
looks something like

[pte_basic_tests          ]: Validating PTE basic ()
[pte_basic_tests          ]: Validating PTE basic (read)
[pte_basic_tests          ]: Validating PTE basic (write)
[pte_basic_tests          ]: Validating PTE basic (read|write)
[pte_basic_tests          ]: Validating PTE basic (exec)
[pte_basic_tests          ]: Validating PTE basic (read|exec)
[pte_basic_tests          ]: Validating PTE basic (write|exec)
[pte_basic_tests          ]: Validating PTE basic (read|write|exec)
[pte_basic_tests          ]: Validating PTE basic (shared)
[pte_basic_tests          ]: Validating PTE basic (read|shared)
[pte_basic_tests          ]: Validating PTE basic (write|shared)
[pte_basic_tests          ]: Validating PTE basic (read|write|shared)
[pte_basic_tests          ]: Validating PTE basic (exec|shared)
[pte_basic_tests          ]: Validating PTE basic (read|exec|shared)
[pte_basic_tests          ]: Validating PTE basic (write|exec|shared)
[pte_basic_tests          ]: Validating PTE basic (read|write|exec|shared)

This adds a missing argument 'struct mm_struct *' in pud_basic_tests() test
. This never got exposed before as PUD based THP is available only on X86
platform where mm_pmd_folded(mm) call gets macro replaced without requiring
the mm_struct i.e __is_defined(__PAGETABLE_PMD_FOLDED).

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 mm/debug_vm_pgtable.c | 47 ++++++++++++++++++++++++++++++++-----------
 1 file changed, 35 insertions(+), 12 deletions(-)

diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index a5be11210597..92b4a53d622b 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -58,11 +58,13 @@
 #define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK)
 #define RANDOM_NZVALUE	GENMASK(7, 0)
 
-static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
+static void __init pte_basic_tests(unsigned long pfn, int idx)
 {
+	pgprot_t prot = protection_map[idx];
 	pte_t pte = pfn_pte(pfn, prot);
+	unsigned long val = idx, *ptr = &val;
 
-	pr_debug("Validating PTE basic\n");
+	pr_debug("Validating PTE basic (%pGv)\n", ptr);
 	WARN_ON(!pte_same(pte, pte));
 	WARN_ON(!pte_young(pte_mkyoung(pte_mkold(pte))));
 	WARN_ON(!pte_dirty(pte_mkdirty(pte_mkclean(pte))));
@@ -130,14 +132,16 @@ static void __init pte_savedwrite_tests(unsigned long pfn, pgprot_t prot)
 }
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
+static void __init pmd_basic_tests(unsigned long pfn, int idx)
 {
+	pgprot_t prot = protection_map[idx];
 	pmd_t pmd = pfn_pmd(pfn, prot);
+	unsigned long val = idx, *ptr = &val;
 
 	if (!has_transparent_hugepage())
 		return;
 
-	pr_debug("Validating PMD basic\n");
+	pr_debug("Validating PMD basic (%pGv)\n", ptr);
 	WARN_ON(!pmd_same(pmd, pmd));
 	WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd))));
 	WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd))));
@@ -251,14 +255,16 @@ static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
 }
 
 #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
-static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot)
+static void __init pud_basic_tests(struct mm_struct *mm, unsigned long pfn, int idx)
 {
+	pgprot_t prot = protection_map[idx];
 	pud_t pud = pfn_pud(pfn, prot);
+	unsigned long val = idx, *ptr = &val;
 
 	if (!has_transparent_hugepage())
 		return;
 
-	pr_debug("Validating PUD basic\n");
+	pr_debug("Validating PUD basic (%pGv)\n", ptr);
 	WARN_ON(!pud_same(pud, pud));
 	WARN_ON(!pud_young(pud_mkyoung(pud_mkold(pud))));
 	WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud))));
@@ -362,7 +368,7 @@ static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
 #endif /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
 
 #else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
-static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pud_basic_tests(struct mm_struct *mm, unsigned long pfn, int idx) { }
 static void __init pud_advanced_tests(struct mm_struct *mm,
 				      struct vm_area_struct *vma, pud_t *pudp,
 				      unsigned long pfn, unsigned long vaddr,
@@ -375,8 +381,8 @@ static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
 }
 #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
 #else  /* !CONFIG_TRANSPARENT_HUGEPAGE */
-static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { }
-static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
+static void __init pmd_basic_tests(unsigned long pfn, int idx) { }
+static void __init pud_basic_tests(struct mm_struct *mm, unsigned long pfn, int idx) { }
 static void __init pmd_advanced_tests(struct mm_struct *mm,
 				      struct vm_area_struct *vma, pmd_t *pmdp,
 				      unsigned long pfn, unsigned long vaddr,
@@ -902,6 +908,7 @@ static int __init debug_vm_pgtable(void)
 	unsigned long vaddr, pte_aligned, pmd_aligned;
 	unsigned long pud_aligned, p4d_aligned, pgd_aligned;
 	spinlock_t *ptl = NULL;
+	int idx;
 
 	pr_info("Validating architecture page table helpers\n");
 	prot = vm_get_page_prot(VMFLAGS);
@@ -966,9 +973,25 @@ static int __init debug_vm_pgtable(void)
 	saved_pmdp = pmd_offset(pudp, 0UL);
 	saved_ptep = pmd_pgtable(pmd);
 
-	pte_basic_tests(pte_aligned, prot);
-	pmd_basic_tests(pmd_aligned, prot);
-	pud_basic_tests(pud_aligned, prot);
+	/*
+	 * Iterate over the protection_map[] to make sure that all
+	 * the basic page table transformation validations just hold
+	 * true irrespective of the starting protection value for a
+	 * given page table entry.
+	 */
+	for (idx = 0; idx < ARRAY_SIZE(protection_map); idx++) {
+		pte_basic_tests(pte_aligned, idx);
+		pmd_basic_tests(pmd_aligned, idx);
+		pud_basic_tests(mm, pud_aligned, idx);
+	}
+
+	/*
+	 * Both P4D and PGD level tests are very basic which do not
+	 * involve creating page table entries from the protection
+	 * value and the given pfn. Hence just keep them out from
+	 * the above iteration for now to save some test execution
+	 * time.
+	 */
 	p4d_basic_tests(p4d_aligned, prot);
 	pgd_basic_tests(pgd_aligned, prot);
 
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect
  2020-11-27  5:06 ` [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect Anshuman Khandual
@ 2020-11-27  8:22   ` Christophe Leroy
  2020-11-27  9:44     ` Catalin Marinas
  0 siblings, 1 reply; 10+ messages in thread
From: Christophe Leroy @ 2020-11-27  8:22 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, akpm
  Cc: linux-kernel, catalin.marinas, steven.price, gerald.schaefer,
	vgupta, paul.walmsley



Le 27/11/2020 à 06:06, Anshuman Khandual a écrit :
> This adds validation tests for dirtiness after write protect conversion for
> each page table level. This is important for platforms such as arm64 that
> removes the hardware dirty bit while making it an write protected one. This
> also fixes pxx_wrprotect() related typos in the documentation file.
> 


> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index c05d9dcf7891..a5be11210597 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
>   	WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte))));
>   	WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte))));
>   	WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
> +	WARN_ON(pte_dirty(pte_wrprotect(pte)));

Wondering what you are testing here exactly.

Do you expect that if PTE has the dirty bit, it gets cleared by pte_wrprotect() ?

Powerpc doesn't do that, it only clears the RW bit but the dirty bit remains if it is set, until you 
call pte_mkclean() explicitely.

>   }
>   
>   static void __init pte_advanced_tests(struct mm_struct *mm,
> @@ -144,6 +145,7 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
>   	WARN_ON(pmd_young(pmd_mkold(pmd_mkyoung(pmd))));
>   	WARN_ON(pmd_dirty(pmd_mkclean(pmd_mkdirty(pmd))));
>   	WARN_ON(pmd_write(pmd_wrprotect(pmd_mkwrite(pmd))));
> +	WARN_ON(pmd_dirty(pmd_wrprotect(pmd)));
>   	/*
>   	 * A huge page does not point to next level page table
>   	 * entry. Hence this must qualify as pmd_bad().
> @@ -262,6 +264,7 @@ static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot)
>   	WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud))));
>   	WARN_ON(pud_write(pud_wrprotect(pud_mkwrite(pud))));
>   	WARN_ON(pud_young(pud_mkold(pud_mkyoung(pud))));
> +	WARN_ON(pud_dirty(pud_wrprotect(pud)));
>   
>   	if (mm_pmd_folded(mm))
>   		return;
> 

Christophe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/debug_vm_pgtable/basic: Iterate over entire protection_map[]
  2020-11-27  5:06 ` [PATCH 2/2] mm/debug_vm_pgtable/basic: Iterate over entire protection_map[] Anshuman Khandual
@ 2020-11-27  9:14   ` Steven Price
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Price @ 2020-11-27  9:14 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, akpm
  Cc: linux-kernel, catalin.marinas, christophe.leroy, gerald.schaefer,
	vgupta, paul.walmsley

On 27/11/2020 05:06, Anshuman Khandual wrote:
> Currently the basic tests just validate various page table transformations
> after starting with vm_get_page_prot(VM_READ|VM_WRITE|VM_EXEC) protection.
> Instead scan over the entire protection_map[] for better coverage. It also
> makes sure that all these basic page table tranformations checks hold true
> irrespective of the starting protection value for the page table entry.
> There is also a slight change in the debug print format for basic tests to
> capture the protection value it is being tested with. The modified output
> looks something like
> 
> [pte_basic_tests          ]: Validating PTE basic ()
> [pte_basic_tests          ]: Validating PTE basic (read)
> [pte_basic_tests          ]: Validating PTE basic (write)
> [pte_basic_tests          ]: Validating PTE basic (read|write)
> [pte_basic_tests          ]: Validating PTE basic (exec)
> [pte_basic_tests          ]: Validating PTE basic (read|exec)
> [pte_basic_tests          ]: Validating PTE basic (write|exec)
> [pte_basic_tests          ]: Validating PTE basic (read|write|exec)
> [pte_basic_tests          ]: Validating PTE basic (shared)
> [pte_basic_tests          ]: Validating PTE basic (read|shared)
> [pte_basic_tests          ]: Validating PTE basic (write|shared)
> [pte_basic_tests          ]: Validating PTE basic (read|write|shared)
> [pte_basic_tests          ]: Validating PTE basic (exec|shared)
> [pte_basic_tests          ]: Validating PTE basic (read|exec|shared)
> [pte_basic_tests          ]: Validating PTE basic (write|exec|shared)
> [pte_basic_tests          ]: Validating PTE basic (read|write|exec|shared)
> 
> This adds a missing argument 'struct mm_struct *' in pud_basic_tests() test
> . This never got exposed before as PUD based THP is available only on X86
> platform where mm_pmd_folded(mm) call gets macro replaced without requiring
> the mm_struct i.e __is_defined(__PAGETABLE_PMD_FOLDED).
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>   mm/debug_vm_pgtable.c | 47 ++++++++++++++++++++++++++++++++-----------
>   1 file changed, 35 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index a5be11210597..92b4a53d622b 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -58,11 +58,13 @@
>   #define RANDOM_ORVALUE (GENMASK(BITS_PER_LONG - 1, 0) & ~ARCH_SKIP_MASK)
>   #define RANDOM_NZVALUE	GENMASK(7, 0)
>   
> -static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
> +static void __init pte_basic_tests(unsigned long pfn, int idx)
>   {
> +	pgprot_t prot = protection_map[idx];
>   	pte_t pte = pfn_pte(pfn, prot);
> +	unsigned long val = idx, *ptr = &val;
>   
> -	pr_debug("Validating PTE basic\n");
> +	pr_debug("Validating PTE basic (%pGv)\n", ptr);
>   	WARN_ON(!pte_same(pte, pte));
>   	WARN_ON(!pte_young(pte_mkyoung(pte_mkold(pte))));
>   	WARN_ON(!pte_dirty(pte_mkdirty(pte_mkclean(pte))));
> @@ -130,14 +132,16 @@ static void __init pte_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>   }
>   
>   #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> -static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
> +static void __init pmd_basic_tests(unsigned long pfn, int idx)
>   {
> +	pgprot_t prot = protection_map[idx];
>   	pmd_t pmd = pfn_pmd(pfn, prot);
> +	unsigned long val = idx, *ptr = &val;
>   
>   	if (!has_transparent_hugepage())
>   		return;
>   
> -	pr_debug("Validating PMD basic\n");
> +	pr_debug("Validating PMD basic (%pGv)\n", ptr);
>   	WARN_ON(!pmd_same(pmd, pmd));
>   	WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd))));
>   	WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd))));
> @@ -251,14 +255,16 @@ static void __init pmd_savedwrite_tests(unsigned long pfn, pgprot_t prot)
>   }
>   
>   #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> -static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot)
> +static void __init pud_basic_tests(struct mm_struct *mm, unsigned long pfn, int idx)
>   {
> +	pgprot_t prot = protection_map[idx];
>   	pud_t pud = pfn_pud(pfn, prot);
> +	unsigned long val = idx, *ptr = &val;
>   
>   	if (!has_transparent_hugepage())
>   		return;
>   
> -	pr_debug("Validating PUD basic\n");
> +	pr_debug("Validating PUD basic (%pGv)\n", ptr);
>   	WARN_ON(!pud_same(pud, pud));
>   	WARN_ON(!pud_young(pud_mkyoung(pud_mkold(pud))));
>   	WARN_ON(!pud_write(pud_mkwrite(pud_wrprotect(pud))));
> @@ -362,7 +368,7 @@ static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
>   #endif /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
>   
>   #else  /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
> -static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
> +static void __init pud_basic_tests(struct mm_struct *mm, unsigned long pfn, int idx) { }
>   static void __init pud_advanced_tests(struct mm_struct *mm,
>   				      struct vm_area_struct *vma, pud_t *pudp,
>   				      unsigned long pfn, unsigned long vaddr,
> @@ -375,8 +381,8 @@ static void __init pud_huge_tests(pud_t *pudp, unsigned long pfn, pgprot_t prot)
>   }
>   #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
>   #else  /* !CONFIG_TRANSPARENT_HUGEPAGE */
> -static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { }
> -static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { }
> +static void __init pmd_basic_tests(unsigned long pfn, int idx) { }
> +static void __init pud_basic_tests(struct mm_struct *mm, unsigned long pfn, int idx) { }
>   static void __init pmd_advanced_tests(struct mm_struct *mm,
>   				      struct vm_area_struct *vma, pmd_t *pmdp,
>   				      unsigned long pfn, unsigned long vaddr,
> @@ -902,6 +908,7 @@ static int __init debug_vm_pgtable(void)
>   	unsigned long vaddr, pte_aligned, pmd_aligned;
>   	unsigned long pud_aligned, p4d_aligned, pgd_aligned;
>   	spinlock_t *ptl = NULL;
> +	int idx;
>   
>   	pr_info("Validating architecture page table helpers\n");
>   	prot = vm_get_page_prot(VMFLAGS);
> @@ -966,9 +973,25 @@ static int __init debug_vm_pgtable(void)
>   	saved_pmdp = pmd_offset(pudp, 0UL);
>   	saved_ptep = pmd_pgtable(pmd);
>   
> -	pte_basic_tests(pte_aligned, prot);
> -	pmd_basic_tests(pmd_aligned, prot);
> -	pud_basic_tests(pud_aligned, prot);
> +	/*
> +	 * Iterate over the protection_map[] to make sure that all
> +	 * the basic page table transformation validations just hold
> +	 * true irrespective of the starting protection value for a
> +	 * given page table entry.
> +	 */
> +	for (idx = 0; idx < ARRAY_SIZE(protection_map); idx++) {
> +		pte_basic_tests(pte_aligned, idx);
> +		pmd_basic_tests(pmd_aligned, idx);
> +		pud_basic_tests(mm, pud_aligned, idx);
> +	}
> +
> +	/*
> +	 * Both P4D and PGD level tests are very basic which do not
> +	 * involve creating page table entries from the protection
> +	 * value and the given pfn. Hence just keep them out from
> +	 * the above iteration for now to save some test execution
> +	 * time.
> +	 */
>   	p4d_basic_tests(p4d_aligned, prot);
>   	pgd_basic_tests(pgd_aligned, prot);
>   
> 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect
  2020-11-27  8:22   ` Christophe Leroy
@ 2020-11-27  9:44     ` Catalin Marinas
  2020-11-30  4:25       ` Anshuman Khandual
  0 siblings, 1 reply; 10+ messages in thread
From: Catalin Marinas @ 2020-11-27  9:44 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Anshuman Khandual, linux-mm, akpm, linux-kernel, steven.price,
	gerald.schaefer, vgupta, paul.walmsley

On Fri, Nov 27, 2020 at 09:22:24AM +0100, Christophe Leroy wrote:
> Le 27/11/2020 à 06:06, Anshuman Khandual a écrit :
> > This adds validation tests for dirtiness after write protect conversion for
> > each page table level. This is important for platforms such as arm64 that
> > removes the hardware dirty bit while making it an write protected one. This
> > also fixes pxx_wrprotect() related typos in the documentation file.
> 
> > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> > index c05d9dcf7891..a5be11210597 100644
> > --- a/mm/debug_vm_pgtable.c
> > +++ b/mm/debug_vm_pgtable.c
> > @@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
> >   	WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte))));
> >   	WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte))));
> >   	WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
> > +	WARN_ON(pte_dirty(pte_wrprotect(pte)));
> 
> Wondering what you are testing here exactly.
> 
> Do you expect that if PTE has the dirty bit, it gets cleared by pte_wrprotect() ?
> 
> Powerpc doesn't do that, it only clears the RW bit but the dirty bit remains
> if it is set, until you call pte_mkclean() explicitely.

Arm64 has an unusual way of setting a hardware dirty "bit", it actually
clears the PTE_RDONLY bit. The pte_wrprotect() sets the PTE_RDONLY bit
back and we can lose the dirty information. Will found this and posted
patches to fix the arm64 pte_wprotect() to set a software PTE_DIRTY if
!PTE_RDONLY (we do this for ptep_set_wrprotect() already). My concern
was that we may inadvertently make a fresh/clean pte dirty with such
change, hence the suggestion for the test.

That said, I think we also need a test in the other direction,
pte_wrprotect() should preserve any dirty information:

	WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte))));

If pte_mkwrite() makes a pte truly writable and potentially dirty, we
could also add a test as below. However, I think that's valid for arm64,
other architectures with a separate hardware dirty bit would fail this:

	WARN_ON(!pte_dirty(pte_wrprotect(pte_mkwrite(pte))));

-- 
Catalin


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect
  2020-11-27  9:44     ` Catalin Marinas
@ 2020-11-30  4:25       ` Anshuman Khandual
  2020-11-30  9:38         ` Catalin Marinas
  0 siblings, 1 reply; 10+ messages in thread
From: Anshuman Khandual @ 2020-11-30  4:25 UTC (permalink / raw)
  To: Catalin Marinas, Christophe Leroy
  Cc: linux-mm, akpm, linux-kernel, steven.price, gerald.schaefer,
	vgupta, paul.walmsley



On 11/27/20 3:14 PM, Catalin Marinas wrote:
> On Fri, Nov 27, 2020 at 09:22:24AM +0100, Christophe Leroy wrote:
>> Le 27/11/2020 à 06:06, Anshuman Khandual a écrit :
>>> This adds validation tests for dirtiness after write protect conversion for
>>> each page table level. This is important for platforms such as arm64 that
>>> removes the hardware dirty bit while making it an write protected one. This
>>> also fixes pxx_wrprotect() related typos in the documentation file.
>>
>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>>> index c05d9dcf7891..a5be11210597 100644
>>> --- a/mm/debug_vm_pgtable.c
>>> +++ b/mm/debug_vm_pgtable.c
>>> @@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
>>>   	WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte))));
>>>   	WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte))));
>>>   	WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
>>> +	WARN_ON(pte_dirty(pte_wrprotect(pte)));
>>
>> Wondering what you are testing here exactly.
>>
>> Do you expect that if PTE has the dirty bit, it gets cleared by pte_wrprotect() ?
>>
>> Powerpc doesn't do that, it only clears the RW bit but the dirty bit remains
>> if it is set, until you call pte_mkclean() explicitely.
> 
> Arm64 has an unusual way of setting a hardware dirty "bit", it actually
> clears the PTE_RDONLY bit. The pte_wrprotect() sets the PTE_RDONLY bit
> back and we can lose the dirty information. Will found this and posted
> patches to fix the arm64 pte_wprotect() to set a software PTE_DIRTY if
> !PTE_RDONLY (we do this for ptep_set_wrprotect() already). My concern
> was that we may inadvertently make a fresh/clean pte dirty with such
> change, hence the suggestion for the test.
> 
> That said, I think we also need a test in the other direction,
> pte_wrprotect() should preserve any dirty information:
> 
> 	WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte))));

This seems like a generic enough principle which all platforms should
adhere to. But the proposed test WARN_ON(pte_dirty(pte_wrprotect(pte)))
might fail on some platforms if the page table entry came in as a dirty
one and pte_wrprotect() is not expected to alter the dirty state.

Instead, should we just add the following two tests, which would ensure
that pte_wrprotect() never alters the dirty state of a page table entry.

WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte))));
WARN_ON(pte_dirty(pte_wrprotect(pte_mkclean(pte))));

> 
> If pte_mkwrite() makes a pte truly writable and potentially dirty, we
> could also add a test as below. However, I think that's valid for arm64,
> other architectures with a separate hardware dirty bit would fail this:
> 
> 	WARN_ON(!pte_dirty(pte_wrprotect(pte_mkwrite(pte))));

Right.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect
  2020-11-30  4:25       ` Anshuman Khandual
@ 2020-11-30  9:38         ` Catalin Marinas
  2020-11-30 10:58           ` Anshuman Khandual
  0 siblings, 1 reply; 10+ messages in thread
From: Catalin Marinas @ 2020-11-30  9:38 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Christophe Leroy, linux-mm, akpm, linux-kernel, steven.price,
	gerald.schaefer, vgupta, paul.walmsley

On Mon, Nov 30, 2020 at 09:55:00AM +0530, Anshuman Khandual wrote:
> On 11/27/20 3:14 PM, Catalin Marinas wrote:
> > On Fri, Nov 27, 2020 at 09:22:24AM +0100, Christophe Leroy wrote:
> >> Le 27/11/2020 à 06:06, Anshuman Khandual a écrit :
> >>> This adds validation tests for dirtiness after write protect conversion for
> >>> each page table level. This is important for platforms such as arm64 that
> >>> removes the hardware dirty bit while making it an write protected one. This
> >>> also fixes pxx_wrprotect() related typos in the documentation file.
> >>
> >>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> >>> index c05d9dcf7891..a5be11210597 100644
> >>> --- a/mm/debug_vm_pgtable.c
> >>> +++ b/mm/debug_vm_pgtable.c
> >>> @@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
> >>>   	WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte))));
> >>>   	WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte))));
> >>>   	WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
> >>> +	WARN_ON(pte_dirty(pte_wrprotect(pte)));
> >>
> >> Wondering what you are testing here exactly.
> >>
> >> Do you expect that if PTE has the dirty bit, it gets cleared by
> >> pte_wrprotect() ?
> >>
> >> Powerpc doesn't do that, it only clears the RW bit but the dirty
> >> bit remains if it is set, until you call pte_mkclean() explicitely.
> > 
> > Arm64 has an unusual way of setting a hardware dirty "bit", it actually
> > clears the PTE_RDONLY bit. The pte_wrprotect() sets the PTE_RDONLY bit
> > back and we can lose the dirty information. Will found this and posted
> > patches to fix the arm64 pte_wprotect() to set a software PTE_DIRTY if
> > !PTE_RDONLY (we do this for ptep_set_wrprotect() already). My concern
> > was that we may inadvertently make a fresh/clean pte dirty with such
> > change, hence the suggestion for the test.
> > 
> > That said, I think we also need a test in the other direction,
> > pte_wrprotect() should preserve any dirty information:
> > 
> > 	WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte))));
> 
> This seems like a generic enough principle which all platforms should
> adhere to. But the proposed test WARN_ON(pte_dirty(pte_wrprotect(pte)))
> might fail on some platforms if the page table entry came in as a dirty
> one and pte_wrprotect() is not expected to alter the dirty state.

Ah, so do we have architectures where entries in protection_map[] are
already dirty? If those are valid, maybe the check should be:

	WARN_ON(!pte_dirty(pte) && pte_dirty(pte_wrprotect(pte)));

> Instead, should we just add the following two tests, which would ensure
> that pte_wrprotect() never alters the dirty state of a page table entry.
> 
> WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte))));
> WARN_ON(pte_dirty(pte_wrprotect(pte_mkclean(pte))));

These should be added as additional tests. However, my initial thought
was to check whether pte_wrprotect() on a new pte created from a
protection_map[] entry directly would inadvertently dirty it. On arm64,
that means a protection_map[] entry missing PTE_RDONLY. A pte_mkclean()
would set PTE_RDONLY, so we'd miss such check.

-- 
Catalin


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect
  2020-11-30  9:38         ` Catalin Marinas
@ 2020-11-30 10:58           ` Anshuman Khandual
  2020-11-30 11:01             ` Catalin Marinas
  0 siblings, 1 reply; 10+ messages in thread
From: Anshuman Khandual @ 2020-11-30 10:58 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Christophe Leroy, linux-mm, akpm, linux-kernel, steven.price,
	gerald.schaefer, vgupta, paul.walmsley


On 11/30/20 3:08 PM, Catalin Marinas wrote:
> On Mon, Nov 30, 2020 at 09:55:00AM +0530, Anshuman Khandual wrote:
>> On 11/27/20 3:14 PM, Catalin Marinas wrote:
>>> On Fri, Nov 27, 2020 at 09:22:24AM +0100, Christophe Leroy wrote:
>>>> Le 27/11/2020 à 06:06, Anshuman Khandual a écrit :
>>>>> This adds validation tests for dirtiness after write protect conversion for
>>>>> each page table level. This is important for platforms such as arm64 that
>>>>> removes the hardware dirty bit while making it an write protected one. This
>>>>> also fixes pxx_wrprotect() related typos in the documentation file.
>>>>
>>>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>>>>> index c05d9dcf7891..a5be11210597 100644
>>>>> --- a/mm/debug_vm_pgtable.c
>>>>> +++ b/mm/debug_vm_pgtable.c
>>>>> @@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
>>>>>   	WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte))));
>>>>>   	WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte))));
>>>>>   	WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
>>>>> +	WARN_ON(pte_dirty(pte_wrprotect(pte)));
>>>>
>>>> Wondering what you are testing here exactly.
>>>>
>>>> Do you expect that if PTE has the dirty bit, it gets cleared by
>>>> pte_wrprotect() ?
>>>>
>>>> Powerpc doesn't do that, it only clears the RW bit but the dirty
>>>> bit remains if it is set, until you call pte_mkclean() explicitely.
>>>
>>> Arm64 has an unusual way of setting a hardware dirty "bit", it actually
>>> clears the PTE_RDONLY bit. The pte_wrprotect() sets the PTE_RDONLY bit
>>> back and we can lose the dirty information. Will found this and posted
>>> patches to fix the arm64 pte_wprotect() to set a software PTE_DIRTY if
>>> !PTE_RDONLY (we do this for ptep_set_wrprotect() already). My concern
>>> was that we may inadvertently make a fresh/clean pte dirty with such
>>> change, hence the suggestion for the test.
>>>
>>> That said, I think we also need a test in the other direction,
>>> pte_wrprotect() should preserve any dirty information:
>>>
>>> 	WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte))));
>>
>> This seems like a generic enough principle which all platforms should
>> adhere to. But the proposed test WARN_ON(pte_dirty(pte_wrprotect(pte)))
>> might fail on some platforms if the page table entry came in as a dirty
>> one and pte_wrprotect() is not expected to alter the dirty state.
> 
> Ah, so do we have architectures where entries in protection_map[] are
> already dirty? If those are valid, maybe the check should be:

Okay, I did not imply that actually. The current position for these new
tests in respective pxx_basic_tests() functions is right at the end and
hence the pxx might have already gone through some changes from the time
it was originally created with pfn_pxx(). The entry here is not starting
from the beginning. It is not expected as well, per design. So dirty bit
might or might not be there depending on all the previous test sequences
leading upto these new ones.

IIUC, Christophe mentioned the fact that on platforms like powerpc, dirty
bit just remains unchanged during pte_wprotect(). So the current test
WARN_ON(pte_dirty(pte_wrprotect(pte))) will not work on powerpc if the
previous tests leading upto that point has got the dirty bit set. This is
irrespective of how it was created with pfn_pte() from protection_map[]
originally at the beginning.

> 
> 	WARN_ON(!pte_dirty(pte) && pte_dirty(pte_wrprotect(pte)));
> 
>> Instead, should we just add the following two tests, which would ensure
>> that pte_wrprotect() never alters the dirty state of a page table entry.
>>
>> WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte))));
>> WARN_ON(pte_dirty(pte_wrprotect(pte_mkclean(pte))));
> 
> These should be added as additional tests. However, my initial thought

Okay, will add them.

> was to check whether pte_wrprotect() on a new pte created from a
> protection_map[] entry directly would inadvertently dirty it. On arm64,
> that means a protection_map[] entry missing PTE_RDONLY. A pte_mkclean()
> would set PTE_RDONLY, so we'd miss such check.
> 

To achieve this, we could move the test right at the beginning just after
the pxx gets created from protection_map[], with a comment explaining the
rationale. 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect
  2020-11-30 10:58           ` Anshuman Khandual
@ 2020-11-30 11:01             ` Catalin Marinas
  0 siblings, 0 replies; 10+ messages in thread
From: Catalin Marinas @ 2020-11-30 11:01 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Christophe Leroy, linux-mm, akpm, linux-kernel, steven.price,
	gerald.schaefer, vgupta, paul.walmsley

On Mon, Nov 30, 2020 at 04:28:20PM +0530, Anshuman Khandual wrote:
> On 11/30/20 3:08 PM, Catalin Marinas wrote:
> > On Mon, Nov 30, 2020 at 09:55:00AM +0530, Anshuman Khandual wrote:
> >> On 11/27/20 3:14 PM, Catalin Marinas wrote:
> >>> On Fri, Nov 27, 2020 at 09:22:24AM +0100, Christophe Leroy wrote:
> >>>> Le 27/11/2020 à 06:06, Anshuman Khandual a écrit :
> >>>>> This adds validation tests for dirtiness after write protect conversion for
> >>>>> each page table level. This is important for platforms such as arm64 that
> >>>>> removes the hardware dirty bit while making it an write protected one. This
> >>>>> also fixes pxx_wrprotect() related typos in the documentation file.
> >>>>
> >>>>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> >>>>> index c05d9dcf7891..a5be11210597 100644
> >>>>> --- a/mm/debug_vm_pgtable.c
> >>>>> +++ b/mm/debug_vm_pgtable.c
> >>>>> @@ -70,6 +70,7 @@ static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot)
> >>>>>   	WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte))));
> >>>>>   	WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte))));
> >>>>>   	WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte))));
> >>>>> +	WARN_ON(pte_dirty(pte_wrprotect(pte)));
> >>>>
> >>>> Wondering what you are testing here exactly.
> >>>>
> >>>> Do you expect that if PTE has the dirty bit, it gets cleared by
> >>>> pte_wrprotect() ?
> >>>>
> >>>> Powerpc doesn't do that, it only clears the RW bit but the dirty
> >>>> bit remains if it is set, until you call pte_mkclean() explicitely.
> >>>
> >>> Arm64 has an unusual way of setting a hardware dirty "bit", it actually
> >>> clears the PTE_RDONLY bit. The pte_wrprotect() sets the PTE_RDONLY bit
> >>> back and we can lose the dirty information. Will found this and posted
> >>> patches to fix the arm64 pte_wprotect() to set a software PTE_DIRTY if
> >>> !PTE_RDONLY (we do this for ptep_set_wrprotect() already). My concern
> >>> was that we may inadvertently make a fresh/clean pte dirty with such
> >>> change, hence the suggestion for the test.
> >>>
> >>> That said, I think we also need a test in the other direction,
> >>> pte_wrprotect() should preserve any dirty information:
> >>>
> >>> 	WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte))));
> >>
> >> This seems like a generic enough principle which all platforms should
> >> adhere to. But the proposed test WARN_ON(pte_dirty(pte_wrprotect(pte)))
> >> might fail on some platforms if the page table entry came in as a dirty
> >> one and pte_wrprotect() is not expected to alter the dirty state.
> > 
> > Ah, so do we have architectures where entries in protection_map[] are
> > already dirty? If those are valid, maybe the check should be:
> 
> Okay, I did not imply that actually. The current position for these new
> tests in respective pxx_basic_tests() functions is right at the end and
> hence the pxx might have already gone through some changes from the time
> it was originally created with pfn_pxx(). The entry here is not starting
> from the beginning. It is not expected as well, per design. So dirty bit
> might or might not be there depending on all the previous test sequences
> leading upto these new ones.
> 
> IIUC, Christophe mentioned the fact that on platforms like powerpc, dirty
> bit just remains unchanged during pte_wprotect(). So the current test
> WARN_ON(pte_dirty(pte_wrprotect(pte))) will not work on powerpc if the
> previous tests leading upto that point has got the dirty bit set. This is
> irrespective of how it was created with pfn_pte() from protection_map[]
> originally at the beginning.
[...]
> To achieve this, we could move the test right at the beginning just after
> the pxx gets created from protection_map[], with a comment explaining the
> rationale. 

OK, this makes sense. Thanks for the clarification.

-- 
Catalin


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-11-30 11:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-27  5:06 [PATCH 0/2] mm/debug_vm_pgtable: Some minor updates Anshuman Khandual
2020-11-27  5:06 ` [PATCH 1/2] mm/debug_vm_pgtable/basic: Add validation for dirtiness after write protect Anshuman Khandual
2020-11-27  8:22   ` Christophe Leroy
2020-11-27  9:44     ` Catalin Marinas
2020-11-30  4:25       ` Anshuman Khandual
2020-11-30  9:38         ` Catalin Marinas
2020-11-30 10:58           ` Anshuman Khandual
2020-11-30 11:01             ` Catalin Marinas
2020-11-27  5:06 ` [PATCH 2/2] mm/debug_vm_pgtable/basic: Iterate over entire protection_map[] Anshuman Khandual
2020-11-27  9:14   ` Steven Price

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).