linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
@ 2017-02-09  3:00 Aneesh Kumar K.V
  2017-02-09  3:00 ` [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-09  3:00 UTC (permalink / raw)
  To: akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linux-kernel, linuxppc-dev, Aneesh Kumar K.V

Autonuma preserves the write permission across numa fault to avoid taking
a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve PTE
write permissions across a NUMA hinting fault"). Architecture can implement
protnone in different ways and some may choose to implement that by clearing Read/
Write/Exec bit of pte. Setting the write bit on such pte can result in wrong
behaviour. Fix this up by allowing arch to override how to save the write bit
on a protnone pte.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 include/asm-generic/pgtable.h | 16 ++++++++++++++++
 mm/huge_memory.c              |  4 ++--
 mm/memory.c                   |  2 +-
 mm/mprotect.c                 |  4 ++--
 4 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 18af2bcefe6a..b6f3a8a4b738 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -192,6 +192,22 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres
 }
 #endif
 
+#ifndef pte_savedwrite
+#define pte_savedwrite pte_write
+#endif
+
+#ifndef pte_mk_savedwrite
+#define pte_mk_savedwrite pte_mkwrite
+#endif
+
+#ifndef pmd_savedwrite
+#define pmd_savedwrite pmd_write
+#endif
+
+#ifndef pmd_mk_savedwrite
+#define pmd_mk_savedwrite pmd_mkwrite
+#endif
+
 #ifndef __HAVE_ARCH_PMDP_SET_WRPROTECT
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static inline void pmdp_set_wrprotect(struct mm_struct *mm,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9a6bd6c8d55a..2f0f855ec911 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1300,7 +1300,7 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd)
 	goto out;
 clear_pmdnuma:
 	BUG_ON(!PageLocked(page));
-	was_writable = pmd_write(pmd);
+	was_writable = pmd_savedwrite(pmd);
 	pmd = pmd_modify(pmd, vma->vm_page_prot);
 	pmd = pmd_mkyoung(pmd);
 	if (was_writable)
@@ -1555,7 +1555,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 			entry = pmdp_huge_get_and_clear_notify(mm, addr, pmd);
 			entry = pmd_modify(entry, newprot);
 			if (preserve_write)
-				entry = pmd_mkwrite(entry);
+				entry = pmd_mk_savedwrite(entry);
 			ret = HPAGE_PMD_NR;
 			set_pmd_at(mm, addr, pmd, entry);
 			BUG_ON(vma_is_anonymous(vma) && !preserve_write &&
diff --git a/mm/memory.c b/mm/memory.c
index e78bf72f30dd..88c24f89d6d3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3388,7 +3388,7 @@ static int do_numa_page(struct vm_fault *vmf)
 	int target_nid;
 	bool migrated = false;
 	pte_t pte;
-	bool was_writable = pte_write(vmf->orig_pte);
+	bool was_writable = pte_savedwrite(vmf->orig_pte);
 	int flags = 0;
 
 	/*
diff --git a/mm/mprotect.c b/mm/mprotect.c
index f9c07f54dd62..15f5c174a7c1 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -113,13 +113,13 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
 			ptent = ptep_modify_prot_start(mm, addr, pte);
 			ptent = pte_modify(ptent, newprot);
 			if (preserve_write)
-				ptent = pte_mkwrite(ptent);
+				ptent = pte_mk_savedwrite(ptent);
 
 			/* Avoid taking write faults for known dirty pages */
 			if (dirty_accountable && pte_dirty(ptent) &&
 					(pte_soft_dirty(ptent) ||
 					 !(vma->vm_flags & VM_SOFTDIRTY))) {
-				ptent = pte_mkwrite(ptent);
+				ptent = pte_mk_savedwrite(ptent);
 			}
 			ptep_modify_prot_commit(mm, addr, pte, ptent);
 			pages++;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
  2017-02-09  3:00 [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
@ 2017-02-09  3:00 ` Aneesh Kumar K.V
  2017-02-14  3:59   ` Michael Neuling
  2017-02-15 21:46   ` Andrew Morton
  2017-02-09  3:16 ` [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
  2017-02-14  3:58 ` Michael Neuling
  2 siblings, 2 replies; 8+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-09  3:00 UTC (permalink / raw)
  To: akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linux-kernel, linuxppc-dev, Aneesh Kumar K.V

With this our protnone becomes a present pte with READ/WRITE/EXEC bit cleared.
By default we also set _PAGE_PRIVILEGED on such pte. This is now used to help
us identify a protnone pte that as saved write bit. For such pte, we will clear
the _PAGE_PRIVILEGED bit. The pte still remain non-accessible from both user
and kernel.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  3 +++
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 32 +++++++++++++++++++++++++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 0735d5a8049f..8720a406bbbe 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -16,6 +16,9 @@
 #include <asm/page.h>
 #include <asm/bug.h>
 
+#ifndef __ASSEMBLY__
+#include <linux/mmdebug.h>
+#endif
 /*
  * This is necessary to get the definition of PGTABLE_RANGE which we
  * need for various slices related matters. Note that this isn't the
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index e91ada786d48..efff910a84b1 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -443,8 +443,8 @@ static inline pte_t pte_clear_soft_dirty(pte_t pte)
  */
 static inline int pte_protnone(pte_t pte)
 {
-	return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED)) ==
-		cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED);
+	return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX)) ==
+		cpu_to_be64(_PAGE_PRESENT);
 }
 #endif /* CONFIG_NUMA_BALANCING */
 
@@ -514,6 +514,32 @@ static inline pte_t pte_mkhuge(pte_t pte)
 	return pte;
 }
 
+#define pte_mk_savedwrite pte_mk_savedwrite
+static inline pte_t pte_mk_savedwrite(pte_t pte)
+{
+	/*
+	 * Used by Autonuma subsystem to preserve the write bit
+	 * while marking the pte PROT_NONE. Only allow this
+	 * on PROT_NONE pte
+	 */
+	VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | _PAGE_PRIVILEGED)) !=
+		  cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED));
+	return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED);
+}
+
+#define pte_savedwrite pte_savedwrite
+static inline bool pte_savedwrite(pte_t pte)
+{
+	/*
+	 * Saved write ptes are prot none ptes that doesn't have
+	 * privileged bit sit. We mark prot none as one which has
+	 * present and pviliged bit set and RWX cleared. To mark
+	 * protnone which used to have _PAGE_WRITE set we clear
+	 * the privileged bit.
+	 */
+	return !(pte_raw(pte) & cpu_to_be64(_PAGE_RWX | _PAGE_PRIVILEGED));
+}
+
 static inline pte_t pte_mkdevmap(pte_t pte)
 {
 	return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
@@ -885,6 +911,7 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 #define pmd_mkclean(pmd)	pte_pmd(pte_mkclean(pmd_pte(pmd)))
 #define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+#define pmd_mk_savedwrite(pmd)	pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
 
 #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
 #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
@@ -901,6 +928,7 @@ static inline int pmd_protnone(pmd_t pmd)
 
 #define __HAVE_ARCH_PMD_WRITE
 #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
+#define pmd_savedwrite(pmd)	pte_savedwrite(pmd_pte(pmd))
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
  2017-02-09  3:00 [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
  2017-02-09  3:00 ` [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
@ 2017-02-09  3:16 ` Aneesh Kumar K.V
  2017-02-14  3:58 ` Michael Neuling
  2 siblings, 0 replies; 8+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-09  3:16 UTC (permalink / raw)
  To: akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linux-kernel, linuxppc-dev

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> Autonuma preserves the write permission across numa fault to avoid taking
> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve PTE
> write permissions across a NUMA hinting fault"). Architecture can implement
> protnone in different ways and some may choose to implement that by clearing Read/
> Write/Exec bit of pte. Setting the write bit on such pte can result in wrong
> behaviour. Fix this up by allowing arch to override how to save the write bit
> on a protnone pte.
>

The problem we are trying to fix here is w.r.t autnuma related thp
migration. migrate_misplaced_transhuge_page() cannot deal with
concurrent modification of the page. It does a page copy without
following the migration pte sequence. IIUC, this was done to keep the
migration simpler and at the time of implemenation we didn't had THP
page cache which would have required a more elaborate migration scheme.
[1]. That means thp autonuma migration expect the protnone
with saved write to be done such that both kernel and user cannot update
the page content. This patch series enables archs like ppc64 to do that.
We are good with the hash translation mode with the current code,
because we never create a hardware page table entry for a protnone pte. 


-aneesh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
  2017-02-09  3:00 [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
  2017-02-09  3:00 ` [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
  2017-02-09  3:16 ` [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
@ 2017-02-14  3:58 ` Michael Neuling
  2 siblings, 0 replies; 8+ messages in thread
From: Michael Neuling @ 2017-02-14  3:58 UTC (permalink / raw)
  To: Aneesh Kumar K.V, akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linuxppc-dev, linux-kernel

On Thu, 2017-02-09 at 08:30 +0530, Aneesh Kumar K.V wrote:
> Autonuma preserves the write permission across numa fault to avoid taking
> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve PTE
> write permissions across a NUMA hinting fault"). Architecture can implement
> protnone in different ways and some may choose to implement that by clearing
> Read/
> Write/Exec bit of pte. Setting the write bit on such pte can result in wrong
> behaviour. Fix this up by allowing arch to override how to save the write bit
> on a protnone pte.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

FWIW this is pretty simple and helps with us in powerpc...

Acked-By: Michael Neuling <mikey@neuling.org>

> ---
>  include/asm-generic/pgtable.h | 16 ++++++++++++++++
>  mm/huge_memory.c              |  4 ++--
>  mm/memory.c                   |  2 +-
>  mm/mprotect.c                 |  4 ++--
>  4 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 18af2bcefe6a..b6f3a8a4b738 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -192,6 +192,22 @@ static inline void ptep_set_wrprotect(struct mm_struct
> *mm, unsigned long addres
>  }
>  #endif
>  
> +#ifndef pte_savedwrite
> +#define pte_savedwrite pte_write
> +#endif
> +
> +#ifndef pte_mk_savedwrite
> +#define pte_mk_savedwrite pte_mkwrite
> +#endif
> +
> +#ifndef pmd_savedwrite
> +#define pmd_savedwrite pmd_write
> +#endif
> +
> +#ifndef pmd_mk_savedwrite
> +#define pmd_mk_savedwrite pmd_mkwrite
> +#endif
> +
>  #ifndef __HAVE_ARCH_PMDP_SET_WRPROTECT
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  static inline void pmdp_set_wrprotect(struct mm_struct *mm,
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 9a6bd6c8d55a..2f0f855ec911 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1300,7 +1300,7 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t
> pmd)
>  	goto out;
>  clear_pmdnuma:
>  	BUG_ON(!PageLocked(page));
> -	was_writable = pmd_write(pmd);
> +	was_writable = pmd_savedwrite(pmd);
>  	pmd = pmd_modify(pmd, vma->vm_page_prot);
>  	pmd = pmd_mkyoung(pmd);
>  	if (was_writable)
> @@ -1555,7 +1555,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t
> *pmd,
>  			entry = pmdp_huge_get_and_clear_notify(mm, addr,
> pmd);
>  			entry = pmd_modify(entry, newprot);
>  			if (preserve_write)
> -				entry = pmd_mkwrite(entry);
> +				entry = pmd_mk_savedwrite(entry);
>  			ret = HPAGE_PMD_NR;
>  			set_pmd_at(mm, addr, pmd, entry);
>  			BUG_ON(vma_is_anonymous(vma) && !preserve_write &&
> diff --git a/mm/memory.c b/mm/memory.c
> index e78bf72f30dd..88c24f89d6d3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3388,7 +3388,7 @@ static int do_numa_page(struct vm_fault *vmf)
>  	int target_nid;
>  	bool migrated = false;
>  	pte_t pte;
> -	bool was_writable = pte_write(vmf->orig_pte);
> +	bool was_writable = pte_savedwrite(vmf->orig_pte);
>  	int flags = 0;
>  
>  	/*
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index f9c07f54dd62..15f5c174a7c1 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -113,13 +113,13 @@ static unsigned long change_pte_range(struct
> vm_area_struct *vma, pmd_t *pmd,
>  			ptent = ptep_modify_prot_start(mm, addr, pte);
>  			ptent = pte_modify(ptent, newprot);
>  			if (preserve_write)
> -				ptent = pte_mkwrite(ptent);
> +				ptent = pte_mk_savedwrite(ptent);
>  
>  			/* Avoid taking write faults for known dirty pages */
>  			if (dirty_accountable && pte_dirty(ptent) &&
>  					(pte_soft_dirty(ptent) ||
>  					 !(vma->vm_flags & VM_SOFTDIRTY))) {
> -				ptent = pte_mkwrite(ptent);
> +				ptent = pte_mk_savedwrite(ptent);
>  			}
>  			ptep_modify_prot_commit(mm, addr, pte, ptent);
>  			pages++;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
  2017-02-09  3:00 ` [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
@ 2017-02-14  3:59   ` Michael Neuling
  2017-02-14 11:01     ` Michael Ellerman
  2017-02-15 21:46   ` Andrew Morton
  1 sibling, 1 reply; 8+ messages in thread
From: Michael Neuling @ 2017-02-14  3:59 UTC (permalink / raw)
  To: Aneesh Kumar K.V, akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linuxppc-dev, linux-kernel

On Thu, 2017-02-09 at 08:30 +0530, Aneesh Kumar K.V wrote:
> With this our protnone becomes a present pte with READ/WRITE/EXEC bit cleared.
> By default we also set _PAGE_PRIVILEGED on such pte. This is now used to help
> us identify a protnone pte that as saved write bit. For such pte, we will
> clear
> the _PAGE_PRIVILEGED bit. The pte still remain non-accessible from both user
> and kernel.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>


FWIW I've tested this, so:

Acked-By: Michael Neuling <mikey@neuling.org>

> ---
>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |  3 +++
>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 32 +++++++++++++++++++++++++-
> -
>  2 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 0735d5a8049f..8720a406bbbe 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -16,6 +16,9 @@
>  #include <asm/page.h>
>  #include <asm/bug.h>
>  
> +#ifndef __ASSEMBLY__
> +#include <linux/mmdebug.h>
> +#endif
>  /*
>   * This is necessary to get the definition of PGTABLE_RANGE which we
>   * need for various slices related matters. Note that this isn't the
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index e91ada786d48..efff910a84b1 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -443,8 +443,8 @@ static inline pte_t pte_clear_soft_dirty(pte_t pte)
>   */
>  static inline int pte_protnone(pte_t pte)
>  {
> -	return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED))
> ==
> -		cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED);
> +	return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX)) ==
> +		cpu_to_be64(_PAGE_PRESENT);
>  }
>  #endif /* CONFIG_NUMA_BALANCING */
>  
> @@ -514,6 +514,32 @@ static inline pte_t pte_mkhuge(pte_t pte)
>  	return pte;
>  }
>  
> +#define pte_mk_savedwrite pte_mk_savedwrite
> +static inline pte_t pte_mk_savedwrite(pte_t pte)
> +{
> +	/*
> +	 * Used by Autonuma subsystem to preserve the write bit
> +	 * while marking the pte PROT_NONE. Only allow this
> +	 * on PROT_NONE pte
> +	 */
> +	VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX |
> _PAGE_PRIVILEGED)) !=
> +		  cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED));
> +	return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED);
> +}
> +
> +#define pte_savedwrite pte_savedwrite
> +static inline bool pte_savedwrite(pte_t pte)
> +{
> +	/*
> +	 * Saved write ptes are prot none ptes that doesn't have
> +	 * privileged bit sit. We mark prot none as one which has
> +	 * present and pviliged bit set and RWX cleared. To mark
> +	 * protnone which used to have _PAGE_WRITE set we clear
> +	 * the privileged bit.
> +	 */
> +	return !(pte_raw(pte) & cpu_to_be64(_PAGE_RWX | _PAGE_PRIVILEGED));
> +}
> +
>  static inline pte_t pte_mkdevmap(pte_t pte)
>  {
>  	return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
> @@ -885,6 +911,7 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>  #define pmd_mkclean(pmd)	pte_pmd(pte_mkclean(pmd_pte(pmd)))
>  #define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
>  #define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
> +#define pmd_mk_savedwrite(pmd)	pte_pmd(pte_mk_savedwrite(pmd_pte(pmd))
> )
>  
>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>  #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
> @@ -901,6 +928,7 @@ static inline int pmd_protnone(pmd_t pmd)
>  
>  #define __HAVE_ARCH_PMD_WRITE
>  #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
> +#define pmd_savedwrite(pmd)	pte_savedwrite(pmd_pte(pmd))
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
  2017-02-14  3:59   ` Michael Neuling
@ 2017-02-14 11:01     ` Michael Ellerman
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2017-02-14 11:01 UTC (permalink / raw)
  To: Michael Neuling, Aneesh Kumar K.V, akpm, Rik van Riel,
	Mel Gorman, paulus, benh
  Cc: linux-mm, linuxppc-dev, linux-kernel

Michael Neuling <mikey@neuling.org> writes:

> On Thu, 2017-02-09 at 08:30 +0530, Aneesh Kumar K.V wrote:
>> With this our protnone becomes a present pte with READ/WRITE/EXEC bit cleared.
>> By default we also set _PAGE_PRIVILEGED on such pte. This is now used to help
>> us identify a protnone pte that as saved write bit. For such pte, we will
>> clear
>> the _PAGE_PRIVILEGED bit. The pte still remain non-accessible from both user
>> and kernel.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>
>
> FWIW I've tested this, so:
>
> Acked-By: Michael Neuling <mikey@neuling.org>

In future if you've tested something then "Tested-by:" is the right tag
to use.

cheers

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
  2017-02-09  3:00 ` [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
  2017-02-14  3:59   ` Michael Neuling
@ 2017-02-15 21:46   ` Andrew Morton
  2017-02-16  2:12     ` Aneesh Kumar K.V
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2017-02-15 21:46 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Rik van Riel, Mel Gorman, paulus, benh, linux-mm, linux-kernel,
	linuxppc-dev

On Thu,  9 Feb 2017 08:30:59 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> wrote:

> With this our protnone becomes a present pte with READ/WRITE/EXEC bit cleared.
> By default we also set _PAGE_PRIVILEGED on such pte. This is now used to help
> us identify a protnone pte that as saved write bit. For such pte, we will clear
> the _PAGE_PRIVILEGED bit. The pte still remain non-accessible from both user
> and kernel.

I don't see how these patches differ from the ones which are presently
in -mm.

It helps to have a [0/n] email for a patch series and to put a version
number in there as well.

> +#define pte_mk_savedwrite pte_mk_savedwrite
> +static inline pte_t pte_mk_savedwrite(pte_t pte)
> +{
> +	/*
> +	 * Used by Autonuma subsystem to preserve the write bit
> +	 * while marking the pte PROT_NONE. Only allow this
> +	 * on PROT_NONE pte
> +	 */
> +	VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | _PAGE_PRIVILEGED)) !=
> +		  cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED));
> +	return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED);
> +}
> +
> +#define pte_savedwrite pte_savedwrite
> +static inline bool pte_savedwrite(pte_t pte)
> +{
> +	/*
> +	 * Saved write ptes are prot none ptes that doesn't have
> +	 * privileged bit sit. We mark prot none as one which has
> +	 * present and pviliged bit set and RWX cleared. To mark
> +	 * protnone which used to have _PAGE_WRITE set we clear
> +	 * the privileged bit.
> +	 */
> +	return !(pte_raw(pte) & cpu_to_be64(_PAGE_RWX | _PAGE_PRIVILEGED));
> +}
> +
>  static inline pte_t pte_mkdevmap(pte_t pte)
>  {
>  	return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);

arch/powerpc/include/asm/book3s/64/pgtable.h doesn't have
pte_mkdevmap().  What tree are you patching here?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
  2017-02-15 21:46   ` Andrew Morton
@ 2017-02-16  2:12     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 8+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-16  2:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Rik van Riel, Mel Gorman, paulus, benh, linux-mm, linux-kernel,
	linuxppc-dev



On Thursday 16 February 2017 03:16 AM, Andrew Morton wrote:
> On Thu,  9 Feb 2017 08:30:59 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> wrote:
>
>> With this our protnone becomes a present pte with READ/WRITE/EXEC bit cleared.
>> By default we also set _PAGE_PRIVILEGED on such pte. This is now used to help
>> us identify a protnone pte that as saved write bit. For such pte, we will clear
>> the _PAGE_PRIVILEGED bit. The pte still remain non-accessible from both user
>> and kernel.
> I don't see how these patches differ from the ones which are presently
> in -mm.
>
> It helps to have a [0/n] email for a patch series and to put a version
> number in there as well.
>
>> +#define pte_mk_savedwrite pte_mk_savedwrite
>> +static inline pte_t pte_mk_savedwrite(pte_t pte)
>> +{
>> +	/*
>> +	 * Used by Autonuma subsystem to preserve the write bit
>> +	 * while marking the pte PROT_NONE. Only allow this
>> +	 * on PROT_NONE pte
>> +	 */
>> +	VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | _PAGE_PRIVILEGED)) !=
>> +		  cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED));
>> +	return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED);
>> +}
>> +
>> +#define pte_savedwrite pte_savedwrite
>> +static inline bool pte_savedwrite(pte_t pte)
>> +{
>> +	/*
>> +	 * Saved write ptes are prot none ptes that doesn't have
>> +	 * privileged bit sit. We mark prot none as one which has
>> +	 * present and pviliged bit set and RWX cleared. To mark
>> +	 * protnone which used to have _PAGE_WRITE set we clear
>> +	 * the privileged bit.
>> +	 */
>> +	return !(pte_raw(pte) & cpu_to_be64(_PAGE_RWX | _PAGE_PRIVILEGED));
>> +}
>> +
>>   static inline pte_t pte_mkdevmap(pte_t pte)
>>   {
>>   	return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
> arch/powerpc/include/asm/book3s/64/pgtable.h doesn't have
> pte_mkdevmap().  What tree are you patching here?
>
>

I did post a V2 of this for which you replied
https://lkml.kernel.org/r/20170214162008.bd592c747fc5e167c10ce7b8@linux-foundation.org

I actually found the issue with this patch. I will be sending V3 after 
more testing.

-aneesh

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-02-16  2:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-09  3:00 [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
2017-02-09  3:00 ` [PATCH 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
2017-02-14  3:59   ` Michael Neuling
2017-02-14 11:01     ` Michael Ellerman
2017-02-15 21:46   ` Andrew Morton
2017-02-16  2:12     ` Aneesh Kumar K.V
2017-02-09  3:16 ` [PATCH 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
2017-02-14  3:58 ` Michael Neuling

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).