linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/2] Numabalancing preserve write fix
@ 2017-02-14  5:31 Aneesh Kumar K.V
  2017-02-14  5:31 ` [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
  2017-02-14  5:31 ` [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
  0 siblings, 2 replies; 10+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-14  5:31 UTC (permalink / raw)
  To: akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linux-kernel, linuxppc-dev, Aneesh Kumar K.V

This patch series address an issue w.r.t THP migration and autonuma
preserve write feature. migrate_misplaced_transhuge_page() cannot deal with
concurrent modification of the page. It does a page copy without
following the migration pte sequence. IIUC, this was done to keep the
migration simpler and at the time of implemenation we didn't had THP
page cache which would have required a more elaborate migration scheme.
That means thp autonuma migration expect the protnone with saved write
to be done such that both kernel and user cannot update
the page content. This patch series enables archs like ppc64 to do that.
We are good with the hash translation mode with the current code,
because we never create a hardware page table entry for a protnone pte. 

Changes from V1:
* Update the patch so that it apply cleanly to upstream.
* Add acked-by from Michael Neuling

Aneesh Kumar K.V (2):
  mm/autonuma: Let architecture override how the write bit should be
    stashed in a protnone pte.
  powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved
    write

 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  3 +++
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 32 +++++++++++++++++++++++++--
 include/asm-generic/pgtable.h                 | 16 ++++++++++++++
 mm/huge_memory.c                              |  4 ++--
 mm/memory.c                                   |  2 +-
 mm/mprotect.c                                 |  4 ++--
 6 files changed, 54 insertions(+), 7 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
  2017-02-14  5:31 [PATCH V2 0/2] Numabalancing preserve write fix Aneesh Kumar K.V
@ 2017-02-14  5:31 ` Aneesh Kumar K.V
  2017-02-14  5:49   ` Michael Ellerman
  2017-02-14  5:31 ` [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
  1 sibling, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-14  5:31 UTC (permalink / raw)
  To: akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linux-kernel, linuxppc-dev, Aneesh Kumar K.V

Autonuma preserves the write permission across numa fault to avoid taking
a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve PTE
write permissions across a NUMA hinting fault"). Architecture can implement
protnone in different ways and some may choose to implement that by clearing Read/
Write/Exec bit of pte. Setting the write bit on such pte can result in wrong
behaviour. Fix this up by allowing arch to override how to save the write bit
on a protnone pte.

Acked-By: Michael Neuling <mikey@neuling.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 include/asm-generic/pgtable.h | 16 ++++++++++++++++
 mm/huge_memory.c              |  4 ++--
 mm/memory.c                   |  2 +-
 mm/mprotect.c                 |  4 ++--
 4 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 18af2bcefe6a..b6f3a8a4b738 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -192,6 +192,22 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres
 }
 #endif
 
+#ifndef pte_savedwrite
+#define pte_savedwrite pte_write
+#endif
+
+#ifndef pte_mk_savedwrite
+#define pte_mk_savedwrite pte_mkwrite
+#endif
+
+#ifndef pmd_savedwrite
+#define pmd_savedwrite pmd_write
+#endif
+
+#ifndef pmd_mk_savedwrite
+#define pmd_mk_savedwrite pmd_mkwrite
+#endif
+
 #ifndef __HAVE_ARCH_PMDP_SET_WRPROTECT
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static inline void pmdp_set_wrprotect(struct mm_struct *mm,
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9a6bd6c8d55a..2f0f855ec911 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1300,7 +1300,7 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd)
 	goto out;
 clear_pmdnuma:
 	BUG_ON(!PageLocked(page));
-	was_writable = pmd_write(pmd);
+	was_writable = pmd_savedwrite(pmd);
 	pmd = pmd_modify(pmd, vma->vm_page_prot);
 	pmd = pmd_mkyoung(pmd);
 	if (was_writable)
@@ -1555,7 +1555,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
 			entry = pmdp_huge_get_and_clear_notify(mm, addr, pmd);
 			entry = pmd_modify(entry, newprot);
 			if (preserve_write)
-				entry = pmd_mkwrite(entry);
+				entry = pmd_mk_savedwrite(entry);
 			ret = HPAGE_PMD_NR;
 			set_pmd_at(mm, addr, pmd, entry);
 			BUG_ON(vma_is_anonymous(vma) && !preserve_write &&
diff --git a/mm/memory.c b/mm/memory.c
index e78bf72f30dd..88c24f89d6d3 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3388,7 +3388,7 @@ static int do_numa_page(struct vm_fault *vmf)
 	int target_nid;
 	bool migrated = false;
 	pte_t pte;
-	bool was_writable = pte_write(vmf->orig_pte);
+	bool was_writable = pte_savedwrite(vmf->orig_pte);
 	int flags = 0;
 
 	/*
diff --git a/mm/mprotect.c b/mm/mprotect.c
index f9c07f54dd62..15f5c174a7c1 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -113,13 +113,13 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
 			ptent = ptep_modify_prot_start(mm, addr, pte);
 			ptent = pte_modify(ptent, newprot);
 			if (preserve_write)
-				ptent = pte_mkwrite(ptent);
+				ptent = pte_mk_savedwrite(ptent);
 
 			/* Avoid taking write faults for known dirty pages */
 			if (dirty_accountable && pte_dirty(ptent) &&
 					(pte_soft_dirty(ptent) ||
 					 !(vma->vm_flags & VM_SOFTDIRTY))) {
-				ptent = pte_mkwrite(ptent);
+				ptent = pte_mk_savedwrite(ptent);
 			}
 			ptep_modify_prot_commit(mm, addr, pte, ptent);
 			pages++;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
  2017-02-14  5:31 [PATCH V2 0/2] Numabalancing preserve write fix Aneesh Kumar K.V
  2017-02-14  5:31 ` [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
@ 2017-02-14  5:31 ` Aneesh Kumar K.V
  2017-02-14 11:04   ` Michael Ellerman
  1 sibling, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-14  5:31 UTC (permalink / raw)
  To: akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linux-kernel, linuxppc-dev, Aneesh Kumar K.V

With this our protnone becomes a present pte with READ/WRITE/EXEC bit cleared.
By default we also set _PAGE_PRIVILEGED on such pte. This is now used to help
us identify a protnone pte that as saved write bit. For such pte, we will clear
the _PAGE_PRIVILEGED bit. The pte still remain non-accessible from both user
and kernel.

Acked-By: Michael Neuling <mikey@neuling.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  3 +++
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 32 +++++++++++++++++++++++++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 0735d5a8049f..8720a406bbbe 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -16,6 +16,9 @@
 #include <asm/page.h>
 #include <asm/bug.h>
 
+#ifndef __ASSEMBLY__
+#include <linux/mmdebug.h>
+#endif
 /*
  * This is necessary to get the definition of PGTABLE_RANGE which we
  * need for various slices related matters. Note that this isn't the
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index fef738229a68..c684ef6cbd10 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -441,8 +441,8 @@ static inline pte_t pte_clear_soft_dirty(pte_t pte)
  */
 static inline int pte_protnone(pte_t pte)
 {
-	return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED)) ==
-		cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED);
+	return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX)) ==
+		cpu_to_be64(_PAGE_PRESENT);
 }
 #endif /* CONFIG_NUMA_BALANCING */
 
@@ -512,6 +512,32 @@ static inline pte_t pte_mkhuge(pte_t pte)
 	return pte;
 }
 
+#define pte_mk_savedwrite pte_mk_savedwrite
+static inline pte_t pte_mk_savedwrite(pte_t pte)
+{
+	/*
+	 * Used by Autonuma subsystem to preserve the write bit
+	 * while marking the pte PROT_NONE. Only allow this
+	 * on PROT_NONE pte
+	 */
+	VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | _PAGE_PRIVILEGED)) !=
+		  cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED));
+	return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED);
+}
+
+#define pte_savedwrite pte_savedwrite
+static inline bool pte_savedwrite(pte_t pte)
+{
+	/*
+	 * Saved write ptes are prot none ptes that doesn't have
+	 * privileged bit sit. We mark prot none as one which has
+	 * present and pviliged bit set and RWX cleared. To mark
+	 * protnone which used to have _PAGE_WRITE set we clear
+	 * the privileged bit.
+	 */
+	return !(pte_raw(pte) & cpu_to_be64(_PAGE_RWX | _PAGE_PRIVILEGED));
+}
+
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
 	/* FIXME!! check whether this need to be a conditional */
@@ -873,6 +899,7 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 #define pmd_mkclean(pmd)	pte_pmd(pte_mkclean(pmd_pte(pmd)))
 #define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+#define pmd_mk_savedwrite(pmd)	pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
 
 #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
 #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
@@ -889,6 +916,7 @@ static inline int pmd_protnone(pmd_t pmd)
 
 #define __HAVE_ARCH_PMD_WRITE
 #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
+#define pmd_savedwrite(pmd)	pte_savedwrite(pmd_pte(pmd))
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
  2017-02-14  5:31 ` [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
@ 2017-02-14  5:49   ` Michael Ellerman
  2017-02-14  5:55     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Ellerman @ 2017-02-14  5:49 UTC (permalink / raw)
  To: Aneesh Kumar K.V, akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linux-kernel, linuxppc-dev, Aneesh Kumar K.V

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> Autonuma preserves the write permission across numa fault to avoid taking
> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve PTE
> write permissions across a NUMA hinting fault"). Architecture can implement
> protnone in different ways and some may choose to implement that by clearing Read/
> Write/Exec bit of pte. Setting the write bit on such pte can result in wrong
> behaviour. Fix this up by allowing arch to override how to save the write bit
> on a protnone pte.

This is pretty obviously a nop on arches that don't implement the new
hooks, but it'd still be good to get an ack from someone in mm land
before I merge it.

cheers

> Acked-By: Michael Neuling <mikey@neuling.org>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  include/asm-generic/pgtable.h | 16 ++++++++++++++++
>  mm/huge_memory.c              |  4 ++--
>  mm/memory.c                   |  2 +-
>  mm/mprotect.c                 |  4 ++--
>  4 files changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 18af2bcefe6a..b6f3a8a4b738 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -192,6 +192,22 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres
>  }
>  #endif
>  
> +#ifndef pte_savedwrite
> +#define pte_savedwrite pte_write
> +#endif
> +
> +#ifndef pte_mk_savedwrite
> +#define pte_mk_savedwrite pte_mkwrite
> +#endif
> +
> +#ifndef pmd_savedwrite
> +#define pmd_savedwrite pmd_write
> +#endif
> +
> +#ifndef pmd_mk_savedwrite
> +#define pmd_mk_savedwrite pmd_mkwrite
> +#endif
> +
>  #ifndef __HAVE_ARCH_PMDP_SET_WRPROTECT
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  static inline void pmdp_set_wrprotect(struct mm_struct *mm,
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 9a6bd6c8d55a..2f0f855ec911 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1300,7 +1300,7 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd)
>  	goto out;
>  clear_pmdnuma:
>  	BUG_ON(!PageLocked(page));
> -	was_writable = pmd_write(pmd);
> +	was_writable = pmd_savedwrite(pmd);
>  	pmd = pmd_modify(pmd, vma->vm_page_prot);
>  	pmd = pmd_mkyoung(pmd);
>  	if (was_writable)
> @@ -1555,7 +1555,7 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
>  			entry = pmdp_huge_get_and_clear_notify(mm, addr, pmd);
>  			entry = pmd_modify(entry, newprot);
>  			if (preserve_write)
> -				entry = pmd_mkwrite(entry);
> +				entry = pmd_mk_savedwrite(entry);
>  			ret = HPAGE_PMD_NR;
>  			set_pmd_at(mm, addr, pmd, entry);
>  			BUG_ON(vma_is_anonymous(vma) && !preserve_write &&
> diff --git a/mm/memory.c b/mm/memory.c
> index e78bf72f30dd..88c24f89d6d3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3388,7 +3388,7 @@ static int do_numa_page(struct vm_fault *vmf)
>  	int target_nid;
>  	bool migrated = false;
>  	pte_t pte;
> -	bool was_writable = pte_write(vmf->orig_pte);
> +	bool was_writable = pte_savedwrite(vmf->orig_pte);
>  	int flags = 0;
>  
>  	/*
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index f9c07f54dd62..15f5c174a7c1 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -113,13 +113,13 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
>  			ptent = ptep_modify_prot_start(mm, addr, pte);
>  			ptent = pte_modify(ptent, newprot);
>  			if (preserve_write)
> -				ptent = pte_mkwrite(ptent);
> +				ptent = pte_mk_savedwrite(ptent);
>  
>  			/* Avoid taking write faults for known dirty pages */
>  			if (dirty_accountable && pte_dirty(ptent) &&
>  					(pte_soft_dirty(ptent) ||
>  					 !(vma->vm_flags & VM_SOFTDIRTY))) {
> -				ptent = pte_mkwrite(ptent);
> +				ptent = pte_mk_savedwrite(ptent);
>  			}
>  			ptep_modify_prot_commit(mm, addr, pte, ptent);
>  			pages++;
> -- 
> 2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
  2017-02-14  5:49   ` Michael Ellerman
@ 2017-02-14  5:55     ` Aneesh Kumar K.V
  2017-02-14 10:59       ` Michael Ellerman
  0 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-14  5:55 UTC (permalink / raw)
  To: Michael Ellerman, akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linux-kernel, linuxppc-dev



On Tuesday 14 February 2017 11:19 AM, Michael Ellerman wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
>
>> Autonuma preserves the write permission across numa fault to avoid taking
>> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve PTE
>> write permissions across a NUMA hinting fault"). Architecture can implement
>> protnone in different ways and some may choose to implement that by clearing Read/
>> Write/Exec bit of pte. Setting the write bit on such pte can result in wrong
>> behaviour. Fix this up by allowing arch to override how to save the write bit
>> on a protnone pte.
> This is pretty obviously a nop on arches that don't implement the new
> hooks, but it'd still be good to get an ack from someone in mm land
> before I merge it.


To get it apply cleanly you may need
http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch
http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes-fix.patch

They are strictly not needed after the saved write patch. But I didn't 
request to drop them, because the patch helps us
to get closer to the goal of no ste_pte_at() call on present ptes.

-aneesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
  2017-02-14  5:55     ` Aneesh Kumar K.V
@ 2017-02-14 10:59       ` Michael Ellerman
  2017-02-15  0:20         ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Ellerman @ 2017-02-14 10:59 UTC (permalink / raw)
  To: Aneesh Kumar K.V, akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linuxppc-dev, linux-kernel

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> On Tuesday 14 February 2017 11:19 AM, Michael Ellerman wrote:
>> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
>>
>>> Autonuma preserves the write permission across numa fault to avoid taking
>>> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve PTE
>>> write permissions across a NUMA hinting fault"). Architecture can implement
>>> protnone in different ways and some may choose to implement that by clearing Read/
>>> Write/Exec bit of pte. Setting the write bit on such pte can result in wrong
>>> behaviour. Fix this up by allowing arch to override how to save the write bit
>>> on a protnone pte.
>> This is pretty obviously a nop on arches that don't implement the new
>> hooks, but it'd still be good to get an ack from someone in mm land
>> before I merge it.
>
>
> To get it apply cleanly you may need
> http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch
> http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes-fix.patch

Ah OK, I missed those.

In that case these two should probably go via Andrew's tree.

cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
  2017-02-14  5:31 ` [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
@ 2017-02-14 11:04   ` Michael Ellerman
  2017-02-14 12:26     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Ellerman @ 2017-02-14 11:04 UTC (permalink / raw)
  To: Aneesh Kumar K.V, akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linuxppc-dev, linux-kernel, Aneesh Kumar K.V

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 0735d5a8049f..8720a406bbbe 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -16,6 +16,9 @@
>  #include <asm/page.h>
>  #include <asm/bug.h>
>  
> +#ifndef __ASSEMBLY__
> +#include <linux/mmdebug.h>
> +#endif

I assume that's for the VM_BUG_ON() you add below. But if so wouldn't
the #include be better placed in book3s/64/pgtable.h also?

> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index fef738229a68..c684ef6cbd10 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -512,6 +512,32 @@ static inline pte_t pte_mkhuge(pte_t pte)
>  	return pte;
>  }
>  
> +#define pte_mk_savedwrite pte_mk_savedwrite
> +static inline pte_t pte_mk_savedwrite(pte_t pte)
> +{
> +	/*
> +	 * Used by Autonuma subsystem to preserve the write bit
> +	 * while marking the pte PROT_NONE. Only allow this
> +	 * on PROT_NONE pte
> +	 */
> +	VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | _PAGE_PRIVILEGED)) !=
> +		  cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED));
> +	return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED);
> +}
> +


cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
  2017-02-14 11:04   ` Michael Ellerman
@ 2017-02-14 12:26     ` Aneesh Kumar K.V
  2017-02-15  0:45       ` Michael Ellerman
  0 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-14 12:26 UTC (permalink / raw)
  To: Michael Ellerman, akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linuxppc-dev, linux-kernel

Michael Ellerman <mpe@ellerman.id.au> writes:

> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>> index 0735d5a8049f..8720a406bbbe 100644
>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>> @@ -16,6 +16,9 @@
>>  #include <asm/page.h>
>>  #include <asm/bug.h>
>>  
>> +#ifndef __ASSEMBLY__
>> +#include <linux/mmdebug.h>
>> +#endif
>
> I assume that's for the VM_BUG_ON() you add below. But if so wouldn't
> the #include be better placed in book3s/64/pgtable.h also?

mmu-hash.h has got a hack that is explained below

#ifndef __ASSEMBLY__
#include <linux/mmdebug.h>
#endif
/*
 * This is necessary to get the definition of PGTABLE_RANGE which we
 * need for various slices related matters. Note that this isn't the
 * complete pgtable.h but only a portion of it.
 */
#include <asm/book3s/64/pgtable.h>

This is the only place where we do that book3s/64/pgtable.h include this
way. Everybody should include asm/pgable.h which picks the righ version
based on different config option.

#
>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index fef738229a68..c684ef6cbd10 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -512,6 +512,32 @@ static inline pte_t pte_mkhuge(pte_t pte)
>>  	return pte;
>>  }
>>  
>> +#define pte_mk_savedwrite pte_mk_savedwrite
>> +static inline pte_t pte_mk_savedwrite(pte_t pte)
>> +{
>> +	/*
>> +	 * Used by Autonuma subsystem to preserve the write bit
>> +	 * while marking the pte PROT_NONE. Only allow this
>> +	 * on PROT_NONE pte
>> +	 */
>> +	VM_BUG_ON((pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_RWX | _PAGE_PRIVILEGED)) !=
>> +		  cpu_to_be64(_PAGE_PRESENT | _PAGE_PRIVILEGED));
>> +	return __pte(pte_val(pte) & ~_PAGE_PRIVILEGED);
>> +}
>> +
>
>
> cheers

-aneesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte.
  2017-02-14 10:59       ` Michael Ellerman
@ 2017-02-15  0:20         ` Andrew Morton
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2017-02-15  0:20 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Aneesh Kumar K.V, Rik van Riel, Mel Gorman, paulus, benh,
	linux-mm, linuxppc-dev, linux-kernel

On Tue, 14 Feb 2017 21:59:23 +1100 Michael Ellerman <michaele@au1.ibm.com> wrote:

> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> 
> > On Tuesday 14 February 2017 11:19 AM, Michael Ellerman wrote:
> >> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> >>
> >>> Autonuma preserves the write permission across numa fault to avoid taking
> >>> a writefault after a numa fault (Commit: b191f9b106ea " mm: numa: preserve PTE
> >>> write permissions across a NUMA hinting fault"). Architecture can implement
> >>> protnone in different ways and some may choose to implement that by clearing Read/
> >>> Write/Exec bit of pte. Setting the write bit on such pte can result in wrong
> >>> behaviour. Fix this up by allowing arch to override how to save the write bit
> >>> on a protnone pte.
> >> This is pretty obviously a nop on arches that don't implement the new
> >> hooks, but it'd still be good to get an ack from someone in mm land
> >> before I merge it.
> >
> >
> > To get it apply cleanly you may need
> > http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch
> > http://ozlabs.org/~akpm/mmots/broken-out/mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes-fix.patch
> 
> Ah OK, I missed those.
> 
> In that case these two should probably go via Andrew's tree.

Done.  But
mm-autonuma-dont-use-set_pte_at-when-updating-protnone-ptes.patch is on
hold because Aneesh saw a testing issue, so these two are also on hold.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write
  2017-02-14 12:26     ` Aneesh Kumar K.V
@ 2017-02-15  0:45       ` Michael Ellerman
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Ellerman @ 2017-02-15  0:45 UTC (permalink / raw)
  To: Aneesh Kumar K.V, akpm, Rik van Riel, Mel Gorman, paulus, benh
  Cc: linux-mm, linuxppc-dev, linux-kernel

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> Michael Ellerman <mpe@ellerman.id.au> writes:
>
>> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
>>> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>> index 0735d5a8049f..8720a406bbbe 100644
>>> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
>>> @@ -16,6 +16,9 @@
>>>  #include <asm/page.h>
>>>  #include <asm/bug.h>
>>>  
>>> +#ifndef __ASSEMBLY__
>>> +#include <linux/mmdebug.h>
>>> +#endif
>>
>> I assume that's for the VM_BUG_ON() you add below. But if so wouldn't
>> the #include be better placed in book3s/64/pgtable.h also?
>
> mmu-hash.h has got a hack that is explained below
>
> #ifndef __ASSEMBLY__
> #include <linux/mmdebug.h>
> #endif
> /*
>  * This is necessary to get the definition of PGTABLE_RANGE which we
>  * need for various slices related matters. Note that this isn't the
>  * complete pgtable.h but only a portion of it.
>  */
> #include <asm/book3s/64/pgtable.h>
>
> This is the only place where we do that book3s/64/pgtable.h include this
> way. Everybody should include asm/pgable.h which picks the righ version
> based on different config option.

I don't understand how that is related.

If you're adding a VM_BUG_ON() in book3s/64/pgtable.h, why isn't the
include of mmdebug.h in that file also?

cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-02-15  0:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-14  5:31 [PATCH V2 0/2] Numabalancing preserve write fix Aneesh Kumar K.V
2017-02-14  5:31 ` [PATCH V2 1/2] mm/autonuma: Let architecture override how the write bit should be stashed in a protnone pte Aneesh Kumar K.V
2017-02-14  5:49   ` Michael Ellerman
2017-02-14  5:55     ` Aneesh Kumar K.V
2017-02-14 10:59       ` Michael Ellerman
2017-02-15  0:20         ` Andrew Morton
2017-02-14  5:31 ` [PATCH V2 2/2] powerpc/mm/autonuma: Switch ppc64 to its own implementeation of saved write Aneesh Kumar K.V
2017-02-14 11:04   ` Michael Ellerman
2017-02-14 12:26     ` Aneesh Kumar K.V
2017-02-15  0:45       ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).