All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking
@ 2015-10-16 12:07 ` Laurent Dufour
  0 siblings, 0 replies; 24+ messages in thread
From: Laurent Dufour @ 2015-10-16 12:07 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus
  Cc: criu

This series is enabling the software memory dirty tracking in the
kernel for powerpc.  This is the follow up of the commit 0f8975ec4db2
("mm: soft-dirty bits for user memory changes tracking") which
introduced this feature in the mm code.

The first patch is fixing an issue in the code clearing the soft dirty
bit.  The PTE were not cleared before being modified, leading to hang
on ppc64.

The second patch is fixing a build issue when the transparent huge
page is not enabled.

The third patch is introducing the soft dirty tracking in the powerpc
architecture code. 

Laurent Dufour (3):
  mm: clearing pte in clear_soft_dirty()
  mm: clear_soft_dirty_pmd requires THP
  powerpc/mm: Add page soft dirty tracking

 arch/powerpc/Kconfig                     |  2 ++
 arch/powerpc/include/asm/pgtable-ppc64.h | 13 +++++++++--
 arch/powerpc/include/asm/pgtable.h       | 40 +++++++++++++++++++++++++++++++-
 arch/powerpc/include/asm/pte-book3e.h    |  1 +
 arch/powerpc/include/asm/pte-common.h    |  5 ++--
 arch/powerpc/include/asm/pte-hash64.h    |  1 +
 fs/proc/task_mmu.c                       | 21 +++++++++--------
 7 files changed, 68 insertions(+), 15 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking
@ 2015-10-16 12:07 ` Laurent Dufour
  0 siblings, 0 replies; 24+ messages in thread
From: Laurent Dufour @ 2015-10-16 12:07 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus
  Cc: criu

This series is enabling the software memory dirty tracking in the
kernel for powerpc.  This is the follow up of the commit 0f8975ec4db2
("mm: soft-dirty bits for user memory changes tracking") which
introduced this feature in the mm code.

The first patch is fixing an issue in the code clearing the soft dirty
bit.  The PTE were not cleared before being modified, leading to hang
on ppc64.

The second patch is fixing a build issue when the transparent huge
page is not enabled.

The third patch is introducing the soft dirty tracking in the powerpc
architecture code. 

Laurent Dufour (3):
  mm: clearing pte in clear_soft_dirty()
  mm: clear_soft_dirty_pmd requires THP
  powerpc/mm: Add page soft dirty tracking

 arch/powerpc/Kconfig                     |  2 ++
 arch/powerpc/include/asm/pgtable-ppc64.h | 13 +++++++++--
 arch/powerpc/include/asm/pgtable.h       | 40 +++++++++++++++++++++++++++++++-
 arch/powerpc/include/asm/pte-book3e.h    |  1 +
 arch/powerpc/include/asm/pte-common.h    |  5 ++--
 arch/powerpc/include/asm/pte-hash64.h    |  1 +
 fs/proc/task_mmu.c                       | 21 +++++++++--------
 7 files changed, 68 insertions(+), 15 deletions(-)

-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] mm: clearing pte in clear_soft_dirty()
  2015-10-16 12:07 ` Laurent Dufour
@ 2015-10-16 12:07   ` Laurent Dufour
  -1 siblings, 0 replies; 24+ messages in thread
From: Laurent Dufour @ 2015-10-16 12:07 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus
  Cc: criu

As mentioned in the commit 56eecdb912b5 ("mm: Use ptep/pmdp_set_numa()
for updating _PAGE_NUMA bit"), architecture like ppc64 doesn't do
tlb flush in set_pte/pmd functions.

So when dealing with existing pte in clear_soft_dirty, the pte must
be cleared before being modified.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/proc/task_mmu.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e2d46adb54b4..c9454ee39b28 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -753,19 +753,20 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
 	pte_t ptent = *pte;
 
 	if (pte_present(ptent)) {
+		ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte);
 		ptent = pte_wrprotect(ptent);
 		ptent = pte_clear_flags(ptent, _PAGE_SOFT_DIRTY);
+		ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent);
 	} else if (is_swap_pte(ptent)) {
 		ptent = pte_swp_clear_soft_dirty(ptent);
+		set_pte_at(vma->vm_mm, addr, pte, ptent);
 	}
-
-	set_pte_at(vma->vm_mm, addr, pte, ptent);
 }
 
 static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
 		unsigned long addr, pmd_t *pmdp)
 {
-	pmd_t pmd = *pmdp;
+	pmd_t pmd = pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp);
 
 	pmd = pmd_wrprotect(pmd);
 	pmd = pmd_clear_flags(pmd, _PAGE_SOFT_DIRTY);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/3] mm: clearing pte in clear_soft_dirty()
@ 2015-10-16 12:07   ` Laurent Dufour
  0 siblings, 0 replies; 24+ messages in thread
From: Laurent Dufour @ 2015-10-16 12:07 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus
  Cc: criu

As mentioned in the commit 56eecdb912b5 ("mm: Use ptep/pmdp_set_numa()
for updating _PAGE_NUMA bit"), architecture like ppc64 doesn't do
tlb flush in set_pte/pmd functions.

So when dealing with existing pte in clear_soft_dirty, the pte must
be cleared before being modified.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/proc/task_mmu.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e2d46adb54b4..c9454ee39b28 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -753,19 +753,20 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
 	pte_t ptent = *pte;
 
 	if (pte_present(ptent)) {
+		ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte);
 		ptent = pte_wrprotect(ptent);
 		ptent = pte_clear_flags(ptent, _PAGE_SOFT_DIRTY);
+		ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent);
 	} else if (is_swap_pte(ptent)) {
 		ptent = pte_swp_clear_soft_dirty(ptent);
+		set_pte_at(vma->vm_mm, addr, pte, ptent);
 	}
-
-	set_pte_at(vma->vm_mm, addr, pte, ptent);
 }
 
 static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
 		unsigned long addr, pmd_t *pmdp)
 {
-	pmd_t pmd = *pmdp;
+	pmd_t pmd = pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp);
 
 	pmd = pmd_wrprotect(pmd);
 	pmd = pmd_clear_flags(pmd, _PAGE_SOFT_DIRTY);
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] mm: clear_soft_dirty_pmd requires THP
  2015-10-16 12:07 ` Laurent Dufour
@ 2015-10-16 12:07   ` Laurent Dufour
  -1 siblings, 0 replies; 24+ messages in thread
From: Laurent Dufour @ 2015-10-16 12:07 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus
  Cc: criu

Don't build clear_soft_dirty_pmd() if the transparent huge pages are
not enabled.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/proc/task_mmu.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index c9454ee39b28..fa847a982a9f 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -762,7 +762,14 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
 		set_pte_at(vma->vm_mm, addr, pte, ptent);
 	}
 }
+#else
+static inline void clear_soft_dirty(struct vm_area_struct *vma,
+		unsigned long addr, pte_t *pte)
+{
+}
+#endif
 
+#if defined(CONFIG_MEM_SOFT_DIRTY) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
 static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
 		unsigned long addr, pmd_t *pmdp)
 {
@@ -776,14 +783,7 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
 
 	set_pmd_at(vma->vm_mm, addr, pmdp, pmd);
 }
-
 #else
-
-static inline void clear_soft_dirty(struct vm_area_struct *vma,
-		unsigned long addr, pte_t *pte)
-{
-}
-
 static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
 		unsigned long addr, pmd_t *pmdp)
 {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] mm: clear_soft_dirty_pmd requires THP
@ 2015-10-16 12:07   ` Laurent Dufour
  0 siblings, 0 replies; 24+ messages in thread
From: Laurent Dufour @ 2015-10-16 12:07 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus
  Cc: criu

Don't build clear_soft_dirty_pmd() if the transparent huge pages are
not enabled.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/proc/task_mmu.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index c9454ee39b28..fa847a982a9f 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -762,7 +762,14 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
 		set_pte_at(vma->vm_mm, addr, pte, ptent);
 	}
 }
+#else
+static inline void clear_soft_dirty(struct vm_area_struct *vma,
+		unsigned long addr, pte_t *pte)
+{
+}
+#endif
 
+#if defined(CONFIG_MEM_SOFT_DIRTY) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
 static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
 		unsigned long addr, pmd_t *pmdp)
 {
@@ -776,14 +783,7 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
 
 	set_pmd_at(vma->vm_mm, addr, pmdp, pmd);
 }
-
 #else
-
-static inline void clear_soft_dirty(struct vm_area_struct *vma,
-		unsigned long addr, pte_t *pte)
-{
-}
-
 static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
 		unsigned long addr, pmd_t *pmdp)
 {
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] powerpc/mm: Add page soft dirty tracking
  2015-10-16 12:07 ` Laurent Dufour
@ 2015-10-16 12:07   ` Laurent Dufour
  -1 siblings, 0 replies; 24+ messages in thread
From: Laurent Dufour @ 2015-10-16 12:07 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus
  Cc: criu

User space checkpoint and restart tool (CRIU) needs the page's change
to be soft tracked. This allows to do a pre checkpoint and then dump
only touched pages.

This is done by using a newly assigned PTE bit (_PAGE_SOFT_DIRTY) when
the page is backed in memory, and a new _PAGE_SWP_SOFT_DIRTY bit when
the page is swapped out.

The _PAGE_SWP_SOFT_DIRTY bit is dynamically put after the swap type
in the swap pte. A check is added to ensure that the bit is not
overwritten by _PAGE_HPTEFLAGS.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                     |  2 ++
 arch/powerpc/include/asm/pgtable-ppc64.h | 13 +++++++++--
 arch/powerpc/include/asm/pgtable.h       | 40 +++++++++++++++++++++++++++++++-
 arch/powerpc/include/asm/pte-book3e.h    |  1 +
 arch/powerpc/include/asm/pte-common.h    |  5 ++--
 arch/powerpc/include/asm/pte-hash64.h    |  1 +
 6 files changed, 57 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9a7057ec2154..73a4a36a6b38 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -559,6 +559,7 @@ choice
 
 config PPC_4K_PAGES
 	bool "4k page size"
+	select HAVE_ARCH_SOFT_DIRTY if CHECKPOINT_RESTORE && PPC_BOOK3S
 
 config PPC_16K_PAGES
 	bool "16k page size"
@@ -567,6 +568,7 @@ config PPC_16K_PAGES
 config PPC_64K_PAGES
 	bool "64k page size"
 	depends on !PPC_FSL_BOOK3E && (44x || PPC_STD_MMU_64 || PPC_BOOK3E_64)
+	select HAVE_ARCH_SOFT_DIRTY if CHECKPOINT_RESTORE && PPC_BOOK3S
 
 config PPC_256K_PAGES
 	bool "256k page size"
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index fa1dfb7f7b48..2738bf4a8c55 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -315,7 +315,8 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
 static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 {
 	unsigned long bits = pte_val(entry) &
-		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
+		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC |
+		 _PAGE_SOFT_DIRTY);
 
 #ifdef PTE_ATOMIC_UPDATES
 	unsigned long old, tmp;
@@ -354,6 +355,7 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 	 * We filter HPTEFLAGS on set_pte.			\
 	 */							\
 	BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \
+	BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY);	\
 	} while (0)
 /*
  * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
@@ -371,6 +373,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
+
+#define _PAGE_SWP_SOFT_DIRTY	(1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
 #endif /* __ASSEMBLY__ */
 
 /*
@@ -389,7 +393,7 @@ void pgtable_cache_init(void);
  */
 #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
 			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
+			 _PAGE_THP_HUGE | _PAGE_SOFT_DIRTY)
 
 #ifndef __ASSEMBLY__
 /*
@@ -513,6 +517,11 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 #define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
 
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+#define pmd_soft_dirty(pmd)	pte_soft_dirty(pmd_pte(pmd))
+#define pmd_mksoft_dirty(pmd)	pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
+#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
+
 #define __HAVE_ARCH_PMD_WRITE
 #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
 
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 0717693c8428..88baad3d66e2 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -38,6 +38,44 @@ static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL;
 static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
 static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
 
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+static inline int pte_soft_dirty(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_SOFT_DIRTY;
+}
+static inline pte_t pte_mksoft_dirty(pte_t pte)
+{
+	pte_val(pte) |= _PAGE_SOFT_DIRTY;
+	return pte;
+}
+
+static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
+{
+	pte_val(pte) |= _PAGE_SWP_SOFT_DIRTY;
+	return pte;
+}
+static inline int pte_swp_soft_dirty(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_SWP_SOFT_DIRTY;
+}
+static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
+{
+	pte_val(pte) &= ~_PAGE_SWP_SOFT_DIRTY;
+	return pte;
+}
+
+static inline pte_t pte_clear_flags(pte_t pte, pte_basic_t clear)
+{
+	pte_val(pte) &= ~clear;
+	return pte;
+}
+static inline pmd_t pmd_clear_flags(pmd_t pmd, unsigned long clear)
+{
+	pmd_val(pmd) &= ~clear;
+	return pmd;
+}
+#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
+
 #ifdef CONFIG_NUMA_BALANCING
 /*
  * These work without NUMA balancing but the kernel does not care. See the
@@ -89,7 +127,7 @@ static inline pte_t pte_mkwrite(pte_t pte) {
 	pte_val(pte) &= ~_PAGE_RO;
 	pte_val(pte) |= _PAGE_RW; return pte; }
 static inline pte_t pte_mkdirty(pte_t pte) {
-	pte_val(pte) |= _PAGE_DIRTY; return pte; }
+	pte_val(pte) |= _PAGE_DIRTY | _PAGE_SOFT_DIRTY; return pte; }
 static inline pte_t pte_mkyoung(pte_t pte) {
 	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
 static inline pte_t pte_mkspecial(pte_t pte) {
diff --git a/arch/powerpc/include/asm/pte-book3e.h b/arch/powerpc/include/asm/pte-book3e.h
index 8d8473278d91..df5581f817f6 100644
--- a/arch/powerpc/include/asm/pte-book3e.h
+++ b/arch/powerpc/include/asm/pte-book3e.h
@@ -57,6 +57,7 @@
 
 #define _PAGE_HASHPTE	0
 #define _PAGE_BUSY	0
+#define _PAGE_SOFT_DIRTY	0
 
 #define _PAGE_SPECIAL	_PAGE_SW0
 
diff --git a/arch/powerpc/include/asm/pte-common.h b/arch/powerpc/include/asm/pte-common.h
index 71537a319fc8..1bf670996df5 100644
--- a/arch/powerpc/include/asm/pte-common.h
+++ b/arch/powerpc/include/asm/pte-common.h
@@ -94,13 +94,14 @@ extern unsigned long bad_call_to_PMD_PAGE_SIZE(void);
  * pgprot changes
  */
 #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
-                         _PAGE_ACCESSED | _PAGE_SPECIAL)
+			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_SOFT_DIRTY)
 
 /* Mask of bits returned by pte_pgprot() */
 #define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
 			 _PAGE_WRITETHRU | _PAGE_ENDIAN | _PAGE_4K_PFN | \
 			 _PAGE_USER | _PAGE_ACCESSED | _PAGE_RO | \
-			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | _PAGE_EXEC)
+			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | \
+			 _PAGE_EXEC | _PAGE_SOFT_DIRTY)
 
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
diff --git a/arch/powerpc/include/asm/pte-hash64.h b/arch/powerpc/include/asm/pte-hash64.h
index ef612c160da7..19ffd150957f 100644
--- a/arch/powerpc/include/asm/pte-hash64.h
+++ b/arch/powerpc/include/asm/pte-hash64.h
@@ -19,6 +19,7 @@
 #define _PAGE_BIT_SWAP_TYPE	2
 #define _PAGE_EXEC		0x0004 /* No execute on POWER4 and newer (we invert) */
 #define _PAGE_GUARDED		0x0008
+#define _PAGE_SOFT_DIRTY	0x0010 /* software dirty tracking */
 /* We can derive Memory coherence from _PAGE_NO_CACHE */
 #define _PAGE_NO_CACHE		0x0020 /* I: cache inhibit */
 #define _PAGE_WRITETHRU		0x0040 /* W: cache write-through */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] powerpc/mm: Add page soft dirty tracking
@ 2015-10-16 12:07   ` Laurent Dufour
  0 siblings, 0 replies; 24+ messages in thread
From: Laurent Dufour @ 2015-10-16 12:07 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus
  Cc: criu

User space checkpoint and restart tool (CRIU) needs the page's change
to be soft tracked. This allows to do a pre checkpoint and then dump
only touched pages.

This is done by using a newly assigned PTE bit (_PAGE_SOFT_DIRTY) when
the page is backed in memory, and a new _PAGE_SWP_SOFT_DIRTY bit when
the page is swapped out.

The _PAGE_SWP_SOFT_DIRTY bit is dynamically put after the swap type
in the swap pte. A check is added to ensure that the bit is not
overwritten by _PAGE_HPTEFLAGS.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                     |  2 ++
 arch/powerpc/include/asm/pgtable-ppc64.h | 13 +++++++++--
 arch/powerpc/include/asm/pgtable.h       | 40 +++++++++++++++++++++++++++++++-
 arch/powerpc/include/asm/pte-book3e.h    |  1 +
 arch/powerpc/include/asm/pte-common.h    |  5 ++--
 arch/powerpc/include/asm/pte-hash64.h    |  1 +
 6 files changed, 57 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9a7057ec2154..73a4a36a6b38 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -559,6 +559,7 @@ choice
 
 config PPC_4K_PAGES
 	bool "4k page size"
+	select HAVE_ARCH_SOFT_DIRTY if CHECKPOINT_RESTORE && PPC_BOOK3S
 
 config PPC_16K_PAGES
 	bool "16k page size"
@@ -567,6 +568,7 @@ config PPC_16K_PAGES
 config PPC_64K_PAGES
 	bool "64k page size"
 	depends on !PPC_FSL_BOOK3E && (44x || PPC_STD_MMU_64 || PPC_BOOK3E_64)
+	select HAVE_ARCH_SOFT_DIRTY if CHECKPOINT_RESTORE && PPC_BOOK3S
 
 config PPC_256K_PAGES
 	bool "256k page size"
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index fa1dfb7f7b48..2738bf4a8c55 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -315,7 +315,8 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
 static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 {
 	unsigned long bits = pte_val(entry) &
-		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
+		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC |
+		 _PAGE_SOFT_DIRTY);
 
 #ifdef PTE_ATOMIC_UPDATES
 	unsigned long old, tmp;
@@ -354,6 +355,7 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 	 * We filter HPTEFLAGS on set_pte.			\
 	 */							\
 	BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \
+	BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY);	\
 	} while (0)
 /*
  * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
@@ -371,6 +373,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
+
+#define _PAGE_SWP_SOFT_DIRTY	(1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
 #endif /* __ASSEMBLY__ */
 
 /*
@@ -389,7 +393,7 @@ void pgtable_cache_init(void);
  */
 #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
 			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
+			 _PAGE_THP_HUGE | _PAGE_SOFT_DIRTY)
 
 #ifndef __ASSEMBLY__
 /*
@@ -513,6 +517,11 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 #define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
 
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+#define pmd_soft_dirty(pmd)	pte_soft_dirty(pmd_pte(pmd))
+#define pmd_mksoft_dirty(pmd)	pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
+#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
+
 #define __HAVE_ARCH_PMD_WRITE
 #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
 
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 0717693c8428..88baad3d66e2 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -38,6 +38,44 @@ static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL;
 static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
 static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
 
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+static inline int pte_soft_dirty(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_SOFT_DIRTY;
+}
+static inline pte_t pte_mksoft_dirty(pte_t pte)
+{
+	pte_val(pte) |= _PAGE_SOFT_DIRTY;
+	return pte;
+}
+
+static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
+{
+	pte_val(pte) |= _PAGE_SWP_SOFT_DIRTY;
+	return pte;
+}
+static inline int pte_swp_soft_dirty(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_SWP_SOFT_DIRTY;
+}
+static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
+{
+	pte_val(pte) &= ~_PAGE_SWP_SOFT_DIRTY;
+	return pte;
+}
+
+static inline pte_t pte_clear_flags(pte_t pte, pte_basic_t clear)
+{
+	pte_val(pte) &= ~clear;
+	return pte;
+}
+static inline pmd_t pmd_clear_flags(pmd_t pmd, unsigned long clear)
+{
+	pmd_val(pmd) &= ~clear;
+	return pmd;
+}
+#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
+
 #ifdef CONFIG_NUMA_BALANCING
 /*
  * These work without NUMA balancing but the kernel does not care. See the
@@ -89,7 +127,7 @@ static inline pte_t pte_mkwrite(pte_t pte) {
 	pte_val(pte) &= ~_PAGE_RO;
 	pte_val(pte) |= _PAGE_RW; return pte; }
 static inline pte_t pte_mkdirty(pte_t pte) {
-	pte_val(pte) |= _PAGE_DIRTY; return pte; }
+	pte_val(pte) |= _PAGE_DIRTY | _PAGE_SOFT_DIRTY; return pte; }
 static inline pte_t pte_mkyoung(pte_t pte) {
 	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
 static inline pte_t pte_mkspecial(pte_t pte) {
diff --git a/arch/powerpc/include/asm/pte-book3e.h b/arch/powerpc/include/asm/pte-book3e.h
index 8d8473278d91..df5581f817f6 100644
--- a/arch/powerpc/include/asm/pte-book3e.h
+++ b/arch/powerpc/include/asm/pte-book3e.h
@@ -57,6 +57,7 @@
 
 #define _PAGE_HASHPTE	0
 #define _PAGE_BUSY	0
+#define _PAGE_SOFT_DIRTY	0
 
 #define _PAGE_SPECIAL	_PAGE_SW0
 
diff --git a/arch/powerpc/include/asm/pte-common.h b/arch/powerpc/include/asm/pte-common.h
index 71537a319fc8..1bf670996df5 100644
--- a/arch/powerpc/include/asm/pte-common.h
+++ b/arch/powerpc/include/asm/pte-common.h
@@ -94,13 +94,14 @@ extern unsigned long bad_call_to_PMD_PAGE_SIZE(void);
  * pgprot changes
  */
 #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
-                         _PAGE_ACCESSED | _PAGE_SPECIAL)
+			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_SOFT_DIRTY)
 
 /* Mask of bits returned by pte_pgprot() */
 #define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
 			 _PAGE_WRITETHRU | _PAGE_ENDIAN | _PAGE_4K_PFN | \
 			 _PAGE_USER | _PAGE_ACCESSED | _PAGE_RO | \
-			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | _PAGE_EXEC)
+			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | \
+			 _PAGE_EXEC | _PAGE_SOFT_DIRTY)
 
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
diff --git a/arch/powerpc/include/asm/pte-hash64.h b/arch/powerpc/include/asm/pte-hash64.h
index ef612c160da7..19ffd150957f 100644
--- a/arch/powerpc/include/asm/pte-hash64.h
+++ b/arch/powerpc/include/asm/pte-hash64.h
@@ -19,6 +19,7 @@
 #define _PAGE_BIT_SWAP_TYPE	2
 #define _PAGE_EXEC		0x0004 /* No execute on POWER4 and newer (we invert) */
 #define _PAGE_GUARDED		0x0008
+#define _PAGE_SOFT_DIRTY	0x0010 /* software dirty tracking */
 /* We can derive Memory coherence from _PAGE_NO_CACHE */
 #define _PAGE_NO_CACHE		0x0020 /* I: cache inhibit */
 #define _PAGE_WRITETHRU		0x0040 /* W: cache write-through */
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] mm: clearing pte in clear_soft_dirty()
  2015-10-16 12:07   ` Laurent Dufour
@ 2015-10-16 15:00     ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-16 15:00 UTC (permalink / raw)
  To: Laurent Dufour, linux-kernel, linux-mm, akpm, xemul,
	linuxppc-dev, mpe, aneesh.kumar, paulus
  Cc: criu

On Fri, 2015-10-16 at 14:07 +0200, Laurent Dufour wrote:
> As mentioned in the commit 56eecdb912b5 ("mm: Use
> ptep/pmdp_set_numa()
> for updating _PAGE_NUMA bit"), architecture like ppc64 doesn't do
> tlb flush in set_pte/pmd functions.
> 
> So when dealing with existing pte in clear_soft_dirty, the pte must
> be cleared before being modified.

Note that this is true of more than powerpc afaik. There's is a general
rule that we don't "restrict" a PTE access permissions without first
clearing it, due to various races.

> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/proc/task_mmu.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e2d46adb54b4..c9454ee39b28 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -753,19 +753,20 @@ static inline void clear_soft_dirty(struct
> vm_area_struct *vma,
>  	pte_t ptent = *pte;
>  
>  	if (pte_present(ptent)) {
> +		ptent = ptep_modify_prot_start(vma->vm_mm, addr,
> pte);
>  		ptent = pte_wrprotect(ptent);
>  		ptent = pte_clear_flags(ptent, _PAGE_SOFT_DIRTY);
> +		ptep_modify_prot_commit(vma->vm_mm, addr, pte,
> ptent);
>  	} else if (is_swap_pte(ptent)) {
>  		ptent = pte_swp_clear_soft_dirty(ptent);
> +		set_pte_at(vma->vm_mm, addr, pte, ptent);
>  	}
> -
> -	set_pte_at(vma->vm_mm, addr, pte, ptent);
>  }
>  
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>  		unsigned long addr, pmd_t *pmdp)
>  {
> -	pmd_t pmd = *pmdp;
> +	pmd_t pmd = pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp);
>  
>  	pmd = pmd_wrprotect(pmd);
>  	pmd = pmd_clear_flags(pmd, _PAGE_SOFT_DIRTY);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] mm: clearing pte in clear_soft_dirty()
@ 2015-10-16 15:00     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-16 15:00 UTC (permalink / raw)
  To: Laurent Dufour, linux-kernel, linux-mm, akpm, xemul,
	linuxppc-dev, mpe, aneesh.kumar, paulus
  Cc: criu

On Fri, 2015-10-16 at 14:07 +0200, Laurent Dufour wrote:
> As mentioned in the commit 56eecdb912b5 ("mm: Use
> ptep/pmdp_set_numa()
> for updating _PAGE_NUMA bit"), architecture like ppc64 doesn't do
> tlb flush in set_pte/pmd functions.
> 
> So when dealing with existing pte in clear_soft_dirty, the pte must
> be cleared before being modified.

Note that this is true of more than powerpc afaik. There's is a general
rule that we don't "restrict" a PTE access permissions without first
clearing it, due to various races.

> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/proc/task_mmu.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e2d46adb54b4..c9454ee39b28 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -753,19 +753,20 @@ static inline void clear_soft_dirty(struct
> vm_area_struct *vma,
>  	pte_t ptent = *pte;
>  
>  	if (pte_present(ptent)) {
> +		ptent = ptep_modify_prot_start(vma->vm_mm, addr,
> pte);
>  		ptent = pte_wrprotect(ptent);
>  		ptent = pte_clear_flags(ptent, _PAGE_SOFT_DIRTY);
> +		ptep_modify_prot_commit(vma->vm_mm, addr, pte,
> ptent);
>  	} else if (is_swap_pte(ptent)) {
>  		ptent = pte_swp_clear_soft_dirty(ptent);
> +		set_pte_at(vma->vm_mm, addr, pte, ptent);
>  	}
> -
> -	set_pte_at(vma->vm_mm, addr, pte, ptent);
>  }
>  
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>  		unsigned long addr, pmd_t *pmdp)
>  {
> -	pmd_t pmd = *pmdp;
> +	pmd_t pmd = pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp);
>  
>  	pmd = pmd_wrprotect(pmd);
>  	pmd = pmd_clear_flags(pmd, _PAGE_SOFT_DIRTY);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking
  2015-10-16 12:07 ` Laurent Dufour
@ 2015-10-16 21:11   ` Andrew Morton
  -1 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2015-10-16 21:11 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: linux-kernel, linux-mm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus, criu

On Fri, 16 Oct 2015 14:07:05 +0200 Laurent Dufour <ldufour@linux.vnet.ibm.com> wrote:

> This series is enabling the software memory dirty tracking in the
> kernel for powerpc.  This is the follow up of the commit 0f8975ec4db2
> ("mm: soft-dirty bits for user memory changes tracking") which
> introduced this feature in the mm code.
> 
> The first patch is fixing an issue in the code clearing the soft dirty
> bit.  The PTE were not cleared before being modified, leading to hang
> on ppc64.
> 
> The second patch is fixing a build issue when the transparent huge
> page is not enabled.
> 
> The third patch is introducing the soft dirty tracking in the powerpc
> architecture code. 

I grabbed these patches, but they're more a ppc thing than a core
kernel thing.  I can merge them into 4.3 with suitable acks or drop
them if they turn up in the powerpc tree.  Or something else?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking
@ 2015-10-16 21:11   ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2015-10-16 21:11 UTC (permalink / raw)
  To: Laurent Dufour
  Cc: linux-kernel, linux-mm, xemul, linuxppc-dev, mpe, benh,
	aneesh.kumar, paulus, criu

On Fri, 16 Oct 2015 14:07:05 +0200 Laurent Dufour <ldufour@linux.vnet.ibm.com> wrote:

> This series is enabling the software memory dirty tracking in the
> kernel for powerpc.  This is the follow up of the commit 0f8975ec4db2
> ("mm: soft-dirty bits for user memory changes tracking") which
> introduced this feature in the mm code.
> 
> The first patch is fixing an issue in the code clearing the soft dirty
> bit.  The PTE were not cleared before being modified, leading to hang
> on ppc64.
> 
> The second patch is fixing a build issue when the transparent huge
> page is not enabled.
> 
> The third patch is introducing the soft dirty tracking in the powerpc
> architecture code. 

I grabbed these patches, but they're more a ppc thing than a core
kernel thing.  I can merge them into 4.3 with suitable acks or drop
them if they turn up in the powerpc tree.  Or something else?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking
  2015-10-16 21:11   ` Andrew Morton
@ 2015-10-17  2:15     ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-17  2:15 UTC (permalink / raw)
  To: Andrew Morton, Laurent Dufour
  Cc: linux-kernel, linux-mm, xemul, linuxppc-dev, mpe, aneesh.kumar,
	paulus, criu

On Fri, 2015-10-16 at 14:11 -0700, Andrew Morton wrote:
> I grabbed these patches, but they're more a ppc thing than a core
> kernel thing.  I can merge them into 4.3 with suitable acks or drop
> them if they turn up in the powerpc tree.  Or something else?

I'm happy for you to keep the generic ones but the powerpc one at the
end should be reviewed by Aneesh at least.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking
@ 2015-10-17  2:15     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-17  2:15 UTC (permalink / raw)
  To: Andrew Morton, Laurent Dufour
  Cc: linux-kernel, linux-mm, xemul, linuxppc-dev, mpe, aneesh.kumar,
	paulus, criu

On Fri, 2015-10-16 at 14:11 -0700, Andrew Morton wrote:
> I grabbed these patches, but they're more a ppc thing than a core
> kernel thing.  I can merge them into 4.3 with suitable acks or drop
> them if they turn up in the powerpc tree.  Or something else?

I'm happy for you to keep the generic ones but the powerpc one at the
end should be reviewed by Aneesh at least.

Cheers,
Ben.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking
  2015-10-16 21:11   ` Andrew Morton
@ 2015-10-17 12:07     ` Aneesh Kumar K.V
  -1 siblings, 0 replies; 24+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 12:07 UTC (permalink / raw)
  To: Andrew Morton, Laurent Dufour
  Cc: linux-kernel, linux-mm, xemul, linuxppc-dev, mpe, benh, paulus, criu

Andrew Morton <akpm@linux-foundation.org> writes:

> On Fri, 16 Oct 2015 14:07:05 +0200 Laurent Dufour <ldufour@linux.vnet.ibm.com> wrote:
>
>> This series is enabling the software memory dirty tracking in the
>> kernel for powerpc.  This is the follow up of the commit 0f8975ec4db2
>> ("mm: soft-dirty bits for user memory changes tracking") which
>> introduced this feature in the mm code.
>> 
>> The first patch is fixing an issue in the code clearing the soft dirty
>> bit.  The PTE were not cleared before being modified, leading to hang
>> on ppc64.
>> 
>> The second patch is fixing a build issue when the transparent huge
>> page is not enabled.
>> 
>> The third patch is introducing the soft dirty tracking in the powerpc
>> architecture code. 
>
> I grabbed these patches, but they're more a ppc thing than a core
> kernel thing.  I can merge them into 4.3 with suitable acks or drop
> them if they turn up in the powerpc tree.  Or something else?

patch 1 and patch 2 are fixes for generic code. That can go via -mm
tree. The ppc64 bits should go via linux-powerpc tree. We have changes
in this area pending to be merged upstream and patch 3 will result
in conflicts.


-aneesh


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking
@ 2015-10-17 12:07     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 24+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 12:07 UTC (permalink / raw)
  To: Andrew Morton, Laurent Dufour
  Cc: linux-kernel, linux-mm, xemul, linuxppc-dev, mpe, benh, paulus, criu

Andrew Morton <akpm@linux-foundation.org> writes:

> On Fri, 16 Oct 2015 14:07:05 +0200 Laurent Dufour <ldufour@linux.vnet.ibm.com> wrote:
>
>> This series is enabling the software memory dirty tracking in the
>> kernel for powerpc.  This is the follow up of the commit 0f8975ec4db2
>> ("mm: soft-dirty bits for user memory changes tracking") which
>> introduced this feature in the mm code.
>> 
>> The first patch is fixing an issue in the code clearing the soft dirty
>> bit.  The PTE were not cleared before being modified, leading to hang
>> on ppc64.
>> 
>> The second patch is fixing a build issue when the transparent huge
>> page is not enabled.
>> 
>> The third patch is introducing the soft dirty tracking in the powerpc
>> architecture code. 
>
> I grabbed these patches, but they're more a ppc thing than a core
> kernel thing.  I can merge them into 4.3 with suitable acks or drop
> them if they turn up in the powerpc tree.  Or something else?

patch 1 and patch 2 are fixes for generic code. That can go via -mm
tree. The ppc64 bits should go via linux-powerpc tree. We have changes
in this area pending to be merged upstream and patch 3 will result
in conflicts.


-aneesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] mm: clearing pte in clear_soft_dirty()
  2015-10-16 12:07   ` Laurent Dufour
@ 2015-10-17 12:12     ` Aneesh Kumar K.V
  -1 siblings, 0 replies; 24+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 12:12 UTC (permalink / raw)
  To: Laurent Dufour, linux-kernel, linux-mm, akpm, xemul,
	linuxppc-dev, mpe, benh, paulus
  Cc: criu

Laurent Dufour <ldufour@linux.vnet.ibm.com> writes:

> As mentioned in the commit 56eecdb912b5 ("mm: Use ptep/pmdp_set_numa()
> for updating _PAGE_NUMA bit"), architecture like ppc64 doesn't do
> tlb flush in set_pte/pmd functions.
>
> So when dealing with existing pte in clear_soft_dirty, the pte must
> be cleared before being modified.
>
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> ---
>  fs/proc/task_mmu.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e2d46adb54b4..c9454ee39b28 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -753,19 +753,20 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
>  	pte_t ptent = *pte;
>
>  	if (pte_present(ptent)) {
> +		ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte);
>  		ptent = pte_wrprotect(ptent);
>  		ptent = pte_clear_flags(ptent, _PAGE_SOFT_DIRTY);
> +		ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent);
>  	} else if (is_swap_pte(ptent)) {
>  		ptent = pte_swp_clear_soft_dirty(ptent);
> +		set_pte_at(vma->vm_mm, addr, pte, ptent);
>  	}
> -
> -	set_pte_at(vma->vm_mm, addr, pte, ptent);
>  }
>
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>  		unsigned long addr, pmd_t *pmdp)
>  {
> -	pmd_t pmd = *pmdp;
> +	pmd_t pmd = pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp);
>
>  	pmd = pmd_wrprotect(pmd);
>  	pmd = pmd_clear_flags(pmd, _PAGE_SOFT_DIRTY);
> -- 
> 1.9.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] mm: clearing pte in clear_soft_dirty()
@ 2015-10-17 12:12     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 24+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 12:12 UTC (permalink / raw)
  To: Laurent Dufour, linux-kernel, linux-mm, akpm, xemul,
	linuxppc-dev, mpe, benh, paulus
  Cc: criu

Laurent Dufour <ldufour@linux.vnet.ibm.com> writes:

> As mentioned in the commit 56eecdb912b5 ("mm: Use ptep/pmdp_set_numa()
> for updating _PAGE_NUMA bit"), architecture like ppc64 doesn't do
> tlb flush in set_pte/pmd functions.
>
> So when dealing with existing pte in clear_soft_dirty, the pte must
> be cleared before being modified.
>
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> ---
>  fs/proc/task_mmu.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e2d46adb54b4..c9454ee39b28 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -753,19 +753,20 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
>  	pte_t ptent = *pte;
>
>  	if (pte_present(ptent)) {
> +		ptent = ptep_modify_prot_start(vma->vm_mm, addr, pte);
>  		ptent = pte_wrprotect(ptent);
>  		ptent = pte_clear_flags(ptent, _PAGE_SOFT_DIRTY);
> +		ptep_modify_prot_commit(vma->vm_mm, addr, pte, ptent);
>  	} else if (is_swap_pte(ptent)) {
>  		ptent = pte_swp_clear_soft_dirty(ptent);
> +		set_pte_at(vma->vm_mm, addr, pte, ptent);
>  	}
> -
> -	set_pte_at(vma->vm_mm, addr, pte, ptent);
>  }
>
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>  		unsigned long addr, pmd_t *pmdp)
>  {
> -	pmd_t pmd = *pmdp;
> +	pmd_t pmd = pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp);
>
>  	pmd = pmd_wrprotect(pmd);
>  	pmd = pmd_clear_flags(pmd, _PAGE_SOFT_DIRTY);
> -- 
> 1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] mm: clear_soft_dirty_pmd requires THP
  2015-10-16 12:07   ` Laurent Dufour
@ 2015-10-17 12:14     ` Aneesh Kumar K.V
  -1 siblings, 0 replies; 24+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 12:14 UTC (permalink / raw)
  To: Laurent Dufour, linux-kernel, linux-mm, akpm, xemul,
	linuxppc-dev, mpe, benh, paulus
  Cc: criu

Laurent Dufour <ldufour@linux.vnet.ibm.com> writes:

> Don't build clear_soft_dirty_pmd() if the transparent huge pages are
> not enabled.
>
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>


Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/proc/task_mmu.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index c9454ee39b28..fa847a982a9f 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -762,7 +762,14 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
>  		set_pte_at(vma->vm_mm, addr, pte, ptent);
>  	}
>  }
> +#else
> +static inline void clear_soft_dirty(struct vm_area_struct *vma,
> +		unsigned long addr, pte_t *pte)
> +{
> +}
> +#endif
>
> +#if defined(CONFIG_MEM_SOFT_DIRTY) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>  		unsigned long addr, pmd_t *pmdp)
>  {
> @@ -776,14 +783,7 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>
>  	set_pmd_at(vma->vm_mm, addr, pmdp, pmd);
>  }
> -
>  #else
> -
> -static inline void clear_soft_dirty(struct vm_area_struct *vma,
> -		unsigned long addr, pte_t *pte)
> -{
> -}
> -
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>  		unsigned long addr, pmd_t *pmdp)
>  {
> -- 
> 1.9.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] mm: clear_soft_dirty_pmd requires THP
@ 2015-10-17 12:14     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 24+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 12:14 UTC (permalink / raw)
  To: Laurent Dufour, linux-kernel, linux-mm, akpm, xemul,
	linuxppc-dev, mpe, benh, paulus
  Cc: criu

Laurent Dufour <ldufour@linux.vnet.ibm.com> writes:

> Don't build clear_soft_dirty_pmd() if the transparent huge pages are
> not enabled.
>
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>


Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/proc/task_mmu.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index c9454ee39b28..fa847a982a9f 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -762,7 +762,14 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
>  		set_pte_at(vma->vm_mm, addr, pte, ptent);
>  	}
>  }
> +#else
> +static inline void clear_soft_dirty(struct vm_area_struct *vma,
> +		unsigned long addr, pte_t *pte)
> +{
> +}
> +#endif
>
> +#if defined(CONFIG_MEM_SOFT_DIRTY) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>  		unsigned long addr, pmd_t *pmdp)
>  {
> @@ -776,14 +783,7 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>
>  	set_pmd_at(vma->vm_mm, addr, pmdp, pmd);
>  }
> -
>  #else
> -
> -static inline void clear_soft_dirty(struct vm_area_struct *vma,
> -		unsigned long addr, pte_t *pte)
> -{
> -}
> -
>  static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
>  		unsigned long addr, pmd_t *pmdp)
>  {
> -- 
> 1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] powerpc/mm: Add page soft dirty tracking
  2015-10-16 12:07   ` Laurent Dufour
@ 2015-10-17 12:19     ` Aneesh Kumar K.V
  -1 siblings, 0 replies; 24+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 12:19 UTC (permalink / raw)
  To: Laurent Dufour, linux-kernel, linux-mm, akpm, xemul,
	linuxppc-dev, mpe, benh, paulus
  Cc: criu

Laurent Dufour <ldufour@linux.vnet.ibm.com> writes:

> User space checkpoint and restart tool (CRIU) needs the page's change
> to be soft tracked. This allows to do a pre checkpoint and then dump
> only touched pages.
>
> This is done by using a newly assigned PTE bit (_PAGE_SOFT_DIRTY) when
> the page is backed in memory, and a new _PAGE_SWP_SOFT_DIRTY bit when
> the page is swapped out.
>
> The _PAGE_SWP_SOFT_DIRTY bit is dynamically put after the swap type
> in the swap pte. A check is added to ensure that the bit is not
> overwritten by _PAGE_HPTEFLAGS.
>
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/Kconfig                     |  2 ++
>  arch/powerpc/include/asm/pgtable-ppc64.h | 13 +++++++++--
>  arch/powerpc/include/asm/pgtable.h       | 40 +++++++++++++++++++++++++++++++-
>  arch/powerpc/include/asm/pte-book3e.h    |  1 +
>  arch/powerpc/include/asm/pte-common.h    |  5 ++--
>  arch/powerpc/include/asm/pte-hash64.h    |  1 +
>  6 files changed, 57 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 9a7057ec2154..73a4a36a6b38 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -559,6 +559,7 @@ choice
>
>  config PPC_4K_PAGES
>  	bool "4k page size"
> +	select HAVE_ARCH_SOFT_DIRTY if CHECKPOINT_RESTORE && PPC_BOOK3S
>
>  config PPC_16K_PAGES
>  	bool "16k page size"
> @@ -567,6 +568,7 @@ config PPC_16K_PAGES
>  config PPC_64K_PAGES
>  	bool "64k page size"
>  	depends on !PPC_FSL_BOOK3E && (44x || PPC_STD_MMU_64 || PPC_BOOK3E_64)
> +	select HAVE_ARCH_SOFT_DIRTY if CHECKPOINT_RESTORE && PPC_BOOK3S
>
>  config PPC_256K_PAGES
>  	bool "256k page size"
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
> index fa1dfb7f7b48..2738bf4a8c55 100644
> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
> @@ -315,7 +315,8 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
>  static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
>  {
>  	unsigned long bits = pte_val(entry) &
> -		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
> +		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC |
> +		 _PAGE_SOFT_DIRTY);
>
>  #ifdef PTE_ATOMIC_UPDATES
>  	unsigned long old, tmp;
> @@ -354,6 +355,7 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
>  	 * We filter HPTEFLAGS on set_pte.			\
>  	 */							\
>  	BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \
> +	BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY);	\
>  	} while (0)
>  /*
>   * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
> @@ -371,6 +373,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
>
>  void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
>  void pgtable_cache_init(void);
> +
> +#define _PAGE_SWP_SOFT_DIRTY	(1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
>  #endif /* __ASSEMBLY__ */
>
>  /*
> @@ -389,7 +393,7 @@ void pgtable_cache_init(void);
>   */
>  #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
>  			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
> -			 _PAGE_THP_HUGE)
> +			 _PAGE_THP_HUGE | _PAGE_SOFT_DIRTY)
>
>  #ifndef __ASSEMBLY__
>  /*
> @@ -513,6 +517,11 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>  #define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
>  #define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
>
> +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
> +#define pmd_soft_dirty(pmd)	pte_soft_dirty(pmd_pte(pmd))
> +#define pmd_mksoft_dirty(pmd)	pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
> +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
> +
>  #define __HAVE_ARCH_PMD_WRITE
>  #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
>
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 0717693c8428..88baad3d66e2 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -38,6 +38,44 @@ static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL;
>  static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
>  static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
>
> +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
> +static inline int pte_soft_dirty(pte_t pte)
> +{
> +	return pte_val(pte) & _PAGE_SOFT_DIRTY;
> +}
> +static inline pte_t pte_mksoft_dirty(pte_t pte)
> +{
> +	pte_val(pte) |= _PAGE_SOFT_DIRTY;
> +	return pte;


This will break after
https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135298.html


A good option is to drop this patch from the series and let Andrew take
the first two patches. You can send an updated version of patch 3 against
linux-powerpc tree once Michael pulls that series to his tree. 


> +}
> +
> +static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
> +{
> +	pte_val(pte) |= _PAGE_SWP_SOFT_DIRTY;
> +	return pte;
> +}
> +static inline int pte_swp_soft_dirty(pte_t pte)
> +{
> +	return pte_val(pte) & _PAGE_SWP_SOFT_DIRTY;
> +}
> +static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
> +{
> +	pte_val(pte) &= ~_PAGE_SWP_SOFT_DIRTY;
> +	return pte;
> +}
> +
> +static inline pte_t pte_clear_flags(pte_t pte, pte_basic_t clear)
> +{
> +	pte_val(pte) &= ~clear;
> +	return pte;
> +}
> +static inline pmd_t pmd_clear_flags(pmd_t pmd, unsigned long clear)
> +{
> +	pmd_val(pmd) &= ~clear;
> +	return pmd;
> +}
> +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
> +
>  #ifdef CONFIG_NUMA_BALANCING
>  /*
>   * These work without NUMA balancing but the kernel does not care. See the
> @@ -89,7 +127,7 @@ static inline pte_t pte_mkwrite(pte_t pte) {
>  	pte_val(pte) &= ~_PAGE_RO;
>  	pte_val(pte) |= _PAGE_RW; return pte; }
>  static inline pte_t pte_mkdirty(pte_t pte) {
> -	pte_val(pte) |= _PAGE_DIRTY; return pte; }
> +	pte_val(pte) |= _PAGE_DIRTY | _PAGE_SOFT_DIRTY; return pte; }
>  static inline pte_t pte_mkyoung(pte_t pte) {
>  	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
>  static inline pte_t pte_mkspecial(pte_t pte) {
> diff --git a/arch/powerpc/include/asm/pte-book3e.h b/arch/powerpc/include/asm/pte-book3e.h
> index 8d8473278d91..df5581f817f6 100644
> --- a/arch/powerpc/include/asm/pte-book3e.h
> +++ b/arch/powerpc/include/asm/pte-book3e.h
> @@ -57,6 +57,7 @@
>
>  #define _PAGE_HASHPTE	0
>  #define _PAGE_BUSY	0
> +#define _PAGE_SOFT_DIRTY	0
>
>  #define _PAGE_SPECIAL	_PAGE_SW0
>
> diff --git a/arch/powerpc/include/asm/pte-common.h b/arch/powerpc/include/asm/pte-common.h
> index 71537a319fc8..1bf670996df5 100644
> --- a/arch/powerpc/include/asm/pte-common.h
> +++ b/arch/powerpc/include/asm/pte-common.h
> @@ -94,13 +94,14 @@ extern unsigned long bad_call_to_PMD_PAGE_SIZE(void);
>   * pgprot changes
>   */
>  #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
> -                         _PAGE_ACCESSED | _PAGE_SPECIAL)
> +			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_SOFT_DIRTY)
>
>  /* Mask of bits returned by pte_pgprot() */
>  #define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
>  			 _PAGE_WRITETHRU | _PAGE_ENDIAN | _PAGE_4K_PFN | \
>  			 _PAGE_USER | _PAGE_ACCESSED | _PAGE_RO | \
> -			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | _PAGE_EXEC)
> +			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | \
> +			 _PAGE_EXEC | _PAGE_SOFT_DIRTY)
>
>  /*
>   * We define 2 sets of base prot bits, one for basic pages (ie,
> diff --git a/arch/powerpc/include/asm/pte-hash64.h b/arch/powerpc/include/asm/pte-hash64.h
> index ef612c160da7..19ffd150957f 100644
> --- a/arch/powerpc/include/asm/pte-hash64.h
> +++ b/arch/powerpc/include/asm/pte-hash64.h
> @@ -19,6 +19,7 @@
>  #define _PAGE_BIT_SWAP_TYPE	2
>  #define _PAGE_EXEC		0x0004 /* No execute on POWER4 and newer (we invert) */
>  #define _PAGE_GUARDED		0x0008
> +#define _PAGE_SOFT_DIRTY	0x0010 /* software dirty tracking */
>  /* We can derive Memory coherence from _PAGE_NO_CACHE */
>  #define _PAGE_NO_CACHE		0x0020 /* I: cache inhibit */
>  #define _PAGE_WRITETHRU		0x0040 /* W: cache write-through */
> -- 
> 1.9.1


-aneesh


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] powerpc/mm: Add page soft dirty tracking
@ 2015-10-17 12:19     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 24+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 12:19 UTC (permalink / raw)
  To: Laurent Dufour, linux-kernel, linux-mm, akpm, xemul,
	linuxppc-dev, mpe, benh, paulus
  Cc: criu

Laurent Dufour <ldufour@linux.vnet.ibm.com> writes:

> User space checkpoint and restart tool (CRIU) needs the page's change
> to be soft tracked. This allows to do a pre checkpoint and then dump
> only touched pages.
>
> This is done by using a newly assigned PTE bit (_PAGE_SOFT_DIRTY) when
> the page is backed in memory, and a new _PAGE_SWP_SOFT_DIRTY bit when
> the page is swapped out.
>
> The _PAGE_SWP_SOFT_DIRTY bit is dynamically put after the swap type
> in the swap pte. A check is added to ensure that the bit is not
> overwritten by _PAGE_HPTEFLAGS.
>
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/Kconfig                     |  2 ++
>  arch/powerpc/include/asm/pgtable-ppc64.h | 13 +++++++++--
>  arch/powerpc/include/asm/pgtable.h       | 40 +++++++++++++++++++++++++++++++-
>  arch/powerpc/include/asm/pte-book3e.h    |  1 +
>  arch/powerpc/include/asm/pte-common.h    |  5 ++--
>  arch/powerpc/include/asm/pte-hash64.h    |  1 +
>  6 files changed, 57 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 9a7057ec2154..73a4a36a6b38 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -559,6 +559,7 @@ choice
>
>  config PPC_4K_PAGES
>  	bool "4k page size"
> +	select HAVE_ARCH_SOFT_DIRTY if CHECKPOINT_RESTORE && PPC_BOOK3S
>
>  config PPC_16K_PAGES
>  	bool "16k page size"
> @@ -567,6 +568,7 @@ config PPC_16K_PAGES
>  config PPC_64K_PAGES
>  	bool "64k page size"
>  	depends on !PPC_FSL_BOOK3E && (44x || PPC_STD_MMU_64 || PPC_BOOK3E_64)
> +	select HAVE_ARCH_SOFT_DIRTY if CHECKPOINT_RESTORE && PPC_BOOK3S
>
>  config PPC_256K_PAGES
>  	bool "256k page size"
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
> index fa1dfb7f7b48..2738bf4a8c55 100644
> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
> @@ -315,7 +315,8 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
>  static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
>  {
>  	unsigned long bits = pte_val(entry) &
> -		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
> +		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC |
> +		 _PAGE_SOFT_DIRTY);
>
>  #ifdef PTE_ATOMIC_UPDATES
>  	unsigned long old, tmp;
> @@ -354,6 +355,7 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
>  	 * We filter HPTEFLAGS on set_pte.			\
>  	 */							\
>  	BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \
> +	BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY);	\
>  	} while (0)
>  /*
>   * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
> @@ -371,6 +373,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
>
>  void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
>  void pgtable_cache_init(void);
> +
> +#define _PAGE_SWP_SOFT_DIRTY	(1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))
>  #endif /* __ASSEMBLY__ */
>
>  /*
> @@ -389,7 +393,7 @@ void pgtable_cache_init(void);
>   */
>  #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
>  			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
> -			 _PAGE_THP_HUGE)
> +			 _PAGE_THP_HUGE | _PAGE_SOFT_DIRTY)
>
>  #ifndef __ASSEMBLY__
>  /*
> @@ -513,6 +517,11 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>  #define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
>  #define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
>
> +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
> +#define pmd_soft_dirty(pmd)	pte_soft_dirty(pmd_pte(pmd))
> +#define pmd_mksoft_dirty(pmd)	pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
> +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
> +
>  #define __HAVE_ARCH_PMD_WRITE
>  #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
>
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index 0717693c8428..88baad3d66e2 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -38,6 +38,44 @@ static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL;
>  static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
>  static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
>
> +#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
> +static inline int pte_soft_dirty(pte_t pte)
> +{
> +	return pte_val(pte) & _PAGE_SOFT_DIRTY;
> +}
> +static inline pte_t pte_mksoft_dirty(pte_t pte)
> +{
> +	pte_val(pte) |= _PAGE_SOFT_DIRTY;
> +	return pte;


This will break after
https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135298.html


A good option is to drop this patch from the series and let Andrew take
the first two patches. You can send an updated version of patch 3 against
linux-powerpc tree once Michael pulls that series to his tree. 


> +}
> +
> +static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
> +{
> +	pte_val(pte) |= _PAGE_SWP_SOFT_DIRTY;
> +	return pte;
> +}
> +static inline int pte_swp_soft_dirty(pte_t pte)
> +{
> +	return pte_val(pte) & _PAGE_SWP_SOFT_DIRTY;
> +}
> +static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
> +{
> +	pte_val(pte) &= ~_PAGE_SWP_SOFT_DIRTY;
> +	return pte;
> +}
> +
> +static inline pte_t pte_clear_flags(pte_t pte, pte_basic_t clear)
> +{
> +	pte_val(pte) &= ~clear;
> +	return pte;
> +}
> +static inline pmd_t pmd_clear_flags(pmd_t pmd, unsigned long clear)
> +{
> +	pmd_val(pmd) &= ~clear;
> +	return pmd;
> +}
> +#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
> +
>  #ifdef CONFIG_NUMA_BALANCING
>  /*
>   * These work without NUMA balancing but the kernel does not care. See the
> @@ -89,7 +127,7 @@ static inline pte_t pte_mkwrite(pte_t pte) {
>  	pte_val(pte) &= ~_PAGE_RO;
>  	pte_val(pte) |= _PAGE_RW; return pte; }
>  static inline pte_t pte_mkdirty(pte_t pte) {
> -	pte_val(pte) |= _PAGE_DIRTY; return pte; }
> +	pte_val(pte) |= _PAGE_DIRTY | _PAGE_SOFT_DIRTY; return pte; }
>  static inline pte_t pte_mkyoung(pte_t pte) {
>  	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
>  static inline pte_t pte_mkspecial(pte_t pte) {
> diff --git a/arch/powerpc/include/asm/pte-book3e.h b/arch/powerpc/include/asm/pte-book3e.h
> index 8d8473278d91..df5581f817f6 100644
> --- a/arch/powerpc/include/asm/pte-book3e.h
> +++ b/arch/powerpc/include/asm/pte-book3e.h
> @@ -57,6 +57,7 @@
>
>  #define _PAGE_HASHPTE	0
>  #define _PAGE_BUSY	0
> +#define _PAGE_SOFT_DIRTY	0
>
>  #define _PAGE_SPECIAL	_PAGE_SW0
>
> diff --git a/arch/powerpc/include/asm/pte-common.h b/arch/powerpc/include/asm/pte-common.h
> index 71537a319fc8..1bf670996df5 100644
> --- a/arch/powerpc/include/asm/pte-common.h
> +++ b/arch/powerpc/include/asm/pte-common.h
> @@ -94,13 +94,14 @@ extern unsigned long bad_call_to_PMD_PAGE_SIZE(void);
>   * pgprot changes
>   */
>  #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
> -                         _PAGE_ACCESSED | _PAGE_SPECIAL)
> +			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_SOFT_DIRTY)
>
>  /* Mask of bits returned by pte_pgprot() */
>  #define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
>  			 _PAGE_WRITETHRU | _PAGE_ENDIAN | _PAGE_4K_PFN | \
>  			 _PAGE_USER | _PAGE_ACCESSED | _PAGE_RO | \
> -			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | _PAGE_EXEC)
> +			 _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | \
> +			 _PAGE_EXEC | _PAGE_SOFT_DIRTY)
>
>  /*
>   * We define 2 sets of base prot bits, one for basic pages (ie,
> diff --git a/arch/powerpc/include/asm/pte-hash64.h b/arch/powerpc/include/asm/pte-hash64.h
> index ef612c160da7..19ffd150957f 100644
> --- a/arch/powerpc/include/asm/pte-hash64.h
> +++ b/arch/powerpc/include/asm/pte-hash64.h
> @@ -19,6 +19,7 @@
>  #define _PAGE_BIT_SWAP_TYPE	2
>  #define _PAGE_EXEC		0x0004 /* No execute on POWER4 and newer (we invert) */
>  #define _PAGE_GUARDED		0x0008
> +#define _PAGE_SOFT_DIRTY	0x0010 /* software dirty tracking */
>  /* We can derive Memory coherence from _PAGE_NO_CACHE */
>  #define _PAGE_NO_CACHE		0x0020 /* I: cache inhibit */
>  #define _PAGE_WRITETHRU		0x0040 /* W: cache write-through */
> -- 
> 1.9.1


-aneesh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] powerpc/mm: Add page soft dirty tracking
  2015-10-17 12:19     ` Aneesh Kumar K.V
@ 2015-10-17 13:24       ` Benjamin Herrenschmidt
  -1 siblings, 0 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-17 13:24 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Laurent Dufour, linux-kernel, linux-mm, akpm,
	xemul, linuxppc-dev, mpe, paulus
  Cc: criu

On Sat, 2015-10-17 at 17:49 +0530, Aneesh Kumar K.V wrote:
> This will break after
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135298.html
> 
> 
> A good option is to drop this patch from the series and let Andrew take
> the first two patches. You can send an updated version of patch 3 against
> linux-powerpc tree once Michael pulls that series to his tree. 

Or not ... I'm not comfortable with your series just yet for the reasons
I mentioned earlier (basically doubling the memory footprint of the page
tables).

They are already too big.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] powerpc/mm: Add page soft dirty tracking
@ 2015-10-17 13:24       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 24+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-17 13:24 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Laurent Dufour, linux-kernel, linux-mm, akpm,
	xemul, linuxppc-dev, mpe, paulus
  Cc: criu

On Sat, 2015-10-17 at 17:49 +0530, Aneesh Kumar K.V wrote:
> This will break after
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-October/135298.html
> 
> 
> A good option is to drop this patch from the series and let Andrew take
> the first two patches. You can send an updated version of patch 3 against
> linux-powerpc tree once Michael pulls that series to his tree. 

Or not ... I'm not comfortable with your series just yet for the reasons
I mentioned earlier (basically doubling the memory footprint of the page
tables).

They are already too big.

Cheers,
Ben.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-10-17 13:58 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-16 12:07 [PATCH 0/3] mm/powerpc: enabling memory soft dirty tracking Laurent Dufour
2015-10-16 12:07 ` Laurent Dufour
2015-10-16 12:07 ` [PATCH 1/3] mm: clearing pte in clear_soft_dirty() Laurent Dufour
2015-10-16 12:07   ` Laurent Dufour
2015-10-16 15:00   ` Benjamin Herrenschmidt
2015-10-16 15:00     ` Benjamin Herrenschmidt
2015-10-17 12:12   ` Aneesh Kumar K.V
2015-10-17 12:12     ` Aneesh Kumar K.V
2015-10-16 12:07 ` [PATCH 2/3] mm: clear_soft_dirty_pmd requires THP Laurent Dufour
2015-10-16 12:07   ` Laurent Dufour
2015-10-17 12:14   ` Aneesh Kumar K.V
2015-10-17 12:14     ` Aneesh Kumar K.V
2015-10-16 12:07 ` [PATCH 3/3] powerpc/mm: Add page soft dirty tracking Laurent Dufour
2015-10-16 12:07   ` Laurent Dufour
2015-10-17 12:19   ` Aneesh Kumar K.V
2015-10-17 12:19     ` Aneesh Kumar K.V
2015-10-17 13:24     ` Benjamin Herrenschmidt
2015-10-17 13:24       ` Benjamin Herrenschmidt
2015-10-16 21:11 ` [PATCH 0/3] mm/powerpc: enabling memory " Andrew Morton
2015-10-16 21:11   ` Andrew Morton
2015-10-17  2:15   ` Benjamin Herrenschmidt
2015-10-17  2:15     ` Benjamin Herrenschmidt
2015-10-17 12:07   ` Aneesh Kumar K.V
2015-10-17 12:07     ` Aneesh Kumar K.V

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.