All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/7] Radix pte update tlbflush optimizations.
@ 2016-11-28  6:16 Aneesh Kumar K.V
  2016-11-28  6:16 ` [PATCH v7 1/7] powerpc/mm: Rename hugetlb-radix.h to hugetlb.h Aneesh Kumar K.V
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Aneesh Kumar K.V @ 2016-11-28  6:16 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

Changes from v6:
* restrict the new pte bit to radix and DD1 config

Changes from V5:
Switch to use pte bits to track page size.


Aneesh Kumar K.V (7):
  powerpc/mm: Rename hugetlb-radix.h to hugetlb.h
  powerpc/mm/hugetlb: Handle hugepage size supported by hash config
  powerpc/mm: Introduce _PAGE_LARGE software pte bits
  powerpc/mm: Add radix__tlb_flush_pte
  powerpc/mm: update radix__ptep_set_access_flag to not do full mm tlb
    flush
  powerpc/mm: update radix__pte_update to not do full mm tlb flush
  powerpc/mm: Batch tlb flush when invalidating pte entries

 arch/powerpc/include/asm/book3s/32/pgtable.h       |  3 ++-
 .../asm/book3s/64/{hugetlb-radix.h => hugetlb.h}   | 28 ++++++++++++++++++++--
 arch/powerpc/include/asm/book3s/64/pgtable.h       | 14 +++++++++--
 arch/powerpc/include/asm/book3s/64/radix.h         | 28 ++++++++++------------
 .../powerpc/include/asm/book3s/64/tlbflush-radix.h |  2 ++
 arch/powerpc/include/asm/hugetlb.h                 |  2 +-
 arch/powerpc/include/asm/nohash/32/pgtable.h       |  3 ++-
 arch/powerpc/include/asm/nohash/64/pgtable.h       |  3 ++-
 arch/powerpc/mm/pgtable-book3s64.c                 |  3 ++-
 arch/powerpc/mm/pgtable.c                          |  2 +-
 arch/powerpc/mm/tlb-radix.c                        | 18 ++++++++++++++
 11 files changed, 81 insertions(+), 25 deletions(-)
 rename arch/powerpc/include/asm/book3s/64/{hugetlb-radix.h => hugetlb.h} (52%)

-- 
2.10.2

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7 1/7] powerpc/mm: Rename hugetlb-radix.h to hugetlb.h
  2016-11-28  6:16 [PATCH v7 0/7] Radix pte update tlbflush optimizations Aneesh Kumar K.V
@ 2016-11-28  6:16 ` Aneesh Kumar K.V
  2016-11-29 11:33   ` Balbir Singh
  2016-11-29 12:58   ` [v7,1/7] " Michael Ellerman
  2016-11-28  6:16 ` [PATCH v7 2/7] powerpc/mm/hugetlb: Handle hugepage size supported by hash config Aneesh Kumar K.V
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 17+ messages in thread
From: Aneesh Kumar K.V @ 2016-11-28  6:16 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

We will start moving some book3s specific hugetlb functions there.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/{hugetlb-radix.h => hugetlb.h} | 4 ++--
 arch/powerpc/include/asm/hugetlb.h                                | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)
 rename arch/powerpc/include/asm/book3s/64/{hugetlb-radix.h => hugetlb.h} (90%)

diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h
similarity index 90%
rename from arch/powerpc/include/asm/book3s/64/hugetlb-radix.h
rename to arch/powerpc/include/asm/book3s/64/hugetlb.h
index c45189aa7476..499268045306 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_BOOK3S_64_HUGETLB_RADIX_H
-#define _ASM_POWERPC_BOOK3S_64_HUGETLB_RADIX_H
+#ifndef _ASM_POWERPC_BOOK3S_64_HUGETLB_H
+#define _ASM_POWERPC_BOOK3S_64_HUGETLB_H
 /*
  * For radix we want generic code to handle hugetlb. But then if we want
  * both hash and radix to be enabled together we need to workaround the
diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index c5517f463ec7..c03e0a3dd4d8 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -9,7 +9,7 @@ extern struct kmem_cache *hugepte_cache;
 
 #ifdef CONFIG_PPC_BOOK3S_64
 
-#include <asm/book3s/64/hugetlb-radix.h>
+#include <asm/book3s/64/hugetlb.h>
 /*
  * This should work for other subarchs too. But right now we use the
  * new format only for 64bit book3s
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 2/7] powerpc/mm/hugetlb: Handle hugepage size supported by hash config
  2016-11-28  6:16 [PATCH v7 0/7] Radix pte update tlbflush optimizations Aneesh Kumar K.V
  2016-11-28  6:16 ` [PATCH v7 1/7] powerpc/mm: Rename hugetlb-radix.h to hugetlb.h Aneesh Kumar K.V
@ 2016-11-28  6:16 ` Aneesh Kumar K.V
  2016-11-29 11:35   ` Balbir Singh
  2016-11-28  6:17 ` [PATCH v7 3/7] powerpc/mm: Introduce _PAGE_LARGE software pte bits Aneesh Kumar K.V
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Aneesh Kumar K.V @ 2016-11-28  6:16 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

W.r.t hash page table config, we support 16MB and 16GB as the hugepage
size. Update the hstate_get_psize to handle 16M and 16G.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hugetlb.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h
index 499268045306..d9c283f95e05 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
@@ -21,6 +21,10 @@ static inline int hstate_get_psize(struct hstate *hstate)
 		return MMU_PAGE_2M;
 	else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift)
 		return MMU_PAGE_1G;
+	else if (shift == mmu_psize_defs[MMU_PAGE_16M].shift)
+		return MMU_PAGE_16M;
+	else if (shift == mmu_psize_defs[MMU_PAGE_16G].shift)
+		return MMU_PAGE_16G;
 	else {
 		WARN(1, "Wrong huge page shift\n");
 		return mmu_virtual_psize;
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 3/7] powerpc/mm: Introduce _PAGE_LARGE software pte bits
  2016-11-28  6:16 [PATCH v7 0/7] Radix pte update tlbflush optimizations Aneesh Kumar K.V
  2016-11-28  6:16 ` [PATCH v7 1/7] powerpc/mm: Rename hugetlb-radix.h to hugetlb.h Aneesh Kumar K.V
  2016-11-28  6:16 ` [PATCH v7 2/7] powerpc/mm/hugetlb: Handle hugepage size supported by hash config Aneesh Kumar K.V
@ 2016-11-28  6:17 ` Aneesh Kumar K.V
  2016-11-30  0:14   ` Balbir Singh
  2016-11-28  6:17 ` [PATCH v7 4/7] powerpc/mm: Add radix__tlb_flush_pte Aneesh Kumar K.V
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Aneesh Kumar K.V @ 2016-11-28  6:17 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

This patch adds a new software defined pte bit. We use the reserved
fields of ISA 3.0 pte definition since we will only be using this
on DD1 code paths. We can possibly look at removing this code later.

The software bit will be used to differentiate between 64K/4K and 2M ptes.
This helps in finding the page size mapping by a pte so that we can do efficient
tlb flush.

We don't support 1G hugetlb pages yet. So we add a DEBUG WARN_ON to catch
wrong usage.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hugetlb.h | 20 ++++++++++++++++++++
 arch/powerpc/include/asm/book3s/64/pgtable.h |  9 +++++++++
 arch/powerpc/include/asm/book3s/64/radix.h   |  2 ++
 3 files changed, 31 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h
index d9c283f95e05..c62f14d0bec1 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
@@ -30,4 +30,24 @@ static inline int hstate_get_psize(struct hstate *hstate)
 		return mmu_virtual_psize;
 	}
 }
+
+#define arch_make_huge_pte arch_make_huge_pte
+static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
+				       struct page *page, int writable)
+{
+	unsigned long page_shift;
+
+	if (!cpu_has_feature(CPU_FTR_POWER9_DD1))
+		return entry;
+
+	page_shift = huge_page_shift(hstate_vma(vma));
+	/*
+	 * We don't support 1G hugetlb pages yet.
+	 */
+	VM_WARN_ON(page_shift == mmu_psize_defs[MMU_PAGE_1G].shift);
+	if (page_shift == mmu_psize_defs[MMU_PAGE_2M].shift)
+		return __pte(pte_val(entry) | _PAGE_LARGE);
+	else
+		return entry;
+}
 #endif
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 86870c11917b..6f39b9d134a2 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -26,6 +26,11 @@
 #define _RPAGE_SW1		0x00800
 #define _RPAGE_SW2		0x00400
 #define _RPAGE_SW3		0x00200
+#define _RPAGE_RSV1		0x1000000000000000UL
+#define _RPAGE_RSV2		0x0800000000000000UL
+#define _RPAGE_RSV3		0x0400000000000000UL
+#define _RPAGE_RSV4		0x0200000000000000UL
+
 #ifdef CONFIG_MEM_SOFT_DIRTY
 #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
 #else
@@ -34,6 +39,10 @@
 #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
 #define _PAGE_DEVMAP		_RPAGE_SW1
 #define __HAVE_ARCH_PTE_DEVMAP
+/*
+ * For DD1 only, we need to track whether the pte huge
+ */
+#define _PAGE_LARGE	_RPAGE_RSV1
 
 
 #define _PAGE_PTE		(1ul << 62)	/* distinguishes PTEs from pointers */
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index 2a46dea8e1b1..d2c5c064e266 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -243,6 +243,8 @@ static inline int radix__pmd_trans_huge(pmd_t pmd)
 
 static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
 {
+	if (cpu_has_feature(CPU_FTR_POWER9_DD1))
+		return __pmd(pmd_val(pmd) | _PAGE_PTE | _PAGE_LARGE);
 	return __pmd(pmd_val(pmd) | _PAGE_PTE);
 }
 static inline void radix__pmdp_huge_split_prepare(struct vm_area_struct *vma,
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 4/7] powerpc/mm: Add radix__tlb_flush_pte
  2016-11-28  6:16 [PATCH v7 0/7] Radix pte update tlbflush optimizations Aneesh Kumar K.V
                   ` (2 preceding siblings ...)
  2016-11-28  6:17 ` [PATCH v7 3/7] powerpc/mm: Introduce _PAGE_LARGE software pte bits Aneesh Kumar K.V
@ 2016-11-28  6:17 ` Aneesh Kumar K.V
  2016-11-28 11:42   ` Michael Ellerman
  2016-11-28  6:17 ` [PATCH v7 5/7] powerpc/mm: update radix__ptep_set_access_flag to not do full mm tlb flush Aneesh Kumar K.V
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Aneesh Kumar K.V @ 2016-11-28  6:17 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

Now that we have page size details encoded in pte using software pte bits,
use that to find the page size needed for tlb flush.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/tlbflush-radix.h |  2 ++
 arch/powerpc/mm/tlb-radix.c                         | 18 ++++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index a9e19cb2f7c5..e9bbd10ee7e9 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -42,4 +42,6 @@ extern void radix__flush_tlb_lpid_va(unsigned long lpid, unsigned long gpa,
 				     unsigned long page_size);
 extern void radix__flush_tlb_lpid(unsigned long lpid);
 extern void radix__flush_tlb_all(void);
+extern void radix__flush_tlb_pte(unsigned long old_pte, struct mm_struct *mm,
+				 unsigned long address);
 #endif
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 3493cf4e0452..7648952e4f08 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -428,3 +428,21 @@ void radix__flush_tlb_all(void)
 		     : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(0) : "memory");
 	asm volatile("eieio; tlbsync; ptesync": : :"memory");
 }
+
+void radix__flush_tlb_pte(unsigned long old_pte, struct mm_struct *mm,
+			  unsigned long address)
+{
+	/*
+	 * We track page size in pte only for DD1, So we can
+	 * call this only on DD1.
+	 */
+	if (!cpu_has_feature(CPU_FTR_POWER9_DD1)) {
+		VM_WARN_ON(1);
+		return;
+	}
+
+	if (old_pte & _PAGE_LARGE)
+		radix__flush_tlb_page_psize(mm, address, MMU_PAGE_2M);
+	else
+		radix__flush_tlb_page_psize(mm, address, mmu_virtual_psize);
+}
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 5/7] powerpc/mm: update radix__ptep_set_access_flag to not do full mm tlb flush
  2016-11-28  6:16 [PATCH v7 0/7] Radix pte update tlbflush optimizations Aneesh Kumar K.V
                   ` (3 preceding siblings ...)
  2016-11-28  6:17 ` [PATCH v7 4/7] powerpc/mm: Add radix__tlb_flush_pte Aneesh Kumar K.V
@ 2016-11-28  6:17 ` Aneesh Kumar K.V
  2016-11-28  6:17 ` [PATCH v7 6/7] powerpc/mm: update radix__pte_update " Aneesh Kumar K.V
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Aneesh Kumar K.V @ 2016-11-28  6:17 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

When we are updating a pte, we just need to flush the tlb mapping
that pte. Right now we do a full mm flush because we don't track the page
size. Now that we have page size details in pte use that to do the
optimized flush

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h |  3 ++-
 arch/powerpc/include/asm/book3s/64/pgtable.h |  5 +++--
 arch/powerpc/include/asm/book3s/64/radix.h   | 11 +++--------
 arch/powerpc/include/asm/nohash/32/pgtable.h |  3 ++-
 arch/powerpc/include/asm/nohash/64/pgtable.h |  3 ++-
 arch/powerpc/mm/pgtable-book3s64.c           |  3 ++-
 arch/powerpc/mm/pgtable.c                    |  2 +-
 7 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 6b8b2d57fdc8..dc58980f3ad9 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -224,7 +224,8 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 
 
 static inline void __ptep_set_access_flags(struct mm_struct *mm,
-					   pte_t *ptep, pte_t entry)
+					   pte_t *ptep, pte_t entry,
+					   unsigned long address)
 {
 	unsigned long set = pte_val(entry) &
 		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 6f39b9d134a2..696a17c561bf 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -589,10 +589,11 @@ static inline bool check_pte_access(unsigned long access, unsigned long ptev)
  */
 
 static inline void __ptep_set_access_flags(struct mm_struct *mm,
-					   pte_t *ptep, pte_t entry)
+					   pte_t *ptep, pte_t entry,
+					   unsigned long address)
 {
 	if (radix_enabled())
-		return radix__ptep_set_access_flags(mm, ptep, entry);
+		return radix__ptep_set_access_flags(mm, ptep, entry, address);
 	return hash__ptep_set_access_flags(ptep, entry);
 }
 
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index d2c5c064e266..7f31de4fb454 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -167,7 +167,8 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
  * function doesn't need to invalidate tlb.
  */
 static inline void radix__ptep_set_access_flags(struct mm_struct *mm,
-						pte_t *ptep, pte_t entry)
+						pte_t *ptep, pte_t entry,
+						unsigned long address)
 {
 
 	unsigned long set = pte_val(entry) & (_PAGE_DIRTY | _PAGE_ACCESSED |
@@ -183,13 +184,7 @@ static inline void radix__ptep_set_access_flags(struct mm_struct *mm,
 		 * new value of pte
 		 */
 		new_pte = old_pte | set;
-
-		/*
-		 * For now let's do heavy pid flush
-		 * radix__flush_tlb_page_psize(mm, addr, mmu_virtual_psize);
-		 */
-		radix__flush_tlb_mm(mm);
-
+		radix__flush_tlb_pte(old_pte, mm, address);
 		__radix_pte_update(ptep, 0, new_pte);
 	} else
 		__radix_pte_update(ptep, 0, set);
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index c219ef7be53b..65073fbc6707 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -268,7 +268,8 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 
 
 static inline void __ptep_set_access_flags(struct mm_struct *mm,
-					   pte_t *ptep, pte_t entry)
+					   pte_t *ptep, pte_t entry,
+					   unsigned long address)
 {
 	unsigned long set = pte_val(entry) &
 		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 653a1838469d..ea1c0123b85c 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -301,7 +301,8 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
  * function doesn't need to flush the hash entry
  */
 static inline void __ptep_set_access_flags(struct mm_struct *mm,
-					   pte_t *ptep, pte_t entry)
+					   pte_t *ptep, pte_t entry,
+					   unsigned long address)
 {
 	unsigned long bits = pte_val(entry) &
 		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index f4f437cbabf1..ebf9782bacf9 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -35,7 +35,8 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
 #endif
 	changed = !pmd_same(*(pmdp), entry);
 	if (changed) {
-		__ptep_set_access_flags(vma->vm_mm, pmdp_ptep(pmdp), pmd_pte(entry));
+		__ptep_set_access_flags(vma->vm_mm, pmdp_ptep(pmdp),
+					pmd_pte(entry), address);
 		flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
 	}
 	return changed;
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 911fdfb63ec1..cb39c8bd2436 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -224,7 +224,7 @@ int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
 	if (changed) {
 		if (!is_vm_hugetlb_page(vma))
 			assert_pte_locked(vma->vm_mm, address);
-		__ptep_set_access_flags(vma->vm_mm, ptep, entry);
+		__ptep_set_access_flags(vma->vm_mm, ptep, entry, address);
 		flush_tlb_page(vma, address);
 	}
 	return changed;
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 6/7] powerpc/mm: update radix__pte_update to not do full mm tlb flush
  2016-11-28  6:16 [PATCH v7 0/7] Radix pte update tlbflush optimizations Aneesh Kumar K.V
                   ` (4 preceding siblings ...)
  2016-11-28  6:17 ` [PATCH v7 5/7] powerpc/mm: update radix__ptep_set_access_flag to not do full mm tlb flush Aneesh Kumar K.V
@ 2016-11-28  6:17 ` Aneesh Kumar K.V
  2016-11-28  6:17 ` [PATCH v7 7/7] powerpc/mm: Batch tlb flush when invalidating pte entries Aneesh Kumar K.V
  2016-11-29 11:32 ` [PATCH v7 0/7] Radix pte update tlbflush optimizations Balbir Singh
  7 siblings, 0 replies; 17+ messages in thread
From: Aneesh Kumar K.V @ 2016-11-28  6:17 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

When we are updating a pte, we just need to flush the tlb mapping
that pte. Right now we do a full mm flush because we don't track page
size. Now that we have page size details in pte use that to do the
optimized flush

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/radix.h | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index 7f31de4fb454..4e88178cf03b 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -145,13 +145,7 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
 		 * new value of pte
 		 */
 		new_pte = (old_pte | set) & ~clr;
-
-		/*
-		 * For now let's do heavy pid flush
-		 * radix__flush_tlb_page_psize(mm, addr, mmu_virtual_psize);
-		 */
-		radix__flush_tlb_mm(mm);
-
+		radix__flush_tlb_pte(old_pte, mm, addr);
 		__radix_pte_update(ptep, 0, new_pte);
 	} else
 		old_pte = __radix_pte_update(ptep, clr, set);
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 7/7] powerpc/mm: Batch tlb flush when invalidating pte entries
  2016-11-28  6:16 [PATCH v7 0/7] Radix pte update tlbflush optimizations Aneesh Kumar K.V
                   ` (5 preceding siblings ...)
  2016-11-28  6:17 ` [PATCH v7 6/7] powerpc/mm: update radix__pte_update " Aneesh Kumar K.V
@ 2016-11-28  6:17 ` Aneesh Kumar K.V
  2016-11-29 11:32 ` [PATCH v7 0/7] Radix pte update tlbflush optimizations Balbir Singh
  7 siblings, 0 replies; 17+ messages in thread
From: Aneesh Kumar K.V @ 2016-11-28  6:17 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

This will improve the task exit case, by batching tlb invalidates.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/radix.h | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index 4e88178cf03b..20a4eb4d065a 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -140,13 +140,20 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
 		unsigned long new_pte;
 
 		old_pte = __radix_pte_update(ptep, ~0, 0);
-		asm volatile("ptesync" : : : "memory");
 		/*
 		 * new value of pte
 		 */
 		new_pte = (old_pte | set) & ~clr;
-		radix__flush_tlb_pte(old_pte, mm, addr);
-		__radix_pte_update(ptep, 0, new_pte);
+		/*
+		 * If we are trying to clear the pte, we can skip
+		 * the below sequence and batch the tlb flush. The
+		 * tlb flush batching is done by mmu gather code
+		 */
+		if (new_pte) {
+			asm volatile("ptesync" : : : "memory");
+			radix__flush_tlb_pte(old_pte, mm, addr);
+			__radix_pte_update(ptep, 0, new_pte);
+		}
 	} else
 		old_pte = __radix_pte_update(ptep, clr, set);
 	asm volatile("ptesync" : : : "memory");
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 4/7] powerpc/mm: Add radix__tlb_flush_pte
  2016-11-28  6:17 ` [PATCH v7 4/7] powerpc/mm: Add radix__tlb_flush_pte Aneesh Kumar K.V
@ 2016-11-28 11:42   ` Michael Ellerman
  0 siblings, 0 replies; 17+ messages in thread
From: Michael Ellerman @ 2016-11-28 11:42 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
> index 3493cf4e0452..7648952e4f08 100644
> --- a/arch/powerpc/mm/tlb-radix.c
> +++ b/arch/powerpc/mm/tlb-radix.c
> @@ -428,3 +428,21 @@ void radix__flush_tlb_all(void)
>  		     : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(0) : "memory");
>  	asm volatile("eieio; tlbsync; ptesync": : :"memory");
>  }
> +
> +void radix__flush_tlb_pte(unsigned long old_pte, struct mm_struct *mm,
> +			  unsigned long address)
> +{
> +	/*
> +	 * We track page size in pte only for DD1, So we can
> +	 * call this only on DD1.
> +	 */
> +	if (!cpu_has_feature(CPU_FTR_POWER9_DD1)) {
> +		VM_WARN_ON(1);
> +		return;
> +	}

That's a bit gross but I guess it's OK.

How about we give the function a name that makes it obvious as well?

Like radix__flush_tlb_pte_p9_dd1() - ugly but unlikely anyone would call
it by accident outside of a workaround.

cheers

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 0/7] Radix pte update tlbflush optimizations.
  2016-11-28  6:16 [PATCH v7 0/7] Radix pte update tlbflush optimizations Aneesh Kumar K.V
                   ` (6 preceding siblings ...)
  2016-11-28  6:17 ` [PATCH v7 7/7] powerpc/mm: Batch tlb flush when invalidating pte entries Aneesh Kumar K.V
@ 2016-11-29 11:32 ` Balbir Singh
  2016-11-30  4:30   ` Michael Ellerman
  7 siblings, 1 reply; 17+ messages in thread
From: Balbir Singh @ 2016-11-29 11:32 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe; +Cc: linuxppc-dev



On 28/11/16 17:16, Aneesh Kumar K.V wrote:
> Changes from v6:
> * restrict the new pte bit to radix and DD1 config
> 
> Changes from V5:
> Switch to use pte bits to track page size.
> 
> 

This series looks much better, I wish there was a better
way of avoiding to have to pass the address to the ptep function,
but I guess we get to live with it forever

Balbir Singh.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 1/7] powerpc/mm: Rename hugetlb-radix.h to hugetlb.h
  2016-11-28  6:16 ` [PATCH v7 1/7] powerpc/mm: Rename hugetlb-radix.h to hugetlb.h Aneesh Kumar K.V
@ 2016-11-29 11:33   ` Balbir Singh
  2016-11-29 12:58   ` [v7,1/7] " Michael Ellerman
  1 sibling, 0 replies; 17+ messages in thread
From: Balbir Singh @ 2016-11-29 11:33 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe; +Cc: linuxppc-dev



On 28/11/16 17:16, Aneesh Kumar K.V wrote:
> We will start moving some book3s specific hugetlb functions there.

You mean for both radix and hash right?

Balbir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 2/7] powerpc/mm/hugetlb: Handle hugepage size supported by hash config
  2016-11-28  6:16 ` [PATCH v7 2/7] powerpc/mm/hugetlb: Handle hugepage size supported by hash config Aneesh Kumar K.V
@ 2016-11-29 11:35   ` Balbir Singh
  0 siblings, 0 replies; 17+ messages in thread
From: Balbir Singh @ 2016-11-29 11:35 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe; +Cc: linuxppc-dev



On 28/11/16 17:16, Aneesh Kumar K.V wrote:
> W.r.t hash page table config, we support 16MB and 16GB as the hugepage
> size. Update the hstate_get_psize to handle 16M and 16G.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hugetlb.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h
> index 499268045306..d9c283f95e05 100644
> --- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
> +++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
> @@ -21,6 +21,10 @@ static inline int hstate_get_psize(struct hstate *hstate)
>  		return MMU_PAGE_2M;
>  	else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift)
>  		return MMU_PAGE_1G;
> +	else if (shift == mmu_psize_defs[MMU_PAGE_16M].shift)
> +		return MMU_PAGE_16M;
> +	else if (shift == mmu_psize_defs[MMU_PAGE_16G].shift)
> +		return MMU_PAGE_16G;

Could we reorder this

We check for 2M, 1G, 16M and 16G. The likely sizes are
2M and 16M. Can we have those upfront so that the order of checks
is 2M, 16M, 1G and 16G

Balbir

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [v7,1/7] powerpc/mm: Rename hugetlb-radix.h to hugetlb.h
  2016-11-28  6:16 ` [PATCH v7 1/7] powerpc/mm: Rename hugetlb-radix.h to hugetlb.h Aneesh Kumar K.V
  2016-11-29 11:33   ` Balbir Singh
@ 2016-11-29 12:58   ` Michael Ellerman
  1 sibling, 0 replies; 17+ messages in thread
From: Michael Ellerman @ 2016-11-29 12:58 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V

On Mon, 2016-11-28 at 06:16:58 UTC, "Aneesh Kumar K.V" wrote:
> We will start moving some book3s specific hugetlb functions there.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/bee8b3b56d1dfc4075254a61340ee3

cheers

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 3/7] powerpc/mm: Introduce _PAGE_LARGE software pte bits
  2016-11-28  6:17 ` [PATCH v7 3/7] powerpc/mm: Introduce _PAGE_LARGE software pte bits Aneesh Kumar K.V
@ 2016-11-30  0:14   ` Balbir Singh
  2016-11-30  0:35     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 17+ messages in thread
From: Balbir Singh @ 2016-11-30  0:14 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe; +Cc: linuxppc-dev



On 28/11/16 17:17, Aneesh Kumar K.V wrote:
> This patch adds a new software defined pte bit. We use the reserved
> fields of ISA 3.0 pte definition since we will only be using this
> on DD1 code paths. We can possibly look at removing this code later.
> 
> The software bit will be used to differentiate between 64K/4K and 2M ptes.
> This helps in finding the page size mapping by a pte so that we can do efficient
> tlb flush.
> 
> We don't support 1G hugetlb pages yet. So we add a DEBUG WARN_ON to catch
> wrong usage.
> 

I thought we do in hugetlb_page_init() don't we register sizes for every size
from 0 to MMU_PAGE_COUNT?

> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hugetlb.h | 20 ++++++++++++++++++++
>  arch/powerpc/include/asm/book3s/64/pgtable.h |  9 +++++++++
>  arch/powerpc/include/asm/book3s/64/radix.h   |  2 ++
>  3 files changed, 31 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h
> index d9c283f95e05..c62f14d0bec1 100644
> --- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
> +++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
> @@ -30,4 +30,24 @@ static inline int hstate_get_psize(struct hstate *hstate)
>  		return mmu_virtual_psize;
>  	}
>  }
> +
> +#define arch_make_huge_pte arch_make_huge_pte
> +static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
> +				       struct page *page, int writable)
> +{
> +	unsigned long page_shift;
> +
> +	if (!cpu_has_feature(CPU_FTR_POWER9_DD1))
> +		return entry;
> +
> +	page_shift = huge_page_shift(hstate_vma(vma));
> +	/*
> +	 * We don't support 1G hugetlb pages yet.
> +	 */
> +	VM_WARN_ON(page_shift == mmu_psize_defs[MMU_PAGE_1G].shift);
> +	if (page_shift == mmu_psize_defs[MMU_PAGE_2M].shift)
> +		return __pte(pte_val(entry) | _PAGE_LARGE);
> +	else
> +		return entry;
> +}
>  #endif
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 86870c11917b..6f39b9d134a2 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -26,6 +26,11 @@
>  #define _RPAGE_SW1		0x00800
>  #define _RPAGE_SW2		0x00400
>  #define _RPAGE_SW3		0x00200
> +#define _RPAGE_RSV1		0x1000000000000000UL
> +#define _RPAGE_RSV2		0x0800000000000000UL
> +#define _RPAGE_RSV3		0x0400000000000000UL
> +#define _RPAGE_RSV4		0x0200000000000000UL
> +

We use the top 4 bits and not the _SW bits?

>  #ifdef CONFIG_MEM_SOFT_DIRTY
>  #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
>  #else
> @@ -34,6 +39,10 @@
>  #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
>  #define _PAGE_DEVMAP		_RPAGE_SW1
>  #define __HAVE_ARCH_PTE_DEVMAP
> +/*
> + * For DD1 only, we need to track whether the pte huge

For POWER9_DD1 only

> + */
> +#define _PAGE_LARGE	_RPAGE_RSV1
>  
>  
>  #define _PAGE_PTE		(1ul << 62)	/* distinguishes PTEs from pointers */
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index 2a46dea8e1b1..d2c5c064e266 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -243,6 +243,8 @@ static inline int radix__pmd_trans_huge(pmd_t pmd)
>  
>  static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
>  {
> +	if (cpu_has_feature(CPU_FTR_POWER9_DD1))
> +		return __pmd(pmd_val(pmd) | _PAGE_PTE | _PAGE_LARGE);
>  	return __pmd(pmd_val(pmd) | _PAGE_PTE);
>  }
>  static inline void radix__pmdp_huge_split_prepare(struct vm_area_struct *vma,
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 3/7] powerpc/mm: Introduce _PAGE_LARGE software pte bits
  2016-11-30  0:14   ` Balbir Singh
@ 2016-11-30  0:35     ` Benjamin Herrenschmidt
  2016-11-30  0:50       ` Balbir Singh
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Herrenschmidt @ 2016-11-30  0:35 UTC (permalink / raw)
  To: Balbir Singh, Aneesh Kumar K.V, paulus, mpe; +Cc: linuxppc-dev

On Wed, 2016-11-30 at 11:14 +1100, Balbir Singh wrote:
> > +#define _RPAGE_RSV1          0x1000000000000000UL
> > +#define _RPAGE_RSV2          0x0800000000000000UL
> > +#define _RPAGE_RSV3          0x0400000000000000UL
> > +#define _RPAGE_RSV4          0x0200000000000000UL
> > +
> 
> We use the top 4 bits and not the _SW bits?

Correct, welcome to the discussion we've been having the last 2 weeks
:-)

We use those bits because we are otherwise short on SW bits (we still
need _PAGE_DEVMAP etc...). We know P9 DD1 is supposed to ignore the
reserved bits so it's a good place holder.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 3/7] powerpc/mm: Introduce _PAGE_LARGE software pte bits
  2016-11-30  0:35     ` Benjamin Herrenschmidt
@ 2016-11-30  0:50       ` Balbir Singh
  0 siblings, 0 replies; 17+ messages in thread
From: Balbir Singh @ 2016-11-30  0:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Aneesh Kumar K.V, paulus, mpe; +Cc: linuxppc-dev



On 30/11/16 11:35, Benjamin Herrenschmidt wrote:
> On Wed, 2016-11-30 at 11:14 +1100, Balbir Singh wrote:
>>> +#define _RPAGE_RSV1          0x1000000000000000UL
>>> +#define _RPAGE_RSV2          0x0800000000000000UL
>>> +#define _RPAGE_RSV3          0x0400000000000000UL
>>> +#define _RPAGE_RSV4          0x0200000000000000UL
>>> +
>>
>> We use the top 4 bits and not the _SW bits?
> 
> Correct, welcome to the discussion we've been having the last 2 weeks
> :-)
> 

I thought we were following Paul's suggestion here

https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-November/151620.html
and I also noticed
https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-November/151624.html

My bad, I thought we had two SW bits to use for DD1

Balbir Singh.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 0/7] Radix pte update tlbflush optimizations.
  2016-11-29 11:32 ` [PATCH v7 0/7] Radix pte update tlbflush optimizations Balbir Singh
@ 2016-11-30  4:30   ` Michael Ellerman
  0 siblings, 0 replies; 17+ messages in thread
From: Michael Ellerman @ 2016-11-30  4:30 UTC (permalink / raw)
  To: Balbir Singh, Aneesh Kumar K.V, benh, paulus; +Cc: linuxppc-dev

Balbir Singh <bsingharora@gmail.com> writes:

> On 28/11/16 17:16, Aneesh Kumar K.V wrote:
>> Changes from v6:
>> * restrict the new pte bit to radix and DD1 config
>> 
>> Changes from V5:
>> Switch to use pte bits to track page size.
>
> This series looks much better, I wish there was a better
> way of avoiding to have to pass the address to the ptep function,
> but I guess we get to live with it forever

No, we can always revert it when P9 DD1 is dead and buried.

cheers

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-11-30  4:30 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-28  6:16 [PATCH v7 0/7] Radix pte update tlbflush optimizations Aneesh Kumar K.V
2016-11-28  6:16 ` [PATCH v7 1/7] powerpc/mm: Rename hugetlb-radix.h to hugetlb.h Aneesh Kumar K.V
2016-11-29 11:33   ` Balbir Singh
2016-11-29 12:58   ` [v7,1/7] " Michael Ellerman
2016-11-28  6:16 ` [PATCH v7 2/7] powerpc/mm/hugetlb: Handle hugepage size supported by hash config Aneesh Kumar K.V
2016-11-29 11:35   ` Balbir Singh
2016-11-28  6:17 ` [PATCH v7 3/7] powerpc/mm: Introduce _PAGE_LARGE software pte bits Aneesh Kumar K.V
2016-11-30  0:14   ` Balbir Singh
2016-11-30  0:35     ` Benjamin Herrenschmidt
2016-11-30  0:50       ` Balbir Singh
2016-11-28  6:17 ` [PATCH v7 4/7] powerpc/mm: Add radix__tlb_flush_pte Aneesh Kumar K.V
2016-11-28 11:42   ` Michael Ellerman
2016-11-28  6:17 ` [PATCH v7 5/7] powerpc/mm: update radix__ptep_set_access_flag to not do full mm tlb flush Aneesh Kumar K.V
2016-11-28  6:17 ` [PATCH v7 6/7] powerpc/mm: update radix__pte_update " Aneesh Kumar K.V
2016-11-28  6:17 ` [PATCH v7 7/7] powerpc/mm: Batch tlb flush when invalidating pte entries Aneesh Kumar K.V
2016-11-29 11:32 ` [PATCH v7 0/7] Radix pte update tlbflush optimizations Balbir Singh
2016-11-30  4:30   ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.