linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 1/4] MIPS: Do not flush tlb page when updating PTE entry
@ 2020-05-19 10:03 Bibo Mao
  2020-05-19 10:03 ` [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists Bibo Mao
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Bibo Mao @ 2020-05-19 10:03 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Andrew Morton,
	Paul Burton, Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual
  Cc: linux-mips, linux-kernel, Mike Rapoport, Sergei Shtylyov,
	Maciej W. Rozycki, linux-mm, David Hildenbrand

It is not necessary to flush tlb page on all CPUs if suitable PTE
entry exists already during page fault handling, just updating
TLB is fine.

Here redefine flush_tlb_fix_spurious_fault as empty on MIPS system.
V4:
- add pte_sw_mkyoung function to implement readable privilege, and
  this function is  only in effect on MIPS system.
- add page valid bit judgement in function pte_modify
V3:
- add detailed changelog, modify typo issue in patch V2
v2:
- split flush_tlb_fix_spurious_fault and tlb update into two patches
- comments typo modification
- separate tlb update and add pte readable privilege into two patches

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 arch/mips/include/asm/pgtable.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 9b01d2d..0d625c2 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -478,6 +478,8 @@ static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
 	return __pgprot(prot);
 }
 
+#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
+
 /*
  * Conversion functions: convert a page and protection to a page entry,
  * and a page entry and page directory to the page they refer to.
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists
  2020-05-19 10:03 [PATCH v4 1/4] MIPS: Do not flush tlb page when updating PTE entry Bibo Mao
@ 2020-05-19 10:03 ` Bibo Mao
  2020-05-20  1:26   ` Andrew Morton
  2020-05-19 10:03 ` [PATCH v4 3/4] mm/memory.c: Add memory read privilege on page fault handling Bibo Mao
  2020-05-19 10:03 ` [PATCH v4 4/4] MIPS: mm: add page valid judgement in function pte_modify Bibo Mao
  2 siblings, 1 reply; 9+ messages in thread
From: Bibo Mao @ 2020-05-19 10:03 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Andrew Morton,
	Paul Burton, Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual
  Cc: linux-mips, linux-kernel, Mike Rapoport, Sergei Shtylyov,
	Maciej W. Rozycki, linux-mm, David Hildenbrand

If two threads concurrently fault at the same address, the thread that
won the race updates the PTE and its local TLB. For now, the other
thread gives up, simply does nothing, and continues.

It could happen that this second thread triggers another fault, whereby
it only updates its local TLB while handling the fault. Instead of
triggering another fault, let's directly update the local TLB of the
second thread.

It is only useful to architectures where software can update TLB, it may
bring out some negative effect if update_mmu_cache is used for other
purpose also. It seldom happens where multiple threads access the same
page at the same time, so the negative effect is limited on other arches.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 mm/memory.c | 44 +++++++++++++++++++++++++++-----------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index f703fe8..2eb59a9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1770,8 +1770,8 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 			}
 			entry = pte_mkyoung(*pte);
 			entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-			if (ptep_set_access_flags(vma, addr, pte, entry, 1))
-				update_mmu_cache(vma, addr, pte);
+			ptep_set_access_flags(vma, addr, pte, entry, 1);
+			update_mmu_cache(vma, addr, pte);
 		}
 		goto out_unlock;
 	}
@@ -2436,17 +2436,16 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 		if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
 			/*
 			 * Other thread has already handled the fault
-			 * and we don't need to do anything. If it's
-			 * not the case, the fault will be triggered
-			 * again on the same address.
+			 * and update local tlb only
 			 */
+			update_mmu_cache(vma, addr, vmf->pte);
 			ret = false;
 			goto pte_unlock;
 		}
 
 		entry = pte_mkyoung(vmf->orig_pte);
-		if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
-			update_mmu_cache(vma, addr, vmf->pte);
+		ptep_set_access_flags(vma, addr, vmf->pte, entry, 0);
+		update_mmu_cache(vma, addr, vmf->pte);
 	}
 
 	/*
@@ -2463,7 +2462,8 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 		vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
 		locked = true;
 		if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
-			/* The PTE changed under us. Retry page fault. */
+			/* The PTE changed under us, update local tlb */
+			update_mmu_cache(vma, addr, vmf->pte);
 			ret = false;
 			goto pte_unlock;
 		}
@@ -2618,8 +2618,8 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
 	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 	entry = pte_mkyoung(vmf->orig_pte);
 	entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-	if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
-		update_mmu_cache(vma, vmf->address, vmf->pte);
+	ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1);
+	update_mmu_cache(vma, vmf->address, vmf->pte);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
 }
 
@@ -2752,6 +2752,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 		new_page = old_page;
 		page_copied = 1;
 	} else {
+		update_mmu_cache(vma, vmf->address, vmf->pte);
 		mem_cgroup_cancel_charge(new_page, memcg, false);
 	}
 
@@ -2812,6 +2813,7 @@ vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf)
 	 * pte_offset_map_lock.
 	 */
 	if (!pte_same(*vmf->pte, vmf->orig_pte)) {
+		update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
 		pte_unmap_unlock(vmf->pte, vmf->ptl);
 		return VM_FAULT_NOPAGE;
 	}
@@ -2936,6 +2938,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
 			vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
 					vmf->address, &vmf->ptl);
 			if (!pte_same(*vmf->pte, vmf->orig_pte)) {
+				update_mmu_cache(vma, vmf->address, vmf->pte);
 				unlock_page(vmf->page);
 				pte_unmap_unlock(vmf->pte, vmf->ptl);
 				put_page(vmf->page);
@@ -3341,8 +3344,10 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 						vma->vm_page_prot));
 		vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
 				vmf->address, &vmf->ptl);
-		if (!pte_none(*vmf->pte))
+		if (!pte_none(*vmf->pte)) {
+			update_mmu_cache(vma, vmf->address, vmf->pte);
 			goto unlock;
+		}
 		ret = check_stable_address_space(vma->vm_mm);
 		if (ret)
 			goto unlock;
@@ -3378,8 +3383,10 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 
 	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
 			&vmf->ptl);
-	if (!pte_none(*vmf->pte))
+	if (!pte_none(*vmf->pte)) {
+		update_mmu_cache(vma, vmf->address, vmf->pte);
 		goto release;
+	}
 
 	ret = check_stable_address_space(vma->vm_mm);
 	if (ret)
@@ -3646,8 +3653,10 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 	}
 
 	/* Re-check under ptl */
-	if (unlikely(!pte_none(*vmf->pte)))
+	if (unlikely(!pte_none(*vmf->pte))) {
+		update_mmu_cache(vma, vmf->address, vmf->pte);
 		return VM_FAULT_NOPAGE;
+	}
 
 	flush_icache_page(vma, page);
 	entry = mk_pte(page, vma->vm_page_prot);
@@ -4224,18 +4233,18 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
 	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
 	spin_lock(vmf->ptl);
 	entry = vmf->orig_pte;
-	if (unlikely(!pte_same(*vmf->pte, entry)))
+	if (unlikely(!pte_same(*vmf->pte, entry))) {
+		update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
 		goto unlock;
+	}
 	if (vmf->flags & FAULT_FLAG_WRITE) {
 		if (!pte_write(entry))
 			return do_wp_page(vmf);
 		entry = pte_mkdirty(entry);
 	}
 	entry = pte_mkyoung(entry);
-	if (ptep_set_access_flags(vmf->vma, vmf->address, vmf->pte, entry,
+	if (!ptep_set_access_flags(vmf->vma, vmf->address, vmf->pte, entry,
 				vmf->flags & FAULT_FLAG_WRITE)) {
-		update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
-	} else {
 		/*
 		 * This is needed only for protection faults but the arch code
 		 * is not yet telling us if this is a protection fault or not.
@@ -4245,6 +4254,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
 		if (vmf->flags & FAULT_FLAG_WRITE)
 			flush_tlb_fix_spurious_fault(vmf->vma, vmf->address);
 	}
+	update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
 unlock:
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
 	return 0;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 3/4] mm/memory.c: Add memory read privilege on page fault handling
  2020-05-19 10:03 [PATCH v4 1/4] MIPS: Do not flush tlb page when updating PTE entry Bibo Mao
  2020-05-19 10:03 ` [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists Bibo Mao
@ 2020-05-19 10:03 ` Bibo Mao
  2020-05-20  1:30   ` Andrew Morton
  2020-05-19 10:03 ` [PATCH v4 4/4] MIPS: mm: add page valid judgement in function pte_modify Bibo Mao
  2 siblings, 1 reply; 9+ messages in thread
From: Bibo Mao @ 2020-05-19 10:03 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Andrew Morton,
	Paul Burton, Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual
  Cc: linux-mips, linux-kernel, Mike Rapoport, Sergei Shtylyov,
	Maciej W. Rozycki, linux-mm, David Hildenbrand

Here add pte_sw_mkyoung function to make page readable on MIPS
platform during page fault handling. This patch improves page
fault latency about 10% on my MIPS machine with lmbench
lat_pagefault case.

It is noop function on other arches, there is no negative
influence on those architectures.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 arch/mips/include/asm/pgtable.h |  2 ++
 include/asm-generic/pgtable.h   | 15 +++++++++++++++
 mm/memory.c                     |  3 +++
 3 files changed, 20 insertions(+)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 0d625c2..755d534 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -414,6 +414,8 @@ static inline pte_t pte_mkyoung(pte_t pte)
 	return pte;
 }
 
+#define pte_sw_mkyoung	pte_mkyoung
+
 #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
 static inline int pte_huge(pte_t pte)	{ return pte_val(pte) & _PAGE_HUGE; }
 
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 329b8c8..2542ef1 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -227,6 +227,21 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres
 }
 #endif
 
+/*
+ * On some architectures hardware does not set page access bit when accessing
+ * memory page, it is responsibilty of software setting this bit. It brings
+ * out extra page fault penalty to track page access bit. For optimization page
+ * access bit can be set during all page fault flow on these arches.
+ * To be differentiate with macro pte_mkyoung, this macro is used on platforms
+ * where software maintains page access bit.
+ */
+#ifndef pte_sw_mkyoung
+static inline pte_t pte_sw_mkyoung(pte_t pte)
+{
+	return pte;
+}
+#endif
+
 #ifndef pte_savedwrite
 #define pte_savedwrite pte_write
 #endif
diff --git a/mm/memory.c b/mm/memory.c
index 2eb59a9..d9700b1 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2704,6 +2704,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 		}
 		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 		entry = mk_pte(new_page, vma->vm_page_prot);
+		entry = pte_sw_mkyoung(entry);
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
 		/*
 		 * Clear the pte entry and flush it first, before updating the
@@ -3378,6 +3379,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 	__SetPageUptodate(page);
 
 	entry = mk_pte(page, vma->vm_page_prot);
+	entry = pte_sw_mkyoung(entry);
 	if (vma->vm_flags & VM_WRITE)
 		entry = pte_mkwrite(pte_mkdirty(entry));
 
@@ -3660,6 +3662,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 
 	flush_icache_page(vma, page);
 	entry = mk_pte(page, vma->vm_page_prot);
+	entry = pte_sw_mkyoung(entry);
 	if (write)
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
 	/* copy-on-write page */
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 4/4] MIPS: mm: add page valid judgement in function pte_modify
  2020-05-19 10:03 [PATCH v4 1/4] MIPS: Do not flush tlb page when updating PTE entry Bibo Mao
  2020-05-19 10:03 ` [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists Bibo Mao
  2020-05-19 10:03 ` [PATCH v4 3/4] mm/memory.c: Add memory read privilege on page fault handling Bibo Mao
@ 2020-05-19 10:03 ` Bibo Mao
  2 siblings, 0 replies; 9+ messages in thread
From: Bibo Mao @ 2020-05-19 10:03 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Andrew Morton,
	Paul Burton, Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual
  Cc: linux-mips, linux-kernel, Mike Rapoport, Sergei Shtylyov,
	Maciej W. Rozycki, linux-mm, David Hildenbrand

If original PTE has _PAGE_ACCESSED bit set, and new pte has no
_PAGE_NO_READ bit set, we can add _PAGE_SILENT_READ bit to enable
page valid bit.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 arch/mips/include/asm/pgtable.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 755d534..4f6eb56 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -509,8 +509,11 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 #else
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-	return __pte((pte_val(pte) & _PAGE_CHG_MASK) |
-		     (pgprot_val(newprot) & ~_PAGE_CHG_MASK));
+	pte_val(pte) &= _PAGE_CHG_MASK;
+	pte_val(pte) |= pgprot_val(newprot) & ~_PAGE_CHG_MASK;
+	if ((pte_val(pte) & _PAGE_ACCESSED) && !(pte_val(pte) & _PAGE_NO_READ))
+		pte_val(pte) |= _PAGE_SILENT_READ;
+	return pte;
 }
 #endif
 
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists
  2020-05-19 10:03 ` [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists Bibo Mao
@ 2020-05-20  1:26   ` Andrew Morton
  2020-05-20  6:39     ` maobibo
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2020-05-20  1:26 UTC (permalink / raw)
  To: Bibo Mao
  Cc: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Paul Burton,
	Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand

On Tue, 19 May 2020 18:03:28 +0800 Bibo Mao <maobibo@loongson.cn> wrote:

> If two threads concurrently fault at the same address, the thread that
> won the race updates the PTE and its local TLB. For now, the other
> thread gives up, simply does nothing, and continues.
> 
> It could happen that this second thread triggers another fault, whereby
> it only updates its local TLB while handling the fault. Instead of
> triggering another fault, let's directly update the local TLB of the
> second thread.
> 
> It is only useful to architectures where software can update TLB, it may
> bring out some negative effect if update_mmu_cache is used for other
> purpose also. It seldom happens where multiple threads access the same
> page at the same time, so the negative effect is limited on other arches.

I'm still worried about the impact on other architectures.  The
additional update_mmu_cache() calls won't occur only when multiple
threads are racing against the same page, I think?  For example,
insert_pfn() will do this when making a read-only page a writable one.

Would you have time to add some instrumentation into update_mmu_cache()
(maybe a tracepoint) and see what effect this change has upon the
frequency at which update_mmu_cache() is called for a selection of
workloads?  And add this info to the changelog to set minds at ease?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 3/4] mm/memory.c: Add memory read privilege on page fault handling
  2020-05-19 10:03 ` [PATCH v4 3/4] mm/memory.c: Add memory read privilege on page fault handling Bibo Mao
@ 2020-05-20  1:30   ` Andrew Morton
  2020-05-20  8:22     ` maobibo
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2020-05-20  1:30 UTC (permalink / raw)
  To: Bibo Mao
  Cc: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Paul Burton,
	Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand

On Tue, 19 May 2020 18:03:29 +0800 Bibo Mao <maobibo@loongson.cn> wrote:

> Here add pte_sw_mkyoung function to make page readable on MIPS
> platform during page fault handling. This patch improves page
> fault latency about 10% on my MIPS machine with lmbench
> lat_pagefault case.
> 
> It is noop function on other arches, there is no negative
> influence on those architectures.
> 
> --- a/arch/mips/include/asm/pgtable.h
> +++ b/arch/mips/include/asm/pgtable.h
> @@ -414,6 +414,8 @@ static inline pte_t pte_mkyoung(pte_t pte)
>  	return pte;
>  }
>  
> +#define pte_sw_mkyoung	pte_mkyoung
> +
>  #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
>  static inline int pte_huge(pte_t pte)	{ return pte_val(pte) & _PAGE_HUGE; }
>  
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -227,6 +227,21 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres
>  }
>  #endif
>  
> +/*
> + * On some architectures hardware does not set page access bit when accessing
> + * memory page, it is responsibilty of software setting this bit. It brings
> + * out extra page fault penalty to track page access bit. For optimization page
> + * access bit can be set during all page fault flow on these arches.
> + * To be differentiate with macro pte_mkyoung, this macro is used on platforms
> + * where software maintains page access bit.
> + */
> +#ifndef pte_sw_mkyoung
> +static inline pte_t pte_sw_mkyoung(pte_t pte)
> +{
> +	return pte;
> +}
> +#endif

Yup, that's neat enough.  Thanks for making this change.  It looks like
all architectures include asm-generic/pgtable.h so that's fine.

It's conventional to add a

#define pte_sw_mkyoung pte_sw_mkyoung

immediately above the #endif there, so we can't try to implement
pte_sw_mkyoung() twice if this header gets included twice.  But the
header has #ifndef _ASM_GENERIC_PGTABLE_H around the whole thing so
that should be OK.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists
  2020-05-20  1:26   ` Andrew Morton
@ 2020-05-20  6:39     ` maobibo
  2020-05-21  0:54       ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: maobibo @ 2020-05-20  6:39 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Paul Burton,
	Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand



On 05/20/2020 09:26 AM, Andrew Morton wrote:
> On Tue, 19 May 2020 18:03:28 +0800 Bibo Mao <maobibo@loongson.cn> wrote:
> 
>> If two threads concurrently fault at the same address, the thread that
>> won the race updates the PTE and its local TLB. For now, the other
>> thread gives up, simply does nothing, and continues.
>>
>> It could happen that this second thread triggers another fault, whereby
>> it only updates its local TLB while handling the fault. Instead of
>> triggering another fault, let's directly update the local TLB of the
>> second thread.
>>
>> It is only useful to architectures where software can update TLB, it may
>> bring out some negative effect if update_mmu_cache is used for other
>> purpose also. It seldom happens where multiple threads access the same
>> page at the same time, so the negative effect is limited on other arches.
> 
> I'm still worried about the impact on other architectures.  The
> additional update_mmu_cache() calls won't occur only when multiple
> threads are racing against the same page, I think?  For example,
> insert_pfn() will do this when making a read-only page a writable one.
How about defining ptep_set_access_flags function like this on mips system?
which is the same on riscv platform.

static inline int ptep_set_access_flags(struct vm_area_struct *vma,
					unsigned long address, pte_t *ptep,
					pte_t entry, int dirty)
{
	if (!pte_same(*ptep, entry))
		set_pte_at(vma->vm_mm, address, ptep, entry);
	/*
	 * update_mmu_cache will unconditionally execute, handling both
	 * the case that the PTE changed and the spurious fault case.
	 */
	return true;
}

And keep the following piece of code unchanged, the change will be smaller.
@@ -1770,8 +1770,8 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr,
                        }
                        entry = pte_mkyoung(*pte);
                        entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-                       if (ptep_set_access_flags(vma, addr, pte, entry, 1))
-                               update_mmu_cache(vma, addr, pte);
+                       ptep_set_access_flags(vma, addr, pte, entry, 1);
+                       update_mmu_cache(vma, addr, pte);
                }

@@ -2436,17 +2436,16 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
                entry = pte_mkyoung(vmf->orig_pte);
-               if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
-                       update_mmu_cache(vma, addr, vmf->pte);
+               ptep_set_access_flags(vma, addr, vmf->pte, entry, 0);
+               update_mmu_cache(vma, addr, vmf->pte);
        }
@@ -2618,8 +2618,8 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
        flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
        entry = pte_mkyoung(vmf->orig_pte);
        entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-       if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
-               update_mmu_cache(vma, vmf->address, vmf->pte);
+       ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1);
+       update_mmu_cache(vma, vmf->address, vmf->pte);
        pte_unmap_unlock(vmf->pte, vmf->ptl);
 }



> 
> Would you have time to add some instrumentation into update_mmu_cache()
> (maybe a tracepoint) and see what effect this change has upon the
> frequency at which update_mmu_cache() is called for a selection of
> workloads?  And add this info to the changelog to set minds at ease?
OK, I will add some instrumentation data in the changelog.

 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 3/4] mm/memory.c: Add memory read privilege on page fault handling
  2020-05-20  1:30   ` Andrew Morton
@ 2020-05-20  8:22     ` maobibo
  0 siblings, 0 replies; 9+ messages in thread
From: maobibo @ 2020-05-20  8:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Paul Burton,
	Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand



On 05/20/2020 09:30 AM, Andrew Morton wrote:
> On Tue, 19 May 2020 18:03:29 +0800 Bibo Mao <maobibo@loongson.cn> wrote:
> 
>> Here add pte_sw_mkyoung function to make page readable on MIPS
>> platform during page fault handling. This patch improves page
>> fault latency about 10% on my MIPS machine with lmbench
>> lat_pagefault case.
>>
>> It is noop function on other arches, there is no negative
>> influence on those architectures.
>>
>> --- a/arch/mips/include/asm/pgtable.h
>> +++ b/arch/mips/include/asm/pgtable.h
>> @@ -414,6 +414,8 @@ static inline pte_t pte_mkyoung(pte_t pte)
>>  	return pte;
>>  }
>>  
>> +#define pte_sw_mkyoung	pte_mkyoung
>> +
>>  #ifdef CONFIG_MIPS_HUGE_TLB_SUPPORT
>>  static inline int pte_huge(pte_t pte)	{ return pte_val(pte) & _PAGE_HUGE; }
>>  
>> --- a/include/asm-generic/pgtable.h
>> +++ b/include/asm-generic/pgtable.h
>> @@ -227,6 +227,21 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addres
>>  }
>>  #endif
>>  
>> +/*
>> + * On some architectures hardware does not set page access bit when accessing
>> + * memory page, it is responsibilty of software setting this bit. It brings
>> + * out extra page fault penalty to track page access bit. For optimization page
>> + * access bit can be set during all page fault flow on these arches.
>> + * To be differentiate with macro pte_mkyoung, this macro is used on platforms
>> + * where software maintains page access bit.
>> + */
>> +#ifndef pte_sw_mkyoung
>> +static inline pte_t pte_sw_mkyoung(pte_t pte)
>> +{
>> +	return pte;
>> +}
>> +#endif
> 
> Yup, that's neat enough.  Thanks for making this change.  It looks like
> all architectures include asm-generic/pgtable.h so that's fine.
> 
> It's conventional to add a
> 
> #define pte_sw_mkyoung pte_sw_mkyoung
> 
> immediately above the #endif there, so we can't try to implement
> pte_sw_mkyoung() twice if this header gets included twice.  But the
> header has #ifndef _ASM_GENERIC_PGTABLE_H around the whole thing so
> that should be OK.

Sure, will do, and thanks for your kindly help and guidance

 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists
  2020-05-20  6:39     ` maobibo
@ 2020-05-21  0:54       ` Andrew Morton
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew Morton @ 2020-05-21  0:54 UTC (permalink / raw)
  To: maobibo
  Cc: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Paul Burton,
	Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand

On Wed, 20 May 2020 14:39:13 +0800 maobibo <maobibo@loongson.cn> wrote:

> > I'm still worried about the impact on other architectures.  The
> > additional update_mmu_cache() calls won't occur only when multiple
> > threads are racing against the same page, I think?  For example,
> > insert_pfn() will do this when making a read-only page a writable one.
> How about defining ptep_set_access_flags function like this on mips system?
> which is the same on riscv platform.
> 
> static inline int ptep_set_access_flags(struct vm_area_struct *vma,
> 					unsigned long address, pte_t *ptep,
> 					pte_t entry, int dirty)
> {
> 	if (!pte_same(*ptep, entry))
> 		set_pte_at(vma->vm_mm, address, ptep, entry);
> 	/*
> 	 * update_mmu_cache will unconditionally execute, handling both
> 	 * the case that the PTE changed and the spurious fault case.
> 	 */
> 	return true;
> }
> 

hm, it seems a bit abusive - ptep_set_access_flags() is supposed to
return true if the pte changed, and that isn't the case here.

I suppose we could run update_mmu_cache() directly from
ptep_set_access_flags() if we're about to return false, but that
doesn't seem a lot nicer?

> > Would you have time to add some instrumentation into update_mmu_cache()
> > (maybe a tracepoint) and see what effect this change has upon the
> > frequency at which update_mmu_cache() is called for a selection of
> > workloads?  And add this info to the changelog to set minds at ease?
>
> OK, I will add some instrumentation data in the changelog.

Well, if this testing shows no effect as you expect, perhaps we can
leave the code as-is.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-05-21  0:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-19 10:03 [PATCH v4 1/4] MIPS: Do not flush tlb page when updating PTE entry Bibo Mao
2020-05-19 10:03 ` [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists Bibo Mao
2020-05-20  1:26   ` Andrew Morton
2020-05-20  6:39     ` maobibo
2020-05-21  0:54       ` Andrew Morton
2020-05-19 10:03 ` [PATCH v4 3/4] mm/memory.c: Add memory read privilege on page fault handling Bibo Mao
2020-05-20  1:30   ` Andrew Morton
2020-05-20  8:22     ` maobibo
2020-05-19 10:03 ` [PATCH v4 4/4] MIPS: mm: add page valid judgement in function pte_modify Bibo Mao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).