linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 1/3] MIPS: Do not flush tlb page when updating PTE entry
@ 2020-05-18  5:08 Bibo Mao
  2020-05-18  5:08 ` [PATCH v3 2/3] mm/memory.c: Update local TLB if PTE entry exists Bibo Mao
  2020-05-18  5:08 ` [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry Bibo Mao
  0 siblings, 2 replies; 8+ messages in thread
From: Bibo Mao @ 2020-05-18  5:08 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Andrew Morton,
	Paul Burton, Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual
  Cc: linux-mips, linux-kernel, Mike Rapoport, Sergei Shtylyov,
	Maciej W. Rozycki, linux-mm, David Hildenbrand

It is not necessary to flush tlb page on all CPUs if suitable PTE
entry exists already during page fault handling, just updating
TLB is fine.

Here redefine flush_tlb_fix_spurious_fault as empty on MIPS system.

Change in v2:
- split flush_tlb_fix_spurious_fault and tlb update into two patches
- comments typo modification
- separate tlb update and add pte readable privilege into two patches
Change in V3:
- add detailed changelog, modify typo issue in patch V2

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 arch/mips/include/asm/pgtable.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 9b01d2d..0d625c2 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -478,6 +478,8 @@ static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
 	return __pgprot(prot);
 }
 
+#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
+
 /*
  * Conversion functions: convert a page and protection to a page entry,
  * and a page entry and page directory to the page they refer to.
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/3] mm/memory.c: Update local TLB if PTE entry exists
  2020-05-18  5:08 [PATCH v3 1/3] MIPS: Do not flush tlb page when updating PTE entry Bibo Mao
@ 2020-05-18  5:08 ` Bibo Mao
  2020-05-18  5:08 ` [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry Bibo Mao
  1 sibling, 0 replies; 8+ messages in thread
From: Bibo Mao @ 2020-05-18  5:08 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Andrew Morton,
	Paul Burton, Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual
  Cc: linux-mips, linux-kernel, Mike Rapoport, Sergei Shtylyov,
	Maciej W. Rozycki, linux-mm, David Hildenbrand

If two threads concurrently fault at the same address, the thread that
won the race updates the PTE and its local TLB. For now, the other
thread gives up, simply does nothing, and continues.

It could happen that this second thread triggers another fault, whereby
it only updates its local TLB while handling the fault. Instead of
triggering another fault, let's directly update the local TLB of the
second thread.

It is only useful to architectures where software can update TLB, it may
bring out some negative effect if update_mmu_cache is used for other
purpose also. It seldom happens where multiple threads access the same
page at the same time, so the negative effect is limited on other arches.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 mm/memory.c | 44 +++++++++++++++++++++++++++-----------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index f703fe8..2eb59a9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1770,8 +1770,8 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr,
 			}
 			entry = pte_mkyoung(*pte);
 			entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-			if (ptep_set_access_flags(vma, addr, pte, entry, 1))
-				update_mmu_cache(vma, addr, pte);
+			ptep_set_access_flags(vma, addr, pte, entry, 1);
+			update_mmu_cache(vma, addr, pte);
 		}
 		goto out_unlock;
 	}
@@ -2436,17 +2436,16 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 		if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
 			/*
 			 * Other thread has already handled the fault
-			 * and we don't need to do anything. If it's
-			 * not the case, the fault will be triggered
-			 * again on the same address.
+			 * and update local tlb only
 			 */
+			update_mmu_cache(vma, addr, vmf->pte);
 			ret = false;
 			goto pte_unlock;
 		}
 
 		entry = pte_mkyoung(vmf->orig_pte);
-		if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
-			update_mmu_cache(vma, addr, vmf->pte);
+		ptep_set_access_flags(vma, addr, vmf->pte, entry, 0);
+		update_mmu_cache(vma, addr, vmf->pte);
 	}
 
 	/*
@@ -2463,7 +2462,8 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
 		vmf->pte = pte_offset_map_lock(mm, vmf->pmd, addr, &vmf->ptl);
 		locked = true;
 		if (!likely(pte_same(*vmf->pte, vmf->orig_pte))) {
-			/* The PTE changed under us. Retry page fault. */
+			/* The PTE changed under us, update local tlb */
+			update_mmu_cache(vma, addr, vmf->pte);
 			ret = false;
 			goto pte_unlock;
 		}
@@ -2618,8 +2618,8 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
 	flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 	entry = pte_mkyoung(vmf->orig_pte);
 	entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-	if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
-		update_mmu_cache(vma, vmf->address, vmf->pte);
+	ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1);
+	update_mmu_cache(vma, vmf->address, vmf->pte);
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
 }
 
@@ -2752,6 +2752,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 		new_page = old_page;
 		page_copied = 1;
 	} else {
+		update_mmu_cache(vma, vmf->address, vmf->pte);
 		mem_cgroup_cancel_charge(new_page, memcg, false);
 	}
 
@@ -2812,6 +2813,7 @@ vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf)
 	 * pte_offset_map_lock.
 	 */
 	if (!pte_same(*vmf->pte, vmf->orig_pte)) {
+		update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
 		pte_unmap_unlock(vmf->pte, vmf->ptl);
 		return VM_FAULT_NOPAGE;
 	}
@@ -2936,6 +2938,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
 			vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
 					vmf->address, &vmf->ptl);
 			if (!pte_same(*vmf->pte, vmf->orig_pte)) {
+				update_mmu_cache(vma, vmf->address, vmf->pte);
 				unlock_page(vmf->page);
 				pte_unmap_unlock(vmf->pte, vmf->ptl);
 				put_page(vmf->page);
@@ -3341,8 +3344,10 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 						vma->vm_page_prot));
 		vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd,
 				vmf->address, &vmf->ptl);
-		if (!pte_none(*vmf->pte))
+		if (!pte_none(*vmf->pte)) {
+			update_mmu_cache(vma, vmf->address, vmf->pte);
 			goto unlock;
+		}
 		ret = check_stable_address_space(vma->vm_mm);
 		if (ret)
 			goto unlock;
@@ -3378,8 +3383,10 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 
 	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
 			&vmf->ptl);
-	if (!pte_none(*vmf->pte))
+	if (!pte_none(*vmf->pte)) {
+		update_mmu_cache(vma, vmf->address, vmf->pte);
 		goto release;
+	}
 
 	ret = check_stable_address_space(vma->vm_mm);
 	if (ret)
@@ -3646,8 +3653,10 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 	}
 
 	/* Re-check under ptl */
-	if (unlikely(!pte_none(*vmf->pte)))
+	if (unlikely(!pte_none(*vmf->pte))) {
+		update_mmu_cache(vma, vmf->address, vmf->pte);
 		return VM_FAULT_NOPAGE;
+	}
 
 	flush_icache_page(vma, page);
 	entry = mk_pte(page, vma->vm_page_prot);
@@ -4224,18 +4233,18 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
 	vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
 	spin_lock(vmf->ptl);
 	entry = vmf->orig_pte;
-	if (unlikely(!pte_same(*vmf->pte, entry)))
+	if (unlikely(!pte_same(*vmf->pte, entry))) {
+		update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
 		goto unlock;
+	}
 	if (vmf->flags & FAULT_FLAG_WRITE) {
 		if (!pte_write(entry))
 			return do_wp_page(vmf);
 		entry = pte_mkdirty(entry);
 	}
 	entry = pte_mkyoung(entry);
-	if (ptep_set_access_flags(vmf->vma, vmf->address, vmf->pte, entry,
+	if (!ptep_set_access_flags(vmf->vma, vmf->address, vmf->pte, entry,
 				vmf->flags & FAULT_FLAG_WRITE)) {
-		update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
-	} else {
 		/*
 		 * This is needed only for protection faults but the arch code
 		 * is not yet telling us if this is a protection fault or not.
@@ -4245,6 +4254,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
 		if (vmf->flags & FAULT_FLAG_WRITE)
 			flush_tlb_fix_spurious_fault(vmf->vma, vmf->address);
 	}
+	update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
 unlock:
 	pte_unmap_unlock(vmf->pte, vmf->ptl);
 	return 0;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry
  2020-05-18  5:08 [PATCH v3 1/3] MIPS: Do not flush tlb page when updating PTE entry Bibo Mao
  2020-05-18  5:08 ` [PATCH v3 2/3] mm/memory.c: Update local TLB if PTE entry exists Bibo Mao
@ 2020-05-18  5:08 ` Bibo Mao
  2020-05-18 20:57   ` Andrew Morton
  1 sibling, 1 reply; 8+ messages in thread
From: Bibo Mao @ 2020-05-18  5:08 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Andrew Morton,
	Paul Burton, Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual
  Cc: linux-mips, linux-kernel, Mike Rapoport, Sergei Shtylyov,
	Maciej W. Rozycki, linux-mm, David Hildenbrand

On mips platform, hw PTE entry valid bit is set in pte_mkyoung
function, it is used to set physical page with readable privilege.

Here add pte_mkyoung function to make page readable on MIPS platform
during page fault handling. This patch improves page fault latency
about 10% on my MIPS machine with lmbench lat_pagefault case.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 mm/memory.c   | 3 +++
 mm/mprotect.c | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 2eb59a9..2399dcb 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2704,6 +2704,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 		}
 		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
 		entry = mk_pte(new_page, vma->vm_page_prot);
+		entry = pte_mkyoung(entry);
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
 		/*
 		 * Clear the pte entry and flush it first, before updating the
@@ -3378,6 +3379,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 	__SetPageUptodate(page);
 
 	entry = mk_pte(page, vma->vm_page_prot);
+	entry = pte_mkyoung(entry);
 	if (vma->vm_flags & VM_WRITE)
 		entry = pte_mkwrite(pte_mkdirty(entry));
 
@@ -3660,6 +3662,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 
 	flush_icache_page(vma, page);
 	entry = mk_pte(page, vma->vm_page_prot);
+	entry = pte_mkyoung(entry);
 	if (write)
 		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
 	/* copy-on-write page */
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 494192ca..673f1cd 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -131,6 +131,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
 				ptent = pte_clear_uffd_wp(ptent);
 			}
 
+			if (vma->vm_flags & VM_READ)
+				ptent = pte_mkyoung(ptent);
 			/* Avoid taking write faults for known dirty pages */
 			if (dirty_accountable && pte_dirty(ptent) &&
 					(pte_soft_dirty(ptent) ||
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry
  2020-05-18  5:08 ` [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry Bibo Mao
@ 2020-05-18 20:57   ` Andrew Morton
  2020-05-19  3:22     ` maobibo
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2020-05-18 20:57 UTC (permalink / raw)
  To: Bibo Mao
  Cc: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Paul Burton,
	Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand

On Mon, 18 May 2020 13:08:49 +0800 Bibo Mao <maobibo@loongson.cn> wrote:

> On mips platform, hw PTE entry valid bit is set in pte_mkyoung
> function, it is used to set physical page with readable privilege.

pte_mkyoung() seems to be a strange place to set the pte's valid bit. 
Why is it done there?  Can it be done within mips's mk_pte()?

> Here add pte_mkyoung function to make page readable on MIPS platform
> during page fault handling. This patch improves page fault latency
> about 10% on my MIPS machine with lmbench lat_pagefault case.
> 
> ...
>
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2704,6 +2704,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
>  		}
>  		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
>  		entry = mk_pte(new_page, vma->vm_page_prot);
> +		entry = pte_mkyoung(entry);
>  		entry = maybe_mkwrite(pte_mkdirty(entry), vma);

What is the effect on non-mips machines?  If it's only "additional
overhead" then it would be better to add a new pte_mkvalid() (or
whatever) and arrange for it to be a no-op on all but mips?

>  		/*
>  		 * Clear the pte entry and flush it first, before updating the
> @@ -3378,6 +3379,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
>  	__SetPageUptodate(page);
>  
>  	entry = mk_pte(page, vma->vm_page_prot);
> +	entry = pte_mkyoung(entry);
>  	if (vma->vm_flags & VM_WRITE)
>  		entry = pte_mkwrite(pte_mkdirty(entry));
>  
> @@ -3660,6 +3662,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
>  
>  	flush_icache_page(vma, page);
>  	entry = mk_pte(page, vma->vm_page_prot);
> +	entry = pte_mkyoung(entry);
>  	if (write)
>  		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
>  	/* copy-on-write page */
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 494192ca..673f1cd 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -131,6 +131,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
>  				ptent = pte_clear_uffd_wp(ptent);
>  			}
>  
> +			if (vma->vm_flags & VM_READ)
> +				ptent = pte_mkyoung(ptent);
>  			/* Avoid taking write faults for known dirty pages */
>  			if (dirty_accountable && pte_dirty(ptent) &&
>  					(pte_soft_dirty(ptent) ||



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry
  2020-05-18 20:57   ` Andrew Morton
@ 2020-05-19  3:22     ` maobibo
  2020-05-19  3:34       ` Andrew Morton
  2020-05-27 20:55       ` Hugh Dickins
  0 siblings, 2 replies; 8+ messages in thread
From: maobibo @ 2020-05-19  3:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Paul Burton,
	Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand



On 05/19/2020 04:57 AM, Andrew Morton wrote:
> On Mon, 18 May 2020 13:08:49 +0800 Bibo Mao <maobibo@loongson.cn> wrote:
> 
>> On mips platform, hw PTE entry valid bit is set in pte_mkyoung
>> function, it is used to set physical page with readable privilege.
> 
> pte_mkyoung() seems to be a strange place to set the pte's valid bit. 
> Why is it done there?  Can it be done within mips's mk_pte()?
On MIPS system hardware cannot set PAGE_ACCESS bit when accessing the page,
software sets PAGE_ACCESS software bit and PAGE_VALID hw bit together during page
fault stage.

If mk_pte is called in page fault flow, it is ok to set both bits. If it is not 
called in page fault, PAGE_ACCESS is set however there is no actual memory accessing.

> 
>> Here add pte_mkyoung function to make page readable on MIPS platform
>> during page fault handling. This patch improves page fault latency
>> about 10% on my MIPS machine with lmbench lat_pagefault case.
>>
>> ...
>>
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2704,6 +2704,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
>>  		}
>>  		flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
>>  		entry = mk_pte(new_page, vma->vm_page_prot);
>> +		entry = pte_mkyoung(entry);
>>  		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
> 
> What is the effect on non-mips machines?  If it's only "additional
> overhead" then it would be better to add a new pte_mkvalid() (or
> whatever) and arrange for it to be a no-op on all but mips?

how about adding pte_sw_mkyoung alike function which is a noop on all but mips?
this function is used to set PAGE_ACCESS bit and PAGE_VALID bit on mips platform.

regards
bibo,mao

> 
>>  		/*
>>  		 * Clear the pte entry and flush it first, before updating the
>> @@ -3378,6 +3379,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
>>  	__SetPageUptodate(page);
>>  
>>  	entry = mk_pte(page, vma->vm_page_prot);
>> +	entry = pte_mkyoung(entry);
>>  	if (vma->vm_flags & VM_WRITE)
>>  		entry = pte_mkwrite(pte_mkdirty(entry));
>>  
>> @@ -3660,6 +3662,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
>>  
>>  	flush_icache_page(vma, page);
>>  	entry = mk_pte(page, vma->vm_page_prot);
>> +	entry = pte_mkyoung(entry);
>>  	if (write)
>>  		entry = maybe_mkwrite(pte_mkdirty(entry), vma);
>>  	/* copy-on-write page */
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 494192ca..673f1cd 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -131,6 +131,8 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
>>  				ptent = pte_clear_uffd_wp(ptent);
>>  			}
>>  
>> +			if (vma->vm_flags & VM_READ)
>> +				ptent = pte_mkyoung(ptent);
>>  			/* Avoid taking write faults for known dirty pages */
>>  			if (dirty_accountable && pte_dirty(ptent) &&
>>  					(pte_soft_dirty(ptent) ||



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry
  2020-05-19  3:22     ` maobibo
@ 2020-05-19  3:34       ` Andrew Morton
  2020-05-27 20:55       ` Hugh Dickins
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2020-05-19  3:34 UTC (permalink / raw)
  To: maobibo
  Cc: Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen, Paul Burton,
	Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand

On Tue, 19 May 2020 11:22:49 +0800 maobibo <maobibo@loongson.cn> wrote:

> how about adding pte_sw_mkyoung alike function which is a noop on all but mips?
> this function is used to set PAGE_ACCESS bit and PAGE_VALID bit on mips platform.

Sounds good.  Please ensure that the interface (roles and
responsibilities, etc) is well documented in code comments.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry
  2020-05-19  3:22     ` maobibo
  2020-05-19  3:34       ` Andrew Morton
@ 2020-05-27 20:55       ` Hugh Dickins
  2020-05-28  4:33         ` maobibo
  1 sibling, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2020-05-27 20:55 UTC (permalink / raw)
  To: maobibo
  Cc: Andrew Morton, Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen,
	Paul Burton, Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand

On Tue, 19 May 2020, maobibo wrote:
> On 05/19/2020 04:57 AM, Andrew Morton wrote:
> > On Mon, 18 May 2020 13:08:49 +0800 Bibo Mao <maobibo@loongson.cn> wrote:
> > 
> >> On mips platform, hw PTE entry valid bit is set in pte_mkyoung
> >> function, it is used to set physical page with readable privilege.
> > 
> > pte_mkyoung() seems to be a strange place to set the pte's valid bit. 
> > Why is it done there?  Can it be done within mips's mk_pte()?
> On MIPS system hardware cannot set PAGE_ACCESS bit when accessing the page,
> software sets PAGE_ACCESS software bit and PAGE_VALID hw bit together during page
> fault stage.
> 
> If mk_pte is called in page fault flow, it is ok to set both bits. If it is not 
> called in page fault, PAGE_ACCESS is set however there is no actual memory accessing.

Sorry for joining in so late, but would you please explain that some more:
preferably in the final commit message, if not here.

I still don't understand why this is not done in the same way as on other
architectures - those that care (I just checked x86, powerpc, arm, arm64,
but not all of them) make sure that all the bits they want are there in
mm/mmap.c's protection_map[16], which then feeds into vma->vm_page_prot,
and so into mk_pte() as Andrew indicated.

And I can see that arch/mips/mm/cache.c has a setup_protection_map()
to do that: why does it not set the additional bits that you want?
including the valid bit and the accessed (young) bit, as others do.
Are you saying that there are circumstances in which it is wrong
for mk_pte() to set the additional bits?

I'm afraid that generic mm developers will have no clue as to whether
or not to add a pte_sw_mkyoung() after a mk_pte(); and generic source
will be the cleaner if it turns out not to be needed (but thank you
for making sure that it does nothing on the other architectures).

Hugh


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry
  2020-05-27 20:55       ` Hugh Dickins
@ 2020-05-28  4:33         ` maobibo
  0 siblings, 0 replies; 8+ messages in thread
From: maobibo @ 2020-05-28  4:33 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Thomas Bogendoerfer, Jiaxun Yang, Huacai Chen,
	Paul Burton, Dmitry Korotin, Philippe Mathieu-Daudé,
	Stafford Horne, Steven Price, Anshuman Khandual, linux-mips,
	linux-kernel, Mike Rapoport, Sergei Shtylyov, Maciej W. Rozycki,
	linux-mm, David Hildenbrand



On 05/28/2020 04:55 AM, Hugh Dickins wrote:
> On Tue, 19 May 2020, maobibo wrote:
>> On 05/19/2020 04:57 AM, Andrew Morton wrote:
>>> On Mon, 18 May 2020 13:08:49 +0800 Bibo Mao <maobibo@loongson.cn> wrote:
>>>
>>>> On mips platform, hw PTE entry valid bit is set in pte_mkyoung
>>>> function, it is used to set physical page with readable privilege.
>>>
>>> pte_mkyoung() seems to be a strange place to set the pte's valid bit. 
>>> Why is it done there?  Can it be done within mips's mk_pte()?
>> On MIPS system hardware cannot set PAGE_ACCESS bit when accessing the page,
>> software sets PAGE_ACCESS software bit and PAGE_VALID hw bit together during page
>> fault stage.
>>
>> If mk_pte is called in page fault flow, it is ok to set both bits. If it is not 
>> called in page fault, PAGE_ACCESS is set however there is no actual memory accessing.
> 
> Sorry for joining in so late, but would you please explain that some more:
> preferably in the final commit message, if not here.
> 
> I still don't understand why this is not done in the same way as on other
> architectures - those that care (I just checked x86, powerpc, arm, arm64,
> but not all of them) make sure that all the bits they want are there in
> mm/mmap.c's protection_map[16], which then feeds into vma->vm_page_prot,
> and so into mk_pte() as Andrew indicated.
> 
> And I can see that arch/mips/mm/cache.c has a setup_protection_map()
> to do that: why does it not set the additional bits that you want?
> including the valid bit and the accessed (young) bit, as others do.
> Are you saying that there are circumstances in which it is wrong
> for mk_pte() to set the additional bits?
MIPS is actually strange here, _PAGE_ACCESSED is not set in protection_map.
I do not understand history of mips neither. 

On x86/aarch/powerpc system, _PAGE_ACCESSED bit is set in the beginning. How does
software track memory page accessing frequency?  Does software not care current
status about _PAGE_ACCESSED bit, just calles madvise_cold to clear this bit, and
then watches whether this bit is changed or not?

regards
bibo,mao
> 
> I'm afraid that generic mm developers will have no clue as to whether
> or not to add a pte_sw_mkyoung() after a mk_pte(); and generic source
> will be the cleaner if it turns out not to be needed (but thank you
> for making sure that it does nothing on the other architectures).
> 
> Hugh
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-05-28  4:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-18  5:08 [PATCH v3 1/3] MIPS: Do not flush tlb page when updating PTE entry Bibo Mao
2020-05-18  5:08 ` [PATCH v3 2/3] mm/memory.c: Update local TLB if PTE entry exists Bibo Mao
2020-05-18  5:08 ` [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry Bibo Mao
2020-05-18 20:57   ` Andrew Morton
2020-05-19  3:22     ` maobibo
2020-05-19  3:34       ` Andrew Morton
2020-05-27 20:55       ` Hugh Dickins
2020-05-28  4:33         ` maobibo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).