linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Cc: paulmck@linux.vnet.ibm.com, peterz@infradead.org,
	akpm@linux-foundation.org, kirill@shutemov.name,
	ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net,
	jack@suse.cz, Matthew Wilcox <willy@infradead.org>,
	benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	hpa@zytor.com, Will Deacon <will.deacon@arm.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	kemi.wang@intel.com, sergey.senozhatsky.work@gmail.com,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com,
	npiggin@gmail.com, bsingharora@gmail.com,
	Tim Chen <tim.c.chen@linux.intel.com>,
	linuxppc-dev@lists.ozlabs.org, x86@kernel.org
Subject: Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
Date: Tue, 3 Apr 2018 15:10:05 -0400	[thread overview]
Message-ID: <20180403191005.GC5935@redhat.com> (raw)
In-Reply-To: <1520963994-28477-7-git-send-email-ldufour@linux.vnet.ibm.com>

On Tue, Mar 13, 2018 at 06:59:36PM +0100, Laurent Dufour wrote:
> pte_unmap_same() is making the assumption that the page table are still
> around because the mmap_sem is held.
> This is no more the case when running a speculative page fault and
> additional check must be made to ensure that the final page table are still
> there.
> 
> This is now done by calling pte_spinlock() to check for the VMA's
> consistency while locking for the page tables.
> 
> This is requiring passing a vm_fault structure to pte_unmap_same() which is
> containing all the needed parameters.
> 
> As pte_spinlock() may fail in the case of a speculative page fault, if the
> VMA has been touched in our back, pte_unmap_same() should now return 3
> cases :
> 	1. pte are the same (0)
> 	2. pte are different (VM_FAULT_PTNOTSAME)
> 	3. a VMA's changes has been detected (VM_FAULT_RETRY)
> 
> The case 2 is handled by the introduction of a new VM_FAULT flag named
> VM_FAULT_PTNOTSAME which is then trapped in cow_user_page().
> If VM_FAULT_RETRY is returned, it is passed up to the callers to retry the
> page fault while holding the mmap_sem.
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> ---
>  include/linux/mm.h |  1 +
>  mm/memory.c        | 29 +++++++++++++++++++----------
>  2 files changed, 20 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 2f3e98edc94a..b6432a261e63 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1199,6 +1199,7 @@ static inline void clear_page_pfmemalloc(struct page *page)
>  #define VM_FAULT_NEEDDSYNC  0x2000	/* ->fault did not modify page tables
>  					 * and needs fsync() to complete (for
>  					 * synchronous page faults in DAX) */
> +#define VM_FAULT_PTNOTSAME 0x4000	/* Page table entries have changed */
>  
>  #define VM_FAULT_ERROR	(VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | \
>  			 VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE | \
> diff --git a/mm/memory.c b/mm/memory.c
> index 21b1212a0892..4bc7b0bdcb40 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
>   * parts, do_swap_page must check under lock before unmapping the pte and
>   * proceeding (but do_wp_page is only called after already making such a check;
>   * and do_anonymous_page can safely check later on).
> + *
> + * pte_unmap_same() returns:
> + *	0			if the PTE are the same
> + *	VM_FAULT_PTNOTSAME	if the PTE are different
> + *	VM_FAULT_RETRY		if the VMA has changed in our back during
> + *				a speculative page fault handling.
>   */
> -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
> -				pte_t *page_table, pte_t orig_pte)
> +static inline int pte_unmap_same(struct vm_fault *vmf)
>  {
> -	int same = 1;
> +	int ret = 0;
> +
>  #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
>  	if (sizeof(pte_t) > sizeof(unsigned long)) {
> -		spinlock_t *ptl = pte_lockptr(mm, pmd);
> -		spin_lock(ptl);
> -		same = pte_same(*page_table, orig_pte);
> -		spin_unlock(ptl);
> +		if (pte_spinlock(vmf)) {
> +			if (!pte_same(*vmf->pte, vmf->orig_pte))
> +				ret = VM_FAULT_PTNOTSAME;
> +			spin_unlock(vmf->ptl);
> +		} else
> +			ret = VM_FAULT_RETRY;
>  	}
>  #endif
> -	pte_unmap(page_table);
> -	return same;
> +	pte_unmap(vmf->pte);
> +	return ret;
>  }
>  
>  static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma)
> @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
>  	int exclusive = 0;
>  	int ret = 0;
>  
> -	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
> +	ret = pte_unmap_same(vmf);
> +	if (ret)
>  		goto out;
>  

This change what do_swap_page() returns ie before it was returning 0
when locked pte lookup was different from orig_pte. After this patch
it returns VM_FAULT_PTNOTSAME but this is a new return value for
handle_mm_fault() (the do_swap_page() return value is what ultimately
get return by handle_mm_fault())

Do we really want that ? This might confuse some existing user of
handle_mm_fault() and i am not sure of the value of that information
to caller.

Note i do understand that you want to return retry if anything did
change from underneath and thus need to differentiate from when the
pte value are not the same.

Cheers,
Jérôme

  parent reply	other threads:[~2018-04-03 19:10 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-13 17:59 [PATCH v9 00/24] Speculative page faults Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-03-25 21:50   ` David Rientjes
2018-03-28  7:49     ` Laurent Dufour
2018-03-28 10:16       ` David Rientjes
2018-03-28 11:15         ` Laurent Dufour
2018-03-28 21:18           ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 02/24] x86/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 03/24] powerpc/mm: " Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 04/24] mm: Prepare for FAULT_FLAG_SPECULATIVE Laurent Dufour
2018-03-25 21:50   ` David Rientjes
2018-03-28 10:27     ` Laurent Dufour
2018-04-03 21:57       ` David Rientjes
2018-04-04  9:23         ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 05/24] mm: Introduce pte_spinlock " Laurent Dufour
2018-03-25 21:50   ` David Rientjes
2018-03-28  8:15     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF Laurent Dufour
2018-03-27 21:18   ` David Rientjes
2018-03-28  8:27     ` Laurent Dufour
2018-03-28 10:20       ` David Rientjes
2018-03-28 10:43         ` Laurent Dufour
2018-04-03 19:10   ` Jerome Glisse [this message]
2018-04-03 20:40     ` David Rientjes
2018-04-03 21:04       ` Jerome Glisse
2018-04-04  9:53     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 07/24] mm: VMA sequence count Laurent Dufour
2018-03-17  7:51   ` [mm] b1f0502d04: INFO:trying_to_register_non-static_key kernel test robot
2018-03-21 12:21     ` Laurent Dufour
2018-03-25 22:10       ` David Rientjes
2018-03-28 13:30         ` Laurent Dufour
2018-04-04  0:48           ` David Rientjes
2018-04-04  1:03             ` David Rientjes
2018-04-04 10:28               ` Laurent Dufour
2018-04-04 10:19             ` Laurent Dufour
2018-04-04 21:53               ` David Rientjes
2018-04-05 16:55                 ` Laurent Dufour
2018-03-27 21:30   ` [PATCH v9 07/24] mm: VMA sequence count David Rientjes
2018-03-28 17:58     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 08/24] mm: Protect VMA modifications using " Laurent Dufour
2018-03-27 21:45   ` David Rientjes
2018-03-28 16:57     ` Laurent Dufour
2018-03-27 21:57   ` David Rientjes
2018-03-28 17:10     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 09/24] mm: protect mremap() against SPF hanlder Laurent Dufour
2018-03-27 22:12   ` David Rientjes
2018-03-28 18:11     ` Laurent Dufour
2018-03-28 21:21       ` David Rientjes
2018-04-04  8:24         ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 10/24] mm: Protect SPF handler against anon_vma changes Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 11/24] mm: Cache some VMA fields in the vm_fault structure Laurent Dufour
2018-04-02 22:24   ` David Rientjes
2018-04-04 15:48     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 12/24] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour
2018-04-02 23:00   ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 13/24] mm: Introduce __lru_cache_add_active_or_unevictable Laurent Dufour
2018-04-02 23:11   ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 14/24] mm: Introduce __maybe_mkwrite() Laurent Dufour
2018-04-02 23:12   ` David Rientjes
2018-04-04 15:56     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 15/24] mm: Introduce __vm_normal_page() Laurent Dufour
2018-04-02 23:18   ` David Rientjes
2018-04-04 16:04     ` Laurent Dufour
2018-04-03 19:39   ` Jerome Glisse
2018-04-03 20:45     ` David Rientjes
2018-04-04 16:26     ` Laurent Dufour
2018-04-04 21:59       ` Jerome Glisse
2018-04-05 12:53         ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 16/24] mm: Introduce __page_add_new_anon_rmap() Laurent Dufour
2018-04-02 23:57   ` David Rientjes
2018-04-10 16:30     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock Laurent Dufour
2018-03-14  8:48   ` Peter Zijlstra
2018-03-14 16:25     ` Laurent Dufour
2018-03-16 10:23   ` [mm] b33ddf50eb: INFO:trying_to_register_non-static_key kernel test robot
2018-03-16 16:38     ` Laurent Dufour
2018-04-03  0:11   ` [PATCH v9 17/24] mm: Protect mm_rb tree with a rwlock David Rientjes
2018-04-06 14:23     ` Laurent Dufour
2018-04-10 16:20     ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 18/24] mm: Provide speculative fault infrastructure Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 19/24] mm: Adding speculative page fault failure trace events Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 20/24] perf: Add a speculative page fault sw event Laurent Dufour
2018-03-26 21:43   ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 21/24] perf tools: Add support for the SPF perf event Laurent Dufour
2018-03-26 21:44   ` David Rientjes
2018-03-27  3:49     ` Andi Kleen
2018-04-10  6:47       ` David Rientjes
2018-04-12 13:44         ` Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 22/24] mm: Speculative page fault handler return VMA Laurent Dufour
2018-03-13 17:59 ` [PATCH v9 23/24] x86/mm: Add speculative pagefault handling Laurent Dufour
2018-03-26 21:41   ` David Rientjes
2018-03-13 17:59 ` [PATCH v9 24/24] powerpc/mm: Add speculative page fault Laurent Dufour
2018-03-26 21:39   ` David Rientjes
2018-03-14 13:11 ` [PATCH v9 00/24] Speculative page faults Michal Hocko
2018-03-14 13:33   ` Laurent Dufour
2018-04-13 13:34   ` Laurent Dufour
2018-03-22  1:21 ` Ganesh Mahendran
2018-03-29 12:49   ` Laurent Dufour
2018-04-03 20:37 ` Jerome Glisse
2018-04-04  7:59   ` Laurent Dufour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180403191005.GC5935@redhat.com \
    --to=jglisse@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=benh@kernel.crashing.org \
    --cc=bsingharora@gmail.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave@stgolabs.net \
    --cc=haren@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.cz \
    --cc=kemi.wang@intel.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=kirill@shutemov.name \
    --cc=ldufour@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=will.deacon@arm.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).