All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurent Dufour <ldufour@linux.vnet.ibm.com>
To: Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	paulmck@linux.vnet.ibm.com, akpm@linux-foundation.org,
	ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net,
	jack@suse.cz, Matthew Wilcox <willy@infradead.org>,
	benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	hpa@zytor.com, Will Deacon <will.deacon@arm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	haren@linux.vnet.ibm.com, npiggin@gmail.com,
	bsingharora@gmail.com, Tim Chen <tim.c.chen@linux.intel.com>,
	linuxppc-dev@lists.ozlabs.org, x86@kernel.org
Subject: Re: [PATCH v2 14/20] mm: Provide speculative fault infrastructure
Date: Wed, 30 Aug 2017 11:53:41 +0200	[thread overview]
Message-ID: <db7e5c3e-0bb6-a1f3-a025-379071c30183@linux.vnet.ibm.com> (raw)
In-Reply-To: <ab0634c4-274d-208f-fc4b-43991986bacf@linux.vnet.ibm.com>

On 30/08/2017 07:03, Anshuman Khandual wrote:
> On 08/29/2017 07:15 PM, Peter Zijlstra wrote:
>> On Tue, Aug 29, 2017 at 03:18:25PM +0200, Laurent Dufour wrote:
>>> On 29/08/2017 14:04, Peter Zijlstra wrote:
>>>> On Tue, Aug 29, 2017 at 09:59:30AM +0200, Laurent Dufour wrote:
>>>>> On 27/08/2017 02:18, Kirill A. Shutemov wrote:
>>>>>>> +
>>>>>>> +	if (unlikely(!vma->anon_vma))
>>>>>>> +		goto unlock;
>>>>>>
>>>>>> It deserves a comment.
>>>>>
>>>>> You're right I'll add it in the next version.
>>>>> For the record, the root cause is that __anon_vma_prepare() requires the
>>>>> mmap_sem to be held because vm_next and vm_prev must be safe.
>>>>
>>>> But should that test not be:
>>>>
>>>> 	if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma))
>>>> 		goto unlock;
>>>>
>>>> Because !anon vmas will never have ->anon_vma set and you don't want to
>>>> exclude those.
>>>
>>> Yes in the case we later allow non anonymous vmas to be handled.
>>> Currently only anonymous vmas are supported so the check is good enough,
>>> isn't it ?
>>
>> That wasn't at all clear from reading the code. This makes it clear
>> ->anon_vma is only ever looked at for anonymous.
>>
>> And like Kirill says, we _really_ should start allowing some (if not
>> all) vm_ops. Large file based mappings aren't particularly rare.
>>
>> I'm not sure we want to introduce a white-list or just bite the bullet
>> and audit all ->fault() implementations. But either works and isn't
>> terribly difficult, auditing all is more work though.
> 
> filemap_fault() is used as vma-vm_ops->fault() for most of the file
> systems. Changing it can enable speculative fault support for all of
> them. It will still exclude other driver based vma-vm_ops->fault()
> implementation. AFAICS, __lock_page_or_retry() function can drop
> mm->mmap_sem if the page could not be locked right away. As suggested
> by Peterz, making it understand FAULT_FLAG_SPECULATIVE should be good
> enough. The patch is lightly tested for file mappings on top of this
> series.

Hi Anshuman,

This sounds pretty good, except for  the FAULT_FLAG_RETRY_NOWAIT's case I
mentioned in another mail.

The next step would be to find a way to discriminate between the vm_fault()
functions. Any idea ?

Thanks,
Laurent.

> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index a497024..08f3042 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1181,6 +1181,18 @@ int __lock_page_killable(struct page *__page)
>  int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
>                          unsigned int flags)
>  {
> +       if (flags & FAULT_FLAG_SPECULATIVE) {
> +               if (flags & FAULT_FLAG_KILLABLE) {
> +                       int ret;
> +
> +                       ret = __lock_page_killable(page);
> +                       if (ret)
> +                               return 0;
> +               } else
> +                       __lock_page(page);
> +               return 1;
> +       }
> +
>         if (flags & FAULT_FLAG_ALLOW_RETRY) {
>                 /*
>                  * CAUTION! In this case, mmap_sem is not released
> diff --git a/mm/memory.c b/mm/memory.c
> index 549d235..02347f3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3836,8 +3836,6 @@ static int handle_pte_fault(struct vm_fault *vmf)
>         if (!vmf->pte) {
>                 if (vma_is_anonymous(vmf->vma))
>                         return do_anonymous_page(vmf);
> -               else if (vmf->flags & FAULT_FLAG_SPECULATIVE)
> -                       return VM_FAULT_RETRY;
>                 else
>                         return do_fault(vmf);
>         }
> @@ -4012,17 +4010,7 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
>                 goto unlock;
>         }
> 
> -       /*
> -        * Can't call vm_ops service has we don't know what they would do
> -        * with the VMA.
> -        * This include huge page from hugetlbfs.
> -        */
> -       if (vma->vm_ops) {
> -               trace_spf_vma_notsup(_RET_IP_, vma, address);
> -               goto unlock;
> -       }
> -
> -       if (unlikely(!vma->anon_vma)) {
> +       if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma)) {
>                 trace_spf_vma_notsup(_RET_IP_, vma, address);
>                 goto unlock;
>         }
> 

WARNING: multiple messages have this Message-ID (diff)
From: Laurent Dufour <ldufour@linux.vnet.ibm.com>
To: Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	paulmck@linux.vnet.ibm.com, akpm@linux-foundation.org,
	ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net,
	jack@suse.cz, Matthew Wilcox <willy@infradead.org>,
	benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	hpa@zytor.com, Will Deacon <will.deacon@arm.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	haren@linux.vnet.ibm.com, npiggin@gmail.com,
	bsingharora@gmail.com, Tim Chen <tim.c.chen@linux.intel.com>,
	linuxppc-dev@lists.ozlabs.org, x86@kernel.org
Subject: Re: [PATCH v2 14/20] mm: Provide speculative fault infrastructure
Date: Wed, 30 Aug 2017 11:53:41 +0200	[thread overview]
Message-ID: <db7e5c3e-0bb6-a1f3-a025-379071c30183@linux.vnet.ibm.com> (raw)
In-Reply-To: <ab0634c4-274d-208f-fc4b-43991986bacf@linux.vnet.ibm.com>

On 30/08/2017 07:03, Anshuman Khandual wrote:
> On 08/29/2017 07:15 PM, Peter Zijlstra wrote:
>> On Tue, Aug 29, 2017 at 03:18:25PM +0200, Laurent Dufour wrote:
>>> On 29/08/2017 14:04, Peter Zijlstra wrote:
>>>> On Tue, Aug 29, 2017 at 09:59:30AM +0200, Laurent Dufour wrote:
>>>>> On 27/08/2017 02:18, Kirill A. Shutemov wrote:
>>>>>>> +
>>>>>>> +	if (unlikely(!vma->anon_vma))
>>>>>>> +		goto unlock;
>>>>>>
>>>>>> It deserves a comment.
>>>>>
>>>>> You're right I'll add it in the next version.
>>>>> For the record, the root cause is that __anon_vma_prepare() requires the
>>>>> mmap_sem to be held because vm_next and vm_prev must be safe.
>>>>
>>>> But should that test not be:
>>>>
>>>> 	if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma))
>>>> 		goto unlock;
>>>>
>>>> Because !anon vmas will never have ->anon_vma set and you don't want to
>>>> exclude those.
>>>
>>> Yes in the case we later allow non anonymous vmas to be handled.
>>> Currently only anonymous vmas are supported so the check is good enough,
>>> isn't it ?
>>
>> That wasn't at all clear from reading the code. This makes it clear
>> ->anon_vma is only ever looked at for anonymous.
>>
>> And like Kirill says, we _really_ should start allowing some (if not
>> all) vm_ops. Large file based mappings aren't particularly rare.
>>
>> I'm not sure we want to introduce a white-list or just bite the bullet
>> and audit all ->fault() implementations. But either works and isn't
>> terribly difficult, auditing all is more work though.
> 
> filemap_fault() is used as vma-vm_ops->fault() for most of the file
> systems. Changing it can enable speculative fault support for all of
> them. It will still exclude other driver based vma-vm_ops->fault()
> implementation. AFAICS, __lock_page_or_retry() function can drop
> mm->mmap_sem if the page could not be locked right away. As suggested
> by Peterz, making it understand FAULT_FLAG_SPECULATIVE should be good
> enough. The patch is lightly tested for file mappings on top of this
> series.

Hi Anshuman,

This sounds pretty good, except for  the FAULT_FLAG_RETRY_NOWAIT's case I
mentioned in another mail.

The next step would be to find a way to discriminate between the vm_fault()
functions. Any idea ?

Thanks,
Laurent.

> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index a497024..08f3042 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1181,6 +1181,18 @@ int __lock_page_killable(struct page *__page)
>  int __lock_page_or_retry(struct page *page, struct mm_struct *mm,
>                          unsigned int flags)
>  {
> +       if (flags & FAULT_FLAG_SPECULATIVE) {
> +               if (flags & FAULT_FLAG_KILLABLE) {
> +                       int ret;
> +
> +                       ret = __lock_page_killable(page);
> +                       if (ret)
> +                               return 0;
> +               } else
> +                       __lock_page(page);
> +               return 1;
> +       }
> +
>         if (flags & FAULT_FLAG_ALLOW_RETRY) {
>                 /*
>                  * CAUTION! In this case, mmap_sem is not released
> diff --git a/mm/memory.c b/mm/memory.c
> index 549d235..02347f3 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3836,8 +3836,6 @@ static int handle_pte_fault(struct vm_fault *vmf)
>         if (!vmf->pte) {
>                 if (vma_is_anonymous(vmf->vma))
>                         return do_anonymous_page(vmf);
> -               else if (vmf->flags & FAULT_FLAG_SPECULATIVE)
> -                       return VM_FAULT_RETRY;
>                 else
>                         return do_fault(vmf);
>         }
> @@ -4012,17 +4010,7 @@ int handle_speculative_fault(struct mm_struct *mm, unsigned long address,
>                 goto unlock;
>         }
> 
> -       /*
> -        * Can't call vm_ops service has we don't know what they would do
> -        * with the VMA.
> -        * This include huge page from hugetlbfs.
> -        */
> -       if (vma->vm_ops) {
> -               trace_spf_vma_notsup(_RET_IP_, vma, address);
> -               goto unlock;
> -       }
> -
> -       if (unlikely(!vma->anon_vma)) {
> +       if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma)) {
>                 trace_spf_vma_notsup(_RET_IP_, vma, address);
>                 goto unlock;
>         }
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-08-30  9:53 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-17 22:04 [PATCH v2 00/20] Speculative page faults Laurent Dufour
2017-08-17 22:04 ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 01/20] mm: Dont assume page-table invariance during faults Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 02/20] mm: Prepare for FAULT_FLAG_SPECULATIVE Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 03/20] mm: Introduce pte_spinlock " Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 04/20] mm: VMA sequence count Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 05/20] mm: Protect VMA modifications using " Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 06/20] mm: RCU free VMAs Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 07/20] mm: Cache some VMA fields in the vm_fault structure Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 08/20] mm: Protect SPF handler against anon_vma changes Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 09/20] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 10/20] mm: Introduce __lru_cache_add_active_or_unevictable Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 11/20] mm: Introduce __maybe_mkwrite() Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 12/20] mm: Introduce __vm_normal_page() Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 13/20] mm: Introduce __page_add_new_anon_rmap() Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 14/20] mm: Provide speculative fault infrastructure Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-20 12:11   ` Sergey Senozhatsky
2017-08-20 12:11     ` Sergey Senozhatsky
2017-08-25  8:52     ` Laurent Dufour
2017-08-25  8:52       ` Laurent Dufour
2017-08-27  0:18   ` Kirill A. Shutemov
2017-08-27  0:18     ` Kirill A. Shutemov
2017-08-28  9:37     ` Peter Zijlstra
2017-08-28  9:37       ` Peter Zijlstra
2017-08-28 21:14       ` Benjamin Herrenschmidt
2017-08-28 21:14         ` Benjamin Herrenschmidt
2017-08-28 22:35         ` Andi Kleen
2017-08-28 22:35           ` Andi Kleen
2017-08-29  8:15           ` Peter Zijlstra
2017-08-29  8:15             ` Peter Zijlstra
2017-08-29  8:33         ` Peter Zijlstra
2017-08-29  8:33           ` Peter Zijlstra
2017-08-29 11:27           ` Peter Zijlstra
2017-08-29 11:27             ` Peter Zijlstra
2017-08-29 21:19             ` Benjamin Herrenschmidt
2017-08-29 21:19               ` Benjamin Herrenschmidt
2017-08-30  6:13               ` Peter Zijlstra
2017-08-30  6:13                 ` Peter Zijlstra
2017-08-29  7:59     ` Laurent Dufour
2017-08-29  7:59       ` Laurent Dufour
2017-08-29 12:04       ` Peter Zijlstra
2017-08-29 12:04         ` Peter Zijlstra
2017-08-29 13:18         ` Laurent Dufour
2017-08-29 13:18           ` Laurent Dufour
2017-08-29 13:45           ` Peter Zijlstra
2017-08-29 13:45             ` Peter Zijlstra
2017-08-30  5:03             ` Anshuman Khandual
2017-08-30  5:03               ` Anshuman Khandual
2017-08-30  5:58               ` Peter Zijlstra
2017-08-30  5:58                 ` Peter Zijlstra
2017-08-30  9:32                 ` Laurent Dufour
2017-08-30  9:32                   ` Laurent Dufour
2017-08-31  6:55                   ` Anshuman Khandual
2017-08-31  6:55                     ` Anshuman Khandual
2017-08-31  7:31                     ` Peter Zijlstra
2017-08-31  7:31                       ` Peter Zijlstra
2017-08-30  9:53               ` Laurent Dufour [this message]
2017-08-30  9:53                 ` Laurent Dufour
2017-08-30  3:48         ` Anshuman Khandual
2017-08-30  3:48           ` Anshuman Khandual
2017-08-30  5:25     ` Anshuman Khandual
2017-08-30  5:25       ` Anshuman Khandual
2017-08-30  8:56     ` Laurent Dufour
2017-08-30  8:56       ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 15/20] mm: Try spin lock in speculative path Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 16/20] mm: Adding speculative page fault failure trace events Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 17/20] perf: Add a speculative page fault sw event Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-21  8:55   ` Anshuman Khandual
2017-08-21  8:55     ` Anshuman Khandual
2017-08-22  1:46     ` Michael Ellerman
2017-08-22  1:46       ` Michael Ellerman
2017-08-17 22:05 ` [PATCH v2 18/20] perf tools: Add support for the SPF perf event Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-21  8:48   ` Anshuman Khandual
2017-08-21  8:48     ` Anshuman Khandual
2017-08-25  8:53     ` Laurent Dufour
2017-08-25  8:53       ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 19/20] x86/mm: Add speculative pagefault handling Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-21  7:29   ` Anshuman Khandual
2017-08-21  7:29     ` Anshuman Khandual
2017-08-29 14:50     ` Laurent Dufour
2017-08-29 14:50       ` Laurent Dufour
2017-08-29 14:58       ` Laurent Dufour
2017-08-29 14:58         ` Laurent Dufour
2017-08-17 22:05 ` [PATCH v2 20/20] powerpc/mm: Add speculative page fault Laurent Dufour
2017-08-17 22:05   ` Laurent Dufour
2017-08-21  6:58   ` Anshuman Khandual
2017-08-21  6:58     ` Anshuman Khandual
2017-08-29 15:13     ` Laurent Dufour
2017-08-29 15:13       ` Laurent Dufour
2017-08-21  2:26 ` [PATCH v2 00/20] Speculative page faults Sergey Senozhatsky
2017-08-21  2:26   ` Sergey Senozhatsky
2017-09-08  9:24   ` Laurent Dufour
2017-09-08  9:24     ` Laurent Dufour
2017-09-11  0:45     ` Sergey Senozhatsky
2017-09-11  0:45       ` Sergey Senozhatsky
2017-09-11  6:28       ` Laurent Dufour
2017-09-11  6:28         ` Laurent Dufour
2017-08-21  6:28 ` Anshuman Khandual
2017-08-21  6:28   ` Anshuman Khandual
2017-08-22  0:41   ` Paul E. McKenney
2017-08-22  0:41     ` Paul E. McKenney
2017-08-25  9:41   ` Laurent Dufour
2017-08-25  9:41     ` Laurent Dufour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db7e5c3e-0bb6-a1f3-a025-379071c30183@linux.vnet.ibm.com \
    --to=ldufour@linux.vnet.ibm.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bsingharora@gmail.com \
    --cc=dave@stgolabs.net \
    --cc=haren@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.cz \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=will.deacon@arm.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.