All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurent Dufour <ldufour@linux.vnet.ibm.com>
To: Punit Agrawal <punitagrawal@gmail.com>
Cc: akpm@linux-foundation.org, mhocko@kernel.org,
	peterz@infradead.org, kirill@shutemov.name, ak@linux.intel.com,
	dave@stgolabs.net, jack@suse.cz,
	Matthew Wilcox <willy@infradead.org>,
	benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	hpa@zytor.com, Will Deacon <will.deacon@arm.com>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	kemi.wang@intel.com, sergey.senozhatsky.work@gmail.com,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	David Rientjes <rientjes@google.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Ganesh Mahendran <opensource.ganesh@gmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com,
	npiggin@gmail.com, bsingharora@gmail.com,
	paulmck@linux.vnet.ibm.com, Tim Chen <tim.c.chen@linux.intel.com>,
	linuxppc-dev@lists.ozlabs.org, x86@kernel.org
Subject: Re: [PATCH v10 24/25] x86/mm: add speculative pagefault handling
Date: Thu, 3 May 2018 16:59:14 +0200	[thread overview]
Message-ID: <fb143123-d54e-b08d-1bd8-07767c86c7d0@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAD4BONd5DZiKkGPGaYqEcVb+YubVDy43MNNQ8_yztDHWpf0Y7w@mail.gmail.com>

On 30/04/2018 20:43, Punit Agrawal wrote:
> Hi Laurent,
> 
> I am looking to add support for speculative page fault handling to
> arm64 (effectively porting this patch) and had a few questions.
> Apologies if I've missed an obvious explanation for my queries. I'm
> jumping in bit late to the discussion.

Hi Punit,

Thanks for giving this series a review.
I don't have arm64 hardware to play with, but I'll be happy to add arm64
patches to my series and to try to maintain them.

> 
> On Tue, Apr 17, 2018 at 3:33 PM, Laurent Dufour
> <ldufour@linux.vnet.ibm.com> wrote:
>> From: Peter Zijlstra <peterz@infradead.org>
>>
>> Try a speculative fault before acquiring mmap_sem, if it returns with
>> VM_FAULT_RETRY continue with the mmap_sem acquisition and do the
>> traditional fault.
>>
>> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>>
>> [Clearing of FAULT_FLAG_ALLOW_RETRY is now done in
>>  handle_speculative_fault()]
>> [Retry with usual fault path in the case VM_ERROR is returned by
>>  handle_speculative_fault(). This allows signal to be delivered]
>> [Don't build SPF call if !CONFIG_SPECULATIVE_PAGE_FAULT]
>> [Try speculative fault path only for multi threaded processes]
>> [Try reuse to the VMA fetch during the speculative path in case of retry]
>> [Call reuse_spf_or_find_vma()]
>> [Handle memory protection key fault]
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>> ---
>>  arch/x86/mm/fault.c | 42 ++++++++++++++++++++++++++++++++++++++----
>>  1 file changed, 38 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>> index 73bd8c95ac71..59f778386df5 100644
>> --- a/arch/x86/mm/fault.c
>> +++ b/arch/x86/mm/fault.c
>> @@ -1220,7 +1220,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>>         struct mm_struct *mm;
>>         int fault, major = 0;
>>         unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
>> -       u32 pkey;
>> +       u32 pkey, *pt_pkey = &pkey;
>>
>>         tsk = current;
>>         mm = tsk->mm;
>> @@ -1310,6 +1310,30 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>>                 flags |= FAULT_FLAG_INSTRUCTION;
>>
>>         /*
>> +        * Do not try speculative page fault for kernel's pages and if
>> +        * the fault was due to protection keys since it can't be resolved.
>> +        */
>> +       if (IS_ENABLED(CONFIG_SPECULATIVE_PAGE_FAULT) &&
>> +           !(error_code & X86_PF_PK)) {
> 
> You can simplify this condition by dropping the IS_ENABLED() check as
> you already provide an alternate implementation of
> handle_speculative_fault() when CONFIG_SPECULATIVE_PAGE_FAULT is not
> defined.

Yes you're right, I completely forgot about that define of
handle_speculative_fault() when CONFIG_SPECULATIVE_PAGE_FAULT is not set, that
will definitively makes that part of code more readable.

> 
>> +               fault = handle_speculative_fault(mm, address, flags, &vma);
>> +               if (fault != VM_FAULT_RETRY) {
>> +                       perf_sw_event(PERF_COUNT_SW_SPF, 1, regs, address);
>> +                       /*
>> +                        * Do not advertise for the pkey value since we don't
>> +                        * know it.
>> +                        * This is not a matter as we checked for X86_PF_PK
>> +                        * earlier, so we should not handle pkey fault here,
>> +                        * but to be sure that mm_fault_error() callees will
>> +                        * not try to use it, we invalidate the pointer.
>> +                        */
>> +                       pt_pkey = NULL;
>> +                       goto done;
>> +               }
>> +       } else {
>> +               vma = NULL;
>> +       }
> 
> The else part can be dropped if vma is initialised to NULL when it is
> declared at the top of the function.
Sure.

> 
>> +
>> +       /*
>>          * When running in the kernel we expect faults to occur only to
>>          * addresses in user space.  All other faults represent errors in
>>          * the kernel and should generate an OOPS.  Unfortunately, in the
>> @@ -1342,7 +1366,8 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>>                 might_sleep();
>>         }
>>
>> -       vma = find_vma(mm, address);
>> +       if (!vma || !can_reuse_spf_vma(vma, address))
>> +               vma = find_vma(mm, address);
> 
> Is there a measurable benefit from reusing the vma?
> 
> Dropping the vma reference unconditionally after speculative page
> fault handling gets rid of the implicit state when "vma != NULL"
> (increased ref-count). I found it a bit confusing to follow.

I do agree, this is quite confusing. My initial goal was to be able to reuse
the VMA in the case a protection key error was detected, but it's not really
necessary on x86 since we know at the beginning of the fault operation that
protection key are in the loop. This is not the case on ppc64 but I couldn't
find a way to easily rely on the speculatively fetched VMA neither, so for
protection keys, this didn't help.

Regarding the measurable benefit of reusing the fetched vma, I did further
tests using will-it-scale/page_fault2_threads test, and I'm no more really
convince that this worth the added complexity. I think I'll drop the patch "mm:
speculative page fault handler return VMA" of the series, and thus remove the
call to can_reuse_spf_vma().

Thanks,
Laurent.

> 
>>         if (unlikely(!vma)) {
>>                 bad_area(regs, error_code, address);
>>                 return;
>> @@ -1409,8 +1434,15 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>>                 if (flags & FAULT_FLAG_ALLOW_RETRY) {
>>                         flags &= ~FAULT_FLAG_ALLOW_RETRY;
>>                         flags |= FAULT_FLAG_TRIED;
>> -                       if (!fatal_signal_pending(tsk))
>> +                       if (!fatal_signal_pending(tsk)) {
>> +                               /*
>> +                                * Do not try to reuse this vma and fetch it
>> +                                * again since we will release the mmap_sem.
>> +                                */
>> +                               if (IS_ENABLED(CONFIG_SPECULATIVE_PAGE_FAULT))
>> +                                       vma = NULL;
> 
> Regardless of the above comment, can the vma be reset here unconditionally?
> 
> Thanks,
> Punit
> 

  reply	other threads:[~2018-05-03 14:59 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-17 14:33 [PATCH v10 00/25] Speculative page faults Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 01/25] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-04-23  5:58   ` Minchan Kim
2018-04-23 15:10     ` Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 02/25] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-05-08 11:04   ` Punit Agrawal
2018-05-08 11:04     ` Punit Agrawal
2018-05-08 11:04     ` Punit Agrawal
2018-05-14 14:47     ` Laurent Dufour
2018-05-14 15:05       ` Punit Agrawal
2018-05-14 15:05         ` Punit Agrawal
2018-05-14 15:05         ` Punit Agrawal
2018-04-17 14:33 ` [PATCH v10 03/25] powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 04/25] mm: prepare for FAULT_FLAG_SPECULATIVE Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 05/25] mm: introduce pte_spinlock " Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 06/25] mm: make pte_unmap_same compatible with SPF Laurent Dufour
2018-04-23  6:31   ` Minchan Kim
2018-04-30 14:07     ` Laurent Dufour
2018-05-01 13:04       ` Minchan Kim
2018-05-10 16:15   ` vinayak menon
2018-05-14 15:09     ` Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 07/25] mm: introduce INIT_VMA() Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 08/25] mm: VMA sequence count Laurent Dufour
2018-04-23  6:42   ` Minchan Kim
2018-04-30 15:14     ` Laurent Dufour
2018-05-01 13:16       ` Minchan Kim
2018-05-03 14:45         ` Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 09/25] mm: protect VMA modifications using " Laurent Dufour
2018-04-23  7:19   ` Minchan Kim
2018-05-14 15:25     ` Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 10/25] mm: protect mremap() against SPF hanlder Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 11/25] mm: protect SPF handler against anon_vma changes Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 12/25] mm: cache some VMA fields in the vm_fault structure Laurent Dufour
2018-04-23  7:42   ` Minchan Kim
2018-05-03 12:25     ` Laurent Dufour
2018-05-03 15:42       ` Minchan Kim
2018-05-04  9:10         ` Laurent Dufour
2018-05-08 10:56           ` Minchan Kim
2018-04-17 14:33 ` [PATCH v10 13/25] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 14/25] mm: introduce __lru_cache_add_active_or_unevictable Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 15/25] mm: introduce __vm_normal_page() Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 16/25] mm: introduce __page_add_new_anon_rmap() Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 17/25] mm: protect mm_rb tree with a rwlock Laurent Dufour
2018-04-30 18:47   ` Punit Agrawal
2018-05-02  6:37     ` Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 18/25] mm: provide speculative fault infrastructure Laurent Dufour
2018-05-15 13:09   ` vinayak menon
2018-05-15 14:07     ` Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 19/25] mm: adding speculative page fault failure trace events Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 20/25] perf: add a speculative page fault sw event Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 21/25] perf tools: add support for the SPF perf event Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 22/25] mm: speculative page fault handler return VMA Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 23/25] mm: add speculative page fault vmstats Laurent Dufour
2018-05-16  2:50   ` Ganesh Mahendran
2018-05-16  6:42     ` Laurent Dufour
2018-04-17 14:33 ` [PATCH v10 24/25] x86/mm: add speculative pagefault handling Laurent Dufour
2018-04-30 18:43   ` Punit Agrawal
2018-05-03 14:59     ` Laurent Dufour [this message]
2018-05-04 15:55       ` Punit Agrawal
2018-05-04 15:55         ` Punit Agrawal
2018-04-17 14:33 ` [PATCH v10 25/25] powerpc/mm: add speculative page fault Laurent Dufour
2018-04-17 16:51 ` [PATCH v10 00/25] Speculative page faults Christopher Lameter
2018-05-02 14:17 ` Punit Agrawal
2018-05-02 14:17   ` Punit Agrawal
2018-05-02 14:17   ` Punit Agrawal
2018-05-02 14:45   ` Laurent Dufour
2018-05-02 15:50     ` Punit Agrawal
2018-05-02 15:50       ` Punit Agrawal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb143123-d54e-b08d-1bd8-07767c86c7d0@linux.vnet.ibm.com \
    --to=ldufour@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=benh@kernel.crashing.org \
    --cc=bsingharora@gmail.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave@stgolabs.net \
    --cc=haren@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.cz \
    --cc=jglisse@redhat.com \
    --cc=kemi.wang@intel.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=opensource.ganesh@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=punitagrawal@gmail.com \
    --cc=rientjes@google.com \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=will.deacon@arm.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.