From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1165506AbdEADA5 (ORCPT ); Sun, 30 Apr 2017 23:00:57 -0400 Received: from mail-pg0-f65.google.com ([74.125.83.65]:36126 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1165456AbdEADAs (ORCPT ); Sun, 30 Apr 2017 23:00:48 -0400 Date: Mon, 1 May 2017 13:00:36 +1000 From: Nicholas Piggin To: Christophe Leroy Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Scott Wood , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] powerpc/mm: Only read faulting instruction when necessary in do_page_fault() Message-ID: <20170501130023.3c10e00d@roar.ozlabs.ibm.com> In-Reply-To: <20170428061301.27B826E713@localhost.localdomain> References: <20170428061301.27B826E713@localhost.localdomain> Organization: IBM X-Mailer: Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 28 Apr 2017 08:13:01 +0200 (CEST) Christophe Leroy wrote: > Commit a7a9dcd882a67 ("powerpc: Avoid taking a data miss on every > userspace instruction miss") has shown that limiting the read of > faulting instruction to likely cases improves performance. > > This patch goes further into this direction by limiting the read > of the faulting instruction to the only cases where it is definitly > needed. > > On an MPC885, with the same benchmark app as in the commit referred > above, we see a reduction of 4000 dTLB misses (approx 3%): > > Before the patch: > Performance counter stats for './fault 500' (10 runs): > > 720495838 cpu-cycles ( +- 0.04% ) > 141769 dTLB-load-misses ( +- 0.02% ) > 52722 iTLB-load-misses ( +- 0.01% ) > 19611 faults ( +- 0.02% ) > > 5.750535176 seconds time elapsed ( +- 0.16% ) > > With the patch: > Performance counter stats for './fault 500' (10 runs): > > 717669123 cpu-cycles ( +- 0.02% ) > 137344 dTLB-load-misses ( +- 0.03% ) > 52731 iTLB-load-misses ( +- 0.01% ) > 19614 faults ( +- 0.03% ) > > 5.728423115 seconds time elapsed ( +- 0.14% ) > > Signed-off-by: Christophe Leroy > --- > v2: Changes 'if (cond1) if (cond2)' by 'if (cond1 && cond2)' > > In case the instruction we read has value 0, store_update_sp() will > return false, so it will bail out. > > This patch applies after the serie "powerpc/mm: some cleanup of do_page_fault()" > > arch/powerpc/mm/fault.c | 22 ++++++++++++---------- > 1 file changed, 12 insertions(+), 10 deletions(-) > > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c > index 400f2d0d42f8..2ec82a279d28 100644 > --- a/arch/powerpc/mm/fault.c > +++ b/arch/powerpc/mm/fault.c > @@ -280,14 +280,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, > > perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address); > > - /* > - * We want to do this outside mmap_sem, because reading code around nip > - * can result in fault, which will cause a deadlock when called with > - * mmap_sem held > - */ > - if (is_write && is_user) > - __get_user(inst, (unsigned int __user *)regs->nip); > - > if (is_user) > flags |= FAULT_FLAG_USER; > > @@ -356,8 +348,18 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, > * between the last mapped region and the stack will > * expand the stack rather than segfaulting. > */ > - if (address + 2048 < uregs->gpr[1] && !store_updates_sp(inst)) > - goto bad_area; > + if (address + 2048 < uregs->gpr[1] && !inst) { > + /* > + * We want to do this outside mmap_sem, because reading > + * code around nip can result in fault, which will cause > + * a deadlock when called with mmap_sem held > + */ > + up_read(&mm->mmap_sem); > + __get_user(inst, (unsigned int __user *)regs->nip); > + if (!store_updates_sp(inst)) > + goto bad_area_nosemaphore; > + goto retry; > + } Yes, nice patch. I wonder if you can do __get_user first as non-faulting to avoid retaking the mmap_sem and retrying? Along the lines of: + nip = (unsigned int __user *)regs->nip; + pagefault_disable(); + if (unlikely(__get_user_inatomic(inst, nip))) { + pagefault_enable(); + up_read(&mm->mmap_sem); + if (get_user(inst, nip)) { ... goto retry; The user instruction should practically always have a Linux pte, so a fault there should be exceedingly rare, I think? Thanks, Nick