From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754694AbdH2O6i (ORCPT ); Tue, 29 Aug 2017 10:58:38 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:53759 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751280AbdH2O6h (ORCPT ); Tue, 29 Aug 2017 10:58:37 -0400 Subject: Re: [PATCH v2 19/20] x86/mm: Add speculative pagefault handling From: Laurent Dufour To: Anshuman Khandual , paulmck@linux.vnet.ibm.com, peterz@infradead.org, akpm@linux-foundation.org, kirill@shutemov.name, ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox , benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner , Ingo Molnar , hpa@zytor.com, Will Deacon Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, Tim Chen , linuxppc-dev@lists.ozlabs.org, x86@kernel.org References: <1503007519-26777-1-git-send-email-ldufour@linux.vnet.ibm.com> <1503007519-26777-20-git-send-email-ldufour@linux.vnet.ibm.com> <37b9b036-e951-0a74-3e5c-31049cda7dd2@linux.vnet.ibm.com> Date: Tue, 29 Aug 2017 16:58:23 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <37b9b036-e951-0a74-3e5c-31049cda7dd2@linux.vnet.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Language: fr Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 17082914-0012-0000-0000-000005732515 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17082914-0013-0000-0000-000018EB4302 Message-Id: <6233b5fc-92d8-1977-82f9-e266ac1f6dec@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-08-29_04:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1708290224 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29/08/2017 16:50, Laurent Dufour wrote: > On 21/08/2017 09:29, Anshuman Khandual wrote: >> On 08/18/2017 03:35 AM, Laurent Dufour wrote: >>> From: Peter Zijlstra >>> >>> Try a speculative fault before acquiring mmap_sem, if it returns with >>> VM_FAULT_RETRY continue with the mmap_sem acquisition and do the >>> traditional fault. >>> >>> Signed-off-by: Peter Zijlstra (Intel) >>> >>> [Clearing of FAULT_FLAG_ALLOW_RETRY is now done in >>> handle_speculative_fault()] >>> [Retry with usual fault path in the case VM_ERROR is returned by >>> handle_speculative_fault(). This allows signal to be delivered] >>> Signed-off-by: Laurent Dufour >>> --- >>> arch/x86/include/asm/pgtable_types.h | 7 +++++++ >>> arch/x86/mm/fault.c | 19 +++++++++++++++++++ >>> 2 files changed, 26 insertions(+) >>> >>> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h >>> index bf9638e1ee42..4fd2693a037e 100644 >>> --- a/arch/x86/include/asm/pgtable_types.h >>> +++ b/arch/x86/include/asm/pgtable_types.h >>> @@ -234,6 +234,13 @@ enum page_cache_mode { >>> #define PGD_IDENT_ATTR 0x001 /* PRESENT (no other attributes) */ >>> #endif >>> >>> +/* >>> + * Advertise that we call the Speculative Page Fault handler. >>> + */ >>> +#ifdef CONFIG_X86_64 >>> +#define __HAVE_ARCH_CALL_SPF >>> +#endif >>> + >>> #ifdef CONFIG_X86_32 >>> # include >>> #else >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >>> index 2a1fa10c6a98..4c070b9a4362 100644 >>> --- a/arch/x86/mm/fault.c >>> +++ b/arch/x86/mm/fault.c >>> @@ -1365,6 +1365,24 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, >>> if (error_code & PF_INSTR) >>> flags |= FAULT_FLAG_INSTRUCTION; >>> >>> +#ifdef __HAVE_ARCH_CALL_SPF >>> + if (error_code & PF_USER) { >>> + fault = handle_speculative_fault(mm, address, flags); >>> + >>> + /* >>> + * We also check against VM_FAULT_ERROR because we have to >>> + * raise a signal by calling later mm_fault_error() which >>> + * requires the vma pointer to be set. So in that case, >>> + * we fall through the normal path. >> >> Cant mm_fault_error() be called inside handle_speculative_fault() ? >> Falling through the normal page fault path again just to raise a >> signal seems overkill. Looking into mm_fault_error(), it seems they >> are different for x86 and powerpc. >> >> X86: >> >> mm_fault_error(struct pt_regs *regs, unsigned long error_code, >> unsigned long address, struct vm_area_struct *vma, >> unsigned int fault) >> >> powerpc: >> >> mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault) >> >> Even in case of X86, I guess we would have reference to the faulting >> VMA (after the SRCU search) which can be used to call this function >> directly. > > Yes I think this is doable in the case of x86. Indeed this is not doable as the vma pointer is not returned by handle_speculative_fault() and this is not possible to return it because once srcu_read_unlock() is called, the pointer is no more safe. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f72.google.com (mail-pg0-f72.google.com [74.125.83.72]) by kanga.kvack.org (Postfix) with ESMTP id 10DDB6B0292 for ; Tue, 29 Aug 2017 10:58:37 -0400 (EDT) Received: by mail-pg0-f72.google.com with SMTP id 83so6864334pgb.1 for ; Tue, 29 Aug 2017 07:58:37 -0700 (PDT) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com. [148.163.156.1]) by mx.google.com with ESMTPS id c63si2443546pfd.289.2017.08.29.07.58.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 Aug 2017 07:58:36 -0700 (PDT) Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v7TEtIxh053391 for ; Tue, 29 Aug 2017 10:58:35 -0400 Received: from e06smtp13.uk.ibm.com (e06smtp13.uk.ibm.com [195.75.94.109]) by mx0a-001b2d01.pphosted.com with ESMTP id 2cn9t3ta6h-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 29 Aug 2017 10:58:35 -0400 Received: from localhost by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 29 Aug 2017 15:58:32 +0100 Subject: Re: [PATCH v2 19/20] x86/mm: Add speculative pagefault handling From: Laurent Dufour References: <1503007519-26777-1-git-send-email-ldufour@linux.vnet.ibm.com> <1503007519-26777-20-git-send-email-ldufour@linux.vnet.ibm.com> <37b9b036-e951-0a74-3e5c-31049cda7dd2@linux.vnet.ibm.com> Date: Tue, 29 Aug 2017 16:58:23 +0200 MIME-Version: 1.0 In-Reply-To: <37b9b036-e951-0a74-3e5c-31049cda7dd2@linux.vnet.ibm.com> Content-Type: text/plain; charset=windows-1252 Content-Language: fr Content-Transfer-Encoding: 7bit Message-Id: <6233b5fc-92d8-1977-82f9-e266ac1f6dec@linux.vnet.ibm.com> Sender: owner-linux-mm@kvack.org List-ID: To: Anshuman Khandual , paulmck@linux.vnet.ibm.com, peterz@infradead.org, akpm@linux-foundation.org, kirill@shutemov.name, ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox , benh@kernel.crashing.org, mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner , Ingo Molnar , hpa@zytor.com, Will Deacon Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, Tim Chen , linuxppc-dev@lists.ozlabs.org, x86@kernel.org On 29/08/2017 16:50, Laurent Dufour wrote: > On 21/08/2017 09:29, Anshuman Khandual wrote: >> On 08/18/2017 03:35 AM, Laurent Dufour wrote: >>> From: Peter Zijlstra >>> >>> Try a speculative fault before acquiring mmap_sem, if it returns with >>> VM_FAULT_RETRY continue with the mmap_sem acquisition and do the >>> traditional fault. >>> >>> Signed-off-by: Peter Zijlstra (Intel) >>> >>> [Clearing of FAULT_FLAG_ALLOW_RETRY is now done in >>> handle_speculative_fault()] >>> [Retry with usual fault path in the case VM_ERROR is returned by >>> handle_speculative_fault(). This allows signal to be delivered] >>> Signed-off-by: Laurent Dufour >>> --- >>> arch/x86/include/asm/pgtable_types.h | 7 +++++++ >>> arch/x86/mm/fault.c | 19 +++++++++++++++++++ >>> 2 files changed, 26 insertions(+) >>> >>> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h >>> index bf9638e1ee42..4fd2693a037e 100644 >>> --- a/arch/x86/include/asm/pgtable_types.h >>> +++ b/arch/x86/include/asm/pgtable_types.h >>> @@ -234,6 +234,13 @@ enum page_cache_mode { >>> #define PGD_IDENT_ATTR 0x001 /* PRESENT (no other attributes) */ >>> #endif >>> >>> +/* >>> + * Advertise that we call the Speculative Page Fault handler. >>> + */ >>> +#ifdef CONFIG_X86_64 >>> +#define __HAVE_ARCH_CALL_SPF >>> +#endif >>> + >>> #ifdef CONFIG_X86_32 >>> # include >>> #else >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >>> index 2a1fa10c6a98..4c070b9a4362 100644 >>> --- a/arch/x86/mm/fault.c >>> +++ b/arch/x86/mm/fault.c >>> @@ -1365,6 +1365,24 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code, >>> if (error_code & PF_INSTR) >>> flags |= FAULT_FLAG_INSTRUCTION; >>> >>> +#ifdef __HAVE_ARCH_CALL_SPF >>> + if (error_code & PF_USER) { >>> + fault = handle_speculative_fault(mm, address, flags); >>> + >>> + /* >>> + * We also check against VM_FAULT_ERROR because we have to >>> + * raise a signal by calling later mm_fault_error() which >>> + * requires the vma pointer to be set. So in that case, >>> + * we fall through the normal path. >> >> Cant mm_fault_error() be called inside handle_speculative_fault() ? >> Falling through the normal page fault path again just to raise a >> signal seems overkill. Looking into mm_fault_error(), it seems they >> are different for x86 and powerpc. >> >> X86: >> >> mm_fault_error(struct pt_regs *regs, unsigned long error_code, >> unsigned long address, struct vm_area_struct *vma, >> unsigned int fault) >> >> powerpc: >> >> mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault) >> >> Even in case of X86, I guess we would have reference to the faulting >> VMA (after the SRCU search) which can be used to call this function >> directly. > > Yes I think this is doable in the case of x86. Indeed this is not doable as the vma pointer is not returned by handle_speculative_fault() and this is not possible to return it because once srcu_read_unlock() is called, the pointer is no more safe. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org