From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,USER_AGENT_MUTT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E25EC10F11 for ; Mon, 22 Apr 2019 20:29:27 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 941F520859 for ; Mon, 22 Apr 2019 20:29:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 941F520859 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 44nyq81qYzzDqNP for ; Tue, 23 Apr 2019 06:29:24 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=redhat.com (client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=jglisse@redhat.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 44nybG1M35zDqLw for ; Tue, 23 Apr 2019 06:19:06 +1000 (AEST) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EFE9E83F3D; Mon, 22 Apr 2019 20:19:02 +0000 (UTC) Received: from redhat.com (unknown [10.20.6.236]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4B40F6013B; Mon, 22 Apr 2019 20:18:59 +0000 (UTC) Date: Mon, 22 Apr 2019 16:18:57 -0400 From: Jerome Glisse To: Laurent Dufour Subject: Re: [PATCH v12 17/31] mm: introduce __page_add_new_anon_rmap() Message-ID: <20190422201857.GH14666@redhat.com> References: <20190416134522.17540-1-ldufour@linux.ibm.com> <20190416134522.17540-18-ldufour@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190416134522.17540-18-ldufour@linux.ibm.com> User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Mon, 22 Apr 2019 20:19:03 +0000 (UTC) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jack@suse.cz, sergey.senozhatsky.work@gmail.com, peterz@infradead.org, Will Deacon , mhocko@kernel.org, linux-mm@kvack.org, paulus@samba.org, Punit Agrawal , hpa@zytor.com, Michel Lespinasse , Alexei Starovoitov , Andrea Arcangeli , ak@linux.intel.com, Minchan Kim , aneesh.kumar@linux.ibm.com, x86@kernel.org, Matthew Wilcox , Daniel Jordan , Ingo Molnar , David Rientjes , paulmck@linux.vnet.ibm.com, Haiyan Song , npiggin@gmail.com, sj38.park@gmail.com, dave@stgolabs.net, kemi.wang@intel.com, kirill@shutemov.name, Thomas Gleixner , zhong jiang , Ganesh Mahendran , Yang Shi , Mike Rapoport , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky , vinayak menon , akpm@linux-foundation.org, Tim Chen , haren@linux.vnet.ibm.com Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Tue, Apr 16, 2019 at 03:45:08PM +0200, Laurent Dufour wrote: > When dealing with speculative page fault handler, we may race with VMA > being split or merged. In this case the vma->vm_start and vm->vm_end > fields may not match the address the page fault is occurring. > > This can only happens when the VMA is split but in that case, the > anon_vma pointer of the new VMA will be the same as the original one, > because in __split_vma the new->anon_vma is set to src->anon_vma when > *new = *vma. > > So even if the VMA boundaries are not correct, the anon_vma pointer is > still valid. > > If the VMA has been merged, then the VMA in which it has been merged > must have the same anon_vma pointer otherwise the merge can't be done. > > So in all the case we know that the anon_vma is valid, since we have > checked before starting the speculative page fault that the anon_vma > pointer is valid for this VMA and since there is an anon_vma this > means that at one time a page has been backed and that before the VMA > is cleaned, the page table lock would have to be grab to clean the > PTE, and the anon_vma field is checked once the PTE is locked. > > This patch introduce a new __page_add_new_anon_rmap() service which > doesn't check for the VMA boundaries, and create a new inline one > which do the check. > > When called from a page fault handler, if this is not a speculative one, > there is a guarantee that vm_start and vm_end match the faulting address, > so this check is useless. In the context of the speculative page fault > handler, this check may be wrong but anon_vma is still valid as explained > above. > > Signed-off-by: Laurent Dufour Reviewed-by: Jérôme Glisse > --- > include/linux/rmap.h | 12 ++++++++++-- > mm/memory.c | 8 ++++---- > mm/rmap.c | 5 ++--- > 3 files changed, 16 insertions(+), 9 deletions(-) > > diff --git a/include/linux/rmap.h b/include/linux/rmap.h > index 988d176472df..a5d282573093 100644 > --- a/include/linux/rmap.h > +++ b/include/linux/rmap.h > @@ -174,8 +174,16 @@ void page_add_anon_rmap(struct page *, struct vm_area_struct *, > unsigned long, bool); > void do_page_add_anon_rmap(struct page *, struct vm_area_struct *, > unsigned long, int); > -void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, > - unsigned long, bool); > +void __page_add_new_anon_rmap(struct page *, struct vm_area_struct *, > + unsigned long, bool); > +static inline void page_add_new_anon_rmap(struct page *page, > + struct vm_area_struct *vma, > + unsigned long address, bool compound) > +{ > + VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); > + __page_add_new_anon_rmap(page, vma, address, compound); > +} > + > void page_add_file_rmap(struct page *, bool); > void page_remove_rmap(struct page *, bool); > > diff --git a/mm/memory.c b/mm/memory.c > index be93f2c8ebe0..46f877b6abea 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2347,7 +2347,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) > * thread doing COW. > */ > ptep_clear_flush_notify(vma, vmf->address, vmf->pte); > - page_add_new_anon_rmap(new_page, vma, vmf->address, false); > + __page_add_new_anon_rmap(new_page, vma, vmf->address, false); > mem_cgroup_commit_charge(new_page, memcg, false, false); > __lru_cache_add_active_or_unevictable(new_page, vmf->vma_flags); > /* > @@ -2897,7 +2897,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > /* ksm created a completely new copy */ > if (unlikely(page != swapcache && swapcache)) { > - page_add_new_anon_rmap(page, vma, vmf->address, false); > + __page_add_new_anon_rmap(page, vma, vmf->address, false); > mem_cgroup_commit_charge(page, memcg, false, false); > __lru_cache_add_active_or_unevictable(page, vmf->vma_flags); > } else { > @@ -3049,7 +3049,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) > } > > inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); > - page_add_new_anon_rmap(page, vma, vmf->address, false); > + __page_add_new_anon_rmap(page, vma, vmf->address, false); > mem_cgroup_commit_charge(page, memcg, false, false); > __lru_cache_add_active_or_unevictable(page, vmf->vma_flags); > setpte: > @@ -3328,7 +3328,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, > /* copy-on-write page */ > if (write && !(vmf->vma_flags & VM_SHARED)) { > inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); > - page_add_new_anon_rmap(page, vma, vmf->address, false); > + __page_add_new_anon_rmap(page, vma, vmf->address, false); > mem_cgroup_commit_charge(page, memcg, false, false); > __lru_cache_add_active_or_unevictable(page, vmf->vma_flags); > } else { > diff --git a/mm/rmap.c b/mm/rmap.c > index e5dfe2ae6b0d..2148e8ce6e34 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1140,7 +1140,7 @@ void do_page_add_anon_rmap(struct page *page, > } > > /** > - * page_add_new_anon_rmap - add pte mapping to a new anonymous page > + * __page_add_new_anon_rmap - add pte mapping to a new anonymous page > * @page: the page to add the mapping to > * @vma: the vm area in which the mapping is added > * @address: the user virtual address mapped > @@ -1150,12 +1150,11 @@ void do_page_add_anon_rmap(struct page *page, > * This means the inc-and-test can be bypassed. > * Page does not have to be locked. > */ > -void page_add_new_anon_rmap(struct page *page, > +void __page_add_new_anon_rmap(struct page *page, > struct vm_area_struct *vma, unsigned long address, bool compound) > { > int nr = compound ? hpage_nr_pages(page) : 1; > > - VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); > __SetPageSwapBacked(page); > if (compound) { > VM_BUG_ON_PAGE(!PageTransHuge(page), page); > -- > 2.21.0 >