From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65FF2C33C8C for ; Mon, 6 Jan 2020 20:21:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E8DC82081E for ; Mon, 6 Jan 2020 20:21:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YNbdut87" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E8DC82081E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 51EEE8E0005; Mon, 6 Jan 2020 15:21:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F6248E0001; Mon, 6 Jan 2020 15:21:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 433B28E0005; Mon, 6 Jan 2020 15:21:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0123.hostedemail.com [216.40.44.123]) by kanga.kvack.org (Postfix) with ESMTP id 2B2D08E0001 for ; Mon, 6 Jan 2020 15:21:12 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id CBA28F96 for ; Mon, 6 Jan 2020 20:21:11 +0000 (UTC) X-FDA: 76348328742.07.lift62_a6645dcc4c56 X-HE-Tag: lift62_a6645dcc4c56 X-Filterd-Recvd-Size: 7760 Received: from mail-io1-f68.google.com (mail-io1-f68.google.com [209.85.166.68]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Mon, 6 Jan 2020 20:21:11 +0000 (UTC) Received: by mail-io1-f68.google.com with SMTP id t26so49978568ioi.13 for ; Mon, 06 Jan 2020 12:21:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eo4yW41sN0GUrwIbGyuYXW7q/87kiRlMIEHdJXO0cSg=; b=YNbdut87NOXUlym7MACVT8LUxQz8T/gbl2UvrhfVdWtDCPr+nSRubK7acQa828Ybaz G6HUFJFN6D1xdoWOXySxwDT1V8ugkoMuiet8BLL8nxI1cBoJ7KDXeumtphB0vbqhB8JB 5u4I3AAmAEPQGxJAZ4zg2RTWGYfZDYHiHXAmZU1pqTba1SW1yHR4ET+t2kvKc+DGY3Th ediiLieqsnc+DKdhJC0mqE+x5GpSdWfYiianLSaVdHWomtxlY4p9E8fx4pkezzsAll7t q+91Tulhi2AuZ/dvHjVmI1mpW17i/9YtB283Zu0DRaNzFk/KOb7MBXtsDMpPbF61Li/g i9AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eo4yW41sN0GUrwIbGyuYXW7q/87kiRlMIEHdJXO0cSg=; b=UbWieMZH1Z/OxQzcwcAPNbNACuRzj3Z+vBpf3gAqK2tKHbuhRHotmrCiq1sMxpDQfX VBgMh1STnTvAJSnZzq3IGYB15C826N4X8+tVt/QoI3pjg4aoojCwCnCZyicYRnOgmnLi 1ikflahBasYu3n/xaIA2AvK3MhkQa8yQlMSAKZqLX+viEDFsZsJXlA47CeicZyOrci8z hzMWcPjN64sb0VeQeAavrqjzGfpNdEPfAX1fH3OOYCqF6tJ1Pe8tOhEzBxfwpzwokjYY T+vcM7qLNxyDynUNcoZ3wf4kG4E6+t7DSjX7Xu5kYD3qnS+BVkVDe2axUSuBqiDHsCcB KaQQ== X-Gm-Message-State: APjAAAW/MDcJZTf/C1I2g/ThSa99eNUU3g6yWxukn3M4PrrGT0uqL8k4 +AAgKH7Lfv7GytOHI9541iqO9PA7HOvAi3MBIMQ= X-Google-Smtp-Source: APXvYqzxEG24XPxGqOANaInR3QLmbbfB/vCwNFVgRlnLB5gm0+kQ9wsnty+TRz2KT02p/H5kIR8JTHzv+qtopjA8gzU= X-Received: by 2002:a02:a38a:: with SMTP id y10mr80441702jak.55.1578342070533; Mon, 06 Jan 2020 12:21:10 -0800 (PST) MIME-Version: 1.0 References: <1578292679-2592-1-git-send-email-lixinhai.lxh@gmail.com> <8f930582-34a8-b44e-ed1b-c938c84f0598@yandex-team.ru> <2020010621281286862913@gmail.com> In-Reply-To: <2020010621281286862913@gmail.com> From: Konstantin Khlebnikov Date: Mon, 6 Jan 2020 23:20:59 +0300 Message-ID: Subject: Re: [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone To: "lixinhai.lxh@gmail.com" Cc: khlebnikov , "linux-mm@kvack.org" , "richardw.yang" , "kirill.shutemov" Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 6, 2020 at 4:28 PM lixinhai.lxh@gmail.com wrote: > > On 2020-01-06 at 18:43 Konstantin Khlebnikov wrote: > >On 06/01/2020 09.37, Li Xinhai wrote: > >> For fork case, the dst->vm_prev is always same as src->vm_prev when > >> anon_vma_clone() is called. Removing the assignment from > >> dst->vm_prev->anon_vma to dst->anon_vma, and explictly assign from > >> anon_vma which is shared by its parent vmas. > > > >This doesn't sound right. > > > >I see dst->vm_prev is set after anon_vma_fork(), so here it still points to parent prev. > >So, this thing works isn't as is supposed to be. > > > >I expect this logic: If parent SRC1 SRC2 .. SRCn share ANON0 > >then in child related DST1 DST2 .. DSTn should fork and share ANON1: > >Forking DST1 creates new ANON1 and then DST2 and following share it. > > This logic was not fully clarified in > https://lore.kernel.org/linux-mm/20191011072256.16275-2-richardw.yang@linux.intel.com/ > I've assumed that sharing parent vma's anon_vma with child vma was the > purpose of that patch, and it intentionally want the first child has its own new > anon_vma (don't sharing as done by other child vma). Well, this more or less follows from original design. Page anon-vma along with page offset limits set of vmas scanned by rmap: it skips vmas where page cannot be mapped for sure. If vmas in one process shares anon-vma then they likely have non-overlapping offsets, so there is no reason to fork personal anon-vma for each of them when process forks. But it's good to fork new anon-vma for all of them together: then rmap could skip scanning parent vmas for pages allocated\cowed in child process. Together they act like one big vma. > > > > >Also this assumption is wrong: > > > Parent has vm_prev, which implies we have vm_prev. > >If in parent prev VMA has VM_DONTCOPY then in child prev VMA will > >not match pprev or even could be NULL if it was first in mm. > > > >See patch: > >https://lore.kernel.org/lkml/157830736034.8148.7070851958306750616.stgit@buzz/T/#u > > > >I've tested it using this: > > > >--- a/fs/proc/task_mmu.c > >+++ b/fs/proc/task_mmu.c > >@@ -847,6 +847,12 @@ static int show_smap(struct seq_file *m, void *v) > > seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma)); > > show_smap_vma_flags(m, vma); > > > >+ if (vma->anon_vma) > >+ seq_printf(m, "AnonVMA: %p %p %d\n", > >+ vma->anon_vma, > >+ vma->anon_vma->parent, > >+ vma->anon_vma->degree); > >+ > > m_cache_vma(m, vma); > > > > return 0; > > > >--- > > > >#include > >#include > >#include > >#include > >#include > > > >int main(int argc, char **argv) { > > void *ptr; > > char buf[100]; > > > > ptr = mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); > > memset(ptr, 0, 0x3000); > > mprotect(ptr + 0x1000, 0x1000, PROT_READ); > > > > sprintf(buf, "cat /proc/%d/smaps", getpid()); > > system(buf); > > > > if (fork()) { > > wait(NULL); > > } else { > > printf("\n\n\n"); > > fflush(stdout); > > sprintf(buf, "cat /proc/%d/smaps", getpid()); > > system(buf); > > } > >} > > > >--- > > > >> > >> Signed-off-by: Li Xinhai > >> Cc: Wei Yang > >> Cc: Konstantin Khlebnikov > >> Cc: Kirill A. Shutemov > >> --- > >> mm/rmap.c | 7 +++---- > >> 1 file changed, 3 insertions(+), 4 deletions(-) > >> > >> diff --git a/mm/rmap.c b/mm/rmap.c > >> index b3e3819..3c912a6c 100644 > >> --- a/mm/rmap.c > >> +++ b/mm/rmap.c > >> @@ -269,10 +269,10 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src) > >> { > >> struct anon_vma_chain *avc, *pavc; > >> struct anon_vma *root = NULL; > >> - struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev; > >> + struct vm_area_struct *pprev = src->vm_prev; > >> > >> /* > >> - * If parent share anon_vma with its vm_prev, keep this sharing in in > >> + * If parent share anon_vma with its vm_prev, keep this sharing in > >> * child. > >> * > >> * 1. Parent has vm_prev, which implies we have vm_prev. > >> @@ -280,8 +280,7 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src) > >> */ > >> if (!dst->anon_vma && src->anon_vma && > >> pprev && pprev->anon_vma == src->anon_vma) > >> - dst->anon_vma = prev->anon_vma; > >> - > >> + dst->anon_vma = pprev->anon_vma; > >> > >> list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) { > >> struct anon_vma *anon_vma; > >>