All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <koct9i@gmail.com>
To: "lixinhai.lxh@gmail.com" <lixinhai.lxh@gmail.com>
Cc: khlebnikov <khlebnikov@yandex-team.ru>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	 "richardw.yang" <richardw.yang@linux.intel.com>,
	 "kirill.shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone
Date: Mon, 6 Jan 2020 23:20:59 +0300	[thread overview]
Message-ID: <CALYGNiNzz+dxHX0g5-gNypUQc3B=8_Scp53-NTOh=zWsdUuHAw@mail.gmail.com> (raw)
In-Reply-To: <2020010621281286862913@gmail.com>

On Mon, Jan 6, 2020 at 4:28 PM lixinhai.lxh@gmail.com
<lixinhai.lxh@gmail.com> wrote:
>
> On 2020-01-06 at 18:43 Konstantin Khlebnikov wrote:
> >On 06/01/2020 09.37, Li Xinhai wrote:
> >> For fork case, the dst->vm_prev is always same as src->vm_prev when
> >> anon_vma_clone() is called. Removing the assignment from
> >> dst->vm_prev->anon_vma to dst->anon_vma, and explictly assign from
> >> anon_vma which is shared by its parent vmas.
> >
> >This doesn't sound right.
> >
> >I see dst->vm_prev is set after anon_vma_fork(), so here it still points to parent prev.
> >So, this thing works isn't as is supposed to be.
> >
> >I expect this logic: If parent SRC1 SRC2 .. SRCn share ANON0
> >then in child related DST1 DST2 .. DSTn should fork and share ANON1:
> >Forking DST1 creates new ANON1 and then DST2 and following share it.
>
> This logic was not fully clarified in
> https://lore.kernel.org/linux-mm/20191011072256.16275-2-richardw.yang@linux.intel.com/
> I've assumed that sharing parent vma's anon_vma with child vma was the
> purpose of that patch, and it intentionally want the first child has its own new
> anon_vma (don't sharing as done by other child vma).

Well, this more or less follows from original design.
Page anon-vma along with page offset limits set of vmas scanned by rmap:
it skips vmas where page cannot be mapped for sure.

If vmas in one process shares anon-vma then they likely have
non-overlapping offsets,
so there is no reason to fork personal anon-vma for each of them when
process forks.
But it's good to fork new anon-vma for all of them together: then rmap
could skip scanning
parent vmas for pages allocated\cowed in child process. Together they
act like one big vma.

>
> >
> >Also this assumption is wrong:
> > > Parent has vm_prev, which implies we have vm_prev.
> >If in parent prev VMA has VM_DONTCOPY then in child prev VMA will
> >not match pprev or even could be NULL if it was first in mm.
> >
> >See patch:
> >https://lore.kernel.org/lkml/157830736034.8148.7070851958306750616.stgit@buzz/T/#u
> >
> >I've tested it using this:
> >
> >--- a/fs/proc/task_mmu.c
> >+++ b/fs/proc/task_mmu.c
> >@@ -847,6 +847,12 @@ static int show_smap(struct seq_file *m, void *v)
> >                 seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
> >         show_smap_vma_flags(m, vma);
> >
> >+       if (vma->anon_vma)
> >+               seq_printf(m, "AnonVMA: %p %p %d\n",
> >+                          vma->anon_vma,
> >+                          vma->anon_vma->parent,
> >+                          vma->anon_vma->degree);
> >+
> >         m_cache_vma(m, vma);
> >
> >         return 0;
> >
> >---
> >
> >#include <sys/mman.h>
> >#include <stdlib.h>
> >#include <unistd.h>
> >#include <string.h>
> >#include <stdio.h>
> >
> >int main(int argc, char **argv) {
> >       void *ptr;
> >       char buf[100];
> >
> >       ptr = mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> >       memset(ptr, 0, 0x3000);
> >       mprotect(ptr + 0x1000, 0x1000, PROT_READ);
> >
> >       sprintf(buf, "cat /proc/%d/smaps", getpid());
> >       system(buf);
> >
> >       if (fork()) {
> >       wait(NULL);
> >       } else {
> >       printf("\n\n\n");
> >       fflush(stdout);
> >       sprintf(buf, "cat /proc/%d/smaps", getpid());
> >       system(buf);
> >       }
> >}
> >
> >---
> >
> >>
> >> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> >> Cc: Wei Yang <richardw.yang@linux.intel.com>
> >> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> >> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >> ---
> >>   mm/rmap.c | 7 +++----
> >>   1 file changed, 3 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/mm/rmap.c b/mm/rmap.c
> >> index b3e3819..3c912a6c 100644
> >> --- a/mm/rmap.c
> >> +++ b/mm/rmap.c
> >> @@ -269,10 +269,10 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >>   {
> >>   struct anon_vma_chain *avc, *pavc;
> >>   struct anon_vma *root = NULL;
> >> -    struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev;
> >> +    struct vm_area_struct *pprev = src->vm_prev;
> >>
> >>   /*
> >> -    * If parent share anon_vma with its vm_prev, keep this sharing in in
> >> +    * If parent share anon_vma with its vm_prev, keep this sharing in
> >>   * child.
> >>   *
> >>   * 1. Parent has vm_prev, which implies we have vm_prev.
> >> @@ -280,8 +280,7 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> >>   */
> >>   if (!dst->anon_vma && src->anon_vma &&
> >>       pprev && pprev->anon_vma == src->anon_vma)
> >> -    dst->anon_vma = prev->anon_vma;
> >> -
> >> +    dst->anon_vma = pprev->anon_vma;
> >>
> >>   list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
> >>   struct anon_vma *anon_vma;
> >>


      reply	other threads:[~2020-01-06 20:21 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-06  6:37 [PATCH] mm/rmap.c: remove useless checking to child vma->vm_prev in anon_vma_clone Li Xinhai
2020-01-06 10:43 ` Konstantin Khlebnikov
2020-01-06 13:28   ` lixinhai.lxh
2020-01-06 20:20     ` Konstantin Khlebnikov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALYGNiNzz+dxHX0g5-gNypUQc3B=8_Scp53-NTOh=zWsdUuHAw@mail.gmail.com' \
    --to=koct9i@gmail.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=lixinhai.lxh@gmail.com \
    --cc=richardw.yang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.