From: "lixinhai.lxh@gmail.com" <lixinhai.lxh@gmail.com>
To: khlebnikov <khlebnikov@yandex-team.ru>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
akpm <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
richardw.yang <richardw.yang@linux.intel.com>
Cc: "Rik van Riel" <riel@redhat.com>,
kirill.shutemov <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH v2 1/2] mm/rmap: fix and simplify reusing mergeable anon_vma as parent when fork
Date: Tue, 7 Jan 2020 21:35:09 +0800 [thread overview]
Message-ID: <2020010721350631540064@gmail.com> (raw)
In-Reply-To: 157839239609.694.10268055713935919822.stgit@buzz
On 2020-01-07 at 18:19 Konstantin Khlebnikov wrote:
>This fixes some misconceptions in commit 4e4a9eb92133 ("mm/rmap.c: reuse
>mergeable anon_vma as parent when fork"). It merges anon-vma in unexpected
>way but fortunately still produces valid anon-vma tree, so nothing crashes.
>
>If in parent VMAs: SRC1 SRC2 .. SRCn share anon-vma ANON0, then after fork
>before all patches in child process related VMAs: DST1 DST2 .. DSTn will
>fork indepndent anon-vmas: ANON1 ANON2 .. ANONn (each is child of ANON0).
>Before this patch only DST1 will fork new ANON1 and following DST2 .. DSTn
>will share parent's ANON0 (i.e. anon-vma tree is valid but isn't optimal).
>With this patch DST1 will create new ANON1 and DST2 .. DSTn will share it.
>
>Root problem caused by initialization order in dup_mmap(): vma->vm_prev
>is set after calling anon_vma_fork(). Thus in anon_vma_fork() it points to
>previous VMA in parent mm.
>
>Second problem is hidden behind first one: assumption "Parent has vm_prev,
>which implies we have vm_prev" is wrong if first VMA in parent mm has set
>flag VM_DONTCOPY. Luckily prev->anon_vma doesn't dereference NULL pointer
>because in current code 'prev' actually is same as 'pprev'.
>
>Third hidden problem is linking between VMA and anon-vmas whose pages it
>could contain. Loop in anon_vma_clone() attaches only parent's anon-vmas,
>shared anon-vma isn't attached. But every mapped page stays reachable in
>rmap because we erroneously share anon-vma from parent's previous VMA.
>
>This patch moves sharing logic out of anon_vma_clone() into more specific
>anon_vma_fork() because this supposed to work only at fork() and simply
>reuses anon_vma from previous VMA if it is forked from the same anon-vma.
>
>Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>Reported-by: Li Xinhai <lixinhai.lxh@gmail.com>
>Fixes: 4e4a9eb92133 ("mm/rmap.c: reuse mergeable anon_vma as parent when fork")
>Link: https://lore.kernel.org/linux-mm/CALYGNiNzz+dxHX0g5-gNypUQc3B=8_Scp53-NTOh=zWsdUuHAw@mail.gmail.com/T/#t
The call to anon_vma_clone is nice. It looks fine to me.
Reviewed-by: Li Xinhai <lixinhai.lxh@gmail.com>
should I also add Suggested-by: Li Xinhai <lixinhai.lxh@gmail.com> ?
>---
> include/linux/rmap.h | 3 ++-
> kernel/fork.c | 2 +-
> mm/rmap.c | 23 +++++++++--------------
> 3 files changed, 12 insertions(+), 16 deletions(-)
>
>diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>index 988d176472df..560e4480dcd0 100644
>--- a/include/linux/rmap.h
>+++ b/include/linux/rmap.h
>@@ -143,7 +143,8 @@ void anon_vma_init(void); /* create anon_vma_cachep */
> int __anon_vma_prepare(struct vm_area_struct *);
> void unlink_anon_vmas(struct vm_area_struct *);
> int anon_vma_clone(struct vm_area_struct *, struct vm_area_struct *);
>-int anon_vma_fork(struct vm_area_struct *, struct vm_area_struct *);
>+int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma,
>+ struct vm_area_struct *prev);
>
> static inline int anon_vma_prepare(struct vm_area_struct *vma)
> {
>diff --git a/kernel/fork.c b/kernel/fork.c
>index 2508a4f238a3..c33626993831 100644
>--- a/kernel/fork.c
>+++ b/kernel/fork.c
>@@ -556,7 +556,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
> tmp->anon_vma = NULL;
> if (anon_vma_prepare(tmp))
> goto fail_nomem_anon_vma_fork;
>- } else if (anon_vma_fork(tmp, mpnt))
>+ } else if (anon_vma_fork(tmp, mpnt, prev))
> goto fail_nomem_anon_vma_fork;
> tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT);
> tmp->vm_next = tmp->vm_prev = NULL;
>diff --git a/mm/rmap.c b/mm/rmap.c
>index b3e381919835..3c1e04389291 100644
>--- a/mm/rmap.c
>+++ b/mm/rmap.c
>@@ -269,19 +269,6 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> {
> struct anon_vma_chain *avc, *pavc;
> struct anon_vma *root = NULL;
>- struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev;
>-
>- /*
>- * If parent share anon_vma with its vm_prev, keep this sharing in in
>- * child.
>- *
>- * 1. Parent has vm_prev, which implies we have vm_prev.
>- * 2. Parent and its vm_prev have the same anon_vma.
>- */
>- if (!dst->anon_vma && src->anon_vma &&
>- pprev && pprev->anon_vma == src->anon_vma)
>- dst->anon_vma = prev->anon_vma;
>-
>
> list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
> struct anon_vma *anon_vma;
>@@ -332,7 +319,8 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
> * the corresponding VMA in the parent process is attached to.
> * Returns 0 on success, non-zero on failure.
> */
>-int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
>+int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma,
>+ struct vm_area_struct *prev)
> {
> struct anon_vma_chain *avc;
> struct anon_vma *anon_vma;
>@@ -342,6 +330,13 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
> if (!pvma->anon_vma)
> return 0;
>
>+ /* Share anon_vma with previous VMA if it has the same parent. */
>+ if (prev && prev->anon_vma &&
>+ prev->anon_vma->parent == pvma->anon_vma) {
>+ vma->anon_vma = prev->anon_vma;
>+ return anon_vma_clone(vma, prev);
>+ }
>+
> /* Drop inherited anon_vma, we'll reuse existing or allocate new. */
> vma->anon_vma = NULL;
>
>
next prev parent reply other threads:[~2020-01-07 13:35 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-07 10:19 [PATCH v2 1/2] mm/rmap: fix and simplify reusing mergeable anon_vma as parent when fork Konstantin Khlebnikov
2020-01-07 10:19 ` [PATCH v2 2/2] kernel/fork: set VMA's mm/prev/next right after vm_area_dup in dup_mmap Konstantin Khlebnikov
2020-01-07 14:22 ` lixinhai.lxh
2020-01-07 13:35 ` lixinhai.lxh [this message]
2020-01-08 2:32 ` [PATCH v2 1/2] mm/rmap: fix and simplify reusing mergeable anon_vma as parent when fork Wei Yang
2020-01-08 6:42 ` lixinhai.lxh
2020-01-08 10:40 ` Konstantin Khlebnikov
2020-01-09 2:52 ` Wei Yang
2020-01-09 8:54 ` Konstantin Khlebnikov
2020-01-10 2:30 ` Wei Yang
2020-01-10 3:23 ` Li Xinhai
2020-01-10 5:34 ` Wei Yang
2020-01-10 8:11 ` Konstantin Khlebnikov
2020-01-11 22:38 ` Wei Yang
2020-01-12 9:55 ` Konstantin Khlebnikov
2020-01-13 0:33 ` Wei Yang
2020-01-13 11:07 ` Konstantin Khlebnikov
2020-01-14 2:09 ` Wei Yang
2020-01-14 14:42 ` Li Xinhai
2020-01-15 1:20 ` Wei Yang
2020-01-18 8:04 ` Konstantin Khlebnikov
2020-01-18 14:00 ` Wei Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2020010721350631540064@gmail.com \
--to=lixinhai.lxh@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=khlebnikov@yandex-team.ru \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=richardw.yang@linux.intel.com \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).