All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Wei Yang <richardw.yang@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>,
	Li Xinhai <lixinhai.lxh@gmail.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: [PATCH v2 1/2] mm/rmap: fix and simplify reusing mergeable anon_vma as parent when fork
Date: Tue, 07 Jan 2020 13:19:56 +0300	[thread overview]
Message-ID: <157839239609.694.10268055713935919822.stgit@buzz> (raw)

This fixes some misconceptions in commit 4e4a9eb92133 ("mm/rmap.c: reuse
mergeable anon_vma as parent when fork"). It merges anon-vma in unexpected
way but fortunately still produces valid anon-vma tree, so nothing crashes.

If in parent VMAs: SRC1 SRC2 .. SRCn share anon-vma ANON0, then after fork
before all patches in child process related VMAs: DST1 DST2 .. DSTn will
fork indepndent anon-vmas: ANON1 ANON2 .. ANONn (each is child of ANON0).
Before this patch only DST1 will fork new ANON1 and following DST2 .. DSTn
will share parent's ANON0 (i.e. anon-vma tree is valid but isn't optimal).
With this patch DST1 will create new ANON1 and DST2 .. DSTn will share it.

Root problem caused by initialization order in dup_mmap(): vma->vm_prev
is set after calling anon_vma_fork(). Thus in anon_vma_fork() it points to
previous VMA in parent mm.

Second problem is hidden behind first one: assumption "Parent has vm_prev,
which implies we have vm_prev" is wrong if first VMA in parent mm has set
flag VM_DONTCOPY. Luckily prev->anon_vma doesn't dereference NULL pointer
because in current code 'prev' actually is same as 'pprev'.

Third hidden problem is linking between VMA and anon-vmas whose pages it
could contain. Loop in anon_vma_clone() attaches only parent's anon-vmas,
shared anon-vma isn't attached. But every mapped page stays reachable in
rmap because we erroneously share anon-vma from parent's previous VMA.

This patch moves sharing logic out of anon_vma_clone() into more specific
anon_vma_fork() because this supposed to work only at fork() and simply
reuses anon_vma from previous VMA if it is forked from the same anon-vma.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reported-by: Li Xinhai <lixinhai.lxh@gmail.com>
Fixes: 4e4a9eb92133 ("mm/rmap.c: reuse mergeable anon_vma as parent when fork")
Link: https://lore.kernel.org/linux-mm/CALYGNiNzz+dxHX0g5-gNypUQc3B=8_Scp53-NTOh=zWsdUuHAw@mail.gmail.com/T/#t
---
 include/linux/rmap.h |    3 ++-
 kernel/fork.c        |    2 +-
 mm/rmap.c            |   23 +++++++++--------------
 3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 988d176472df..560e4480dcd0 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -143,7 +143,8 @@ void anon_vma_init(void);	/* create anon_vma_cachep */
 int  __anon_vma_prepare(struct vm_area_struct *);
 void unlink_anon_vmas(struct vm_area_struct *);
 int anon_vma_clone(struct vm_area_struct *, struct vm_area_struct *);
-int anon_vma_fork(struct vm_area_struct *, struct vm_area_struct *);
+int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma,
+		  struct vm_area_struct *prev);
 
 static inline int anon_vma_prepare(struct vm_area_struct *vma)
 {
diff --git a/kernel/fork.c b/kernel/fork.c
index 2508a4f238a3..c33626993831 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -556,7 +556,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
 			tmp->anon_vma = NULL;
 			if (anon_vma_prepare(tmp))
 				goto fail_nomem_anon_vma_fork;
-		} else if (anon_vma_fork(tmp, mpnt))
+		} else if (anon_vma_fork(tmp, mpnt, prev))
 			goto fail_nomem_anon_vma_fork;
 		tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT);
 		tmp->vm_next = tmp->vm_prev = NULL;
diff --git a/mm/rmap.c b/mm/rmap.c
index b3e381919835..3c1e04389291 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -269,19 +269,6 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
 {
 	struct anon_vma_chain *avc, *pavc;
 	struct anon_vma *root = NULL;
-	struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev;
-
-	/*
-	 * If parent share anon_vma with its vm_prev, keep this sharing in in
-	 * child.
-	 *
-	 * 1. Parent has vm_prev, which implies we have vm_prev.
-	 * 2. Parent and its vm_prev have the same anon_vma.
-	 */
-	if (!dst->anon_vma && src->anon_vma &&
-	    pprev && pprev->anon_vma == src->anon_vma)
-		dst->anon_vma = prev->anon_vma;
-
 
 	list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) {
 		struct anon_vma *anon_vma;
@@ -332,7 +319,8 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src)
  * the corresponding VMA in the parent process is attached to.
  * Returns 0 on success, non-zero on failure.
  */
-int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
+int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma,
+		  struct vm_area_struct *prev)
 {
 	struct anon_vma_chain *avc;
 	struct anon_vma *anon_vma;
@@ -342,6 +330,13 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma)
 	if (!pvma->anon_vma)
 		return 0;
 
+	/* Share anon_vma with previous VMA if it has the same parent. */
+	if (prev && prev->anon_vma &&
+	    prev->anon_vma->parent == pvma->anon_vma) {
+		vma->anon_vma = prev->anon_vma;
+		return anon_vma_clone(vma, prev);
+	}
+
 	/* Drop inherited anon_vma, we'll reuse existing or allocate new. */
 	vma->anon_vma = NULL;
 


             reply	other threads:[~2020-01-07 10:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07 10:19 Konstantin Khlebnikov [this message]
2020-01-07 10:19 ` [PATCH v2 2/2] kernel/fork: set VMA's mm/prev/next right after vm_area_dup in dup_mmap Konstantin Khlebnikov
2020-01-07 14:22   ` lixinhai.lxh
2020-01-07 13:35 ` [PATCH v2 1/2] mm/rmap: fix and simplify reusing mergeable anon_vma as parent when fork lixinhai.lxh
2020-01-08  2:32 ` Wei Yang
2020-01-08  6:42   ` lixinhai.lxh
2020-01-08 10:40   ` Konstantin Khlebnikov
2020-01-09  2:52     ` Wei Yang
2020-01-09  8:54       ` Konstantin Khlebnikov
2020-01-10  2:30         ` Wei Yang
2020-01-10  3:23           ` Li Xinhai
2020-01-10  5:34             ` Wei Yang
2020-01-10  8:11               ` Konstantin Khlebnikov
2020-01-11 22:38                 ` Wei Yang
2020-01-12  9:55                   ` Konstantin Khlebnikov
2020-01-13  0:33                     ` Wei Yang
2020-01-13 11:07                       ` Konstantin Khlebnikov
2020-01-14  2:09                         ` Wei Yang
2020-01-14 14:42                         ` Li Xinhai
2020-01-15  1:20                           ` Wei Yang
2020-01-18  8:04                             ` Konstantin Khlebnikov
2020-01-18 14:00                               ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=157839239609.694.10268055713935919822.stgit@buzz \
    --to=khlebnikov@yandex-team.ru \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lixinhai.lxh@gmail.com \
    --cc=richardw.yang@linux.intel.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.