All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michel Lespinasse <walken@google.com>
To: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Tim Hartrick <tim@edgecast.com>, Michal Hocko <mhocko@suse.cz>
Subject: Re: [PATCH] Repeated fork() causes SLAB to grow without bound
Date: Wed, 19 Nov 2014 15:14:15 -0800	[thread overview]
Message-ID: <CANN689G+y77m2_paF0vBpHG8EsJ2-pEnJvLJSGs-zHf+SqTEjQ@mail.gmail.com> (raw)
In-Reply-To: <CALYGNiO9_bAVVZ2GdFq=PO2yV3LPs2utsbcb2pFby7MypptLCw@mail.gmail.com>

On Wed, Nov 19, 2014 at 8:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
> On Wed, Nov 19, 2014 at 7:09 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>> Also from reading http://lwn.net/Articles/383162/ I understand that correctness
>> also depends on the hierarchy and I wonder if there's a danger of reintroducing
>> a bug like the one described there.
>
> If I remember right that was fixed by linking non-exclusively mapped pages to
> root anon_vma instead of anon_vma from vma where fault has happened.
> After my patch this still works. Topology hierarchy actually isn't used.
> Here just one selected "root' anon_vma which dies last. That's all.

That's not how I remember it.

An anon_vma corresponds to a given vma V, and is used to track all
vmas (V and descendant vmas) that may include a page that was
originally mapped in V.

Each anon page has a link to the anon_vma corresponding to the vma
they were originally faulted in, and an offset indicating where the
page was located relative to that original VMA.

The anon_vma has an interval tree of struct anon_vma_chain, and each
struct anon_vma_chain includes a link to a descendent-of-V vma. This
allows rmap to quickly find all the vmas that may map a given page
(based on the page's anon_vma and offset).

When forking or splitting vmas, the new vma is a descendent of the
same vmas as the old one so it must be added to all the anon_vma
interval trees that were referencing the old one (that is, ancestors
of the new vma). To that end, all the struct anon_vma_chain pointing
to a given vma are kept on a linked list, and struct anon_vma_chain
includes a link to the anon_vma holding the interval tree.

Locking the entire structure is done with a single lock hosted in the
root anon_vma (that is, a vma that was created by mmap() and not by
cloning or forking existing vmas).

Limit the length of the ancestors linked list is correct, though it
has performance implications. In the extreme case, forcing all vmas to
be added on the root vma's interval tree would be correct, though it
may re-introduce the performance problems that lead to the
introduction of anon_vma.

The good thing about Konstantin's proposal is that it does not have
any magic constant like mine did. However, I think he is mistaken in
saying that hierarchy isn't used - an ancestor vma will always have
more descendents than its children, and the reason for the hierarchy
is to limit the number of vmas that rmap must explore.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

WARNING: multiple messages have this Message-ID (diff)
From: Michel Lespinasse <walken@google.com>
To: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hughd@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Tim Hartrick <tim@edgecast.com>, Michal Hocko <mhocko@suse.cz>
Subject: Re: [PATCH] Repeated fork() causes SLAB to grow without bound
Date: Wed, 19 Nov 2014 15:14:15 -0800	[thread overview]
Message-ID: <CANN689G+y77m2_paF0vBpHG8EsJ2-pEnJvLJSGs-zHf+SqTEjQ@mail.gmail.com> (raw)
In-Reply-To: <CALYGNiO9_bAVVZ2GdFq=PO2yV3LPs2utsbcb2pFby7MypptLCw@mail.gmail.com>

On Wed, Nov 19, 2014 at 8:58 AM, Konstantin Khlebnikov <koct9i@gmail.com> wrote:
> On Wed, Nov 19, 2014 at 7:09 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
>> Also from reading http://lwn.net/Articles/383162/ I understand that correctness
>> also depends on the hierarchy and I wonder if there's a danger of reintroducing
>> a bug like the one described there.
>
> If I remember right that was fixed by linking non-exclusively mapped pages to
> root anon_vma instead of anon_vma from vma where fault has happened.
> After my patch this still works. Topology hierarchy actually isn't used.
> Here just one selected "root' anon_vma which dies last. That's all.

That's not how I remember it.

An anon_vma corresponds to a given vma V, and is used to track all
vmas (V and descendant vmas) that may include a page that was
originally mapped in V.

Each anon page has a link to the anon_vma corresponding to the vma
they were originally faulted in, and an offset indicating where the
page was located relative to that original VMA.

The anon_vma has an interval tree of struct anon_vma_chain, and each
struct anon_vma_chain includes a link to a descendent-of-V vma. This
allows rmap to quickly find all the vmas that may map a given page
(based on the page's anon_vma and offset).

When forking or splitting vmas, the new vma is a descendent of the
same vmas as the old one so it must be added to all the anon_vma
interval trees that were referencing the old one (that is, ancestors
of the new vma). To that end, all the struct anon_vma_chain pointing
to a given vma are kept on a linked list, and struct anon_vma_chain
includes a link to the anon_vma holding the interval tree.

Locking the entire structure is done with a single lock hosted in the
root anon_vma (that is, a vma that was created by mmap() and not by
cloning or forking existing vmas).

Limit the length of the ancestors linked list is correct, though it
has performance implications. In the extreme case, forcing all vmas to
be added on the root vma's interval tree would be correct, though it
may re-introduce the performance problems that lead to the
introduction of anon_vma.

The good thing about Konstantin's proposal is that it does not have
any magic constant like mine did. However, I think he is mistaken in
saying that hierarchy isn't used - an ancestor vma will always have
more descendents than its children, and the reason for the hierarchy
is to limit the number of vmas that rmap must explore.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-11-19 23:14 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-16  2:46 Repeated fork() causes SLAB to grow without bound Daniel Forrest
2012-08-16 18:58 ` Rik van Riel
2012-08-16 18:58   ` Rik van Riel
2012-08-18  0:03   ` Daniel Forrest
2012-08-18  0:03     ` Daniel Forrest
2012-08-18  3:46     ` Rik van Riel
2012-08-18  3:46       ` Rik van Riel
2012-08-18  4:07       ` Daniel Forrest
2012-08-18  4:07         ` Daniel Forrest
2012-08-18  4:10         ` Rik van Riel
2012-08-18  4:10           ` Rik van Riel
2012-08-20  8:00       ` Hugh Dickins
2012-08-20  8:00         ` Hugh Dickins
2012-08-20  9:39         ` Michel Lespinasse
2012-08-20  9:39           ` Michel Lespinasse
2012-08-20 11:11           ` Andi Kleen
2012-08-20 11:11             ` Andi Kleen
2012-08-20 11:17           ` Rik van Riel
2012-08-20 11:17             ` Rik van Riel
2012-08-20 11:53             ` Michel Lespinasse
2012-08-20 11:53               ` Michel Lespinasse
2012-08-20 19:11               ` Michel Lespinasse
2012-08-20 19:11                 ` Michel Lespinasse
2012-08-22  3:20           ` [RFC PATCH] " Michel Lespinasse
2012-08-22  3:20             ` Michel Lespinasse
2012-08-22  3:29             ` Rik van Riel
2012-08-22  3:29               ` Rik van Riel
2013-06-03 19:50               ` Daniel Forrest
2013-06-03 19:50                 ` Daniel Forrest
2013-06-04 10:37                 ` Rik van Riel
2013-06-04 10:37                   ` Rik van Riel
2013-06-05 14:02                   ` Andrea Arcangeli
2013-06-05 14:02                     ` Andrea Arcangeli
2014-11-14 16:30                 ` [PATCH] " Daniel Forrest
2014-11-14 16:30                   ` Daniel Forrest
2014-11-18  0:02                   ` Andrew Morton
2014-11-18  0:02                     ` Andrew Morton
2014-11-18  1:41                     ` Daniel Forrest
2014-11-18  1:41                       ` Daniel Forrest
2014-11-18  2:41                       ` Rik van Riel
2014-11-18  2:41                         ` Rik van Riel
2014-11-18 20:19                         ` Andrew Morton
2014-11-18 20:19                           ` Andrew Morton
2014-11-18 22:15                           ` Konstantin Khlebnikov
2014-11-18 22:15                             ` Konstantin Khlebnikov
2014-11-18 23:02                             ` Konstantin Khlebnikov
2014-11-18 23:50                               ` Vlastimil Babka
2014-11-18 23:50                                 ` Vlastimil Babka
2014-11-19 14:36                                 ` Konstantin Khlebnikov
2014-11-19 14:36                                   ` Konstantin Khlebnikov
2014-11-19 16:09                                   ` Vlastimil Babka
2014-11-19 16:09                                     ` Vlastimil Babka
2014-11-19 16:58                                     ` Konstantin Khlebnikov
2014-11-19 16:58                                       ` Konstantin Khlebnikov
2014-11-19 23:14                                       ` Michel Lespinasse [this message]
2014-11-19 23:14                                         ` Michel Lespinasse
2014-11-20 14:42                                         ` Konstantin Khlebnikov
2014-11-20 14:42                                           ` Konstantin Khlebnikov
2014-11-20 14:50                                           ` Rik van Riel
2014-11-20 14:50                                             ` Rik van Riel
2014-11-20 15:03                                             ` Konstantin Khlebnikov
2014-11-20 15:03                                               ` Konstantin Khlebnikov
2014-11-24  7:09                                               ` Konstantin Khlebnikov
2014-11-25 10:59                                                 ` Michal Hocko
2014-11-25 10:59                                                   ` Michal Hocko
2014-11-25 12:13                                                   ` Konstantin Khlebnikov
2014-11-25 15:00                                                     ` Michal Hocko
2014-11-25 15:00                                                       ` Michal Hocko
2014-11-26 17:35                                                       ` Michal Hocko
2014-11-26 17:35                                                         ` Michal Hocko
2014-12-05 15:44                                                         ` Jerome Marchand
2014-11-20 15:27                                           ` Michel Lespinasse
2014-11-20 15:27                                             ` Michel Lespinasse
2014-11-19  2:48                           ` Rik van Riel
2014-11-19  2:48                             ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANN689G+y77m2_paF0vBpHG8EsJ2-pEnJvLJSGs-zHf+SqTEjQ@mail.gmail.com \
    --to=walken@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=koct9i@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    --cc=tim@edgecast.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.