linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Daniel Colascione <dancol@google.com>
Cc: tytso@mit.edu, dave.hansen@intel.com, linux-mm@kvack.org,
	Tim Murray <timmurray@google.com>,
	Minchan Kim <minchan@kernel.org>
Subject: Re: Why do we let munmap fail?
Date: Mon, 21 May 2018 19:11:52 -0700	[thread overview]
Message-ID: <20180522021152.GA18682@bombadil.infradead.org> (raw)
In-Reply-To: <CAKOZuev5kMc88VOvwELv4aAwKB0n2x+uiSK8-XcNHstABcc=7w@mail.gmail.com>

On Mon, May 21, 2018 at 06:41:12PM -0700, Daniel Colascione wrote:
> On Mon, May 21, 2018 at 6:19 PM Theodore Y. Ts'o <tytso@mit.edu> wrote:
> 
> > On Mon, May 21, 2018 at 05:38:06PM -0700, Daniel Colascione wrote:
> > >
> > > One approach to dealing with this badness, the one I proposed earlier,
> is
> > > to prevent that giant mmap from appearing in the first place (because
> we'd
> > > cap vsize). If that giant mmap never appears, you can't generate a huge
> VMA
> > > tree by splitting it.
> > >
> > > Maybe that's not a good approach. Maybe processes really need mappings
> that
> > > big. If they do, then maybe the right approach is to just make 8 billion
> > > VMAs not "DoS the system". What actually goes wrong if we just let the
> VMA
> > > tree grow that large? So what if VMA lookup ends up taking a while ---
> the
> > > process with the pathological allocation pattern is paying the cost,
> right?
> > >
> 
> > Fine.  Let's pick a more reasonable size --- say, 1GB.  That's still
> > 2**18 4k pages.  Someone who munmap's every other 4k page is going to
> > create 2**17 VMA's.  That's a lot of VMA's.  So now the question is do
> > we pre-preserve enough VMA's for this worst case scenario, for all
> > processes in the system?  Or do we fail or otherwise kill the process
> > who is clearly attempting a DOS attack on the system?
> 
> > If your goal is that munmap must ***never*** fail, then effectively
> > you have to preserve enough resources for 50% of all 4k pages in all
> > of the virtual address spaces in use by all of the processes in the
> > system.  That's a horrible waste of resources, just to guarantee that
> > munmap(2) must never fail.
> 
> To be clear, I'm not suggesting that we actually perform this
> preallocation. (Maybe in the distant future, with strict commit accounting,
> it'd be useful.) I'm just suggesting that we perform the accounting as if
> we did. But I think Matthew's convinced me that there's no vsize cap small
> enough to be safe and still large enough to be useful, so I'll retract the
> vsize cap idea.
> 
> > Personally, I think it's not worth it.
> 
> > Why is it so important to you that munmap(2) must not fail?  Is it not
> > enough to say that if you mmap(2) a region, if you munmap(2) that
> > exact same size region as you mmap(2)'ed, it must not fail?  That's a
> > much easier guarantee to make....
> 
> That'd be good too, but I don't see how this guarantee would be easier to
> make. If you call mmap three times, those three allocations might end up
> merged into the same VMA, and if you called munmap on the middle
> allocation, you'd still have to split. Am I misunderstanding something?

What I think Ted's proposing (and I was too) is that we either preallocate
or make a note of how many VMAs we've merged.  So you can unmap as many
times as you've mapped without risking failure.  If you start unmapping
in the middle, then you might see munmap failures, but if you only unmap
things that you already mapped, we can guarantee that munmap won't fail.

  parent reply	other threads:[~2018-05-22  2:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-21 22:07 Why do we let munmap fail? Daniel Colascione
2018-05-21 22:12 ` Dave Hansen
2018-05-21 22:20   ` Daniel Colascione
2018-05-21 22:29     ` Dave Hansen
2018-05-21 22:35       ` Daniel Colascione
2018-05-21 22:48         ` Dave Hansen
2018-05-21 22:54           ` Daniel Colascione
2018-05-21 23:02             ` Dave Hansen
2018-05-21 23:16               ` Daniel Colascione
2018-05-21 23:32                 ` Dave Hansen
2018-05-22  0:00                   ` Daniel Colascione
2018-05-22  0:22                     ` Matthew Wilcox
2018-05-22  0:38                       ` Daniel Colascione
2018-05-22  1:19                         ` Theodore Y. Ts'o
2018-05-22  1:41                           ` Daniel Colascione
2018-05-22  2:09                             ` Daniel Colascione
2018-05-22  2:11                             ` Matthew Wilcox [this message]
2018-05-22  1:22                         ` Matthew Wilcox
2018-05-22  5:34                     ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180522021152.GA18682@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=dancol@google.com \
    --cc=dave.hansen@intel.com \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=timmurray@google.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).