All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Goldblatt <davidtgoldblatt@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@kernel.org>,
	linux-api@vger.kernel.org, Khalid Aziz <khalid.aziz@oracle.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Russell King - ARM Linux <linux@armlinux.org.uk>,
	Andrea Arcangeli <aarcange@redhat.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	linux-arch@vger.kernel.org, Florian Weimer <fweimer@redhat.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Matthew Wilcox <willy@infradead.org>,
	Abdul Haleem <abdhalee@linux.vnet.ibm.com>,
	Joel Stanley <joel@jms.id.au>, Kees Cook <keescook@chromium.org>,
	Michal Hocko <mhocko@suse.com>,
	trasz@freebsd.org, Jason Evans <jasone@canonware.com>
Subject: Re: [PATCH v2 0/2] mm: introduce MAP_FIXED_SAFE
Date: Wed, 13 Dec 2017 17:35:48 -0800	[thread overview]
Message-ID: <CAHD6eXccNCJDMPEhrYkYUO-+GH3GLseoYSVE375OcKMNYJKm-A@mail.gmail.com> (raw)
In-Reply-To: <20171213163210.6a16ccf8753b74a6982ef5b6@linux-foundation.org>

[-- Attachment #1: Type: text/plain, Size: 6278 bytes --]

(+cc the jemalloc jasone; -cc,+bcc the Google jasone).

The only time we would want MAP_FIXED (or rather, a non-broken variant) is
in the middle of trying to expand an allocation in place; "atomic address
range probing in the multithreaded programs" describes our use case pretty
well. That's in a pathway that usually fails; it's pretty far down on our
kernel mmap enhancements wish-list.

On Wed, Dec 13, 2017 at 4:32 PM, Andrew Morton <akpm@linux-foundation.org>
wrote:

> On Wed, 13 Dec 2017 10:25:48 +0100 Michal Hocko <mhocko@kernel.org> wrote:
>
> >
> > Hi,
> > I am resending with some minor updates based on Michael's review and
> > ask for inclusion. There haven't been any fundamental objections for
> > the RFC [1] nor the previous version [2].  The biggest discussion
> > revolved around the naming. There were many suggestions flowing
> > around MAP_REQUIRED, MAP_EXACT, MAP_FIXED_NOCLOBBER, MAP_AT_ADDR,
> > MAP_FIXED_NOREPLACE etc...
>
> I like MAP_FIXED_CAREFUL :)
>
> > I am afraid we can bikeshed this to death and there will still be
> > somebody finding yet another better name. Therefore I've decided to
> > stick with my original MAP_FIXED_SAFE. Why? Well, because it keeps the
> > MAP_FIXED prefix which should be recognized by developers and _SAFE
> > suffix should also be clear that all dangerous side effects of the old
> > MAP_FIXED are gone.
> >
> > If somebody _really_ hates this then feel free to nack and resubmit
> > with a different name you can find a consensus for. I am sorry to be
> > stubborn here but I would rather have this merged than go over few more
> > iterations changing the name just because it seems like a good idea
> > now. My experience tells me that chances are that the name will turn out
> > to be "suboptimal" anyway over time.
> >
> > Some more background:
> > This has started as a follow up discussion [3][4] resulting in the
> > runtime failure caused by hardening patch [5] which removes MAP_FIXED
> > from the elf loader because MAP_FIXED is inherently dangerous as it
> > might silently clobber an existing underlying mapping (e.g. stack). The
> > reason for the failure is that some architectures enforce an alignment
> > for the given address hint without MAP_FIXED used (e.g. for shared or
> > file backed mappings).
> >
> > One way around this would be excluding those archs which do alignment
> > tricks from the hardening [6]. The patch is really trivial but it has
> > been objected, rightfully so, that this screams for a more generic
> > solution. We basically want a non-destructive MAP_FIXED.
> >
> > The first patch introduced MAP_FIXED_SAFE which enforces the given
> > address but unlike MAP_FIXED it fails with EEXIST if the given range
> > conflicts with an existing one. The flag is introduced as a completely
> > new one rather than a MAP_FIXED extension because of the backward
> > compatibility. We really want a never-clobber semantic even on older
> > kernels which do not recognize the flag. Unfortunately mmap sucks wrt.
> > flags evaluation because we do not EINVAL on unknown flags. On those
> > kernels we would simply use the traditional hint based semantic so the
> > caller can still get a different address (which sucks) but at least not
> > silently corrupt an existing mapping. I do not see a good way around
> > that. Except we won't export expose the new semantic to the userspace at
> > all.
> >
> > It seems there are users who would like to have something like that.
> > Jemalloc has been mentioned by Michael Ellerman [7]
>
> http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au.
>
> It would be useful to get feedback from jemalloc developers (please).
> I'll add some cc's.
>
>
> > Florian Weimer has mentioned the following:
> > : glibc ld.so currently maps DSOs without hints.  This means that the
> kernel
> > : will map right next to each other, and the offsets between them a
> completely
> > : predictable.  We would like to change that and supply a random address
> in a
> > : window of the address space.  If there is a conflict, we do not want
> the
> > : kernel to pick a non-random address. Instead, we would try again with a
> > : random address.
> >
> > John Hubbard has mentioned CUDA example
> > : a) Searches /proc/<pid>/maps for a "suitable" region of available
> > : VA space.  "Suitable" generally means it has to have a base address
> > : within a certain limited range (a particular device model might
> > : have odd limitations, for example), it has to be large enough, and
> > : alignment has to be large enough (again, various devices may have
> > : constraints that lead us to do this).
> > :
> > : This is of course subject to races with other threads in the process.
> > :
> > : Let's say it finds a region starting at va.
> > :
> > : b) Next it does:
> > :     p = mmap(va, ...)
> > :
> > : *without* setting MAP_FIXED, of course (so va is just a hint), to
> > : attempt to safely reserve that region. If p != va, then in most cases,
> > : this is a failure (almost certainly due to another thread getting a
> > : mapping from that region before we did), and so this layer now has to
> > : call munmap(), before returning a "failure: retry" to upper layers.
> > :
> > :     IMPROVEMENT: --> if instead, we could call this:
> > :
> > :             p = mmap(va, ... MAP_FIXED_SAFE ...)
> > :
> > :         , then we could skip the munmap() call upon failure. This
> > :         is a small thing, but it is useful here. (Thanks to Piotr
> > :         Jaroszynski and Mark Hairgrove for helping me get that detail
> > :         exactly right, btw.)
> > :
> > : c) After that, CUDA suballocates from p, via:
> > :
> > :      q = mmap(sub_region_start, ... MAP_FIXED ...)
> > :
> > : Interestingly enough, "freeing" is also done via MAP_FIXED, and
> > : setting PROT_NONE to the subregion. Anyway, I just included (c) for
> > : general interest.
> >
> > Atomic address range probing in the multithreaded programs in general
> > sounds like an interesting thing to me.
> >
> > The second patch simply replaces MAP_FIXED use in elf loader by
> > MAP_FIXED_SAFE. I believe other places which rely on MAP_FIXED should
> > follow. Actually real MAP_FIXED usages should be docummented properly
> > and they should be more of an exception.
>
>

[-- Attachment #2: Type: text/html, Size: 7614 bytes --]

  reply	other threads:[~2017-12-14  1:35 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-13  9:25 [PATCH v2 0/2] mm: introduce MAP_FIXED_SAFE Michal Hocko
2017-12-13  9:25 ` Michal Hocko
2017-12-13  9:25 ` Michal Hocko
2017-12-13  9:25 ` [PATCH 1/2] " Michal Hocko
2017-12-13  9:25   ` Michal Hocko
2017-12-13  9:25   ` Michal Hocko
2017-12-13 12:50   ` Matthew Wilcox
2017-12-13 12:50     ` Matthew Wilcox
2017-12-13 12:50     ` Matthew Wilcox
2017-12-13 13:01     ` Michal Hocko
2017-12-13 13:01       ` Michal Hocko
2017-12-13  9:25 ` [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map Michal Hocko
2017-12-13  9:25   ` Michal Hocko
2017-12-13  9:25   ` Michal Hocko
2017-12-16  0:49   ` [2/2] " Andrei Vagin
2017-12-18  9:13     ` Michal Hocko
2017-12-18  9:13       ` Michal Hocko
2017-12-18 18:12       ` Andrei Vagin
2017-12-18 18:12         ` Andrei Vagin
2017-12-18 18:49       ` [PATCH] mm: don't use the same value for MAP_FIXED_SAFE and MAP_SYNC Andrei Vagin
2017-12-18 18:49         ` Andrei Vagin
2017-12-18 20:41         ` Michal Hocko
2017-12-18 20:41           ` Michal Hocko
2018-04-18 10:51   ` [PATCH 2/2] fs, elf: drop MAP_FIXED usage from elf_map Tetsuo Handa
2018-04-18 10:51     ` Tetsuo Handa
2018-04-18 11:33     ` Michal Hocko
2018-04-18 11:43       ` Tetsuo Handa
2018-04-18 11:55         ` Michal Hocko
2018-04-18 14:07           ` [PATCH v2] fs, elf: don't complain MAP_FIXED_NOREPLACE unless -EEXIST error Tetsuo Handa
2018-04-19  5:57             ` Michal Hocko
2017-12-13  9:31 ` [PATCH 1/2] mmap.2: document new MAP_FIXED_SAFE flag Michal Hocko
2017-12-13  9:31   ` Michal Hocko
2017-12-13  9:31   ` Michal Hocko
2017-12-13  9:31   ` [PATCH 2/2] mmap.2: MAP_FIXED updated documentation Michal Hocko
2017-12-13  9:31     ` Michal Hocko
2017-12-13  9:31     ` Michal Hocko
2017-12-13 12:55     ` Pavel Machek
2017-12-13 13:03       ` Cyril Hrubis
2017-12-13 13:03         ` Cyril Hrubis
2017-12-13 13:03         ` Cyril Hrubis
2017-12-13 13:04       ` Michal Hocko
2017-12-13 13:04         ` Michal Hocko
2017-12-13 13:09         ` Pavel Machek
2017-12-13 13:16           ` Michal Hocko
2017-12-13 13:16             ` Michal Hocko
2017-12-13 13:21             ` Pavel Machek
2017-12-13 13:21               ` Pavel Machek
2017-12-13 13:35               ` Michal Hocko
2017-12-13 13:35                 ` Michal Hocko
2017-12-13 14:40               ` Cyril Hrubis
2017-12-13 14:40                 ` Cyril Hrubis
2017-12-13 14:40                 ` Cyril Hrubis
2017-12-13 23:19                 ` Kees Cook
2017-12-13 23:19                   ` Kees Cook
2017-12-14  7:07                   ` Michal Hocko
2017-12-14  7:07                     ` Michal Hocko
2017-12-14  7:07                     ` Michal Hocko
2017-12-18 19:12                   ` Michael Kerrisk (man-pages)
2017-12-18 19:12                     ` Michael Kerrisk (man-pages)
2017-12-18 20:19                     ` Kees Cook
2017-12-18 20:19                       ` Kees Cook
2017-12-18 20:33                       ` Matthew Wilcox
2017-12-18 20:33                         ` Matthew Wilcox
2017-12-21 12:38                       ` Michael Ellerman
2017-12-21 12:38                         ` Michael Ellerman
2017-12-21 12:38                         ` Michael Ellerman
2017-12-21 12:38                         ` Michael Ellerman
2017-12-21 14:59                         ` known bad patch in -mm tree was " Pavel Machek
2017-12-21 15:08                           ` Michal Hocko
2017-12-21 15:08                             ` Michal Hocko
2017-12-21 22:24                         ` Andrew Morton
2017-12-21 22:24                           ` Andrew Morton
2017-12-21 22:24                           ` Andrew Morton
2017-12-22  0:06                           ` Michael Ellerman
2017-12-22  0:06                             ` Michael Ellerman
2017-12-22  0:06                             ` Michael Ellerman
2017-12-14  2:52     ` Jann Horn
2017-12-14  2:52       ` Jann Horn
2017-12-14  5:28       ` John Hubbard
2017-12-14  5:28         ` John Hubbard
2017-12-14  5:28         ` John Hubbard
2017-12-14  5:28         ` John Hubbard
2017-12-14 23:06       ` John Hubbard
2017-12-14 23:06         ` John Hubbard
2017-12-14 23:06         ` John Hubbard
2017-12-14 23:10         ` Jann Horn
2017-12-14 23:10           ` Jann Horn
2017-12-14 23:10           ` Jann Horn
2017-12-13 12:25 ` [PATCH v2 0/2] mm: introduce MAP_FIXED_SAFE Matthew Wilcox
2017-12-13 12:25   ` Matthew Wilcox
2017-12-13 12:25   ` Matthew Wilcox
2017-12-13 12:34   ` Michal Hocko
2017-12-13 12:34     ` Michal Hocko
2017-12-13 17:13 ` Kees Cook
2017-12-13 17:13   ` Kees Cook
2017-12-13 17:13   ` Kees Cook
2017-12-15  9:02   ` Michael Ellerman
2017-12-15  9:02     ` Michael Ellerman
2017-12-14  0:32 ` Andrew Morton
2017-12-14  0:32   ` Andrew Morton
2017-12-14  0:32   ` Andrew Morton
2017-12-14  0:32   ` Andrew Morton
2017-12-14  1:35   ` David Goldblatt [this message]
2017-12-14  1:42     ` David Goldblatt
2017-12-14  1:42       ` David Goldblatt
2017-12-14 12:44   ` Edward Napierala
2017-12-14 13:15     ` Michal Hocko
2017-12-14 13:15       ` Michal Hocko
2017-12-14 14:54       ` Edward Napierala
2017-12-14 14:54         ` Edward Napierala
2017-12-19 12:40         ` David Laight
2017-12-19 12:40           ` David Laight
2017-12-19 12:46           ` Michal Hocko
2017-12-19 12:46             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHD6eXccNCJDMPEhrYkYUO-+GH3GLseoYSVE375OcKMNYJKm-A@mail.gmail.com \
    --to=davidtgoldblatt@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=abdhalee@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=fweimer@redhat.com \
    --cc=jasone@canonware.com \
    --cc=jhubbard@nvidia.com \
    --cc=joel@jms.id.au \
    --cc=keescook@chromium.org \
    --cc=khalid.aziz@oracle.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mpe@ellerman.id.au \
    --cc=trasz@freebsd.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.