All of lore.kernel.org
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: "Michal Hocko" <mhocko@suse.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Pavel Machek" <pavel@ucw.cz>,
	"Rasmus Villemoes" <linux@rasmusvillemoes.dk>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Colin Cross" <ccross@google.com>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Dave Hansen" <dave.hansen@intel.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Kalesh Singh" <kaleshsingh@google.com>,
	"Peter Xu" <peterx@redhat.com>,
	rppt@kernel.org, "Peter Zijlstra" <peterz@infradead.org>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	vincenzo.frascino@arm.com,
	"Chinwen Chang (張錦文)" <chinwen.chang@mediatek.com>,
	"Axel Rasmussen" <axelrasmussen@google.com>,
	"Andrea Arcangeli" <aarcange@redhat.com>,
	"Jann Horn" <jannh@google.com>,
	apopple@nvidia.com, "Yu Zhao" <yuzhao@google.com>,
	"Will Deacon" <will@kernel.org>,
	fenghua.yu@intel.com, thunder.leizhen@huawei.com,
	"Hugh Dickins" <hughd@google.com>,
	feng.tang@intel.com, "Jason Gunthorpe" <jgg@ziepe.ca>,
	"Roman Gushchin" <guro@fb.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	krisman@collabora.com, "Chris Hyser" <chris.hyser@oracle.com>,
	"Peter Collingbourne" <pcc@google.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Jens Axboe" <axboe@kernel.dk>,
	legion@kernel.org, "Rolf Eike Beer" <eb@emlix.com>,
	"Cyrill Gorcunov" <gorcunov@gmail.com>,
	"Muchun Song" <songmuchun@bytedance.com>,
	"Viresh Kumar" <viresh.kumar@linaro.org>,
	"Thomas Cedeno" <thomascedeno@google.com>,
	sashal@kernel.org, cxfcosmos@gmail.com,
	LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-mm <linux-mm@kvack.org>,
	kernel-team <kernel-team@android.com>
Subject: Re: [PATCH v10 3/3] mm: add anonymous vma name refcounting
Date: Thu, 14 Oct 2021 13:16:47 -0700	[thread overview]
Message-ID: <CAJuCfpE2j91_AOwwRs_pYBs50wfLTwassRqgtqhXsh6fT+4MCg@mail.gmail.com> (raw)
In-Reply-To: <CAJuCfpGTCM_Rf3GEyzpR5UOTfgGKTY0_rvAbGdtjbyabFhrRAw@mail.gmail.com>

On Tue, Oct 12, 2021 at 10:01 AM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Tue, Oct 12, 2021 at 12:44 AM David Hildenbrand <david@redhat.com> wrote:
> >
> > > I'm still evaluating the proposal to use memfds but I'm not sure if
> > > the issue that David Hildenbrand mentioned about additional memory
> > > consumed in pagecache (which has to be addressed) is the only one we
> > > will encounter with this approach. If anyone knows of any potential
> > > issues with using memfds as named anonymous memory, I would really
> > > appreciate your feedback before I go too far in that direction.
> >
> > [MAP_PRIVATE memfd only behave that way with 4k, not with huge pages, so
> > I think it just has to be fixed. It doesn't make any sense to allocate a
> > page for the pagecache ("populate the file") when accessing via a
> > private mapping that's supposed to leave the file untouched]
> >
> > My gut feeling is if you really need a string as identifier, then try
> > going with memfds. Yes, we might hit some road blocks to be sorted out,
> > but it just logically makes sense to me: Files have names. These names
> > exist before mapping and after mapping. They "name" the content.
>
> I'm investigating this direction. I don't have much background with
> memfds, so I'll need to digest the code first.

I've done some investigation into the possibility of using memfds to
name anonymous VMAs. Here are my findings:

1. Forking a process with anonymous vmas named using memfd is 5-15%
slower than with prctl (depends on the number of VMAs in the process
being forked). Profiling shows that i_mmap_lock_write() dominates
dup_mmap(). Exit path is also slower by roughly 9% with
free_pgtables() and fput() dominating exit_mmap(). Fork performance is
important for Android because almost all processes are forked from
zygote, therefore this limitation already makes this approach
prohibitive.

2. mremap() usage to grow the mapping has an issue when used with memfds:

fd = memfd_create(name, MFD_ALLOW_SEALING);
ftruncate(fd, size_bytes);
ptr = mmap(NULL, size_bytes, prot, MAP_PRIVATE, fd, 0);
close(fd);
ptr = mremap(ptr, size_bytes, size_bytes * 2, MREMAP_MAYMOVE);
touch_mem(ptr, size_bytes * 2);

This would generate a SIGBUS in touch_mem(). I believe it's because
ftruncate() specified the size to be size_bytes and we are accessing
more than that after remapping. prctl() does not have this limitation
and we do have a usecase for growing a named VMA.

3. Leaves an fd exposed, even briefly, which may lead to unexpected
flaws (e.g. anything using mmap MAP_SHARED could allow exposures or
overwrites). Even MAP_PRIVATE, if an attacker writes into the file
after ftruncate() and before mmap(), can cause private memory to be
initialized with unexpected data.

4. There is a usecase in the Android userspace where vma naming
happens after memory was allocated. Bionic linker does in-memory
relocations and then names some relocated sections.

In the light of these findings, could the current patchset be reconsidered?
Thanks,
Suren.


>
> >
> > Maybe it's just me, but the whole interface, setting the name via a
> > prctl after the mapping was already instantiated doesn't really spark
> > joy at my end. That's not a strong pushback, but if we can avoid it
> > using something that's already there, that would be very much preferred.
>
> Actually that's one of my worries about using memfds. There might be
> cases when we need to name a vma after it was mapped. memfd_create()
> would not allow us to do that AFAIKT. But I need to check all usages
> to say if that's really an issue.
> Thanks!
>
> >
> > --
> > Thanks,
> >
> > David / dhildenb
> >
> > --
> > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
> >

  reply	other threads:[~2021-10-14 20:17 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01 20:56 [PATCH v10 1/3] mm: rearrange madvise code to allow for reuse Suren Baghdasaryan
2021-10-01 20:56 ` Suren Baghdasaryan
2021-10-01 20:56 ` [PATCH v10 2/3] mm: add a field to store names for private anonymous memory Suren Baghdasaryan
2021-10-01 20:56   ` Suren Baghdasaryan
2021-10-01 23:08   ` Andrew Morton
2021-10-02  0:52     ` Suren Baghdasaryan
2021-10-04 16:21       ` Suren Baghdasaryan
2021-10-07  2:39         ` Andrew Morton
2021-10-07  2:50           ` Suren Baghdasaryan
2021-10-01 23:11   ` kernel test robot
2021-10-02  1:59     ` Suren Baghdasaryan
2021-10-01 23:27   ` kernel test robot
2021-10-01 20:56 ` [PATCH v10 3/3] mm: add anonymous vma name refcounting Suren Baghdasaryan
2021-10-01 20:56   ` Suren Baghdasaryan
2021-10-05 18:42   ` Pavel Machek
2021-10-05 19:14     ` Suren Baghdasaryan
2021-10-05 19:21       ` Kees Cook
2021-10-05 20:04       ` Pavel Machek
2021-10-05 20:43         ` Suren Baghdasaryan
2021-10-06  6:57           ` John Hubbard
2021-10-06  8:27             ` Michal Hocko
2021-10-06  9:27               ` David Hildenbrand
2021-10-06 15:01                 ` Suren Baghdasaryan
2021-10-06 15:07                   ` David Hildenbrand
2021-10-06 15:20                     ` Suren Baghdasaryan
2021-10-07  2:29                       ` Andrew Morton
2021-10-07  2:46                         ` Suren Baghdasaryan
2021-10-07  2:53                           ` Andrew Morton
2021-10-07  3:01                             ` Suren Baghdasaryan
2021-10-07  7:27                               ` David Hildenbrand
2021-10-07  7:33                       ` David Hildenbrand
2021-10-07 15:42                         ` Suren Baghdasaryan
2021-10-06 17:58                   ` Pavel Machek
2021-10-06 18:18                     ` Suren Baghdasaryan
2021-10-07  8:10                       ` Michal Hocko
2021-10-07  8:41                         ` Pavel Machek
2021-10-07  8:47                         ` Rasmus Villemoes
2021-10-07 10:15                           ` Pavel Machek
2021-10-07 16:04                             ` Suren Baghdasaryan
2021-10-07 16:40                               ` Michal Hocko
2021-10-07 16:58                                 ` Suren Baghdasaryan
2021-10-07 17:31                                   ` Michal Hocko
2021-10-07 17:50                                     ` Suren Baghdasaryan
2021-10-07 18:12                                       ` Kees Cook
2021-10-07 18:50                                         ` Suren Baghdasaryan
2021-10-07 19:02                                           ` John Hubbard
2021-10-07 21:32                                             ` Suren Baghdasaryan
2021-10-08  1:04                                               ` Liam Howlett
2021-10-08  7:25                                             ` Rasmus Villemoes
2021-10-08  7:43                                               ` David Hildenbrand
2021-10-08 21:13                                                 ` Kees Cook
2021-10-08  6:34                                         ` Michal Hocko
2021-10-08 14:14                                           ` Dave Hansen
2021-10-08 14:57                                             ` Michal Hocko
2021-10-08 16:10                                               ` Suren Baghdasaryan
2021-10-08 20:58                                           ` Kees Cook
2021-10-11  8:36                                             ` Michal Hocko
2021-10-12  1:18                                               ` Suren Baghdasaryan
2021-10-12  1:20                                                 ` Suren Baghdasaryan
2021-10-12  3:00                                                   ` Johannes Weiner
2021-10-12  5:36                                                     ` Suren Baghdasaryan
2021-10-12 18:26                                                       ` Johannes Weiner
2021-10-12 18:52                                                         ` Suren Baghdasaryan
2021-10-12 20:41                                                           ` Johannes Weiner
2021-10-12 20:59                                                             ` Suren Baghdasaryan
2021-10-12  7:36                                                   ` Michal Hocko
2021-10-12 16:50                                                     ` Suren Baghdasaryan
2021-10-12  7:43                                                 ` David Hildenbrand
2021-10-12 17:01                                                   ` Suren Baghdasaryan
2021-10-14 20:16                                                     ` Suren Baghdasaryan [this message]
2021-10-15  8:03                                                       ` David Hildenbrand
2021-10-15 16:30                                                         ` Suren Baghdasaryan
2021-10-15 16:39                                                           ` David Hildenbrand
2021-10-15 18:33                                                             ` Suren Baghdasaryan
2021-10-15 17:45                                                           ` Kees Cook
2021-10-07  7:59                   ` Michal Hocko
2021-10-07 15:45                     ` Suren Baghdasaryan
2021-10-07 16:37                       ` Michal Hocko
2021-10-07 16:43                         ` Suren Baghdasaryan
2021-10-07 17:25                           ` Michal Hocko
2021-10-07 17:30                             ` Suren Baghdasaryan
2021-10-04  7:03 ` [PATCH v10 1/3] mm: rearrange madvise code to allow for reuse Rolf Eike Beer
2021-10-04  7:03   ` Rolf Eike Beer
2021-10-04 16:18   ` Suren Baghdasaryan
2021-10-05 21:00     ` Liam Howlett
2021-10-05 21:30       ` Suren Baghdasaryan
2021-10-06 17:33         ` Liam Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJuCfpE2j91_AOwwRs_pYBs50wfLTwassRqgtqhXsh6fT+4MCg@mail.gmail.com \
    --to=surenb@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=axboe@kernel.dk \
    --cc=axelrasmussen@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=ccross@google.com \
    --cc=chinwen.chang@mediatek.com \
    --cc=chris.hyser@oracle.com \
    --cc=corbet@lwn.net \
    --cc=cxfcosmos@gmail.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=eb@emlix.com \
    --cc=ebiederm@xmission.com \
    --cc=feng.tang@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=gorcunov@gmail.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=kaleshsingh@google.com \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=krisman@collabora.com \
    --cc=legion@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=mhocko@suse.com \
    --cc=pavel@ucw.cz \
    --cc=pcc@google.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rppt@kernel.org \
    --cc=sashal@kernel.org \
    --cc=songmuchun@bytedance.com \
    --cc=sumit.semwal@linaro.org \
    --cc=tglx@linutronix.de \
    --cc=thomascedeno@google.com \
    --cc=thunder.leizhen@huawei.com \
    --cc=vbabka@suse.cz \
    --cc=vincenzo.frascino@arm.com \
    --cc=viresh.kumar@linaro.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.