All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: David Herrmann <dh.herrmann@gmail.com>
Cc: Hugh Dickins <hughd@google.com>,
	Tony Battersby <tonyb@cybernetics.com>,
	Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Ryan Lortie <desrt@desrt.ca>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	Greg Kroah-Hartman <greg@kroah.com>,
	John Stultz <john.stultz@linaro.org>,
	Kristian Hogsberg <krh@bitplanet.net>,
	Lennart Poettering <lennart@poettering.net>,
	Daniel Mack <zonque@gmail.com>, Kay Sievers <kay@vrfy.org>
Subject: Re: [PATCH v2 0/3] File Sealing & memfd_create()
Date: Mon, 19 May 2014 18:09:42 +0200	[thread overview]
Message-ID: <20140519160942.GD3427@quack.suse.cz> (raw)
In-Reply-To: <CANq1E4QgSbD9G70H7W4QeXbZ77_Kn1wV7edwzN4k4NjQJS=36A@mail.gmail.com>

On Mon 19-05-14 13:44:25, David Herrmann wrote:
> Hi
> 
> On Thu, May 15, 2014 at 12:35 AM, Hugh Dickins <hughd@google.com> wrote:
> > The aspect which really worries me is this: the maintenance burden.
> > This approach would add some peculiar new code, introducing a rare
> > special case: which we might get right today, but will very easily
> > forget tomorrow when making some other changes to mm.  If we compile
> > a list of danger areas in mm, this would surely belong on that list.
> 
> I tried doing the page-replacement in the last 4 days, but honestly,
> it's far more complex than I thought. So if no-one more experienced
> with mm/ comes up with a simple implementation, I'll have to delay
> this for some more weeks.
> 
> However, I still wonder why we try to fix this as part of this
> patchset. Using FUSE, a DIRECT-IO call can be delayed for an arbitrary
> amount of time. Same is true for network block-devices, NFS, iscsi,
> maybe loop-devices, ... This means, _any_ once mapped page can be
> written to after an arbitrary delay. This can break any feature that
> makes FS objects read-only (remounting read-only, setting S_IMMUTABLE,
> sealing, ..).
> 
> Shouldn't we try to fix the _cause_ of this?
> 
> Isn't there a simple way to lock/mark/.. affected vmas in
> get_user_pages(_fast)() and release them once done? We could increase
> i_mmap_writable on all affected address_space and decrease it on
> release. This would at least prevent sealing and could be check on
> other operations, too (like setting S_IMMUTABLE).
> This should be as easy as checking page_mapping(page) != NULL and then
> adjusting ->i_mmap_writable in
> get_writable_user_pages/put_writable_user_pages, right?
  Doing this would be quite a bit of work. Currently references returned by
get_user_pages() are page references like any other and thus are released
by put_page() or similar. Now you would make them special and they need
special releasing and there are lots of places in kernel where
get_user_pages() is used that would need changing.

Another aspect is that it could have performance implications - if there
are several processes using get_user_pages[_fast]() on a file, they would
start contending on modifying i_mmap_writeable.

One somewhat crazy idea I have is that maybe we could delay unmapping of a
page if this was last VMA referencing it until all extra page references of
pages in there are dropped. That would make i_mmap_writeable reliable for
you and it would also close those races with remount. Hugh, do you think
this might be viable?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara <jack@suse.cz>
To: David Herrmann <dh.herrmann@gmail.com>
Cc: Hugh Dickins <hughd@google.com>,
	Tony Battersby <tonyb@cybernetics.com>,
	Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Ryan Lortie <desrt@desrt.ca>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	Greg Kroah-Hartman <greg@kroah.com>,
	John Stultz <john.stultz@linaro.org>,
	Kristian Hogsberg <krh@bitplanet.net>,
	Lennart Poettering <lennart@poettering.net>,
	Daniel Mack <zonque@gmail.com>, Kay Sievers <kay@vrfy.org>
Subject: Re: [PATCH v2 0/3] File Sealing & memfd_create()
Date: Mon, 19 May 2014 18:09:42 +0200	[thread overview]
Message-ID: <20140519160942.GD3427@quack.suse.cz> (raw)
In-Reply-To: <CANq1E4QgSbD9G70H7W4QeXbZ77_Kn1wV7edwzN4k4NjQJS=36A@mail.gmail.com>

On Mon 19-05-14 13:44:25, David Herrmann wrote:
> Hi
> 
> On Thu, May 15, 2014 at 12:35 AM, Hugh Dickins <hughd@google.com> wrote:
> > The aspect which really worries me is this: the maintenance burden.
> > This approach would add some peculiar new code, introducing a rare
> > special case: which we might get right today, but will very easily
> > forget tomorrow when making some other changes to mm.  If we compile
> > a list of danger areas in mm, this would surely belong on that list.
> 
> I tried doing the page-replacement in the last 4 days, but honestly,
> it's far more complex than I thought. So if no-one more experienced
> with mm/ comes up with a simple implementation, I'll have to delay
> this for some more weeks.
> 
> However, I still wonder why we try to fix this as part of this
> patchset. Using FUSE, a DIRECT-IO call can be delayed for an arbitrary
> amount of time. Same is true for network block-devices, NFS, iscsi,
> maybe loop-devices, ... This means, _any_ once mapped page can be
> written to after an arbitrary delay. This can break any feature that
> makes FS objects read-only (remounting read-only, setting S_IMMUTABLE,
> sealing, ..).
> 
> Shouldn't we try to fix the _cause_ of this?
> 
> Isn't there a simple way to lock/mark/.. affected vmas in
> get_user_pages(_fast)() and release them once done? We could increase
> i_mmap_writable on all affected address_space and decrease it on
> release. This would at least prevent sealing and could be check on
> other operations, too (like setting S_IMMUTABLE).
> This should be as easy as checking page_mapping(page) != NULL and then
> adjusting ->i_mmap_writable in
> get_writable_user_pages/put_writable_user_pages, right?
  Doing this would be quite a bit of work. Currently references returned by
get_user_pages() are page references like any other and thus are released
by put_page() or similar. Now you would make them special and they need
special releasing and there are lots of places in kernel where
get_user_pages() is used that would need changing.

Another aspect is that it could have performance implications - if there
are several processes using get_user_pages[_fast]() on a file, they would
start contending on modifying i_mmap_writeable.

One somewhat crazy idea I have is that maybe we could delay unmapping of a
page if this was last VMA referencing it until all extra page references of
pages in there are dropped. That would make i_mmap_writeable reliable for
you and it would also close those races with remount. Hugh, do you think
this might be viable?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-05-19 16:09 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-15 18:38 [PATCH v2 0/3] File Sealing & memfd_create() David Herrmann
2014-04-15 18:38 ` David Herrmann
2014-04-15 18:38 ` [PATCH v2 1/3] shm: add sealing API David Herrmann
2014-04-15 18:38   ` David Herrmann
2014-05-20  2:16   ` Hugh Dickins
2014-05-20  2:16     ` Hugh Dickins
2014-05-23 16:37     ` David Herrmann
2014-05-23 16:37       ` David Herrmann
2014-06-02 10:30       ` Hugh Dickins
2014-06-02 10:30         ` Hugh Dickins
2014-04-15 18:38 ` [PATCH v2 2/3] shm: add memfd_create() syscall David Herrmann
2014-04-15 18:38   ` David Herrmann
2014-05-20  2:20   ` Hugh Dickins
2014-05-20  2:20     ` Hugh Dickins
2014-05-23 16:57     ` David Herrmann
2014-05-23 16:57       ` David Herrmann
2014-06-02 10:59       ` Hugh Dickins
2014-06-02 10:59         ` Hugh Dickins
2014-06-02 17:50         ` Andy Lutomirski
2014-06-02 17:50           ` Andy Lutomirski
2014-06-13 10:42         ` David Herrmann
2014-06-13 10:42           ` David Herrmann
2014-05-21 10:50   ` Konstantin Khlebnikov
2014-05-21 10:50     ` Konstantin Khlebnikov
2014-04-15 18:38 ` [PATCH v2 3/3] selftests: add memfd_create() + sealing tests David Herrmann
2014-04-15 18:38   ` David Herrmann
2014-05-20  2:22   ` Hugh Dickins
2014-05-20  2:22     ` Hugh Dickins
2014-05-23 17:06     ` David Herrmann
2014-05-23 17:06       ` David Herrmann
2014-05-14  5:09 ` [PATCH v2 0/3] File Sealing & memfd_create() Hugh Dickins
2014-05-14  5:09   ` Hugh Dickins
2014-05-14 16:15   ` Tony Battersby
2014-05-14 16:15     ` Tony Battersby
2014-05-14 22:35     ` Hugh Dickins
2014-05-14 22:35       ` Hugh Dickins
2014-05-19 11:44       ` David Herrmann
2014-05-19 11:44         ` David Herrmann
2014-05-19 16:09         ` Jan Kara [this message]
2014-05-19 16:09           ` Jan Kara
2014-05-19 22:11           ` Hugh Dickins
2014-05-19 22:11             ` Hugh Dickins
2014-05-26 11:44             ` David Herrmann
2014-05-26 11:44               ` David Herrmann
2014-05-31  4:44               ` Hugh Dickins
2014-05-31  4:44                 ` Hugh Dickins
2014-06-02  4:42         ` Minchan Kim
2014-06-02  4:42           ` Minchan Kim
2014-06-02  9:14           ` Jan Kara
2014-06-02  9:14             ` Jan Kara
2014-06-02 16:04             ` David Herrmann
2014-06-02 16:04               ` David Herrmann
2014-06-03  8:31               ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140519160942.GD3427@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=desrt@desrt.ca \
    --cc=dh.herrmann@gmail.com \
    --cc=greg@kroah.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=john.stultz@linaro.org \
    --cc=kay@vrfy.org \
    --cc=krh@bitplanet.net \
    --cc=lennart@poettering.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mtk.manpages@gmail.com \
    --cc=tj@kernel.org \
    --cc=tonyb@cybernetics.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zonque@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.