All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Herrmann <dh.herrmann@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: Michael Kerrisk <mtk.manpages@gmail.com>,
	Ryan Lortie <desrt@desrt.ca>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-api@vger.kernel.org, Greg Kroah-Hartman <greg@kroah.com>,
	john.stultz@linaro.org,
	Lennart Poettering <lennart@poettering.net>,
	Daniel Mack <zonque@gmail.com>, Kay Sievers <kay@vrfy.org>,
	Hugh Dickins <hughd@google.com>,
	Tony Battersby <tonyb@cybernetics.com>,
	Andy Lutomirski <luto@amacapital.net>,
	David Herrmann <dh.herrmann@gmail.com>
Subject: [PATCH v3 0/7] File Sealing & memfd_create()
Date: Fri, 13 Jun 2014 12:36:52 +0200	[thread overview]
Message-ID: <1402655819-14325-1-git-send-email-dh.herrmann@gmail.com> (raw)

Hi

This is v3 of the File-Sealing and memfd_create() patches. You can find v1 with
a longer introduction at gmane:
  http://thread.gmane.org/gmane.comp.video.dri.devel/102241
An LWN article about memfd+sealing is available, too:
  https://lwn.net/Articles/593918/
v2 with some more discussions can be found here:
  http://thread.gmane.org/gmane.linux.kernel.mm/115713

This series introduces two new APIs:
  memfd_create(): Think of this syscall as malloc() but it returns a
                  file-descriptor instead of a pointer. That file-descriptor is
                  backed by anon-memory and can be memory-mapped for access.
  sealing: The sealing API can be used to prevent a specific set of operations
           on a file-descriptor. You 'seal' the file and give thus the
           guarantee, that it cannot be modified in the specific ways.

A short high-level introduction is also available here:
  http://dvdhrm.wordpress.com/2014/06/10/memfd_create2/


Changed in v3:
 - fcntl() now returns EINVAL if the FD does not support sealing. We used to
   return EBADF like pipe_fcntl() does, but that is really weird and I don't
   like repeating that.
 - seals are now saved as "unsigned int" instead of "u32".
 - i_mmap_writable is now an atomic so we can deny writable mappings just like
   i_writecount does.
 - SHMEM_ALLOW_SEALING is dropped. We initialize all objects with F_SEAL_SEAL
   and only unset it for memfds that shall support sealing.
 - memfd_create() no longer has a size argument. It was redundant, use
   ftruncate() or fallocate().
 - memfd_create() flags are "unsigned int" now, instead of "u64".
 - NAME_MAX off-by-one fix
 - several cosmetic changes
 - Added AIO/Direct-IO page-pinning protection

The last point is the most important change in this version: We now bail out if
any page-refcount is elevated while setting SEAL_WRITE. This prevents parallel
GUP users from writing to sealed files _after_ they were sealed. There is also a
new FUSE-based test-case to trigger such situations.

The last 2 patches try to improve the page-pinning handling. I included both in
this series, but obviously only one of them is needed (or we could stack them):
 - 6/7: This waits for up to 150ms for pages to be unpinned
 - 7/7: This isolates pinned pages and replaces them with a fresh copy

Hugh, patch 6 is basically your code. In case that gets merged, can I put your
Signed-off-by on it?

I hope I didn't miss anything. Further comments welcome!

Thanks
David

David Herrmann (7):
  mm: allow drivers to prevent new writable mappings
  shm: add sealing API
  shm: add memfd_create() syscall
  selftests: add memfd_create() + sealing tests
  selftests: add memfd/sealing page-pinning tests
  shm: wait for pins to be released when sealing
  shm: isolate pinned pages when sealing files

 arch/x86/syscalls/syscall_32.tbl               |   1 +
 arch/x86/syscalls/syscall_64.tbl               |   1 +
 fs/fcntl.c                                     |   5 +
 fs/inode.c                                     |   1 +
 include/linux/fs.h                             |  29 +-
 include/linux/shmem_fs.h                       |  17 +
 include/linux/syscalls.h                       |   1 +
 include/uapi/linux/fcntl.h                     |  15 +
 include/uapi/linux/memfd.h                     |   8 +
 kernel/fork.c                                  |   2 +-
 kernel/sys_ni.c                                |   1 +
 mm/mmap.c                                      |  24 +-
 mm/shmem.c                                     | 320 ++++++++-
 mm/swap_state.c                                |   1 +
 tools/testing/selftests/Makefile               |   1 +
 tools/testing/selftests/memfd/.gitignore       |   4 +
 tools/testing/selftests/memfd/Makefile         |  40 ++
 tools/testing/selftests/memfd/fuse_mnt.c       | 110 +++
 tools/testing/selftests/memfd/fuse_test.c      | 311 +++++++++
 tools/testing/selftests/memfd/memfd_test.c     | 913 +++++++++++++++++++++++++
 tools/testing/selftests/memfd/run_fuse_test.sh |  14 +
 21 files changed, 1807 insertions(+), 12 deletions(-)
 create mode 100644 include/uapi/linux/memfd.h
 create mode 100644 tools/testing/selftests/memfd/.gitignore
 create mode 100644 tools/testing/selftests/memfd/Makefile
 create mode 100755 tools/testing/selftests/memfd/fuse_mnt.c
 create mode 100644 tools/testing/selftests/memfd/fuse_test.c
 create mode 100644 tools/testing/selftests/memfd/memfd_test.c
 create mode 100755 tools/testing/selftests/memfd/run_fuse_test.sh

-- 
2.0.0


WARNING: multiple messages have this Message-ID (diff)
From: David Herrmann <dh.herrmann@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: Michael Kerrisk <mtk.manpages@gmail.com>,
	Ryan Lortie <desrt@desrt.ca>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-api@vger.kernel.org, Greg Kroah-Hartman <greg@kroah.com>,
	john.stultz@linaro.org,
	Lennart Poettering <lennart@poettering.net>,
	Daniel Mack <zonque@gmail.com>, Kay Sievers <kay@vrfy.org>,
	Hugh Dickins <hughd@google.com>,
	Tony Battersby <tonyb@cybernetics.com>,
	Andy Lutomirski <luto@amacapital.net>,
	David Herrmann <dh.herrmann@gmail.com>
Subject: [PATCH v3 0/7] File Sealing & memfd_create()
Date: Fri, 13 Jun 2014 12:36:52 +0200	[thread overview]
Message-ID: <1402655819-14325-1-git-send-email-dh.herrmann@gmail.com> (raw)

Hi

This is v3 of the File-Sealing and memfd_create() patches. You can find v1 with
a longer introduction at gmane:
  http://thread.gmane.org/gmane.comp.video.dri.devel/102241
An LWN article about memfd+sealing is available, too:
  https://lwn.net/Articles/593918/
v2 with some more discussions can be found here:
  http://thread.gmane.org/gmane.linux.kernel.mm/115713

This series introduces two new APIs:
  memfd_create(): Think of this syscall as malloc() but it returns a
                  file-descriptor instead of a pointer. That file-descriptor is
                  backed by anon-memory and can be memory-mapped for access.
  sealing: The sealing API can be used to prevent a specific set of operations
           on a file-descriptor. You 'seal' the file and give thus the
           guarantee, that it cannot be modified in the specific ways.

A short high-level introduction is also available here:
  http://dvdhrm.wordpress.com/2014/06/10/memfd_create2/


Changed in v3:
 - fcntl() now returns EINVAL if the FD does not support sealing. We used to
   return EBADF like pipe_fcntl() does, but that is really weird and I don't
   like repeating that.
 - seals are now saved as "unsigned int" instead of "u32".
 - i_mmap_writable is now an atomic so we can deny writable mappings just like
   i_writecount does.
 - SHMEM_ALLOW_SEALING is dropped. We initialize all objects with F_SEAL_SEAL
   and only unset it for memfds that shall support sealing.
 - memfd_create() no longer has a size argument. It was redundant, use
   ftruncate() or fallocate().
 - memfd_create() flags are "unsigned int" now, instead of "u64".
 - NAME_MAX off-by-one fix
 - several cosmetic changes
 - Added AIO/Direct-IO page-pinning protection

The last point is the most important change in this version: We now bail out if
any page-refcount is elevated while setting SEAL_WRITE. This prevents parallel
GUP users from writing to sealed files _after_ they were sealed. There is also a
new FUSE-based test-case to trigger such situations.

The last 2 patches try to improve the page-pinning handling. I included both in
this series, but obviously only one of them is needed (or we could stack them):
 - 6/7: This waits for up to 150ms for pages to be unpinned
 - 7/7: This isolates pinned pages and replaces them with a fresh copy

Hugh, patch 6 is basically your code. In case that gets merged, can I put your
Signed-off-by on it?

I hope I didn't miss anything. Further comments welcome!

Thanks
David

David Herrmann (7):
  mm: allow drivers to prevent new writable mappings
  shm: add sealing API
  shm: add memfd_create() syscall
  selftests: add memfd_create() + sealing tests
  selftests: add memfd/sealing page-pinning tests
  shm: wait for pins to be released when sealing
  shm: isolate pinned pages when sealing files

 arch/x86/syscalls/syscall_32.tbl               |   1 +
 arch/x86/syscalls/syscall_64.tbl               |   1 +
 fs/fcntl.c                                     |   5 +
 fs/inode.c                                     |   1 +
 include/linux/fs.h                             |  29 +-
 include/linux/shmem_fs.h                       |  17 +
 include/linux/syscalls.h                       |   1 +
 include/uapi/linux/fcntl.h                     |  15 +
 include/uapi/linux/memfd.h                     |   8 +
 kernel/fork.c                                  |   2 +-
 kernel/sys_ni.c                                |   1 +
 mm/mmap.c                                      |  24 +-
 mm/shmem.c                                     | 320 ++++++++-
 mm/swap_state.c                                |   1 +
 tools/testing/selftests/Makefile               |   1 +
 tools/testing/selftests/memfd/.gitignore       |   4 +
 tools/testing/selftests/memfd/Makefile         |  40 ++
 tools/testing/selftests/memfd/fuse_mnt.c       | 110 +++
 tools/testing/selftests/memfd/fuse_test.c      | 311 +++++++++
 tools/testing/selftests/memfd/memfd_test.c     | 913 +++++++++++++++++++++++++
 tools/testing/selftests/memfd/run_fuse_test.sh |  14 +
 21 files changed, 1807 insertions(+), 12 deletions(-)
 create mode 100644 include/uapi/linux/memfd.h
 create mode 100644 tools/testing/selftests/memfd/.gitignore
 create mode 100644 tools/testing/selftests/memfd/Makefile
 create mode 100755 tools/testing/selftests/memfd/fuse_mnt.c
 create mode 100644 tools/testing/selftests/memfd/fuse_test.c
 create mode 100644 tools/testing/selftests/memfd/memfd_test.c
 create mode 100755 tools/testing/selftests/memfd/run_fuse_test.sh

-- 
2.0.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2014-06-13 10:45 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-13 10:36 David Herrmann [this message]
2014-06-13 10:36 ` [PATCH v3 0/7] File Sealing & memfd_create() David Herrmann
2014-06-13 10:36 ` [PATCH v3 1/7] mm: allow drivers to prevent new writable mappings David Herrmann
2014-06-13 10:36   ` David Herrmann
2014-07-09  8:55   ` Hugh Dickins
2014-07-09  8:55     ` Hugh Dickins
2014-07-19 16:12     ` David Herrmann
2014-07-19 16:12       ` David Herrmann
2014-06-13 10:36 ` [PATCH v3 2/7] shm: add sealing API David Herrmann
2014-06-13 10:36   ` David Herrmann
2014-07-16 10:06   ` Hugh Dickins
2014-07-16 10:06     ` Hugh Dickins
2014-07-19 16:17     ` David Herrmann
2014-07-19 16:17       ` David Herrmann
2014-06-13 10:36 ` [PATCH v3 3/7] shm: add memfd_create() syscall David Herrmann
2014-06-13 10:36   ` David Herrmann
2014-06-13 12:27   ` Michael Kerrisk (man-pages)
2014-06-13 12:27     ` Michael Kerrisk (man-pages)
2014-06-13 12:41     ` David Herrmann
2014-06-13 12:41       ` David Herrmann
2014-06-13 14:20       ` Michael Kerrisk (man-pages)
2014-06-13 14:20         ` Michael Kerrisk (man-pages)
2014-06-13 16:20         ` John Stultz
2014-06-13 16:20           ` John Stultz
2014-06-13 16:20           ` John Stultz
2014-06-16  4:12           ` Michael Kerrisk (man-pages)
2014-06-16  4:12             ` Michael Kerrisk (man-pages)
2014-07-08 18:39         ` David Herrmann
2014-07-08 18:39           ` David Herrmann
2014-06-15 10:50   ` Jann Horn
2014-07-16 10:07   ` Hugh Dickins
2014-07-16 10:07     ` Hugh Dickins
2014-07-19 16:29     ` David Herrmann
2014-07-19 16:29       ` David Herrmann
2014-06-13 10:36 ` [PATCH v3 4/7] selftests: add memfd_create() + sealing tests David Herrmann
2014-06-13 10:36   ` David Herrmann
2014-07-16 10:07   ` Hugh Dickins
2014-07-16 10:07     ` Hugh Dickins
2014-07-19 16:31     ` David Herrmann
2014-07-19 16:31       ` David Herrmann
2014-06-13 10:36 ` [PATCH v3 5/7] selftests: add memfd/sealing page-pinning tests David Herrmann
2014-06-13 10:36   ` David Herrmann
2014-07-16 10:08   ` Hugh Dickins
2014-07-16 10:08     ` Hugh Dickins
2014-07-19 16:32     ` David Herrmann
2014-07-19 16:32       ` David Herrmann
2014-06-13 10:36 ` [RFC v3 6/7] shm: wait for pins to be released when sealing David Herrmann
2014-06-13 10:36   ` David Herrmann
2014-07-16 10:09   ` Hugh Dickins
2014-07-16 10:09     ` Hugh Dickins
2014-07-19 16:36     ` David Herrmann
2014-07-19 16:36       ` David Herrmann
2014-06-13 10:36 ` [RFC v3 7/7] shm: isolate pinned pages when sealing files David Herrmann
2014-06-13 10:36   ` David Herrmann
2014-06-13 15:06   ` Andy Lutomirski
2014-06-13 15:06     ` Andy Lutomirski
2014-06-13 15:27     ` David Herrmann
2014-06-13 15:27       ` David Herrmann
2014-06-13 17:23       ` Andy Lutomirski
2014-06-13 17:23         ` Andy Lutomirski
2014-07-09  8:57   ` Hugh Dickins
2014-07-09  8:57     ` Hugh Dickins
2014-07-19 16:40     ` David Herrmann
2014-07-19 16:40       ` David Herrmann
2014-06-13 15:10 ` [PATCH v3 0/7] File Sealing & memfd_create() Andy Lutomirski
2014-06-13 15:10   ` Andy Lutomirski
2014-06-13 15:15   ` David Herrmann
2014-06-13 15:15     ` David Herrmann
2014-06-13 15:15     ` David Herrmann
2014-06-13 15:17     ` Andy Lutomirski
2014-06-13 15:17       ` Andy Lutomirski
2014-06-13 15:17       ` Andy Lutomirski
2014-06-13 15:33       ` David Herrmann
2014-06-13 15:33         ` David Herrmann
2014-06-13 15:33         ` David Herrmann
2014-06-17  9:54         ` Florian Weimer
2014-06-17  9:54           ` Florian Weimer
2014-06-17 10:01           ` David Herrmann
2014-06-17 10:01             ` David Herrmann
2014-06-17 10:01             ` David Herrmann
2014-06-17 10:04             ` Florian Weimer
2014-06-17 10:04               ` Florian Weimer
2014-06-17 10:10               ` David Herrmann
2014-06-17 10:10                 ` David Herrmann
2014-06-17 12:13                 ` Florian Weimer
2014-06-17 12:13                   ` Florian Weimer
2014-06-17 13:26                   ` David Herrmann
2014-06-17 13:26                     ` David Herrmann
2014-06-17 13:26                     ` David Herrmann
2014-06-17 16:20             ` Andy Lutomirski
2014-06-17 16:36               ` David Herrmann
2014-06-17 16:36                 ` David Herrmann
2014-06-17 16:41                 ` Andy Lutomirski
2014-06-17 16:41                   ` Andy Lutomirski
2014-06-17 16:51                   ` David Herrmann
2014-06-17 16:51                     ` David Herrmann
2014-06-17 17:01                     ` Andy Lutomirski
2014-06-17 17:01                       ` Andy Lutomirski
2014-06-17 20:31                       ` Hugh Dickins
2014-06-17 20:31                         ` Hugh Dickins
2014-06-17 20:31                         ` Hugh Dickins
2014-06-17 21:25                         ` Andy Lutomirski
2014-06-17 21:25                           ` Andy Lutomirski
2014-07-08 16:54 ` David Herrmann
2014-07-08 16:54   ` David Herrmann
2014-07-09  8:53   ` Hugh Dickins
2014-07-09  8:53     ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1402655819-14325-1-git-send-email-dh.herrmann@gmail.com \
    --to=dh.herrmann@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=desrt@desrt.ca \
    --cc=greg@kroah.com \
    --cc=hughd@google.com \
    --cc=john.stultz@linaro.org \
    --cc=kay@vrfy.org \
    --cc=lennart@poettering.net \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=mtk.manpages@gmail.com \
    --cc=tonyb@cybernetics.com \
    --cc=torvalds@linux-foundation.org \
    --cc=zonque@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.