All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 2/6] shm: add sealing API
@ 2014-04-10 21:33 ` Tony Battersby
  0 siblings, 0 replies; 26+ messages in thread
From: Tony Battersby @ 2014-04-10 21:33 UTC (permalink / raw)
  To: David Herrmann; +Cc: linux-kernel, linux-fsdevel, linux-mm, dri-devel

(reposted from my comments at http://lwn.net/Articles/593918/)

I may have thought of a way to subvert SEAL_WRITE using O_DIRECT and
asynchronous I/O.  I am not sure if this is a real problem or not, but
better to ask, right?

The exploit would go like this:

1) mmap() the shared memory
2) open some *other* file with O_DIRECT
3) prepare a read-type iocb for the *other* file pointing to the
mmap()ed memory
4) io_submit(), but don't wait for it to complete
5) munmap() the shared memory
6) SEAL_WRITE the shared memory
7) the "sealed" memory is overwritten by DMA from the disk drive at some
point in the future when the I/O completes

So this exploit effectively changes the contents of the supposedly
write-protected memory after SEAL_WRITE is granted.

For O_DIRECT the kernel pins the submitted pages in memory for DMA by
incrementing the page reference counts when the I/O is submitted,
allowing the pages to be modified by DMA even if they are no longer
mapped in the address space of the process.  This is different from a
regular read(), which uses the CPU to copy the data and will fail if the
pages are not mapped.

I am sure there are also other direct I/O mechanisms in the kernel that
can be used to setup a DMA transfer to change the contents of unmapped
memory; the SCSI generic driver comes to mind.

I suppose SEAL_WRITE could iterate over all the pages in the file and
check to make sure no page refcount is greater than the "expected"
value, and return an error instead of granting the seal if a page is
found with an unexpected extra reference that might have been added by
e.g. get_user_pages() for direct I/O. But looking over shmem_set_seals()
in patch 2/6, it doesn't seem to do that.

Tony Battersby
Cybernetics

^ permalink raw reply	[flat|nested] 26+ messages in thread
* [PATCH 0/6] File Sealing & memfd_create()
@ 2014-03-19 19:06 David Herrmann
  2014-03-19 19:06   ` David Herrmann
  0 siblings, 1 reply; 26+ messages in thread
From: David Herrmann @ 2014-03-19 19:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hugh Dickins, Alexander Viro, Matthew Wilcox, Karol Lewandowski,
	Kay Sievers, Daniel Mack, Lennart Poettering,
	Kristian Høgsberg, john.stultz, Greg Kroah-Hartman,
	Tejun Heo, Johannes Weiner, dri-devel, linux-fsdevel, linux-mm,
	Andrew Morton, Linus Torvalds, Ryan Lortie,
	Michael Kerrisk (man-pages),
	David Herrmann

Hi

This series introduces the concept of "file sealing". Sealing a file restricts
the set of allowed operations on the file in question. Multiple seals are
defined and each seal will cause a different set of operations to return EPERM
if it is set. The following seals are introduced:

 * SEAL_SHRINK: If set, the inode size cannot be reduced
 * SEAL_GROW: If set, the inode size cannot be increased
 * SEAL_WRITE: If set, the file content cannot be modified

Unlike existing techniques that provide similar protection, sealing allows
file-sharing without any trust-relationship. This is enforced by rejecting seal
modifications if you don't own an exclusive reference to the given file. So if
you own a file-descriptor, you can be sure that no-one besides you can modify
the seals on the given file. This allows mapping shared files from untrusted
parties without the fear of the file getting truncated or modified by an
attacker.

Several use-cases exist that could make great use of sealing:

  1) Graphics Compositors
     If a graphics client creates a memory-backed render-buffer and passes a
     file-decsriptor to it to the graphics server for display, the server
     _has_ to setup SIGBUS handlers whenever mapping the given file. Otherwise,
     the client might run ftruncate() or O_TRUNC on the on file in parallel,
     thus crashing the server.
     With sealing, a compositor can reject any incoming file-descriptor that
     does _not_ have SEAL_SHRINK set. This way, any memory-mappings are
     guaranteed to stay accessible. Furthermore, we still allow clients to
     increase the buffer-size in case they want to resize the render-buffer for
     the next frame. We also allow parallel writes so the client can render new
     frames into the same buffer (client is responsible of never rendering into
     a front-buffer if you want to avoid artifacts).

     Real use-case: Wayland wl_shm buffers can be transparently converted

  2) Geneal-purpose IPC
     IPC mechanisms that do not require a mutual trust-relationship (like dbus)
     cannot do zero-copy so far. With sealing, zero-copy can be easily done by
     sharing a file-descriptor that has SEAL_SHRINK | SEAL_GROW | SEAL_WRITE
     set. This way, the source can store sensible data in the file, seal the
     file and then pass it to the destination. The destination verifies these
     seals are set and then can parse the message in-line.
     Note that these files are usually one-shot files. Without any
     trust-relationship, a destination can notify the source that it released a
     file again, but a source can never rely on it. So unless the destination
     releases the file, a source cannot clear the seals for modification again.
     However, this is inherent to situations without any trust-relationship.

     Real use-case: kdbus messages already use a similar interface and can be
                    transparently converted to use these seals

Other similar use-cases exist (eg., audio), but these two I am personally
working on. Interest in this interface has been raised from several other camps
and I've put respective maintainers into CC. If more information on these
use-cases is needed, I think they can give some insights.

The API introduced by this patchset is:

 * fcntl() extension:
   Two new fcntl() commands are added that allow retrieveing (SHMEM_GET_SEALS)
   and setting (SHMEM_SET_SEALS) seals on a file. Only shmfs implements them so
   far and there is no intention to implement them on other file-systems.
   All shmfs based files support sealing.

   Patch 2/6

 * memfd_create() syscall:
   The new memfd_create() syscall is a public frontend to the shmem_file_new()
   interface in the kernel. It avoids the need of a local shmfs mount-point (as
   requested by android people) and acts more like MAP_ANON than O_TMPFILE.

   Patch 3/6

The other 4 patches are cleanups, self-tests and docs.

The commit-messages explain the API extensions in detail. Man-page proposals
are also provided. Last but not least, the extensive self-tests document the
intended behavior, in case it is still not clear.

Technically, sealing and memfd_create() are independent, but the described
use-cases would greatly benefit from the combination of both. Hence, I merged
them into the same series. Please also note that this series is based on earlier
works (ashmem, memfd, shmgetfd, ..) and unifies these attempts.

Comments welcome!

Thanks
David

David Herrmann (4):
  fs: fix i_writecount on shmem and friends
  shm: add sealing API
  shm: add memfd_create() syscall
  selftests: add memfd_create() + sealing tests

David Herrmann (2): (man-pages)
  fcntl.2: document SHMEM_SET/GET_SEALS commands
  memfd_create.2: add memfd_create() man-page

 arch/x86/syscalls/syscall_32.tbl           |   1 +
 arch/x86/syscalls/syscall_64.tbl           |   1 +
 fs/fcntl.c                                 |  12 +-
 fs/file_table.c                            |  27 +-
 include/linux/shmem_fs.h                   |  17 +
 include/linux/syscalls.h                   |   1 +
 include/uapi/linux/fcntl.h                 |  13 +
 include/uapi/linux/memfd.h                 |   9 +
 kernel/sys_ni.c                            |   1 +
 mm/shmem.c                                 | 267 +++++++-
 tools/testing/selftests/Makefile           |   1 +
 tools/testing/selftests/memfd/.gitignore   |   2 +
 tools/testing/selftests/memfd/Makefile     |  29 +
 tools/testing/selftests/memfd/memfd_test.c | 972 +++++++++++++++++++++++++++++
 14 files changed, 1338 insertions(+), 15 deletions(-)
 create mode 100644 include/uapi/linux/memfd.h
 create mode 100644 tools/testing/selftests/memfd/.gitignore
 create mode 100644 tools/testing/selftests/memfd/Makefile
 create mode 100644 tools/testing/selftests/memfd/memfd_test.c

-- 
1.9.0


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-04-14  0:03 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-10 21:33 [PATCH 2/6] shm: add sealing API Tony Battersby
2014-04-10 21:33 ` Tony Battersby
2014-04-11  0:22 ` David Herrmann
2014-04-11  0:22   ` David Herrmann
2014-04-11  0:22   ` David Herrmann
2014-04-11  1:27   ` Andy Lutomirski
2014-04-11  1:27     ` Andy Lutomirski
2014-04-11 13:43     ` Tony Battersby
2014-04-11 13:43       ` Tony Battersby
2014-04-11 21:31       ` David Herrmann
2014-04-11 21:31         ` David Herrmann
2014-04-11 21:31         ` David Herrmann
2014-04-11 21:36         ` Andy Lutomirski
2014-04-11 21:36           ` Andy Lutomirski
2014-04-11 21:42           ` David Herrmann
2014-04-11 21:42             ` David Herrmann
2014-04-11 21:42             ` David Herrmann
2014-04-11 22:07             ` Andy Lutomirski
2014-04-11 22:07               ` Andy Lutomirski
2014-04-14  0:03               ` David Herrmann
2014-04-14  0:03                 ` David Herrmann
2014-04-14  0:03                 ` David Herrmann
  -- strict thread matches above, loose matches on Subject: below --
2014-03-19 19:06 [PATCH 0/6] File Sealing & memfd_create() David Herrmann
2014-03-19 19:06 ` [PATCH 2/6] shm: add sealing API David Herrmann
2014-03-19 19:06   ` David Herrmann
2014-03-19 19:06   ` David Herrmann
2014-03-19 19:06   ` David Herrmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.