linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] permit write-sealed memfd read-only shared mappings
@ 2023-10-07 20:50 Lorenzo Stoakes
  2023-10-07 20:50 ` [PATCH v3 1/3] mm: drop the assumption that VM_SHARED always implies writable Lorenzo Stoakes
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Lorenzo Stoakes @ 2023-10-07 20:50 UTC (permalink / raw)
  To: linux-mm, linux-kernel, Andrew Morton
  Cc: Mike Kravetz, Muchun Song, Alexander Viro, Christian Brauner,
	Matthew Wilcox, Hugh Dickins, Andy Lutomirski, Jan Kara,
	linux-fsdevel, bpf, Lorenzo Stoakes

The man page for fcntl() describing memfd file seals states the following
about F_SEAL_WRITE:-

    Furthermore, trying to create new shared, writable memory-mappings via
    mmap(2) will also fail with EPERM.

With emphasis on 'writable'. In turns out in fact that currently the kernel
simply disallows all new shared memory mappings for a memfd with
F_SEAL_WRITE applied, rendering this documentation inaccurate.

This matters because users are therefore unable to obtain a shared mapping
to a memfd after write sealing altogether, which limits their
usefulness. This was reported in the discussion thread [1] originating from
a bug report [2].

This is a product of both using the struct address_space->i_mmap_writable
atomic counter to determine whether writing may be permitted, and the
kernel adjusting this counter when any VM_SHARED mapping is performed and
more generally implicitly assuming VM_SHARED implies writable.

It seems sensible that we should only update this mapping if VM_MAYWRITE is
specified, i.e. whether it is possible that this mapping could at any point
be written to.

If we do so then all we need to do to permit write seals to function as
documented is to clear VM_MAYWRITE when mapping read-only. It turns out
this functionality already exists for F_SEAL_FUTURE_WRITE - we can
therefore simply adapt this logic to do the same for F_SEAL_WRITE.

We then hit a chicken and egg situation in mmap_region() where the check
for VM_MAYWRITE occurs before we are able to clear this flag. To work
around this, separate the check and its enforcement across call_mmap() -
allowing for this function to clear VM_MAYWRITE.

Thanks to Andy Lutomirski for the suggestion!

[1]:https://lore.kernel.org/all/20230324133646.16101dfa666f253c4715d965@linux-foundation.org/
[2]:https://bugzilla.kernel.org/show_bug.cgi?id=217238

v3:
- Don't defer the writable check until after call_mmap() in case this
  breaks f_ops->mmap() callbacks which assume this has been done
  first. Instead, separate the check and enforcement of it across the call,
  allowing for it to change vma->vm_flags in the meanwhile.
- Improve/correct commit messages and comments throughout.

v2:
- Removed RFC tag.
- Correct incorrect goto pointed out by Jan.
- Reworded cover letter as suggested by Jan.
https://lore.kernel.org/all/cover.1682890156.git.lstoakes@gmail.com/

v1:
https://lore.kernel.org/all/cover.1680560277.git.lstoakes@gmail.com/

Lorenzo Stoakes (3):
  mm: drop the assumption that VM_SHARED always implies writable
  mm: update memfd seal write check to include F_SEAL_WRITE
  mm: perform the mapping_map_writable() check after call_mmap()

 fs/hugetlbfs/inode.c |  2 +-
 include/linux/fs.h   |  4 ++--
 include/linux/mm.h   | 26 +++++++++++++++++++-------
 kernel/fork.c        |  2 +-
 mm/filemap.c         |  2 +-
 mm/madvise.c         |  2 +-
 mm/mmap.c            | 28 ++++++++++++++++++----------
 mm/shmem.c           |  2 +-
 8 files changed, 44 insertions(+), 24 deletions(-)

-- 
2.42.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-10-12  8:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-07 20:50 [PATCH v3 0/3] permit write-sealed memfd read-only shared mappings Lorenzo Stoakes
2023-10-07 20:50 ` [PATCH v3 1/3] mm: drop the assumption that VM_SHARED always implies writable Lorenzo Stoakes
2023-10-07 20:51 ` [PATCH v3 2/3] mm: update memfd seal write check to include F_SEAL_WRITE Lorenzo Stoakes
2023-10-07 20:51 ` [PATCH v3 3/3] mm: enforce the mapping_map_writable() check after call_mmap() Lorenzo Stoakes
2023-10-11  9:46   ` Jan Kara
2023-10-11 18:14     ` Lorenzo Stoakes
2023-10-12  8:38       ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).