linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Joel Fernandes <joel@joelfernandes.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Andrew Lutomirski <luto@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Hugh Dickins <hughd@google.com>, Jann Horn <jannh@google.com>,
	Khalid Aziz <khalid.aziz@oracle.com>,
	Linux API <linux-api@vger.kernel.org>,
	"open list:KERNEL SELFTEST FRAMEWORK" 
	<linux-kselftest@vger.kernel.org>, Linux-MM <linux-mm@kvack.org>,
	marcandre.lureau@redhat.com, Matthew Wilcox <willy@infradead.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Shuah Khan <shuah@kernel.org>
Subject: Re: [PATCH -next 1/2] mm/memfd: make F_SEAL_FUTURE_WRITE seal more robust
Date: Wed, 21 Nov 2018 19:25:26 -0800	[thread overview]
Message-ID: <CALCETrUGyhqi+M3cTdqJNNOPfTWn-R-ekM_R5heq2mbdVqPUAw@mail.gmail.com> (raw)
In-Reply-To: <20181121182701.0d8a775fda6af1f8d2be8f25@linux-foundation.org>

On Wed, Nov 21, 2018 at 6:27 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Tue, 20 Nov 2018 13:13:35 -0800 Joel Fernandes <joel@joelfernandes.org> wrote:
>
> > > > I am Ok with whatever Andrew wants to do, if it is better to squash it with
> > > > the original, then I can do that and send another patch.
> > > >
> > > >
> > >
> > > From experience, Andrew will food in fixups on request :)
> >
> > Andrew, could you squash this patch into the one titled ("mm: Add an
> > F_SEAL_FUTURE_WRITE seal to memfd")?
>
> Sure.
>
> I could of course queue them separately but I rarely do so - I don't
> think that the intermediate development states are useful in the
> infinite-term, and I make them available via additional Link: tags in
> the changelog footers anyway.
>
> I think that the magnitude of these patches is such that John Stultz's
> Reviewed-by is invalidated, so this series is now in the "unreviewed"
> state.
>
> So can we have a re-review please?  For convenience, here's the
> folded-together [1/1] patch, as it will go to Linus.
>
>
> From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> Subject: mm: Add an F_SEAL_FUTURE_WRITE seal to memfd
>
> Android uses ashmem for sharing memory regions.  We are looking forward to
> migrating all usecases of ashmem to memfd so that we can possibly remove
> the ashmem driver in the future from staging while also benefiting from
> using memfd and contributing to it.  Note staging drivers are also not ABI
> and generally can be removed at anytime.
>
> One of the main usecases Android has is the ability to create a region and
> mmap it as writeable, then add protection against making any "future"
> writes while keeping the existing already mmap'ed writeable-region active.
> This allows us to implement a usecase where receivers of the shared
> memory buffer can get a read-only view, while the sender continues to
> write to the buffer.  See CursorWindow documentation in Android for more
> details:
> https://developer.android.com/reference/android/database/CursorWindow
>
> This usecase cannot be implemented with the existing F_SEAL_WRITE seal.
> To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal
> which prevents any future mmap and write syscalls from succeeding while
> keeping the existing mmap active.  The following program shows the seal
> working in action:
>
>  #include <stdio.h>
>  #include <errno.h>
>  #include <sys/mman.h>
>  #include <linux/memfd.h>
>  #include <linux/fcntl.h>
>  #include <asm/unistd.h>
>  #include <unistd.h>
>  #define F_SEAL_FUTURE_WRITE 0x0010
>  #define REGION_SIZE (5 * 1024 * 1024)
>
> int memfd_create_region(const char *name, size_t size)
> {
>     int ret;
>     int fd = syscall(__NR_memfd_create, name, MFD_ALLOW_SEALING);
>     if (fd < 0) return fd;
>     ret = ftruncate(fd, size);
>     if (ret < 0) { close(fd); return ret; }
>     return fd;
> }
>
> int main() {
>     int ret, fd;
>     void *addr, *addr2, *addr3, *addr1;
>     ret = memfd_create_region("test_region", REGION_SIZE);
>     printf("ret=%d\n", ret);
>     fd = ret;
>
>     // Create map
>     addr = mmap(0, REGION_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>     if (addr == MAP_FAILED)
>             printf("map 0 failed\n");
>     else
>             printf("map 0 passed\n");
>
>     if ((ret = write(fd, "test", 4)) != 4)
>             printf("write failed even though no future-write seal "
>                    "(ret=%d errno =%d)\n", ret, errno);
>     else
>             printf("write passed\n");
>
>     addr1 = mmap(0, REGION_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>     if (addr1 == MAP_FAILED)
>             perror("map 1 prot-write failed even though no seal\n");
>     else
>             printf("map 1 prot-write passed as expected\n");
>
>     ret = fcntl(fd, F_ADD_SEALS, F_SEAL_FUTURE_WRITE |
>                                  F_SEAL_GROW |
>                                  F_SEAL_SHRINK);
>     if (ret == -1)
>             printf("fcntl failed, errno: %d\n", errno);
>     else
>             printf("future-write seal now active\n");
>
>     if ((ret = write(fd, "test", 4)) != 4)
>             printf("write failed as expected due to future-write seal\n");
>     else
>             printf("write passed (unexpected)\n");
>
>     addr2 = mmap(0, REGION_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
>     if (addr2 == MAP_FAILED)
>             perror("map 2 prot-write failed as expected due to seal\n");
>     else
>             printf("map 2 passed\n");
>
>     addr3 = mmap(0, REGION_SIZE, PROT_READ, MAP_SHARED, fd, 0);
>     if (addr3 == MAP_FAILED)
>             perror("map 3 failed\n");
>     else
>             printf("map 3 prot-read passed as expected\n");
> }
>
> The output of running this program is as follows:
> ret=3
> map 0 passed
> write passed
> map 1 prot-write passed as expected
> future-write seal now active
> write failed as expected due to future-write seal
> map 2 prot-write failed as expected due to seal
> : Permission denied
> map 3 prot-read passed as expected
>
> [joel@joelfernandes.org: make F_SEAL_FUTURE_WRITE seal more robust]
>   Link: http://lkml.kernel.org/r/20181120052137.74317-1-joel@joelfernandes.org
> Link: http://lkml.kernel.org/r/20181108041537.39694-1-joel@joelfernandes.org
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: John Reck <jreck@google.com>
> Cc: Todd Kjos <tkjos@google.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Christoph Hellwig <hch@infradead.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Daniel Colascione <dancol@google.com>
> Cc: J. Bruce Fields <bfields@fieldses.org>
> Cc: Jeff Layton <jlayton@kernel.org>
> Cc: Khalid Aziz <khalid.aziz@oracle.com>
> Cc: Lei Yang <Lei.Yang@windriver.com>
> Cc: Marc-Andr Lureau <marcandre.lureau@redhat.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Shuah Khan <shuah@kernel.org>
> Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Jann Horn <jannh@google.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
>
> --- a/include/uapi/linux/fcntl.h~mm-add-an-f_seal_future_write-seal-to-memfd
> +++ a/include/uapi/linux/fcntl.h
> @@ -41,6 +41,7 @@
>  #define F_SEAL_SHRINK  0x0002  /* prevent file from shrinking */
>  #define F_SEAL_GROW    0x0004  /* prevent file from growing */
>  #define F_SEAL_WRITE   0x0008  /* prevent writes */
> +#define F_SEAL_FUTURE_WRITE    0x0010  /* prevent future writes while mapped */
>  /* (1U << 31) is reserved for signed error codes */
>
>  /*
> --- a/mm/memfd.c~mm-add-an-f_seal_future_write-seal-to-memfd
> +++ a/mm/memfd.c
> @@ -131,7 +131,8 @@ static unsigned int *memfd_file_seals_pt
>  #define F_ALL_SEALS (F_SEAL_SEAL | \
>                      F_SEAL_SHRINK | \
>                      F_SEAL_GROW | \
> -                    F_SEAL_WRITE)
> +                    F_SEAL_WRITE | \
> +                    F_SEAL_FUTURE_WRITE)
>
>  static int memfd_add_seals(struct file *file, unsigned int seals)
>  {
> --- a/fs/hugetlbfs/inode.c~mm-add-an-f_seal_future_write-seal-to-memfd
> +++ a/fs/hugetlbfs/inode.c
> @@ -530,7 +530,7 @@ static long hugetlbfs_punch_hole(struct
>                 inode_lock(inode);
>
>                 /* protected by i_mutex */
> -               if (info->seals & F_SEAL_WRITE) {
> +               if (info->seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE)) {
>                         inode_unlock(inode);
>                         return -EPERM;
>                 }
> --- a/mm/shmem.c~mm-add-an-f_seal_future_write-seal-to-memfd
> +++ a/mm/shmem.c
> @@ -2119,6 +2119,23 @@ out_nomem:
>
>  static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
>  {
> +       struct shmem_inode_info *info = SHMEM_I(file_inode(file));
> +
> +       /*
> +        * New PROT_READ and MAP_SHARED mmaps are not allowed when "future

PROT_WRITE, perhaps?

> +        * write" seal active.
> +        */
> +       if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE) &&
> +           (info->seals & F_SEAL_FUTURE_WRITE))
> +               return -EPERM;
> +
> +       /*
> +        * Since the F_SEAL_FUTURE_WRITE seals allow for a MAP_SHARED read-only
> +        * mapping, take care to not allow mprotect to revert protections.
> +        */
> +       if (info->seals & F_SEAL_FUTURE_WRITE)
> +               vma->vm_flags &= ~(VM_MAYWRITE);
> +

This might all be clearer as:

if (info->seals & F_SEAL_FUTURE_WRITE) {
  if (vma->vm_flags ...)
    return -EPERM;
  vma->vm_flags &= ~VM_MAYWRITE;
}

with appropriate comments inserted.

  reply	other threads:[~2018-11-22  3:25 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-20  5:21 [PATCH -next 1/2] mm/memfd: make F_SEAL_FUTURE_WRITE seal more robust Joel Fernandes (Google)
2018-11-20  5:21 ` [PATCH -next 2/2] selftests/memfd: modify tests for F_SEAL_FUTURE_WRITE seal Joel Fernandes (Google)
2018-11-22 23:21   ` Joel Fernandes
2018-11-20 15:13 ` [PATCH -next 1/2] mm/memfd: make F_SEAL_FUTURE_WRITE seal more robust Andy Lutomirski
2018-11-20 18:39   ` Joel Fernandes
2018-11-20 20:07     ` Stephen Rothwell
2018-11-20 20:33       ` Andy Lutomirski
2018-11-20 20:47         ` Joel Fernandes
2018-11-20 21:02           ` Andy Lutomirski
2018-11-20 21:13             ` Joel Fernandes
2018-11-22  2:27               ` Andrew Morton
2018-11-22  3:25                 ` Andy Lutomirski [this message]
2018-11-22 23:09                   ` Joel Fernandes
2018-11-25  0:42                     ` Andrew Morton
2018-11-25  0:47                       ` Matthew Wilcox
2018-11-26 13:35                         ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrUGyhqi+M3cTdqJNNOPfTWn-R-ekM_R5heq2mbdVqPUAw@mail.gmail.com \
    --to=luto@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jannh@google.com \
    --cc=joel@joelfernandes.org \
    --cc=khalid.aziz@oracle.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=marcandre.lureau@redhat.com \
    --cc=mike.kravetz@oracle.com \
    --cc=sfr@canb.auug.org.au \
    --cc=shuah@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).