All of lore.kernel.org
 help / color / mirror / Atom feed
From: Axel Rasmussen <axelrasmussen@google.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	Jerome Glisse <jglisse@redhat.com>, Joe Perches <joe@perches.com>,
	Lokesh Gidra <lokeshgidra@google.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Peter Xu <peterx@redhat.com>, Shaohua Li <shli@fb.com>,
	Shuah Khan <shuah@kernel.org>, Wang Qing <wangqing@vivo.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, Linux MM <linux-mm@kvack.org>,
	linux-kselftest@vger.kernel.org,
	Brian Geffon <bgeffon@google.com>,
	Cannon Matthews <cannonmatthews@google.com>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Michel Lespinasse <walken@google.com>,
	Mina Almasry <almasrymina@google.com>,
	Oliver Upton <oupton@google.com>
Subject: Re: [PATCH v2 1/5] userfaultfd: support minor fault handling for shmem
Date: Tue, 9 Mar 2021 11:57:26 -0800	[thread overview]
Message-ID: <CAJHvVcjjX8FEsdngUTH+0c3C17T0ud-FXj=GJf90R8hn8PKekA@mail.gmail.com> (raw)
In-Reply-To: <04697A35-AEC7-43F1-8462-1CD39648544A@nvidia.com>

On Tue, Mar 9, 2021 at 11:52 AM Zi Yan <ziy@nvidia.com> wrote:
>
> On 1 Mar 2021, at 19:01, Axel Rasmussen wrote:
>
> > Modify the userfaultfd register API to allow registering shmem VMAs in
> > minor mode. Modify the shmem mcopy implementation to support
> > UFFDIO_CONTINUE in order to resolve such faults.
> >
> > Combine the shmem mcopy handler functions into a single
> > shmem_mcopy_atomic_pte, which takes a mode parameter. This matches how
> > the hugetlbfs implementation is structured, and lets us remove a good
> > chunk of boilerplate.
> >
> > Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
> > ---
> >  fs/userfaultfd.c                 |  6 +--
> >  include/linux/shmem_fs.h         | 26 ++++-----
> >  include/uapi/linux/userfaultfd.h |  4 +-
> >  mm/memory.c                      |  8 +--
> >  mm/shmem.c                       | 92 +++++++++++++++-----------------
> >  mm/userfaultfd.c                 | 27 +++++-----
> >  6 files changed, 79 insertions(+), 84 deletions(-)
> >
> > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> > index 14f92285d04f..9f3b8684cf3c 100644
> > --- a/fs/userfaultfd.c
> > +++ b/fs/userfaultfd.c
> > @@ -1267,8 +1267,7 @@ static inline bool vma_can_userfault(struct vm_area_struct *vma,
> >       }
> >
> >       if (vm_flags & VM_UFFD_MINOR) {
> > -             /* FIXME: Add minor fault interception for shmem. */
> > -             if (!is_vm_hugetlb_page(vma))
> > +             if (!(is_vm_hugetlb_page(vma) || vma_is_shmem(vma)))
> >                       return false;
> >       }
> >
> > @@ -1941,7 +1940,8 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx,
> >       /* report all available features and ioctls to userland */
> >       uffdio_api.features = UFFD_API_FEATURES;
> >  #ifndef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR
> > -     uffdio_api.features &= ~UFFD_FEATURE_MINOR_HUGETLBFS;
> > +     uffdio_api.features &=
> > +             ~(UFFD_FEATURE_MINOR_HUGETLBFS | UFFD_FEATURE_MINOR_SHMEM);
> >  #endif
> >       uffdio_api.ioctls = UFFD_API_IOCTLS;
> >       ret = -EFAULT;
> > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
> > index d82b6f396588..f0919c3722e7 100644
> > --- a/include/linux/shmem_fs.h
> > +++ b/include/linux/shmem_fs.h
> > @@ -9,6 +9,7 @@
> >  #include <linux/percpu_counter.h>
> >  #include <linux/xattr.h>
> >  #include <linux/fs_parser.h>
> > +#include <linux/userfaultfd_k.h>
> >
> >  /* inode in-kernel data */
> >
> > @@ -122,21 +123,16 @@ static inline bool shmem_file(struct file *file)
> >  extern bool shmem_charge(struct inode *inode, long pages);
> >  extern void shmem_uncharge(struct inode *inode, long pages);
> >
> > +#ifdef CONFIG_USERFAULTFD
> >  #ifdef CONFIG_SHMEM
> > -extern int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
> > -                               struct vm_area_struct *dst_vma,
> > -                               unsigned long dst_addr,
> > -                               unsigned long src_addr,
> > -                               struct page **pagep);
> > -extern int shmem_mfill_zeropage_pte(struct mm_struct *dst_mm,
> > -                                 pmd_t *dst_pmd,
> > -                                 struct vm_area_struct *dst_vma,
> > -                                 unsigned long dst_addr);
> > -#else
> > -#define shmem_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, dst_addr, \
> > -                            src_addr, pagep)        ({ BUG(); 0; })
> > -#define shmem_mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, \
> > -                              dst_addr)      ({ BUG(); 0; })
> > -#endif
> > +int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
> > +                        struct vm_area_struct *dst_vma,
> > +                        unsigned long dst_addr, unsigned long src_addr,
> > +                        enum mcopy_atomic_mode mode, struct page **pagep);
> > +#else /* !CONFIG_SHMEM */
> > +#define shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, \
> > +                            src_addr, mode, pagep)        ({ BUG(); 0; })
> > +#endif /* CONFIG_SHMEM */
> > +#endif /* CONFIG_USERFAULTFD */
> >
> >  #endif
> > diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
> > index bafbeb1a2624..47d9790d863d 100644
> > --- a/include/uapi/linux/userfaultfd.h
> > +++ b/include/uapi/linux/userfaultfd.h
> > @@ -31,7 +31,8 @@
> >                          UFFD_FEATURE_MISSING_SHMEM |         \
> >                          UFFD_FEATURE_SIGBUS |                \
> >                          UFFD_FEATURE_THREAD_ID |             \
> > -                        UFFD_FEATURE_MINOR_HUGETLBFS)
> > +                        UFFD_FEATURE_MINOR_HUGETLBFS |       \
> > +                        UFFD_FEATURE_MINOR_SHMEM)
> >  #define UFFD_API_IOCTLS                              \
> >       ((__u64)1 << _UFFDIO_REGISTER |         \
> >        (__u64)1 << _UFFDIO_UNREGISTER |       \
> > @@ -196,6 +197,7 @@ struct uffdio_api {
> >  #define UFFD_FEATURE_SIGBUS                  (1<<7)
> >  #define UFFD_FEATURE_THREAD_ID                       (1<<8)
> >  #define UFFD_FEATURE_MINOR_HUGETLBFS         (1<<9)
> > +#define UFFD_FEATURE_MINOR_SHMEM             (1<<10)
> >       __u64 features;
> >
> >       __u64 ioctls;
> > diff --git a/mm/memory.c b/mm/memory.c
> > index c8e357627318..a1e5ff55027e 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3929,9 +3929,11 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf)
> >        * something).
> >        */
> >       if (vma->vm_ops->map_pages && fault_around_bytes >> PAGE_SHIFT > 1) {
> > -             ret = do_fault_around(vmf);
> > -             if (ret)
> > -                     return ret;
> > +             if (likely(!userfaultfd_minor(vmf->vma))) {
> > +                     ret = do_fault_around(vmf);
> > +                     if (ret)
> > +                             return ret;
> > +             }
> >       }
> >
> >       ret = __do_fault(vmf);
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index b2db4ed0fbc7..6f81259fabb3 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -77,7 +77,6 @@ static struct vfsmount *shm_mnt;
> >  #include <linux/syscalls.h>
> >  #include <linux/fcntl.h>
> >  #include <uapi/linux/memfd.h>
> > -#include <linux/userfaultfd_k.h>
> >  #include <linux/rmap.h>
> >  #include <linux/uuid.h>
> >
> > @@ -1785,8 +1784,8 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index,
> >   * vm. If we swap it in we mark it dirty since we also free the swap
> >   * entry since a page cannot live in both the swap and page cache.
> >   *
> > - * vmf and fault_type are only supplied by shmem_fault:
> > - * otherwise they are NULL.
> > + * vma, vmf, and fault_type are only supplied by shmem_fault: otherwise they
> > + * are NULL.
> >   */
> >  static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
> >       struct page **pagep, enum sgp_type sgp, gfp_t gfp,
> > @@ -1830,6 +1829,12 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
> >               return error;
> >       }
> >
> > +     if (page && vma && userfaultfd_minor(vma)) {
> > +             unlock_page(page);
> > +             *fault_type = handle_userfault(vmf, VM_UFFD_MINOR);
> > +             return 0;
> > +     }
> > +
> >       if (page)
> >               hindex = page->index;
> >       if (page && sgp == SGP_WRITE)
> > @@ -2354,14 +2359,12 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode
> >       return inode;
> >  }
> >
> > -static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
> > -                               pmd_t *dst_pmd,
> > -                               struct vm_area_struct *dst_vma,
> > -                               unsigned long dst_addr,
> > -                               unsigned long src_addr,
> > -                               bool zeropage,
> > -                               struct page **pagep)
> > +int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
> > +                        struct vm_area_struct *dst_vma,
> > +                        unsigned long dst_addr, unsigned long src_addr,
> > +                        enum mcopy_atomic_mode mode, struct page **pagep)
> >  {
> > +     bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE);
> >       struct inode *inode = file_inode(dst_vma->vm_file);
> >       struct shmem_inode_info *info = SHMEM_I(inode);
> >       struct address_space *mapping = inode->i_mapping;
> > @@ -2378,12 +2381,17 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
> >       if (!shmem_inode_acct_block(inode, 1))
> >               goto out;
> >
> > -     if (!*pagep) {
> > +     if (is_continue) {
> > +             ret = -EFAULT;
> > +             page = find_lock_page(mapping, pgoff);
> > +             if (!page)
> > +                     goto out_unacct_blocks;
> > +     } else if (!*pagep) {
> >               page = shmem_alloc_page(gfp, info, pgoff);
> >               if (!page)
> >                       goto out_unacct_blocks;
> >
> > -             if (!zeropage) {        /* mcopy_atomic */
> > +             if (mode == MCOPY_ATOMIC_NORMAL) {      /* mcopy_atomic */
> >                       page_kaddr = kmap_atomic(page);
> >                       ret = copy_from_user(page_kaddr,
> >                                            (const void __user *)src_addr,
>
> Hi Axel,
>
> shmem_mcopy_atomic_pte is not guarded by CONFIG_USERFAULTFD, thus it is
> causing compilation errors due to the use of enum mcopy_atomic_mode mode,
> when CONFIG_USERFAULTFD is not set.

Ah, my apologies, I guarded it in the header but forgot to do so in
shmem.c. I'll send an updated patch today.

>
>
> —
> Best Regards,
> Yan Zi

  reply	other threads:[~2021-03-09 19:58 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-02  0:01 [PATCH v2 0/5] userfaultfd: support minor fault handling for shmem Axel Rasmussen
2021-03-02  0:01 ` Axel Rasmussen
2021-03-02  0:01 ` [PATCH v2 1/5] " Axel Rasmussen
2021-03-02  0:01   ` Axel Rasmussen
2021-03-09 19:52   ` Zi Yan
2021-03-09 19:57     ` Axel Rasmussen [this message]
2021-03-09 19:57       ` Axel Rasmussen
2021-03-09 22:58     ` [PATCH v2.5] " Axel Rasmussen
2021-03-09 22:58       ` Axel Rasmussen
2021-03-10  4:40       ` Axel Rasmussen
2021-03-10  4:40         ` Axel Rasmussen
2021-03-02  0:01 ` [PATCH v2 2/5] userfaultfd/selftests: use memfd_create for shmem test type Axel Rasmussen
2021-03-02  0:01   ` Axel Rasmussen
2021-03-02  0:01 ` [PATCH v2 3/5] userfaultfd/selftests: create alias mappings in the shmem test Axel Rasmussen
2021-03-02  0:01   ` Axel Rasmussen
2021-03-02  0:01 ` [PATCH v2 4/5] userfaultfd/selftests: reinitialize test context in each test Axel Rasmussen
2021-03-02  0:01   ` Axel Rasmussen
2021-03-02  0:01 ` [PATCH v2 5/5] userfaultfd/selftests: exercise minor fault handling shmem support Axel Rasmussen
2021-03-02  0:01   ` Axel Rasmussen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJHvVcjjX8FEsdngUTH+0c3C17T0ud-FXj=GJf90R8hn8PKekA@mail.gmail.com' \
    --to=axelrasmussen@google.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=bgeffon@google.com \
    --cc=cannonmatthews@google.com \
    --cc=dgilbert@redhat.com \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=joe@perches.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lokeshgidra@google.com \
    --cc=oupton@google.com \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=shli@fb.com \
    --cc=shuah@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=walken@google.com \
    --cc=wangqing@vivo.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.