All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: akpm@linux-foundation.org, viro@zeniv.linux.org.uk,
	brauner@kernel.org, shuah@kernel.org, aarcange@redhat.com,
	lokeshgidra@google.com, peterx@redhat.com, david@redhat.com,
	hughd@google.com, mhocko@suse.com, axelrasmussen@google.com,
	rppt@kernel.org, willy@infradead.org, Liam.Howlett@oracle.com,
	zhangpeng362@huawei.com, bgeffon@google.com,
	kaleshsingh@google.com, ngeoffray@google.com, jdduke@google.com,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	kernel-team@android.com
Subject: Re: [PATCH 2/3] userfaultfd: UFFDIO_REMAP uABI
Date: Wed, 20 Sep 2023 18:11:19 +0200	[thread overview]
Message-ID: <CAG48ez0WQiZvBzYBHXVP8ZVFXJEfRsHxNuXvMTFL+ietgyB9yQ@mail.gmail.com> (raw)
In-Reply-To: <CAJuCfpGEPPmEW6GqCjQXB5g04PA=BwhNgLVooM+DcroQj8RjyA@mail.gmail.com>

On Wed, Sep 20, 2023 at 3:49 AM Suren Baghdasaryan <surenb@google.com> wrote:
> On Tue, Sep 19, 2023 at 4:51 PM Jann Horn <jannh@google.com> wrote:
> > On Wed, Sep 20, 2023 at 1:08 AM Suren Baghdasaryan <surenb@google.com> wrote:
> > > On Thu, Sep 14, 2023 at 7:28 PM Jann Horn <jannh@google.com> wrote:
> > > > On Thu, Sep 14, 2023 at 5:26 PM Suren Baghdasaryan <surenb@google.com> wrote:
> > > > > From: Andrea Arcangeli <aarcange@redhat.com>
> > > > >
> > > > > This implements the uABI of UFFDIO_REMAP.
> > > > >
> > > > > Notably one mode bitflag is also forwarded (and in turn known) by the
> > > > > lowlevel remap_pages method.
> > [...]
> > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > [...]
> > > > > +int remap_pages_huge_pmd(struct mm_struct *dst_mm,
> > > > > +                        struct mm_struct *src_mm,
> > > > > +                        pmd_t *dst_pmd, pmd_t *src_pmd,
> > > > > +                        pmd_t dst_pmdval,
> > > > > +                        struct vm_area_struct *dst_vma,
> > > > > +                        struct vm_area_struct *src_vma,
> > > > > +                        unsigned long dst_addr,
> > > > > +                        unsigned long src_addr)
> > > > > +{
> > > > > +       pmd_t _dst_pmd, src_pmdval;
> > > > > +       struct page *src_page;
> > > > > +       struct anon_vma *src_anon_vma, *dst_anon_vma;
> > > > > +       spinlock_t *src_ptl, *dst_ptl;
> > > > > +       pgtable_t pgtable;
> > > > > +       struct mmu_notifier_range range;
> > > > > +
> > > > > +       src_pmdval = *src_pmd;
> > > > > +       src_ptl = pmd_lockptr(src_mm, src_pmd);
> > > > > +
> > > > > +       BUG_ON(!pmd_trans_huge(src_pmdval));
> > > > > +       BUG_ON(!pmd_none(dst_pmdval));
> > > >
> > > > Why can we assert that pmd_none(dst_pmdval) is true here? Can we not
> > > > have concurrent faults (or userfaultfd operations) populating that
> > > > PMD?
> > >
> > > IIUC dst_pmdval is a copy of the value from dst_pmd, so that local
> > > copy should not change even if some concurrent operation changes
> > > dst_pmd. We can assert that it's pmd_none because we checked for that
> > > before calling remap_pages_huge_pmd. Later on we check if dst_pmd
> > > changed from under us (see pmd_same(*dst_pmd, dst_pmdval) check) and
> > > retry if that happened.
> >
> > Oh, right, I don't know what I was thinking when I typed that.
> >
> > But now I wonder about the check directly above that: What does this
> > code do for swap PMDs? It looks like that might splat on the
> > BUG_ON(!pmd_trans_huge(src_pmdval)). All we've checked on the path to
> > here is that the virtual memory area is aligned, that the destination
> > PMD is empty, and that pmd_trans_huge_lock() succeeded; but
> > pmd_trans_huge_lock() explicitly permits swap PMDs (which is the
> > swapped-out version of transhuge PMDs):
> >
> > static inline spinlock_t *pmd_trans_huge_lock(pmd_t *pmd,
> >                 struct vm_area_struct *vma)
> > {
> >         if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd))
> >                 return __pmd_trans_huge_lock(pmd, vma);
> >         else
> >                 return NULL;
> > }
>
> Yeah... Ok, I think I'm missing a check for  pmd_trans_huge(*src_pmd)
> after we lock it with pmd_trans_huge_lock(src_pmd, src_vma). And we
> can remove the above BUG_ON(). Would that address your concern?

Sounds good. It'll end up splitting huge swap entries but I guess the
extra code for moving huge swap entries might not be worth it.

  reply	other threads:[~2023-09-20 16:12 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-14 15:26 [PATCH 0/3] userfaultfd remap option Suren Baghdasaryan
2023-09-14 15:26 ` [PATCH 1/3] userfaultfd: UFFDIO_REMAP: rmap preparation Suren Baghdasaryan
2023-09-14 17:56   ` Matthew Wilcox
2023-09-14 18:34     ` Suren Baghdasaryan
2023-09-14 15:26 ` [PATCH 2/3] userfaultfd: UFFDIO_REMAP uABI Suren Baghdasaryan
2023-09-14 18:11   ` Matthew Wilcox
2023-09-14 18:43     ` David Hildenbrand
2023-09-14 18:45       ` David Hildenbrand
2023-09-21 18:04         ` Suren Baghdasaryan
2023-09-21 18:17           ` David Hildenbrand
2023-09-22  1:57             ` Suren Baghdasaryan
2023-09-14 18:47   ` David Hildenbrand
2023-09-14 18:54     ` Suren Baghdasaryan
2023-09-14 19:28   ` Jann Horn
2023-09-14 20:57     ` Suren Baghdasaryan
2023-09-19 23:08     ` Suren Baghdasaryan
2023-09-19 23:40       ` Suren Baghdasaryan
2023-09-19 23:50       ` Jann Horn
2023-09-20  1:49         ` Suren Baghdasaryan
2023-09-20 16:11           ` Jann Horn [this message]
2023-09-21 16:59     ` Jann Horn
2023-09-14 21:57   ` Nadav Amit
2023-09-15  3:28     ` Suren Baghdasaryan
2023-09-15  4:03       ` Nadav Amit
2023-09-15  4:15         ` Suren Baghdasaryan
2023-09-15 23:33   ` Jann Horn
2023-09-15 23:39     ` Suren Baghdasaryan
2023-09-14 15:26 ` [PATCH 3/3] selftests/mm: add UFFDIO_REMAP ioctl test Suren Baghdasaryan
2023-09-14 16:00 [PATCH 2/3] userfaultfd: UFFDIO_REMAP uABI kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAG48ez0WQiZvBzYBHXVP8ZVFXJEfRsHxNuXvMTFL+ietgyB9yQ@mail.gmail.com \
    --to=jannh@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=bgeffon@google.com \
    --cc=brauner@kernel.org \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jdduke@google.com \
    --cc=kaleshsingh@google.com \
    --cc=kernel-team@android.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lokeshgidra@google.com \
    --cc=mhocko@suse.com \
    --cc=ngeoffray@google.com \
    --cc=peterx@redhat.com \
    --cc=rppt@kernel.org \
    --cc=shuah@kernel.org \
    --cc=surenb@google.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=zhangpeng362@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.