All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	kernel test robot <yujie.liu@intel.com>,
	"Fabio M. De Francesco" <fmdefrancesco@gmail.com>,
	<lkp@lists.01.org>, <lkp@intel.com>,
	<linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	Peter Xu <peterx@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: [shmem] 7a7256d5f5: WARNING:possible_recursive_locking_detected
Date: Fri, 21 Oct 2022 17:00:24 -0700	[thread overview]
Message-ID: <Y1MymJ/INb45AdaY@iweiny-desk3> (raw)
In-Reply-To: <Y1Mh2S7fUGQ/iKFR@iweiny-desk3>

On Fri, Oct 21, 2022 at 03:48:57PM -0700, Ira wrote:
> On Fri, Oct 21, 2022 at 01:30:41PM -0700, Andrew Morton wrote:
> > On Fri, 21 Oct 2022 14:09:16 +0100 Matthew Wilcox <willy@infradead.org> wrote:
> > 
> > > On Fri, Oct 21, 2022 at 12:10:17PM +0800, kernel test robot wrote:
> > > > FYI, we noticed WARNING:possible_recursive_locking_detected due to commit (built with gcc-11):
> > > > 
> > > > commit: 7a7256d5f512b6c17957df7f59cf5e281b3ddba3 ("shmem: convert shmem_mfill_atomic_pte() to use a folio")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > Ummm.  Looks to me like this now occurs because of this part of the
> > > change:
> > > 
> > >                 if (!zeropage) {        /* COPY */
> > > -                       page_kaddr = kmap_atomic(page);
> > > +                       page_kaddr = kmap_local_folio(folio, 0);
> > >                         ret = copy_from_user(page_kaddr,
> > >                                              (const void __user *)src_addr,
> > >                                              PAGE_SIZE);
> > > -                       kunmap_atomic(page_kaddr);
> > > +                       kunmap_local(page_kaddr);
> > > 
> > > Should I be using __copy_from_user_inatomic() here?
> 
> I would say not.  I'm curious why copy_from_user() was safe (at least did not
> fail the checkers).  :-/
> 
> > 
> > Caller __mcopy_atomic() is holding mmap_read_lock(dst_mm) and this
> > copy_from_user() calls
> > might_fault()->might_lock_read(current->mm->mmap_lock).
> > 
> > And I guess might_lock_read() gets upset because we're holding another
> > mm's mmap_lock.  Which sounds OK to me, unless a concurrent
> > mmap_write_lock() could jam things up.
> > 
> > But I cannot see why your patch would suddenly trigger this warning -
> > kmap_local_folio() and kmap_atomic() are basically the same thing.
> 
> It is related to your patch but I think what you did made sense on the surface.
> 
> On the surface copy_from_user() should not require pagefaults to be disabled.
> But that side affect of kmap_atomic() was being used here because it looks like
> the code is designed to fallback if the fault was not allowed:[1]
> 
> mm/shmem.c
> ...
>                         page_kaddr = kmap_local_folio(folio, 0);
>                         ret = copy_from_user(page_kaddr,
>                                              (const void __user *)src_addr,
>                                              PAGE_SIZE);
>                         kunmap_local(page_kaddr);
> 
>                         /* fallback to copy_from_user outside mmap_lock */
>                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>                         if (unlikely(ret)) {
>                                 *pagep = &folio->page;
>                                 ret = -ENOENT;
>                                 /* don't free the page */
>                                 goto out_unacct_blocks;
>                         }
> ...
> 
> So this is one of those rare places where the kmap_atomic() side effects were
> being depended on...  :-(
> 
> [1] might_fault() does not actually mean the code completes the fault.
> 
> mm/memory.c
> ...
> void __might_fault(const char *file, int line)
> {
>         if (pagefault_disabled())
>                 return;
> ...
> 
> > 
> > I see that __mcopy_atomic() is using plain old kmap(), perhaps to work
> > around this?  But that's 2015 code and I'm not sure we had such
> > detailed lock checking in those days.
> 
> No kmap() can't work around this.  That works because the lock is released just
> above that.
> 
> mm/userfaultfd.c
> ...
>                         mmap_read_unlock(dst_mm);
>                         BUG_ON(!page);
> 
>                         page_kaddr = kmap(page);
>                         err = copy_from_user(page_kaddr,
>                                              (const void __user *) src_addr,
>                                              PAGE_SIZE);
>                         kunmap(page);
> ...
> 
> So I think the correct solution is below because we want to prevent the page
> fault.

I was about to get this patch ready to send when I found this:

commit b6ebaedb4cb1a18220ae626c3a9e184ee39dd248
Author: Andrea Arcangeli <aarcange@redhat.com>
Date:   Fri Sep 4 15:47:08 2015 -0700

    userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

    If the rwsem starves writers it wasn't strictly a bug but lockdep
    doesn't like it and this avoids depending on lowlevel implementation  
    details of the lock.
    
    [akpm@linux-foundation.org: delete weird BUILD_BUG_ON()]
    Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
    Acked-by: Pavel Emelyanov <xemul@parallels.com>
...

So I wonder if the true fix is something to lockdep?

Regardless I'll send the below patch because it will restore things to a
working order.

But I'm CC'ing Andrea for comments.

Ira

> 
> Ira
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 8280a5cb48df..6c8e99bf5983 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2424,9 +2424,11 @@ int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
> 
>                 if (!zeropage) {        /* COPY */
>                         page_kaddr = kmap_local_folio(folio, 0);
> +                       pagefault_disable()
>                         ret = copy_from_user(page_kaddr,
>                                              (const void __user *)src_addr,
>                                              PAGE_SIZE);
> +                       pagefault_enable()
>                         kunmap_local(page_kaddr);
> 
>                         /* fallback to copy_from_user outside mmap_lock */
> 
> 

  reply	other threads:[~2022-10-22  0:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-21  4:10 [shmem] 7a7256d5f5: WARNING:possible_recursive_locking_detected kernel test robot
2022-10-21 13:09 ` Matthew Wilcox
2022-10-21 20:30   ` Andrew Morton
2022-10-21 22:36     ` Matthew Wilcox
2022-10-21 22:48     ` Ira Weiny
2022-10-22  0:00       ` Ira Weiny [this message]
2022-10-22  0:02         ` Randy Dunlap
2022-10-22  0:08           ` Ira Weiny
2022-10-22  0:17         ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1MymJ/INb45AdaY@iweiny-desk3 \
    --to=ira.weiny@intel.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=fmdefrancesco@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lkp@lists.01.org \
    --cc=peterx@redhat.com \
    --cc=willy@infradead.org \
    --cc=yujie.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.