llvm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Peter Xu <peterx@redhat.com>
Cc: Hugh Dickins <hughd@google.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yang Shi <shy828301@gmail.com>,
	Matthew Wilcox <willy@infradead.org>,
	syzbot <syzbot+152d76c44ba142f8992b@syzkaller.appspotmail.com>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, llvm@lists.linux.dev, nathan@kernel.org,
	ndesaulniers@google.com, songmuchun@bytedance.com,
	syzkaller-bugs@googlegroups.com, trix@redhat.com
Subject: Re: [syzbot] general protection fault in PageHeadHuge
Date: Tue, 27 Sep 2022 09:24:37 -0700	[thread overview]
Message-ID: <YzMjxY5O6Hf/IPTx@monkey> (raw)
In-Reply-To: <YzDuHbuo2x/b2Mbr@x1n>

On 09/25/22 20:11, Peter Xu wrote:
> On Sat, Sep 24, 2022 at 12:01:16PM -0700, Mike Kravetz wrote:
> > On 09/24/22 11:06, Peter Xu wrote:
> > > 
> > > Sorry I forgot to reply on this one.
> > > 
> > > I didn't try linux-next, but I can easily reproduce this with mm-unstable
> > > already, and I verified that Hugh's patch fixes the problem for shmem.
> > > 
> > > When I was testing I found hugetlb selftest is broken too but with some
> > > other errors:
> > > 
> > > $ sudo ./userfaultfd hugetlb 100 10  
> > > ...
> > > bounces: 6, mode: racing ver read, ERROR: unexpected write fault (errno=0, line=779)
> > > 
> > > The failing check was making sure all MISSING events are not triggered by
> > > writes, but frankly I don't really know why it's required, and that check
> > > existed since the 1st commit when test was introduced.
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c47174fc362a089b1125174258e53ef4a69ce6b8
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/vm/userfaultfd.c?id=c47174fc362a089b1125174258e53ef4a69ce6b8#n291
> > > 
> > > And obviously some recent hugetlb-related change caused that to happen.
> > > 
> > > Dropping that check can definitely work, but I'll have a closer look soon
> > > too to make sure I didn't miss something.  Mike, please also let me know if
> > > you are aware of this problem.
> > > 
> > 
> > Peter, I am not aware of this problem.  I really should make running ALL
> > hugetlb tests part of my regular routine.
> > 
> > If you do not beat me to it, I will take a look in the next few days.
> 
> Just to update - my bisection points to 00cdec99f3eb ("hugetlbfs: revert
> use i_mmap_rwsem to address page fault/truncate race", 2022-09-21).
> 
> I don't understand how they are related so far, though.  It should be a
> timing thing because the failure cannot be reproduced on a VM but only on
> the host, and it can also pass sometimes even on the host but rarely.

Thanks Peter!

After your analysis, I also started looking at this.
- I did reproduce a few times in a VM
- On BM (a laptop) I could reproduce but it would take several (10's of) runs

> Logically all the uffd messages in the stress test should be generated by
> the locking thread, upon:
> 
> 		pthread_mutex_lock(area_mutex(area_dst, page_nr));

I personally find that test program hard to understand/follow and it takes me 
a day or so to understand what it is doing, then I immediately loose context
when I stop looking at it. :(

So, as you mention below the program is depending on pthread_mutex_lock()
doing a read fault before a write.

> I thought a common scheme for lock() fast path should already be an
> userspace cmpxchg() and that should be a write fault already.
> 
> For example, I did some stupid hack on the test and I can trigger the write
> check fault with anonymous easily with an explicit cmpxchg on byte offset 128:
> 
> diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
> index 74babdbc02e5..a7d6938d4553 100644
> --- a/tools/testing/selftests/vm/userfaultfd.c
> +++ b/tools/testing/selftests/vm/userfaultfd.c
> @@ -637,6 +637,10 @@ static void *locking_thread(void *arg)
>                 } else
>                         page_nr += 1;
>                 page_nr %= nr_pages;
> +               char *ptr = area_dst + (page_nr * page_size) + 128;
> +               char _old = 0, new = 1;
> +               (void)__atomic_compare_exchange_n(ptr, &_old, new, false,
> +                               __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
>                 pthread_mutex_lock(area_mutex(area_dst, page_nr));
>                 count = *area_count(area_dst, page_nr);
>                 if (count != count_verify[page_nr])
> 
> I'll need some more time thinking about it before I send a patch to drop
> the write check..

I did another stupid hack, and duplicated the statement:
	count = *area_count(area_dst, page_nr);
before the,
	pthread_mutex_lock(area_mutex(area_dst, page_nr));

This should guarantee a read fault independent of what pthread_mutex_lock
does.  However, it still results in the occasional "ERROR: unexpected write
fault".  So, something else if happening.  I will continue to experiment
and think about this.
-- 
Mike Kravetz

  reply	other threads:[~2022-09-27 16:25 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-23 16:05 [syzbot] general protection fault in PageHeadHuge syzbot
2022-09-23 21:11 ` Mike Kravetz
2022-09-23 21:37   ` Yang Shi
2022-09-24  0:14   ` Hugh Dickins
2022-09-24  0:18     ` Nathan Chancellor
2022-09-24  7:28       ` Hugh Dickins
2022-09-24  0:58     ` Peter Xu
2022-09-24 15:06       ` Peter Xu
2022-09-24 19:01         ` Mike Kravetz
2022-09-26  0:11           ` Peter Xu
2022-09-27 16:24             ` Mike Kravetz [this message]
2022-09-27 16:45               ` Peter Xu
2022-09-29 23:33                 ` Mike Kravetz
2022-09-30  1:29                   ` Peter Xu
2022-09-30  1:35                     ` Peter Xu
2022-09-30 16:05                   ` Peter Xu
2022-09-30 21:38                     ` Mike Kravetz
2022-10-01  2:47                       ` Peter Xu
2022-10-01  3:01                         ` Peter Xu
2022-10-03  1:16                           ` Mike Kravetz
2022-10-03 15:24                             ` Peter Xu
2022-09-25 18:55     ` Andrew Morton
2022-09-24 21:56   ` Matthew Wilcox
2022-09-27  4:32     ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YzMjxY5O6Hf/IPTx@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=llvm@lists.linux.dev \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=peterx@redhat.com \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=syzbot+152d76c44ba142f8992b@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=trix@redhat.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).