linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: syzbot <syzbot+e58112d71f77113ddb7b@syzkaller.appspotmail.com>,
	aarcange@redhat.com, akpm@linux-foundation.org,
	christian@brauner.io, davem@davemloft.net, ebiederm@xmission.com,
	elena.reshetova@intel.com, guro@fb.com, hch@infradead.org,
	james.bottomley@hansenpartnership.com, jglisse@redhat.com,
	keescook@chromium.org, ldv@altlinux.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-parisc@vger.kernel.org, luto@amacapital.net,
	mhocko@suse.com, mingo@kernel.org, namit@vmware.com,
	peterz@infradead.org, syzkaller-bugs@googlegroups.com,
	viro@zeniv.linux.org.uk, wad@chromium.org
Subject: Re: WARNING in __mmdrop
Date: Wed, 24 Jul 2019 04:05:17 -0400	[thread overview]
Message-ID: <20190724040238-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <e0c91b89-d1e8-9831-00fe-23fe92d79fa2@redhat.com>

On Wed, Jul 24, 2019 at 10:17:14AM +0800, Jason Wang wrote:
> 
> On 2019/7/23 下午11:02, Michael S. Tsirkin wrote:
> > On Tue, Jul 23, 2019 at 09:34:29PM +0800, Jason Wang wrote:
> > > On 2019/7/23 下午6:27, Michael S. Tsirkin wrote:
> > > > > Yes, since there could be multiple co-current invalidation requests. We need
> > > > > count them to make sure we don't pin wrong pages.
> > > > > 
> > > > > 
> > > > > > I also wonder about ordering. kvm has this:
> > > > > >           /*
> > > > > >             * Used to check for invalidations in progress, of the pfn that is
> > > > > >             * returned by pfn_to_pfn_prot below.
> > > > > >             */
> > > > > >            mmu_seq = kvm->mmu_notifier_seq;
> > > > > >            /*
> > > > > >             * Ensure the read of mmu_notifier_seq isn't reordered with PTE reads in
> > > > > >             * gfn_to_pfn_prot() (which calls get_user_pages()), so that we don't
> > > > > >             * risk the page we get a reference to getting unmapped before we have a
> > > > > >             * chance to grab the mmu_lock without mmu_notifier_retry() noticing.
> > > > > >             *
> > > > > >             * This smp_rmb() pairs with the effective smp_wmb() of the combination
> > > > > >             * of the pte_unmap_unlock() after the PTE is zapped, and the
> > > > > >             * spin_lock() in kvm_mmu_notifier_invalidate_<page|range_end>() before
> > > > > >             * mmu_notifier_seq is incremented.
> > > > > >             */
> > > > > >            smp_rmb();
> > > > > > 
> > > > > > does this apply to us? Can't we use a seqlock instead so we do
> > > > > > not need to worry?
> > > > > I'm not familiar with kvm MMU internals, but we do everything under of
> > > > > mmu_lock.
> > > > > 
> > > > > Thanks
> > > > I don't think this helps at all.
> > > > 
> > > > There's no lock between checking the invalidate counter and
> > > > get user pages fast within vhost_map_prefetch. So it's possible
> > > > that get user pages fast reads PTEs speculatively before
> > > > invalidate is read.
> > > > 
> > > > -- 
> > > 
> > > In vhost_map_prefetch() we do:
> > > 
> > >          spin_lock(&vq->mmu_lock);
> > > 
> > >          ...
> > > 
> > >          err = -EFAULT;
> > >          if (vq->invalidate_count)
> > >                  goto err;
> > > 
> > >          ...
> > > 
> > >          npinned = __get_user_pages_fast(uaddr->uaddr, npages,
> > >                                          uaddr->write, pages);
> > > 
> > >          ...
> > > 
> > >          spin_unlock(&vq->mmu_lock);
> > > 
> > > Is this not sufficient?
> > > 
> > > Thanks
> > So what orders __get_user_pages_fast wrt invalidate_count read?
> 
> 
> So in invalidate_end() callback we have:
> 
> spin_lock(&vq->mmu_lock);
> --vq->invalidate_count;
>         spin_unlock(&vq->mmu_lock);
> 
> 
> So even PTE is read speculatively before reading invalidate_count (only in
> the case of invalidate_count is zero). The spinlock has guaranteed that we
> won't read any stale PTEs.
> 
> Thanks

I'm sorry I just do not get the argument.
If you want to order two reads you need an smp_rmb
or stronger between them executed on the same CPU.

Executing any kind of barrier on another CPU
will have no ordering effect on the 1st one.


So if CPU1 runs the prefetch, and CPU2 runs invalidate
callback, read of invalidate counter on CPU1 can bypass
read of PTE on CPU1 unless there's a barrier
in between, and nothing CPU2 does can affect that outcome.


What did I miss?

> 
> > 

  reply	other threads:[~2019-07-24  8:05 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-19  3:35 WARNING in __mmdrop syzbot
2019-07-20 10:08 ` syzbot
2019-07-21 10:02   ` Michael S. Tsirkin
2019-07-21 12:18     ` Michael S. Tsirkin
2019-07-22  5:24       ` Jason Wang
2019-07-22  8:08         ` Michael S. Tsirkin
2019-07-23  4:01           ` Jason Wang
2019-07-23  5:01             ` Michael S. Tsirkin
2019-07-23  5:47               ` Jason Wang
2019-07-23  7:23                 ` Michael S. Tsirkin
2019-07-23  7:53                   ` Jason Wang
2019-07-23  8:10                     ` Michael S. Tsirkin
2019-07-23  8:49                       ` Jason Wang
2019-07-23  9:26                         ` Michael S. Tsirkin
2019-07-23 13:31                           ` Jason Wang
2019-07-25  5:52                             ` Michael S. Tsirkin
2019-07-25  7:43                               ` Jason Wang
2019-07-25  8:28                                 ` Michael S. Tsirkin
2019-07-25 13:21                                   ` Jason Wang
2019-07-25 13:26                                     ` Michael S. Tsirkin
2019-07-25 14:25                                       ` Jason Wang
2019-07-26 11:49                                         ` Michael S. Tsirkin
2019-07-26 12:00                                           ` Jason Wang
2019-07-26 12:38                                             ` Michael S. Tsirkin
2019-07-26 12:53                                               ` Jason Wang
2019-07-26 13:36                                                 ` Jason Wang
2019-07-26 13:49                                                   ` Michael S. Tsirkin
2019-07-29  5:54                                                     ` Jason Wang
2019-07-29  8:59                                                       ` Michael S. Tsirkin
2019-07-29 14:24                                                         ` Jason Wang
2019-07-29 14:44                                                           ` Michael S. Tsirkin
2019-07-30  7:44                                                             ` Jason Wang
2019-07-30  8:03                                                               ` Jason Wang
2019-07-30 15:08                                                               ` Michael S. Tsirkin
2019-07-31  8:49                                                                 ` Jason Wang
2019-07-31 23:00                                                                   ` Jason Gunthorpe
2019-07-26 13:47                                                 ` Michael S. Tsirkin
2019-07-26 14:00                                                   ` Jason Wang
2019-07-26 14:10                                                     ` Michael S. Tsirkin
2019-07-26 15:03                                                     ` Jason Gunthorpe
2019-07-29  5:56                                                       ` Jason Wang
2019-07-21 12:28     ` RFC: call_rcu_outstanding (was Re: WARNING in __mmdrop) Michael S. Tsirkin
2019-07-21 13:17       ` Paul E. McKenney
2019-07-21 17:53         ` Michael S. Tsirkin
2019-07-21 19:28           ` Paul E. McKenney
2019-07-22  7:56             ` Michael S. Tsirkin
2019-07-22 11:57               ` Paul E. McKenney
2019-07-21 21:08         ` Matthew Wilcox
2019-07-21 23:31           ` Paul E. McKenney
2019-07-22  7:52             ` Michael S. Tsirkin
2019-07-22 11:51               ` Paul E. McKenney
2019-07-22 13:41                 ` Jason Gunthorpe
2019-07-22 15:52                   ` Paul E. McKenney
2019-07-22 16:04                     ` Jason Gunthorpe
2019-07-22 16:15                       ` Michael S. Tsirkin
2019-07-22 16:15                       ` Paul E. McKenney
2019-07-22 15:14             ` Joel Fernandes
2019-07-22 15:47               ` Michael S. Tsirkin
2019-07-22 15:55                 ` Paul E. McKenney
2019-07-22 16:13                   ` Michael S. Tsirkin
2019-07-22 16:25                     ` Paul E. McKenney
2019-07-22 16:32                       ` Michael S. Tsirkin
2019-07-22 18:58                         ` Paul E. McKenney
2019-07-22  5:21     ` WARNING in __mmdrop Jason Wang
2019-07-22  8:02       ` Michael S. Tsirkin
2019-07-23  3:55         ` Jason Wang
2019-07-23  5:02           ` Michael S. Tsirkin
2019-07-23  5:48             ` Jason Wang
2019-07-23  7:25               ` Michael S. Tsirkin
2019-07-23  7:55                 ` Jason Wang
2019-07-23  7:56               ` Michael S. Tsirkin
2019-07-23  8:42                 ` Jason Wang
2019-07-23 10:27                   ` Michael S. Tsirkin
2019-07-23 13:34                     ` Jason Wang
2019-07-23 15:02                       ` Michael S. Tsirkin
2019-07-24  2:17                         ` Jason Wang
2019-07-24  8:05                           ` Michael S. Tsirkin [this message]
2019-07-24 10:08                             ` Jason Wang
2019-07-24 18:25                               ` Michael S. Tsirkin
2019-07-25  3:44                                 ` Jason Wang
2019-07-25  5:09                                   ` Michael S. Tsirkin
2019-07-24 16:53                             ` Jason Gunthorpe
2019-07-24 18:25                               ` Michael S. Tsirkin
2019-07-23 10:42                   ` Michael S. Tsirkin
2019-07-23 13:37                     ` Jason Wang
2019-07-22 14:11     ` Jason Gunthorpe
2019-07-25  6:02       ` Michael S. Tsirkin
2019-07-25  7:44         ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190724040238-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=christian@brauner.io \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=elena.reshetova@intel.com \
    --cc=guro@fb.com \
    --cc=hch@infradead.org \
    --cc=james.bottomley@hansenpartnership.com \
    --cc=jasowang@redhat.com \
    --cc=jglisse@redhat.com \
    --cc=keescook@chromium.org \
    --cc=ldv@altlinux.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=syzbot+e58112d71f77113ddb7b@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).