linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Laurent Dufour <ldufour@linux.ibm.com>,
	Suren Baghdasaryan <surenb@google.com>
Subject: Re: Silencing false lockdep warning related to seq lock
Date: Wed, 19 May 2021 12:50:27 +0800	[thread overview]
Message-ID: <YKSZE1HfOk1nh/5A@boqun-archlinux> (raw)
In-Reply-To: <CAEXW_YTiL_hsqw9f0fiXWYen8i8R=Uj+eYM8tBaV-K6Hw1CRSQ@mail.gmail.com>

On Tue, May 18, 2021 at 11:53:57AM -0400, Joel Fernandes wrote:
> On Mon, May 17, 2021 at 10:24 PM Boqun Feng <boqun.feng@gmail.com> wrote:
> > > [...]
> > > > > After apply Laurent's SPF patchset [1] , we're facing a large number
> > > > > of (seemingly false positive) lockdep reports which are related to
> > > > > circular dependencies with seq locks.
> > > > >
> > > > >  lock(A); write_seqcount(B)
> > > > >   vs.
> > > > > write_seqcount(B); lock(A)
> > > > >
> > > >
> > > > Two questions here:
> > > >
> > > > *   Could you provide the lockdep splats you saw? I wonder whether
> > > >     it's similar to the one mentioned in patch #9[1].
> > >
> > > Sure I have appended them to this email. Here is the tree with Laurent's
> > > patches applied, in case you want to take a look:
> > > https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/heads/chromeos-5.4
> > >
> > > Yes, the splat is similar to the one mentioned in that patch #9, however the
> > > first splat I appended below shows an issue with lockdep false positive -
> > > there is clearly problem with lockdep where it thinks following sequence is
> > > bad, when in fact it is not:
> > >
> > >     init process (skipped some functions):
> > >     exec->
> > >      flush_old_exec->
> > >       exit_mmap ->
> > >         free_pgtables->
> > >           vm_write_begin(vma) // Y: acquires seq lock in write mode
> > >              unlink_anon_vmas // Z: acquires anon_vma->rwsem
> > >
> > >     exec->
> > >     load_elf_binary->
> > >       -> setup_arg_pages
> > >         -> expand_downwards (in the if (grow <=) block)
> > >            mm->page_table_lock // X
> > >            vm_write_begin(vma) // Y: acquires seq lock
> > >
> > >     exec->
> > >      do_execve_file->
> > >        ->get_user_pages
> > >          -> handle_pte_fault
> > >           -> anon_vma_prepare
> > >               -> acquire anon_vma->rwsem  // Z
> > >               -> acquire mm->page_table_lock // X
> > >
> > >     If vm_write_begin ever had to wait, then it could lockup like this if following happened concurrently:
> > >     Acquire->TryToAcquire
> > >     Y->Z
> > >     X->Y
> > >     Z->X
> > >
> > >     But Y can never result in a wait since it is a sequence lock. So this is
> > >     a lockdep false positive.
> > >
> > > >
> > > > *   What keeps write_seqcount(vm_seqcount) serialized? If it's only
> > > >     one lock that serializes the writers, we probably can make it
> > > >     as the nest_lock argument for seqcount_acquire(), and that will
> > > >     help prevent the false positives.
> > >
> > > I think the write seqcount is serialized by the mmap_sem in all the code
> > > paths I audited in Laurent's patchset.
> > >
> > > I am not sure how making it nest_lock argument will help though? The issue is
> > > that lockdep's understanding of seqcount needs to relate seqcount readers to
> >
> > The thing lockdep will not report deadlock for the following sequences:
> >
> >         T1:
> >         lock(A);
> >         lock_nest(B, A);
> >         lock(C);
> >
> >         T2:
> >         lock(A);
> >         lock(C);
> >         lock_nest(B, A);
> >
> > because with the nest_lock, lockdep would know A serializes B, so the
> > following case cannot happen:
> >
> >         CPU 0                   CPU 1
> >         ============            ============
> >         lock_nest(B,A);
> >                                 lock(C);
> >         lock(C);
> >                                 lock_nest(B, A);
> >
> > becaue either CPU 0 or CPU 1 will already hold A, so they are
> > serialized. So nest_lock can solve part of the problem if all the
> > writers of vm_seqcount are serialized somehow.
> >
> > Yes, seqcount writers cannot block each other, however, they are
> > supposed to be seralized with each other, right? So if we have the
> > reason to believe the above two CPUs case can happen, though it's not
> > a deadlock, but it's a data race, right?
> >
> > I think the proper fix here is to annotate vm_seqcount with the correct
> > locks serializing the writers.
> >
> 
> I agree with you now and that's the best way forward. I will work on
> something like this (unless you already did), thanks Boqun!
> 

Go ahead! I haven't started anything yet, although I did read a little
to understand what is needed to change.

Regards,
Boqun

> -Joel

      reply	other threads:[~2021-05-19  4:50 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-14 14:52 Silencing false lockdep warning related to seq lock Joel Fernandes
2021-05-17  4:21 ` Boqun Feng
2021-05-18  1:52   ` Joel Fernandes
2021-05-18  2:24     ` Boqun Feng
2021-05-18 15:53       ` Joel Fernandes
2021-05-19  4:50         ` Boqun Feng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YKSZE1HfOk1nh/5A@boqun-archlinux \
    --to=boqun.feng@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=ldufour@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).