All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org,
	tip-bot2 for Peter Zijlstra <tip-bot2@linutronix.de>,
	Qian Cai <cai@redhat.com>, x86 <x86@kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>
Subject: Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
Date: Wed, 28 Oct 2020 20:59:10 +0100	[thread overview]
Message-ID: <20201028195910.GI2651@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20201028194208.GF2628@hirez.programming.kicks-ass.net>

On Wed, Oct 28, 2020 at 08:42:09PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 28, 2020 at 05:40:48PM +0000, Chris Wilson wrote:
> > Quoting Chris Wilson (2020-10-27 16:34:53)
> > > Quoting Peter Zijlstra (2020-10-27 15:45:33)
> > > > On Tue, Oct 27, 2020 at 01:29:10PM +0000, Chris Wilson wrote:
> > > > 
> > > > > <4> [304.908891] hm#2, depth: 6 [6], 3425cfea6ff31f7f != 547d92e9ec2ab9af
> > > > > <4> [304.908897] WARNING: CPU: 0 PID: 5658 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0
> > > > 
> > > > Urgh, I don't think I've _ever_ seen that warning trigger.
> > > > 
> > > > The comments that go with it suggest memory corruption is the most
> > > > likely trigger of it. Is it easy to trigger?
> > > 
> > > For the automated CI, yes, the few machines that run that particular HW
> > > test seem to hit it regularly. I have not yet reproduced it for myself.
> > > I thought it looked like something kasan would provide some insight for
> > > and we should get a kasan run through CI over the w/e. I suspect we've
> > > feed in some garbage and called it a lock.
> > 
> > I tracked it down to a second invocation of lock_acquire_shared_recursive()
> > intermingled with some other regular mutexes (in this case ww_mutex).
> > 
> > We hit this path in validate_chain():
> > 	/*
> > 	 * Mark recursive read, as we jump over it when
> > 	 * building dependencies (just like we jump over
> > 	 * trylock entries):
> > 	 */
> > 	if (ret == 2)
> > 		hlock->read = 2;
> > 
> > and that is modifying hlock_id() and so the chain-key, after it has
> > already been computed.
> 
> Ooh, interesting.. I'll have to go look at this in the morning, brain is
> fried already. Thanks for digging into it.

So that's commit f611e8cf98ec ("lockdep: Take read/write status in
consideration when generate chainkey") that did that.

So validate_chain() requires the new chain_key, but can change ->read
which then invalidates the chain_key we just calculated.

This happens when check_deadlock() returns 2, which only happens when:

  - next->read == 2 && ... ; however @hext is our @hlock, so that's
    pointless

  - when there's a nest_lock involved ; ww_mutex uses that !!!

I suppose something like the below _might_ just do it, but I haven't
compiled it, and like said, my brain is fried.

Boqun, could you have a look, you're a few timezones ahead of us so your
morning is earlier ;-)

---

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 3e99dfef8408..3caf63532bc2 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3556,7 +3556,7 @@ static inline int lookup_chain_cache_add(struct task_struct *curr,
 
 static int validate_chain(struct task_struct *curr,
 			  struct held_lock *hlock,
-			  int chain_head, u64 chain_key)
+			  int chain_head, u64 *chain_key)
 {
 	/*
 	 * Trylock needs to maintain the stack of held locks, but it
@@ -3568,6 +3568,7 @@ static int validate_chain(struct task_struct *curr,
 	 * (If lookup_chain_cache_add() return with 1 it acquires
 	 * graph_lock for us)
 	 */
+again:
 	if (!hlock->trylock && hlock->check &&
 	    lookup_chain_cache_add(curr, hlock, chain_key)) {
 		/*
@@ -3597,8 +3598,12 @@ static int validate_chain(struct task_struct *curr,
 		 * building dependencies (just like we jump over
 		 * trylock entries):
 		 */
-		if (ret == 2)
+		if (ret == 2) {
 			hlock->read = 2;
+			*chain_key = iterate_chain_key(hlock->prev_chain_key, hlock_id(hlock));
+			goto again;
+		}
+
 		/*
 		 * Add dependency only if this lock is not the head
 		 * of the chain, and if it's not a secondary read-lock:
@@ -3620,7 +3625,7 @@ static int validate_chain(struct task_struct *curr,
 #else
 static inline int validate_chain(struct task_struct *curr,
 				 struct held_lock *hlock,
-				 int chain_head, u64 chain_key)
+				 int chain_head, u64 *chain_key)
 {
 	return 1;
 }
@@ -4834,7 +4839,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 		WARN_ON_ONCE(!hlock_class(hlock)->key);
 	}
 
-	if (!validate_chain(curr, hlock, chain_head, chain_key))
+	if (!validate_chain(curr, hlock, chain_head, &chain_key))
 		return 0;
 
 	curr->curr_chain_key = chain_key;

  reply	other threads:[~2020-10-28 21:45 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-29 14:31 lockdep null-ptr-deref Qian Cai
2020-09-29 23:08 ` Boqun Feng
2020-09-30  9:16   ` Peter Zijlstra
2020-09-30  9:49     ` Peter Zijlstra
2020-09-30 12:18       ` Boqun Feng
2020-09-30 19:02         ` Peter Zijlstra
2020-10-02 12:36           ` Boqun Feng
2020-10-02 13:09             ` Peter Zijlstra
2020-10-02 13:35               ` Boqun Feng
2020-10-02 10:06       ` Peter Zijlstra
2020-10-02 13:40         ` Qian Cai
2020-10-07 16:20       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra
2020-10-27 11:29         ` Chris Wilson
2020-10-27 11:59           ` Peter Zijlstra
2020-10-27 12:30             ` Peter Zijlstra
2020-10-27 12:48               ` Peter Zijlstra
2020-10-27 14:13                 ` Chris Wilson
2020-10-31 11:30                 ` [tip: locking/urgent] lockdep: Fix nr_unused_locks accounting tip-bot2 for Peter Zijlstra
2020-10-27 13:29               ` [tip: locking/core] lockdep: Fix usage_traceoverflow Chris Wilson
2020-10-27 15:45                 ` Peter Zijlstra
2020-10-27 16:34                   ` Chris Wilson
2020-10-28 17:40                     ` Chris Wilson
2020-10-28 18:06                       ` Chris Wilson
2020-10-28 19:42                       ` Peter Zijlstra
2020-10-28 19:59                         ` Peter Zijlstra [this message]
2020-10-30  3:51                           ` Boqun Feng
2020-10-30  9:38                             ` Peter Zijlstra
2020-10-30  9:55                               ` Peter Zijlstra
2020-11-02  5:37                               ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
2020-11-02  5:37                                 ` [PATCH 2/2] lockdep/selftest: Add spin_nest_lock test Boqun Feng
2020-12-03 10:35                                   ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-11-05  6:25                                 ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
2020-11-10 17:28                                 ` Peter Zijlstra
2020-11-11  8:23                                 ` [tip: locking/urgent] " tip-bot2 for Boqun Feng
2020-10-09  7:58       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201028195910.GI2651@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=boqun.feng@gmail.com \
    --cc=cai@redhat.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=tip-bot2@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.