linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: g@hirez.programming.kicks-ass.net, Qian Cai <cai@redhat.com>,
	Waiman Long <longman@redhat.com>, Ingo Molnar <mingo@redhat.com>,
	Will Deacon <will@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: lockdep null-ptr-deref
Date: Wed, 30 Sep 2020 20:18:18 +0800	[thread overview]
Message-ID: <20200930121818.GD17810@debian-boqun.qqnc3lrjykvubdpftowmye0fmh.lx.internal.cloudapp.net> (raw)
In-Reply-To: <20200930094937.GE2651@hirez.programming.kicks-ass.net>

On Wed, Sep 30, 2020 at 11:49:37AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 30, 2020 at 11:16:11AM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 30, 2020 at 07:08:23AM +0800, Boqun Feng wrote:
> > > I think there are two problems here:
> > > 
> > > 1) the "(null)" means we don't have the "usage_str" for a usage bit,
> > > which I think is the LOCK_USED_READ bit introduced by Peter at
> > > 23870f122768 ('locking/lockdep: Fix "USED" <- "IN-NMI" inversions').
> > > 
> > > 2) the next null-ptr-deref, and I think this is also caused by
> > > LOCK_USED_READ bit, because in the loop inside
> > > print_lock_class_header(), we iterate from 0 to LOCK_USAGE_STATES (which
> > > is 4*2 + 3), however the class->usage_traces[] only has
> > > XXX_LOCK_USAGE_STATES (which is 4*2 + 1) elements, so if we have
> > > LOCK_USED_READ bit set in ->usage_mask, we will try to access an element
> > > out of the ->usage_traces[] array.
> > > 
> > > Probably the following helps? And another possible fix is to enlarge the
> > > ->usage_trace[] array and record the call trace of LOCK_READ_USED.
> > 
> > Urgh.. yeah, I wanted to avoid saving that trace; it's pretty useless :/
> > The existing USED trace is already mostly pointless, the consistent
> > thing would be to remove both but that might be too radical.
> > 
> > But you're right in that I made a right mess of it. Not sure what's
> > best here.
> > 
> > Let me have a play.
> 
> How's something like this? It's bigger than I'd like, but I feel the
> result is more consistent/readable.
> 

Looks good to me.

For one thing, I do think that LOCK_READ_USED trace is helpful for
better reporting, because if there is a read lock in the dependency path
which causes the deadlock, it's better to have the LOCK_READ_USED trace
to know at least the initial READ usage. For example, if we have

	void f1(...)
	{
		write_lock(&A);
		spin_lock(&C);
		// A -> C
		...
	}

	void g(...)
	{
		read_lock(&A);
		...
	}
	void f2(...)
	{
		spin_lock(&B);
		g(...);
		// B -> A
	}

	void f3(...) {
		spin_lock(&C);
		spin_lock(&B);
		// C -> B, trigger lockdep splat
	}

when lockdep reports the deadlock (at the time f3() is called), it will
be useful if we have a trace like:

	INITIAL READ usage at:
	g+0x.../0x...
	f2+0x.../0x...

Thoughts?

Regards,
Boqun

> ---
> diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
> index bb35b449f533..a55b1d314ae8 100644
> --- a/include/linux/lockdep_types.h
> +++ b/include/linux/lockdep_types.h
> @@ -35,8 +35,12 @@ enum lockdep_wait_type {
>  /*
>   * We'd rather not expose kernel/lockdep_states.h this wide, but we do need
>   * the total number of states... :-(
> + *
> + * XXX_LOCK_USAGE_STATES is the number of lines in lockdep_states.h, for each
> + * of those we generates 4 states, Additionally we (for now) report on USED.
>   */
> -#define XXX_LOCK_USAGE_STATES		(1+2*4)
> +#define XXX_LOCK_USAGE_STATES		2
> +#define LOCK_TRACE_STATES		(XXX_LOCK_USAGE_STATES*4 + 1)
>  
>  /*
>   * NR_LOCKDEP_CACHING_CLASSES ... Number of classes
> @@ -106,7 +110,7 @@ struct lock_class {
>  	 * IRQ/softirq usage tracking bits:
>  	 */
>  	unsigned long			usage_mask;
> -	const struct lock_trace		*usage_traces[XXX_LOCK_USAGE_STATES];
> +	const struct lock_trace		*usage_traces[LOCK_TRACE_STATES];
>  
>  	/*
>  	 * Generation counter, when doing certain classes of graph walking,
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 454355c033d2..4f98ac8b4575 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -600,6 +600,8 @@ static const char *usage_str[] =
>  #include "lockdep_states.h"
>  #undef LOCKDEP_STATE
>  	[LOCK_USED] = "INITIAL USE",
> +	[LOCK_USED_READ] = "INITIAL READ USE",
> +	/* abused as string storage for verify_lock_unused() */
>  	[LOCK_USAGE_STATES] = "IN-NMI",
>  };
>  #endif
> @@ -2231,7 +2233,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
>  #endif
>  	printk(KERN_CONT " {\n");
>  
> -	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
> +	for (bit = 0; bit < LOCK_TRACE_STATES; bit++) {
>  		if (class->usage_mask & (1 << bit)) {
>  			int len = depth;
>  
> @@ -4354,27 +4356,24 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
>  	old_mask = hlock_class(this)->usage_mask;
>  	hlock_class(this)->usage_mask |= new_mask;
>  
> -	/*
> -	 * Save one usage_traces[] entry and map both LOCK_USED and
> -	 * LOCK_USED_READ onto the same entry.
> -	 */
> -	if (new_bit == LOCK_USED || new_bit == LOCK_USED_READ) {
> -		if (old_mask & (LOCKF_USED | LOCKF_USED_READ))
> -			goto unlock;
> -		new_bit = LOCK_USED;
> +	if (new_bit < LOCK_TRACE_STATES) {
> +		if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
> +			return 0;
>  	}
>  
> -	if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
> -		return 0;
> -
>  	switch (new_bit) {
> +	case 0 ... LOCK_USED-1:
> +		ret = mark_lock_irq(curr, this, new_bit);
> +		if (!ret)
> +			return 0;
> +		break;
> +
>  	case LOCK_USED:
>  		debug_atomic_dec(nr_unused_locks);
>  		break;
> +
>  	default:
> -		ret = mark_lock_irq(curr, this, new_bit);
> -		if (!ret)
> -			return 0;
> +		break;
>  	}
>  
>  unlock:
> diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
> index b0be1560ed17..67dc46e46552 100644
> --- a/kernel/locking/lockdep_internals.h
> +++ b/kernel/locking/lockdep_internals.h
> @@ -20,9 +20,12 @@ enum lock_usage_bit {
>  #undef LOCKDEP_STATE
>  	LOCK_USED,
>  	LOCK_USED_READ,
> -	LOCK_USAGE_STATES
> +	LOCK_USAGE_STATES,
>  };
>  
> +/* states after LOCK_USED are not traced and printed */
> +static_assert(LOCK_TRACE_STATES == LOCK_USED_READ);
> +
>  #define LOCK_USAGE_READ_MASK 1
>  #define LOCK_USAGE_DIR_MASK  2
>  #define LOCK_USAGE_STATE_MASK (~(LOCK_USAGE_READ_MASK | LOCK_USAGE_DIR_MASK))
> @@ -121,7 +124,7 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
>  extern struct list_head all_lock_classes;
>  extern struct lock_chain lock_chains[];
>  
> -#define LOCK_USAGE_CHARS (1+LOCK_USAGE_STATES/2)
> +#define LOCK_USAGE_CHARS (2*XXX_LOCK_USAGE_STATES + 1)
>  
>  extern void get_usage_chars(struct lock_class *class,
>  			    char usage[LOCK_USAGE_CHARS]);

  reply	other threads:[~2020-09-30 12:18 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-29 14:31 lockdep null-ptr-deref Qian Cai
2020-09-29 23:08 ` Boqun Feng
2020-09-30  9:16   ` Peter Zijlstra
2020-09-30  9:49     ` Peter Zijlstra
2020-09-30 12:18       ` Boqun Feng [this message]
2020-09-30 19:02         ` Peter Zijlstra
2020-10-02 12:36           ` Boqun Feng
2020-10-02 13:09             ` Peter Zijlstra
2020-10-02 13:35               ` Boqun Feng
2020-10-02 10:06       ` Peter Zijlstra
2020-10-02 13:40         ` Qian Cai
2020-10-07 16:20       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra
2020-10-27 11:29         ` Chris Wilson
2020-10-27 11:59           ` Peter Zijlstra
2020-10-27 12:30             ` Peter Zijlstra
2020-10-27 12:48               ` Peter Zijlstra
2020-10-27 14:13                 ` Chris Wilson
2020-10-31 11:30                 ` [tip: locking/urgent] lockdep: Fix nr_unused_locks accounting tip-bot2 for Peter Zijlstra
2020-10-27 13:29               ` [tip: locking/core] lockdep: Fix usage_traceoverflow Chris Wilson
2020-10-27 15:45                 ` Peter Zijlstra
2020-10-27 16:34                   ` Chris Wilson
2020-10-28 17:40                     ` Chris Wilson
2020-10-28 18:06                       ` Chris Wilson
2020-10-28 19:42                       ` Peter Zijlstra
2020-10-28 19:59                         ` Peter Zijlstra
2020-10-30  3:51                           ` Boqun Feng
2020-10-30  9:38                             ` Peter Zijlstra
2020-10-30  9:55                               ` Peter Zijlstra
2020-11-02  5:37                               ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
2020-11-02  5:37                                 ` [PATCH 2/2] lockdep/selftest: Add spin_nest_lock test Boqun Feng
2020-12-03 10:35                                   ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-11-05  6:25                                 ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
2020-11-10 17:28                                 ` Peter Zijlstra
2020-11-11  8:23                                 ` [tip: locking/urgent] " tip-bot2 for Boqun Feng
2020-10-09  7:58       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200930121818.GD17810@debian-boqun.qqnc3lrjykvubdpftowmye0fmh.lx.internal.cloudapp.net \
    --to=boqun.feng@gmail.com \
    --cc=cai@redhat.com \
    --cc=g@hirez.programming.kicks-ass.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).