Re: GPF from __srcu_read_lock() via drm_minor_acquire()

From: "Paul E. McKenney" <paulmck@kernel.org>
To: Nick Desaulniers <ndesaulniers@google.com>
Cc: Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	jiangshanlai@gmail.com,
	"Joel Fernandes (Google)" <joel@joelfernandes.org>,
	rcu@vger.kernel.org,
	clang-built-linux <clang-built-linux@googlegroups.com>
Subject: Re: GPF from __srcu_read_lock() via drm_minor_acquire()
Date: Thu, 17 Sep 2020 13:58:44 -0700	[thread overview]
Message-ID: <20200917205844.GA1978@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20200916213730.GE29330@paulmck-ThinkPad-P72>

On Wed, Sep 16, 2020 at 02:37:30PM -0700, Paul E. McKenney wrote:
> On Wed, Sep 16, 2020 at 01:48:22PM -0700, Nick Desaulniers wrote:
> > Hey Paul and RCU folks,
> > I noticed we have a bug report from 2 users that seem to have similar
> > stack traces in SRCU code;
> > https://github.com/ClangBuiltLinux/linux/issues/1081
> > 
> > Is there a way we should go about starting to debug this?
> 
> Hello, Nick,
> 
> Huh.  It looks like the per-CPU memory referenced by the srcu_struct
> structure's ->sda field is unmapped.  That would certainly leave
> the next __srcu_read_lock() dazed and confused!
> 
> The trapping instruction is the increment instruction that I would
> expect to be there.  The source code is as follows:
> 
> 	idx = READ_ONCE(ssp->srcu_idx) & 0x1;
> 	this_cpu_inc(ssp->sda->srcu_lock_count[idx]);
> 	smp_mb();
> 
> Looking at the assembly:
> 
> 	  1e:	55                   	push   %ebp
> 	  1f:	89 e5                	mov    %esp,%ebp
> 
> The above is function preamble.
> 
> 	  21:	8b 48 68             	mov    0x68(%eax),%ecx
> 
> The above instruction does READ_ONCE(ssp->srcu_idx).
> 
> 	  24:	8b 40 7c             	mov    0x7c(%eax),%eax
> 
> The above instruction fetches ssp->sda into %eax.  I therefore find it
> quite surprising that the dump contains "EAX: 00000000".  Or is this
> register value inaccurate?
> 
> 	  27:	83 e1 01             	and    $0x1,%ecx
> 
> The above instruction does the "& 0x1".  Therefore, at this point,
> %eax contains the address of the per-CPU srcu_data structure, but
> without the per-CPU offset having been applied.  Also, %ecx contains
> the array index, either 0 or 1.  Here we have zero, which is perfectly
> legitimate.
> 
> 	  2a:*	64 ff 04 88          	incl   %fs:(%eax,%ecx,4)
> 
> The above instruction does the this_cpu_inc().  Here %fs is presumably
> this CPU's offset from the base address of the per-CPU ->sda pointer.
> 
> 	  2e:	f0 83 44 24 fc 00    	lock addl $0x0,-0x4(%esp)
> 
> The above instruction is the smp_mb().
> 
> So here are a few questions that I would ask:

Oh, and this one:

0.	Did someone call srcu_read_lock() before init_srcu_struct()
	had been called on this srcu_struct structure?

							Thanx, Paul

> 1.	Did the init_srcu_struct() for this srcu_struct report an error?
> 	(Though with current mainline, that memory-allocation failure
> 	would more likely have page-faulted in init_srcu_struct().)
> 
> 2.	Has the srcu_struct in question already been passed to
> 	cleanup_srcu_struct()?
> 
> 3.	Has the value of %fs been clobbered?  Though that seems
> 	unlikely given that it also happens on aarch64.  Plus, the
> 	smoking gun seems to me to be the zero value of %eax.
> 
> 4.	If the above three questions fail to provide enlightenment,
> 	I suggest recording the ->sda value and adding debug checks
> 	to anything that can unmap memory...  And recording the value
> 	of ->sda somewhere to check to see if it is being changed (it
> 	should remain constant from init_srcu_struct()'s return through
> 	the corresponding call to cleanup_srcu_struct()).
> 
> Please let me know how it goes!
> 
> 							Thanx, Paul