linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Default enable RCU list lockdep debugging with PROVE_RCU
@ 2020-05-14 12:25 Stephen Rothwell
  2020-05-14 12:31 ` Qian Cai
  0 siblings, 1 reply; 14+ messages in thread
From: Stephen Rothwell @ 2020-05-14 12:25 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Madhuparna Bhowmik, Amol Grover, Dmitry Vyukov

[-- Attachment #1: Type: text/plain, Size: 391 bytes --]

Hi Paul,

This patch in the rcu tree

  d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")

is causing whack-a-mole in the syzbot testing of linux-next.  Because
they always do a debug build of linux-next, no testing is getting done. :-(

Can we find another way to find all the bugs that are being discovered
(very slowly)?
-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 12:25 Default enable RCU list lockdep debugging with PROVE_RCU Stephen Rothwell
@ 2020-05-14 12:31 ` Qian Cai
  2020-05-14 13:33   ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Qian Cai @ 2020-05-14 12:31 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Paul E. McKenney, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov



> On May 14, 2020, at 8:25 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> 
> Hi Paul,
> 
> This patch in the rcu tree
> 
>  d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")
> 
> is causing whack-a-mole in the syzbot testing of linux-next.  Because
> they always do a debug build of linux-next, no testing is getting done. :-(
> 
> Can we find another way to find all the bugs that are being discovered
> (very slowly)?

Alternatively, could syzbot to use PROVE_RCU=n temporarily because it can’t keep up with it? I personally found PROVE_RCU_LIST=y is still useful for my linux-next testing, and don’t want to lose that coverage overnight.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 12:31 ` Qian Cai
@ 2020-05-14 13:33   ` Paul E. McKenney
  2020-05-14 13:39     ` Qian Cai
  2020-05-14 13:44     ` Qian Cai
  0 siblings, 2 replies; 14+ messages in thread
From: Paul E. McKenney @ 2020-05-14 13:33 UTC (permalink / raw)
  To: Qian Cai
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov

On Thu, May 14, 2020 at 08:31:13AM -0400, Qian Cai wrote:
> 
> 
> > On May 14, 2020, at 8:25 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> > 
> > Hi Paul,
> > 
> > This patch in the rcu tree
> > 
> >  d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")
> > 
> > is causing whack-a-mole in the syzbot testing of linux-next.  Because
> > they always do a debug build of linux-next, no testing is getting done. :-(
> > 
> > Can we find another way to find all the bugs that are being discovered
> > (very slowly)?
> 
> Alternatively, could syzbot to use PROVE_RCU=n temporarily because it can’t keep up with it? I personally found PROVE_RCU_LIST=y is still useful for my linux-next testing, and don’t want to lose that coverage overnight.

The problem is that PROVE_RCU is exactly PROVE_LOCKING, and asking people
to test without PROVE_LOCKING is a no-go in my opinion.  But of course
on the other hand if there is no testing of RCU list lockdep debugging,
those issues will never be found, let alone fixed.

One approach would be to do as Stephen asks (either remove d13fee049fa8
or pull it out of -next) and have testers force-enable the RCU list
lockdep debugging.

Would that work for you?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 13:33   ` Paul E. McKenney
@ 2020-05-14 13:39     ` Qian Cai
  2020-05-14 13:44     ` Qian Cai
  1 sibling, 0 replies; 14+ messages in thread
From: Qian Cai @ 2020-05-14 13:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov



> On May 14, 2020, at 9:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> On Thu, May 14, 2020 at 08:31:13AM -0400, Qian Cai wrote:
>> 
>> 
>>> On May 14, 2020, at 8:25 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>>> 
>>> Hi Paul,
>>> 
>>> This patch in the rcu tree
>>> 
>>> d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")
>>> 
>>> is causing whack-a-mole in the syzbot testing of linux-next.  Because
>>> they always do a debug build of linux-next, no testing is getting done. :-(
>>> 
>>> Can we find another way to find all the bugs that are being discovered
>>> (very slowly)?
>> 
>> Alternatively, could syzbot to use PROVE_RCU=n temporarily because it can’t keep up with it? I personally found PROVE_RCU_LIST=y is still useful for my linux-next testing, and don’t want to lose that coverage overnight.
> 
> The problem is that PROVE_RCU is exactly PROVE_LOCKING, and asking people
> to test without PROVE_LOCKING is a no-go in my opinion.  But of course
> on the other hand if there is no testing of RCU list lockdep debugging,
> those issues will never be found, let alone fixed.
> 
> One approach would be to do as Stephen asks (either remove d13fee049fa8
> or pull it out of -next) and have testers force-enable the RCU list
> lockdep debugging.
> 
> Would that work for you?

Yes, if there is a way to enable PROVE_RCU_LIST=y manually, that is fine. I think we would want to make it easier to enable it. Currently, it is buried into RCU_EXPERT?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 13:33   ` Paul E. McKenney
  2020-05-14 13:39     ` Qian Cai
@ 2020-05-14 13:44     ` Qian Cai
  2020-05-14 13:54       ` Paul E. McKenney
  1 sibling, 1 reply; 14+ messages in thread
From: Qian Cai @ 2020-05-14 13:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov



> On May 14, 2020, at 9:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> On Thu, May 14, 2020 at 08:31:13AM -0400, Qian Cai wrote:
>> 
>> 
>>> On May 14, 2020, at 8:25 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>>> 
>>> Hi Paul,
>>> 
>>> This patch in the rcu tree
>>> 
>>> d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")
>>> 
>>> is causing whack-a-mole in the syzbot testing of linux-next.  Because
>>> they always do a debug build of linux-next, no testing is getting done. :-(
>>> 
>>> Can we find another way to find all the bugs that are being discovered
>>> (very slowly)?
>> 
>> Alternatively, could syzbot to use PROVE_RCU=n temporarily because it can’t keep up with it? I personally found PROVE_RCU_LIST=y is still useful for my linux-next testing, and don’t want to lose that coverage overnight.
> 
> The problem is that PROVE_RCU is exactly PROVE_LOCKING, and asking people
> to test without PROVE_LOCKING is a no-go in my opinion.  But of course
> on the other hand if there is no testing of RCU list lockdep debugging,
> those issues will never be found, let alone fixed.
> 
> One approach would be to do as Stephen asks (either remove d13fee049fa8
> or pull it out of -next) and have testers force-enable the RCU list
> lockdep debugging.
> 
> Would that work for you?

Alternatively, how about having

PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT

since it is only syzbot can’t keep up with it?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 13:44     ` Qian Cai
@ 2020-05-14 13:54       ` Paul E. McKenney
  2020-05-14 14:03         ` Qian Cai
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2020-05-14 13:54 UTC (permalink / raw)
  To: Qian Cai
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov

On Thu, May 14, 2020 at 09:44:28AM -0400, Qian Cai wrote:
> 
> 
> > On May 14, 2020, at 9:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > 
> > On Thu, May 14, 2020 at 08:31:13AM -0400, Qian Cai wrote:
> >> 
> >> 
> >>> On May 14, 2020, at 8:25 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >>> 
> >>> Hi Paul,
> >>> 
> >>> This patch in the rcu tree
> >>> 
> >>> d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")
> >>> 
> >>> is causing whack-a-mole in the syzbot testing of linux-next.  Because
> >>> they always do a debug build of linux-next, no testing is getting done. :-(
> >>> 
> >>> Can we find another way to find all the bugs that are being discovered
> >>> (very slowly)?
> >> 
> >> Alternatively, could syzbot to use PROVE_RCU=n temporarily because it can’t keep up with it? I personally found PROVE_RCU_LIST=y is still useful for my linux-next testing, and don’t want to lose that coverage overnight.
> > 
> > The problem is that PROVE_RCU is exactly PROVE_LOCKING, and asking people
> > to test without PROVE_LOCKING is a no-go in my opinion.  But of course
> > on the other hand if there is no testing of RCU list lockdep debugging,
> > those issues will never be found, let alone fixed.
> > 
> > One approach would be to do as Stephen asks (either remove d13fee049fa8
> > or pull it out of -next) and have testers force-enable the RCU list
> > lockdep debugging.
> > 
> > Would that work for you?
> 
> Alternatively, how about having
> 
> PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT
> 
> since it is only syzbot can’t keep up with it?

Sound good to me, assuming that this works for the syzkaller guys.
Or could there be a "select PROVE_RCU_LIST" for the people who would
like to test it.

Alternatively, if we revert d13fee049fa8 from -next, I could provide
you a script that updates your .config to set both RCU_EXPERT and
PROVE_RCU_LIST.

There are a lot of ways to appraoch this.

So what would work best for everyone?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 13:54       ` Paul E. McKenney
@ 2020-05-14 14:03         ` Qian Cai
  2020-05-14 15:34           ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Qian Cai @ 2020-05-14 14:03 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov



> On May 14, 2020, at 9:54 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> On Thu, May 14, 2020 at 09:44:28AM -0400, Qian Cai wrote:
>> 
>> 
>>> On May 14, 2020, at 9:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
>>> 
>>> On Thu, May 14, 2020 at 08:31:13AM -0400, Qian Cai wrote:
>>>> 
>>>> 
>>>>> On May 14, 2020, at 8:25 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>>>>> 
>>>>> Hi Paul,
>>>>> 
>>>>> This patch in the rcu tree
>>>>> 
>>>>> d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")
>>>>> 
>>>>> is causing whack-a-mole in the syzbot testing of linux-next.  Because
>>>>> they always do a debug build of linux-next, no testing is getting done. :-(
>>>>> 
>>>>> Can we find another way to find all the bugs that are being discovered
>>>>> (very slowly)?
>>>> 
>>>> Alternatively, could syzbot to use PROVE_RCU=n temporarily because it can’t keep up with it? I personally found PROVE_RCU_LIST=y is still useful for my linux-next testing, and don’t want to lose that coverage overnight.
>>> 
>>> The problem is that PROVE_RCU is exactly PROVE_LOCKING, and asking people
>>> to test without PROVE_LOCKING is a no-go in my opinion.  But of course
>>> on the other hand if there is no testing of RCU list lockdep debugging,
>>> those issues will never be found, let alone fixed.
>>> 
>>> One approach would be to do as Stephen asks (either remove d13fee049fa8
>>> or pull it out of -next) and have testers force-enable the RCU list
>>> lockdep debugging.
>>> 
>>> Would that work for you?
>> 
>> Alternatively, how about having
>> 
>> PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT
>> 
>> since it is only syzbot can’t keep up with it?
> 
> Sound good to me, assuming that this works for the syzkaller guys.
> Or could there be a "select PROVE_RCU_LIST" for the people who would
> like to test it.
> 
> Alternatively, if we revert d13fee049fa8 from -next, I could provide
> you a script that updates your .config to set both RCU_EXPERT and
> PROVE_RCU_LIST.
> 
> There are a lot of ways to appraoch this.
> 
> So what would work best for everyone?


If PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT works for syzbot guys, that would be great, so other testing agents could still report/fix those RCU-list bugs and then pave a way for syzbot to return back once all those false positives had been sorted out.

Otherwise,  “select PROVE_RCU_LIST” *might* be better than buried into RCU_EXPERT where we will probably never saw those false positives been addressed since my configs does not cover a wide range of subsystems and probably not many other bots would enable RCU_EXPERT.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 14:03         ` Qian Cai
@ 2020-05-14 15:34           ` Paul E. McKenney
  2020-05-14 15:46             ` Qian Cai
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2020-05-14 15:34 UTC (permalink / raw)
  To: Qian Cai
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov

On Thu, May 14, 2020 at 10:03:21AM -0400, Qian Cai wrote:
> 
> 
> > On May 14, 2020, at 9:54 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > 
> > On Thu, May 14, 2020 at 09:44:28AM -0400, Qian Cai wrote:
> >> 
> >> 
> >>> On May 14, 2020, at 9:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> >>> 
> >>> On Thu, May 14, 2020 at 08:31:13AM -0400, Qian Cai wrote:
> >>>> 
> >>>> 
> >>>>> On May 14, 2020, at 8:25 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >>>>> 
> >>>>> Hi Paul,
> >>>>> 
> >>>>> This patch in the rcu tree
> >>>>> 
> >>>>> d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")
> >>>>> 
> >>>>> is causing whack-a-mole in the syzbot testing of linux-next.  Because
> >>>>> they always do a debug build of linux-next, no testing is getting done. :-(
> >>>>> 
> >>>>> Can we find another way to find all the bugs that are being discovered
> >>>>> (very slowly)?
> >>>> 
> >>>> Alternatively, could syzbot to use PROVE_RCU=n temporarily because it can’t keep up with it? I personally found PROVE_RCU_LIST=y is still useful for my linux-next testing, and don’t want to lose that coverage overnight.
> >>> 
> >>> The problem is that PROVE_RCU is exactly PROVE_LOCKING, and asking people
> >>> to test without PROVE_LOCKING is a no-go in my opinion.  But of course
> >>> on the other hand if there is no testing of RCU list lockdep debugging,
> >>> those issues will never be found, let alone fixed.
> >>> 
> >>> One approach would be to do as Stephen asks (either remove d13fee049fa8
> >>> or pull it out of -next) and have testers force-enable the RCU list
> >>> lockdep debugging.
> >>> 
> >>> Would that work for you?
> >> 
> >> Alternatively, how about having
> >> 
> >> PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT
> >> 
> >> since it is only syzbot can’t keep up with it?
> > 
> > Sound good to me, assuming that this works for the syzkaller guys.
> > Or could there be a "select PROVE_RCU_LIST" for the people who would
> > like to test it.
> > 
> > Alternatively, if we revert d13fee049fa8 from -next, I could provide
> > you a script that updates your .config to set both RCU_EXPERT and
> > PROVE_RCU_LIST.
> > 
> > There are a lot of ways to appraoch this.
> > 
> > So what would work best for everyone?
> 
> 
> If PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT works for syzbot guys, that would be great, so other testing agents could still report/fix those RCU-list bugs and then pave a way for syzbot to return back once all those false positives had been sorted out.

On that, I must defer to the syzbot guys.

> Otherwise,  “select PROVE_RCU_LIST” *might* be better than buried into RCU_EXPERT where we will probably never saw those false positives been addressed since my configs does not cover a wide range of subsystems and probably not many other bots would enable RCU_EXPERT.

Yet another option would be to edit your local kernel/rcu/Kconfig.debug
and change the code to the following:

	config PROVE_RCU_LIST
		def_bool y
		help
		  Enable RCU lockdep checking for list usages. It is default
		  enabled with CONFIG_PROVE_RCU.

Removing the RCU_EXPERT dependency would not go over at all well with
some people whose opinions are difficult to ignore.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 15:34           ` Paul E. McKenney
@ 2020-05-14 15:46             ` Qian Cai
  2020-05-14 18:13               ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Qian Cai @ 2020-05-14 15:46 UTC (permalink / raw)
  To: paulmck
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov



> On May 14, 2020, at 11:34 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> On Thu, May 14, 2020 at 10:03:21AM -0400, Qian Cai wrote:
>> 
>> 
>>> On May 14, 2020, at 9:54 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
>>> 
>>> On Thu, May 14, 2020 at 09:44:28AM -0400, Qian Cai wrote:
>>>> 
>>>> 
>>>>> On May 14, 2020, at 9:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
>>>>> 
>>>>> On Thu, May 14, 2020 at 08:31:13AM -0400, Qian Cai wrote:
>>>>>> 
>>>>>> 
>>>>>>> On May 14, 2020, at 8:25 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> This patch in the rcu tree
>>>>>>> 
>>>>>>> d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")
>>>>>>> 
>>>>>>> is causing whack-a-mole in the syzbot testing of linux-next.  Because
>>>>>>> they always do a debug build of linux-next, no testing is getting done. :-(
>>>>>>> 
>>>>>>> Can we find another way to find all the bugs that are being discovered
>>>>>>> (very slowly)?
>>>>>> 
>>>>>> Alternatively, could syzbot to use PROVE_RCU=n temporarily because it can’t keep up with it? I personally found PROVE_RCU_LIST=y is still useful for my linux-next testing, and don’t want to lose that coverage overnight.
>>>>> 
>>>>> The problem is that PROVE_RCU is exactly PROVE_LOCKING, and asking people
>>>>> to test without PROVE_LOCKING is a no-go in my opinion.  But of course
>>>>> on the other hand if there is no testing of RCU list lockdep debugging,
>>>>> those issues will never be found, let alone fixed.
>>>>> 
>>>>> One approach would be to do as Stephen asks (either remove d13fee049fa8
>>>>> or pull it out of -next) and have testers force-enable the RCU list
>>>>> lockdep debugging.
>>>>> 
>>>>> Would that work for you?
>>>> 
>>>> Alternatively, how about having
>>>> 
>>>> PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT
>>>> 
>>>> since it is only syzbot can’t keep up with it?
>>> 
>>> Sound good to me, assuming that this works for the syzkaller guys.
>>> Or could there be a "select PROVE_RCU_LIST" for the people who would
>>> like to test it.
>>> 
>>> Alternatively, if we revert d13fee049fa8 from -next, I could provide
>>> you a script that updates your .config to set both RCU_EXPERT and
>>> PROVE_RCU_LIST.
>>> 
>>> There are a lot of ways to appraoch this.
>>> 
>>> So what would work best for everyone?
>> 
>> 
>> If PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT works for syzbot guys, that would be great, so other testing agents could still report/fix those RCU-list bugs and then pave a way for syzbot to return back once all those false positives had been sorted out.
> 
> On that, I must defer to the syzbot guys.
> 
>> Otherwise,  “select PROVE_RCU_LIST” *might* be better than buried into RCU_EXPERT where we will probably never saw those false positives been addressed since my configs does not cover a wide range of subsystems and probably not many other bots would enable RCU_EXPERT.
> 
> Yet another option would be to edit your local kernel/rcu/Kconfig.debug
> and change the code to the following:
> 
> 	config PROVE_RCU_LIST
> 		def_bool y
> 		help
> 		  Enable RCU lockdep checking for list usages. It is default
> 		  enabled with CONFIG_PROVE_RCU.
> 
> Removing the RCU_EXPERT dependency would not go over at all well with
> some people whose opinions are difficult to ignore.  ;-)

I am trying to not getting into a game of carrying any custom patch myself.

Let’s see what syzbot guys will say, and then I’ll enable RCU_EXPERT myself if needed, but again we probably never see PROVE_RCU_LIST to be used again in syzbot for this path. I surely have no cycles to expand the testing coverage for more subsystems at the moment.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 15:46             ` Qian Cai
@ 2020-05-14 18:13               ` Paul E. McKenney
  2020-05-15 18:36                 ` Qian Cai
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2020-05-14 18:13 UTC (permalink / raw)
  To: Qian Cai
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov

On Thu, May 14, 2020 at 11:46:23AM -0400, Qian Cai wrote:
> 
> 
> > On May 14, 2020, at 11:34 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > 
> > On Thu, May 14, 2020 at 10:03:21AM -0400, Qian Cai wrote:
> >> 
> >> 
> >>> On May 14, 2020, at 9:54 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> >>> 
> >>> On Thu, May 14, 2020 at 09:44:28AM -0400, Qian Cai wrote:
> >>>> 
> >>>> 
> >>>>> On May 14, 2020, at 9:33 AM, Paul E. McKenney <paulmck@kernel.org> wrote:
> >>>>> 
> >>>>> On Thu, May 14, 2020 at 08:31:13AM -0400, Qian Cai wrote:
> >>>>>> 
> >>>>>> 
> >>>>>>> On May 14, 2020, at 8:25 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >>>>>>> 
> >>>>>>> Hi Paul,
> >>>>>>> 
> >>>>>>> This patch in the rcu tree
> >>>>>>> 
> >>>>>>> d13fee049fa8 ("Default enable RCU list lockdep debugging with PROVE_RCU")
> >>>>>>> 
> >>>>>>> is causing whack-a-mole in the syzbot testing of linux-next.  Because
> >>>>>>> they always do a debug build of linux-next, no testing is getting done. :-(
> >>>>>>> 
> >>>>>>> Can we find another way to find all the bugs that are being discovered
> >>>>>>> (very slowly)?
> >>>>>> 
> >>>>>> Alternatively, could syzbot to use PROVE_RCU=n temporarily because it can’t keep up with it? I personally found PROVE_RCU_LIST=y is still useful for my linux-next testing, and don’t want to lose that coverage overnight.
> >>>>> 
> >>>>> The problem is that PROVE_RCU is exactly PROVE_LOCKING, and asking people
> >>>>> to test without PROVE_LOCKING is a no-go in my opinion.  But of course
> >>>>> on the other hand if there is no testing of RCU list lockdep debugging,
> >>>>> those issues will never be found, let alone fixed.
> >>>>> 
> >>>>> One approach would be to do as Stephen asks (either remove d13fee049fa8
> >>>>> or pull it out of -next) and have testers force-enable the RCU list
> >>>>> lockdep debugging.
> >>>>> 
> >>>>> Would that work for you?
> >>>> 
> >>>> Alternatively, how about having
> >>>> 
> >>>> PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT
> >>>> 
> >>>> since it is only syzbot can’t keep up with it?
> >>> 
> >>> Sound good to me, assuming that this works for the syzkaller guys.
> >>> Or could there be a "select PROVE_RCU_LIST" for the people who would
> >>> like to test it.
> >>> 
> >>> Alternatively, if we revert d13fee049fa8 from -next, I could provide
> >>> you a script that updates your .config to set both RCU_EXPERT and
> >>> PROVE_RCU_LIST.
> >>> 
> >>> There are a lot of ways to appraoch this.
> >>> 
> >>> So what would work best for everyone?
> >> 
> >> 
> >> If PROVE_RCU_LIST=n if DEBUG_AID_FOR_SYZBOT works for syzbot guys, that would be great, so other testing agents could still report/fix those RCU-list bugs and then pave a way for syzbot to return back once all those false positives had been sorted out.
> > 
> > On that, I must defer to the syzbot guys.
> > 
> >> Otherwise,  “select PROVE_RCU_LIST” *might* be better than buried into RCU_EXPERT where we will probably never saw those false positives been addressed since my configs does not cover a wide range of subsystems and probably not many other bots would enable RCU_EXPERT.
> > 
> > Yet another option would be to edit your local kernel/rcu/Kconfig.debug
> > and change the code to the following:
> > 
> > 	config PROVE_RCU_LIST
> > 		def_bool y
> > 		help
> > 		  Enable RCU lockdep checking for list usages. It is default
> > 		  enabled with CONFIG_PROVE_RCU.
> > 
> > Removing the RCU_EXPERT dependency would not go over at all well with
> > some people whose opinions are difficult to ignore.  ;-)
> 
> I am trying to not getting into a game of carrying any custom patch myself.
> 
> Let’s see what syzbot guys will say, and then I’ll enable RCU_EXPERT myself if needed, but again we probably never see PROVE_RCU_LIST to be used again in syzbot for this path. I surely have no cycles to expand the testing coverage for more subsystems at the moment.

Fair enough!  And yes, the Linux kernel is quite large, so I certainly am
not asking you to test the whole thing yourself.

								Thanx, Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-14 18:13               ` Paul E. McKenney
@ 2020-05-15 18:36                 ` Qian Cai
  2020-05-17 21:47                   ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Qian Cai @ 2020-05-15 18:36 UTC (permalink / raw)
  To: paulmck
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov



> On May 14, 2020, at 2:13 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> Fair enough!  And yes, the Linux kernel is quite large, so I certainly am
> not asking you to test the whole thing yourself.

Ok, I saw 0day bot also started to report those which is good. For example,

lkml.org/lkml/2020/5/12/1358

which so far is nit blocking 0day on linux-next since it does not use panic_on_warn yet (while syzbot does).

Thus, I am more convinced that we should not revert the commit just for syzbot until someone could also convince 0day to select RCU_EXPERT and then DEBUG_RCU_LIST?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-15 18:36                 ` Qian Cai
@ 2020-05-17 21:47                   ` Paul E. McKenney
  2020-05-18  5:54                     ` Rong Chen
  0 siblings, 1 reply; 14+ messages in thread
From: Paul E. McKenney @ 2020-05-17 21:47 UTC (permalink / raw)
  To: Qian Cai
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov, philip.li, lkp, fengguang.wu, rong.a.chen

On Fri, May 15, 2020 at 02:36:26PM -0400, Qian Cai wrote:
> 
> 
> > On May 14, 2020, at 2:13 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > 
> > Fair enough!  And yes, the Linux kernel is quite large, so I certainly am
> > not asking you to test the whole thing yourself.
> 
> Ok, I saw 0day bot also started to report those which is good. For example,
> 
> lkml.org/lkml/2020/5/12/1358
> 
> which so far is nit blocking 0day on linux-next since it does not use panic_on_warn yet (while syzbot does).
> 
> Thus, I am more convinced that we should not revert the commit just for syzbot until someone could also convince 0day to select RCU_EXPERT and then DEBUG_RCU_LIST?

Let's ask the 0day people, now CCed, if they would be willing to
build with CONFIG_RCU_EXPERT=y and CONFIG_DEBUG_RCU_LIST=y on some
fraction of their testing.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-17 21:47                   ` Paul E. McKenney
@ 2020-05-18  5:54                     ` Rong Chen
  2020-05-18 12:44                       ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Rong Chen @ 2020-05-18  5:54 UTC (permalink / raw)
  To: paulmck, Qian Cai
  Cc: Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov, philip.li, lkp, fengguang.wu



On 5/18/20 5:47 AM, Paul E. McKenney wrote:
> On Fri, May 15, 2020 at 02:36:26PM -0400, Qian Cai wrote:
>>
>>> On May 14, 2020, at 2:13 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
>>>
>>> Fair enough!  And yes, the Linux kernel is quite large, so I certainly am
>>> not asking you to test the whole thing yourself.
>> Ok, I saw 0day bot also started to report those which is good. For example,
>>
>> lkml.org/lkml/2020/5/12/1358
>>
>> which so far is nit blocking 0day on linux-next since it does not use panic_on_warn yet (while syzbot does).
>>
>> Thus, I am more convinced that we should not revert the commit just for syzbot until someone could also convince 0day to select RCU_EXPERT and then DEBUG_RCU_LIST?
> Let's ask the 0day people, now CCed, if they would be willing to
> build with CONFIG_RCU_EXPERT=y and CONFIG_DEBUG_RCU_LIST=y on some
> fraction of their testing.  ;-)
>
> 							Thanx, Paul

Hi,

Thanks for your advice, we'll support it in the near future.

Best Regards,
Rong Chen

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Default enable RCU list lockdep debugging with PROVE_RCU
  2020-05-18  5:54                     ` Rong Chen
@ 2020-05-18 12:44                       ` Paul E. McKenney
  0 siblings, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2020-05-18 12:44 UTC (permalink / raw)
  To: Rong Chen
  Cc: Qian Cai, Stephen Rothwell, Linux Next Mailing List,
	Linux Kernel Mailing List, Madhuparna Bhowmik, Amol Grover,
	Dmitry Vyukov, philip.li, lkp, fengguang.wu

On Mon, May 18, 2020 at 01:54:13PM +0800, Rong Chen wrote:
> 
> 
> On 5/18/20 5:47 AM, Paul E. McKenney wrote:
> > On Fri, May 15, 2020 at 02:36:26PM -0400, Qian Cai wrote:
> > > 
> > > > On May 14, 2020, at 2:13 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > 
> > > > Fair enough!  And yes, the Linux kernel is quite large, so I certainly am
> > > > not asking you to test the whole thing yourself.
> > > Ok, I saw 0day bot also started to report those which is good. For example,
> > > 
> > > lkml.org/lkml/2020/5/12/1358
> > > 
> > > which so far is nit blocking 0day on linux-next since it does not use panic_on_warn yet (while syzbot does).
> > > 
> > > Thus, I am more convinced that we should not revert the commit just for syzbot until someone could also convince 0day to select RCU_EXPERT and then DEBUG_RCU_LIST?
> > Let's ask the 0day people, now CCed, if they would be willing to
> > build with CONFIG_RCU_EXPERT=y and CONFIG_DEBUG_RCU_LIST=y on some
> > fraction of their testing.  ;-)
> > 
> > 							Thanx, Paul
> 
> Hi,
> 
> Thanks for your advice, we'll support it in the near future.

Thank you very much!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-05-18 12:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-14 12:25 Default enable RCU list lockdep debugging with PROVE_RCU Stephen Rothwell
2020-05-14 12:31 ` Qian Cai
2020-05-14 13:33   ` Paul E. McKenney
2020-05-14 13:39     ` Qian Cai
2020-05-14 13:44     ` Qian Cai
2020-05-14 13:54       ` Paul E. McKenney
2020-05-14 14:03         ` Qian Cai
2020-05-14 15:34           ` Paul E. McKenney
2020-05-14 15:46             ` Qian Cai
2020-05-14 18:13               ` Paul E. McKenney
2020-05-15 18:36                 ` Qian Cai
2020-05-17 21:47                   ` Paul E. McKenney
2020-05-18  5:54                     ` Rong Chen
2020-05-18 12:44                       ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).