All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE
@ 2020-07-25  1:30 Tetsuo Handa
  2020-07-25  4:48 ` Dmitry Vyukov
  2020-08-27 15:20 ` [PATCH v2] lockdep: Allow tuning tracing capacity constants Tetsuo Handa
  0 siblings, 2 replies; 38+ messages in thread
From: Tetsuo Handa @ 2020-07-25  1:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Dmitry Vyukov, Tetsuo Handa, syzbot, syzbot, syzbot

Since syzkaller continues various test cases until the kernel crashes,
syzkaller tends to examine more locking dependencies than normal systems.
As a result, syzbot is reporting that the fuzz testing was terminated
due to hitting upper limits lockdep can track [1] [2] [3].

Like CONFIG_LOCKDEP_SMALL which halves the upper limits, let's introduce
CONFIG_LOCKDEP_LARGE which doubles the upper limits.

[1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
[2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
[3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f

Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 kernel/locking/lockdep.c           | 4 ++++
 kernel/locking/lockdep_internals.h | 5 +++++
 lib/Kconfig.debug                  | 8 ++++++++
 3 files changed, 17 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 29a8de4..85ba7eb 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1349,7 +1349,11 @@ static int add_lock_to_list(struct lock_class *this,
 /*
  * For good efficiency of modular, we use power of 2
  */
+#ifdef CONFIG_LOCKDEP_LARGE
+#define MAX_CIRCULAR_QUEUE_SIZE		8192UL
+#else
 #define MAX_CIRCULAR_QUEUE_SIZE		4096UL
+#endif
 #define CQ_MASK				(MAX_CIRCULAR_QUEUE_SIZE-1)
 
 /*
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index baca699..00a3ec3 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -93,6 +93,11 @@ enum {
 #define MAX_LOCKDEP_CHAINS_BITS	15
 #define MAX_STACK_TRACE_ENTRIES	262144UL
 #define STACK_TRACE_HASH_SIZE	8192
+#elif defined(CONFIG_LOCKDEP_LARGE)
+#define MAX_LOCKDEP_ENTRIES	65536UL
+#define MAX_LOCKDEP_CHAINS_BITS	17
+#define MAX_STACK_TRACE_ENTRIES	1048576UL
+#define STACK_TRACE_HASH_SIZE	32768
 #else
 #define MAX_LOCKDEP_ENTRIES	32768UL
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 9ad9210..69ba624 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1266,6 +1266,14 @@ config LOCKDEP
 config LOCKDEP_SMALL
 	bool
 
+config LOCKDEP_LARGE
+	bool "Use larger buffer for tracking more locking dependencies"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	help
+	  If you say Y here, the upper limits the lock dependency engine uses will
+	  be doubled. Useful for fuzz testing which tends to test many complecated
+	  dependencies than normal systems.
+
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
 	depends on DEBUG_KERNEL && LOCKDEP
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE
  2020-07-25  1:30 [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE Tetsuo Handa
@ 2020-07-25  4:48 ` Dmitry Vyukov
  2020-07-25  5:23   ` Tetsuo Handa
  2020-08-27 15:20 ` [PATCH v2] lockdep: Allow tuning tracing capacity constants Tetsuo Handa
  1 sibling, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-07-25  4:48 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, LKML, syzbot, syzbot, syzbot

On Sat, Jul 25, 2020 at 3:30 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Since syzkaller continues various test cases until the kernel crashes,
> syzkaller tends to examine more locking dependencies than normal systems.
> As a result, syzbot is reporting that the fuzz testing was terminated
> due to hitting upper limits lockdep can track [1] [2] [3].
>
> Like CONFIG_LOCKDEP_SMALL which halves the upper limits, let's introduce
> CONFIG_LOCKDEP_LARGE which doubles the upper limits.
>
> [1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
> [2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
> [3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
>
> Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@syzkaller.appspotmail.com>
> Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com>
> Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
>  kernel/locking/lockdep.c           | 4 ++++
>  kernel/locking/lockdep_internals.h | 5 +++++
>  lib/Kconfig.debug                  | 8 ++++++++
>  3 files changed, 17 insertions(+)
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 29a8de4..85ba7eb 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1349,7 +1349,11 @@ static int add_lock_to_list(struct lock_class *this,
>  /*
>   * For good efficiency of modular, we use power of 2
>   */
> +#ifdef CONFIG_LOCKDEP_LARGE
> +#define MAX_CIRCULAR_QUEUE_SIZE                8192UL
> +#else
>  #define MAX_CIRCULAR_QUEUE_SIZE                4096UL

Maybe this number should be the config value? So that we don't ever
return here to introduce "VERY_LARGE" :)
Also somebody may use it to _reduce_ size of the table for a smaller kernel.

> +#endif
>  #define CQ_MASK                                (MAX_CIRCULAR_QUEUE_SIZE-1)
>
>  /*
> diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
> index baca699..00a3ec3 100644
> --- a/kernel/locking/lockdep_internals.h
> +++ b/kernel/locking/lockdep_internals.h
> @@ -93,6 +93,11 @@ enum {
>  #define MAX_LOCKDEP_CHAINS_BITS        15
>  #define MAX_STACK_TRACE_ENTRIES        262144UL
>  #define STACK_TRACE_HASH_SIZE  8192
> +#elif defined(CONFIG_LOCKDEP_LARGE)
> +#define MAX_LOCKDEP_ENTRIES    65536UL
> +#define MAX_LOCKDEP_CHAINS_BITS        17
> +#define MAX_STACK_TRACE_ENTRIES        1048576UL
> +#define STACK_TRACE_HASH_SIZE  32768
>  #else
>  #define MAX_LOCKDEP_ENTRIES    32768UL
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 9ad9210..69ba624 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1266,6 +1266,14 @@ config LOCKDEP
>  config LOCKDEP_SMALL
>         bool
>
> +config LOCKDEP_LARGE
> +       bool "Use larger buffer for tracking more locking dependencies"
> +       depends on LOCKDEP && !LOCKDEP_SMALL
> +       help
> +         If you say Y here, the upper limits the lock dependency engine uses will
> +         be doubled. Useful for fuzz testing which tends to test many complecated
> +         dependencies than normal systems.
> +
>  config DEBUG_LOCKDEP
>         bool "Lock dependency engine debugging"
>         depends on DEBUG_KERNEL && LOCKDEP
> --
> 1.8.3.1
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE
  2020-07-25  4:48 ` Dmitry Vyukov
@ 2020-07-25  5:23   ` Tetsuo Handa
  2020-08-04  2:36     ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-07-25  5:23 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, LKML, syzbot, syzbot, syzbot

On 2020/07/25 13:48, Dmitry Vyukov wrote:
>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>> index 29a8de4..85ba7eb 100644
>> --- a/kernel/locking/lockdep.c
>> +++ b/kernel/locking/lockdep.c
>> @@ -1349,7 +1349,11 @@ static int add_lock_to_list(struct lock_class *this,
>>  /*
>>   * For good efficiency of modular, we use power of 2
>>   */
>> +#ifdef CONFIG_LOCKDEP_LARGE
>> +#define MAX_CIRCULAR_QUEUE_SIZE                8192UL
>> +#else
>>  #define MAX_CIRCULAR_QUEUE_SIZE                4096UL
> 
> Maybe this number should be the config value? So that we don't ever
> return here to introduce "VERY_LARGE" :)

They can be "tiny, small, medium, compact, large and huge". Yeah, it's a joke. :-)

> Also somebody may use it to _reduce_ size of the table for a smaller kernel.

Maybe. But my feeling is that it is very rare that the kernel actually deadlocks
as soon as lockdep warned the possibility of deadlock.

Since syzbot runs many instances in parallel, a lot of CPU resource is spent for
checking the same dependency tree. However, the possibility of deadlock can be
warned for only locks held within each kernel boot, and it is impossible to hold
all locks with one kernel boot.

Then, it might be nice if lockdep can audit only "which lock was held from which
context and what backtrace" and export that log like KCOV data (instead of evaluating
the possibility of deadlock), and rebuild the whole dependency (and evaluate the
possibility of deadlock) across multiple kernel boots in userspace.

> 
>> +#endif
>>  #define CQ_MASK                                (MAX_CIRCULAR_QUEUE_SIZE-1)


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE
  2020-07-25  5:23   ` Tetsuo Handa
@ 2020-08-04  2:36     ` Tetsuo Handa
  2020-08-18  9:57       ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-08-04  2:36 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon; +Cc: Dmitry Vyukov, LKML

Hello, Peter, Ingo and Will.

(Q1) Can we change the capacity using kernel config?

(Q2) If we can change the capacity, is it OK to specify these constants
     independently? (In other words, is there inter-dependency among
     these constants?)

(Q3) Do you think that we can extend lockdep to be used as a tool for auditing
     locks held in kernel space and rebuilding lock dependency map in user space?

On 2020/07/25 14:23, Tetsuo Handa wrote:
>> Also somebody may use it to _reduce_ size of the table for a smaller kernel.
> 
> Maybe. But my feeling is that it is very rare that the kernel actually deadlocks
> as soon as lockdep warned the possibility of deadlock.
> 
> Since syzbot runs many instances in parallel, a lot of CPU resource is spent for
> checking the same dependency tree. However, the possibility of deadlock can be
> warned for only locks held within each kernel boot, and it is impossible to hold
> all locks with one kernel boot.
> 
> Then, it might be nice if lockdep can audit only "which lock was held from which
> context and what backtrace" and export that log like KCOV data (instead of evaluating
> the possibility of deadlock), and rebuild the whole dependency (and evaluate the
> possibility of deadlock) across multiple kernel boots in userspace.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE
  2020-08-04  2:36     ` Tetsuo Handa
@ 2020-08-18  9:57       ` Dmitry Vyukov
  2020-08-18 11:07         ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-08-18  9:57 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, LKML

On Tue, Aug 4, 2020 at 4:36 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Hello, Peter, Ingo and Will.
>
> (Q1) Can we change the capacity using kernel config?
>
> (Q2) If we can change the capacity, is it OK to specify these constants
>      independently? (In other words, is there inter-dependency among
>      these constants?)


I think we should do this.
syzbot uses a very beefy kernel config and very broad load.
We are hitting "BUG: MAX_LOCKDEP_ENTRIES too low!" for the part 428
days and already hit it 96K times. It's just harming overall kernel
testing:
https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b

I think it's better if exact values are not hardcoded, but rather
specified in the config. Today we are switching from 4K to 8K, but as
we enable more configs and learn to reach more code, we may need 16K.


> (Q3) Do you think that we can extend lockdep to be used as a tool for auditing
>      locks held in kernel space and rebuilding lock dependency map in user space?

This looks like lots of work. Also unpleasant dependencies on
user-space. If there is a user-space component, it will need to be
deployed to _all_ of kernel testing systems and for all users of
syzkaller. And it will also be a dependency for reproducers. Currently
one can run a C reproducer and get the errors message from LOCKDEP. It
seems that a user-space component will make it significantly more
complicated.


> On 2020/07/25 14:23, Tetsuo Handa wrote:
> >> Also somebody may use it to _reduce_ size of the table for a smaller kernel.
> >
> > Maybe. But my feeling is that it is very rare that the kernel actually deadlocks
> > as soon as lockdep warned the possibility of deadlock.
> >
> > Since syzbot runs many instances in parallel, a lot of CPU resource is spent for
> > checking the same dependency tree. However, the possibility of deadlock can be
> > warned for only locks held within each kernel boot, and it is impossible to hold
> > all locks with one kernel boot.
> >
> > Then, it might be nice if lockdep can audit only "which lock was held from which
> > context and what backtrace" and export that log like KCOV data (instead of evaluating
> > the possibility of deadlock), and rebuild the whole dependency (and evaluate the
> > possibility of deadlock) across multiple kernel boots in userspace.
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE
  2020-08-18  9:57       ` Dmitry Vyukov
@ 2020-08-18 11:07         ` Tetsuo Handa
  2020-08-18 12:02           ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-08-18 11:07 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, LKML

On 2020/08/18 18:57, Dmitry Vyukov wrote:
> On Tue, Aug 4, 2020 at 4:36 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> Hello, Peter, Ingo and Will.
>>
>> (Q1) Can we change the capacity using kernel config?
>>
>> (Q2) If we can change the capacity, is it OK to specify these constants
>>      independently? (In other words, is there inter-dependency among
>>      these constants?)
> 
> 
> I think we should do this.
> syzbot uses a very beefy kernel config and very broad load.
> We are hitting "BUG: MAX_LOCKDEP_ENTRIES too low!" for the part 428
> days and already hit it 96K times. It's just harming overall kernel
> testing:
> https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
> 
> I think it's better if exact values are not hardcoded, but rather
> specified in the config. Today we are switching from 4K to 8K, but as
> we enable more configs and learn to reach more code, we may need 16K.

For short term, increasing the capacity would be fine. But for long term, I doubt.

Learning more locks being held within one boot by enabling more configs, I suspect
that it becomes more and more timing dependent and difficult to hold all locks that
can generate a lockdep warning.

> 
> 
>> (Q3) Do you think that we can extend lockdep to be used as a tool for auditing
>>      locks held in kernel space and rebuilding lock dependency map in user space?
> 
> This looks like lots of work. Also unpleasant dependencies on
> user-space. If there is a user-space component, it will need to be
> deployed to _all_ of kernel testing systems and for all users of
> syzkaller. And it will also be a dependency for reproducers. Currently
> one can run a C reproducer and get the errors message from LOCKDEP. It
> seems that a user-space component will make it significantly more
> complicated.

My suggestion is to detach lockdep warning from realtime alarming.

Since not all locks are always held (e.g. some locks are held only if exceeding
some threshold), requiring all locks being held within one boot sounds difficult.
Such requirement results in flaky bisection like "Fix bisection: failed" in
https://syzkaller.appspot.com/bug?id=b23ec126241ad0d86628de6eb5c1cff57d282632 .

Then, I'm wishing that we could build non-realtime alarming based on all locks held
across all boots on each vmlinux file.

> 
> 
>> On 2020/07/25 14:23, Tetsuo Handa wrote:
>>>> Also somebody may use it to _reduce_ size of the table for a smaller kernel.
>>>
>>> Maybe. But my feeling is that it is very rare that the kernel actually deadlocks
>>> as soon as lockdep warned the possibility of deadlock.
>>>
>>> Since syzbot runs many instances in parallel, a lot of CPU resource is spent for
>>> checking the same dependency tree. However, the possibility of deadlock can be
>>> warned for only locks held within each kernel boot, and it is impossible to hold
>>> all locks with one kernel boot.
>>>
>>> Then, it might be nice if lockdep can audit only "which lock was held from which
>>> context and what backtrace" and export that log like KCOV data (instead of evaluating
>>> the possibility of deadlock), and rebuild the whole dependency (and evaluate the
>>> possibility of deadlock) across multiple kernel boots in userspace.
>>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE
  2020-08-18 11:07         ` Tetsuo Handa
@ 2020-08-18 12:02           ` Dmitry Vyukov
  2020-08-18 12:59             ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-08-18 12:02 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, LKML, syzkaller

On Tue, Aug 18, 2020 at 1:07 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/08/18 18:57, Dmitry Vyukov wrote:
> > On Tue, Aug 4, 2020 at 4:36 AM Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >>
> >> Hello, Peter, Ingo and Will.
> >>
> >> (Q1) Can we change the capacity using kernel config?
> >>
> >> (Q2) If we can change the capacity, is it OK to specify these constants
> >>      independently? (In other words, is there inter-dependency among
> >>      these constants?)
> >
> >
> > I think we should do this.
> > syzbot uses a very beefy kernel config and very broad load.
> > We are hitting "BUG: MAX_LOCKDEP_ENTRIES too low!" for the part 428
> > days and already hit it 96K times. It's just harming overall kernel
> > testing:
> > https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
> >
> > I think it's better if exact values are not hardcoded, but rather
> > specified in the config. Today we are switching from 4K to 8K, but as
> > we enable more configs and learn to reach more code, we may need 16K.
>
> For short term, increasing the capacity would be fine. But for long term, I doubt.
>
> Learning more locks being held within one boot by enabling more configs, I suspect
> that it becomes more and more timing dependent and difficult to hold all locks that
> can generate a lockdep warning.
>
> >
> >
> >> (Q3) Do you think that we can extend lockdep to be used as a tool for auditing
> >>      locks held in kernel space and rebuilding lock dependency map in user space?
> >
> > This looks like lots of work. Also unpleasant dependencies on
> > user-space. If there is a user-space component, it will need to be
> > deployed to _all_ of kernel testing systems and for all users of
> > syzkaller. And it will also be a dependency for reproducers. Currently
> > one can run a C reproducer and get the errors message from LOCKDEP. It
> > seems that a user-space component will make it significantly more
> > complicated.
>
> My suggestion is to detach lockdep warning from realtime alarming.
>
> Since not all locks are always held (e.g. some locks are held only if exceeding
> some threshold), requiring all locks being held within one boot sounds difficult.
> Such requirement results in flaky bisection like "Fix bisection: failed" in
> https://syzkaller.appspot.com/bug?id=b23ec126241ad0d86628de6eb5c1cff57d282632 .
>
> Then, I'm wishing that we could build non-realtime alarming based on all locks held
> across all boots on each vmlinux file.

Unless I am missing something, deployment/maintenance story for this
for syzbot, syzkaller users, other kernel testing, reproducer
extraction, bisection, resproducer hermeticity is quite complicated. I
don't see it outweighing any potential benefit in reporting quality.

I also don't see how it will improve reproducer/bisection quality: to
confirm presence of a bug we still need to trigger all cycle edges
within a single run anyway, it does not have to be a single VM, but
still needs to be a single test case. And this "having all edges
within a single test case" seems to be the root problem. I don't see
how this proposal addresses this problem.

> >> On 2020/07/25 14:23, Tetsuo Handa wrote:
> >>>> Also somebody may use it to _reduce_ size of the table for a smaller kernel.
> >>>
> >>> Maybe. But my feeling is that it is very rare that the kernel actually deadlocks
> >>> as soon as lockdep warned the possibility of deadlock.
> >>>
> >>> Since syzbot runs many instances in parallel, a lot of CPU resource is spent for
> >>> checking the same dependency tree. However, the possibility of deadlock can be
> >>> warned for only locks held within each kernel boot, and it is impossible to hold
> >>> all locks with one kernel boot.
> >>>
> >>> Then, it might be nice if lockdep can audit only "which lock was held from which
> >>> context and what backtrace" and export that log like KCOV data (instead of evaluating
> >>> the possibility of deadlock), and rebuild the whole dependency (and evaluate the
> >>> possibility of deadlock) across multiple kernel boots in userspace.
> >>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE
  2020-08-18 12:02           ` Dmitry Vyukov
@ 2020-08-18 12:59             ` Tetsuo Handa
  0 siblings, 0 replies; 38+ messages in thread
From: Tetsuo Handa @ 2020-08-18 12:59 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, LKML, syzkaller

Peter, Ingo and Will. Would you answer (Q1) and (Q2)?

On 2020/08/18 21:02, Dmitry Vyukov wrote:
> On Tue, Aug 18, 2020 at 1:07 PM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>
>> On 2020/08/18 18:57, Dmitry Vyukov wrote:
>>> On Tue, Aug 4, 2020 at 4:36 AM Tetsuo Handa
>>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>>>
>>>> Hello, Peter, Ingo and Will.
>>>>
>>>> (Q1) Can we change the capacity using kernel config?
>>>>
>>>> (Q2) If we can change the capacity, is it OK to specify these constants
>>>>      independently? (In other words, is there inter-dependency among
>>>>      these constants?)
>>>
>>>
>>> I think we should do this.
>>> syzbot uses a very beefy kernel config and very broad load.
>>> We are hitting "BUG: MAX_LOCKDEP_ENTRIES too low!" for the part 428
>>> days and already hit it 96K times. It's just harming overall kernel
>>> testing:
>>> https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
>>>
>>> I think it's better if exact values are not hardcoded, but rather
>>> specified in the config. Today we are switching from 4K to 8K, but as
>>> we enable more configs and learn to reach more code, we may need 16K.
>>
>> For short term, increasing the capacity would be fine. But for long term, I doubt.
>>
>> Learning more locks being held within one boot by enabling more configs, I suspect
>> that it becomes more and more timing dependent and difficult to hold all locks that
>> can generate a lockdep warning.
>>
>>>
>>>
>>>> (Q3) Do you think that we can extend lockdep to be used as a tool for auditing
>>>>      locks held in kernel space and rebuilding lock dependency map in user space?
>>>
>>> This looks like lots of work. Also unpleasant dependencies on
>>> user-space. If there is a user-space component, it will need to be
>>> deployed to _all_ of kernel testing systems and for all users of
>>> syzkaller. And it will also be a dependency for reproducers. Currently
>>> one can run a C reproducer and get the errors message from LOCKDEP. It
>>> seems that a user-space component will make it significantly more
>>> complicated.
>>
>> My suggestion is to detach lockdep warning from realtime alarming.
>>
>> Since not all locks are always held (e.g. some locks are held only if exceeding
>> some threshold), requiring all locks being held within one boot sounds difficult.
>> Such requirement results in flaky bisection like "Fix bisection: failed" in
>> https://syzkaller.appspot.com/bug?id=b23ec126241ad0d86628de6eb5c1cff57d282632 .
>>
>> Then, I'm wishing that we could build non-realtime alarming based on all locks held
>> across all boots on each vmlinux file.
> 
> Unless I am missing something, deployment/maintenance story for this
> for syzbot, syzkaller users, other kernel testing, reproducer
> extraction, bisection, resproducer hermeticity is quite complicated. I
> don't see it outweighing any potential benefit in reporting quality.

What I'm imaging is: do not try to judge lock dependency problems within
syzkaller (or other kernel testing) kernels. That is, no reproducer for
lock dependency problems, no bisection for lock dependency problems.
Utilize their resources for gathering only, and create lock dependency
(like kcov data) in some dedicated userspace component.

> 
> I also don't see how it will improve reproducer/bisection quality: to
> confirm presence of a bug we still need to trigger all cycle edges
> within a single run anyway, it does not have to be a single VM, but
> still needs to be a single test case. And this "having all edges
> within a single test case" seems to be the root problem. I don't see
> how this proposal addresses this problem.
> 
>>>> On 2020/07/25 14:23, Tetsuo Handa wrote:
>>>>>> Also somebody may use it to _reduce_ size of the table for a smaller kernel.
>>>>>
>>>>> Maybe. But my feeling is that it is very rare that the kernel actually deadlocks
>>>>> as soon as lockdep warned the possibility of deadlock.
>>>>>
>>>>> Since syzbot runs many instances in parallel, a lot of CPU resource is spent for
>>>>> checking the same dependency tree. However, the possibility of deadlock can be
>>>>> warned for only locks held within each kernel boot, and it is impossible to hold
>>>>> all locks with one kernel boot.
>>>>>
>>>>> Then, it might be nice if lockdep can audit only "which lock was held from which
>>>>> context and what backtrace" and export that log like KCOV data (instead of evaluating
>>>>> the possibility of deadlock), and rebuild the whole dependency (and evaluate the
>>>>> possibility of deadlock) across multiple kernel boots in userspace.
>>>>
>>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2] lockdep: Allow tuning tracing capacity constants.
  2020-07-25  1:30 [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE Tetsuo Handa
  2020-07-25  4:48 ` Dmitry Vyukov
@ 2020-08-27 15:20 ` Tetsuo Handa
  2020-09-04 16:05   ` Tetsuo Handa
  2020-10-10 12:58   ` [PATCH v3] " Tetsuo Handa
  1 sibling, 2 replies; 38+ messages in thread
From: Tetsuo Handa @ 2020-08-27 15:20 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton
  Cc: linux-kernel, Dmitry Vyukov, syzbot, syzbot, syzbot

Since syzkaller continues various test cases until the kernel crashes,
syzkaller tends to examine more locking dependencies than normal systems.
As a result, syzbot is reporting that the fuzz testing was terminated
due to hitting upper limits lockdep can track [1] [2] [3].

Let's allow individually tuning upper limits via kernel config options
(based on an assumption that there is no inter-dependency among these
constants).

[1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
[2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
[3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f

Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 kernel/locking/lockdep.c           |  2 +-
 kernel/locking/lockdep_internals.h |  8 +++---
 lib/Kconfig.debug                  | 40 ++++++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 2fad21d345b0..5a058fcec435 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1349,7 +1349,7 @@ static int add_lock_to_list(struct lock_class *this,
 /*
  * For good efficiency of modular, we use power of 2
  */
-#define MAX_CIRCULAR_QUEUE_SIZE		4096UL
+#define MAX_CIRCULAR_QUEUE_SIZE		(1UL << CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS)
 #define CQ_MASK				(MAX_CIRCULAR_QUEUE_SIZE-1)
 
 /*
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index baca699b94e9..8435a11bf456 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -94,16 +94,16 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
 #define MAX_STACK_TRACE_ENTRIES	262144UL
 #define STACK_TRACE_HASH_SIZE	8192
 #else
-#define MAX_LOCKDEP_ENTRIES	32768UL
+#define MAX_LOCKDEP_ENTRIES	(1UL << CONFIG_LOCKDEP_BITS)
 
-#define MAX_LOCKDEP_CHAINS_BITS	16
+#define MAX_LOCKDEP_CHAINS_BITS	CONFIG_LOCKDEP_CHAINS_BITS
 
 /*
  * Stack-trace: tightly packed array of stack backtrace
  * addresses. Protected by the hash_lock.
  */
-#define MAX_STACK_TRACE_ENTRIES	524288UL
-#define STACK_TRACE_HASH_SIZE	16384
+#define MAX_STACK_TRACE_ENTRIES	(1UL << CONFIG_LOCKDEP_STACK_TRACE_BITS)
+#define STACK_TRACE_HASH_SIZE	(1 << CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS)
 #endif
 
 /*
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e068c3c7189a..d7612e132986 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1311,6 +1311,46 @@ config LOCKDEP
 config LOCKDEP_SMALL
 	bool
 
+config LOCKDEP_BITS
+	int "Bitsize for MAX_LOCKDEP_ENTRIES"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 15
+	help
+	  Try increasing this value if you hit "BUG: MAX_LOCKDEP_ENTRIES too low!" message.
+
+config LOCKDEP_CHAINS_BITS
+	int "Bitsize for MAX_LOCKDEP_CHAINS"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 16
+	help
+	  Try increasing this value if you hit "BUG: MAX_LOCKDEP_CHAINS too low!" message.
+
+config LOCKDEP_STACK_TRACE_BITS
+	int "Bitsize for MAX_STACK_TRACE_ENTRIES"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 19
+	help
+	  Try increasing this value if you hit "BUG: MAX_STACK_TRACE_ENTRIES too low!" message.
+
+config LOCKDEP_STACK_TRACE_HASH_BITS
+	int "Bitsize for STACK_TRACE_HASH_SIZE"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 14
+	help
+	  Try increasing this value if you need large MAX_STACK_TRACE_ENTRIES.
+
+config LOCKDEP_CIRCULAR_QUEUE_BITS
+	int "Bitsize for elements in circular_queue struct"
+	depends on LOCKDEP
+	range 10 30
+	default 12
+	help
+	  Try increasing this value if you hit "lockdep bfs error:-1" warning due to __cq_enqueue() failure.
+
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
 	depends on DEBUG_KERNEL && LOCKDEP
-- 
2.18.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] lockdep: Allow tuning tracing capacity constants.
  2020-08-27 15:20 ` [PATCH v2] lockdep: Allow tuning tracing capacity constants Tetsuo Handa
@ 2020-09-04 16:05   ` Tetsuo Handa
  2020-09-16 11:28     ` Dmitry Vyukov
  2020-10-10 12:58   ` [PATCH v3] " Tetsuo Handa
  1 sibling, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-09-04 16:05 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton
  Cc: linux-kernel, Dmitry Vyukov

Hello. Can we apply this patch?

This patch addresses top crashers for syzbot, and applying this patch
will help utilizing syzbot's resource for finding other bugs.

On 2020/08/28 0:20, Tetsuo Handa wrote:
> Since syzkaller continues various test cases until the kernel crashes,
> syzkaller tends to examine more locking dependencies than normal systems.
> As a result, syzbot is reporting that the fuzz testing was terminated
> due to hitting upper limits lockdep can track [1] [2] [3].
> 
> Let's allow individually tuning upper limits via kernel config options
> (based on an assumption that there is no inter-dependency among these
> constants).
> 
> [1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
> [2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
> [3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
> 
> Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@syzkaller.appspotmail.com>
> Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com>
> Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
>  kernel/locking/lockdep.c           |  2 +-
>  kernel/locking/lockdep_internals.h |  8 +++---
>  lib/Kconfig.debug                  | 40 ++++++++++++++++++++++++++++++
>  3 files changed, 45 insertions(+), 5 deletions(-)
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] lockdep: Allow tuning tracing capacity constants.
  2020-09-04 16:05   ` Tetsuo Handa
@ 2020-09-16 11:28     ` Dmitry Vyukov
  2020-09-16 11:50       ` peterz
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-09-16 11:28 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton, LKML, syzkaller

On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> Hello. Can we apply this patch?
>
> This patch addresses top crashers for syzbot, and applying this patch
> will help utilizing syzbot's resource for finding other bugs.

Acked-by: Dmitry Vyukov <dvyukov@google.com>

Peter, do you still have concerns with this?

Both MAX_LOCKDEP_ENTRIES and MAX_LOCKDEP_CHAINS still fire on syzbot a
lot and harm ability to test whole kernel:
https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
https://syzkaller.appspot.com/bug?id=63fc8d0501c39609dd2f268e4190ec9a72619563

I hate disabling lockdep entirely as it also finds lots of useful things.


> On 2020/08/28 0:20, Tetsuo Handa wrote:
> > Since syzkaller continues various test cases until the kernel crashes,
> > syzkaller tends to examine more locking dependencies than normal systems.
> > As a result, syzbot is reporting that the fuzz testing was terminated
> > due to hitting upper limits lockdep can track [1] [2] [3].
> >
> > Let's allow individually tuning upper limits via kernel config options
> > (based on an assumption that there is no inter-dependency among these
> > constants).
> >
> > [1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
> > [2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
> > [3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
> >
> > Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@syzkaller.appspotmail.com>
> > Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com>
> > Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com>
> > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> > ---
> >  kernel/locking/lockdep.c           |  2 +-
> >  kernel/locking/lockdep_internals.h |  8 +++---
> >  lib/Kconfig.debug                  | 40 ++++++++++++++++++++++++++++++
> >  3 files changed, 45 insertions(+), 5 deletions(-)
> >

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] lockdep: Allow tuning tracing capacity constants.
  2020-09-16 11:28     ` Dmitry Vyukov
@ 2020-09-16 11:50       ` peterz
  2020-09-16 12:14         ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: peterz @ 2020-09-16 11:50 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Tetsuo Handa, Ingo Molnar, Will Deacon, Andrew Morton, LKML, syzkaller

On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > Hello. Can we apply this patch?
> >
> > This patch addresses top crashers for syzbot, and applying this patch
> > will help utilizing syzbot's resource for finding other bugs.
> 
> Acked-by: Dmitry Vyukov <dvyukov@google.com>
> 
> Peter, do you still have concerns with this?

Yeah, I still hate it with a passion; it discourages thinking. A bad
annotation that blows up the lockdep storage, no worries, we'll just
increase this :/

IIRC the issue with syzbot is that the current sysfs annotation is
pretty terrible and generates a gazillion classes, and syzbot likes
poking at /sys a lot and thus floods the system.

I don't know enough about sysfs to suggest an alternative, and haven't
exactly had spare time to look into it either :/

Examples of bad annotations is getting every CPU a separate class, that
leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
only a single nesting level).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] lockdep: Allow tuning tracing capacity constants.
  2020-09-16 11:50       ` peterz
@ 2020-09-16 12:14         ` Dmitry Vyukov
  2020-09-28  0:24           ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-09-16 12:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tetsuo Handa, Ingo Molnar, Will Deacon, Andrew Morton, LKML, syzkaller

On Wed, Sep 16, 2020 at 1:51 PM <peterz@infradead.org> wrote:
>
> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> > On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > >
> > > Hello. Can we apply this patch?
> > >
> > > This patch addresses top crashers for syzbot, and applying this patch
> > > will help utilizing syzbot's resource for finding other bugs.
> >
> > Acked-by: Dmitry Vyukov <dvyukov@google.com>
> >
> > Peter, do you still have concerns with this?
>
> Yeah, I still hate it with a passion; it discourages thinking. A bad
> annotation that blows up the lockdep storage, no worries, we'll just
> increase this :/
>
> IIRC the issue with syzbot is that the current sysfs annotation is
> pretty terrible and generates a gazillion classes, and syzbot likes
> poking at /sys a lot and thus floods the system.
>
> I don't know enough about sysfs to suggest an alternative, and haven't
> exactly had spare time to look into it either :/
>
> Examples of bad annotations is getting every CPU a separate class, that
> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
> only a single nesting level).

Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
sort and show existing chains so that it's possible to identify if
there are any worst offenders and who they are.

Currently we only have a hypothesis that there are some worst
offenders vs lots of normal load. And we can't point fingers which
means that, say, sysfs, or other maintainers won't be too inclined to
fix anything.

If we would know for sure that lock class X is guilty. That would make
the situation much more actionable.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] lockdep: Allow tuning tracing capacity constants.
  2020-09-16 12:14         ` Dmitry Vyukov
@ 2020-09-28  0:24           ` Tetsuo Handa
  2020-09-28  5:12             ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-09-28  0:24 UTC (permalink / raw)
  To: Dmitry Vyukov, Peter Zijlstra
  Cc: Ingo Molnar, Will Deacon, Andrew Morton, LKML, syzkaller

On 2020/09/16 21:14, Dmitry Vyukov wrote:
> On Wed, Sep 16, 2020 at 1:51 PM <peterz@infradead.org> wrote:
>>
>> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
>>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
>>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>>>
>>>> Hello. Can we apply this patch?
>>>>
>>>> This patch addresses top crashers for syzbot, and applying this patch
>>>> will help utilizing syzbot's resource for finding other bugs.
>>>
>>> Acked-by: Dmitry Vyukov <dvyukov@google.com>
>>>
>>> Peter, do you still have concerns with this?
>>
>> Yeah, I still hate it with a passion; it discourages thinking. A bad
>> annotation that blows up the lockdep storage, no worries, we'll just
>> increase this :/
>>
>> IIRC the issue with syzbot is that the current sysfs annotation is
>> pretty terrible and generates a gazillion classes, and syzbot likes
>> poking at /sys a lot and thus floods the system.
>>
>> I don't know enough about sysfs to suggest an alternative, and haven't
>> exactly had spare time to look into it either :/
>>
>> Examples of bad annotations is getting every CPU a separate class, that
>> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
>> only a single nesting level).
> 
> Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> sort and show existing chains so that it's possible to identify if
> there are any worst offenders and who they are.
> 
> Currently we only have a hypothesis that there are some worst
> offenders vs lots of normal load. And we can't point fingers which
> means that, say, sysfs, or other maintainers won't be too inclined to
> fix anything.
> 
> If we would know for sure that lock class X is guilty. That would make
> the situation much more actionable.
> 

Dmitry is thinking that we need to use CONFIG_LOCKDEP=n temporary until lockdep
problems are resolved. ( https://github.com/google/syzkaller/issues/2140 )

But I think it is better to apply this patch (and revert this patch when it became
possible to identify if there are any worst offenders and who they are) than using
CONFIG_LOCKDEP=n.

CONFIG_LOCKDEP=n causes "#syz test" request to cause false response regarding locking
related issues, for we are not ready to enforce "retest without proposed patch
when test with proposed patch did not reproduce the crash".

I think that "not detecting lock related problems introduced by new patches" costs
more than "postpone fixing lock related problems in existing code".

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] lockdep: Allow tuning tracing capacity constants.
  2020-09-28  0:24           ` Tetsuo Handa
@ 2020-09-28  5:12             ` Dmitry Vyukov
  0 siblings, 0 replies; 38+ messages in thread
From: Dmitry Vyukov @ 2020-09-28  5:12 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Andrew Morton, LKML, syzkaller

On Mon, Sep 28, 2020 at 2:24 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/09/16 21:14, Dmitry Vyukov wrote:
> > On Wed, Sep 16, 2020 at 1:51 PM <peterz@infradead.org> wrote:
> >>
> >> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> >>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> >>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >>>>
> >>>> Hello. Can we apply this patch?
> >>>>
> >>>> This patch addresses top crashers for syzbot, and applying this patch
> >>>> will help utilizing syzbot's resource for finding other bugs.
> >>>
> >>> Acked-by: Dmitry Vyukov <dvyukov@google.com>
> >>>
> >>> Peter, do you still have concerns with this?
> >>
> >> Yeah, I still hate it with a passion; it discourages thinking. A bad
> >> annotation that blows up the lockdep storage, no worries, we'll just
> >> increase this :/
> >>
> >> IIRC the issue with syzbot is that the current sysfs annotation is
> >> pretty terrible and generates a gazillion classes, and syzbot likes
> >> poking at /sys a lot and thus floods the system.
> >>
> >> I don't know enough about sysfs to suggest an alternative, and haven't
> >> exactly had spare time to look into it either :/
> >>
> >> Examples of bad annotations is getting every CPU a separate class, that
> >> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
> >> only a single nesting level).
> >
> > Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> > sort and show existing chains so that it's possible to identify if
> > there are any worst offenders and who they are.
> >
> > Currently we only have a hypothesis that there are some worst
> > offenders vs lots of normal load. And we can't point fingers which
> > means that, say, sysfs, or other maintainers won't be too inclined to
> > fix anything.
> >
> > If we would know for sure that lock class X is guilty. That would make
> > the situation much more actionable.
> >
>
> Dmitry is thinking that we need to use CONFIG_LOCKDEP=n temporary until lockdep
> problems are resolved. ( https://github.com/google/syzkaller/issues/2140 )
>
> But I think it is better to apply this patch (and revert this patch when it became
> possible to identify if there are any worst offenders and who they are) than using
> CONFIG_LOCKDEP=n.
>
> CONFIG_LOCKDEP=n causes "#syz test" request to cause false response regarding locking
> related issues, for we are not ready to enforce "retest without proposed patch
> when test with proposed patch did not reproduce the crash".

FWIW patch testing for previously reported bugs should still work
because it uses the kernel config associated with the bug report.

> I think that "not detecting lock related problems introduced by new patches" costs
> more than "postpone fixing lock related problems in existing code".

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-08-27 15:20 ` [PATCH v2] lockdep: Allow tuning tracing capacity constants Tetsuo Handa
  2020-09-04 16:05   ` Tetsuo Handa
@ 2020-10-10 12:58   ` Tetsuo Handa
  2020-10-18 13:02     ` Tetsuo Handa
  2021-01-01  8:09     ` [PATCH v4] " Tetsuo Handa
  1 sibling, 2 replies; 38+ messages in thread
From: Tetsuo Handa @ 2020-10-10 12:58 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Andrew Morton, linux-kernel, Dmitry Vyukov, Linus Torvalds

Since syzkaller continues various test cases until the kernel crashes,
syzkaller tends to examine more locking dependencies than normal systems.
As a result, syzbot is reporting that the fuzz testing was terminated
due to hitting upper limits lockdep can track [1] [2] [3].

Peter Zijlstra does not want to allow tuning these limits via kernel
config options, for such change discourages thinking. But currently we
are not actionable, for lockdep does not report the culprit for hitting
these limits [4].

Therefore, I propose this patch again, with a caveat that this patch is
expected to be reverted after lockdep becomes capable of reporting the
culprit, for I consider that "postpone fixing lock related problems in
existing code" is less painful than "not detecting lock related problems
introduced by new patches".

[1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
[2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
[3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
[4] https://lkml.kernel.org/r/CACT4Y+agTiEF-1i9LbAgp-q_02oYF0kAPZGAAJ==-wx2Xh7xzQ@mail.gmail.com

Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
---
 kernel/locking/lockdep.c           |  2 +-
 kernel/locking/lockdep_internals.h |  8 +++---
 lib/Kconfig.debug                  | 40 ++++++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 2facbbd146ec..2144708a867c 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1349,7 +1349,7 @@ static int add_lock_to_list(struct lock_class *this,
 /*
  * For good efficiency of modular, we use power of 2
  */
-#define MAX_CIRCULAR_QUEUE_SIZE		4096UL
+#define MAX_CIRCULAR_QUEUE_SIZE		(1UL << CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS)
 #define CQ_MASK				(MAX_CIRCULAR_QUEUE_SIZE-1)
 
 /*
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index b0be1560ed17..cf7752847eb7 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -96,16 +96,16 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
 #define MAX_STACK_TRACE_ENTRIES	262144UL
 #define STACK_TRACE_HASH_SIZE	8192
 #else
-#define MAX_LOCKDEP_ENTRIES	32768UL
+#define MAX_LOCKDEP_ENTRIES	(1UL << CONFIG_LOCKDEP_BITS)
 
-#define MAX_LOCKDEP_CHAINS_BITS	16
+#define MAX_LOCKDEP_CHAINS_BITS	CONFIG_LOCKDEP_CHAINS_BITS
 
 /*
  * Stack-trace: tightly packed array of stack backtrace
  * addresses. Protected by the hash_lock.
  */
-#define MAX_STACK_TRACE_ENTRIES	524288UL
-#define STACK_TRACE_HASH_SIZE	16384
+#define MAX_STACK_TRACE_ENTRIES	(1UL << CONFIG_LOCKDEP_STACK_TRACE_BITS)
+#define STACK_TRACE_HASH_SIZE	(1 << CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS)
 #endif
 
 /*
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 0c781f912f9f..41d083be7ec3 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1311,6 +1311,46 @@ config LOCKDEP
 config LOCKDEP_SMALL
 	bool
 
+config LOCKDEP_BITS
+	int "Bitsize for MAX_LOCKDEP_ENTRIES"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 15
+	help
+	  Try increasing this value if you hit "BUG: MAX_LOCKDEP_ENTRIES too low!" message.
+
+config LOCKDEP_CHAINS_BITS
+	int "Bitsize for MAX_LOCKDEP_CHAINS"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 16
+	help
+	  Try increasing this value if you hit "BUG: MAX_LOCKDEP_CHAINS too low!" message.
+
+config LOCKDEP_STACK_TRACE_BITS
+	int "Bitsize for MAX_STACK_TRACE_ENTRIES"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 19
+	help
+	  Try increasing this value if you hit "BUG: MAX_STACK_TRACE_ENTRIES too low!" message.
+
+config LOCKDEP_STACK_TRACE_HASH_BITS
+	int "Bitsize for STACK_TRACE_HASH_SIZE"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 14
+	help
+	  Try increasing this value if you need large MAX_STACK_TRACE_ENTRIES.
+
+config LOCKDEP_CIRCULAR_QUEUE_BITS
+	int "Bitsize for elements in circular_queue struct"
+	depends on LOCKDEP
+	range 10 30
+	default 12
+	help
+	  Try increasing this value if you hit "lockdep bfs error:-1" warning due to __cq_enqueue() failure.
+
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
 	depends on DEBUG_KERNEL && LOCKDEP
-- 
2.18.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-10-10 12:58   ` [PATCH v3] " Tetsuo Handa
@ 2020-10-18 13:02     ` Tetsuo Handa
  2020-11-18 13:57       ` Tetsuo Handa
  2021-01-01  8:09     ` [PATCH v4] " Tetsuo Handa
  1 sibling, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-10-18 13:02 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Andrew Morton, linux-kernel, Dmitry Vyukov, Linus Torvalds

Peter, what do you think? Can we apply this patch?

A potential for-now workaround for syzkaller would be to allow syzkaller
not to encounter the BUG: message (by masking BUG: message on the kernel
side) when hitting these limits, for continue testing until the kernel
crashes (due to other bugs like UAF) would be to some degree useful.

On 2020/10/10 21:58, Tetsuo Handa wrote:
> Since syzkaller continues various test cases until the kernel crashes,
> syzkaller tends to examine more locking dependencies than normal systems.
> As a result, syzbot is reporting that the fuzz testing was terminated
> due to hitting upper limits lockdep can track [1] [2] [3].
> 
> Peter Zijlstra does not want to allow tuning these limits via kernel
> config options, for such change discourages thinking. But currently we
> are not actionable, for lockdep does not report the culprit for hitting
> these limits [4].
> 
> Therefore, I propose this patch again, with a caveat that this patch is
> expected to be reverted after lockdep becomes capable of reporting the
> culprit, for I consider that "postpone fixing lock related problems in
> existing code" is less painful than "not detecting lock related problems
> introduced by new patches".
> 
> [1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
> [2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
> [3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
> [4] https://lkml.kernel.org/r/CACT4Y+agTiEF-1i9LbAgp-q_02oYF0kAPZGAAJ==-wx2Xh7xzQ@mail.gmail.com
> 
> Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@syzkaller.appspotmail.com>
> Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com>
> Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Acked-by: Dmitry Vyukov <dvyukov@google.com>
> ---
>  kernel/locking/lockdep.c           |  2 +-
>  kernel/locking/lockdep_internals.h |  8 +++---
>  lib/Kconfig.debug                  | 40 ++++++++++++++++++++++++++++++
>  3 files changed, 45 insertions(+), 5 deletions(-)

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-10-18 13:02     ` Tetsuo Handa
@ 2020-11-18 13:57       ` Tetsuo Handa
  2020-11-18 14:23         ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-11-18 13:57 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: Andrew Morton, linux-kernel, Dmitry Vyukov, Linus Torvalds

Peter, do you have a plan to make this problem actionable?

On 2020/10/18 22:02, Tetsuo Handa wrote:
> Peter, what do you think? Can we apply this patch?
> 
> A potential for-now workaround for syzkaller would be to allow syzkaller
> not to encounter the BUG: message (by masking BUG: message on the kernel
> side) when hitting these limits, for continue testing until the kernel
> crashes (due to other bugs like UAF) would be to some degree useful.
> 
> On 2020/10/10 21:58, Tetsuo Handa wrote:
>> Since syzkaller continues various test cases until the kernel crashes,
>> syzkaller tends to examine more locking dependencies than normal systems.
>> As a result, syzbot is reporting that the fuzz testing was terminated
>> due to hitting upper limits lockdep can track [1] [2] [3].
>>
>> Peter Zijlstra does not want to allow tuning these limits via kernel
>> config options, for such change discourages thinking. But currently we
>> are not actionable, for lockdep does not report the culprit for hitting
>> these limits [4].
>>
>> Therefore, I propose this patch again, with a caveat that this patch is
>> expected to be reverted after lockdep becomes capable of reporting the
>> culprit, for I consider that "postpone fixing lock related problems in
>> existing code" is less painful than "not detecting lock related problems
>> introduced by new patches".
>>
>> [1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
>> [2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
>> [3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
>> [4] https://lkml.kernel.org/r/CACT4Y+agTiEF-1i9LbAgp-q_02oYF0kAPZGAAJ==-wx2Xh7xzQ@mail.gmail.com
>>
>> Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@syzkaller.appspotmail.com>
>> Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com>
>> Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com>
>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> Acked-by: Dmitry Vyukov <dvyukov@google.com>
>> ---
>>  kernel/locking/lockdep.c           |  2 +-
>>  kernel/locking/lockdep_internals.h |  8 +++---
>>  lib/Kconfig.debug                  | 40 ++++++++++++++++++++++++++++++
>>  3 files changed, 45 insertions(+), 5 deletions(-)


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-18 13:57       ` Tetsuo Handa
@ 2020-11-18 14:23         ` Peter Zijlstra
  2020-11-18 14:30           ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2020-11-18 14:23 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Ingo Molnar, Will Deacon, Andrew Morton, linux-kernel,
	Dmitry Vyukov, Linus Torvalds

On Wed, Nov 18, 2020 at 10:57:08PM +0900, Tetsuo Handa wrote:
> Peter, do you have a plan to make this problem actionable?

I don't even know what the problem is.. nor do you. You can keep sending
this bad paper-over-issues patch until you're blue in the face, I'll not
take it for reasons I've explained before.

I've also previously said that the right way to go about this is to
figure out what exactly is consuming these resources and then figure out
how to cure _that_. Blindly increasing the number only enables bad
actors to get away with being bad.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-18 14:23         ` Peter Zijlstra
@ 2020-11-18 14:30           ` Tetsuo Handa
  2020-11-18 15:10             ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-11-18 14:30 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Will Deacon, Andrew Morton, linux-kernel,
	Dmitry Vyukov, Linus Torvalds

The problem is that we can't know what exactly is consuming these resources.
My question is do you have a plan to make it possible to know what exactly is
consuming these resources.

On 2020/11/18 23:23, Peter Zijlstra wrote:
> On Wed, Nov 18, 2020 at 10:57:08PM +0900, Tetsuo Handa wrote:
>> Peter, do you have a plan to make this problem actionable?
> 
> I don't even know what the problem is.. nor do you. You can keep sending
> this bad paper-over-issues patch until you're blue in the face, I'll not
> take it for reasons I've explained before.
> 
> I've also previously said that the right way to go about this is to
> figure out what exactly is consuming these resources and then figure out
> how to cure _that_. Blindly increasing the number only enables bad
> actors to get away with being bad.
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-18 14:30           ` Tetsuo Handa
@ 2020-11-18 15:10             ` Peter Zijlstra
  2020-11-18 15:31               ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2020-11-18 15:10 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Ingo Molnar, Will Deacon, Andrew Morton, linux-kernel,
	Dmitry Vyukov, Linus Torvalds

On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
> The problem is that we can't know what exactly is consuming these resources.
> My question is do you have a plan to make it possible to know what exactly is
> consuming these resources.

I'm pretty sure it's in /proc/lockdep* somewhere.

IIRC you were suffering from "MAX_LOCKDEP_ENTRIES too low!", which you
find in alloc_list_entry(), that's used in add_lock_to_list(), which in
turn is used in check_prev_add() to add to ->locks_after and
->locks_before.

/me frobs in lockdep_proc.c and finds l_show() uses locks_after,
l_show() is the implementation for /proc/lockdep.

So something like:

$ grep "^ ->" /proc/lockdep | wc -l
5064

Should be roughly half (it only shows the forward dependencies) the
number of list_entries used.

/proc/lockdep_stats: direct dependencies:                 11886 [max: 32768]

gives the number of list_entries used

Trick then is finding where they all go, because as you can see, my
machine is nowhere near saturated, even though it's been running a few
days.

So, go forth and analyize your problem.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-18 15:10             ` Peter Zijlstra
@ 2020-11-18 15:31               ` Tetsuo Handa
  2020-11-19 12:33                 ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-11-18 15:31 UTC (permalink / raw)
  To: Dmitry Vyukov, syzkaller
  Cc: Ingo Molnar, Will Deacon, Andrew Morton, linux-kernel,
	Linus Torvalds, Peter Zijlstra

On 2020/11/19 0:10, Peter Zijlstra wrote:
> On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
>> The problem is that we can't know what exactly is consuming these resources.
>> My question is do you have a plan to make it possible to know what exactly is
>> consuming these resources.
> 
> I'm pretty sure it's in /proc/lockdep* somewhere.

OK. Then...

Dmitry, can you update syzkaller to dump /proc/lockdep* before terminating as
a crash as soon as encountering one of

  BUG: MAX_LOCKDEP_ENTRIES too low!
  BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  BUG: MAX_LOCKDEP_CHAINS too low!
  BUG: MAX_LOCKDEP_KEYS too low!
  WARNING in print_bfs_bug

messages?

On 2020/09/16 21:14, Dmitry Vyukov wrote:
> On Wed, Sep 16, 2020 at 1:51 PM <peterz@infradead.org> wrote:
>>
>> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
>>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
>>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
>>>>
>>>> Hello. Can we apply this patch?
>>>>
>>>> This patch addresses top crashers for syzbot, and applying this patch
>>>> will help utilizing syzbot's resource for finding other bugs.
>>>
>>> Acked-by: Dmitry Vyukov <dvyukov@google.com>
>>>
>>> Peter, do you still have concerns with this?
>>
>> Yeah, I still hate it with a passion; it discourages thinking. A bad
>> annotation that blows up the lockdep storage, no worries, we'll just
>> increase this :/
>>
>> IIRC the issue with syzbot is that the current sysfs annotation is
>> pretty terrible and generates a gazillion classes, and syzbot likes
>> poking at /sys a lot and thus floods the system.
>>
>> I don't know enough about sysfs to suggest an alternative, and haven't
>> exactly had spare time to look into it either :/
>>
>> Examples of bad annotations is getting every CPU a separate class, that
>> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
>> only a single nesting level).
> 
> Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> sort and show existing chains so that it's possible to identify if
> there are any worst offenders and who they are.
> 
> Currently we only have a hypothesis that there are some worst
> offenders vs lots of normal load. And we can't point fingers which
> means that, say, sysfs, or other maintainers won't be too inclined to
> fix anything.
> 
> If we would know for sure that lock class X is guilty. That would make
> the situation much more actionable.
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-18 15:31               ` Tetsuo Handa
@ 2020-11-19 12:33                 ` Dmitry Vyukov
  2020-11-19 12:43                   ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-19 12:33 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

On Wed, Nov 18, 2020 at 4:32 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/11/19 0:10, Peter Zijlstra wrote:
> > On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
> >> The problem is that we can't know what exactly is consuming these resources.
> >> My question is do you have a plan to make it possible to know what exactly is
> >> consuming these resources.
> >
> > I'm pretty sure it's in /proc/lockdep* somewhere.
>
> OK. Then...
>
> Dmitry, can you update syzkaller to dump /proc/lockdep* before terminating as
> a crash as soon as encountering one of
>
>   BUG: MAX_LOCKDEP_ENTRIES too low!
>   BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
>   BUG: MAX_LOCKDEP_CHAINS too low!
>   BUG: MAX_LOCKDEP_KEYS too low!
>   WARNING in print_bfs_bug
>
> messages?
>
> On 2020/09/16 21:14, Dmitry Vyukov wrote:
> > On Wed, Sep 16, 2020 at 1:51 PM <peterz@infradead.org> wrote:
> >>
> >> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> >>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> >>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >>>>
> >>>> Hello. Can we apply this patch?
> >>>>
> >>>> This patch addresses top crashers for syzbot, and applying this patch
> >>>> will help utilizing syzbot's resource for finding other bugs.
> >>>
> >>> Acked-by: Dmitry Vyukov <dvyukov@google.com>
> >>>
> >>> Peter, do you still have concerns with this?
> >>
> >> Yeah, I still hate it with a passion; it discourages thinking. A bad
> >> annotation that blows up the lockdep storage, no worries, we'll just
> >> increase this :/
> >>
> >> IIRC the issue with syzbot is that the current sysfs annotation is
> >> pretty terrible and generates a gazillion classes, and syzbot likes
> >> poking at /sys a lot and thus floods the system.
> >>
> >> I don't know enough about sysfs to suggest an alternative, and haven't
> >> exactly had spare time to look into it either :/
> >>
> >> Examples of bad annotations is getting every CPU a separate class, that
> >> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
> >> only a single nesting level).
> >
> > Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> > sort and show existing chains so that it's possible to identify if
> > there are any worst offenders and who they are.
> >
> > Currently we only have a hypothesis that there are some worst
> > offenders vs lots of normal load. And we can't point fingers which
> > means that, say, sysfs, or other maintainers won't be too inclined to
> > fix anything.
> >
> > If we would know for sure that lock class X is guilty. That would make
> > the situation much more actionable.

I am trying to reproduce this locally first. syzbot caims it can
reproduce it with a number of very simpler reproducers (like spawn
process, unshare, create socket):
https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc

I see a very slow drift, but it's very slow, so get only to:
 direct dependencies:                 22072 [max: 32768]

But that's running a very uniform workload.

However when I tried to cat /proc/lockdep to see if there is anything
fishy already,
I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).

Some missing locks?

==================================================================
BUG: KASAN: use-after-free in string_nocheck lib/vsprintf.c:611 [inline]
BUG: KASAN: use-after-free in string+0x39c/0x3d0 lib/vsprintf.c:693
Read of size 1 at addr ffff888295833740 by task less/1855

CPU: 0 PID: 1855 Comm: less Tainted: G        W         5.10.0-rc4+ #68
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xae/0x4c8 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
 string_nocheck lib/vsprintf.c:611 [inline]
 string+0x39c/0x3d0 lib/vsprintf.c:693
 vsnprintf+0x71b/0x14f0 lib/vsprintf.c:2618
 seq_vprintf fs/seq_file.c:398 [inline]
 seq_printf+0x195/0x240 fs/seq_file.c:413
 print_name+0x98/0x1d0 kernel/locking/lockdep_proc.c:50
 l_show+0x111/0x2c0 kernel/locking/lockdep_proc.c:82
 seq_read_iter+0xae4/0x10c0 fs/seq_file.c:268
 proc_reg_read_iter+0x1fb/0x2d0 fs/proc/inode.c:310
 call_read_iter include/linux/fs.h:1897 [inline]
 new_sync_read+0x41e/0x6e0 fs/read_write.c:415
 vfs_read+0x35c/0x570 fs/read_write.c:496
 ksys_read+0x12d/0x250 fs/read_write.c:634
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f1d48906310
Code: 73 01 c3 48 8b 0d 28 4b 2b 00 31 d2 48 29 c2 64 89 11 48 83 c8
ff eb ea 90 90 83 3d e5 a2 2b 00 00 75 10 b8 00 00 00 00 0f 05 <48> 3d
01 f04
RSP: 002b:00007fff8ad3f048 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000ae0ec0 RCX: 00007f1d48906310
RDX: 0000000000002000 RSI: 0000000000ae0eec RDI: 0000000000000004
RBP: 0000000000072000 R08: 0000000000000038 R09: 0000000001000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a6d7f0
R13: 0000000000000000 R14: 0000000000000039 R15: 0000000000ae0ec0

Allocated by task 2828:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
 kasan_set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:461
 kmalloc include/linux/slab.h:557 [inline]
 kzalloc include/linux/slab.h:664 [inline]
 tomoyo_encode2.part.0+0xe9/0x3a0 security/tomoyo/realpath.c:45
 tomoyo_encode2 security/tomoyo/realpath.c:31 [inline]
 tomoyo_encode+0x28/0x50 security/tomoyo/realpath.c:80
 tomoyo_path_perm+0x368/0x400 security/tomoyo/file.c:831
 tomoyo_path_symlink+0x94/0xe0 security/tomoyo/tomoyo.c:200
 security_path_symlink+0xdf/0x150 security/security.c:1110
 do_symlinkat+0x123/0x2c0 fs/namei.c:3985
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 2828:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
 kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
 __kasan_slab_free+0x102/0x140 mm/kasan/common.c:422
 slab_free_hook mm/slub.c:1544 [inline]
 slab_free_freelist_hook+0x5d/0x150 mm/slub.c:1577
 slab_free mm/slub.c:3142 [inline]
 kfree+0xdb/0x360 mm/slub.c:4124
 tomoyo_path_perm+0x3b0/0x400 security/tomoyo/file.c:840
 tomoyo_path_symlink+0x94/0xe0 security/tomoyo/tomoyo.c:200
 security_path_symlink+0xdf/0x150 security/security.c:1110
 do_symlinkat+0x123/0x2c0 fs/namei.c:3985
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at ffff888295833740
 which belongs to the cache kmalloc-32 of size 32
The buggy address is located 0 bytes inside of
 32-byte region [ffff888295833740, ffff888295833760)
The buggy address belongs to the page:
page:00000000706b2f94 refcount:1 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0x295833
flags: 0x57ffe0000000200(slab)
raw: 057ffe0000000200 ffffea0004adca40 0000000200000002 ffff888100041a00
raw: 0000000000000000 0000000000400040 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888295833600: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
 ffff888295833680: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
>ffff888295833700: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
                                           ^
 ffff888295833780: 00 00 00 fc fc fc fc fc 00 00 00 fc fc fc fc fc
 ffff888295833800: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
==================================================================

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-19 12:33                 ` Dmitry Vyukov
@ 2020-11-19 12:43                   ` Dmitry Vyukov
  2020-11-19 12:49                     ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-19 12:43 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

On Thu, Nov 19, 2020 at 1:33 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Wed, Nov 18, 2020 at 4:32 PM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > On 2020/11/19 0:10, Peter Zijlstra wrote:
> > > On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
> > >> The problem is that we can't know what exactly is consuming these resources.
> > >> My question is do you have a plan to make it possible to know what exactly is
> > >> consuming these resources.
> > >
> > > I'm pretty sure it's in /proc/lockdep* somewhere.
> >
> > OK. Then...
> >
> > Dmitry, can you update syzkaller to dump /proc/lockdep* before terminating as
> > a crash as soon as encountering one of
> >
> >   BUG: MAX_LOCKDEP_ENTRIES too low!
> >   BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> >   BUG: MAX_LOCKDEP_CHAINS too low!
> >   BUG: MAX_LOCKDEP_KEYS too low!
> >   WARNING in print_bfs_bug
> >
> > messages?
> >
> > On 2020/09/16 21:14, Dmitry Vyukov wrote:
> > > On Wed, Sep 16, 2020 at 1:51 PM <peterz@infradead.org> wrote:
> > >>
> > >> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> > >>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> > >>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > >>>>
> > >>>> Hello. Can we apply this patch?
> > >>>>
> > >>>> This patch addresses top crashers for syzbot, and applying this patch
> > >>>> will help utilizing syzbot's resource for finding other bugs.
> > >>>
> > >>> Acked-by: Dmitry Vyukov <dvyukov@google.com>
> > >>>
> > >>> Peter, do you still have concerns with this?
> > >>
> > >> Yeah, I still hate it with a passion; it discourages thinking. A bad
> > >> annotation that blows up the lockdep storage, no worries, we'll just
> > >> increase this :/
> > >>
> > >> IIRC the issue with syzbot is that the current sysfs annotation is
> > >> pretty terrible and generates a gazillion classes, and syzbot likes
> > >> poking at /sys a lot and thus floods the system.
> > >>
> > >> I don't know enough about sysfs to suggest an alternative, and haven't
> > >> exactly had spare time to look into it either :/
> > >>
> > >> Examples of bad annotations is getting every CPU a separate class, that
> > >> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
> > >> only a single nesting level).
> > >
> > > Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> > > sort and show existing chains so that it's possible to identify if
> > > there are any worst offenders and who they are.
> > >
> > > Currently we only have a hypothesis that there are some worst
> > > offenders vs lots of normal load. And we can't point fingers which
> > > means that, say, sysfs, or other maintainers won't be too inclined to
> > > fix anything.
> > >
> > > If we would know for sure that lock class X is guilty. That would make
> > > the situation much more actionable.
>
> I am trying to reproduce this locally first. syzbot caims it can
> reproduce it with a number of very simpler reproducers (like spawn
> process, unshare, create socket):
> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
>
> I see a very slow drift, but it's very slow, so get only to:
>  direct dependencies:                 22072 [max: 32768]
>
> But that's running a very uniform workload.
>
> However when I tried to cat /proc/lockdep to see if there is anything
> fishy already,
> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
>
> Some missing locks?
>
> ==================================================================
> BUG: KASAN: use-after-free in string_nocheck lib/vsprintf.c:611 [inline]
> BUG: KASAN: use-after-free in string+0x39c/0x3d0 lib/vsprintf.c:693
> Read of size 1 at addr ffff888295833740 by task less/1855
>
> CPU: 0 PID: 1855 Comm: less Tainted: G        W         5.10.0-rc4+ #68
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x107/0x163 lib/dump_stack.c:118
>  print_address_description.constprop.0.cold+0xae/0x4c8 mm/kasan/report.c:385
>  __kasan_report mm/kasan/report.c:545 [inline]
>  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
>  string_nocheck lib/vsprintf.c:611 [inline]
>  string+0x39c/0x3d0 lib/vsprintf.c:693
>  vsnprintf+0x71b/0x14f0 lib/vsprintf.c:2618
>  seq_vprintf fs/seq_file.c:398 [inline]
>  seq_printf+0x195/0x240 fs/seq_file.c:413
>  print_name+0x98/0x1d0 kernel/locking/lockdep_proc.c:50
>  l_show+0x111/0x2c0 kernel/locking/lockdep_proc.c:82
>  seq_read_iter+0xae4/0x10c0 fs/seq_file.c:268
>  proc_reg_read_iter+0x1fb/0x2d0 fs/proc/inode.c:310
>  call_read_iter include/linux/fs.h:1897 [inline]
>  new_sync_read+0x41e/0x6e0 fs/read_write.c:415
>  vfs_read+0x35c/0x570 fs/read_write.c:496
>  ksys_read+0x12d/0x250 fs/read_write.c:634
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f1d48906310
> Code: 73 01 c3 48 8b 0d 28 4b 2b 00 31 d2 48 29 c2 64 89 11 48 83 c8
> ff eb ea 90 90 83 3d e5 a2 2b 00 00 75 10 b8 00 00 00 00 0f 05 <48> 3d
> 01 f04
> RSP: 002b:00007fff8ad3f048 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 0000000000ae0ec0 RCX: 00007f1d48906310
> RDX: 0000000000002000 RSI: 0000000000ae0eec RDI: 0000000000000004
> RBP: 0000000000072000 R08: 0000000000000038 R09: 0000000001000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a6d7f0
> R13: 0000000000000000 R14: 0000000000000039 R15: 0000000000ae0ec0
>
> Allocated by task 2828:
>  kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
>  kasan_set_track mm/kasan/common.c:56 [inline]
>  __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:461
>  kmalloc include/linux/slab.h:557 [inline]
>  kzalloc include/linux/slab.h:664 [inline]
>  tomoyo_encode2.part.0+0xe9/0x3a0 security/tomoyo/realpath.c:45
>  tomoyo_encode2 security/tomoyo/realpath.c:31 [inline]
>  tomoyo_encode+0x28/0x50 security/tomoyo/realpath.c:80
>  tomoyo_path_perm+0x368/0x400 security/tomoyo/file.c:831
>  tomoyo_path_symlink+0x94/0xe0 security/tomoyo/tomoyo.c:200
>  security_path_symlink+0xdf/0x150 security/security.c:1110
>  do_symlinkat+0x123/0x2c0 fs/namei.c:3985
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Freed by task 2828:
>  kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
>  kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
>  kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
>  __kasan_slab_free+0x102/0x140 mm/kasan/common.c:422
>  slab_free_hook mm/slub.c:1544 [inline]
>  slab_free_freelist_hook+0x5d/0x150 mm/slub.c:1577
>  slab_free mm/slub.c:3142 [inline]
>  kfree+0xdb/0x360 mm/slub.c:4124
>  tomoyo_path_perm+0x3b0/0x400 security/tomoyo/file.c:840
>  tomoyo_path_symlink+0x94/0xe0 security/tomoyo/tomoyo.c:200
>  security_path_symlink+0xdf/0x150 security/security.c:1110
>  do_symlinkat+0x123/0x2c0 fs/namei.c:3985
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> The buggy address belongs to the object at ffff888295833740
>  which belongs to the cache kmalloc-32 of size 32
> The buggy address is located 0 bytes inside of
>  32-byte region [ffff888295833740, ffff888295833760)
> The buggy address belongs to the page:
> page:00000000706b2f94 refcount:1 mapcount:0 mapping:0000000000000000
> index:0x0 pfn:0x295833
> flags: 0x57ffe0000000200(slab)
> raw: 057ffe0000000200 ffffea0004adca40 0000000200000002 ffff888100041a00
> raw: 0000000000000000 0000000000400040 00000001ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  ffff888295833600: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
>  ffff888295833680: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> >ffff888295833700: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
>                                            ^
>  ffff888295833780: 00 00 00 fc fc fc fc fc 00 00 00 fc fc fc fc fc
>  ffff888295833800: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> ==================================================================


Trying again I wasn't able to reproduce the drift.
Then I realized that running that simple workload I am getting a mix of:
[ 1459.589213][ T3142] WARNING: CPU: 1 PID: 3142 at
drivers/net/bonding/bond_main.c:4310
bond_update_slave_arr+0xcaf/0x10c0
[  222.027968][ T8662] BUG: sleeping function called from invalid
context at net/mac80211/sta_info.c:1962
which torns off lockdep completely.
Perhaps it would be useful to add ON/OFF status to lockdep_stats.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-19 12:43                   ` Dmitry Vyukov
@ 2020-11-19 12:49                     ` Dmitry Vyukov
  2020-11-19 13:06                       ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-19 12:49 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

On Thu, Nov 19, 2020 at 1:43 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > On Wed, Nov 18, 2020 at 4:32 PM Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > >
> > > On 2020/11/19 0:10, Peter Zijlstra wrote:
> > > > On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
> > > >> The problem is that we can't know what exactly is consuming these resources.
> > > >> My question is do you have a plan to make it possible to know what exactly is
> > > >> consuming these resources.
> > > >
> > > > I'm pretty sure it's in /proc/lockdep* somewhere.
> > >
> > > OK. Then...
> > >
> > > Dmitry, can you update syzkaller to dump /proc/lockdep* before terminating as
> > > a crash as soon as encountering one of
> > >
> > >   BUG: MAX_LOCKDEP_ENTRIES too low!
> > >   BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > >   BUG: MAX_LOCKDEP_CHAINS too low!
> > >   BUG: MAX_LOCKDEP_KEYS too low!
> > >   WARNING in print_bfs_bug
> > >
> > > messages?
> > >
> > > On 2020/09/16 21:14, Dmitry Vyukov wrote:
> > > > On Wed, Sep 16, 2020 at 1:51 PM <peterz@infradead.org> wrote:
> > > >>
> > > >> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> > > >>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> > > >>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > >>>>
> > > >>>> Hello. Can we apply this patch?
> > > >>>>
> > > >>>> This patch addresses top crashers for syzbot, and applying this patch
> > > >>>> will help utilizing syzbot's resource for finding other bugs.
> > > >>>
> > > >>> Acked-by: Dmitry Vyukov <dvyukov@google.com>
> > > >>>
> > > >>> Peter, do you still have concerns with this?
> > > >>
> > > >> Yeah, I still hate it with a passion; it discourages thinking. A bad
> > > >> annotation that blows up the lockdep storage, no worries, we'll just
> > > >> increase this :/
> > > >>
> > > >> IIRC the issue with syzbot is that the current sysfs annotation is
> > > >> pretty terrible and generates a gazillion classes, and syzbot likes
> > > >> poking at /sys a lot and thus floods the system.
> > > >>
> > > >> I don't know enough about sysfs to suggest an alternative, and haven't
> > > >> exactly had spare time to look into it either :/
> > > >>
> > > >> Examples of bad annotations is getting every CPU a separate class, that
> > > >> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
> > > >> only a single nesting level).
> > > >
> > > > Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> > > > sort and show existing chains so that it's possible to identify if
> > > > there are any worst offenders and who they are.
> > > >
> > > > Currently we only have a hypothesis that there are some worst
> > > > offenders vs lots of normal load. And we can't point fingers which
> > > > means that, say, sysfs, or other maintainers won't be too inclined to
> > > > fix anything.
> > > >
> > > > If we would know for sure that lock class X is guilty. That would make
> > > > the situation much more actionable.
> >
> > I am trying to reproduce this locally first. syzbot caims it can
> > reproduce it with a number of very simpler reproducers (like spawn
> > process, unshare, create socket):
> > https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> >
> > I see a very slow drift, but it's very slow, so get only to:
> >  direct dependencies:                 22072 [max: 32768]
> >
> > But that's running a very uniform workload.
> >
> > However when I tried to cat /proc/lockdep to see if there is anything
> > fishy already,
> > I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> >
> > Some missing locks?
> >
> > ==================================================================
> > BUG: KASAN: use-after-free in string_nocheck lib/vsprintf.c:611 [inline]
> > BUG: KASAN: use-after-free in string+0x39c/0x3d0 lib/vsprintf.c:693
> > Read of size 1 at addr ffff888295833740 by task less/1855
> >
> > CPU: 0 PID: 1855 Comm: less Tainted: G        W         5.10.0-rc4+ #68
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x107/0x163 lib/dump_stack.c:118
> >  print_address_description.constprop.0.cold+0xae/0x4c8 mm/kasan/report.c:385
> >  __kasan_report mm/kasan/report.c:545 [inline]
> >  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
> >  string_nocheck lib/vsprintf.c:611 [inline]
> >  string+0x39c/0x3d0 lib/vsprintf.c:693
> >  vsnprintf+0x71b/0x14f0 lib/vsprintf.c:2618
> >  seq_vprintf fs/seq_file.c:398 [inline]
> >  seq_printf+0x195/0x240 fs/seq_file.c:413
> >  print_name+0x98/0x1d0 kernel/locking/lockdep_proc.c:50
> >  l_show+0x111/0x2c0 kernel/locking/lockdep_proc.c:82
> >  seq_read_iter+0xae4/0x10c0 fs/seq_file.c:268
> >  proc_reg_read_iter+0x1fb/0x2d0 fs/proc/inode.c:310
> >  call_read_iter include/linux/fs.h:1897 [inline]
> >  new_sync_read+0x41e/0x6e0 fs/read_write.c:415
> >  vfs_read+0x35c/0x570 fs/read_write.c:496
> >  ksys_read+0x12d/0x250 fs/read_write.c:634
> >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x7f1d48906310
> > Code: 73 01 c3 48 8b 0d 28 4b 2b 00 31 d2 48 29 c2 64 89 11 48 83 c8
> > ff eb ea 90 90 83 3d e5 a2 2b 00 00 75 10 b8 00 00 00 00 0f 05 <48> 3d
> > 01 f04
> > RSP: 002b:00007fff8ad3f048 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> > RAX: ffffffffffffffda RBX: 0000000000ae0ec0 RCX: 00007f1d48906310
> > RDX: 0000000000002000 RSI: 0000000000ae0eec RDI: 0000000000000004
> > RBP: 0000000000072000 R08: 0000000000000038 R09: 0000000001000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a6d7f0
> > R13: 0000000000000000 R14: 0000000000000039 R15: 0000000000ae0ec0
> >
> > Allocated by task 2828:
> >  kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
> >  kasan_set_track mm/kasan/common.c:56 [inline]
> >  __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:461
> >  kmalloc include/linux/slab.h:557 [inline]
> >  kzalloc include/linux/slab.h:664 [inline]
> >  tomoyo_encode2.part.0+0xe9/0x3a0 security/tomoyo/realpath.c:45
> >  tomoyo_encode2 security/tomoyo/realpath.c:31 [inline]
> >  tomoyo_encode+0x28/0x50 security/tomoyo/realpath.c:80
> >  tomoyo_path_perm+0x368/0x400 security/tomoyo/file.c:831
> >  tomoyo_path_symlink+0x94/0xe0 security/tomoyo/tomoyo.c:200
> >  security_path_symlink+0xdf/0x150 security/security.c:1110
> >  do_symlinkat+0x123/0x2c0 fs/namei.c:3985
> >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Freed by task 2828:
> >  kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
> >  kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
> >  kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
> >  __kasan_slab_free+0x102/0x140 mm/kasan/common.c:422
> >  slab_free_hook mm/slub.c:1544 [inline]
> >  slab_free_freelist_hook+0x5d/0x150 mm/slub.c:1577
> >  slab_free mm/slub.c:3142 [inline]
> >  kfree+0xdb/0x360 mm/slub.c:4124
> >  tomoyo_path_perm+0x3b0/0x400 security/tomoyo/file.c:840
> >  tomoyo_path_symlink+0x94/0xe0 security/tomoyo/tomoyo.c:200
> >  security_path_symlink+0xdf/0x150 security/security.c:1110
> >  do_symlinkat+0x123/0x2c0 fs/namei.c:3985
> >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > The buggy address belongs to the object at ffff888295833740
> >  which belongs to the cache kmalloc-32 of size 32
> > The buggy address is located 0 bytes inside of
> >  32-byte region [ffff888295833740, ffff888295833760)
> > The buggy address belongs to the page:
> > page:00000000706b2f94 refcount:1 mapcount:0 mapping:0000000000000000
> > index:0x0 pfn:0x295833
> > flags: 0x57ffe0000000200(slab)
> > raw: 057ffe0000000200 ffffea0004adca40 0000000200000002 ffff888100041a00
> > raw: 0000000000000000 0000000000400040 00000001ffffffff 0000000000000000
> > page dumped because: kasan: bad access detected
> >
> > Memory state around the buggy address:
> >  ffff888295833600: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> >  ffff888295833680: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> > >ffff888295833700: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> >                                            ^
> >  ffff888295833780: 00 00 00 fc fc fc fc fc 00 00 00 fc fc fc fc fc
> >  ffff888295833800: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> > ==================================================================
>
>
> Trying again I wasn't able to reproduce the drift.
> Then I realized that running that simple workload I am getting a mix of:
> [ 1459.589213][ T3142] WARNING: CPU: 1 PID: 3142 at
> drivers/net/bonding/bond_main.c:4310
> bond_update_slave_arr+0xcaf/0x10c0
> [  222.027968][ T8662] BUG: sleeping function called from invalid
> context at net/mac80211/sta_info.c:1962
> which torns off lockdep completely.
> Perhaps it would be useful to add ON/OFF status to lockdep_stats.

I can reproduce slow drift by running just:

#define _GNU_SOURCE
#include <unistd.h>
#include <sched.h>
int main() {
    for (int i = 0; i < 16; i++) {
        if (fork() == 0) {
            for (;;) {
                unshare(CLONE_NEWNET);
            }
        }
    }
    sleep(1<<20);
}

 lock-classes:                         1765 [max: 8192]
 direct dependencies:                 12882 [max: 32768]
 dependency chains:                   16841 [max: 65536]
 dependency chain hlocks used:        65720 [max: 327680]
 stack-trace entries:                141213 [max: 524288]
...
 lock-classes:                         1768 [max: 8192]
 direct dependencies:                 13048 [max: 32768]
 dependency chains:                   17144 [max: 65536]
 dependency chain hlocks used:        67194 [max: 327680]
 stack-trace entries:                143043 [max: 524288]
...
 lock-classes:                         1768 [max: 8192]
 direct dependencies:                 13056 [max: 32768]
 dependency chains:                   17173 [max: 65536]
 dependency chain hlocks used:        67353 [max: 327680]
 stack-trace entries:                143138 [max: 524288]
...
 lock-classes:                         1770 [max: 8192]
 direct dependencies:                 13072 [max: 32768]
 dependency chains:                   17232 [max: 65536]
 dependency chain hlocks used:        67604 [max: 327680]
 stack-trace entries:                143390 [max: 524288]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-19 12:49                     ` Dmitry Vyukov
@ 2020-11-19 13:06                       ` Dmitry Vyukov
  2020-11-19 13:45                         ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-19 13:06 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

On Thu, Nov 19, 2020 at 1:49 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Thu, Nov 19, 2020 at 1:43 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > On Wed, Nov 18, 2020 at 4:32 PM Tetsuo Handa
> > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > >
> > > > On 2020/11/19 0:10, Peter Zijlstra wrote:
> > > > > On Wed, Nov 18, 2020 at 11:30:05PM +0900, Tetsuo Handa wrote:
> > > > >> The problem is that we can't know what exactly is consuming these resources.
> > > > >> My question is do you have a plan to make it possible to know what exactly is
> > > > >> consuming these resources.
> > > > >
> > > > > I'm pretty sure it's in /proc/lockdep* somewhere.
> > > >
> > > > OK. Then...
> > > >
> > > > Dmitry, can you update syzkaller to dump /proc/lockdep* before terminating as
> > > > a crash as soon as encountering one of
> > > >
> > > >   BUG: MAX_LOCKDEP_ENTRIES too low!
> > > >   BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > >   BUG: MAX_LOCKDEP_CHAINS too low!
> > > >   BUG: MAX_LOCKDEP_KEYS too low!
> > > >   WARNING in print_bfs_bug
> > > >
> > > > messages?
> > > >
> > > > On 2020/09/16 21:14, Dmitry Vyukov wrote:
> > > > > On Wed, Sep 16, 2020 at 1:51 PM <peterz@infradead.org> wrote:
> > > > >>
> > > > >> On Wed, Sep 16, 2020 at 01:28:19PM +0200, Dmitry Vyukov wrote:
> > > > >>> On Fri, Sep 4, 2020 at 6:05 PM Tetsuo Handa
> > > > >>> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > >>>>
> > > > >>>> Hello. Can we apply this patch?
> > > > >>>>
> > > > >>>> This patch addresses top crashers for syzbot, and applying this patch
> > > > >>>> will help utilizing syzbot's resource for finding other bugs.
> > > > >>>
> > > > >>> Acked-by: Dmitry Vyukov <dvyukov@google.com>
> > > > >>>
> > > > >>> Peter, do you still have concerns with this?
> > > > >>
> > > > >> Yeah, I still hate it with a passion; it discourages thinking. A bad
> > > > >> annotation that blows up the lockdep storage, no worries, we'll just
> > > > >> increase this :/
> > > > >>
> > > > >> IIRC the issue with syzbot is that the current sysfs annotation is
> > > > >> pretty terrible and generates a gazillion classes, and syzbot likes
> > > > >> poking at /sys a lot and thus floods the system.
> > > > >>
> > > > >> I don't know enough about sysfs to suggest an alternative, and haven't
> > > > >> exactly had spare time to look into it either :/
> > > > >>
> > > > >> Examples of bad annotations is getting every CPU a separate class, that
> > > > >> leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
> > > > >> only a single nesting level).
> > > > >
> > > > > Maybe on "BUG: MAX_LOCKDEP_CHAINS too low!" we should then aggregate,
> > > > > sort and show existing chains so that it's possible to identify if
> > > > > there are any worst offenders and who they are.
> > > > >
> > > > > Currently we only have a hypothesis that there are some worst
> > > > > offenders vs lots of normal load. And we can't point fingers which
> > > > > means that, say, sysfs, or other maintainers won't be too inclined to
> > > > > fix anything.
> > > > >
> > > > > If we would know for sure that lock class X is guilty. That would make
> > > > > the situation much more actionable.
> > >
> > > I am trying to reproduce this locally first. syzbot caims it can
> > > reproduce it with a number of very simpler reproducers (like spawn
> > > process, unshare, create socket):
> > > https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> > >
> > > I see a very slow drift, but it's very slow, so get only to:
> > >  direct dependencies:                 22072 [max: 32768]
> > >
> > > But that's running a very uniform workload.
> > >
> > > However when I tried to cat /proc/lockdep to see if there is anything
> > > fishy already,
> > > I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> > >
> > > Some missing locks?
> > >
> > > ==================================================================
> > > BUG: KASAN: use-after-free in string_nocheck lib/vsprintf.c:611 [inline]
> > > BUG: KASAN: use-after-free in string+0x39c/0x3d0 lib/vsprintf.c:693
> > > Read of size 1 at addr ffff888295833740 by task less/1855
> > >
> > > CPU: 0 PID: 1855 Comm: less Tainted: G        W         5.10.0-rc4+ #68
> > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > > rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
> > > Call Trace:
> > >  __dump_stack lib/dump_stack.c:77 [inline]
> > >  dump_stack+0x107/0x163 lib/dump_stack.c:118
> > >  print_address_description.constprop.0.cold+0xae/0x4c8 mm/kasan/report.c:385
> > >  __kasan_report mm/kasan/report.c:545 [inline]
> > >  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
> > >  string_nocheck lib/vsprintf.c:611 [inline]
> > >  string+0x39c/0x3d0 lib/vsprintf.c:693
> > >  vsnprintf+0x71b/0x14f0 lib/vsprintf.c:2618
> > >  seq_vprintf fs/seq_file.c:398 [inline]
> > >  seq_printf+0x195/0x240 fs/seq_file.c:413
> > >  print_name+0x98/0x1d0 kernel/locking/lockdep_proc.c:50
> > >  l_show+0x111/0x2c0 kernel/locking/lockdep_proc.c:82
> > >  seq_read_iter+0xae4/0x10c0 fs/seq_file.c:268
> > >  proc_reg_read_iter+0x1fb/0x2d0 fs/proc/inode.c:310
> > >  call_read_iter include/linux/fs.h:1897 [inline]
> > >  new_sync_read+0x41e/0x6e0 fs/read_write.c:415
> > >  vfs_read+0x35c/0x570 fs/read_write.c:496
> > >  ksys_read+0x12d/0x250 fs/read_write.c:634
> > >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > RIP: 0033:0x7f1d48906310
> > > Code: 73 01 c3 48 8b 0d 28 4b 2b 00 31 d2 48 29 c2 64 89 11 48 83 c8
> > > ff eb ea 90 90 83 3d e5 a2 2b 00 00 75 10 b8 00 00 00 00 0f 05 <48> 3d
> > > 01 f04
> > > RSP: 002b:00007fff8ad3f048 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> > > RAX: ffffffffffffffda RBX: 0000000000ae0ec0 RCX: 00007f1d48906310
> > > RDX: 0000000000002000 RSI: 0000000000ae0eec RDI: 0000000000000004
> > > RBP: 0000000000072000 R08: 0000000000000038 R09: 0000000001000000
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000a6d7f0
> > > R13: 0000000000000000 R14: 0000000000000039 R15: 0000000000ae0ec0
> > >
> > > Allocated by task 2828:
> > >  kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
> > >  kasan_set_track mm/kasan/common.c:56 [inline]
> > >  __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:461
> > >  kmalloc include/linux/slab.h:557 [inline]
> > >  kzalloc include/linux/slab.h:664 [inline]
> > >  tomoyo_encode2.part.0+0xe9/0x3a0 security/tomoyo/realpath.c:45
> > >  tomoyo_encode2 security/tomoyo/realpath.c:31 [inline]
> > >  tomoyo_encode+0x28/0x50 security/tomoyo/realpath.c:80
> > >  tomoyo_path_perm+0x368/0x400 security/tomoyo/file.c:831
> > >  tomoyo_path_symlink+0x94/0xe0 security/tomoyo/tomoyo.c:200
> > >  security_path_symlink+0xdf/0x150 security/security.c:1110
> > >  do_symlinkat+0x123/0x2c0 fs/namei.c:3985
> > >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >
> > > Freed by task 2828:
> > >  kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
> > >  kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
> > >  kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
> > >  __kasan_slab_free+0x102/0x140 mm/kasan/common.c:422
> > >  slab_free_hook mm/slub.c:1544 [inline]
> > >  slab_free_freelist_hook+0x5d/0x150 mm/slub.c:1577
> > >  slab_free mm/slub.c:3142 [inline]
> > >  kfree+0xdb/0x360 mm/slub.c:4124
> > >  tomoyo_path_perm+0x3b0/0x400 security/tomoyo/file.c:840
> > >  tomoyo_path_symlink+0x94/0xe0 security/tomoyo/tomoyo.c:200
> > >  security_path_symlink+0xdf/0x150 security/security.c:1110
> > >  do_symlinkat+0x123/0x2c0 fs/namei.c:3985
> > >  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >
> > > The buggy address belongs to the object at ffff888295833740
> > >  which belongs to the cache kmalloc-32 of size 32
> > > The buggy address is located 0 bytes inside of
> > >  32-byte region [ffff888295833740, ffff888295833760)
> > > The buggy address belongs to the page:
> > > page:00000000706b2f94 refcount:1 mapcount:0 mapping:0000000000000000
> > > index:0x0 pfn:0x295833
> > > flags: 0x57ffe0000000200(slab)
> > > raw: 057ffe0000000200 ffffea0004adca40 0000000200000002 ffff888100041a00
> > > raw: 0000000000000000 0000000000400040 00000001ffffffff 0000000000000000
> > > page dumped because: kasan: bad access detected
> > >
> > > Memory state around the buggy address:
> > >  ffff888295833600: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> > >  ffff888295833680: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> > > >ffff888295833700: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> > >                                            ^
> > >  ffff888295833780: 00 00 00 fc fc fc fc fc 00 00 00 fc fc fc fc fc
> > >  ffff888295833800: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc
> > > ==================================================================
> >
> >
> > Trying again I wasn't able to reproduce the drift.
> > Then I realized that running that simple workload I am getting a mix of:
> > [ 1459.589213][ T3142] WARNING: CPU: 1 PID: 3142 at
> > drivers/net/bonding/bond_main.c:4310
> > bond_update_slave_arr+0xcaf/0x10c0
> > [  222.027968][ T8662] BUG: sleeping function called from invalid
> > context at net/mac80211/sta_info.c:1962
> > which torns off lockdep completely.
> > Perhaps it would be useful to add ON/OFF status to lockdep_stats.
>
> I can reproduce slow drift by running just:
>
> #define _GNU_SOURCE
> #include <unistd.h>
> #include <sched.h>
> int main() {
>     for (int i = 0; i < 16; i++) {
>         if (fork() == 0) {
>             for (;;) {
>                 unshare(CLONE_NEWNET);
>             }
>         }
>     }
>     sleep(1<<20);
> }
>
>  lock-classes:                         1765 [max: 8192]
>  direct dependencies:                 12882 [max: 32768]
>  dependency chains:                   16841 [max: 65536]
>  dependency chain hlocks used:        65720 [max: 327680]
>  stack-trace entries:                141213 [max: 524288]
> ...
>  lock-classes:                         1768 [max: 8192]
>  direct dependencies:                 13048 [max: 32768]
>  dependency chains:                   17144 [max: 65536]
>  dependency chain hlocks used:        67194 [max: 327680]
>  stack-trace entries:                143043 [max: 524288]
> ...
>  lock-classes:                         1768 [max: 8192]
>  direct dependencies:                 13056 [max: 32768]
>  dependency chains:                   17173 [max: 65536]
>  dependency chain hlocks used:        67353 [max: 327680]
>  stack-trace entries:                143138 [max: 524288]
> ...
>  lock-classes:                         1770 [max: 8192]
>  direct dependencies:                 13072 [max: 32768]
>  dependency chains:                   17232 [max: 65536]
>  dependency chain hlocks used:        67604 [max: 327680]
>  stack-trace entries:                143390 [max: 524288]


Part of this can be explained by non-deterministically taken
scheduler/kmalloc lock.
These are slowly taken in more and more random places:

root@syzkaller:~# egrep "BD:  [4-9]" /proc/lockdep
0000000095032c93 FD:    2 BD:  822 -.-.: &obj_hash[i].lock
00000000b7a46d70 FD:    1 BD:  823 ..-.: pool_lock
0000000083d7269e FD:    2 BD:  518 ..-.: &zone->lock
000000002ccb1edd FD:    1 BD:  526 ...-: &____s->seqcount
0000000084e3526f FD:    1 BD:  586 -.-.: depot_lock
00000000902ec5bd FD:    1 BD:  495 ..-.: &n->list_lock
0000000090368662 FD:   13 BD:  747 -.-.: &rq->lock
00000000584dac99 FD:    1 BD:  748 ....: &cfs_b->lock
0000000040e6e0e2 FD:   18 BD:  471 -.-.: &pool->lock
00000000463b385b FD:   20 BD:  402 -.-.: &pool->lock/1
00000000cc84b7dd FD:    1 BD:  407 -.-.: rcu_node_0
000000001c39e710 FD:    2 BD:  759 ----: tk_core.seq.seqcount
000000001666d8c1 FD:    3 BD:  782 -.-.: &base->lock
00000000c2cc6df8 FD:   22 BD:  426 +.+.: sysctl_lock
00000000ce7f6164 FD:    5 BD:  750 -.-.: hrtimer_bases.lock
000000004d814e16 FD:   33 BD:  425 +.+.: &dentry->d_lock
000000006bce20c6 FD:   15 BD:  650 -.-.: &p->pi_lock
00000000b0649d65 FD:    1 BD:  750 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
00000000565db0bb FD:   16 BD:  472 ..-.: wq_mayday_lock
00000000b2aab64b FD:    1 BD:  748 -.-.: &cfs_rq->removed.lock
00000000fca10e98 FD:    7 BD:  748 -...: &rt_b->rt_runtime_lock
000000009ecec09b FD:    1 BD:  749 -...: &rt_rq->rt_runtime_lock
0000000049455d82 FD:    1 BD:  748 ....: &cp->lock
00000000680101ba FD:    2 BD:  429 +.+.: &____s->seqcount#4
0000000096be0d2b FD:    1 BD:  406 ..-.: quarantine_lock
000000009ca236d8 FD:   21 BD:  438 ....: krc.lock
00000000362192dc FD:   16 BD:  426 ....: &wq
000000001680ad80 FD:    6 BD:  426 +.+.: &dentry->d_lock/1
00000000d7e85735 FD:    1 BD:  760 -.-.: pvclock_gtod_data
00000000553448d2 FD:   16 BD:  439 ....: &pgdat->kswapd_wait
00000000ad927e1a FD:   16 BD:  474 ....: &x->wait#12
0000000032646360 FD:   16 BD:  402 -.-.: &rcu_state.expedited_wq
00000000463b551e FD:    1 BD:  651 ..-.: &tsk->delays->lock
0000000036b1c7d2 FD:   16 BD:  426 ....: &wq#2
000000007211be38 FD:    1 BD:  427 +.+.: &lru->node[i].lock
000000002e97aa5e FD:    1 BD:  426 ....: &wq#3
000000000c53876e FD:    4 BD:  427 +.+.: &dentry->d_lock/2
00000000fb45c9bd FD:    3 BD:  428 +.+.: &dentry->d_lock/3
000000009f8394cb FD:    1 BD:  430 +.+.: &____s->seqcount#4/1
00000000439f870a FD:    1 BD:  426 ....: &wq#4

And diffing samples I see how they slowly creep.

But I don't know if it's enough to explain the overflow or not...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-19 13:06                       ` Dmitry Vyukov
@ 2020-11-19 13:45                         ` Tetsuo Handa
  2020-11-19 14:05                           ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-11-19 13:45 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

On 2020/11/19 22:06, Dmitry Vyukov wrote:
>>>>
>>>> I am trying to reproduce this locally first. syzbot caims it can
>>>> reproduce it with a number of very simpler reproducers (like spawn
>>>> process, unshare, create socket):
>>>> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
>>>>
>>>> I see a very slow drift, but it's very slow, so get only to:
>>>>  direct dependencies:                 22072 [max: 32768]
>>>>
>>>> But that's running a very uniform workload.
>>>>
>>>> However when I tried to cat /proc/lockdep to see if there is anything
>>>> fishy already,
>>>> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
>>>>
>>>> Some missing locks?

Not a TOMOYO's bug. Maybe a lockdep's bug.

> 
> But I don't know if it's enough to explain the overflow or not...
> 

Since you can't hit the limit locally, I guess we need to ask syzbot to
run massive testcases.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-19 13:45                         ` Tetsuo Handa
@ 2020-11-19 14:05                           ` Dmitry Vyukov
       [not found]                             ` <CACT4Y+aNJmuhk0KicX4FzKW6PhawFBgvrC2gSJcWwUkR8VSSmg@mail.gmail.com>
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-19 14:05 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

On Thu, Nov 19, 2020 at 2:45 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/11/19 22:06, Dmitry Vyukov wrote:
> >>>>
> >>>> I am trying to reproduce this locally first. syzbot caims it can
> >>>> reproduce it with a number of very simpler reproducers (like spawn
> >>>> process, unshare, create socket):
> >>>> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> >>>>
> >>>> I see a very slow drift, but it's very slow, so get only to:
> >>>>  direct dependencies:                 22072 [max: 32768]
> >>>>
> >>>> But that's running a very uniform workload.
> >>>>
> >>>> However when I tried to cat /proc/lockdep to see if there is anything
> >>>> fishy already,
> >>>> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> >>>>
> >>>> Some missing locks?
>
> Not a TOMOYO's bug. Maybe a lockdep's bug.
>
> >
> > But I don't know if it's enough to explain the overflow or not...
> >
>
> Since you can't hit the limit locally, I guess we need to ask syzbot to
> run massive testcases.

I am trying to test the code that will do this. Otherwise we will get
days-long round-trips for stupid bugs. These files are also quite
huge, I afraid that may not fit into storage.

So far I get to at most:

 lock-classes:                         2901 [max: 8192]
 direct dependencies:                 25574 [max: 32768]
 dependency chains:                   40605 [max: 65536]
 dependency chain hlocks used:       176814 [max: 327680]
 stack-trace entries:                258590 [max: 524288]

with these worst offenders:

# egrep "BD: [0-9]" /proc/lockdep
00000000df5b6792 FD:    2 BD: 1235 -.-.: &obj_hash[i].lock
000000005dfeb73c FD:    1 BD: 1236 ..-.: pool_lock
00000000b86254b1 FD:   14 BD: 1111 -.-.: &rq->lock
00000000866efb75 FD:    1 BD: 1112 ....: &cfs_b->lock
000000006970cf1a FD:    2 BD: 1126 ----: tk_core.seq.seqcount
00000000f49d95b0 FD:    3 BD: 1180 -.-.: &base->lock
00000000ba3f8454 FD:    5 BD: 1115 -.-.: hrtimer_bases.lock
00000000fb340f16 FD:   16 BD: 1030 -.-.: &p->pi_lock
00000000c9f6f58c FD:    1 BD: 1114 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
0000000049d3998c FD:    1 BD: 1112 -.-.: &cfs_rq->removed.lock
00000000fdf7f396 FD:    7 BD: 1112 -...: &rt_b->rt_runtime_lock
0000000021aedb8d FD:    1 BD: 1113 -...: &rt_rq->rt_runtime_lock
000000004e34c8d4 FD:    1 BD: 1112 ....: &cp->lock
00000000b2ac5d96 FD:    1 BD: 1127 -.-.: pvclock_gtod_data
00000000c5df4dc3 FD:    1 BD: 1031 ..-.: &tsk->delays->lock
00000000fe623698 FD:    1 BD: 1112 -...:
per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)


But the kernel continues to crash on different unrelated bugs...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
       [not found]                             ` <CACT4Y+aNJmuhk0KicX4FzKW6PhawFBgvrC2gSJcWwUkR8VSSmg@mail.gmail.com>
@ 2020-11-19 14:36                               ` Dmitry Vyukov
  2020-11-19 18:08                                 ` Dmitry Vyukov
  2020-11-19 14:57                               ` Tetsuo Handa
  1 sibling, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-19 14:36 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

On Thu, Nov 19, 2020 at 3:30 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > On Thu, Nov 19, 2020 at 2:45 PM Tetsuo Handa
> > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > >
> > > On 2020/11/19 22:06, Dmitry Vyukov wrote:
> > > >>>>
> > > >>>> I am trying to reproduce this locally first. syzbot caims it can
> > > >>>> reproduce it with a number of very simpler reproducers (like spawn
> > > >>>> process, unshare, create socket):
> > > >>>> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> > > >>>>
> > > >>>> I see a very slow drift, but it's very slow, so get only to:
> > > >>>>  direct dependencies:                 22072 [max: 32768]
> > > >>>>
> > > >>>> But that's running a very uniform workload.
> > > >>>>
> > > >>>> However when I tried to cat /proc/lockdep to see if there is anything
> > > >>>> fishy already,
> > > >>>> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> > > >>>>
> > > >>>> Some missing locks?
> > >
> > > Not a TOMOYO's bug. Maybe a lockdep's bug.
> > >
> > > >
> > > > But I don't know if it's enough to explain the overflow or not...
> > > >
> > >
> > > Since you can't hit the limit locally, I guess we need to ask syzbot to
> > > run massive testcases.
> >
> > I am trying to test the code that will do this. Otherwise we will get
> > days-long round-trips for stupid bugs. These files are also quite
> > huge, I afraid that may not fit into storage.
> >
> > So far I get to at most:
> >
> >  lock-classes:                         2901 [max: 8192]
> >  direct dependencies:                 25574 [max: 32768]
> >  dependency chains:                   40605 [max: 65536]
> >  dependency chain hlocks used:       176814 [max: 327680]
> >  stack-trace entries:                258590 [max: 524288]
> >
> > with these worst offenders:
> >
> > # egrep "BD: [0-9]" /proc/lockdep
> > 00000000df5b6792 FD:    2 BD: 1235 -.-.: &obj_hash[i].lock
> > 000000005dfeb73c FD:    1 BD: 1236 ..-.: pool_lock
> > 00000000b86254b1 FD:   14 BD: 1111 -.-.: &rq->lock
> > 00000000866efb75 FD:    1 BD: 1112 ....: &cfs_b->lock
> > 000000006970cf1a FD:    2 BD: 1126 ----: tk_core.seq.seqcount
> > 00000000f49d95b0 FD:    3 BD: 1180 -.-.: &base->lock
> > 00000000ba3f8454 FD:    5 BD: 1115 -.-.: hrtimer_bases.lock
> > 00000000fb340f16 FD:   16 BD: 1030 -.-.: &p->pi_lock
> > 00000000c9f6f58c FD:    1 BD: 1114 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
> > 0000000049d3998c FD:    1 BD: 1112 -.-.: &cfs_rq->removed.lock
> > 00000000fdf7f396 FD:    7 BD: 1112 -...: &rt_b->rt_runtime_lock
> > 0000000021aedb8d FD:    1 BD: 1113 -...: &rt_rq->rt_runtime_lock
> > 000000004e34c8d4 FD:    1 BD: 1112 ....: &cp->lock
> > 00000000b2ac5d96 FD:    1 BD: 1127 -.-.: pvclock_gtod_data
> > 00000000c5df4dc3 FD:    1 BD: 1031 ..-.: &tsk->delays->lock
> > 00000000fe623698 FD:    1 BD: 1112 -...:
> > per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)
> >
> >
> > But the kernel continues to crash on different unrelated bugs...
>
>
> Here is one successful sample. How do we debug it? What should we be
> looking for?
>
> p.s. it's indeed huge, full log was 11MB, this probably won't be
> chewed by syzbot.
> Peter, are these [hex numbers] needed? Could we strip them during
> post-processing? At first sight they look like derivatives of the
> name.

The worst back-edge offenders are:

00000000b445a595 FD:    2 BD: 1595 -.-.: &obj_hash[i].lock
0000000055ae0468 FD:    1 BD: 1596 ..-.: pool_lock
00000000b1336dc4 FD:    2 BD: 1002 ..-.: &zone->lock
000000009a0cabce FD:    1 BD: 1042 ...-: &____s->seqcount
000000001f2849b5 FD:    1 BD: 1192 ..-.: depot_lock
00000000d044255b FD:    1 BD: 1038 -.-.: &n->list_lock
000000005868699e FD:   17 BD: 1447 -.-.: &rq->lock
00000000bb52ab59 FD:    1 BD: 1448 ....: &cfs_b->lock
000000004f442fff FD:    2 BD: 1469 ----: tk_core.seq.seqcount
00000000c908cc32 FD:    3 BD: 1512 -.-.: &base->lock
00000000478677cc FD:    5 BD: 1435 -.-.: hrtimer_bases.lock
00000000b5b65cb1 FD:   19 BD: 1255 -.-.: &p->pi_lock
000000007f313bd5 FD:    1 BD: 1451 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
00000000bac5d8ed FD:    1 BD: 1004 ...-: &____s->seqcount#2
000000000f57e411 FD:    1 BD: 1448 -.-.: &cfs_rq->removed.lock
0000000013c1ab65 FD:    7 BD: 1449 -.-.: &rt_b->rt_runtime_lock
000000003bdf78f4 FD:    1 BD: 1450 -.-.: &rt_rq->rt_runtime_lock
00000000975d5b80 FD:    1 BD: 1448 ....: &cp->lock
000000002586e81b FD:    1 BD: 1471 -.-.: pvclock_gtod_data
00000000d03aed24 FD:    1 BD: 1275 ..-.: &tsk->delays->lock
000000001119414f FD:    1 BD: 1448 -...:
per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)
000000006f3d793b FD:    6 BD: 1449 -.-.: &ctx->lock
00000000f3f0190c FD:    9 BD: 1448 -...: &rq->lock/1
000000007410cf1a FD:    1 BD: 1448 -...: &rd->rto_lock

There are 19 with ~1500 incoming edges. So that's 20K.

In my local testing I was at around 20-something K and these worst
offenders were at ~1000 back edges.
Now they got to 1500, so that is what got us over the 32K limit, right?

Does this analysis make sense?

Any ideas what to do with these?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
       [not found]                             ` <CACT4Y+aNJmuhk0KicX4FzKW6PhawFBgvrC2gSJcWwUkR8VSSmg@mail.gmail.com>
  2020-11-19 14:36                               ` Dmitry Vyukov
@ 2020-11-19 14:57                               ` Tetsuo Handa
  1 sibling, 0 replies; 38+ messages in thread
From: Tetsuo Handa @ 2020-11-19 14:57 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

On 2020/11/19 23:30, Dmitry Vyukov wrote:
> p.s. it's indeed huge, full log was 11MB, this probably won't be
> chewed by syzbot.

Is "cat /proc/lockdep*" output written from userspace?
Then, we could try "xz -9" + "base64" for recording.

> Peter, are these [hex numbers] needed? Could we strip them during
> post-processing? At first sight they look like derivatives of the
> name.

kernel/locking/lockdep.c uses %px (raw address) whereas
kernel/locking/lockdep_proc.c uses %p (__ptr_to_hashval() value).
I think saving these [hashed hex numbers] is unlikely useful.
At least for this testing, we can strip leading 00000000 part.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-19 14:36                               ` Dmitry Vyukov
@ 2020-11-19 18:08                                 ` Dmitry Vyukov
  2020-11-20  9:22                                   ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-19 18:08 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 5389 bytes --]

On Thu, Nov 19, 2020 at 3:36 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Thu, Nov 19, 2020 at 3:30 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > >
> > > On Thu, Nov 19, 2020 at 2:45 PM Tetsuo Handa
> > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > >
> > > > On 2020/11/19 22:06, Dmitry Vyukov wrote:
> > > > >>>>
> > > > >>>> I am trying to reproduce this locally first. syzbot caims it can
> > > > >>>> reproduce it with a number of very simpler reproducers (like spawn
> > > > >>>> process, unshare, create socket):
> > > > >>>> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> > > > >>>>
> > > > >>>> I see a very slow drift, but it's very slow, so get only to:
> > > > >>>>  direct dependencies:                 22072 [max: 32768]
> > > > >>>>
> > > > >>>> But that's running a very uniform workload.
> > > > >>>>
> > > > >>>> However when I tried to cat /proc/lockdep to see if there is anything
> > > > >>>> fishy already,
> > > > >>>> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> > > > >>>>
> > > > >>>> Some missing locks?
> > > >
> > > > Not a TOMOYO's bug. Maybe a lockdep's bug.
> > > >
> > > > >
> > > > > But I don't know if it's enough to explain the overflow or not...
> > > > >
> > > >
> > > > Since you can't hit the limit locally, I guess we need to ask syzbot to
> > > > run massive testcases.
> > >
> > > I am trying to test the code that will do this. Otherwise we will get
> > > days-long round-trips for stupid bugs. These files are also quite
> > > huge, I afraid that may not fit into storage.
> > >
> > > So far I get to at most:
> > >
> > >  lock-classes:                         2901 [max: 8192]
> > >  direct dependencies:                 25574 [max: 32768]
> > >  dependency chains:                   40605 [max: 65536]
> > >  dependency chain hlocks used:       176814 [max: 327680]
> > >  stack-trace entries:                258590 [max: 524288]
> > >
> > > with these worst offenders:
> > >
> > > # egrep "BD: [0-9]" /proc/lockdep
> > > 00000000df5b6792 FD:    2 BD: 1235 -.-.: &obj_hash[i].lock
> > > 000000005dfeb73c FD:    1 BD: 1236 ..-.: pool_lock
> > > 00000000b86254b1 FD:   14 BD: 1111 -.-.: &rq->lock
> > > 00000000866efb75 FD:    1 BD: 1112 ....: &cfs_b->lock
> > > 000000006970cf1a FD:    2 BD: 1126 ----: tk_core.seq.seqcount
> > > 00000000f49d95b0 FD:    3 BD: 1180 -.-.: &base->lock
> > > 00000000ba3f8454 FD:    5 BD: 1115 -.-.: hrtimer_bases.lock
> > > 00000000fb340f16 FD:   16 BD: 1030 -.-.: &p->pi_lock
> > > 00000000c9f6f58c FD:    1 BD: 1114 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
> > > 0000000049d3998c FD:    1 BD: 1112 -.-.: &cfs_rq->removed.lock
> > > 00000000fdf7f396 FD:    7 BD: 1112 -...: &rt_b->rt_runtime_lock
> > > 0000000021aedb8d FD:    1 BD: 1113 -...: &rt_rq->rt_runtime_lock
> > > 000000004e34c8d4 FD:    1 BD: 1112 ....: &cp->lock
> > > 00000000b2ac5d96 FD:    1 BD: 1127 -.-.: pvclock_gtod_data
> > > 00000000c5df4dc3 FD:    1 BD: 1031 ..-.: &tsk->delays->lock
> > > 00000000fe623698 FD:    1 BD: 1112 -...:
> > > per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)
> > >
> > >
> > > But the kernel continues to crash on different unrelated bugs...
> >
> >
> > Here is one successful sample. How do we debug it? What should we be
> > looking for?
> >
> > p.s. it's indeed huge, full log was 11MB, this probably won't be
> > chewed by syzbot.
> > Peter, are these [hex numbers] needed? Could we strip them during
> > post-processing? At first sight they look like derivatives of the
> > name.
>
> The worst back-edge offenders are:
>
> 00000000b445a595 FD:    2 BD: 1595 -.-.: &obj_hash[i].lock
> 0000000055ae0468 FD:    1 BD: 1596 ..-.: pool_lock
> 00000000b1336dc4 FD:    2 BD: 1002 ..-.: &zone->lock
> 000000009a0cabce FD:    1 BD: 1042 ...-: &____s->seqcount
> 000000001f2849b5 FD:    1 BD: 1192 ..-.: depot_lock
> 00000000d044255b FD:    1 BD: 1038 -.-.: &n->list_lock
> 000000005868699e FD:   17 BD: 1447 -.-.: &rq->lock
> 00000000bb52ab59 FD:    1 BD: 1448 ....: &cfs_b->lock
> 000000004f442fff FD:    2 BD: 1469 ----: tk_core.seq.seqcount
> 00000000c908cc32 FD:    3 BD: 1512 -.-.: &base->lock
> 00000000478677cc FD:    5 BD: 1435 -.-.: hrtimer_bases.lock
> 00000000b5b65cb1 FD:   19 BD: 1255 -.-.: &p->pi_lock
> 000000007f313bd5 FD:    1 BD: 1451 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
> 00000000bac5d8ed FD:    1 BD: 1004 ...-: &____s->seqcount#2
> 000000000f57e411 FD:    1 BD: 1448 -.-.: &cfs_rq->removed.lock
> 0000000013c1ab65 FD:    7 BD: 1449 -.-.: &rt_b->rt_runtime_lock
> 000000003bdf78f4 FD:    1 BD: 1450 -.-.: &rt_rq->rt_runtime_lock
> 00000000975d5b80 FD:    1 BD: 1448 ....: &cp->lock
> 000000002586e81b FD:    1 BD: 1471 -.-.: pvclock_gtod_data
> 00000000d03aed24 FD:    1 BD: 1275 ..-.: &tsk->delays->lock
> 000000001119414f FD:    1 BD: 1448 -...:
> per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)
> 000000006f3d793b FD:    6 BD: 1449 -.-.: &ctx->lock
> 00000000f3f0190c FD:    9 BD: 1448 -...: &rq->lock/1
> 000000007410cf1a FD:    1 BD: 1448 -...: &rd->rto_lock
>
> There are 19 with ~1500 incoming edges. So that's 20K.
>
> In my local testing I was at around 20-something K and these worst
> offenders were at ~1000 back edges.
> Now they got to 1500, so that is what got us over the 32K limit, right?
>
> Does this analysis make sense?
>
> Any ideas what to do with these?

And here is a full dump for:
BUG: MAX_LOCKDEP_CHAINS too low!

[-- Attachment #2: log29.gz --]
[-- Type: application/gzip, Size: 659232 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-19 18:08                                 ` Dmitry Vyukov
@ 2020-11-20  9:22                                   ` Dmitry Vyukov
  2020-11-20  9:27                                     ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-20  9:22 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 8685 bytes --]

On Thu, Nov 19, 2020 at 7:08 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > > On Thu, Nov 19, 2020 at 2:45 PM Tetsuo Handa
> > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > >
> > > > > On 2020/11/19 22:06, Dmitry Vyukov wrote:
> > > > > >>>>
> > > > > >>>> I am trying to reproduce this locally first. syzbot caims it can
> > > > > >>>> reproduce it with a number of very simpler reproducers (like spawn
> > > > > >>>> process, unshare, create socket):
> > > > > >>>> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> > > > > >>>>
> > > > > >>>> I see a very slow drift, but it's very slow, so get only to:
> > > > > >>>>  direct dependencies:                 22072 [max: 32768]
> > > > > >>>>
> > > > > >>>> But that's running a very uniform workload.
> > > > > >>>>
> > > > > >>>> However when I tried to cat /proc/lockdep to see if there is anything
> > > > > >>>> fishy already,
> > > > > >>>> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> > > > > >>>>
> > > > > >>>> Some missing locks?
> > > > >
> > > > > Not a TOMOYO's bug. Maybe a lockdep's bug.
> > > > >
> > > > > >
> > > > > > But I don't know if it's enough to explain the overflow or not...
> > > > > >
> > > > >
> > > > > Since you can't hit the limit locally, I guess we need to ask syzbot to
> > > > > run massive testcases.
> > > >
> > > > I am trying to test the code that will do this. Otherwise we will get
> > > > days-long round-trips for stupid bugs. These files are also quite
> > > > huge, I afraid that may not fit into storage.
> > > >
> > > > So far I get to at most:
> > > >
> > > >  lock-classes:                         2901 [max: 8192]
> > > >  direct dependencies:                 25574 [max: 32768]
> > > >  dependency chains:                   40605 [max: 65536]
> > > >  dependency chain hlocks used:       176814 [max: 327680]
> > > >  stack-trace entries:                258590 [max: 524288]
> > > >
> > > > with these worst offenders:
> > > >
> > > > # egrep "BD: [0-9]" /proc/lockdep
> > > > 00000000df5b6792 FD:    2 BD: 1235 -.-.: &obj_hash[i].lock
> > > > 000000005dfeb73c FD:    1 BD: 1236 ..-.: pool_lock
> > > > 00000000b86254b1 FD:   14 BD: 1111 -.-.: &rq->lock
> > > > 00000000866efb75 FD:    1 BD: 1112 ....: &cfs_b->lock
> > > > 000000006970cf1a FD:    2 BD: 1126 ----: tk_core.seq.seqcount
> > > > 00000000f49d95b0 FD:    3 BD: 1180 -.-.: &base->lock
> > > > 00000000ba3f8454 FD:    5 BD: 1115 -.-.: hrtimer_bases.lock
> > > > 00000000fb340f16 FD:   16 BD: 1030 -.-.: &p->pi_lock
> > > > 00000000c9f6f58c FD:    1 BD: 1114 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
> > > > 0000000049d3998c FD:    1 BD: 1112 -.-.: &cfs_rq->removed.lock
> > > > 00000000fdf7f396 FD:    7 BD: 1112 -...: &rt_b->rt_runtime_lock
> > > > 0000000021aedb8d FD:    1 BD: 1113 -...: &rt_rq->rt_runtime_lock
> > > > 000000004e34c8d4 FD:    1 BD: 1112 ....: &cp->lock
> > > > 00000000b2ac5d96 FD:    1 BD: 1127 -.-.: pvclock_gtod_data
> > > > 00000000c5df4dc3 FD:    1 BD: 1031 ..-.: &tsk->delays->lock
> > > > 00000000fe623698 FD:    1 BD: 1112 -...:
> > > > per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)
> > > >
> > > >
> > > > But the kernel continues to crash on different unrelated bugs...
> > >
> > >
> > > Here is one successful sample. How do we debug it? What should we be
> > > looking for?
> > >
> > > p.s. it's indeed huge, full log was 11MB, this probably won't be
> > > chewed by syzbot.
> > > Peter, are these [hex numbers] needed? Could we strip them during
> > > post-processing? At first sight they look like derivatives of the
> > > name.
> >
> > The worst back-edge offenders are:
> >
> > 00000000b445a595 FD:    2 BD: 1595 -.-.: &obj_hash[i].lock
> > 0000000055ae0468 FD:    1 BD: 1596 ..-.: pool_lock
> > 00000000b1336dc4 FD:    2 BD: 1002 ..-.: &zone->lock
> > 000000009a0cabce FD:    1 BD: 1042 ...-: &____s->seqcount
> > 000000001f2849b5 FD:    1 BD: 1192 ..-.: depot_lock
> > 00000000d044255b FD:    1 BD: 1038 -.-.: &n->list_lock
> > 000000005868699e FD:   17 BD: 1447 -.-.: &rq->lock
> > 00000000bb52ab59 FD:    1 BD: 1448 ....: &cfs_b->lock
> > 000000004f442fff FD:    2 BD: 1469 ----: tk_core.seq.seqcount
> > 00000000c908cc32 FD:    3 BD: 1512 -.-.: &base->lock
> > 00000000478677cc FD:    5 BD: 1435 -.-.: hrtimer_bases.lock
> > 00000000b5b65cb1 FD:   19 BD: 1255 -.-.: &p->pi_lock
> > 000000007f313bd5 FD:    1 BD: 1451 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
> > 00000000bac5d8ed FD:    1 BD: 1004 ...-: &____s->seqcount#2
> > 000000000f57e411 FD:    1 BD: 1448 -.-.: &cfs_rq->removed.lock
> > 0000000013c1ab65 FD:    7 BD: 1449 -.-.: &rt_b->rt_runtime_lock
> > 000000003bdf78f4 FD:    1 BD: 1450 -.-.: &rt_rq->rt_runtime_lock
> > 00000000975d5b80 FD:    1 BD: 1448 ....: &cp->lock
> > 000000002586e81b FD:    1 BD: 1471 -.-.: pvclock_gtod_data
> > 00000000d03aed24 FD:    1 BD: 1275 ..-.: &tsk->delays->lock
> > 000000001119414f FD:    1 BD: 1448 -...:
> > per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)
> > 000000006f3d793b FD:    6 BD: 1449 -.-.: &ctx->lock
> > 00000000f3f0190c FD:    9 BD: 1448 -...: &rq->lock/1
> > 000000007410cf1a FD:    1 BD: 1448 -...: &rd->rto_lock
> >
> > There are 19 with ~1500 incoming edges. So that's 20K.
> >
> > In my local testing I was at around 20-something K and these worst
> > offenders were at ~1000 back edges.
> > Now they got to 1500, so that is what got us over the 32K limit, right?
> >
> > Does this analysis make sense?
> >
> > Any ideas what to do with these?
>
> And here is a full dump for:
> BUG: MAX_LOCKDEP_CHAINS too low!

Attached a wrong one, here is the one with "BUG: MAX_LOCKDEP_CHAINS too low!".

There are also some outliers with high FD count. But these mostly look
fine too, just locks with high fan out
(genl_mutex/deferred_process_work):

$ egrep "FD: [0-9]" log42
00000000745aa345 FD: 1032 BD:   13 +.+.: cgroup_mutex
00000000cdf4d414 FD: 1193 BD:    8 ++++: pernet_ops_rwsem
00000000582c60d4 FD: 1707 BD:    1 +.+.: (wq_completion)events
000000003f5e465a FD: 1012 BD:   46 +.+.: rtnl_mutex
000000005c1c41b7 FD: 1041 BD:    1 ++++: cb_lock
000000001dea9384 FD: 1038 BD:    3 +.+.: genl_mutex
00000000d158b76e FD: 1483 BD:    2 +.+.: (delayed_fput_work).work
00000000caec859a FD: 1036 BD:    1 +.+.: (wq_completion)events_power_efficient
00000000e36b9bc8 FD: 1033 BD:    1 +.+.: blkcg_pol_register_mutex
00000000927de1d8 FD: 1013 BD:    2 +.+.: (linkwatch_work).work
00000000161fed05 FD: 1013 BD:    1 +.+.: dlci_ioctl_mutex
000000003b6f60a4 FD: 1194 BD:    1 +.+.: masq_mutex
0000000050f06ef0 FD: 1015 BD:    1 +.+.: (wq_completion)ipv6_addrconf
0000000022d8bdd2 FD: 1013 BD:    2 +.+.: (addr_chk_work).work
00000000208719ed FD: 1014 BD:    1 +.+.: br_ioctl_mutex
0000000034a64a9d FD: 1013 BD:    1 +.+.: vlan_ioctl_mutex
000000004bbb2b36 FD: 1013 BD:    2 +.+.: reg_work
00000000f3d10da2 FD: 1075 BD:    2 +.+.: (work_completion)(&fw_work->work)
00000000bf633318 FD: 1089 BD:    2 +.+.: &p->lock
00000000a04e4ca6 FD: 1342 BD:    4 +.+.: &sb->s_type->i_mutex_key#11
0000000074e4d032 FD: 1065 BD:    2 .+.+: sb_writers#7
000000005b21b2d8 FD: 1056 BD:    6 +.+.: &of->mutex
000000004e03ff55 FD: 1189 BD:    1 +.+.: &f->f_pos_lock
000000003c8d0466 FD: 1013 BD:    2 +.+.: (crda_timeout).work
00000000d7d5a678 FD: 1013 BD:    2 +.+.:
(work_completion)(&(&ifa->dad_work)->work)
0000000085c47bd2 FD: 1013 BD:    2 +.+.: (reg_check_chans).work
0000000008a61c9f FD: 1065 BD:    2 .+.+: sb_writers#10
000000003728e6ec FD: 1065 BD:    1 .+.+: sb_writers#11
000000007fe449ca FD: 1035 BD:    1 ++++: &type->s_umount_key#43
0000000078ce5279 FD: 1013 BD:    2 +.+.: deferred_process_work
00000000c673625f FD: 1029 BD:    1 .+.+: kn->active#80
000000003734f752 FD: 1028 BD:    8 +.+.: nsim_bus_dev_list_lock
0000000085797ea4 FD: 1023 BD:   20 +.+.: devlink_mutex
0000000019a4328d FD: 1020 BD:    2 +.+.:
(work_completion)(&(&nsim_dev->trap_data->trap_report_dw)->work)
00000000615ae4c1 FD: 1018 BD:   22 +.+.: &nsim_dev->port_list_lock
00000000bf207b94 FD: 1039 BD:    2 +.+.: nlk_cb_mutex-GENERIC
00000000d0a722ec FD: 1015 BD:    1 +.+.: (wq_completion)cfg80211
00000000e2d69b1e FD: 1013 BD:   10 +.+.: (work_completion)(&rdev->event_work)
0000000094cc69ac FD: 1040 BD:    3 +.+.: &type->i_mutex_dir_key#6/1
00000000897451d9 FD: 1035 BD:    2 +.+.: &type->i_mutex_dir_key#7/1
000000008bade0a8 FD: 1014 BD:    1 +.+.: ppp_mutex
000000004f82d94d FD: 1196 BD:    1 +.+.: (wq_completion)netns
000000002d98bcf4 FD: 1195 BD:    2 +.+.: net_cleanup_work
:00000000cade4b38 FD: 1343 BD:    1 .+.+: sb_writers#13
00000000530be43e FD: 1013 BD:    2 +.+.:
(work_completion)(&(&rdev->dfs_update_channels_wk)->work)
0000000027cc2a05 FD: 1032 BD:    2 +.+.: (work_completion)(&map->work)

[-- Attachment #2: log42.gz --]
[-- Type: application/gzip, Size: 467848 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-20  9:22                                   ` Dmitry Vyukov
@ 2020-11-20  9:27                                     ` Dmitry Vyukov
  2020-11-22  1:56                                       ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-20  9:27 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Peter Zijlstra

On Fri, Nov 20, 2020 at 10:22 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Thu, Nov 19, 2020 at 7:08 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > > > On Thu, Nov 19, 2020 at 2:45 PM Tetsuo Handa
> > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > > >
> > > > > > On 2020/11/19 22:06, Dmitry Vyukov wrote:
> > > > > > >>>>
> > > > > > >>>> I am trying to reproduce this locally first. syzbot caims it can
> > > > > > >>>> reproduce it with a number of very simpler reproducers (like spawn
> > > > > > >>>> process, unshare, create socket):
> > > > > > >>>> https://syzkaller.appspot.com/bug?id=8a18efe79140782a88dcd098808d6ab20ed740cc
> > > > > > >>>>
> > > > > > >>>> I see a very slow drift, but it's very slow, so get only to:
> > > > > > >>>>  direct dependencies:                 22072 [max: 32768]
> > > > > > >>>>
> > > > > > >>>> But that's running a very uniform workload.
> > > > > > >>>>
> > > > > > >>>> However when I tried to cat /proc/lockdep to see if there is anything
> > > > > > >>>> fishy already,
> > > > > > >>>> I got this (on c2e7554e1b85935d962127efa3c2a76483b0b3b6).
> > > > > > >>>>
> > > > > > >>>> Some missing locks?
> > > > > >
> > > > > > Not a TOMOYO's bug. Maybe a lockdep's bug.
> > > > > >
> > > > > > >
> > > > > > > But I don't know if it's enough to explain the overflow or not...
> > > > > > >
> > > > > >
> > > > > > Since you can't hit the limit locally, I guess we need to ask syzbot to
> > > > > > run massive testcases.
> > > > >
> > > > > I am trying to test the code that will do this. Otherwise we will get
> > > > > days-long round-trips for stupid bugs. These files are also quite
> > > > > huge, I afraid that may not fit into storage.
> > > > >
> > > > > So far I get to at most:
> > > > >
> > > > >  lock-classes:                         2901 [max: 8192]
> > > > >  direct dependencies:                 25574 [max: 32768]
> > > > >  dependency chains:                   40605 [max: 65536]
> > > > >  dependency chain hlocks used:       176814 [max: 327680]
> > > > >  stack-trace entries:                258590 [max: 524288]
> > > > >
> > > > > with these worst offenders:
> > > > >
> > > > > # egrep "BD: [0-9]" /proc/lockdep
> > > > > 00000000df5b6792 FD:    2 BD: 1235 -.-.: &obj_hash[i].lock
> > > > > 000000005dfeb73c FD:    1 BD: 1236 ..-.: pool_lock
> > > > > 00000000b86254b1 FD:   14 BD: 1111 -.-.: &rq->lock
> > > > > 00000000866efb75 FD:    1 BD: 1112 ....: &cfs_b->lock
> > > > > 000000006970cf1a FD:    2 BD: 1126 ----: tk_core.seq.seqcount
> > > > > 00000000f49d95b0 FD:    3 BD: 1180 -.-.: &base->lock
> > > > > 00000000ba3f8454 FD:    5 BD: 1115 -.-.: hrtimer_bases.lock
> > > > > 00000000fb340f16 FD:   16 BD: 1030 -.-.: &p->pi_lock
> > > > > 00000000c9f6f58c FD:    1 BD: 1114 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
> > > > > 0000000049d3998c FD:    1 BD: 1112 -.-.: &cfs_rq->removed.lock
> > > > > 00000000fdf7f396 FD:    7 BD: 1112 -...: &rt_b->rt_runtime_lock
> > > > > 0000000021aedb8d FD:    1 BD: 1113 -...: &rt_rq->rt_runtime_lock
> > > > > 000000004e34c8d4 FD:    1 BD: 1112 ....: &cp->lock
> > > > > 00000000b2ac5d96 FD:    1 BD: 1127 -.-.: pvclock_gtod_data
> > > > > 00000000c5df4dc3 FD:    1 BD: 1031 ..-.: &tsk->delays->lock
> > > > > 00000000fe623698 FD:    1 BD: 1112 -...:
> > > > > per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)
> > > > >
> > > > >
> > > > > But the kernel continues to crash on different unrelated bugs...
> > > >
> > > >
> > > > Here is one successful sample. How do we debug it? What should we be
> > > > looking for?
> > > >
> > > > p.s. it's indeed huge, full log was 11MB, this probably won't be
> > > > chewed by syzbot.
> > > > Peter, are these [hex numbers] needed? Could we strip them during
> > > > post-processing? At first sight they look like derivatives of the
> > > > name.
> > >
> > > The worst back-edge offenders are:
> > >
> > > 00000000b445a595 FD:    2 BD: 1595 -.-.: &obj_hash[i].lock
> > > 0000000055ae0468 FD:    1 BD: 1596 ..-.: pool_lock
> > > 00000000b1336dc4 FD:    2 BD: 1002 ..-.: &zone->lock
> > > 000000009a0cabce FD:    1 BD: 1042 ...-: &____s->seqcount
> > > 000000001f2849b5 FD:    1 BD: 1192 ..-.: depot_lock
> > > 00000000d044255b FD:    1 BD: 1038 -.-.: &n->list_lock
> > > 000000005868699e FD:   17 BD: 1447 -.-.: &rq->lock
> > > 00000000bb52ab59 FD:    1 BD: 1448 ....: &cfs_b->lock
> > > 000000004f442fff FD:    2 BD: 1469 ----: tk_core.seq.seqcount
> > > 00000000c908cc32 FD:    3 BD: 1512 -.-.: &base->lock
> > > 00000000478677cc FD:    5 BD: 1435 -.-.: hrtimer_bases.lock
> > > 00000000b5b65cb1 FD:   19 BD: 1255 -.-.: &p->pi_lock
> > > 000000007f313bd5 FD:    1 BD: 1451 -.-.: &per_cpu_ptr(group->pcpu, cpu)->seq
> > > 00000000bac5d8ed FD:    1 BD: 1004 ...-: &____s->seqcount#2
> > > 000000000f57e411 FD:    1 BD: 1448 -.-.: &cfs_rq->removed.lock
> > > 0000000013c1ab65 FD:    7 BD: 1449 -.-.: &rt_b->rt_runtime_lock
> > > 000000003bdf78f4 FD:    1 BD: 1450 -.-.: &rt_rq->rt_runtime_lock
> > > 00000000975d5b80 FD:    1 BD: 1448 ....: &cp->lock
> > > 000000002586e81b FD:    1 BD: 1471 -.-.: pvclock_gtod_data
> > > 00000000d03aed24 FD:    1 BD: 1275 ..-.: &tsk->delays->lock
> > > 000000001119414f FD:    1 BD: 1448 -...:
> > > per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)
> > > 000000006f3d793b FD:    6 BD: 1449 -.-.: &ctx->lock
> > > 00000000f3f0190c FD:    9 BD: 1448 -...: &rq->lock/1
> > > 000000007410cf1a FD:    1 BD: 1448 -...: &rd->rto_lock
> > >
> > > There are 19 with ~1500 incoming edges. So that's 20K.
> > >
> > > In my local testing I was at around 20-something K and these worst
> > > offenders were at ~1000 back edges.
> > > Now they got to 1500, so that is what got us over the 32K limit, right?
> > >
> > > Does this analysis make sense?
> > >
> > > Any ideas what to do with these?
> >
> > And here is a full dump for:
> > BUG: MAX_LOCKDEP_CHAINS too low!
>
> Attached a wrong one, here is the one with "BUG: MAX_LOCKDEP_CHAINS too low!".
>
> There are also some outliers with high FD count. But these mostly look
> fine too, just locks with high fan out
> (genl_mutex/deferred_process_work):
>
> $ egrep "FD: [0-9]" log42
> 00000000745aa345 FD: 1032 BD:   13 +.+.: cgroup_mutex
> 00000000cdf4d414 FD: 1193 BD:    8 ++++: pernet_ops_rwsem
> 00000000582c60d4 FD: 1707 BD:    1 +.+.: (wq_completion)events
> 000000003f5e465a FD: 1012 BD:   46 +.+.: rtnl_mutex
> 000000005c1c41b7 FD: 1041 BD:    1 ++++: cb_lock
> 000000001dea9384 FD: 1038 BD:    3 +.+.: genl_mutex
> 00000000d158b76e FD: 1483 BD:    2 +.+.: (delayed_fput_work).work
> 00000000caec859a FD: 1036 BD:    1 +.+.: (wq_completion)events_power_efficient
> 00000000e36b9bc8 FD: 1033 BD:    1 +.+.: blkcg_pol_register_mutex
> 00000000927de1d8 FD: 1013 BD:    2 +.+.: (linkwatch_work).work
> 00000000161fed05 FD: 1013 BD:    1 +.+.: dlci_ioctl_mutex
> 000000003b6f60a4 FD: 1194 BD:    1 +.+.: masq_mutex
> 0000000050f06ef0 FD: 1015 BD:    1 +.+.: (wq_completion)ipv6_addrconf
> 0000000022d8bdd2 FD: 1013 BD:    2 +.+.: (addr_chk_work).work
> 00000000208719ed FD: 1014 BD:    1 +.+.: br_ioctl_mutex
> 0000000034a64a9d FD: 1013 BD:    1 +.+.: vlan_ioctl_mutex
> 000000004bbb2b36 FD: 1013 BD:    2 +.+.: reg_work
> 00000000f3d10da2 FD: 1075 BD:    2 +.+.: (work_completion)(&fw_work->work)
> 00000000bf633318 FD: 1089 BD:    2 +.+.: &p->lock
> 00000000a04e4ca6 FD: 1342 BD:    4 +.+.: &sb->s_type->i_mutex_key#11
> 0000000074e4d032 FD: 1065 BD:    2 .+.+: sb_writers#7
> 000000005b21b2d8 FD: 1056 BD:    6 +.+.: &of->mutex
> 000000004e03ff55 FD: 1189 BD:    1 +.+.: &f->f_pos_lock
> 000000003c8d0466 FD: 1013 BD:    2 +.+.: (crda_timeout).work
> 00000000d7d5a678 FD: 1013 BD:    2 +.+.:
> (work_completion)(&(&ifa->dad_work)->work)
> 0000000085c47bd2 FD: 1013 BD:    2 +.+.: (reg_check_chans).work
> 0000000008a61c9f FD: 1065 BD:    2 .+.+: sb_writers#10
> 000000003728e6ec FD: 1065 BD:    1 .+.+: sb_writers#11
> 000000007fe449ca FD: 1035 BD:    1 ++++: &type->s_umount_key#43
> 0000000078ce5279 FD: 1013 BD:    2 +.+.: deferred_process_work
> 00000000c673625f FD: 1029 BD:    1 .+.+: kn->active#80
> 000000003734f752 FD: 1028 BD:    8 +.+.: nsim_bus_dev_list_lock
> 0000000085797ea4 FD: 1023 BD:   20 +.+.: devlink_mutex
> 0000000019a4328d FD: 1020 BD:    2 +.+.:
> (work_completion)(&(&nsim_dev->trap_data->trap_report_dw)->work)
> 00000000615ae4c1 FD: 1018 BD:   22 +.+.: &nsim_dev->port_list_lock
> 00000000bf207b94 FD: 1039 BD:    2 +.+.: nlk_cb_mutex-GENERIC
> 00000000d0a722ec FD: 1015 BD:    1 +.+.: (wq_completion)cfg80211
> 00000000e2d69b1e FD: 1013 BD:   10 +.+.: (work_completion)(&rdev->event_work)
> 0000000094cc69ac FD: 1040 BD:    3 +.+.: &type->i_mutex_dir_key#6/1
> 00000000897451d9 FD: 1035 BD:    2 +.+.: &type->i_mutex_dir_key#7/1
> 000000008bade0a8 FD: 1014 BD:    1 +.+.: ppp_mutex
> 000000004f82d94d FD: 1196 BD:    1 +.+.: (wq_completion)netns
> 000000002d98bcf4 FD: 1195 BD:    2 +.+.: net_cleanup_work
> :00000000cade4b38 FD: 1343 BD:    1 .+.+: sb_writers#13
> 00000000530be43e FD: 1013 BD:    2 +.+.:
> (work_completion)(&(&rdev->dfs_update_channels_wk)->work)
> 0000000027cc2a05 FD: 1032 BD:    2 +.+.: (work_completion)(&map->work)


Peter, so far it looks like just a very large, but normal graph to me.
The cheapest from an engineering point of view solution would be just
to increase the constants. I assume a 2x increase should buy us lots
of time to overflow.
I can think of more elaborate solutions, e.g. using bitmasks to
represent hot leaf and top-level locks. But it will both increase the
resulting code complexity (no uniform representation anymore, all code
will need to deal with different representations) and require some
time investments (that I can't justify for me at least as compared to
just throwing a bit more machine memory at it). And in the end it
won't really reduce the size of the graph.
What do you think?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-20  9:27                                     ` Dmitry Vyukov
@ 2020-11-22  1:56                                       ` Tetsuo Handa
  2020-11-27  9:00                                         ` Dmitry Vyukov
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-11-22  1:56 UTC (permalink / raw)
  To: Dmitry Vyukov, Peter Zijlstra
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML, Linus Torvalds

On 2020/11/20 18:27, Dmitry Vyukov wrote:
> Peter, so far it looks like just a very large, but normal graph to me.
> The cheapest from an engineering point of view solution would be just
> to increase the constants. I assume a 2x increase should buy us lots
> of time to overflow.
> I can think of more elaborate solutions, e.g. using bitmasks to
> represent hot leaf and top-level locks. But it will both increase the
> resulting code complexity (no uniform representation anymore, all code
> will need to deal with different representations) and require some
> time investments (that I can't justify for me at least as compared to
> just throwing a bit more machine memory at it). And in the end it
> won't really reduce the size of the graph.
> What do you think?
> 

Yes, I think it is a normal graph; simply syzkaller kernels tend to record
a few times more dependencies than my idle kernel (shown bottom).

Peter, you guessed that the culprit is sysfs at
https://lkml.kernel.org/r/20200916115057.GO2674@hirez.programming.kicks-ass.net , but
syzbot reported at https://syzkaller.appspot.com/text?tag=MachineInfo&x=99b8f2b092d9714f
that "BUG: MAX_LOCKDEP_ENTRIES too low!" can occur on a VM with only 2 CPUs.
Is your guess catching the culprit?

We could improve a few locks, but as a whole we won't be able to afford keeping
sum of individual dependencies under current threshold. Therefore, allow lockdep to
tune the capacity and allow syzkaller to dump when reaching the capacity will be
the way to go.



# cat /proc/lockdep_stats
 lock-classes:                         1236 [max: 8192]
 direct dependencies:                  9610 [max: 32768]
 indirect dependencies:               40401
 all direct dependencies:            174635
 dependency chains:                   11398 [max: 65536]
 dependency chain hlocks used:        42830 [max: 327680]
 dependency chain hlocks lost:            0
 in-hardirq chains:                      61
 in-softirq chains:                     414
 in-process chains:                   10923
 stack-trace entries:                 93041 [max: 524288]
 number of stack traces:               4997
 number of stack hash chains:          4292
 combined max dependencies:       281074520
 hardirq-safe locks:                     50
 hardirq-unsafe locks:                  805
 softirq-safe locks:                    146
 softirq-unsafe locks:                  722
 irq-safe locks:                        155
 irq-unsafe locks:                      805
 hardirq-read-safe locks:                 2
 hardirq-read-unsafe locks:             129
 softirq-read-safe locks:                11
 softirq-read-unsafe locks:             123
 irq-read-safe locks:                    11
 irq-read-unsafe locks:                 129
 uncategorized locks:                   224
 unused locks:                            0
 max locking depth:                      15
 max bfs queue depth:                   215
 chain lookup misses:                 11664
 chain lookup hits:                37393935
 cyclic checks:                       11053
 redundant checks:                        0
 redundant links:                         0
 find-mask forwards checks:            1588
 find-mask backwards checks:           1779
 hardirq on events:                17502380
 hardirq off events:               17502376
 redundant hardirq ons:                   0
 redundant hardirq offs:                  0
 softirq on events:                   90845
 softirq off events:                  90845
 redundant softirq ons:                   0
 redundant softirq offs:                  0
 debug_locks:                             1

 zapped classes:                          0
 zapped lock chains:                      0
 large chain blocks:                      1
# awk ' { if ($2 == "OPS:") print $5" "$9 } ' /proc/lockdep | sort -rV | head -n 30
423 (wq_completion)events
405 (wq_completion)events_unbound
393 &f->f_pos_lock
355 &p->lock
349 sb_writers#3
342 sb_writers#6
338 &of->mutex
330 (work_completion)(&entry->work)
330 pernet_ops_rwsem
289 epmutex
288 &ep->mtx
281 tty_mutex
280 &tty->legacy_mutex
273 &tty->legacy_mutex/1
269 &tty->ldisc_sem
268 (wq_completion)ipv6_addrconf
266 (work_completion)(&(&ifa->dad_work)->work)
266 (linkwatch_work).work
266 (addr_chk_work).work
266 &ldata->atomic_read_lock
265 (work_completion)(&buf->work)
265 rtnl_mutex
263 &tty->atomic_write_lock
262 &buf->lock
261 &tty->termios_rwsem
242 &port->buf.lock/1
242 kn->active#40
241 &o_tty->termios_rwsem/1
240 registration_lock
239 &ldata->output_lock
# awk ' { if ($2 == "OPS:") print $7" "$9 } ' /proc/lockdep | sort -rV | head -n 30
642 pool_lock
641 &obj_hash[i].lock
582 tk_core.seq.seqcount
561 hrtimer_bases.lock
560 &rt_rq->rt_runtime_lock
559 &rt_b->rt_runtime_lock
559 &cp->lock
559 &cfs_rq->removed.lock
559 &cfs_b->lock
558 &rq->lock
550 &tsk->delays->lock
549 &p->pi_lock
506 &base->lock
504 &n->list_lock
432 &____s->seqcount
404 &x->wait#10
401 &pool->lock
369 &zone->lock
330 rcu_node_0
326 &(kretprobe_table_locks[i].lock)
326 pgd_lock
324 &pgdat->lru_lock
323 lock#5
321 &page_wait_table[i]
319 ptlock_ptr(page)#2/1
318 ptlock_ptr(page)#2
316 &sem->wait_lock
316 &mm->page_table_lock
316 ptlock_ptr(page)
315 &anon_vma->rwsem

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-22  1:56                                       ` Tetsuo Handa
@ 2020-11-27  9:00                                         ` Dmitry Vyukov
  2020-12-03 13:47                                           ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Dmitry Vyukov @ 2020-11-27  9:00 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Peter Zijlstra, syzkaller, Ingo Molnar, Will Deacon,
	Andrew Morton, LKML, Linus Torvalds

On Sun, Nov 22, 2020 at 2:56 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2020/11/20 18:27, Dmitry Vyukov wrote:
> > Peter, so far it looks like just a very large, but normal graph to me.
> > The cheapest from an engineering point of view solution would be just
> > to increase the constants. I assume a 2x increase should buy us lots
> > of time to overflow.
> > I can think of more elaborate solutions, e.g. using bitmasks to
> > represent hot leaf and top-level locks. But it will both increase the
> > resulting code complexity (no uniform representation anymore, all code
> > will need to deal with different representations) and require some
> > time investments (that I can't justify for me at least as compared to
> > just throwing a bit more machine memory at it). And in the end it
> > won't really reduce the size of the graph.
> > What do you think?
> >
>
> Yes, I think it is a normal graph; simply syzkaller kernels tend to record
> a few times more dependencies than my idle kernel (shown bottom).
>
> Peter, you guessed that the culprit is sysfs at
> https://lkml.kernel.org/r/20200916115057.GO2674@hirez.programming.kicks-ass.net , but
> syzbot reported at https://syzkaller.appspot.com/text?tag=MachineInfo&x=99b8f2b092d9714f
> that "BUG: MAX_LOCKDEP_ENTRIES too low!" can occur on a VM with only 2 CPUs.
> Is your guess catching the culprit?
>
> We could improve a few locks, but as a whole we won't be able to afford keeping
> sum of individual dependencies under current threshold. Therefore, allow lockdep to
> tune the capacity and allow syzkaller to dump when reaching the capacity will be
> the way to go.
>
>
>
> # cat /proc/lockdep_stats
>  lock-classes:                         1236 [max: 8192]
>  direct dependencies:                  9610 [max: 32768]
>  indirect dependencies:               40401
>  all direct dependencies:            174635
>  dependency chains:                   11398 [max: 65536]
>  dependency chain hlocks used:        42830 [max: 327680]
>  dependency chain hlocks lost:            0
>  in-hardirq chains:                      61
>  in-softirq chains:                     414
>  in-process chains:                   10923
>  stack-trace entries:                 93041 [max: 524288]
>  number of stack traces:               4997
>  number of stack hash chains:          4292
>  combined max dependencies:       281074520
>  hardirq-safe locks:                     50
>  hardirq-unsafe locks:                  805
>  softirq-safe locks:                    146
>  softirq-unsafe locks:                  722
>  irq-safe locks:                        155
>  irq-unsafe locks:                      805
>  hardirq-read-safe locks:                 2
>  hardirq-read-unsafe locks:             129
>  softirq-read-safe locks:                11
>  softirq-read-unsafe locks:             123
>  irq-read-safe locks:                    11
>  irq-read-unsafe locks:                 129
>  uncategorized locks:                   224
>  unused locks:                            0
>  max locking depth:                      15
>  max bfs queue depth:                   215
>  chain lookup misses:                 11664
>  chain lookup hits:                37393935
>  cyclic checks:                       11053
>  redundant checks:                        0
>  redundant links:                         0
>  find-mask forwards checks:            1588
>  find-mask backwards checks:           1779
>  hardirq on events:                17502380
>  hardirq off events:               17502376
>  redundant hardirq ons:                   0
>  redundant hardirq offs:                  0
>  softirq on events:                   90845
>  softirq off events:                  90845
>  redundant softirq ons:                   0
>  redundant softirq offs:                  0
>  debug_locks:                             1
>
>  zapped classes:                          0
>  zapped lock chains:                      0
>  large chain blocks:                      1
> # awk ' { if ($2 == "OPS:") print $5" "$9 } ' /proc/lockdep | sort -rV | head -n 30
> 423 (wq_completion)events
> 405 (wq_completion)events_unbound
> 393 &f->f_pos_lock
> 355 &p->lock
> 349 sb_writers#3
> 342 sb_writers#6
> 338 &of->mutex
> 330 (work_completion)(&entry->work)
> 330 pernet_ops_rwsem
> 289 epmutex
> 288 &ep->mtx
> 281 tty_mutex
> 280 &tty->legacy_mutex
> 273 &tty->legacy_mutex/1
> 269 &tty->ldisc_sem
> 268 (wq_completion)ipv6_addrconf
> 266 (work_completion)(&(&ifa->dad_work)->work)
> 266 (linkwatch_work).work
> 266 (addr_chk_work).work
> 266 &ldata->atomic_read_lock
> 265 (work_completion)(&buf->work)
> 265 rtnl_mutex
> 263 &tty->atomic_write_lock
> 262 &buf->lock
> 261 &tty->termios_rwsem
> 242 &port->buf.lock/1
> 242 kn->active#40
> 241 &o_tty->termios_rwsem/1
> 240 registration_lock
> 239 &ldata->output_lock
> # awk ' { if ($2 == "OPS:") print $7" "$9 } ' /proc/lockdep | sort -rV | head -n 30
> 642 pool_lock
> 641 &obj_hash[i].lock
> 582 tk_core.seq.seqcount
> 561 hrtimer_bases.lock
> 560 &rt_rq->rt_runtime_lock
> 559 &rt_b->rt_runtime_lock
> 559 &cp->lock
> 559 &cfs_rq->removed.lock
> 559 &cfs_b->lock
> 558 &rq->lock
> 550 &tsk->delays->lock
> 549 &p->pi_lock
> 506 &base->lock
> 504 &n->list_lock
> 432 &____s->seqcount
> 404 &x->wait#10
> 401 &pool->lock
> 369 &zone->lock
> 330 rcu_node_0
> 326 &(kretprobe_table_locks[i].lock)
> 326 pgd_lock
> 324 &pgdat->lru_lock
> 323 lock#5
> 321 &page_wait_table[i]
> 319 ptlock_ptr(page)#2/1
> 318 ptlock_ptr(page)#2
> 316 &sem->wait_lock
> 316 &mm->page_table_lock
> 316 ptlock_ptr(page)
> 315 &anon_vma->rwsem

The latest syzbot logs now contain these dumps as well, if anybody is
interested in more:
https://syzkaller.appspot.com/bug?id=63fc8d0501c39609dd2f268e4190ec9a72619563
https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-11-27  9:00                                         ` Dmitry Vyukov
@ 2020-12-03 13:47                                           ` Tetsuo Handa
  2020-12-04 14:35                                             ` Tetsuo Handa
  0 siblings, 1 reply; 38+ messages in thread
From: Tetsuo Handa @ 2020-12-03 13:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Dmitry Vyukov

On Sun, Nov 22, 2020 at 2:56 AM Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote:
> Peter, you guessed that the culprit is sysfs at
> https://lkml.kernel.org/r/20200916115057.GO2674@hirez.programming.kicks-ass.net , but
> syzbot reported at https://syzkaller.appspot.com/text?tag=MachineInfo&x=99b8f2b092d9714f
> that "BUG: MAX_LOCKDEP_ENTRIES too low!" can occur on a VM with only 2 CPUs.
> Is your guess catching the culprit?
>
> We could improve a few locks, but as a whole we won't be able to afford keeping
> sum of individual dependencies under current threshold. Therefore, allow lockdep to
> tune the capacity and allow syzkaller to dump when reaching the capacity will be
> the way to go.

An example dump from a VM with only 2 CPUs
( https://syzkaller.appspot.com/text?tag=CrashLog&x=12016d55500000 ) had
a chain with more than kn->active#number 150 entries.

kn->active#number is initialized by __kernfs_create_file() but it is not
deinitialized (there is no method for erasing lockdep map?) by kernfs_put().
Since syzkaller continues object creation/deletion via fuzz testing until
kernel crashes, __kernfs_create_file() are called for so many times?



struct kernfs_node *__kernfs_create_file(struct kernfs_node *parent,
                                         const char *name,
                                         umode_t mode, kuid_t uid, kgid_t gid,
                                         loff_t size,
                                         const struct kernfs_ops *ops,
                                         void *priv, const void *ns,
                                         struct lock_class_key *key)
{
(...snipped...)
        kn = kernfs_new_node(parent, name, (mode & S_IALLUGO) | S_IFREG,
                             uid, gid, flags);
        if (!kn)
                return ERR_PTR(-ENOMEM);

        kn->attr.ops = ops;
        kn->attr.size = size;
        kn->ns = ns;
        kn->priv = priv;

#ifdef CONFIG_DEBUG_LOCK_ALLOC
        if (key) {
                lockdep_init_map(&kn->dep_map, "kn->active", key, 0);
                kn->flags |= KERNFS_LOCKDEP;
        }
#endif
(...snipped...)
}

void kernfs_put(struct kernfs_node *kn)
{
(...snipped...)
        kmem_cache_free(kernfs_node_cache, kn);
(...snipped...)
}



FD:  919 BD:   59 +.+.: rtnl_mutex
 ->fs_reclaim
 ->depot_lock
 ->pcpu_alloc_mutex
 ->&x->wait#6
 ->&obj_hash[i].lock
 ->gdp_mutex
 ->&k->list_lock
 ->lock#2
 ->kernfs_mutex
 ->bus_type_sem
 ->sysfs_symlink_target_lock
 ->&____s->seqcount
 ->&dev->power.lock
 ->dpm_list_mtx
 ->uevent_sock_mutex
 ->&zone->lock
 ->running_helpers_waitq.lock
 ->subsys mutex#17
 ->dev_hotplug_mutex
 ->dev_base_lock
 ->input_pool.lock
 ->nl_table_lock
 ->nl_table_wait.lock
 ->net_rwsem
 ->&tbl->lock
 ->sysctl_lock
 ->&rq->lock
 ->krc.lock
 ->cpu_hotplug_lock
 ->kthread_create_lock
 ->&p->pi_lock
 ->&cfs_rq->removed.lock
 ->&x->wait
 ->&p->alloc_lock
 ->&cpuset_rwsem
 ->wq_pool_mutex
 ->&crng->lock
 ->rcu_node_0
 ->&rcu_state.expedited_wq
 ->&n->list_lock
 ->&pool->lock
 ->pool_lock
 ->lweventlist_lock
 ->rtnl_mutex.wait_lock
 ->proc_subdir_lock
 ->proc_inum_ida.xa_lock
 ->&k->k_lock
 ->pcpu_lock
 ->param_lock
 ->logbuf_lock
 ->(console_sem).lock
 ->subsys mutex#52
 ->pin_fs_lock
 ->&sb->s_type->i_mutex_key#3
 ->&local->iflist_mtx
 ->&wdev->mtx
 ->subsys mutex#53
 ->&sdata->sec_mtx
 ->&local->iflist_mtx#2
 ->lock#6
 ->net_dm_mutex
 ->failover_lock
 ->&tn->lock
 ->&cn->lock
 ->&ndev->lock
 ->&idev->mc_lock
 ->&mc->mca_lock
 ->&base->lock
 ->j1939_netdev_lock
 ->&pnettable->lock
 ->&vn->sock_lock
 ->&reg_requests_lock
 ->&reg_pending_beacons_lock
 ->&reg_indoor_lock
 ->_xmit_LOOPBACK
 ->(inetaddr_validator_chain).rwsem
 ->(inetaddr_chain).rwsem
 ->&in_dev->mc_tomb_lock
 ->&im->lock
 ->fib_info_lock
 ->cbs_list_lock
 ->taprio_list_lock
 ->(inet6addr_validator_chain).rwsem
 ->addrconf_hash_lock
 ->&ifa->lock
 ->&tb->tb6_lock
 ->rlock-AF_NETLINK
 ->_xmit_ETHER
 ->noop_qdisc.q.lock
 ->quarantine_lock
 ->kernfs_mutex.wait_lock
 ->x25_neigh_list_lock
 ->&mm->mmap_lock#2
 ->free_vmap_area_lock
 ->vmap_area_lock
 ->&ul->lock
 ->&n->lock
 ->&pn->hash_lock
 ->init_mm.page_table_lock
 ->&nr_netdev_addr_lock_key
 ->listen_lock
 ->&r->consumer_lock
 ->batched_entropy_u32.lock
 ->(switchdev_blocking_notif_chain).rwsem
 ->&br->hash_lock
 ->nf_hook_mutex
 ->console_owner_lock
 ->console_owner
 ->&bat_priv->tvlv.handler_list_lock
 ->&bat_priv->tvlv.container_list_lock
 ->&bat_priv->softif_vlan_list_lock
 ->key#14
 ->&bat_priv->tt.changes_list_lock
 ->tk_core.seq.seqcount
 ->&wq->mutex
 ->init_lock
 ->deferred_lock
 ->target_list_lock
 ->&br->lock
 ->&dev_addr_list_lock_key/1
 ->team->team_lock_key
 ->&hard_iface->bat_iv.ogm_buff_mutex
 ->ptype_lock
 ->&hsr->list_lock
 ->&rnp->exp_wq[1]
 ->&rnp->exp_wq[2]
 ->&rnp->exp_wq[3]
 ->&xa->xa_lock#7
 ->&dev_addr_list_lock_key#3/1
 ->req_lock
 ->&x->wait#9
 ->subsys mutex#74
 ->&devlink_port->type_lock
 ->pgd_lock
 ->&dev->tx_global_lock
 ->&rnp->exp_wq[0]
 ->&sch->q.lock
 ->(&tbl->proxy_timer)
 ->class
 ->bpf_devs_lock
 ->devnet_rename_sem
 ->&net->nft.commit_mutex
 ->&ent->pde_unload_lock
 ->team->team_lock_key#2
 ->&wg->device_update_lock
 ->_xmit_SIT
 ->&bridge_netdev_addr_lock_key/1
 ->team->team_lock_key#3
 ->_xmit_NONE
 ->_xmit_TUNNEL
 ->_xmit_IPGRE
 ->mount_lock
 ->team->team_lock_key#4
 ->_xmit_TUNNEL6
 ->team->team_lock_key#5
 ->&dev_addr_list_lock_key#2/1
 ->team->team_lock_key#6
 ->_xmit_ETHER/1
 ->&nn->netlink_tap_lock
 ->&batadv_netdev_addr_lock_key/1
 ->&vlan_netdev_addr_lock_key/1
 ->&macvlan_netdev_addr_lock_key/1
 ->&br->multicast_lock
 ->&macsec_netdev_addr_lock_key/1
 ->key#18
 ->&bat_priv->tt.commit_lock
 ->&sb->s_type->i_lock_key#8
 ->k-slock-AF_INET/1
 ->k-slock-AF_INET#2
 ->k-sk_lock-AF_INET
 ->k-slock-AF_INET6
 ->k-sk_lock-AF_INET6
 ->&ul->lock#2
 ->&____s->seqcount#5
 ->&nf_conntrack_locks[i]
 ->&wdev->event_lock
 ->&wdev->mgmt_registrations_lock
 ->&local->key_mtx
 ->&dentry->d_lock
 ->&sb->s_type->i_lock_key#7
 ->&s->s_inode_list_lock
 ->&xa->xa_lock#4
 ->&rfkill->lock
 ->&local->chanctx_mtx
 ->&data->mutex
 ->&local->queue_stop_reason_lock
 ->&sch->seqlock
 ->&cpuctx_lock
 ->act_mod_lock
 ->&tn->idrinfo->lock
 ->vmap_purge_lock
 ->cpa_lock
 ->&fp->aux->used_maps_mutex
 ->&p->tcfa_lock
 ->slock-AF_INET6
 ->sk_lock-AF_INET6
 ->__ip_vs_mutex
 ->flowtable_lock
 ->&net->xdp.lock
 ->mirred_list_lock
 ->&pnn->pndevs.lock
 ->&pnn->routes.lock
 ->kn->active#164
 ->kn->active#165
 ->kernfs_idr_lock
 ->kn->active#166
 ->kn->active#167
 ->kn->active#168
 ->kn->active#169
 ->kn->active#170
 ->kn->active#171
 ->kn->active#172
 ->kn->active#173
 ->kn->active#174
 ->kn->active#175
 ->dev_pm_qos_sysfs_mtx
 ->kn->active#80
 ->kn->active#81
 ->kn->active#82
 ->kn->active#83
 ->kn->active#84
 ->kn->active#2
 ->kn->active#176
 ->kn->active#177
 ->kn->active#178
 ->kn->active#179
 ->kn->active#180
 ->kn->active#181
 ->kn->active#182
 ->kn->active#183
 ->kn->active#184
 ->kn->active#185
 ->kn->active#186
 ->kn->active#187
 ->kn->active#188
 ->kn->active#189
 ->kn->active#190
 ->kn->active#191
 ->kn->active#192
 ->kn->active#193
 ->kn->active#194
 ->kn->active#195
 ->kn->active#196
 ->kn->active#197
 ->kn->active#198
 ->kn->active#199
 ->kn->active#200
 ->kn->active#53
 ->kn->active#201
 ->kn->active#64
 ->kn->active#63
 ->kn->active#62
 ->kn->active#55
 ->kn->active#56
 ->kn->active#202
 ->kn->active#203
 ->kn->active#54
 ->kn->active#204
 ->kn->active#205
 ->kn->active#206
 ->kn->active#207
 ->kn->active#208
 ->kn->active#209
 ->kn->active#210
 ->kn->active#211
 ->kn->active#212
 ->kn->active#213
 ->kn->active#214
 ->kn->active#215
 ->kn->active#216
 ->kn->active#217
 ->kn->active#218
 ->kn->active#65
 ->kn->active#219
 ->kn->active#220
 ->kn->active#221
 ->kn->active#222
 ->deferred_probe_mutex
 ->wfs_lock
 ->device_links_lock
 ->qdisc_mod_lock
 ->dev->qdisc_running_key ?: &qdisc_running_key
 ->(&q->adapt_timer)
 ->&block->lock
 ->&block->cb_lock
 ->(&q->adapt_timer)#2
 ->&chain->filter_chain_lock
 ->cls_mod_lock
 ->&head->masks_lock
 ->&tp->lock
 ->&tn->idrinfo->lock#2
 ->&block->proto_destroy_lock
 ->lock#10
 ->&tun->lock
 ->wlock-AF_UNSPEC
 ->elock-AF_UNSPEC
 ->kn->active#343
 ->kn->active#344
 ->kn->active#345
 ->pcpu_alloc_mutex.wait_lock
 ->rcu_state.exp_mutex.wait_lock
 ->&rnp->exp_lock
 ->(&fnew->perturb_timer)
 ->ematch_mod_lock
 ->&pn->all_ppp_mutex
 ->&ppp->rlock
 ->&ppp->wlock
 ->&dev_addr_list_lock_key#4
 ->&pf->rwait
 ->&macvlan_netdev_addr_lock_key/2
 ->kn->active#71
 ->&dev->devres_lock
 ->kn->active#346
 ->kn->active#347
 ->kn->active#348
 ->kn->active#349
 ->rcu_state.exp_mutex
 ->&tn->idrinfo->lock#3
 ->netdev_unregistering_wq.lock
 ->&(&net->nexthop.notifier_chain)->rwsem
 ->&caifn->caifdevs.lock
 ->&net->rules_mod_lock
 ->(&mrt->ipmr_expire_timer)
 ->&ht->mutex
 ->nf_connlabels_lock
 ->&tn->idrinfo->lock#4
 ->zones_mutex
 ->uevent_sock_mutex.wait_lock
 ->napi_hash_lock
 ->kn->active#356
 ->kn->active#357
 ->kn->active#358
 ->kn->active#359
 ->kn->active#360
 ->kn->active#361
 ->kn->active#362
 ->kn->active#363
 ->kn->active#364
 ->kn->active#365
 ->kn->active#366
 ->kn->active#367
 ->kn->active#368
 ->kn->active#369
 ->kn->active#370
 ->kn->active#371
 ->kn->active#372
 ->kn->active#373
 ->kn->active#374
 ->kn->active#375
 ->kn->active#376
 ->kn->active#377
 ->kn->active#378
 ->kn->active#379
 ->kn->active#380
 ->kn->active#381
 ->kn->active#382
 ->kn->active#383
 ->kn->active#384
 ->kn->active#385
 ->kn->active#386
 ->kn->active#387
 ->kn->active#388
 ->kn->active#389
 ->kn->active#390
 ->kn->active#391
 ->kn->active#392
 ->kn->active#393
 ->kn->active#394
 ->kn->active#395
 ->kn->active#396
 ->kn->active#397
 ->kn->active#398
 ->kn->active#399
 ->kn->active#400
 ->kn->active#401
 ->&bridge_netdev_addr_lock_key
 ->rcu_state.barrier_mutex
 ->&app->lock#2
 ->(&app->join_timer)#2
 ->(&app->periodic_timer)
 ->&list->lock#13
 ->(&app->join_timer)
 ->&app->lock
 ->&list->lock#12
 ->&tn->idrinfo->lock#5
 ->ife_mod_lock
 ->_xmit_NETROM#2
 ->&this->info_list_lock
 ->&local->mtx
 ->&lock->wait_lock
 ->&list->lock#2
 ->(&port->multicast_router_timer)
 ->kn->active#404
 ->kn->active#405
 ->kn->active#406
 ->kn->active#407
 ->kn->active#408
 ->kn->active#409
 ->kn->active#410
 ->kn->active#411
 ->kn->active#412
 ->kn->active#413
 ->kn->active#414
 ->kn->active#415
 ->kn->active#416
 ->kn->active#417
 ->kn->active#418
 ->kn->active#419
 ->kn->active#420
 ->kn->active#421
 ->kn->active#422
 ->kn->active#423
 ->kn->active#424
 ->kn->active#425
 ->kn->active#426
 ->kn->active#427
 ->kn->active#428
 ->kn->active#429
 ->kn->active#430
 ->kn->active#431
 ->kn->active#432
 ->kn->active#433
 ->kn->active#434
 ->&tn->idrinfo->lock#6
 ->&local->sta_mtx
 ->acaddr_hash_lock
 ->(&br->hello_timer)
 ->(&br->topology_change_timer)
 ->(&br->tcn_timer)
 ->(&br->multicast_router_timer)
 ->(&br->ip4_other_query.timer)
 ->(&br->ip4_own_query.timer)
 ->(&br->ip6_other_query.timer)
 ->(&br->ip6_own_query.timer)
 ->slock-AF_CAN
 ->sk_lock-AF_CAN
 ->&list->lock#16
 ->&wg->socket_update_lock
 ->&table->hash[i].lock
 ->k-clock-AF_INET
 ->k-clock-AF_INET6
 ->&mc_lst->sflock
 ->&tn->idrinfo->lock#7
 ->_xmit_AX25
 ->&bridge_netdev_addr_lock_key/2
 ->fastmap_lock

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] lockdep: Allow tuning tracing capacity constants.
  2020-12-03 13:47                                           ` Tetsuo Handa
@ 2020-12-04 14:35                                             ` Tetsuo Handa
  0 siblings, 0 replies; 38+ messages in thread
From: Tetsuo Handa @ 2020-12-04 14:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: syzkaller, Ingo Molnar, Will Deacon, Andrew Morton, LKML,
	Linus Torvalds, Dmitry Vyukov

I attempted some analysis, but I couldn't find which lock is causing nr_list_entries == 32768.
Since "struct lock_list list_entries[MAX_LOCKDEP_ENTRIES]" is marked as "static" variable in lockdep.c ,
none of /proc/lockdep_stats /proc/lockdep /proc/lockdep_chains can report which lock is responsible
for "BUG: MAX_LOCKDEP_ENTRIES too low!" message?

 lock-classes:                         3456 [max: 8192]
 direct dependencies:                 32768 [max: 32768]
 indirect dependencies:              200953
 all direct dependencies:           1213844
 dependency chains:                   56497 [max: 65536]
 dependency chain hlocks used:       249022 [max: 327680]
 dependency chain hlocks lost:            0
 in-hardirq chains:                      88
 in-softirq chains:                    2636
 in-process chains:                   53773
 stack-trace entries:                303496 [max: 524288]
 number of stack traces:              15714
 number of stack hash chains:         10103
 combined max dependencies:hardirq-safe locks:                     57
 hardirq-unsafe locks:                 2613
 softirq-safe locks:                    381
 softirq-unsafe locks:                 2202
 irq-safe locks:                        395
 irq-unsafe locks:                     2613
 hardirq-read-safe locks:                 4
 hardirq-read-unsafe locks:             233
 softirq-read-safe locks:                21
 softirq-read-unsafe locks:             218
 irq-read-safe locks:                    21
 irq-read-unsafe locks:                 233
 uncategorized locks:                   491
 unused locks:                            1
 max locking depth:                      20
 max bfs queue depth:                   749
 debug_locks:                             0

 zapped classes:                        224
 zapped lock chains:                    640
 large chain blocks:                      1

I was able to calculate "lock-classes:" and "indirect dependencies:" from /proc/lockdep , but
I failed to calculate "direct dependencies:" from /proc/lockdep . How can we calculate it?

grep ^FD: CrashLog-12016d55500000 | wc -l
3456

grep -F -- '->' CrashLog-12016d55500000 | wc -l
16004

grep '^ ->' CrashLog-12016d55500000 | wc -l
14014

grep '^ ->' CrashLog-12016d55500000 | sort -u | wc -l
2506

grep ^FD: CrashLog-12016d55500000 | awk ' BEGIN { FD=0; BD=0; } { if ($1 == "FD:") { FD=FD+$2; BD=BD+$4; } } END { print FD" "BD; } '
200953 200856

Top FD entries tend to be long held locks (e.g. workqueue) whereas
top BD entries tend to be short held locks (e.g. scheduler and timekeeping).
But given that this result is from a VM with only 2 CPUs, I think that
this result does not match your

  Examples of bad annotations is getting every CPU a separate class, that
  leads to nr_cpus! chains if CPUs arbitrarily nest (nr_cpus^2 if there's
  only a single nesting level).

case.

awk ' { if ($1 == "FD:") print $2" "$6$7 } ' CrashLog-12016d55500000 | sort -rV | head -n 40
1933 (wq_completion)usb_hub_wq
1932 (work_completion)(&hub->events)
1730 (wq_completion)events
1508 (delayed_fput_work).work
1332 &sb->s_type->i_mutex_key#13
1227 &f->f_pos_lock
1173 sb_writers#5
1095 (wq_completion)netns
1094 net_cleanup_work
1094 masq_mutex
1092 pernet_ops_rwsem
1046 (work_completion)(&m->wq)
1046 (work_completion)(&m->rq)
1045 &pipe->mutex/1
1043 tty_mutex
1042 &tty->legacy_mutex
1039 &p->lock
1035 &tty->legacy_mutex/1
1033 &tty->ldisc_sem
1009 cb_lock
1007 (work_completion)(&nlk->work)
1007 nlk_cb_mutex-GENERIC
1006 genl_mutex
985 sb_writers#8
984 sb_writers#13
984 sb_writers#6
979 &of->mutex
964 (work_completion)(&fw_work->work)
946 (wq_completion)events_power_efficient
943 &rdma_nl_types[idx].sem
942 link_ops_rwsem
939 (wq_completion)gid-cache-wq
937 (work_completion)(&work->work)
937 (work_completion)(&ndev_work->work)
936 devices_rwsem
931 kn->active#88
930 nsim_bus_dev_list_lock
926 rdma_nets_rwsem
924 &device->compat_devs_mutex
922 (wq_completion)ipv6_addrconf

awk ' { if ($3 == "BD:") print $4" "$6$7 } ' CrashLog-12016d55500000 | sort -urV | head -n 40
1618 pool_lock
1617 &obj_hash[i].lock
1510 &base->lock
1489 pvclock_gtod_data
1487 tk_core.seq.seqcount
1465 hrtimer_bases.lock
1460 &per_cpu_ptr(group->pcpu,cpu)->seq
1459 &rt_rq->rt_runtime_lock
1459 &ctx->lock/1
1458 &rt_b->rt_runtime_lock
1458 &ctx->lock
1457 &tsk->delays->lock
1457 &rq->lock/1
1457 &rd->rto_lock
1457 &cp->lock
1457 &cfs_rq->removed.lock
1457 &cfs_b->lock
1457 per_cpu_ptr(&cgroup_rstat_cpu_lock,cpu)
1456 &rq->lock
1299 &p->pi_lock
1266 depot_lock
1114 &____s->seqcount
1085 &n->list_lock
1071 &zone->lock
1039 &x->wait#13
1039 wq_mayday_lock
1038 &pool->lock
869 quarantine_lock
859 &cpuctx_lock
854 rcu_node_0
827 &rcu_state.expedited_wq
812 key#10
808 &memcg->deferred_split_queue.split_queue_lock
807 &pgdat->lru_lock
806 lock#3
804 &eb->refs_lock
798 &mapping->private_lock
788 &rcu_state.gp_wq
782 &page_wait_table[i]
778 pgd_lock

grep -F -- '->kn->active#' CrashLog-12016d55500000 | wc -l
533
grep -F -- '->kn->active#' CrashLog-12016d55500000 | sort -u | wc -l
439



FD:  931 BD:    1 .+.+: kn->active#88
 ->fs_reclaim
 ->&____s->seqcount
 ->kernfs_open_file_mutex
 ->nsim_bus_dev_list_lock
 ->&zone->lock
 ->nsim_bus_dev_list_lock.wait_lock
 ->&p->pi_lock
 ->&rq->lock



FD: 1933 BD:    1 +.+.: (wq_completion)usb_hub_wq
 ->(work_completion)(&hub->events)

FD: 1932 BD:    2 +.+.: (work_completion)(&hub->events)
 ->kcov_remote_lock
 ->&dev->power.lock
 ->fs_reclaim
 ->depot_lock
 ->free_vmap_area_lock
 ->vmap_area_lock
 ->&____s->seqcount
 ->init_mm.page_table_lock
 ->&zone->lock
 ->&rq->lock
 ->&obj_hash[i].lock
 ->&cfs_rq->removed.lock
 ->&kcov->lock

FD:    2 BD:  261 ..-.: kcov_remote_lock
 ->depot_lock

FD:   27 BD:  262 -...: &dev->power.lock
 ->&dev->power.lock/1
 ->&dev->power.wait_queue
 ->&obj_hash[i].lock
 ->&ws->lock

FD:   55 BD:  763 +.+.: fs_reclaim
 ->&rq->lock
 ->&cfs_rq->removed.lock
 ->&obj_hash[i].lock
 ->depot_lock
 ->&mapping->i_mmap_rwsem
 ->mmu_notifier_invalidate_range_start
 ->pgd_lock
 ->&cpuctx_lock
 ->rcu_node_0
 ->&rcu_state.expedited_wq

FD:    1 BD: 1266 ..-.: depot_lock

FD:    9 BD:  378 +.+.: free_vmap_area_lock
 ->&obj_hash[i].lock
 ->depot_lock
 ->init_mm.page_table_lock
 ->quarantine_lock
 ->&zone->lock
 ->&____s->seqcount

FD:    1 BD:  383 +.+.: vmap_area_lock

FD:    1 BD: 1114 ...-: &____s->seqcount

FD:    6 BD:  379 +.+.: init_mm.page_table_lock
 ->pgd_lock
 ->&obj_hash[i].lock
 ->&zone->lock

FD:    2 BD: 1071 ..-.: &zone->lock
 ->&____s->seqcount

FD:   19 BD: 1456 -.-.: &rq->lock
 ->&per_cpu_ptr(group->pcpu, cpu)->seq
 ->&base->lock
 ->&cfs_rq->removed.lock
 ->&rt_b->rt_runtime_lock
 ->&cp->lock
 ->&rt_rq->rt_runtime_lock
 ->tk_core.seq.seqcount
 ->&tsk->delays->lock
 ->&ctx->lock
 ->&rd->rto_lock
 ->&rq->lock/1

FD:    2 BD: 1617 -.-.: &obj_hash[i].lock
 ->pool_lock

FD:    1 BD: 1457 -.-.: &cfs_rq->removed.lock

FD:    3 BD:  260 ..-.: &kcov->lock
 ->kcov_remote_lock


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v4] lockdep: Allow tuning tracing capacity constants.
  2020-10-10 12:58   ` [PATCH v3] " Tetsuo Handa
  2020-10-18 13:02     ` Tetsuo Handa
@ 2021-01-01  8:09     ` Tetsuo Handa
  1 sibling, 0 replies; 38+ messages in thread
From: Tetsuo Handa @ 2021-01-01  8:09 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Linus Torvalds
  Cc: Andrew Morton, linux-kernel, Dmitry Vyukov

Since syzkaller continues various test cases until the kernel crashes,
syzkaller tends to examine more locking dependencies than normal systems.
As a result, syzbot is reporting that the fuzz testing was terminated
due to hitting upper limits lockdep can track [1] [2] [3].

Peter Zijlstra does not want to allow tuning these limits via kernel
config options, for such change discourages thinking. But analysis via
/proc/lockdep* did not show any obvious culprit [4] [5]. It is possible
that many hundreds of kn->active lock instances are to some degree
contributing to these problems, but there is no means to verify whether
these instances are created for protecting same callback functions.
Unless Peter provides a way to make these instances per "which callback
functions the lock instance will call (identified by something like MD5
of string representations of callback functions which each lock instance
will protect)" than plain "serial number", I don't think that we can
verify the culprit.

Therefore, despite Peter's objection, I push this patch again.
The ball is in Peter court. Peter, please don't ignore.

[1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
[2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
[3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
[4] https://lkml.kernel.org/r/4b8f7a57-fa20-47bd-48a0-ae35d860f233@i-love.sakura.ne.jp
[5] https://lkml.kernel.org/r/1c351187-253b-2d49-acaf-4563c63ae7d2@i-love.sakura.ne.jp

Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
---
 kernel/locking/lockdep.c           |  2 +-
 kernel/locking/lockdep_internals.h |  8 +++---
 lib/Kconfig.debug                  | 40 ++++++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c1418b47f625..c0553872668a 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1391,7 +1391,7 @@ static int add_lock_to_list(struct lock_class *this,
 /*
  * For good efficiency of modular, we use power of 2
  */
-#define MAX_CIRCULAR_QUEUE_SIZE		4096UL
+#define MAX_CIRCULAR_QUEUE_SIZE		(1UL << CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS)
 #define CQ_MASK				(MAX_CIRCULAR_QUEUE_SIZE-1)
 
 /*
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index de49f9e1c11b..ecb8662e7a4e 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -99,16 +99,16 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
 #define MAX_STACK_TRACE_ENTRIES	262144UL
 #define STACK_TRACE_HASH_SIZE	8192
 #else
-#define MAX_LOCKDEP_ENTRIES	32768UL
+#define MAX_LOCKDEP_ENTRIES	(1UL << CONFIG_LOCKDEP_BITS)
 
-#define MAX_LOCKDEP_CHAINS_BITS	16
+#define MAX_LOCKDEP_CHAINS_BITS	CONFIG_LOCKDEP_CHAINS_BITS
 
 /*
  * Stack-trace: tightly packed array of stack backtrace
  * addresses. Protected by the hash_lock.
  */
-#define MAX_STACK_TRACE_ENTRIES	524288UL
-#define STACK_TRACE_HASH_SIZE	16384
+#define MAX_STACK_TRACE_ENTRIES	(1UL << CONFIG_LOCKDEP_STACK_TRACE_BITS)
+#define STACK_TRACE_HASH_SIZE	(1 << CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS)
 #endif
 
 /*
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e6e58b26e888..2c23939dc6a6 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1340,6 +1340,46 @@ config LOCKDEP
 config LOCKDEP_SMALL
 	bool
 
+config LOCKDEP_BITS
+	int "Bitsize for MAX_LOCKDEP_ENTRIES"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 15
+	help
+	  Try increasing this value if you hit "BUG: MAX_LOCKDEP_ENTRIES too low!" message.
+
+config LOCKDEP_CHAINS_BITS
+	int "Bitsize for MAX_LOCKDEP_CHAINS"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 16
+	help
+	  Try increasing this value if you hit "BUG: MAX_LOCKDEP_CHAINS too low!" message.
+
+config LOCKDEP_STACK_TRACE_BITS
+	int "Bitsize for MAX_STACK_TRACE_ENTRIES"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 19
+	help
+	  Try increasing this value if you hit "BUG: MAX_STACK_TRACE_ENTRIES too low!" message.
+
+config LOCKDEP_STACK_TRACE_HASH_BITS
+	int "Bitsize for STACK_TRACE_HASH_SIZE"
+	depends on LOCKDEP && !LOCKDEP_SMALL
+	range 10 30
+	default 14
+	help
+	  Try increasing this value if you need large MAX_STACK_TRACE_ENTRIES.
+
+config LOCKDEP_CIRCULAR_QUEUE_BITS
+	int "Bitsize for elements in circular_queue struct"
+	depends on LOCKDEP
+	range 10 30
+	default 12
+	help
+	  Try increasing this value if you hit "lockdep bfs error:-1" warning due to __cq_enqueue() failure.
+
 config DEBUG_LOCKDEP
 	bool "Lock dependency engine debugging"
 	depends on DEBUG_KERNEL && LOCKDEP
-- 
2.18.4



^ permalink raw reply related	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2021-01-01  8:12 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-25  1:30 [PATCH] lockdep: Introduce CONFIG_LOCKDEP_LARGE Tetsuo Handa
2020-07-25  4:48 ` Dmitry Vyukov
2020-07-25  5:23   ` Tetsuo Handa
2020-08-04  2:36     ` Tetsuo Handa
2020-08-18  9:57       ` Dmitry Vyukov
2020-08-18 11:07         ` Tetsuo Handa
2020-08-18 12:02           ` Dmitry Vyukov
2020-08-18 12:59             ` Tetsuo Handa
2020-08-27 15:20 ` [PATCH v2] lockdep: Allow tuning tracing capacity constants Tetsuo Handa
2020-09-04 16:05   ` Tetsuo Handa
2020-09-16 11:28     ` Dmitry Vyukov
2020-09-16 11:50       ` peterz
2020-09-16 12:14         ` Dmitry Vyukov
2020-09-28  0:24           ` Tetsuo Handa
2020-09-28  5:12             ` Dmitry Vyukov
2020-10-10 12:58   ` [PATCH v3] " Tetsuo Handa
2020-10-18 13:02     ` Tetsuo Handa
2020-11-18 13:57       ` Tetsuo Handa
2020-11-18 14:23         ` Peter Zijlstra
2020-11-18 14:30           ` Tetsuo Handa
2020-11-18 15:10             ` Peter Zijlstra
2020-11-18 15:31               ` Tetsuo Handa
2020-11-19 12:33                 ` Dmitry Vyukov
2020-11-19 12:43                   ` Dmitry Vyukov
2020-11-19 12:49                     ` Dmitry Vyukov
2020-11-19 13:06                       ` Dmitry Vyukov
2020-11-19 13:45                         ` Tetsuo Handa
2020-11-19 14:05                           ` Dmitry Vyukov
     [not found]                             ` <CACT4Y+aNJmuhk0KicX4FzKW6PhawFBgvrC2gSJcWwUkR8VSSmg@mail.gmail.com>
2020-11-19 14:36                               ` Dmitry Vyukov
2020-11-19 18:08                                 ` Dmitry Vyukov
2020-11-20  9:22                                   ` Dmitry Vyukov
2020-11-20  9:27                                     ` Dmitry Vyukov
2020-11-22  1:56                                       ` Tetsuo Handa
2020-11-27  9:00                                         ` Dmitry Vyukov
2020-12-03 13:47                                           ` Tetsuo Handa
2020-12-04 14:35                                             ` Tetsuo Handa
2020-11-19 14:57                               ` Tetsuo Handa
2021-01-01  8:09     ` [PATCH v4] " Tetsuo Handa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.